PEDESTRIAN safety is a problem of global significance.
|
|
- Loraine Porter
- 6 years ago
- Views:
Transcription
1 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 8, NO. 4, DECEMBER On Color-, Infrared-, and Multimodal-Stereo Approaches to Pedestrian Detection Stephen J. Krotosky and Mohan Manubhai Trivedi Abstract This paper presents an analysis of color-, infrared-, and multimodal-stereo approaches to pedestrian detection. We design a four-camera experimental testbed consisting of two color and two infrared cameras for capturing and analyzing various configuration permutations for pedestrian detection. We incorporate this four-camera system in a test vehicle and conduct comparative experiments of stereo-based approaches to obstacle detection using unimodal color and infrared imageries. A detailed analysis of the color and infrared features used to classify detected obstacles into pedestrian regions is used to motivate the development of a multimodal solution to pedestrian detection. We propose a multimodal trifocal framework consisting of a stereo pair of color cameras coupled with an infrared camera. We use this framework to combine multimodal-image features for pedestrian detection and to demonstrate that the detection performance is significantly higher when color, disparity, and infrared features are used together. This result motivates experiments and discussion toward achieving multimodal-feature combination using a single color and a single infrared camera arranged in a cross-spectral stereo pair. We demonstrate an approach to registering multiple objects across modalities and provide an experimental analysis that highlights issues and challenges of pursuing the crossspectral approach to multimodal and multiperspective pedestrian analysis. Index Terms Active safety, collision avoidance, intelligent vehicles, person detection, tracking. I. INTRODUCTION PEDESTRIAN safety is a problem of global significance. Of the 1.17 million yearly worldwide traffic fatalities, 65% are pedestrian-related [1]. In fully industrialized nations, pedestrian safety remains a high priority, with pedestrian fatalities accounting for 10.9% of all traffic deaths in the United States [2] and fatalities in Britain twice as likely for pedestrians than vehicle occupants [3]. In rapidly industrializing countries, pedestrian fatalities are overwhelmingly more costly in both proportion and sheer volume. Studies have found pedestrian fatalities accounted for over half of all traffic deaths in both China [4] and India [5]. Naturally, an issue of this impact has received significant attention from all aspects of the research community. Ongoing computer-vision research is making strides to Manuscript received January 15, 2007; revised April 23, 2007 and June 21, This work was supported in part by the Technical Support Working Group and in part by the U.C. Discovery Grant. The Associate Editor for this paper was U. Nunes. The authors are with the Computer Vision and Robotics Research Laboratory, University of California, San Diego, CA USA. Color versions of one or more of the figures in this paper are available online at Digital Object Identifier /TITS detect and to track pedestrians from both moving vehicles and transportation infrastructure. These approaches to pedestrian detection use visual or infrared imagery [6] in both monocular and stereo-camera configurations. The choice of visual or infrared imagery is significant, as each provides disparate, yet complementary information about a scene. Visual cameras capture the reflective light properties of objects in a scene, whereas infrared cameras are sensitive to the thermal emissivity properties of the same objects. Features extracted from each modality can be used to determine the presence of pedestrians in a scene. Additionally, their combination can provide a level of feature robustness beyond what is readily obtained from a single camera type. Additionally, multiple camera systems have been incorporated into pedestrian detection in order to extract depth estimates which are crucial to the task of collision mitigation and occlusion handling. In order to be able to register unimodal stereo imagery, correspondencematching techniques [7] are often sufficient. However, in a multimodal multiperspective system, the different appearance of objects in the visual and infrared imagery makes finding a robust correspondence technique challenging [8]. This paper presents research toward the development of a multimodal multiperspective system that can extract the features that are necessary for robust pedestrian detection. We design an experimental testbed consisting of two color and two infrared cameras for comparing multicamera approaches to pedestrian detection. We perform comparative experiments of stereo-based-detection approaches using unimodal imagery and demonstrate the high obstacle-detection rate achievable with both color and infrared imageries, and analyze the features and properties of the color and infrared imageries that are useful in classifying the detected obstacles into pedestrian regions. From this analysis, we propose a multimodal trifocal framework consisting of a stereo pair of color cameras coupled with a single infrared camera. Using a calibrated three-camera setup allows accurate and robust registration of color, disparity, and infrared features using the properties of the trifocal tensor. We demonstrate that the combination of color, disparity, and infrared information can yield significant gains in pedestrian detection compared with detectors trained on only unimodal or stereo features. This result motivates experiments and discussion of a cross-spectral stereo framework to pedestrian detection. Using a single color and a single infrared camera arranged in a stereo pair, we demonstrate an approach to registering color and infrared features and discuss the issues and challenges of pursuing the cross-spectral framework to multimodal and multiperspective pedestrian analysis /$ IEEE
2 620 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 8, NO. 4, DECEMBER 2007 II. RELATED RESEARCH Our focus on pedestrian detection is concerned with the methodologies and challenges of conventional camera systems. Specifically, we will review studies that utilize the color and infrared imageries in single and multicamera configurations. For a more comprehensive review of computer-vision-based approaches to pedestrian detection, we refer the reader to a recent survey paper by Gandhi and Trivedi [9]. Typically, to find pedestrians in crowded and varied scenes with a single camera, a trained set of features used to identify pedestrian regions is extracted. In color imagery, common features include Haar wavelet [10] or Gabor filter [11] responses, component-based gradient responses [12], image contours with mean field models [13], implicit shape models [14], and local receptive fields [15]. Similarly, features are extracted from monocular infrared approach. Typically, the features extracted from the infrared imagery are selected for their relation to the unique thermal signature of humans that enables straightforward segmentation. Such features include thermal hotspots [16], body-model templates [17], shape-independent multidimensional histograms, inertial and contrast base features [18], and histograms of oriented gradients [19]. The features extracted from monocular imagery are then typically used in a classification scheme using many positive and negative examples. The most common approach to classification is to use a support vector machine (SVM) [10], [12], [13], [15], [16], [19], [20]. Additional approaches to classification include template matching [17], [21], convolutional neural networks [22], and Chamfer distance matching [14]. While good pedestrian detection in monocular imagery can be achieved, a single-camera approach is limited in one critical area: accurate and reliable depth estimation. To achieve this, a multicamera system is necessary, typically arranged in a stereo vision configuration. Visual stereo-camera systems [23] [25] utilized dense-stereo matching to identify candidate pedestrian regions and to determine their distance from the camera. Infrared-stereo-camera systems have followed, which combine the benefits of infrared features with the powerful depth estimation inherent in stereo vision [21], [26]. Additionally, a four-camera system that is separately combining color-stereo and infrared-stereo systems has been investigated [27]. In typical stereo approaches to pedestrian detection, depth estimates yield an initial set of obstacle regions that can then be classified as pedestrians using monocular-image features. III. STEREO-BASED PEDESTRIAN DETECTION A fundamental step to analyzing pedestrians in stereo imagery is to detect obstacles and to localize their position in a 3-D space. We adapt a classical approach to obstacle detection in the stereo imagery proposed by Labayrade et al. [28] which utilizes the concept of v-disparity to identify obstacles in a scene. The v-disparity is a histogram of the stereo disparity image that accumulates the disparity values present in each row in the image. This histogram has been shown to be useful in identifying obstacles when the camera is relatively parallel to Fig. 1. Flowchart of stereo disparity-based obstacle-detection algorithm. the imaged scene so that objects appear at distinct planes in the disparity domain [24], [25], [27]. A. Disparity-Based Obstacle Detection Our goal is to provide a comparative analysis of colorstereo and infrared-stereo imageries for pedestrian detection. We use the v-disparity approach to obstacle detection so that it can be implemented for both color-stereo and infrared-stereo imageries without modification. We examine each approach s ability to generate robust stereo disparities for determining obstacle areas in a scene. This comparison of low-level detection accuracy will lead to an evaluation of each camera-type s potential for higher level pedestrian classification and analysis. Fig. 1 shows a flowchart of the obstacle-detection algorithm. 1) Dense-Stereo Matching: We first perform the densestereo matching to yield disparity estimates of the imaged scene. We select the correspondence-matching algorithm by Konolige [29] for its ease of use and reliable disparity generation for both color-stereo and infrared-stereo imageries. Example disparity images from each approach are shown in Fig. 2. 2) u- and v-disparity Image Generation: The u- and v-disparity images are histograms that bin the disparity values d for each column or row in the image, respectively. The resulting v-disparity histogram image indicates the density of disparities for each image row v, whereas the u-disparity image shows the density of disparities for each image column u. Fig.3 shows an example of u-disparity images, and Fig. 4 shows the corresponding v-disparity images generated from the colorstereo and infrared-stereo disparity maps in Fig. 2. Notice that the u-disparity images in Fig. 3 show three distinct horizontal regions corresponding to the three pedestrians in the scene. It is these regions that we wish to detect in order to build candidate pedestrian areas. The region spanning
3 KROTOSKY AND TRIVEDI: STEREO APPROACHES TO PEDESTRIAN DETECTION 621 Fig. 5. ROI generation in u- andv-disparity images with color- and infraredstereo images. (a) Color u-disparity. (b) Infrared u-disparity. (c) Color v-disparity. (d) Infrared v-disparity. Fig. 2. Example disparity images from color- and infrared-stereo images. (a) Color. (b) Infrared. Fig. 3. Example u-disparity images from color- and infrared-stereo images. (a) Color. (b) Infrared. Fig. 4. Example v-disparity images from color- and infrared-stereo images along with the detected ground plane. (a) Color with ground plane. (b) Infrared with ground plane. the entire length at the top of the u-disparity image indicates the background plane and can be filtered from processing. Similarly, the v-disparity images in Fig. 4 show vertical peaks of high density for both the background plane and the range of disparities containing pedestrians. These regions also need to be detected to build pedestrian candidates. Additionally, the downward-sloping trend for each row in the v-disparity image is exploited to estimate the ground plane in the scene [28]. 3) Ground-Plane Estimation: To estimate the ground plane, we extract candidate points in the v-disparity image. For each column corresponding to a disparity d in the v-disparity image, we select the lowest pixel location, whose value is above a threshold, as a candidate ground-plane point. The ground plane Fig. 6. Bounding-box candidates with color- and infrared-stereo images. (a) Color. (b) Infrared. is estimated by fitting these candidate points to a line with a robust linear regression scheme that uses weighted least squares that iteratively reweights using the bisquare weighting function. Fig. 5(b) and (d) shows the v-disparity images for color-stereo and infrared-stereo imageries with the candidate ground-plane points in red and the fitted ground-plane estimate plotted in cyan. Using a dense stereo with a robust point candidate generation and an iterative line fitting, we obtain robust ground-plane estimates in both color- and infrared-stereo imageries. 4) Candidate-Bounding-Box Generation: Bounding-box candidates can be extracted from regions-of-interest (ROI) in the u- and v-disparity images. The ROIs in the u-disparity image are extracted by scanning the rows of the image for continuous spans where the histogram value exceeds the given threshold. Fig. 5(a) and (b) overlays the extracted regions in green on the u-disparity image. The ROIs are extracted from the v-disparity image by selecting columns where the sum of the histogram values above the ground plane is greater than the threshold. The ROI spans from the ground plane to the highest point in the column that exceeds the given threshold. Fig. 5(c) and (d) shows the extracted regions in green on the v-disparity image. The candidate bounding boxes are selected from the ROIs in the u- and v-disparity images based on their disparity values. For a given disparity d, the widths of the bounding boxes are determined by the ROIs found in the u-disparity image, and the heights are derived from the ROIs in the v-disparity image. Large bounding boxes associated with background regions are filtered, and the remaining candidates are shown in Fig. 6. 5) Candidate Filtering and Merging: As shown in Fig. 6, there are often multiple overlapping candidate bounding boxes
4 622 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 8, NO. 4, DECEMBER 2007 Fig. 8. Experimental testbed: Two color cameras and two infrared cameras arranged in stereo pairs and mounted to the front of the LISA-P testbed. Fig. 7. Example of the final selection of pedestrian candidates after boundingbox merging with color- and infrared-stereo images. (a) Color. (b) Infrared. generated. This occurs because the disparities associated with a single pedestrian span a range of values, particularly as the pedestrian moves closer to the camera. We merge significantly overlapping candidates if the disparities that are associated with the bounding boxes are close. The final pedestrian candidate bounding boxes are shown in Fig. 7. Notice how the overlapping candidates have merged into the correct bounding boxes corresponding to the pedestrians in the scene. B. Experimental Framework and Testbed We establish a framework for experimenting and analyzing pedestrian detection approaches to facilitate a direct side-byside comparison of the data coming from color-stereo and infrared-stereo imageries. A custom rig was designed, consisting of a matched color-stereo pair and a matched infraredstereo pair. The two pairs share identical baselines and are aligned in pitch, roll, and yaw to maximize the similarities in the field of view. Calibration data were obtained by illuminating a checkerboard pattern with high-intensity halogen bulbs so that the checks would be visible in both color and infrared imageries, and standard calibration techniques could be applied to obtain the intrinsic and extrinsic parameters of the cameras. The calibrated rig was mounted on the grill of the Laboratory for Intelligent and Safe Automobiles (LISA)-P testbed [30], [31], a Volkswagen Passat equipped with the computing, power, and cabling requirements necessary to synchronously capture and save the four simultaneous camera streams. Fig. 8 shows the four-camera rig properly arranged and mounted on the LISA-P. Fig. 9. Merge and miss errors from pedestrian-candidate generation. (a) Color merged. (b) IR merged. (c) Color missed. (d) IR missed. depth, complexity, and occlusion. To allow for direct comparison, color and infrared videos were captured synchronously and were analyzed using the disparity-based obstacle-detection algorithm in Section III-A. Successful detection was indicated by a bounding box which is correctly overlaid by a corresponding pedestrian region. If our merging process combined two separate pedestrian regions, we consider the detection correct, yet note it as a merge error [Fig. 9(a) and (b)]. We reason that errors associated with lack of sophistication in our chosen merging algorithm should not adversely affect the detection rate, as the desire is to evaluate the effectiveness of identifying pedestrian regions and not the robustness of the merging procedure. This is also a fair assessment for collision mitigation, as finding all the critical areas in the scene is given priority over discerning the merged bounding boxes. Therefore, false negatives were counted only if a pedestrian region was missed [Fig. 9(c) and (d)], and false positives were counted when a bounding box selected a nonpedestrian region. Still, had we incorporated the merge errors, the total detection rate would decrease by only 1% for the color and 1.4% for the infrared. Table I shows the compiled results of the comparative experiments, and Fig. 10 shows additional examples of detection. C. Experimental Analysis of Disparity-Based Obstacle Detection in Color- and Infrared-Stereo Imageries Experiments were conducted so that multiple pedestrians walk in front of the LISA-P testbed with varying degrees of IV. STEREO-BASED PEDESTRIAN-DETECTION ANALYSIS Our comparative experiments in Section III with stereobased pedestrian detection for the color and infrared imageries indicate a very high level of detection accuracy and a low
5 KROTOSKY AND TRIVEDI: STEREO APPROACHES TO PEDESTRIAN DETECTION 623 TABLE I COMPARISON BETWEEN COLOR- AND INFRARED-STEREO IMAGERIES FOR DISPARITY-BASED OBSTACLE DETECTION false-positive rate in both modalities. However, we provide a deeper analysis of the experiments to help understand and evaluate the success of these experiments. We note that the difference in the pedestrian counts in Table I comes from the position and view differences of the colorstereo and infrared-stereo cameras. As only pedestrians that are fully visible in the image are considered, there are frames where a pedestrian is only visible in one modality. However, given the high number of examples, the detection rates can be directly compared despite the different tallies. The experiments yielded such a high rate of detection since the captured images did not include nonpedestrian obstacles, such as other vehicles or bicyclists, so detection of any obstacle region is assumed to be a pedestrian. For our experiments, this assumption is appropriate, as we are interested in evaluating how color and infrared dense-stereo correspondences can be used in low-level pedestrian detection. In that respect, our experiments demonstrate that both achieve high rates of lowlevel obstacle detection, which is an imperative first step toward a robust pedestrian detection and collision mitigation. However, in real-world driving scenarios, this is not sufficient for pedestrian detection. Detected obstacles can include a variety of objects found in common driving scenes other than pedestrians, and additional processing is necessary to filter the detected obstacles to identify pedestrians. For example, bounds on pedestrian bounding-box features, such as size, disparity, and aspect ratio, can be learned or heuristically selected to filter out bounding boxes associated with other objects in the scene [27]. However, such size-basedfiltering techniques will have difficulty with nonpedestrian bounding boxes that fall within the selected bounds of pedestrian candidates. Additionally, the selection of appropriately robust bounds is a challenging task, as bounding-box sizes can vary significantly with changes in pedestrian pose and disparity fidelity. To achieve a more reliable detection of pedestrian candidates, it is necessary to use discriminant image features in a learning framework, such as those discussed in Section II. While justification can be made for selecting either color or infrared features for pedestrian detection, a more interesting proposition would be to use both to obtain a much larger set of discriminant image features, as a system that incorporates all features to improve detection. For example, the thermal hotspots of humans that often make pedestrians easily segmentable can be used combined with the color segmentation features common to challenging tasks, such as detecting articulated poses for classifying human interactions [32]. Although stereo color and infrared analyses can be separately combined [27], a more economical and desirable solution would be to combine the color, disparity, and infrared features in an integrated detection framework. In Section V, we propose a multimodal trifocal framework consisting of a stereo pair of color cameras coupled with a single infrared camera. Such a setup allows for accurate and robust registration of the color and infrared imageries using the trifocal tensor. We use this registration framework to design a pedestrian detector that integrates color, disparity, and infrared features and yields higher detection rates than using separate features. In Section VI, we investigate the feasibility of this integrated detection framework using a minimum camera crossspectral stereo system with a single color and single infrared camera. The challenge is to register image features in a crossspectral stereo, where conventional and state-of-the-art stereocorrespondence algorithms fail due to the disparate nature of the color and infrared imageries. As a step toward a densestereo algorithm for cross-spectral stereo imagery, we propose a stereo-registration algorithm for multimodal imagery [8], evaluate its applicability to pedestrian detection, and highlight the challenges of achieving robustness in this framework. V. M ULTIMODAL TRIFOCAL FRAMEWORK FOR PEDESTRIAN DETECTION The benefits of color-, disparity-, and infrared-image features can be incorporated using a three-camera approach consisting of a standard color-stereo rig paired with a single infrared camera. The trifocal framework, shown in Fig. 11, uses disparity estimates from the stereo imagery to register corresponding pixels in the infrared imagery. This can be done quickly and efficiently with the trifocal tensor the set of matrices relating the correspondences between the three images. The trifocal tensor can be estimated by minimizing the algebraic error of point correspondences [33]. The point correspondences can be obtained for trifocal imagery using the same calibration techniques used for stereo calibration, where the calibration board is visible in each trifocal. While only seven point point point correspondences are required to compute the trifocal tensor, in practice, we use many more correspondences to smooth errors in the point estimates. The resulting trifocal tensor is written as T =[T 1,T 2,T 3 ], where T i is a 3 3 matrix for the ith image in the set. From this tensor notation, standard two-view geometry parameters, such as the fundamental matrices F, the epipoles e, and the projection matrices P, can be determined. Additionally, given a point correspondence x x, we can estimate the point transfer to the third image point x as ( ) [x ] x i T i [x ] = (1) i The dense-stereo matching gives the x x correspondences, and the infrared point transfer is estimated and aligned
6 624 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 8, NO. 4, DECEMBER 2007 Fig. 10. Example of the final selection of pedestrian candidates with color- and infrared-stereo input images. Fig. 11. Flowchart of the trifocal-tensor approach to pedestrian detection for color stereo and infrared framework. Fig. 12. Registered color, disparity, and infrared imageries using trifocal tensor. (a) Color. (b) Disparity. (c) Aligned infrared. to the color reference image. Fig. 12 shows an example set of the registered trifocal imagery. A. Experimental Evaluation of Pedestrian Detection Using Color-, Disparity-, and Infrared-Image Features To determine the effect of using multimodal features for pedestrian detection, we use the trifocal framework to register the color, disparity, and infrared imageries into a single five-channel multispectral image, allowing for the comparison of pedestrian detectors that make use of different combinations of image features. To train the detectors, positive pedestrian samples are manually annotated, and for each positive sample, ten negative samples are generated by moving the positive bounding box to a random nonoverlapping position in the image. All samples are resized to a common size (24 60 pixels), as shown in Fig. 13. We elect to extract histogram-of-oriented-gradient features similar to those proposed by Dalal and Triggs [34]. For each of the color, disparity, and infrared images, we compute an X Y Θ element histogram, where X, Y, and Θ are the numbers of histogram bins in width, height, and gradient orientation, respectively. For our experiments, we use a element histogram, resulting in a 128-element feature vector for each image type. This descriptor was selected on the notion that gradient information can discriminate a pedestrian from other objects. While we make no claims of feature optimality, gradient-based features are common in pedestrian-detection literature, and we feel that its use is sufficient for the evaluation of the effect of multispectral-image features on the detection accuracy. We train pedestrian detectors for all combinations of the color, disparity, and infrared features using an SVM with a radial basis function as the kernel type [35]. We train each SVM using 865 annotated positive samples (and 8650 negative samples) collected from the video obtained while driving the LISA-P testbed in store parking lots and local roads in La Jolla, California. Similarly, we evaluate using a test set of 641 positive samples and 6410 negative samples from a separate set of videos obtained while driving the LISA-P. Pedestrians in the training and testing sets range from approximately 3 to 30 m from the vehicle. The resulting receiver-operatingcharacteristic (ROC) curves are plotted in Fig. 14, and detection rates for a 5% false-positive rate are shown in Table II. The pedestrian detector that combines the color, disparity, and infrared features outperforms the other detectors by a significant margin. By integrating the features, we exploit the complementary nature of multimodal imagery to yield more than a 5% increase in detection for a 5% false-positive rate. We also note that the combinations of color + infrared and color + disparity do not outperform the detector that is trained only on color. We suspect that this is because gradientbased features are not suitable for discriminating pedestrians in the low-contrast disparity and infrared images. This drop in performance is evident in the detectors that are trained only on disparity or infrared. Given the relatively low number of positive samples, the addition of only disparity or infrared seems only to add noise. It is then all the more interesting
7 KROTOSKY AND TRIVEDI: STEREO APPROACHES TO PEDESTRIAN DETECTION 625 Fig. 13. Selection of positive and negative samples used for training pedestrian detectors. Each sample consists of color, disparity, and infrared images. (a) Positive samples. (b) Negative samples. Fig. 14. ROC for pedestrian detection. The combination of color, disparity, and infrared features performs the best. TABLE II PEDESTRIAN-DETECTION RATE FOR 5% FALSE-POSITIVE RATE that the color + disparity + infrared-trained detector performs so well. The discriminant gains from combining all the features greatly outweigh the noise added from nonideal gradient features. We anticipate that greater gains in accuracy could be achieved by using more discriminant features in each image spectrum. requirement of two color cameras for the stereo-correspondence matching is redundant from a feature perspective. We investigate achieving the stereo-correspondence matching using crossspectral stereo a single color and single infrared camera. While a cross-spectral stereo system has the potential to integrate the color, disparity, and infrared detail, the nontrivial problem of accurate and robust stereo registration must first be resolved. Toward achieving this, we have developed an algorithm for matching regions in cross-spectral stereo images [8]. This approach gives a robust disparity estimation with statistical confidence values for images that have an initial object segmentation. Fig. 15 shows the algorithmic framework of the regionbased stereo algorithm. The acquired and rectified image pairs are denoted as I L, for the left color image, and I R, for the right infrared image. Due to the high differences in the imaging characteristics, the matching is focused on the foreground pixels from an initial-segmentation estimate. To obtain the segmentation in a moving vehicle, we use an optical-flow-based approach to detect moving pedestrians in the scene [36]. Our experiments have shown that this approach is relatively robust at low speeds (< 10 m/h) and could be adapted for higher speeds with egomotion estimation. Low-speed analysis is useful in a variety of driving scenarios, including parking lots, residential and shopping areas, and starting or stopping at a traffic signal. Additionally, while stationary pedestrians pose a segmentation issue for optical-flow techniques, we expect that static objects above the ground can be identified through long-term tracking of the scene. Given the optical-flow estimates for motion in the horizontal m u and vertical m v directions, as well as occluded regions m occ, we estimate the foreground regions F where there is motion in either the horizontal or vertical direction and no occlusion. Morphological operations smooth the estimate. VI. CROSS-SPECTRAL STEREO-CORRESPONDENCE MATCHING FOR PEDESTRIAN DETECTION The multimodal trifocal framework demonstrates the benefit of integrating the color, disparity, and infrared features for pedestrian detection. While an attractive framework, its F =(( m u > 0) ( m v > 0)) (m occ =0). (2) We denote the color and infrared foreground images as F L and F R, respectively, which are shown in Fig. 16. The color
8 626 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 8, NO. 4, DECEMBER 2007 a disparity offset. For a given column i, a reference window is determined, and the correspondence values are found for all d d min,...,d max. Given the two correspondence windows W L,i and W R,i,d, we first linearly quantize the image to N levels such that N Mh /8 [37], as this equation has been shown to determine the number of levels needed to give good results for maximizing the mutual information between image regions. The similarity between the two image patches can be measured by the mutual information between them, which is defined as I(L, R) = l,r P L,R (l, r) log P L,R(l, r) P L (l)p R (r) (3) where P L,R (l, r) is the joint probability mass function (pmf), and P L (l) and P R (r) are the marginal pmfs of the left and right image patches, respectively. P L,R (l, r) is computed as the normalized 2-D histogram of the image intensities, and the marginal probabilities are determined by summing along 1-D of this histogram. We define the mutual information between the two correspondence windows as I i,d where i is the center of the reference window and i + d is the center of the moving window. For each column i, we compute I i,d for d d min,...,d max. We choose the best disparity d i as the one that maximizes the mutual information d i = arg max MI i,d. (4) d Fig. 15. Flowchart of region-based correspondence matching in cross-spectral stereo for pedestrian detection. Fig. 16. Outlined foreground extraction for color and infrared images. (a) Color segmentation. (b) Infrared segmentation. image is also converted to grayscale for a mutual informationbased matching. The matching is performed by fixing a window in one foreground image and by sliding a correspondence window along the second image. Given the height h and the width w of the image, for each column i 0,...,w,letW L,i be a reference window in the left image of height h and width M. The width M is experimentally determined for a given scene and is typically less than the width of the target object in the scene. In our case, the value of M was 31 pixels. The height h is the largest spanning of the foreground within the reference window. The correspondence window W R,i,d in the right image also has height h but is located at column i + d, where d is Fig. 17 shows example correspondence windows and a plot of the mutual information for the range of disparities. The red box in the color image is the reference window, and the green boxes in the infrared image are the candidate match windows. We assign a vote for d i to all the foreground pixels in the reference window. Define a disparity voting matrix D L of size (h, w, d max d min +1)as the range of disparities. Then, for each foreground pixel in a given reference window W L,i, (u, v) (W L,i F L ), we accumulate the disparity voting matrix at D L (u, v, d i ). Since the correspondence windows are M pixels wide, each column in the disparity voting matrix will have M votes. For each pixel (u, v) in the image, D L can be thought of as a distribution of matching disparities from the correspondence windows. Since it is assumed that a single person is at a single distance from the camera, a good match should have a large number of votes for a single disparity value, whereas a poor match would be distributed across the range disparity values. The best disparity value and its corresponding confidence at each pixel are then found as DL(u, v) = arg max D L (u, v, d) (5) d CL(u, v) = max D L (u, v, d). (6) d For a pixel (u, v), thevalueofcl (u, v) is the number of votes for the best disparity value DL (u, v). A higher confidence value indicates that the disparity maximized the mutual information for a large number of correspondence windows, and in turn, the disparity value is more likely to be accurate. Values for
9 KROTOSKY AND TRIVEDI: STEREO APPROACHES TO PEDESTRIAN DETECTION 627 Fig. 17. Mutual information for finding corresponding windows in a cross-spectral stereo imagery. (a) Color image. (b) Infrared image. (c) Mutual information for correspondence window. TABLE III CROSS-SPECTRAL STEREO REGISTRATION OF PEDESTRIAN REGIONS Fig. 18. Resulting disparity image D from combining the left and right disparity images DL and D R, as defined in (7). (a) Disparity image. (b) Unaligned. (c) Aligned. DR and C R are similarly determined by making the right image the reference. The values of DR and C R are then shifted by their disparities so that they align to the left image. The aligned disparity images are then combined using an AND operation. This experimentally gives the most robust results. For all pixels (u, v) such that CL (u, v) > 0 and C R (u, v) > 0 { D D (u, v) = L (u, v), CL (u, v) C R (u, v) DR (u, v), C L (u, v). (7) <C R (u, v) The resulting disparity image D (u, v) can be used to register multiple objects in the scene, even at very different depths from the camera. Fig. 18 shows the registration result for the images carried throughout the algorithmic derivation. Fig. 18(a) (c) shows the disparity image D, the initial alignment of the color and infrared images, and the alignment after shifting the foreground pixels by the resulting disparity image, respectively. The infrared foreground pixels are overlaid (in green) on the color foreground pixels (in purple). The crossspectral stereo-correspondence matching successfully aligns the foreground areas of the three people in the scene. A. Experimental Analysis of Cross-Spectral Stereo-Correspondence Matching for Pedestrian Detection Using the same experiments performed in Section III-C, we analyze the cross-spectral stereo-correspondence matching of pedestrian regions in an outdoor environment. The goal was to demonstrate a successful matching for the configurations of people in different positions, for the distances from the camera, and for the levels of occlusion. We evaluate the registration by visually inspecting the alignment of the corresponding color and infrared pedestrian regions. Visually well-aligned are considered correct, and misaligned, missing, or partially aligned regions are deemed incorrect. Table III summarizes our exper- Fig. 19. Cross-spectral stereo-registration results for pedestrian detection. (a) Color. (b) Infrared. (c) Unaligned. (d) Aligned. iments analysis, and Fig. 19 shows examples of correct correspondence matching. Additional experiments [38] demonstrate its robustness to different capture devices and environmental conditions.
10 628 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 8, NO. 4, DECEMBER 2007 Fig. 20. Disparity discontinuity errors in cross-spectral stereo analysis due to artifacting arising from windowed correspondence matching. (a) Color. (b) Infrared. (c) Disparity. (d) Aligned. One challenge associated with this approach to cross-spectral stereo lies in the vertical artifacts from the multiple voting windows that give the resulting registration hard vertical edges at disparity discontinuities. This is most evident when the inherent disparity discontinuity of the occluding pedestrians is forced to a vertical edge, as shown in Fig. 20. Despite these artifacts, we still identify the two distinct obstacle regions. Additionally, incorporating subpixel interpolation would improve the registration, as the integer-based disparity matching of our approach can easily be off by a pixel in either direction of the correct match. The initial segmentation, while necessary for the success of this algorithm, is limiting in several aspects. First, segmentation is challenging, and the result can often be noisy, easily overor underestimating the true object boundaries. We motivated the initial segmentation as a way of providing appropriately sized regions for matching the features in the color and infrared imageries. However, the very idea of an initial segmentation precludes registration estimates for regions that are not within the segmentation boundaries. Clearly, a better approach would be to register features from the entire image. Achieving this is an open research challenge that we are actively pursuing. We feel that a multifeature-matching approach that can integrate structural feature matching, such as edges, with pixel- or areabased matching, is promising. VII. DISCUSSION AND CONCLUDING REMARKS The depth estimates obtained from the vehicle-mounted stereo imagery give rise to a v-disparity-based approach for extracting the obstacle regions from the scene. We have outlined such an algorithm and have provided comparative experiments that indicate that color- and infrared-based stereo disparities are both capable of highly accurate pedestrian detection (> 98%) with low false positives ( 1%). Given these high detection rates, the selection of an appropriate camera system for pedestrian detection turns to each modality s ability to classify the detected obstacles as pedestrians. Because of the disparate physical processes that yield color and thermal images, extractable features are largely unique to each modality. As previous approaches have demonstrated that both color- and infrared-image features can be used for classifying pedestrians, we propose a multimodal trifocal framework that integrates color, depth, and infrared features for pedestrian detection. The multimodal trifocal solution pairs a color-stereo rig with a single infrared camera to accurately register pixels in each image. We use this framework to demonstrate that integrating color, disparity, and infrared features for training a pedestrian detector yields an improved accuracy over detectors that utilize only unimodal or stereo features. From a cost-benefit perspective, we suggest that the multimodal trifocal framework is likely the best approach, as it can achieve the benefits of multimodality seen in higher camera solutions, yet maintains the robustness not yet seen in the two camera cross-spectral solutions. Future areas for investigation include a more extensive evaluation of the color, disparity, and infrared features. Additionally, an integrated object candidate generation and pedestrian-detection algorithm using the multimodal trifocal framework would be useful for evaluating the robustness to various lighting and environmental conditions. In cross-spectral stereo analysis, the disparate nature of multimodal imagery that we hope to exploit in feature extraction makes correspondence matching challenging. We have established an object-level registration scheme for establishing correspondences and have experimentally demonstrated successful registration of object regions across the color and infrared imageries. The 87% registration rate shows the feasibility of creating a multimodal-feature set in a cross-spectral stereo framework. Although the initial-segmentation requirement places limits on the generality and robustness of the approach, we feel that this is a good first step toward the development of a cross-spectral stereo-correspondence algorithm that generates disparity images similar to that of the conventional stereo algorithms for unimodal imagery. We believe that advancement may be obtained by exploring multiple feature or hierarchical matching schemes that can integrate structural feature matching, such as edges, with a pixel- or area-based matching. These multimodal and multiperspective approaches provide insight into the overall active-safety paradigm. Pedestrian safety is one of the many aspects of the driving environment that needs to be monitored to ensure safety in the vehicle and the surrounding areas [31]. The multimodal-feature set that is extractable from a multimodal trifocal or cross-spectral stereo solution could provide a robust and unified framework for analyzing the vehicular environment [39], as well as higher level driver intent analysis, such as lane changing [40], turning [20], or braking [41]. REFERENCES [1] [Online]. Available: safety.htm [2] Traffic safety facts 2004: A compilation of motor vehicle crash data from the fatality analysis reporting system and the general estimates system. Nat. Highway Traffic Safety Assoc., U.S. Dept. Transp. [Online]. Available: TSFAnn/TSF2004.pdf [3] J. R. Crandall, K. S. Bhalla, and N. J. Madeley, Designing road vehicles for pedestrian protection, Brit. Med. J., vol. 324, no. 7346, pp , May [4] D. Mohan, Traffic safety and health in Indian cities, J. Transp. Infrastruct., vol. 9, no. 1, pp , [5] S. K. Singh, Review of urban transportation in India, J. Public Transp., vol. 8, no. 1, pp , [6] Y. Fang, K. Yamada, Y. Ninomiya, B. Horn, and I. Masaki, Comparison between infrared-image-based and visible-image-based approaches for pedestrian detection, in Proc. IEEE Intell. Veh. Symp.,2003, pp
11 KROTOSKY AND TRIVEDI: STEREO APPROACHES TO PEDESTRIAN DETECTION 629 [7] D. Scharstein and R. Szeliski, Middlebury College stereo vision research page, [Online]. Available: [8] S. J. Krotosky and M. M. Trivedi, Mutual information based registration of multimodal stereo videos for person tracking, Comput. Vis. Image Underst., vol. 106, no. 2/3, pp , May/Jun [9] T. Gandhi and M. M. Trivedi, Pedestrian protection systems: Issues, survey, and challenges, IEEE Trans. Intell. Transp. Syst., vol. 8, no. 3, pp , Sep [10] L. Andreone, F. Bellotti, A. De Gloria, and R. Laulette, SVM-based pedestrian recognition on near-infrared images, in Proc. 4th Int. Symp. Image Signal Process. Anal., 2005, pp [11] H. Cheng, N. Zheng, and J. Qin, Pedestrian detection using sparse Gabor filter and support vector machine, in Proc. IEEE Conf. Intell. Veh., 2005, pp [12] A. Shashua, Y. Gdalyahu, and G. Hayun, Pedestrian detection for driving assistance systems: Single-frame classification and system level performance, in Proc. IEEE Conf. Intell. Veh., 2004, pp [13] Y. Wu, T. Yu, and G. Hua, A statistical field model for pedestrian detection, in Proc. Comput. Vis. Pattern Recog., 2005, pp [14] B. Leibe, E. Seemann, and B. Schiele, Pedestrian detection in crowded scenes, in Proc. Comput. Vis. Pattern Recog., 2005, pp [15] S. Munder and D. Gavrila, An experimental study on pedestrian classification, IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 11, pp , Nov [16] F. Xu, X. Liu, and K. Fujimura, Pedestrian detection and tracking with night vision, IEEE Trans. Intell. Transp. Syst., vol. 6, no. 1, pp , Mar [17] A. Broggi, A. Fascioli, P. Grisleri, T. Graf, and M. Meinecke, Modelbased validation approaches and matching techniques for automotive vision based pedestrian detection, in Proc. Comput. Vis. Pattern Recog., 2005, p. 1. [18] Y. Fang, K. Yamada, Y. Ninomiya, B. K. P. Horn, and I. Masaki, A shapeindependent method for pedestrian detection with far-infrared images, IEEE Trans. Veh. Technol., vol. 53, no. 6, pp , Nov [19] F. Suard, A. Rakotomamonjy, A. Bensrhair, and A. Broggi, Pedestrian detection using infrared images and histograms of oriented gradients, in Proc. IEEE Conf. Intell. Veh., 2006, pp [20] S. Cheng and M. M. Trivedi, Turn-intent analysis using body pose for intelligent driver assistance, Pervasive Comput., vol. 5, no. 4, pp , Oct. Dec [21] M. Bertozzi, A. Broggi, C. Caraffi, M. D. Rose, M. Felisa, and G. Vezzoni, Pedestrian detection by means of far-infrared stereo vision, Comput. Vis. Image Underst., vol. 106, no. 2/3, pp , May/Jun [22] M. Szarvas, A. Yoshizawa, M. Yamamoto, and J. Ogata, Pedestrian detection with convolutional neural networks, in Proc. IEEE Intell. Veh. Symp., 2005, pp [23] L. Zhao and C. Thorpe, Stereo- and neural network-based pedestrian detection, IEEE Trans. Intell. Transp. Syst., vol. 1, no. 3, pp , Sep [24] G. Grubb, A. Zelinsky, L. Nilsson, and M. Rilbe, 3D vision sensing for improved pedestrian safety, in Proc. IEEE Conf. Intell. Veh., 2004, pp [25] P. Alfonso, D. F. Llorca, M. A. Sotelo, L. M. Bergasa, P. Revenga de Toro, J. Nuevo, M. Ocana, and M. A. G. Garrido, Combination of feature extraction methods for SVM pedestrian detection, IEEE Trans. Intell. Transp. Syst., vol. 8, no. 2, pp , Jun [26] X. Lie and K. Fujimura, Pedestrian detection using stereo night vision, IEEE Trans. Veh. Technol., vol. 53, no. 6, pp , Nov [27] M. Bertozzi, A. Broggi, M. Felias, G. Vezzoni, and M. Del Rose, Lowlevel pedestrian detection by means of visible and far infra-red tetravision, in Proc. IEEE Conf. Intell. Veh., 2006, pp [28] R. Labayrade, D. Aubert, and J.-P. Tarel, Real time obstacle detection in stereovision on non flat road geometry through v-disparity representation, in Proc. IEEE Conf. Intell. Veh., 2002, pp [29] K. Konolige, Small vision systems: Hardware and implementation, in Proc. 8th Int. Symp. Robot. Res., 1997, pp [30] M. M. Trivedi, S. Y. Cheng, E. M. C. Childers, and S. J. Krotosky, Occupant posture analysis with stereo and thermal infrared video: Algorithms and experimental evaluation, IEEE Trans. Veh. Technol., vol. 53, no. 6, pp , Nov [31] M. M. Trivedi, T. Gandhi, and J. McCall, Looking-in and looking-out of a vehicle: Computer-vision-based enhanced vehicle safety, IEEE Trans. Intell. Transp. Syst., vol. 8, no. 1, pp , Mar [32] S. Park and M. M. Trivedi, Multi-person interaction and activity analysis: A synergistic track- and body-level analysis framework, Mach. Vis. Appl. Special Issue Novel Concepts Challenges Generation Visual Surveillance Systems, vol. 18, no. 3/4, pp , Aug [33] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge, U.K.: Cambridge Univ. Press, [34] N. Dalal and B. Triggs, Histograms of oriented gradients for human detection, in Proc. Comput. Vis. Pattern Recog., 2005, pp [35] C.-C. Chang and C.-J. Lin, LIBSVM: A library for support vector machines, [Online]. Available: cjlin/libsvm [36] A. S. Ogale and Y. Aloimonos, A roadmap to the integration of early visual modules, Int. J. Comput. Vis. Special Issue Early Cognitive Vision, vol. 72, no. 1, pp. 9 25, Apr [37] P. Thevenaz and M. Unser, Optimization of mutual information for multiresolution image registration, IEEE Trans. Image Process., vol. 9, no. 12, pp , Dec [38] S. J. Krotosky and M. M. Trivedi, Multimodal stereo image registration for pedestrian detection, in Proc. IEEE Conf. Intell. Transp. Syst., 2006, pp [39] T. Gandhi and M. M. Trivedi, Vehicle surround capture: Survey of techniques and a novel omni-video-based approach for dynamic panoramic surround maps, IEEE Trans. Intell. Transp. Syst., vol. 7, no. 3, pp , Sep [40] J. McCall, D. Wipf, M. Trivedi, and B. Rao, Lane change intent analysis using robust operators and sparse Bayesian learning, IEEE Trans. Intell. Transp. Syst., vol. 8, no. 3, pp , Sep [41] J. McCall and M. Trivedi, Driver behavior and situation aware brake assistance for intelligent vehicles, Proc. IEEE Special Issue Advanced Automobile Technologies, vol. 95, no. 2, pp , Feb Stephen J. Krotosky received the B.S. degree in computer engineering from the University of Delaware, Newark, in 2001, and the M.S. and Ph.D. degrees in electrical and computer engineering from the University of California, San Diego, in 2004 and 2007, respectively, specializing in signal and image processing. He is currently an Algorithm Development Engineer with the Advanced Multimedia and Signal Processing Division, Science Applications International Corporation, San Diego, CA. Mohan Manubhai Trivedi received the Ph.D. degree in electrical engineering from Utah State University, Logan. He is a Professor with the Department of Electrical and Computer Engineering and the Founding Director of the Computer Vision and Robotics Research Laboratory, University of California, San Diego. His research interests include computer vision, intelligent vehicles and transportation systems, and human machine interfaces. Dr. Trivedi is a member of the IEEE Computer Society, from which he received both the Pioneer Award and the Meritorious Service Award, and a Fellow of the International Society for Optical Engineering.
A Comparison of Color and Infrared Stereo Approaches to Pedestrian Detection
Proceedings of the 2007 IEEE Intelligent Vehicles Symposium Istanbul, Turkey, June 13-15, 2007 A Comparison of Color and Infrared Stereo Approaches to Pedestrian Detection Stephen J. Krotosky and Mohan
More informationFast Pedestrian Detection using Smart ROI separation and Integral image based Feature Extraction
Fast Pedestrian Detection using Smart ROI separation and Integral image based Feature Extraction Bineesh T.R and Philomina Simon Department of Computer Science, University of Kerala Thiruvananthapuram,
More informationPostprint.
http://www.diva-portal.org Postprint This is the accepted version of a paper presented at 13th International Conference on Image Analysis and Recognition, ICIAR 2016, Póvoa de Varzim, Portugal, July 13-15.
More informationA Street Scene Surveillance System for Moving Object Detection, Tracking and Classification
A Street Scene Surveillance System for Moving Object Detection, Tracking and Classification Huei-Yung Lin * and Juang-Yu Wei Department of Electrical Engineering National Chung Cheng University Chia-Yi
More informationInfrared Stereo Vision-based Pedestrian Detection
Infrared Stereo Vision-based Pedestrian Detection M. Bertozzi, A. Broggi, and A. Lasagni Dipartimento di Ingegneria dell Informazione Università di Parma Parma, I-43100, Italy {bertozzi,broggi,lasagni}@ce.unipr.it
More information2 OVERVIEW OF RELATED WORK
Utsushi SAKAI Jun OGATA This paper presents a pedestrian detection system based on the fusion of sensors for LIDAR and convolutional neural network based image classification. By using LIDAR our method
More informationRegistering Multimodal Imagery with Occluding Objects Using Mutual Information: Application to Stereo Tracking of Humans
Chapter 14 Registering Multimodal Imagery with Occluding Objects Using Mutual Information: Application to Stereo Tracking of Humans Stephen Krotosky and Mohan Trivedi Abstract This chapter introduces and
More informationMobile Human Detection Systems based on Sliding Windows Approach-A Review
Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg
More informationFAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO
FAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO Makoto Arie, Masatoshi Shibata, Kenji Terabayashi, Alessandro Moro and Kazunori Umeda Course
More informationPaper title: A Multi-resolution Approach for Infrared Vision-based Pedestrian Detection
Paper title: A Multi-resolution Approach for Infrared Vision-based Pedestrian Detection Authors: A. Broggi, A. Fascioli, M. Carletti, T. Graf, and M. Meinecke Technical categories: Vehicle Environment
More informationDense Stereo-based ROI Generation for Pedestrian Detection
Dense Stereo-based ROI Generation for Pedestrian Detection C. G. Keller 1, D. F. Llorca 2 and D. M. Gavrila 3,4 1 Image & Pattern Analysis Group, Department of Math. and Computer Science, Univ. of Heidelberg,
More informationC. Premsai 1, Prof. A. Kavya 2 School of Computer Science, School of Computer Science Engineering, Engineering VIT Chennai, VIT Chennai
Traffic Sign Detection Via Graph-Based Ranking and Segmentation Algorithm C. Premsai 1, Prof. A. Kavya 2 School of Computer Science, School of Computer Science Engineering, Engineering VIT Chennai, VIT
More informationOn-line and Off-line 3D Reconstruction for Crisis Management Applications
On-line and Off-line 3D Reconstruction for Crisis Management Applications Geert De Cubber Royal Military Academy, Department of Mechanical Engineering (MSTA) Av. de la Renaissance 30, 1000 Brussels geert.de.cubber@rma.ac.be
More informationIR Pedestrian Detection for Advanced Driver Assistance Systems
IR Pedestrian Detection for Advanced Driver Assistance Systems M. Bertozzi 1, A. Broggi 1, M. Carletti 1, A. Fascioli 1, T. Graf 2, P. Grisleri 1, and M. Meinecke 2 1 Dipartimento di Ingegneria dell Informazione
More informationVehicle and Pedestrian Detection in esafety Applications
Vehicle and Pedestrian Detection in esafety Applications S. Álvarez, M. A. Sotelo, I. Parra, D. F. Llorca, M. Gavilán Abstract This paper describes a target detection system on road environments based
More informationReal-Time Human Detection using Relational Depth Similarity Features
Real-Time Human Detection using Relational Depth Similarity Features Sho Ikemura, Hironobu Fujiyoshi Dept. of Computer Science, Chubu University. Matsumoto 1200, Kasugai, Aichi, 487-8501 Japan. si@vision.cs.chubu.ac.jp,
More informationUsing temporal seeding to constrain the disparity search range in stereo matching
Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department
More informationHuman Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg
Human Detection A state-of-the-art survey Mohammad Dorgham University of Hamburg Presentation outline Motivation Applications Overview of approaches (categorized) Approaches details References Motivation
More informationSegmentation and Tracking of Partial Planar Templates
Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract
More informationA novel template matching method for human detection
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 A novel template matching method for human detection Duc Thanh Nguyen
More informationOutdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera
Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera Tomokazu Sato, Masayuki Kanbara and Naokazu Yokoya Graduate School of Information Science, Nara Institute
More informationHuman Motion Detection and Tracking for Video Surveillance
Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,
More informationAn Approach for Reduction of Rain Streaks from a Single Image
An Approach for Reduction of Rain Streaks from a Single Image Vijayakumar Majjagi 1, Netravati U M 2 1 4 th Semester, M. Tech, Digital Electronics, Department of Electronics and Communication G M Institute
More informationAdvanced Driver Assistance Systems: A Cost-Effective Implementation of the Forward Collision Warning Module
Advanced Driver Assistance Systems: A Cost-Effective Implementation of the Forward Collision Warning Module www.lnttechservices.com Table of Contents Abstract 03 Introduction 03 Solution Overview 03 Output
More informationMetric for the Fusion of Synthetic and Real Imagery from Multimodal Sensors
American Journal of Engineering and Applied Sciences Original Research Paper Metric for the Fusion of Synthetic and Real Imagery from Multimodal Sensors 1 Rami Nahas and 2 S.P. Kozaitis 1 Electrical Engineering,
More informationImplementation of Optical Flow, Sliding Window and SVM for Vehicle Detection and Tracking
Implementation of Optical Flow, Sliding Window and SVM for Vehicle Detection and Tracking Mohammad Baji, Dr. I. SantiPrabha 2 M. Tech scholar, Department of E.C.E,U.C.E.K,Jawaharlal Nehru Technological
More informationPedestrian Detection using Infrared images and Histograms of Oriented Gradients
Intelligent Vehicles Symposium 26, June 3-5, 26, Tokyo, Japan 6- Pedestrian Detection using Infrared images and Histograms of Oriented Gradients F. Suard, A. Rakotomamonjy, A. Bensrhair Lab. PSI CNRS FRE
More informationObject detection using non-redundant local Binary Patterns
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Object detection using non-redundant local Binary Patterns Duc Thanh
More informationCar Detecting Method using high Resolution images
Car Detecting Method using high Resolution images Swapnil R. Dhawad Department of Electronics and Telecommunication Engineering JSPM s Rajarshi Shahu College of Engineering, Savitribai Phule Pune University,
More informationScene Text Detection Using Machine Learning Classifiers
601 Scene Text Detection Using Machine Learning Classifiers Nafla C.N. 1, Sneha K. 2, Divya K.P. 3 1 (Department of CSE, RCET, Akkikkvu, Thrissur) 2 (Department of CSE, RCET, Akkikkvu, Thrissur) 3 (Department
More informationTo Appear in IEEE International Symposium on Intelligent Vehicles in Parma, Italy, June 2004.
To Appear in IEEE International Symposium on Intelligent Vehicles in Parma, Italy, June 2004. Occupant Posture Analysis using Reflectance and Stereo Images for Smart Airbag Deployment Stephen J. Krotosky
More informationCS 231A Computer Vision (Fall 2012) Problem Set 3
CS 231A Computer Vision (Fall 2012) Problem Set 3 Due: Nov. 13 th, 2012 (2:15pm) 1 Probabilistic Recursion for Tracking (20 points) In this problem you will derive a method for tracking a point of interest
More informationPedestrian Detection and Tracking in Images and Videos
Pedestrian Detection and Tracking in Images and Videos Azar Fazel Stanford University azarf@stanford.edu Viet Vo Stanford University vtvo@stanford.edu Abstract The increase in population density and accessibility
More informationFace Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN
2016 International Conference on Artificial Intelligence: Techniques and Applications (AITA 2016) ISBN: 978-1-60595-389-2 Face Recognition Using Vector Quantization Histogram and Support Vector Machine
More informationIMPLEMENTATION OF THE CONTRAST ENHANCEMENT AND WEIGHTED GUIDED IMAGE FILTERING ALGORITHM FOR EDGE PRESERVATION FOR BETTER PERCEPTION
IMPLEMENTATION OF THE CONTRAST ENHANCEMENT AND WEIGHTED GUIDED IMAGE FILTERING ALGORITHM FOR EDGE PRESERVATION FOR BETTER PERCEPTION Chiruvella Suresh Assistant professor, Department of Electronics & Communication
More informationTri-modal Human Body Segmentation
Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4
More informationAn Implementation on Histogram of Oriented Gradients for Human Detection
An Implementation on Histogram of Oriented Gradients for Human Detection Cansın Yıldız Dept. of Computer Engineering Bilkent University Ankara,Turkey cansin@cs.bilkent.edu.tr Abstract I implemented a Histogram
More informationMORPHOLOGICAL BOUNDARY BASED SHAPE REPRESENTATION SCHEMES ON MOMENT INVARIANTS FOR CLASSIFICATION OF TEXTURES
International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 125-130 MORPHOLOGICAL BOUNDARY BASED SHAPE REPRESENTATION SCHEMES ON MOMENT INVARIANTS FOR CLASSIFICATION
More informationDynamic Panoramic Surround Map: Motivation and Omni Video Based Approach
Dynamic Panoramic Surround Map: Motivation and Omni Video Based Approach Tarak Gandhi and Mohan M. Trivedi Computer Vision and Robotics Research Laboratory University of California San Diego La Jolla,
More informationPresented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey
Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey Evangelos MALTEZOS, Charalabos IOANNIDIS, Anastasios DOULAMIS and Nikolaos DOULAMIS Laboratory of Photogrammetry, School of Rural
More informationPedestrian Detection with Improved LBP and Hog Algorithm
Open Access Library Journal 2018, Volume 5, e4573 ISSN Online: 2333-9721 ISSN Print: 2333-9705 Pedestrian Detection with Improved LBP and Hog Algorithm Wei Zhou, Suyun Luo Automotive Engineering College,
More informationCOMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION
COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION Mr.V.SRINIVASA RAO 1 Prof.A.SATYA KALYAN 2 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING PRASAD V POTLURI SIDDHARTHA
More informationPedestrian Detection using Infrared images and Histograms of Oriented Gradients
Pedestrian Detection using Infrared images and Histograms of Oriented Gradients F. Suard, A. Rakotomamonjy, A. Bensrhair Lab. PSI CNRS FRE 265 INSA Rouen avenue de l universitè, 768 Saint Etienne du Rouvray
More informationStereo imaging ideal geometry
Stereo imaging ideal geometry (X,Y,Z) Z f (x L,y L ) f (x R,y R ) Optical axes are parallel Optical axes separated by baseline, b. Line connecting lens centers is perpendicular to the optical axis, and
More informationMultiple View Geometry
Multiple View Geometry CS 6320, Spring 2013 Guest Lecture Marcel Prastawa adapted from Pollefeys, Shah, and Zisserman Single view computer vision Projective actions of cameras Camera callibration Photometric
More informationarxiv: v1 [cs.cv] 28 Sep 2018
Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,
More informationLearning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009
Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer
More informationChapter 3 Image Registration. Chapter 3 Image Registration
Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation
More informationOptical Flow-Based Person Tracking by Multiple Cameras
Proc. IEEE Int. Conf. on Multisensor Fusion and Integration in Intelligent Systems, Baden-Baden, Germany, Aug. 2001. Optical Flow-Based Person Tracking by Multiple Cameras Hideki Tsutsui, Jun Miura, and
More informationCS 223B Computer Vision Problem Set 3
CS 223B Computer Vision Problem Set 3 Due: Feb. 22 nd, 2011 1 Probabilistic Recursion for Tracking In this problem you will derive a method for tracking a point of interest through a sequence of images.
More informationReal Time Stereo Vision Based Pedestrian Detection Using Full Body Contours
Real Time Stereo Vision Based Pedestrian Detection Using Full Body Contours Ion Giosan, Sergiu Nedevschi, Silviu Bota Technical University of Cluj-Napoca {Ion.Giosan, Sergiu.Nedevschi, Silviu.Bota}@cs.utcluj.ro
More informationPedestrian Detection Using Correlated Lidar and Image Data EECS442 Final Project Fall 2016
edestrian Detection Using Correlated Lidar and Image Data EECS442 Final roject Fall 2016 Samuel Rohrer University of Michigan rohrer@umich.edu Ian Lin University of Michigan tiannis@umich.edu Abstract
More informationI. INTRODUCTION. Figure-1 Basic block of text analysis
ISSN: 2349-7637 (Online) (RHIMRJ) Research Paper Available online at: www.rhimrj.com Detection and Localization of Texts from Natural Scene Images: A Hybrid Approach Priyanka Muchhadiya Post Graduate Fellow,
More informationSTEREO-VISION SYSTEM PERFORMANCE ANALYSIS
STEREO-VISION SYSTEM PERFORMANCE ANALYSIS M. Bertozzi, A. Broggi, G. Conte, and A. Fascioli Dipartimento di Ingegneria dell'informazione, Università di Parma Parco area delle Scienze, 181A I-43100, Parma,
More informationEfficient Acquisition of Human Existence Priors from Motion Trajectories
Efficient Acquisition of Human Existence Priors from Motion Trajectories Hitoshi Habe Hidehito Nakagawa Masatsugu Kidode Graduate School of Information Science, Nara Institute of Science and Technology
More informationA Summary of Projective Geometry
A Summary of Projective Geometry Copyright 22 Acuity Technologies Inc. In the last years a unified approach to creating D models from multiple images has been developed by Beardsley[],Hartley[4,5,9],Torr[,6]
More informationHigh-Level Fusion of Depth and Intensity for Pedestrian Classification
High-Level Fusion of Depth and Intensity for Pedestrian Classification Marcus Rohrbach 1,3, Markus Enzweiler 2 and Dariu M. Gavrila 1,4 1 Environment Perception, Group Research, Daimler AG, Ulm, Germany
More informationVision-based Lane Analysis: Exploration of Issues and Approaches for Embedded Realization
2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops Vision-based Lane Analysis: Exploration of Issues and Approaches for Embedded Realization R. K. Satzoda and Mohan M. Trivedi Computer
More informationAn Evaluation of the Pedestrian Classification in a Multi-Domain Multi-Modality Setup
Sensors 205, 5, 385-3873; doi:0.3390/s506385 OPEN ACCESS sensors ISSN 424-8220 www.mdpi.com/journal/sensors Article An Evaluation of the Pedestrian Classification in a Multi-Domain Multi-Modality Setup
More informationEE795: Computer Vision and Intelligent Systems
EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational
More informationProf. Fanny Ficuciello Robotics for Bioengineering Visual Servoing
Visual servoing vision allows a robotic system to obtain geometrical and qualitative information on the surrounding environment high level control motion planning (look-and-move visual grasping) low level
More informationVehicle Detection Method using Haar-like Feature on Real Time System
Vehicle Detection Method using Haar-like Feature on Real Time System Sungji Han, Youngjoon Han and Hernsoo Hahn Abstract This paper presents a robust vehicle detection approach using Haar-like feature.
More informationImproving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,
More informationDetecting motion by means of 2D and 3D information
Detecting motion by means of 2D and 3D information Federico Tombari Stefano Mattoccia Luigi Di Stefano Fabio Tonelli Department of Electronics Computer Science and Systems (DEIS) Viale Risorgimento 2,
More informationPedestrian detection from traffic scenes based on probabilistic models of the contour fragments
Pedestrian detection from traffic scenes based on probabilistic models of the contour fragments Florin Florian, Ion Giosan, Sergiu Nedevschi Computer Science Department Technical University of Cluj-Napoca,
More informationA Quantitative Approach for Textural Image Segmentation with Median Filter
International Journal of Advancements in Research & Technology, Volume 2, Issue 4, April-2013 1 179 A Quantitative Approach for Textural Image Segmentation with Median Filter Dr. D. Pugazhenthi 1, Priya
More informationBiometric Security System Using Palm print
ISSN (Online) : 2319-8753 ISSN (Print) : 2347-6710 International Journal of Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 3, March 2014 2014 International Conference
More informationMultiview Pedestrian Detection Based on Online Support Vector Machine Using Convex Hull
Multiview Pedestrian Detection Based on Online Support Vector Machine Using Convex Hull Revathi M K 1, Ramya K P 2, Sona G 3 1, 2, 3 Information Technology, Anna University, Dr.Sivanthi Aditanar College
More informationPeople detection in complex scene using a cascade of Boosted classifiers based on Haar-like-features
People detection in complex scene using a cascade of Boosted classifiers based on Haar-like-features M. Siala 1, N. Khlifa 1, F. Bremond 2, K. Hamrouni 1 1. Research Unit in Signal Processing, Image Processing
More informationReal-Time Detection of Road Markings for Driving Assistance Applications
Real-Time Detection of Road Markings for Driving Assistance Applications Ioana Maria Chira, Ancuta Chibulcutean Students, Faculty of Automation and Computer Science Technical University of Cluj-Napoca
More informationHuman detection using local shape and nonredundant
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Human detection using local shape and nonredundant binary patterns
More informationTowards Practical Evaluation of Pedestrian Detectors
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Towards Practical Evaluation of Pedestrian Detectors Mohamed Hussein, Fatih Porikli, Larry Davis TR2008-088 April 2009 Abstract Despite recent
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationMultiview Image Compression using Algebraic Constraints
Multiview Image Compression using Algebraic Constraints Chaitanya Kamisetty and C. V. Jawahar Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, INDIA-500019
More informationHuman Detection with a Multi-sensors Stereovision System
Human Detection with a Multi-sensors Stereovision System Y. Benezeth 1,P.M.Jodoin 2, B. Emile 3,H.Laurent 4,andC.Rosenberger 5 1 Orange Labs, 4 rue du Clos Courtel, 35510 Cesson-Sévigné -France 2 MOIVRE,
More informationAn Object Detection System using Image Reconstruction with PCA
An Object Detection System using Image Reconstruction with PCA Luis Malagón-Borja and Olac Fuentes Instituto Nacional de Astrofísica Óptica y Electrónica, Puebla, 72840 Mexico jmb@ccc.inaoep.mx, fuentes@inaoep.mx
More informationPerson Detection in Images using HoG + Gentleboost. Rahul Rajan June 1st July 15th CMU Q Robotics Lab
Person Detection in Images using HoG + Gentleboost Rahul Rajan June 1st July 15th CMU Q Robotics Lab 1 Introduction One of the goals of computer vision Object class detection car, animal, humans Human
More informationSOME stereo image-matching methods require a user-selected
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 3, NO. 2, APRIL 2006 207 Seed Point Selection Method for Triangle Constrained Image Matching Propagation Qing Zhu, Bo Wu, and Zhi-Xiang Xu Abstract In order
More informationExploitation of GPS-Control Points in low-contrast IR-imagery for homography estimation
Exploitation of GPS-Control Points in low-contrast IR-imagery for homography estimation Patrick Dunau 1 Fraunhofer-Institute, of Optronics, Image Exploitation and System Technologies (IOSB), Gutleuthausstr.
More informationVideo Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin
Literature Survey Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This literature survey compares various methods
More informationThe Pennsylvania State University. The Graduate School. College of Engineering ONLINE LIVESTREAM CAMERA CALIBRATION FROM CROWD SCENE VIDEOS
The Pennsylvania State University The Graduate School College of Engineering ONLINE LIVESTREAM CAMERA CALIBRATION FROM CROWD SCENE VIDEOS A Thesis in Computer Science and Engineering by Anindita Bandyopadhyay
More informationFast and Stable Human Detection Using Multiple Classifiers Based on Subtraction Stereo with HOG Features
2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China Fast and Stable Human Detection Using Multiple Classifiers Based on
More informationCOSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor
COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality
More informationTraffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers
Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane
More informationHistogram of Oriented Gradients for Human Detection
Histogram of Oriented Gradients for Human Detection Article by Navneet Dalal and Bill Triggs All images in presentation is taken from article Presentation by Inge Edward Halsaunet Introduction What: Detect
More informationVehicle Dimensions Estimation Scheme Using AAM on Stereoscopic Video
Workshop on Vehicle Retrieval in Surveillance (VRS) in conjunction with 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance Vehicle Dimensions Estimation Scheme Using
More informationIntegrated Vehicle and Lane Detection with Distance Estimation
Integrated Vehicle and Lane Detection with Distance Estimation Yu-Chun Chen, Te-Feng Su, Shang-Hong Lai Department of Computer Science, National Tsing Hua University,Taiwan 30013, R.O.C Abstract. In this
More informationNIH Public Access Author Manuscript Proc Int Conf Image Proc. Author manuscript; available in PMC 2013 May 03.
NIH Public Access Author Manuscript Published in final edited form as: Proc Int Conf Image Proc. 2008 ; : 241 244. doi:10.1109/icip.2008.4711736. TRACKING THROUGH CHANGES IN SCALE Shawn Lankton 1, James
More informationAutomatic Fatigue Detection System
Automatic Fatigue Detection System T. Tinoco De Rubira, Stanford University December 11, 2009 1 Introduction Fatigue is the cause of a large number of car accidents in the United States. Studies done by
More informationLOW-DENSITY PARITY-CHECK (LDPC) codes [1] can
208 IEEE TRANSACTIONS ON MAGNETICS, VOL 42, NO 2, FEBRUARY 2006 Structured LDPC Codes for High-Density Recording: Large Girth and Low Error Floor J Lu and J M F Moura Department of Electrical and Computer
More informationOn Road Vehicle Detection using Shadows
On Road Vehicle Detection using Shadows Gilad Buchman Grasp Lab, Department of Computer and Information Science School of Engineering University of Pennsylvania, Philadelphia, PA buchmag@seas.upenn.edu
More informationMeasurement of Pedestrian Groups Using Subtraction Stereo
Measurement of Pedestrian Groups Using Subtraction Stereo Kenji Terabayashi, Yuki Hashimoto, and Kazunori Umeda Chuo University / CREST, JST, 1-13-27 Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan terabayashi@mech.chuo-u.ac.jp
More informationTHE development of in-vehicle assistance systems dedicated
1666 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 53, NO. 6, NOVEMBER 2004 Pedestrian Detection for Driver Assistance Using Multiresolution Infrared Vision Massimo Bertozzi, Associate Member, IEEE,
More informationA Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images
A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images Karthik Ram K.V & Mahantesh K Department of Electronics and Communication Engineering, SJB Institute of Technology, Bangalore,
More informationFundamental Matrices from Moving Objects Using Line Motion Barcodes
Fundamental Matrices from Moving Objects Using Line Motion Barcodes Yoni Kasten (B), Gil Ben-Artzi, Shmuel Peleg, and Michael Werman School of Computer Science and Engineering, The Hebrew University of
More informationA Fast Moving Object Detection Technique In Video Surveillance System
A Fast Moving Object Detection Technique In Video Surveillance System Paresh M. Tank, Darshak G. Thakore, Computer Engineering Department, BVM Engineering College, VV Nagar-388120, India. Abstract Nowadays
More informationColor Local Texture Features Based Face Recognition
Color Local Texture Features Based Face Recognition Priyanka V. Bankar Department of Electronics and Communication Engineering SKN Sinhgad College of Engineering, Korti, Pandharpur, Maharashtra, India
More informationMotion Tracking and Event Understanding in Video Sequences
Motion Tracking and Event Understanding in Video Sequences Isaac Cohen Elaine Kang, Jinman Kang Institute for Robotics and Intelligent Systems University of Southern California Los Angeles, CA Objectives!
More informationINTELLIGENT transportation systems have a significant
INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 205, VOL. 6, NO. 4, PP. 35 356 Manuscript received October 4, 205; revised November, 205. DOI: 0.55/eletel-205-0046 Efficient Two-Step Approach for Automatic
More informationEye Detection by Haar wavelets and cascaded Support Vector Machine
Eye Detection by Haar wavelets and cascaded Support Vector Machine Vishal Agrawal B.Tech 4th Year Guide: Simant Dubey / Amitabha Mukherjee Dept of Computer Science and Engineering IIT Kanpur - 208 016
More information