Pedestrian Detection Algorithm for On-board Cameras of Multi View Angles

Similar documents
Vector Processing Contours

MAPI Computer Vision

UUV DEPTH MEASUREMENT USING CAMERA IMAGES

Traffic Sign Classification Using Ring Partitioned Method

2 The Derivative. 2.0 Introduction to Derivatives. Slopes of Tangent Lines: Graphically

Two Modifications of Weight Calculation of the Non-Local Means Denoising Method

Multi-Stack Boundary Labeling Problems

Section 3. Imaging With A Thin Lens

Our Calibrated Model has No Predictive Value: An Example from the Petroleum Industry

wrobot k wwrobot hrobot (a) Observation area Horopter h(θ) (Virtual) horopters h(θ+ θ lim) U r U l h(θ+ θ) Base line Left camera Optical axis

ANTENNA SPHERICAL COORDINATE SYSTEMS AND THEIR APPLICATION IN COMBINING RESULTS FROM DIFFERENT ANTENNA ORIENTATIONS

MATH 5a Spring 2018 READING ASSIGNMENTS FOR CHAPTER 2

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

Bounding Tree Cover Number and Positive Semidefinite Zero Forcing Number

Proceedings. Seventh ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC 2013) Palm Spring, CA

4.1 Tangent Lines. y 2 y 1 = y 2 y 1

Section 2.3: Calculating Limits using the Limit Laws

Haar Transform CS 430 Denbigh Starkey

Non-Interferometric Testing

NOTES: A quick overview of 2-D geometry

More on Functions and Their Graphs

2.8 The derivative as a function

Chapter K. Geometric Optics. Blinn College - Physics Terry Honan

FAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO

Some Handwritten Signature Parameters in Biometric Recognition Process

CESILA: Communication Circle External Square Intersection-Based WSN Localization Algorithm

All truths are easy to understand once they are discovered; the point is to discover them. Galileo

13.5 DIRECTIONAL DERIVATIVES and the GRADIENT VECTOR

Interference and Diffraction of Light

Fast Calculation of Thermodynamic Properties of Water and Steam in Process Modelling using Spline Interpolation

3.6 Directional Derivatives and the Gradient Vector

Human Motion Detection and Tracking for Video Surveillance

A Cost Model for Distributed Shared Memory. Using Competitive Update. Jai-Hoon Kim Nitin H. Vaidya. Department of Computer Science

2 OVERVIEW OF RELATED WORK

Local features and image matching May 8 th, 2018

Numerical Derivatives

Person Detection in Images using HoG + Gentleboost. Rahul Rajan June 1st July 15th CMU Q Robotics Lab

Fast Pedestrian Detection using Smart ROI separation and Integral image based Feature Extraction

Design of PSO-based Fuzzy Classification Systems

RECONSTRUCTING OF A GIVEN PIXEL S THREE- DIMENSIONAL COORDINATES GIVEN BY A PERSPECTIVE DIGITAL AERIAL PHOTOS BY APPLYING DIGITAL TERRAIN MODEL

Proceedings of the 8th WSEAS International Conference on Neural Networks, Vancouver, British Columbia, Canada, June 19-21,

Investigating an automated method for the sensitivity analysis of functions

A signature analysis based method for elliptical shape

Measuring Length 11and Area

Human detection using local shape and nonredundant

Fast and Stable Human Detection Using Multiple Classifiers Based on Subtraction Stereo with HOG Features

A Practical Approach of Selecting the Edge Detector Parameters to Achieve a Good Edge Map of the Gray Image

θ R = θ 0 (1) -The refraction law says that: the direction of refracted ray (angle θ 1 from vertical) is (2)

Unsupervised Learning for Hierarchical Clustering Using Statistical Information

An Object Detection System using Image Reconstruction with PCA

Implementation of Integral based Digital Curvature Estimators in DGtal

Object Detection Design challenges

A Two-Stage Template Approach to Person Detection in Thermal Imagery

THANK YOU FOR YOUR PURCHASE!

A New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM

19.2 Surface Area of Prisms and Cylinders

Design of Map Decomposition and Wiimote-based Localization for Vacuuming Robots

Density Estimation Over Data Stream

CHAPTER 7: TRANSCENDENTAL FUNCTIONS

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

SUPER OBLIQUE INCIDENCE INTERFEROMETER USING SWS PRISM

Linear Interpolating Splines

Coarticulation: An Approach for Generating Concurrent Plans in Markov Decision Processes

Optimal In-Network Packet Aggregation Policy for Maximum Information Freshness

Real Time Stereo Vision Based Pedestrian Detection Using Full Body Contours

The navigability variable is binary either a cell is navigable or not. Thus, we can invert the entire reasoning by substituting x i for x i : (4)

Histograms of Oriented Gradients for Human Detection p. 1/1

MAC-CPTM Situations Project

Object Category Detection: Sliding Windows

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

arxiv: v1 [cs.cv] 4 Aug 2016

Human detections using Beagle board-xm

You should be able to visually approximate the slope of a graph. The slope m of the graph of f at the point x, f x is given by

Player Number Recognition in Soccer Video using Internal Contours and Temporal Redundancy

An Interactive X-Ray Image Segmentation Technique for Bone Extraction

12.2 Investigate Surface Area

On the use of FHT, its modification for practical applications and the structure of Hough image

Classification of Osteoporosis using Fractal Texture Features

Efficient Content-Based Indexing of Large Image Databases

Software Fault Prediction using Machine Learning Algorithm Pooja Garg 1 Mr. Bhushan Dua 2

Non-rigid body Object Tracking using Fuzzy Neural System based on Multiple ROIs and Adaptive Motion Frame Method

Haar Wavelets and Edge Orientation Histograms for On Board Pedestrian Detection

Geo-Registration of Aerial Images using RANSAC Algorithm

Implementation of Optical Flow, Sliding Window and SVM for Vehicle Detection and Tracking

4.2 The Derivative. f(x + h) f(x) lim

Redundancy Awareness in SQL Queries

An Effective Sensor Deployment Strategy by Linear Density Control in Wireless Sensor Networks Chiming Huang and Rei-Heng Cheng

Low-complexity Image-based 3D Gaming

Human trajectory tracking using a single omnidirectional camera

The Euler and trapezoidal stencils to solve d d x y x = f x, y x

Real-Time Human Detection using Relational Depth Similarity Features

12.2 TECHNIQUES FOR EVALUATING LIMITS

Areas of Triangles and Parallelograms. Bases of a parallelogram. Height of a parallelogram THEOREM 11.3: AREA OF A TRIANGLE. a and its corresponding.

2D transformations Homogeneous coordinates. Uses of Transformations

PYRAMID FILTERS BASED ON BILINEAR INTERPOLATION

Fourth-order NMO velocity for P-waves in layered orthorhombic media vs. offset-azimuth

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki

Multiple-Person Tracking by Detection

You Try: A. Dilate the following figure using a scale factor of 2 with center of dilation at the origin.

Transcription:

Pedestrian Detection Algoritm for On-board Cameras of Multi View Angles S. Kamijo IEEE, K. Fujimura, and Y. Sibayama Abstract In tis paper, a general algoritm for pedestrian detection by on-board monocular camera wic can be applied to cameras of various view ranges in unified manner. Te Spatio-Temporal MRF model extracts and tracks foreground objects as pedestrians and non-pedestrian distinguising from background scenes as buildings by referring to motion difference. During te tracking sequences, cascaded HOG classifiers classify te foreground objects into te two classes of pedestrians and non-pedestrians. Before te classification, geometrical constraints on te relationsip between eigts and positions of te objects are examined to exclude te non-pedestrian objects. Tis pre-processing contributed to reducing te processing time of te classification wile maintaining te classification accuracy. Due to te benefit of te tracking tat te classifier can make decision totally considering Regions of Interest (ROIs) wit same ID during consecutive images, tis algoritm can operates quite robustly against noises and classification errors at eac image frame. R I. INTRODUCTION ECENTLY, from te viewpoint of te safety of te pedestrian, te driving assistance systems to evade pedestrian's traffic accident as been actively studied. Tese systems ave been come into practical use as an infrastructure system in te beginning [1][]. However, in te road were te infrastructure maintenance is difficult, te on-board system is valuable. Te metod to detect te pedestrian in on-board driving assistance systems includes te laser sensor, te millimeter wave radar, and te camera. Te laser sensor can detect te presence of te object by scanning orizontal direction. Te millimeter wave radar as also te problem in te spatial resolution. Terefore, it is difficult to distinguis a pedestrian from te detected objects by tese sensors. On te oter and, on-board camera systems can detect pedestrians visually. Tese systems utilize monaural, stereo, or infrared camera. In te stereo camera, it is difficult to catc te lateral movement of te pedestrian, because te strain is caused in te image according to te view angle of te camera. In te infrared camera, it is difficult to use it in daytime. Te pedestrian detection by on-board camera is also one of te most difficult problems, because te camera moves arbitrarily, and te background image is overly complicated. Te existing pedestrian detection tecniques can be classified into two groups: texture and motion based. Texture Manuscript received January 18, 010. Tis work was founded in part by te STARC. S. Kamijo, K. Fujimura, Y. Sibayama, are wit Institute of Industrial Science, Te University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505 JAPAN, +81-3-545-673, kamijo@iis.u-to kyo.ac.jp based approac utilize te appearance feature tat is extracted using Haar-wavelet [3], egde template [4] and istogram of oriented gradients [5], etc. Papageorgiou and Poggio[3] and Oren et al.[6] utilized te extracted Haar-wavelet feature and SVM classifier to validate te candidate region from static image frame. Gavrilla and Munder [7] and Gavrilla et al. [4] detected Region of Interest (ROI) based on pedestrian edge template matcing followed by a verification stage based on Neural network arcitecture. Munder et al. [8] described multicue (i.e. sape, texture, dept) object model witin a Bayesian framework for detection and tracking based on particle filtering. Sasua et al. [9] presented a system wic break ROIs into sub-regions, and fed processed sub-region vector to adaboost classifier for verification. A similar approac presented by Dalal and Triggs [5] wic uses te fact tat te sape of object can be represented by a distribution of local intensity gradients or edge direction. For classification, SVM is trained using gradient istogram features from pedestrian and non-pedestrian classes. On te oter and, motion based tecniques rely on sort-term motion by estimating optical flow. Cutler and davis [10] focused on uman periodic walking motion pattern as a main cue for pedestrian detection. Sidenblad [11] tecnique was based on collecting examples of uman and non-uman motion pattern and learns SVM wit RBF kernel to create a uman classifier. Viola et al. [1] presented a detection algoritm wic combines motion and appearance information to build a robust model of walking uman. Curio et al. [13] detected walking pedestrians at road intersection. Teir initial detection process is based on a fusion of texture analysis, model-based contour features matcing of pedestrians, and inverse-perspective mapping (binocular vision). Additionally, motion patterns of limb movements are analyzed to classify pedestrian from oter objects. Elzein et al. [14] detected ROIs by computing optical flow only in regions selected by frame differencing, and te selected ROIs are searced to find pedestrian using manually selected Haar-wavelet features. Altoug above mentioned researc looks promising wit some extents, more researc will be needed to adapt tis driving assistance system as a life saving tool in moving veicles. In tis paper, we present inter-layer collaborative pedestrian tracking algoritm were initial foreground segmentation is done by motion based object detection algoritm. A cascade structure of rejection style classifier introduced by viola et al. [1] is utilized to separate pedestrian and non-pedestrian object. Finally, Spatio-Temporal Markov

Random Field model(s-t MRF) [15] based tracking is performed to track pedestrians. II. SYSTEM OVERVIEW A. Target of Our System Our system targets at te pedestrian protection on an open road including te intersection. In tis case, te demanded detection environment is different in te detection in straigt. If te veicles go straigt wit ig speed, te system sould detect te pedestrians in te far distance as muc as possible wit narrow view angle. In contrast, if te veicle turns intersections, te system needs to detect te near pedestrians in surrounding veicle wit wide view angle. Tis detection view angle and distance are in te relation of te trade-off for specs of an on-board camera. Our proposed system aims at te framework construction to be able to select tis trade-off. In te following experiments, our system is verified wit on-board cameras of various view angles. B. Strategy Overview Existing texture based pedestrian detection tecniques ave a problem in extraction of te ROI, and te false detection is caused easily if te entire image is involved. Terefore, we employed a motion-based metod for object detection wic focuses on te difference of motion between foreground objects and onboard background images. Image regions of detected foreground compose ROIs, wic indicate te candidates of pedestrians. Following te ROIs detection process, texture patterns in te ROIs are analyzed to classify pedestrians from te oter objects like polls, trees, buses and oter foreground facilities. Walking pedestrians can be classified due to te difference in teir motion from te motion of background infrastructures suc as buildings. In addition, foreground facilities suc as polls, trees, and signboards would be detected at te same time as pedestrians. Moreover, te standing person wo is near te wall side cannot be detected by te motion detection. However, tis would not be a problem from te viewpoint of te driving assistance systems, because tis is tougt to be a case witout supporting. Te object classification by te texture-based tecniques is done following te motion based object detection. As a result, it is tougt tat te pedestrian can be detected wit ig accuracy compared wit conventional tecniques. Te metod of motion-based object detection and te metod of object classification are described in te next section. C. Motion Based Object Detection and Tracking In general, te motion of te onboard background image is formulated by considering te motion of te onboard camera itself. In tis paper, we assumed tat te motion of te onboard background image can be approximated by te linear formula along te orizontal axis. It is because tat onboard camera is moving orizontally, and te motion of te background image varies depending of distance between te camera and te (a) Detected objects by motion difference (b) Objects tracking Figure.1: Results of motion based object detection background infrastructures. Te detail of tis metod is explained in [16]. Segmentation of te object region in te spatio-temporal image is equivalent to tracking te object against occlusion (see fig. 7).Tis is te principle idea of te S-T MRF model. We defined our te S-T MRF model [15] so as to divide an image into blocks as a group of pixels, and to optimize te labeling of suc blocks by referring to texture and labeling correlations among tem, in combination wit teir motion vectors. Combined wit a stocastic relaxation metod, te S-T MRF optimizes object boundaries precisely, even wen serious occlusions occur. Detail explanation of te S-T MRF tracker can be found in [15]. III. GEOMETRICAL CONSTRAINT ON ROI A. Geometry Estimation of Objects Since te motion based algoritm detects ROIs of bot te pedestrians and non-pedestrians, ROIs sould be classified into two classes of pedestrians and non-pedestrians. In order to reduce processing time for pattern classification, we applied geometrical constraints to te ROIs before te processing of pattern classification to exclude te ROIs wic ave poor likeliood as pedestrians according to teir relationsip between te positions and eigts in te image. Camera calibration was performed as follows to determine te relationsip between te positions and eigts by te pixel unit in te image. Figure. sows condition of our camera setup. In tis paper,

(a) Top view of camera geometry (a) Relationsip between p [pixels / meter] and D [meters] (b) Te profile view of te camera Figure.: Top views and te profile plane of te camera geometry on-board cameras are set up as optical axes of te cameras to be orizontal. Figure.(a) sows top view of te camera setup, and Figure.(b) sows profile of te camera setup wic is obtained along te plane indicated in Figure.(a). Altoug te lengt f φ seems to vary according tat φ in Figure.(a) varies, tis difference in f φ sould be canceled by lens design as f φ to be regarded as constant of focal lengt f. Tus, te relationsip between distance from te camera to te objects and eigt of te object in te image space can be represented as Eq.(1). In Eq.(1), distance between te object and te camera is represented by D [meters], and eigt of te object in image corresponding to 1 meter in real world coordinate is represented by p [pixels / meter]. Te relationsip between D and p are depicted in Figure.3(a) were four curves correspond to camera view angles of 30 degrees, 60 degrees, 75 degrees, and 100 degrees. In Figure., [X,Y] represents coordinate of te object in real world on te road plane, and [x,y] represents coordinate of te object in te image space. [x,y] is measured on te basis of bottom of te ROI, and te position [x,y] is transformed into te position [X,Y] in te real world by Eq.(). Tus, te distance D in te real world sould be derived from te position [x,y] in te image space as Eq.(3), and te relationsip between [x,y] and D are visualizes in Figure.3(b). Difference in te curvature among te four graps in Figure.3(b) is come from te difference in four kinds of f due to te view angles. Finally, combination of Eq.(1) and Eq.(3) provides te relationsip between [x,y] and p as represented in Eq.(4), and te eigt H of te object in te real world is obtained as Eq.(5) were represents eigt of te ROI in te image space. Consequently, wen te position [x,y] and te eigt of an ROI were obtained, te eigt H in te real world can be estimated by te above geometrical consideration. 30 degrees 60 degrees 75 degrees 100 degrees (b) Relationsip between D [meters] from camera and position (x, y) [pixels] Figure.3: Geometrical Estimation of Objects p I f F 1 D FY X Iw / x Iw f fc Hor Y F1 Hor I y D p H X Y Hor p c fc Hor F Hor I y 1 I ( Hor I I 1 w y) x F I w f I 1 w x F Iw f B. Calibration on Estimated Heigt of Pedestrian As derived in te above geometrical consideration, wen te position [x,y] and te eigt of an ROI were obtained, te eigt H in te real world can be estimated. Ten, ROIs aving te estimated eigt H of more or less (1) () (3) (4) (5)

tan tresolds sould be excluded from pedestrian candidates. In order to determine te upper and lower tresolds to decide te ROIs to be excluded, deviation of H among te ROIs of pedestrian sould be examined. Roll of te veicle during rigt and left turns sould cause displacement of te optical axis of te on-board camera from orizontal position. Tis displacement of te optical axis sould cause te deviations in te positions [x,y] of ROIs in te image space. As a result, estimated distance D and p sould ave deviations against te true value wic sould ave been obtained if te optical axis were orizontally positioned. also sould ave deviation due to deviation in eigts of pedestrians. Consequently, estimated eigt H of te pedestrian sould ave deviation due to te deviations of and p. Generally, it is difficult to measure te displacement of te optical axis caused by te veicle rolling and to correct [x,y], D, and p in real time during te driving situation. Terefore, we decided to calibrate H directly from te experiments in order to determine upper and lower tresold of te pedestrian. Tis experimental calibration would be able to reveal te probable range of H considering bot te deviations of and p simultaneously. In tis paper, we does not aim at calibrating internal parameters of cameras, since suc te deviations in te camera productions would be quite small compared wit deviations of and p. In addition, since significance of excluding certain ROIs from pedestrian candidates is to reduce te candidate to be classified by HOG/Fiser algoritm, tresolds on H sould not be teoretically completed. Figure.4 sows plots of te relationsip between te estimated eigts and te estimated distances of pedestrians about four different view angles of 30, 60, 75, and 90 degrees. Plots were extracted from te video sequences obtained in practical veicle driving. From te four graps in Figure.4, upper and lower tresolds sould be determined as. and 0.9 meters respectively to exclude te ROIs from te pedestrian candidate. 30 degrees 60 degrees 75 degrees 100 degrees Figure.4: Experimental plots of [pixels] vs. D [meters] from camera IV. CASCADE CLASSIFIER BY HOG FEATURE A. HOG feature Analyses of ROI Among a variety of algoritms for pedestrian detection, algoritms employing HOG feature are well known to be quite successful. Combined wit learning macine suc as SVM (Support Vector Macine) or Fiser Linear Classifier, it exerts good performance for object classification problems. Followings are overview of HOG feature extraction process implemented to our system. HOG (Histograms of Oriented Gradients) feature represents spatial distribution of edge direction of te scene. Image data of pedestrian including various background images beind are used as training data for Fiser. In tis paper, training images are scaled into 64 x 18 pixels, and a block of 8 x 8 pixels are defined to extract local HOG feature. An orientation of gradients is estimated by applying an edge operation to eac pixel, and te orientation is quantized into nine measures. 64 quantized orientations for 8 x 8 pixels are plotted into a istogram wit respect to nine measures. Tis istogram is translated into a vector of nine dimensions at wic values of elements represent nine magnitudes in istograms. Gradient strengts vary among images and locations witin an image owing to illumination and foreground-background contrast. In order to cancel suc effects, normalization will be performed about descriptor vectors. In tis paper, a descriptor block of 16 x 16 pixels consisting of four 8 x 8 pixel cells are defined, and a descriptor vector of 36 dimensions is obtained by connecting four 9 dimension vectors. A sequence of descriptor blocks are obtained by sifting te region by 8 pixels into te direction of raster scan, and a sequence sould be consist of 7 x 15 descriptor blocks. Eac descriptor vector is normalized wit square norm to be a normalized descriptor vector. Tus, a vector of 3780 dimensions is obtained connecting 105 normalized descriptor vectors wit respect to a training image. An ROI rectangle is scaled into te eigt of 18 pixels wereas te widt of te rectangle does not ave to be 64 pixels. Suc te scaled ROI sould be examined to detect correct region of a pedestrian by HOG/Fiser. In tis paper, tree different scales of training data tat ave te eigt of 18, 10, and 11 pixels are prepared to train tree sets of Fiser classifier respectively. L1-sqrt norm in te original paper [5] was employed in tis paper. B. Cascade of HOG/Fiser Classifiers Since non-pedestrian data would be distributed in various regions in HOG feature space, it is difficult to classify pedestrian data and non-pedestrian data by a single yperplane. On te oter and, ypersurface as SVM of iger dimensions tan quadratic surface sould suffer from overfitting problem. Terefore, we decided to construct cascade structure connecting eac classifier of a yperplane wic is trained by non-pedestrian learning data of different categories as sown in Figure.5(a). In tis paper, we employed Fiser classifier for eac step in cascade classifiers because of its simplicity and competitive

performance wit linear SVM. Te cascade is constructed by connecting four HOG/fiser classifiers. In eac step of te cascade, data determined as non-pedestrian sould be excluded, and residual data will be fed into te classifier in te next step. Consequently, residual data from te final step of te cascade will be determined as pedestrian. Figure.5(b) sows training pedestrian by HOG/Fiser classifier. Out of tem, 1399 pedestrian images were obtained from INRIA training database [17], and 11 pedestrian images were extracted from our original video sequences. Training data for non-pedestrian classes were extracted from our original video sequences. We performed experiments of classification by using 663 non-pedestrian data as a feasibility study. As a result, a lot of images wic are falsely determined as pedestrians were occupied wit te strong edges of veicles and buildings. Ten, we prepared image classes occupied wit vertical edges and orizontal edges to construct te cascade of classifiers. In tis paper, wen a ROI belonging to te same object of more tan tree frames out of consecutive four frames was classified into pedestrian, tis object is determined as pedestrian. Terefore, failure in pedestrian detection of less tan two frames does not degrade te classification result. Tus, tracking by te S-T MRF model is important in improving stability of te classification algoritm. V. EXPERIMENTAL RESULTS (b) (a)cascade structure KMJ : 11 images ttp://lear.inrialpes.fr/dat [17] INRIA : 1399 images Training images of Pedestrian images A. Experimental Results Cameras were set up at left and rigt side and center of te veicle. A CCD camera of 30 degree view angle was employed for te center camera. CCD cameras of 60, 75, and 100 degree view angles were used for left and rigt side cameras. In tis paper, te center camera is supposed to be used for avoiding front collisions to pedestrians crossing in front of te veicle traveling in ig speed along straigt roads. Terefore, we employed te camera of narrow view angle for te center camera. On te oter and, te left and rigt side cameras are supposed to be used for avoiding side collisions to pedestrians crossing around te veicle traveling in relatively slow speed at intersection. Terefore, we employed te cameras of wide view angles for te left and rigt side cameras. Video sequences were acquired at intersections and streets in downtown Tokyo. For te evaluation of pedestrian detection in tis paper, 17 scenes including 09 pedestrians were examined. Training data in Figure.5 were extracted from te oter tan above 17 scenes for te examination. Examinations were performed by two different procedures represented in Eq.(6) and Eq.(7). In Eq.(6), N represents te number of frames ped_exist_true frames in wic pedestrians exist. Te number of frames is estimated by eac object and eac frame, and te number of frames for eac object and eac frame is added into N. ped_exist_true frames For example, tere exist two pedestrians. Of wic one pedestrian exists for 10 frames and anoter exists for 8 frames, and tus N sould be estimated as 18. ped_exist_true frames All categories 663 images Horizontal edges 3817 images Vertical edges 41 images Complicated textures 189 images (c) Training images of Non-pedestrian Figure.5: Cascade classifier by HOG/Fiser N represents te number of frames in ped_detect_correct_frames wic pedestrians can be detected correctly. Terefore, DetectRate_frame represents te rate at wic pedestrians were successfully detected among existing pedestrians as te

ground trut. N represents te number of ROI_detect_asped_frames frames in wic te algoritm determined te ROIs as pedestrians. Te number of frames is estimated by eac object and eac frame, and te number of frames for eac object and eac frame is added into N. ROI_detect_asped_frames N represents te number of frames at wic ROI_detect_false_frames te algoritm determined te ROIs as pedestrians wile tey are non-pedestrians. Terefore, FalseRate_frame represents te rate at wic te detection results include false detections. As an analogy from Eq.(6), Eq.(7) represents te detection rate and te false alarm rate estimated by eac object. In tis estimation, te number of frames is not considered. Wen a pedestrian is detected at one or more frames during a sequence of image frames, tis result is regarded as a successful detection in estimating DetectRate_object. Wen a non-pedestrian object is detected at one or more frames during a sequence of image frames, tis result is regarded as a false detection in estimating FalseRate_object. From te above definitions, DetectRate_frame sould be strict rater tan DetectRate_object in estimation of te detection rate, and FalseRate_object sould be strict rater tan FalseRate_frame in estimation of te false alarm rate. Nped_detect _correct_frames DetectRate_frame Nped_exist_true_frames (6) NROI_detect_false_frames FalseRate_frame NROI_detect_asped_frames N ped_detect_correct_objects DetectRate_object N ped_exist_true_objects (7) N ROI_detect_false_objects FalseRate_object N ROI_detect_asped_objects Figure.6 exemplifies results of pedestrian detection were C030 represents a center camera of 30 degrees view angle, L075 represents a left camera of 75 degrees view angle, and R100 represents a rigt camera of 100 degrees view angle and so on. Rectangles represent objects wic were detected by motion difference and tracked by te S-T MRF. ROIs bounded by blue rectangles were determined as pedestrians, and ROIs bounded by red rectangles were determined as non-pedestrians by te HOG/Fiser classifier. Figure.6(a) sows a sequence of object tracking te S-T MRF model and classification by te HOG/Fiser. Te sequence sows tat pedestrians were tracked successfully wile te on-board camera moving, and a pillar was successfully excluded from pedestrian candidates in 58. Persons traveling on bicycles were successfully classified into pedestrians, since te HOG/Fiser examined teir ROIs by scanning te HOG window toug wole area of te ROIs as explained in Section IV-A. In Figure.6(b) L060, a parking veicle were determined as a pedestrian wic means a classification failure by te HOG/Fiser classifier. Te cascade classifier comprises two descriptors trained by images occupied wit orizontal edges and vertical edges, in order to exclude veicles, pillars, buildings and so on. However, some objects were not excluded. In Figure.6(b) R060, pedestrians standing beside a building were not detected by te motion detector, because tey stand quite close to te building and te motion differences were quite small. Tis case was regarded as a detection failure in te estimation of detection rate in tis paper. However, people besides te building are not dangerous in te practical scene, it would not be necessary to support drivers in suc te case. B. Discussion From our experiments, objects wit te eigts of more tan 4 blocks, tat is 3 pixels, were detected witout lowering te performance of te pedestrian detection. Since tis restriction as 3 pixel eigt comes from te resolution of te image, te restriction commonly used on variations in camera view angles. From Figure.3(a), pedestrians of 1.7 meter tall sould be located at te distances of 41, 31, 6, and 17 meters from te camera associated wit te view angles of 30, 60, 75, and 100 degrees respectively. We suppose tat te wide range camera suc te case wit 100 degrees view angle sould be used to avoid side collision to pedestrians at intersections. Since te veicle sould travel in slow speed in suc te situation, detection range of 17 meters would be practically acceptable. In tis paper, te number of people wo don t ave any motion difference like in Figure.6 (b) R060-01 as been counted into te results. If we exclude suc people, te detect rate rises from 91.39 % to 93.30 % in object based evaluations and from 86.61 % to 88.61 % in frame based evaluations. TABLE.1: Performance Analyses of Cascade Classifier (a) Object-based evaluations : A = 09 : pedestrian appeared Sum Pedestrian Non-ped Detect rate False alarm rate D=B+C B C B / A C / D Motion detection 646 05 441 98.09 68.6 Geometrical constraint on ROI 378 05 173 98.09 45.76 All categories 319 00 119 95.69 37.30 Vertical edges 55 196 59 93.78 3.13 Horizontal edges 9 19 37 91.87 16.16 Complex textures 7 191 36 91.39 15.86 (b) Flame-based evaluations : A = 5883 frames : pedestrian appeared Sum Pedestrian Non-ped Detect rate False alarm rate D=B+C B C B / A C / D Motion detection 7558 5751 1807 97.76 3.91 Geometrical constraint on ROI 6960 5736 14 97.50 17.59 All categories 6748 5553 1195 94.39 17.71 Vertical edges 6464 5466 998 9.91 15.44 Horizontal edges 5853 5145 708 87.46 1.10 Complex textures 5797 5095 70 86.61 1.11 VI. CONCLUSION In tis paper, we ave developed general metod for pedestrian detection wic is applicable to various kinds of

view angles. Te algoritm was examined using cameras of 30-100 degree view angle, and acieved ig accuracy in pedestrian detection. In addition, tis algoritm requires quite a simple and practical calibration for te camera based on experiment on pedestrian eigt. Te distance ranges of te algoritm as 41meters by te camera of 30 degrees view angle and as 17 meters by te camera of 100 view angle would be practically acceptable. Tus, practical systems can be designed selecting te camera specifications suitable to te systems wile te algoritm for pedestrian detection can be uniformly applied. REFERENCES [1] J.A.Misener, PATH Investigations in Veicle-Roadside Cooperation and Safety: A Foundation for Safety and Veicle-Infrastructure Integration Researc, Proceedings of te 9t IEEE Intelligent Transportation Systems Conference(ITSC 06), Toronto, Canada, 006. [] L.Alexander, P.M.Ceng, A.Gorjestani, A.Menon, B.Newstrom, C.Sankwitz and M.Donat, "Te Minnesota Mobile Intersection Surveillance System", Proceedings of te 9t IEEE Intel-ligent Transportation Systems Conference, Toronto, Canada, 006. [3] C. Papageorgiou, T. Poggio, A trainable system for object detection, International Journal of Computer Vision 38 (1) (000) 15 33. [4] D. M. Gavrila, and S. Munder, Multi-cue pedestrian detection and tracking from a moving veicle, Int J. Comput. Vis., 007. [5] N. Dalal, B. Triggs, Histograms of oriented gradients for uman detection, IEEE Conference on Computer Vision and Pattern Recognition (005) 886 893. [6] M. oren, C. Papageorgiou, P. Sina, E. Osuna, and T. Poggio., Pedestrian detection using Wavelet template, In Proceedings of IEEE CVPR 97, pages 193-199,1997. [7] D. M. Gavrila, J. Giebel, and S. Munder, Vision Based Pedestrian detection: Te PROTECTOR system, in proc. IEEE Intell. Ve. Symp., June 004. [8] S. Munder, C. Scnorr, D. M. Gavrila, Pedestrian Detection and Tracking Using a Mixture of View-Based Sape Texture Models, IEEE Transactions on Intelligent Transportation Systems, Volume: 9, Issue:, page(s): 333-343, June 008. [9] A. Sasua, Y. Gdalyau, and G. Hayun, Pedestrian detection for driving assistance systems: single-frame classification and system level performance, IEEE Intelligent veicles Symposium, 004. [10] R. Cutler, L. Davis, Robust real-time periodic motion detection: analysis and applications, IEEE Transactions on Pattern Analysis and Macine Intelligence (8) (000) 781 796. [11] H. Sidenblad, Detecting uman motion wit support vector macines, Proceedings of te 17t International Conference on Pattern Recognition, :188-191, 004. [1] P. Viola, M.J. Jones, and D. Snow, Detecting pedestrian using patterns of motion and appearance, IEEE International Conference of Computer Vision, :734-741,003. [13] C. Curio, J. Edelbrunner, and T. Kalinke, Cristos Tzomakas, and Werner von Seelen, Walking Pedestrian Recognition., IEEE Transaction on Intelligent transportation Systems, Vol.1, No. 3, September 000. [14] H. Elzein, S. Laksmanan, P. Watta, A motion and sape Based pedestrian Detection algoritm, Proceedings. IEEE Intelligent Veicles Symposium, 003. [15] S. Kamijo, M. Sakauci, Simultaneous Tracking of Pedestrians and Veicles in Cluttered Images at Intersections, 10t World Congress on ITS, Madrid, November.003, CD-ROM [16] B. Sen, K. Fujimura, S. Kamijo, Pedestrian Detection by On-board Camera Using Collaboration of Inter-layer Algoritm, ITSC009 pp588-595, October 009, St.Louis. [17] ttp://lear.inrialpes.fr/data 04 07 14 1 33 58 (a) A sequence of pedestrian detection (Left camera of 75 degree view angle : L075)

C030 L060 R060 L075 R075 L100 R100 (b) Results of eac camera specification Figure.6 : Results of pedestrian detection