Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

Similar documents
Object Detection Design challenges

Traffic Sign Localization and Classification Methods: An Overview


Face and Nose Detection in Digital Images using Local Binary Patterns

C. Premsai 1, Prof. A. Kavya 2 School of Computer Science, School of Computer Science Engineering, Engineering VIT Chennai, VIT Chennai

Object Category Detection: Sliding Windows

THE SPEED-LIMIT SIGN DETECTION AND RECOGNITION SYSTEM

Study of Viola-Jones Real Time Face Detector

Classification of objects from Video Data (Group 30)

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Discriminative classifiers for image recognition

2D Image Processing Feature Descriptors

Person Detection in Images using HoG + Gentleboost. Rahul Rajan June 1st July 15th CMU Q Robotics Lab

Automatic Initialization of the TLD Object Tracker: Milestone Update

Postprint.

Skin and Face Detection

Window based detectors

TRAFFIC SIGN RECOGNITION USING MSER AND RANDOM FORESTS. Jack Greenhalgh and Majid Mirmehdi

Recap Image Classification with Bags of Local Features

REAL-TIME ROAD SIGNS RECOGNITION USING MOBILE GPU

Road-Sign Detection and Recognition Based on Support Vector Machines. Maldonado-Bascon et al. et al. Presented by Dara Nyknahad ECG 789

Category-level localization

Human detection using histogram of oriented gradients. Srikumar Ramalingam School of Computing University of Utah

Object Category Detection: Sliding Windows

TRAFFIC SIGN RECOGNITION USING A MULTI-TASK CONVOLUTIONAL NEURAL NETWORK

Traffic Sign Classification using K-d trees and Random Forests

Detecting and Reading Text in Natural Scenes

Human Motion Detection and Tracking for Video Surveillance

Tri-modal Human Body Segmentation

Human-Robot Interaction

Category vs. instance recognition

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION

Distance-Based Descriptors and Their Application in the Task of Object Detection

Detection of a Single Hand Shape in the Foreground of Still Images

Face Detection and Alignment. Prof. Xin Yang HUST

Haar Wavelets and Edge Orientation Histograms for On Board Pedestrian Detection

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image

Previously. Window-based models for generic object detection 4/11/2011

HOG-based Pedestriant Detector Training

Selection of Scale-Invariant Parts for Object Class Recognition

CS 231A Computer Vision (Winter 2018) Problem Set 3

Bus Detection and recognition for visually impaired people

Deformable Part Models

FACE DETECTION AND RECOGNITION OF DRAWN CHARACTERS HERMAN CHAU

Face Tracking in Video

Face detection and recognition. Detection Recognition Sally

Histograms of Oriented Gradients for Human Detection p. 1/1

Active learning for visual object recognition

Object detection using non-redundant local Binary Patterns

Fast Human Detection Algorithm Based on Subtraction Stereo for Generic Environment

FAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO

Beyond Bags of Features

Recognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213)

Irish Road sign detection and recognition based on CIE-Lab colour segmentation and HOG feature extraction

Object Detection. Part1. Presenter: Dae-Yong

Adaptive Feature Extraction with Haar-like Features for Visual Tracking

Relational HOG Feature with Wild-Card for Object Detection

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

Generic Object-Face detection

Histogram of Oriented Gradients for Human Detection

Computer Science Faculty, Bandar Lampung University, Bandar Lampung, Indonesia

Human detection using local shape and nonredundant

Multiple Kernel Learning for Emotion Recognition in the Wild

Fast and Stable Human Detection Using Multiple Classifiers Based on Subtraction Stereo with HOG Features

A Hybrid Face Detection System using combination of Appearance-based and Feature-based methods

Histogram of Oriented Gradients (HOG) for Object Detection

A New Strategy of Pedestrian Detection Based on Pseudo- Wavelet Transform and SVM

Fast Human Detection Using a Cascade of Histograms of Oriented Gradients

Classifier Case Study: Viola-Jones Face Detector

Face Image Quality Assessment for Face Selection in Surveillance Video using Convolutional Neural Networks

Face detection, validation and tracking. Océane Esposito, Grazina Laurinaviciute, Alexandre Majetniak

A System for Real-time Detection and Tracking of Vehicles from a Single Car-mounted Camera

People detection in complex scene using a cascade of Boosted classifiers based on Haar-like-features

A Robust Feature Descriptor: Signed LBP

A novel template matching method for human detection

Object Classification for Video Surveillance

Sketchable Histograms of Oriented Gradients for Object Detection

Find that! Visual Object Detection Primer

Object recognition (part 1)

Face/Flesh Detection and Face Recognition

Out-of-Plane Rotated Object Detection using Patch Feature based Classifier

Eye Detection by Haar wavelets and cascaded Support Vector Machine

Preliminary Local Feature Selection by Support Vector Machine for Bag of Features

A Cascade of Feed-Forward Classifiers for Fast Pedestrian Detection

Human Object Classification in Daubechies Complex Wavelet Domain

Selective Search for Object Recognition

CS4495/6495 Introduction to Computer Vision. 8C-L1 Classification: Discriminative models

Modern Object Detection. Most slides from Ali Farhadi

Linear combinations of simple classifiers for the PASCAL challenge

Segmentation. Bottom up Segmentation Semantic Segmentation

Two Algorithms for Detection of Mutually Occluding Traffic Signs

Part based models for recognition. Kristen Grauman

Designing Applications that See Lecture 7: Object Recognition

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

CS4670: Computer Vision

Pedestrian Detection and Tracking in Images and Videos

Templates and Background Subtraction. Prof. D. Stricker Doz. G. Bleser

Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction

TRAFFIC SIGN DETECTION AND RECOGNITION IN DRIVER SUPPORT SYSTEM FOR SAFETY PRECAUTION

Transcription:

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation M. Blauth, E. Kraft, F. Hirschenberger, M. Böhm Fraunhofer Institute for Industrial Mathematics, Fraunhofer-Platz 1, D-67663 Kaiserslautern University of Applied Sciences Kaiserslautern, D-67657 Morlauterer Strasse 31 email: marco_blauth@web.de email: erwin.kraft@itwm.fraunhofer.de email: f.hirschenberger@fh-kl.de email: martin.boehm@fh-kl.de Abstract We present a state-of-the-art traffic sign recognition system that is able to detect and classify 43 different German traffic signs. To maintain a high performance and scalability of the system we use a two-stage approach that separates the detection from the classification step. We use a fast shape detection method based on multi block local binary patterns (MB-LBP) [1], which has shown to give good results for a face detection application. Color segmentation is also used alongside the shape detector. This increases the affine invariance properties and overall detection rate of the system. For the classification, we rely on HOG-Features [2] and nonlinear support vector machines (SVMs). Our classifier achieves a score of 96.93% on the German Traffic Sign Recognition Benchmark [3], which can be considered as a top scoring result. Since this benchmark is a pure classification benchmark that uses bounding boxes with centered signs we also evaluated the detection capabilities of our system on 108 traffic images provided by a street inspection company. Our system detected and classified 97% of the traffic signs in this data set correctly while the false positive rate was very low. The results show a high applicability of our system under real world conditions.

1 Introduction Reliable traffic sign recognition can be considered as a key aspect of driver assistance systems. Most commercially available systems are limited to a subset of important traffic signs like speed limitation or give way signs that are detected from a frontal view. In this paper we deal with the problem of large scale traffic sign recognition where we have to process a wide variety of different traffic sign classes. Our motivation to the problem are driven by the needs of a street inspection company: Cameras are mounted on inspection cars and images of the road scene are taken every four meters while the car moves along. Our goal is to find all traffic signs in the scene. The cameras observe the scene from different viewing angles, which means that the traffic signs are more likely to be distorted as it would be the case with frontal views. Our system has to differentiate between visually very similar traffic signs, for example different speed limitation signs. It also has to be robust under real world conditions, such as occlusions, image blur, varying lighting conditions as well as other distortions. Some examples are shown in figure 1. Figure 1: Successful detection and classification of different traffic signs. Our system is able to discriminate between visually similar signs and has good scale and affine invariance properties. The literature shows numerous works on the detection and classification of traffic signs. Since there are many published methods that handle the processing of a limited number of important signs well the area of interest has shifted to large-scale settings with more difficult environmental conditions. The recently published German Traffic Sign Recognition Benchmark (GTSRB) [3] was designed to address the classification problem in such a case. Earlier published methods like [4], [5], [6] and others do not perform well on this data set since they require a precise segmentation of the signs and text/pictograms, which is often not possible for the low contrast and distorted images in this benchmark. Newer approaches like [7] or [8] use HOG-Features [2], which have shown to give excellent results in difficult detection and classification scenarios. In our experiments they significantly outperformed other features, for example distance-to-border features [4]. HOG-Features [2] are often used by sliding-window detectors that run across various scales where each window is classified by a support vector machine (SVM). From our experience,

such detectors can become a performance bottleneck when dealing with many different classes and nonlinear kernel functions. We therefore use a two-stage approach where we first detect candidate regions in a more efficient way based on shape and color information. These candidates are then classified using HOG-Features [2] to determine whether they contain a traffic sign. Our approach is very fast: It processes a high resolution image of 1388 1038 pixels in under 2 seconds on a normal workstation. Also, it increases the detection and classification rate for distorted traffic signs that are viewed from steep angles. 2 Detection of traffic signs The goal of our detection stage is to identify image regions that may contain a traffic sign. To ensure a high system performance we focus on fast detection methods. Shape and color information are the primary information cues to detect regions of interest. The found image regions are then normalized and classified using HOG features [2]. Our HOGclassifier is robust against distortions and is also very suitable to reject false positives. Therefore, we focus on a high detection rate both for the shape and color detection rather than a low number of false positives. We use 108 street images to evaluate our detection results, because the GTSRB [3] is a pure classification benchmark and is not well suited for this purpose. Table 1 shows the detection results on our data set. Table 1: Detection results for a set of 108 street images. Detection rate Number of false positives MB-LBPs 97.7% 218 Haar-like features 92% 377 Color segmentation 99% 1396 Color segmentation + MB-LBPs 100% 1396 + 218 2.1 Color based detection Color segmentation and blob detection are used to find interesting image regions. We look for the color ranges red, blue and yellow in the HSI color space, because they are important traffic sign colors. The HSI color space separates the color information from the greyscale intensity. The color is encoded in the hue (H) and saturation (S) channels while the greyscale intensity (I) uses its own channel. However, color information is not reliable if the saturation is very low. We therefore define intervals [H min, H max ], [S min, S max ] and [I min, I max ] for all three channels and use a simple binary thresholding function: 1, if (H min H(i, j) < H max ) (S B(i, j) = min S(i, j) < S max ) (I min I(i, j) < I max ), 0, otherwise The intervals are chosen by grid search so that the detection rate is maximized for a training set of images. Our color detection results are shown in table 1. As one can see

the detection rate is very high but there are also many false positives. Some of the false positives are shown in figure 2. Figure 2: False positive detections of our color segmentation method: Both traffic signs are detected but the red colored leaves on the ground produce a large number of false positives. The false positives are later rejected by our HOG-classifier. The detected regions are normalized to increase the affine scale invariance properties of our system. An example is shown in figure 3: The detected traffic sign is distorted because of the steep viewing angle. A coarse spatial normalization procedure is sufficient to obtain a correct classification result. Figure 3: Affine normalization of segmented image regions. The left image is too distorted for our HOG-classifier. After normalization the region could be correctly classified as a valid traffic sign. 2.2 Shape based detection For the shape based detection of traffic signs we rely on the methods presented in [1] and [9]. The detector uses multi block local binary patterns (MB-LBPs). A MB-LBP is a local descriptor that encodes greyscale differences between local image sub-regions. Figure 4 gives a schematic overview of the computation of MB-LBPs. The computation is very simple and can be effectively speeded up by integral images. Training is done using a variant of the popular AdaBoost algorithm. AdaBoost combines so called weak classifiers h j (x) into a strong classifier h(x). A weak classifer is restricted to use only one feature f j from a predefined set:

1 1 5 1 3 5 2 4 Figure 4: Computation of the MB-LBP descriptor: The mean grey values of the surrounding blocks are compared to the mean value of the center block. The resulting binary pattern forms a compact descriptor. 1, if p j f j (x) p j θ j h j (x) = 0, otherwise θ j is a threshold and p j a parity. The function f j (x) applies a feature f j to an image region x. The final strong classifier h(x) is a combination of weak classifiers h t (x) that were selected at the training stage t: 1, if T h(x) = t=1 α t h t (x) 1 Tt=1 α 2 t 0, otherwise where α t = log 1 β t and β t = ǫt 1 ǫ t. ǫ t is the overall error of the weak classifier h t (x). Weak classifiers are trained and selected so that they complement each other. The final detector then arranges multiple strong classifiers in a cascade: At each stage a classifier is trained by AdaBoost and the number of allowed features is increased so that classifiers at later stages become more reliable but also more complex than classifiers at early stages of the cascade. The main idea is that image regions that do not contain objects of interest are discarded at early stages of the cascade. Thus, the detector becomes very efficient when used in a sliding-window manner. More details on the training and construction of the cascade can be found in [1] and [9]. We used images from the training set of the German Traffic Sign Recognition Benchmark (GSTBR) to train our shape detector. The detector was able to predict the correct traffic sign shape on the test set with a success rate of 96%. We also tested our detector on our 108 street images. The detection rate on this data set was 97.7% and the number of false positives was 218 (see table 1). We also compared the MB-LBPs with Haar-like features. The MB-LBPs produced a smaller number of false positives while the detection rate was higher. Some detection results and false positives can be seen in figure 5. 3 Classification of traffic signs In the classification stage we determine if a detected image region contains a particular traffic sign or if it has to be rejected as a false positive. The detected regions are first normalized to the size of 40 40 pixels and then HOG features [2] are computed on a

Figure 5: False positive detections of our shape detector: The traffic sign is detected but so are other round objects like tires. The false positives are later rejected by our HOG-classifier. dense grid. A HOG feature is a local image feature that captures shape and structure information. It is a weighted 3D histogram that quantizes the spatial positions and directions of the image gradients. To avoid binning effects it is very important to use linear interpolation between all the adjacent bins in the 3D histogram. Figure 6 gives an overview of the computation of HOG features. Further details can be found in [2]. Figure 6: Computation of HOG features: The image region is divided into cells. For each cell the directions of the image gradients are quantized into bins. Linear interpolation is used between adjacent cells and direction bins. Our HOG features consist of 2 2 cells and each cell covers an area of 5 5 pixels. All HOG features are concatenated to form a high dimensional feature vector that describes the whole image region. These feature vectors are used as inputs for non-linear support vector machines (SVMs), which are trained in a one-vs-one manner so that they are able to classify 43 different traffic signs. The HOG-classifier was trained using the training set

of the GTSRB [3]. On the test set it was able to classify 96.93% of the images correctly, which is a very high success rate. We also tested the classifier on the detected regions of our 108 street images. 97% of the traffic signs were classified correctly while only 44 false positives remained. Some examples are shown in figure 1: The classifier is able to discriminate between very similar signs and reduces the number of false positives from the detection stage significantly. Table 2: Detection and classification results for a set of 108 street images. Recognition rate (detection and classification) Number of false positives 97% 44 We also tried to compute HOG-features on different color channels but found that the classification rate could not be increased that way. 4 Conclusions A traffic sign detection and recognition system was presented that is able to reliably recognize a large number of traffic signs under real world conditions. The detection and classification capabilities of the system were extensively tested using two different data sets: The German Traffic Sign Recognition Benchmark [3] and a detection data set, which consists of 108 street images. Our HOG-classifier achieved a score of 96.93% on the GTSRB. The overall recognition rate on our detection data set was 97% while there were only 44 false positives. 5 Acknowledgements This work was supported by the Bundesministerium für Bildung und Forschung (BMBF). References [1] S. C. Liao, X. X. Zhu, Z. Lei, L. Zhang, and S. Z. Li. Learning multi-scale block local binary patterns for face recognition. In International Conference on Biometrics (ICB), volume 4642, pages 828 837, 2007. [2] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In International Conference on Computer Vision & Pattern Recognition (CVPR), volume 2, pages 886 893, 2005. [3] German traffic sign recognition benchmark. http://benchmark.ini.rub.de. [4] S. Lafuente-Arroyo, P. Gil-Jiminéz, R. Maldonado-Bascon, F. López-Ferreras, and S. Maldonado-Bascón. Traffic sign shape classification evaluation i: Svm using distance to borders. In Proceedings of the 2005 IEEE Intelligent Vehicles Symposium, 2005. [5] L-W. Tsai, J.-W. Hsieh, C.H. Chuang, Y.J. Tseng, K.C. Fan, and C.C. Lee. Road sign detection using eigen colour. In IET Computer Vision, volume 2, pages 164 177, 2007.

[6] X. Chen, J. Yang, J. Zhang, and A. Waibel. Automatic detection and recognition of signs from natural scenes. In IEEE Transactions on Image Processing, volume 13, pages 87 99, 2004. [7] F. Zaklouta, B. Stanciulescu, and O. Hamdoun. Traffic sign classification using k-d trees and random forests. In IEEE International Joint Conference on Neural Networks (IJCNN), pages 2151 2155, 2011. [8] I. Bonaci, I. Kusalic, I. Kovacek, Z. Kalafatic, and S. Segvic. Addressing false alarms and localization inaccuracy in traffic sign detection and recognition. In 16th Computer Vision Winter Workshop, pages 1 8, 2011. [9] P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In International Conference on Computer Vision & Pattern Recognition (CVPR), pages 511 518, 2001.