ROBUST FACE DETECTION UNDER CHALLENGES OF ROTATION, POSE AND OCCLUSION

Similar documents
Color-based Face Detection using Combination of Modified Local Binary Patterns and embedded Hidden Markov Models

A Hybrid Face Detection System using combination of Appearance-based and Feature-based methods

Binary Morphological Model in Refining Local Fitting Active Contour in Segmenting Weak/Missing Edges

Face Detection for Automatic Avatar Creation by using Deformable Template and GA

3-D TERRAIN RECONSTRUCTION WITH AERIAL PHOTOGRAPHY

Mouse Pointer Tracking with Eyes

CS485/685 Computer Vision Spring 2012 Dr. George Bebis Programming Assignment 2 Due Date: 3/27/2012

A Survey of Various Face Detection Methods

Automatic Video Segmentation for Czech TV Broadcast Transcription

MAPI Computer Vision. Multiple View Geometry

Face Detection for Skintone Images Using Wavelet and Texture Features

Detection of a Single Hand Shape in the Foreground of Still Images

Project Report for EE7700

Face and Nose Detection in Digital Images using Local Binary Patterns

Image Based Feature Extraction Technique For Multiple Face Detection and Recognition in Color Images

Dynamic skin detection in color images for sign language recognition

Classification Method for Colored Natural Textures Using Gabor Filtering

Face Recognition using Hough Peaks extracted from the significant blocks of the Gradient Image

Object Tracking with Dynamic Feature Graph

Improving Alignment of Faces for Recognition

Facial Expression Recognition based on Affine Moment Invariants

Face detection, validation and tracking. Océane Esposito, Grazina Laurinaviciute, Alexandre Majetniak

Neighbourhood Operations

A New Algorithm for Shape Detection

MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION

Face Detection Using Color Based Segmentation and Morphological Processing A Case Study

Digital Vision Face recognition

Fast Face Detection Assisted with Skin Color Detection

Digital Image Processing. Image Enhancement in the Spatial Domain (Chapter 4)

Face Detection by Means of Skin Detection

Face Detection on OpenCV using Raspberry Pi

The Graph of an Equation Graph the following by using a table of values and plotting points.

Postprint.

A Proposed Approach for Solving Rough Bi-Level. Programming Problems by Genetic Algorithm

An Adaptive Threshold LBP Algorithm for Face Recognition

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

Face Recognition for Mobile Devices

Chapter 3 Image Enhancement in the Spatial Domain

Criminal Identification System Using Face Detection and Recognition

Gesture Recognition using a Probabilistic Framework for Pose Matching

Color Local Texture Features Based Face Recognition

A Robust Elastic and Partial Matching Metric for Face Recognition

Defects Detection of Billet Surface Using Optimized Gabor Filters

Mingle Face Detection using Adaptive Thresholding and Hybrid Median Filter

Unsupervised Human Members Tracking Based on an Silhouette Detection and Analysis Scheme

FACE DETECTION BY HAAR CASCADE CLASSIFIER WITH SIMPLE AND COMPLEX BACKGROUNDS IMAGES USING OPENCV IMPLEMENTATION

Real time eye detection using edge detection and euclidean distance

Designing Applications that See Lecture 7: Object Recognition

An object-based approach to plenoptic videos. Proceedings - Ieee International Symposium On Circuits And Systems, 2005, p.

Subject-Oriented Image Classification based on Face Detection and Recognition

Disguised Face Identification Based Gabor Feature and SVM Classifier

Road Sign Analysis Using Multisensory Data

Chin Contour Extraction Based on an Auto-Initialized Shape-Enhanced Snake

SUPER RESOLUTION IMAGE BY EDGE-CONSTRAINED CURVE FITTING IN THE THRESHOLD DECOMPOSITION DOMAIN

Face Detection in Color Images Using Skin Segmentation

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

Image Segmentation for Image Object Extraction

A Novel Technique to Detect Face Skin Regions using YC b C r Color Model

A Study of Low-resolution Safety Helmet Image Recognition Combining Statistical Features with Artificial Neural Network

Eyes extraction from facial images using edge density

Compressed Sensing Image Reconstruction Based on Discrete Shearlet Transform

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN

Face Recognition Technology Based On Image Processing Chen Xin, Yajuan Li, Zhimin Tian

Threshold Based Face Detection

Improved Neural Network-based Face Detection Method using Color Images

9.8 Graphing Rational Functions

Research on Image Splicing Based on Weighted POISSON Fusion

A MULTI-LEVEL IMAGE DESCRIPTION MODEL SCHEME BASED ON DIGITAL TOPOLOGY

Color Model Based Real-Time Face Detection with AdaBoost in Color Image

Motion based 3D Target Tracking with Interacting Multiple Linear Dynamic Models

Short Survey on Static Hand Gesture Recognition

Vehicle Detection Method using Haar-like Feature on Real Time System

A Review of Evaluation of Optimal Binarization Technique for Character Segmentation in Historical Manuscripts

FACIAL RECOGNITION BASED ON THE LOCAL BINARY PATTERNS MECHANISM

MULTI-VIEW FACE DETECTION AND POSE ESTIMATION EMPLOYING EDGE-BASED FEATURE VECTORS

Haralick feature extraction from LBP images for color texture classification

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

LOCALIZATION OF FACIAL REGIONS AND FEATURES IN COLOR IMAGES. Karin Sobottka Ioannis Pitas

Statistical Approach to a Color-based Face Detection Algorithm

An Introduction to Content Based Image Retrieval

Facial Feature Tracking and Expression Recognition for Sign Language

A Study on Similarity Computations in Template Matching Technique for Identity Verification

A Method of Sign Language Gesture Recognition Based on Contour Feature

9.3 Transform Graphs of Linear Functions Use this blank page to compile the most important things you want to remember for cycle 9.

Hybrid Face Detection System using Combination of Viola - Jones Method and Skin Detection

Edge Detection and Template Matching Approaches for Human Ear Detection

Real-Time Model-Based Hand Localization for Unsupervised Palmar Image Acquisition

Learning to Recognize Faces in Realistic Conditions

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Face Objects Detection in still images using Viola-Jones Algorithm through MATLAB TOOLS

A SAR IMAGE REGISTRATION METHOD BASED ON SIFT ALGORITHM

Face Quality Assessment System in Video Sequences

521466S Machine Vision Exercise #1 Camera models

A Novel Accurate Genetic Algorithm for Multivariable Systems

Automatic local Gabor features extraction for face recognition

Finger Vein Biometric Approach for Personal Identification Using IRT Feature and Gabor Filter Implementation

Procedia Computer Science

1.2 Related Work. Panoramic Camera PTZ Dome Camera

A Cylindrical Surface Model to Rectify the Bound Document Image

Mobile Face Recognization

Transcription:

ROBUST FACE DETECTION UNDER CHALLENGES OF ROTATION, POSE AND OCCLUSION Phuong-Trinh Pham-Ngoc, Quang-Linh Huynh Department o Biomedical Engineering, Faculty o Applied Science, Hochiminh University o Technology, Vietnam {equation/huynhqlinh}@hcmut.edu.vn Abstract: Face detection has been a typical active research domain or decades because it can be applied in many ields such as human machine interaction, surveillance, commercial application and health care. In this paper, we propose an automatic ace detection system which bases on human skin detection, natural properties o aces and the classiication strength o Local Binary Patterns (LBPs) and embedded Hidden Markov Models (ehmms). We create a developed skin color model that reduces eectively similar skin colors causing noise to receive better skin detection or dierent human races. With detected skin regions, natural properties o aces are used to discard non-ace objects to retain most reasonable ace candidates. A lexible classiication combining LBP histogram matching and embedded Hidden Markov Models (ehmms) is used to determine whether detected candidates are aces or not. The advantages o this classiication is reducing eectively the impact o ace rotation, pose and occlusion. The experiments show that our system is robust to detect human aces in both video sequences and still images with 93% correct detection among the variety o acial test databases orming rom dierent sources. Keywords: Face detection, skin segmentation, Local Binary Patterns (LBPs), embedded Hidden Markov Models (ehmms), ace pose, ace rotation, occlusion. 1 INTRODUCTION As regards the rapidly aging and developing o society, the cost and importance o health care are increasing year by year. Many researchers have concentrated to create robot-aided health care systems or elderly people, or taking care o children, etc. These systems always require a human-robot interaction (HRI) whose principal step is ace detection. The better ace detection system is, the better HRI is. Besides, ace detection has been also investigated because o its wide applications such as surveillance, commercial applications, security, etc. Many ace detection method in image [1] have been published and have achieved encouraging results. However, ace detection is still challenging task due to variability in rotation, pose and occlusion. The ace detection method implemented in OpenCV by Rainer Lienhart [2] is very similar to the state-o-art one published and patented by Paul Viola and Michael Jones [3]. Although this is one o the most popular and useul algorithms nowadays, it is still hard to detect rotated, proile and occluded aces. This paper presents an improved ace detection system to solve these problems. This paper is organized as ollows: human skin detection by developed skin-color model is introduced in section 2. In the next section, ace candidate localization based on natural properties o human aces is described. In section 4, we present a hybrid classiication method to veriy which ace candidate is real ace. To reduce the inluence o occlusion, the human ace is divided into two parts and each o them gives us one LBP histogram. The mixed histogram orms rom these two histograms is regarded as acial representation. A hybrid classiication combining template matching and appearance-based method is used to identiy whether ace candidates as ace or not. This is a combination o LBP histogram matching and ehmms in hierarchical classiier. The experiments are shown and discussed in section 5. Finally, we conclude in section 6. 2 HUMAN SKIN DETECTION Skin color is important and powerul inormation or human ace. The use o color inormation can simpliy the task o localization in complex environment. It allows ast processing and is highly

robust to geometric variations o the ace pattern. For skin detection task, many colorspaces with dierent properties have been applied. Many researchers have achieved encouraging results with RGB, normalized rgb, HSI, YCrCb and RGB-space ratios. A survey o skin color detection can be ound in [4]. However, there are many challenges in this task such as dierent illumination conditions, human races, and similar skin colors. We create an improved skin model in RGB space to obtain better skin detection or various human races. This skin model can reduce eectively similar-skin colors causing noise in skin segmentation process. These colors can be yellow, white, orange, pink, red, the brown color o wood, and the yellow color o sand. Other skin models such as the works o Peer [5] and Lin [6] show nice skin detection but they are still sensitive to remain those non-skin colors. The model o Lin oten retains similar white, yellow, orange, color o sand and grey color as shown in Fig. 1(b). As shown in Fig. 1(c), the model o Peer is sensitive to keep red and pink color. This weak point sometimes becomes a serious problem because non-skin regions are retained too many to determine correct ace regions. Our way is to build a skin classiier to deine explicitly the boundaries o skin cluster in RGB space. This method is simple to lead a rapid classiier. Decision rules o our skin modeling are as ollows: 1, i a set o conditions is satisied δ ( P( x, y)) = (1) where P(x,y) is a pixel o color image and a set o conditions are listed in Table 1. Table 1: A set o conditions deining skin pixels. These conditions should be satisied simultaneously. R (R-G) (G-B) B G [70,85] [30,55] [-5,35] [20,255] [30,255] [86,100] [30,60] [-5,40] [30,255] [40,255] [101,150] (R-G) (G-B) (R-B) (R+B-2G) G [0,30] [-10,45] [15,75] [-15,285] - [31,75] [-5,90] [-255,120] [-20,285] [50,255] [151,200] (R-G) (G-B) (R-B) (R+B-2G) B [15,20] [-5,40] [20,255] [-20,285] - [31,85] [-15,70] [20,255] [0,285] [40,255] [201,255] (R-G) (G-B) (R+B-2G) [5,25] [40,70] [-30,285] [26,100] [0,70] [-15,285] In act, the strongest component among R, G, B decides the color. For skin color, generally R component is always the strongest one because human skin has the special expression o blood color. The color is not skin color i the dierence between R, G and B are too big or small or R value is smaller than 70. The level o red color aects the decision rule o our skin model. Approximately, we divide R component into ive main ranges whose values are larger than 70. Our work is adjusting reasonable dierences between R, G, and B according to these ranges. The proposed skin model gives better skin detection or dierent human races in various environments. Several skin detection results shown in Fig. 1 prove the advantage o our skin model comparing to others. Figure 1: Comparison o skin detection results: (a) Original color image, (b), (c) and (d) Skin detection results rom the skin color models o Lin, Peer and proposed skin model, respectively. 3 FACE CANDIDATE LOCALIZATION An overview o proposed ace detection system is described in Fig. 2. Color Image (Frame) Human Skin Detection Face Candidate Localization basing on Natural Property o Faces Face Candidates Hybrid Classiication basing on LBP Histogram Matching and ehmms Detected Faces Figure 2: Face detection system.

Ater skin segmentation by our skin-color model as shown in Fig. 3(a), we label connected skin regions and erase regions whose areas are smaller than the threshold. In our experiment, this threshold is 108 pixels considered as hal o the smallest ace to be detected. We call this step reducing small noise as shown in Fig. 3(b). This step can reduce unreasonable candidates to improve processing speed. Moreover, generally skin segmentation is aected by dierent illumination conditions; maybe we lose some ace regions. We will recover those necessary regions by labeling connected non-skin regions in each skin region and change them to skin ones. This process ignores non-skin regions connecting directly to boundaries o their skin regions and the ones whose areas are greater than selected thresholds. An example o this step is shown in Fig. 3(c). This step is also able to protect our system under dierent illumination conditions. (a) (c) (e) (b) (d) () Figure 3: Several examples o steps in ace candidate localization. With skin regions having properties o human aces, we preserve them by covering their areas with skin ellipses. Because human aces have nearly elliptic shape, we use area condition or special ellipse region as deined in Fig. 4 to check this property. For each skin region, area condition or ellipse region E is as ollows, P E ( i) T H W (2) where P E (i) is a binary value which equals 1 i skin pixel i belongs to the inner region o ellipse E, T is a selected constant, H and W are the height and width o skin region rectangle. Figure 4: Skin region rectangle with special ellipse E. Those works are important or our ace candidate localization step because we may lose correct aces under strong separation scheme, which is introduced now. In some cases, human aces in images can be connected together or with other things such as hand, arm. This is one o challenges or ace detection task. Some researchers have tried to use various ways to locate aces such as using Hough transorm to ind elliptic skin region considered as ace candidate [7]. However, these methods are sensitive to ail to detect aces because the real shapes o human aces are changed when connections occur. We use histograms o skin pixel intensity to separate connected objects. For each skin region rectangle, we calculate horizontal and vertical histograms o skin pixel intensity. In act, basing on these histograms, there are concave regions at the intersection between dierent parts. We set all histogram values at those concave regions become zero to divide those parts according to unction (3), h( i), h( i) t h( i) = (3) where i is the i th bin o horizontal or vertical histogram, h(i) is histogram value o the i th bin and t is selected coeicient. Ater dividing connected objects, we reject nonace skin regions by several geometric conditions as shown in unction (4), 1, S threshold and δ ( H, W, S) = {( H W 3H ) ( W H 4W )} (4) where H and W are the height and width o skin region rectangle and S is the number o skin pixels belonging to skin region rectangle. An example o skin ellipses and separation process is shown in Fig. 3 (d) and the one o reducing non-ace is displayed in Fig. 3(e). Finally, as shown in Fig. 3(), we receive the most potential ace candidates, which will be used in recognition step. In next sections, we will present how our system recognizes which ace candidates are aces and which ones are not.

4 A HYBRID CLASSIFICATION We use a hybrid method or recognizing objects. A hybrid method o our system is the combination o template matching and appearance-based methods. It is a hierarchical classiier shown in Fig. 5 applied or each ace candidate to determine whether this is human ace or not. Figure 5: Hierarchical classiier scheme. 4.1 Face and Non-ace Class Selection To use template matching and appearance-based methods, irstly we have to create the ace database or training. Face detection will become an easy problem i we have clearly ace and non-ace class modeling. However, it is diicult to model non-ace class because anything that is not a ace belongs to non-ace class. In our method, we collect 200 rontal and proile ace images to create ace samples. To create non-ace class, we choose three main non-ace objects: arm (50 samples), hand (50 samples) and noise (100 samples). All samples are 72x93 size color images. In our experiments, those samples are enough to represent ace and non-ace classes. 4.2 A Modiied Local Binary Patterns or Face Representation Human ace is a near-regular texture pattern generated by acial components and their conigurations. Considering acial components such as eyebrow, eye, pupil, nose and ace boundary, we select eight main dierent spatial templates shown in Fig. 6 to preserve shape inormation o acial components. (0) (1) (2) (3) (4) (5) (6) (7) Figure 6: Eight main spatial templates. With only those spatial templates, we can describe all acial components; or example, a union o templates d, b and c can describe eyebrow. However, we combine both those spatial and local texture inormation to improve the capacity o describing aces. Instead o considering the central pixel P C only with its each neighborhood pixel as original LBP operator did [8], our method uses each pair o two neighborhood pixels (P i1,p i2 ) according to spatial templates to compare with the central pixel P C. Eight spatial templates orm eight binary digits o mlbp number. Thereore, mlbp operator produces 256 dierent values. Equation (5) gives the computation o mlbp number. 7 i mlbp = S i ( x) 2 (5) i= 0 where S i is the i th binary digit o mlbp number, 1, ( PC > Pi 1 ) ( PC > Pi 2 ) S i ( x) = (6) In act, mlbp gives us inormation about both local shapes through eight spatial templates and local textures. We retrieve more inormation to represent ace patterns eectively. 4.3 Mixed mlbp Histogram Matching We use the histogram o mlbp coeicients to represent a ace. I we only use single mlbp histogram or the whole ace candidate image, occlusion will aect template-matching algorithm seriously. To reduce the impact o occlusion, in general, we divide human ace into two parts: the upper part rom nose up to orehead and the lower one rom nose down to neck. We calculate individual histogram or each part and connect them sequentially to create one mixed 255x2-bin histogram representing to ace candidate image. By this way, we reduce eectively the inluence o occlusion. Given an image I, one mixed mlbp histogram is denoted by H mlbpmix (I). We adopt error measurement because o simple and ast computation. A distance measurement is deined as: mlbpmix mlbpmix D( H ( I1), H ( I2)) = Hi ( I1) Hi ( I2) (7) where H mlbp (I 1 ) and H mlbp (I 2 ) are two mixed mlbp 255x2-bin histograms, and n is the number o bins. Given a ace database with m samples, or any sample P, we change it rom color image to grayscale one and deine its histogram-matching eature as the average distance to ace training samples as ollows:

m 1 ( P) = D( H ( P), H ( )) (8) ace X i m i= 1 where X i is a ace-training sample. In act, this histogram-matching eature has the discriminating ability between ace and non-ace patterns. Figure 7(a) which shows the positive and negative distance measure distribution over 156 ace samples and 121 non-ace samples demonstrate this property. 4.3 embedded Hidden Markov Models In our algorithm, we deine non-ace class as three dierent sub-classes: arm, hand, and noise. It means our ace detection is changed to our-class pattern classiication problem. ehmms [9] perorms pattern recognition or a our-class problem by determining the maximum likelihood to ind the most similar class or candidate object. Given training sets o positive and negative samples, we will have our ehmm models corresponding to our classes: ace, arm, hand and noise. A ace candidate, which was ignored by the two irst stages o ace detection system, is checked by ehmms. Finally, this is not human ace i the result o this ace candidate under ehmm stage is non-ace. (a) (b) Figure 7: Distribution o distance measurement: (a) Distribution o ace, (b) Distribution o D. With this eature ace, we use thresholds called T ace to classiy ace and non-ace objects. In our experiment, i ace is smaller than 1800, ace candidate can be considered as ace with 99% o correct detection. I ace is bigger than 3500, the ace candidate is almost not a human ace. Only in the range [1800,3500] o T ace, it is still hard to say i the ace candidate is a ace. We improve ace detection in this range by the ollowing eature. With non-ace database, or any sample P, we also deine its histogram-matching eature as the minimum o three average distances to three non-ace object-training samples, given by nonace ( P) = min( arm ( P), hand ( P), noise ( P)) (9) where arm, hand, noise are calculated ollowing to (8). The dierence between nonace and ace shown in Fig. 7(b) also has the discriminating ability between ace and non-ace patterns. We call it dierence D, given by D ( P) = nonace( P) ace( P) (10) We use D to improve the ace detection rate when T ace is in [1800,3500]. We deine the explicit thresholds T D or D to distinguish ace and non-ace patterns. We speciy matching conditions or both ace and D and use them jointly or the two irst matching stages in hierarchical classiier. Ater those two stages, the ace detection rate can reach over 80%. In order to increase the perormance o our system, ehmms is used as the last step to check ace candidates that are not satisied the two irst template matching steps to give the conclusion. Figure 8: Scheme o ehmms algorithm used in ace detection I aces appear in image, the result o our ace detection system is extracted human aces as shown Fig. 9. Figure 9: An result example o ace detection system 5 EXPERIMENTS To evaluate our system, we built both static and video sequence acial databases. Static database includes totally 500 color images rom dierent sources: Caltech ace database, Smugmug image library, amily photos world wide web. Video sequence database is captured by dierent cameras and extracted rom several movies such as Harry

Porter, etc. Both databases contain multi-ace images with rontal and proile aces under variations in rotation, position, size, and acial expression. With proposed skin model, our system showed eectively to detect multiple aces with various skin-tones, acial expression, and sizes. It can also detect proile aces with the angle about or even more than 90 degrees under complex backgrounds. It also detects successully occluded aces with occlusion less than 50% o ace region. Our system can detect aces in rotation conditions as well, but it may ail in detecting horizontal aces because o mixed LBP histogram matching. Several examples are shown in Fig. 10 to show how well our system works. The summary results or those problems are described in Table 2. Finally, the correct ace detection rate or our proposed system is 93%, which proves that our method is more eective in detecting aces comparing to other methods as shown in Table 3. The speed o our system is about 5 ps or 320x240 size image so that we can apply this ace detection method in real HRI applications. 6 CONCLUSIONS In this paper, we proposed and improved color-based ace detection method. The contributions o our paper are creating the boosted skin-color model reducing more noise to obtain more reasonable ace candidates, using improved LBP histogram matching to overcome the problem o rotation and hybrid method o classiier to reduce the eect o occlusion and pose. Especially, our system is improved to detect proile aces with the angle o pose about 90 degrees. Our hybrid method shows a better ace detection capacity than using separately ehmms or LBP histogram matching. Our uture work is planning to apply and develop this system to ace recognition task. ACKNOWLEDGEMENTS The authors would like to thank to the department o Engineering Physics, Hochiminh university o Technology to support us to do this work. REFERENCES Figure 10: Several ace detection results under multiappearance, rotation, pose and. Table 2: Face detection results under rotation, occlusion, connection and pose. Rotation Occlusion Pose Face detection rate (%) 93 75 92 Table 3: Comparison o ace detection results between dierent methods. Proposed algorithm mlbp histogram matching ehmms Detection rate (%) 93 80 64 1. M. H. Yang, D. J. Kriegman and N. Ahuja: Detecting Faces in Images: A survey. IEEE Trans. on PAMI, vol. 24, no. 1, pp. 34-58, 2002. 2. R. Lienhart, J. Maydt: An Extended Set o Haarlike Features or Rapid Object Detection. Proc. O the IEEE Con. on Image Processing (ICIP 02), 2002. 3. P. Viola and M. Jones: Robust Real-time Object Detection. International journal o Computer Vision, 2002. 4. V. Vezhnevets, V. Sazonov and A. Andreeva: A survey on Pixel-based Skin Color Detection Techniques. Proc. Graphicon, pp. 85-92, 2003. 5. P. Peer, J. Kovac and F. Solina: Human Skin Colour Clustering or Face detection. EUROCON, 2003. 6. C. Lin and K. C. Fan: A Color-Triangle-Based Approach to the Detection o Human Face. BMCV, vol. 1811, pp. 359-368, May 2000. 7. R. Seguier: A Very Fast Adaptive Face Detection System. International Conerence on Visualization, Imaging and Image Processing, 2004. 8. L. G. Shapiro and G. C. Stockman: Computer Vision. Prentice Hall, New Jersey, 2001. 9. A. Neian and M. Hayes: Face Recognition using an embedded HMM. Proc. Audio and Video-based Biometric Person Authentication, pp. 19-44, 1999.