Face Tracking in Video

Similar documents
Face Detection on OpenCV using Raspberry Pi

Subject-Oriented Image Classification based on Face Detection and Recognition

Classifier Case Study: Viola-Jones Face Detector

Active learning for visual object recognition

Detecting Pedestrians Using Patterns of Motion and Appearance

Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

Detecting Pedestrians Using Patterns of Motion and Appearance

Study of Viola-Jones Real Time Face Detector

Introduction. How? Rapid Object Detection using a Boosted Cascade of Simple Features. Features. By Paul Viola & Michael Jones

Face and Nose Detection in Digital Images using Local Binary Patterns

Detection of a Single Hand Shape in the Foreground of Still Images

Face Detection and Alignment. Prof. Xin Yang HUST

Detecting Pedestrians Using Patterns of Motion and Appearance

Detecting and Reading Text in Natural Scenes

Face detection and recognition. Detection Recognition Sally

Viola Jones Simplified. By Eric Gregori

FACE DETECTION BY HAAR CASCADE CLASSIFIER WITH SIMPLE AND COMPLEX BACKGROUNDS IMAGES USING OPENCV IMPLEMENTATION

SIMULINK based Moving Object Detection and Blob Counting Algorithm for Traffic Surveillance

Detecting Pedestrians Using Patterns of Motion and Appearance (Viola & Jones) - Aditya Pabbaraju

Face detection and recognition. Many slides adapted from K. Grauman and D. Lowe

Window based detectors

Detecting People in Images: An Edge Density Approach

Computer Vision Group Prof. Daniel Cremers. 8. Boosting and Bagging

Generic Object-Face detection

A Real Time Human Detection System Based on Far Infrared Vision

Discriminative Feature Co-occurrence Selection for Object Detection

REAL TIME HUMAN FACE DETECTION AND TRACKING

Automatic Initialization of the TLD Object Tracker: Milestone Update

Three Embedded Methods

Viola Jones Face Detection. Shahid Nabi Hiader Raiz Muhammad Murtaz

Learning to Detect Faces. A Large-Scale Application of Machine Learning

Skin and Face Detection

Automatic Shadow Removal by Illuminance in HSV Color Space

Mouse Pointer Tracking with Eyes

Face Recognition for Mobile Devices

Fast Face Detection Assisted with Skin Color Detection

Project Report for EE7700

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

Face Recognition Pipeline 1

Progress Report of Final Year Project

A Hybrid Face Detection System using combination of Appearance-based and Feature-based methods

Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information

Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction

Adaptive Feature Extraction with Haar-like Features for Visual Tracking

Face detection, validation and tracking. Océane Esposito, Grazina Laurinaviciute, Alexandre Majetniak

Adaboost Classifier by Artificial Immune System Model

Machine Learning for Signal Processing Detecting faces (& other objects) in images

CS 559: Machine Learning Fundamentals and Applications 10 th Set of Notes

CS228: Project Report Boosted Decision Stumps for Object Recognition

ii(i,j) = k i,l j efficiently computed in a single image pass using recurrences

RGBD Face Detection with Kinect Sensor. ZhongJie Bi

A Study on Similarity Computations in Template Matching Technique for Identity Verification

Unsupervised Improvement of Visual Detectors using Co-Training

Recap Image Classification with Bags of Local Features

Face tracking. (In the context of Saya, the android secretary) Anton Podolsky and Valery Frolov

DEPARTMENT OF INFORMATICS

A robust method for automatic player detection in sport videos

Motion analysis for broadcast tennis video considering mutual interaction of players

Computer Vision Group Prof. Daniel Cremers. 6. Boosting

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

Face Detection in Digital Imagery Using Computer Vision and Image Processing

A Background Subtraction Based Video Object Detecting and Tracking Method

Rapid Object Detection Using a Boosted Cascade of Simple Features

Hybrid Face Detection System using Combination of Viola - Jones Method and Skin Detection

Person identification through emotions using violas Jones algorithm

Rapid Object Detection using a Boosted Cascade of Simple Features

Object Category Detection: Sliding Windows

CSE 151 Machine Learning. Instructor: Kamalika Chaudhuri

Postprint.

Detection and Classification of a Moving Object in a Video Stream

Post-Classification Change Detection of High Resolution Satellite Images Using AdaBoost Classifier

Chapter 3 Image Registration. Chapter 3 Image Registration

Real-time Computation of Haar-like features at generic angles for detection algorithms

Image enhancement for face recognition using color segmentation and Edge detection algorithm

A Survey of Various Face Detection Methods

Vehicle Detection Method using Haar-like Feature on Real Time System

Assessment of Building Classifiers for Face Detection

A Cascade of Feed-Forward Classifiers for Fast Pedestrian Detection

AN HARDWARE ALGORITHM FOR REAL TIME IMAGE IDENTIFICATION 1

Face Detection Using Look-Up Table Based Gentle AdaBoost

Ensemble Methods, Decision Trees

Human Motion Detection and Tracking for Video Surveillance


Face Detection CUDA Accelerating

Haar Wavelets and Edge Orientation Histograms for On Board Pedestrian Detection

IMPROVEMENT OF BACKGROUND SUBTRACTION METHOD FOR REAL TIME MOVING OBJECT DETECTION INTRODUCTION

Object and Class Recognition I:

Lecture 4 Face Detection and Classification. Lin ZHANG, PhD School of Software Engineering Tongji University Spring 2018

An Object Detection System using Image Reconstruction with PCA

Eye Tracking System to Detect Driver Drowsiness

A Two-Stage Template Approach to Person Detection in Thermal Imagery

Face Detection using Hierarchical SVM

Towards Enhancing the Face Detectors Based on Measuring the Effectiveness of Haar Features and Threshold Methods

Viola-Jones with CUDA

Detecting Faces in Images. Detecting Faces in Images. Finding faces in an image. Finding faces in an image. Finding faces in an image

Fast Color-Independent Ball Detection for Mobile Robots

Improving Door Detection for Mobile Robots by fusing Camera and Laser-Based Sensor Data

Object detection as supervised classification

Traffic Sign Localization and Classification Methods: An Overview

A Robust Method for Circle / Ellipse Extraction Based Canny Edge Detection

Transcription:

Face Tracking in Video Hamidreza Khazaei and Pegah Tootoonchi Afshar Stanford University 350 Serra Mall Stanford, CA 94305, USA I. INTRODUCTION Object tracking is a hot area of research, and has many practical applications. These applications include navigation, security, robotics, vehicular communications, and etc. It has also found applications in biology, where people are studying cells and mitosis. One of the biggest applications of object tracking is being able to track a face. In a face tracking application, Boosting and cascading detectors have gained great popularity due to the their efficiency in selecting features for faces. However these detectors suffer from high miss-classification rates. In addition they depend on the orientation of the face. They require that the face be in a full frontal position. If there is any deviations from this position (the face is tilted by 45 degrees, or the face turns to a profile position) the AdaBoost fails to detect a face. We will implement a face tracking algorithm, and compare it to other well known tracking algorithm presented in the literature. Specifically we will be enhancing the AdaBoost algorithm, and comparing the resulting system to one that uses AdaBoost only. There are three components to our algorithm: image representation, appearance model, and motion model. Image representation is done using Haar like features. These features are calculated efficiently using integral image representation, which allows the computation of these features to be done in constant time. Once the features have been attained the face detection is performed using two classifiers, the AdaBoost algorithm creates a classifier by combining a set of weak classifiers. The motion model tracks the aspect ratio, and the center of the face that it has detected. The paper is organized as follows. Section II will provide the necessary background and present a detailed description of our model, and will provide the necessary background information. Section IV will present the results. Section V will serve as a conclusion, and will discuss possible future works. II. BACKGROUND AND SYSTEM MODEL A. Haar features The features used to represent the pictures are the haar features that were originally proposed by Papageorgiou et al. [1]. There are three types of rectangular features that can be computed. For example a two-rectangular feature can be computed by the difference between the sum of two rectangular blocks. Viola-Jones proposed a fast method for computing these features making the features an attractive option for face detection [2]. Their method, which is called integral image and can be calculated with one pass over the picture, uses an intermediate array to store a running sum of pixel above and to the left of the point x,y: ii(x, y) = x x,y y i (x, y ), (1) where ii(x, y) is the integral image, and i (x, y) is the original image. Now using this representation for the image any rectangular sum, and thus haar features, can be computed efficiently. For example the rectangular region D in figure 1 can be computed by ii(4) ii(3) ii(2) + ii(1).

is constructed by combining well-performed weak classifiers. It is shown that the training error of the strong classifier approaches zero exponentially in the number of rounds [3]. The AdaBoost algorithm is summarized below. Fig. 1. Demonstration of the way integral image is used to compute the sum of pixels in a rectangle. B. AdaBoost The number of Haar-like features in each window is much more than the number of pixels, and it is not possible to efficiently classify the image using all the features. However, an effective classifiers can be created by using only a small subset of these features. The algorithm used in this work is the AdaBoost [3] algorithm which extracts the essential features to create a well developed classifier. The AdaBoost algorithm creates a classifier by combining a set of weak classifiers, which are trained on only one feature. We can explain the algorithm in two steps. First, each weak classifier optimizes its performance, i.e. minimizes the classification error over the training set. In this application, a weak classifier has a feature (f i ), threshold (θ i ) and a parity (p i ). It tries to find the optimal threshold that correctly classifies the maximum possible number of examples. The parity is used to determine the direction of inequality sign. x if p j f j p j θ j h j (x) = 0 otherwise After the first round of learning, a weight is assigned to each training example (in the first round they were all equally weighted). In the next step, a strong classifier Given training examples (x i,y i ) where y i = 0, 1 for negetive and positive examples respectively. Initialize weights w 1,i = 1 2m, 1 2l for y i = 0, 1 respectively, where m and l are the number of negative and positive examples respectively. for t = 1,...,T Normalize the weights w t,i = w t,i j wt,j Train each classifier h j corresponding to feature f j. The weighted error with respect to w t is ɛ j = i w i h j (x i ) y i Choose the classifier h t with the lowest error ɛ t. Update the weights, w t+1,i = w t,i βt 1 ei, where e i = 0 if example is classified correctly and 1, otherwise; And β = ɛ t 1 ɛ t the final strong classifier is 1 t h(x) = α th t (x) 1 2 t α t 0 otherwise C. Background-Subtraction Given a frame sequence from a fixed camera, detecting all the foreground objects is based on the difference between the current frame and the stored (and frequently updated ) image of the background: if F rame ij Background ij T hreshould ij F rame ij is foreground else F rame ij is background The threshold was chosen proportional to the variance of pixel values of the current frame (for a more complex model each pixel could have a different threshold based on its own gradual changes). After detecting the foreground objects, the background image is updated as follows:

Fig. 2. Overview of the face tracking system. if F rame ij is detected as background Background ij = α F rame ij + 1-α Background ij else Background ij = Background ij Where α is the updating rate which was chosen to be a small value. Furthermore in order to eliminate the noise from the subtracted foreground image we used the common low pass filters in image processing called dilation and erosion filters. Dilation, in general, increases the size of objects and erosion causes objects to shrink in size. The amount and the way that they affect the objects depend on their structuring elements. To explain them simply, dilation take each binary object pixel (value 1) and set all background pixels (value 0) that are connected to it (determined by the structuring element) to value 1; but erosion takes each binary object pixel (value 1) which is connected to background pixels and set its value to 0. [5] D. Overall Model In order to display the results the face tracking system draws a rectangle around the region that it has recognized to be a face. The system will maintain a set of parameters, which are the center, width, height and the aspect ratio of the rectangle from the previous frame and the current frame. The system Initializes when the AdaBoost classifier detects the first instance of a face, and initialize all the parameters of the system accordingly. In the next frame the AdaBoost classifier will try to draw a rectangle around the position of the face. If this classifier could not find a face or if the rectangle s parameters are significantly different from the previous rectangle, then the background-subtraction algorithm that was describe in section II-C will process the image, and remove the

background. The system will then draw a number of random rectangles, N rectangles, near the vicinity of the previous center. Within each rectangle the sum of the pixels is calculated, and the rectangle with the maximum sum is selected. The parameters are then updated using this rectangle, so they can be used in the next frame. An overview of the entire system can be seen in figure reffig:systemmodel. TABLE I TABLE COMPARING THE ADABOOST ALGORITHM, AND THE ALGORITHM DESCRIBES BY THIS PAPER Number of frames Percent missed (total of 1803 frames). missed AdaBoost only 726 40.3% Algorithm Proposed Algorithm 193 10.7% III. EXPERIMENT SETUP A 480 x 720 video, with 1803 frames was used to test the system. As described in section II-D when the parameters of the current rectangle are significantly different from the parameters of the previous rectangle, the background subtraction algorithm is used. Two criterions were used to determine whether the previous rectangle and the current rectangle are different. First, the difference between the centers of the rectangles were calculated, and if the difference was greater than ten percent of the video width (720) in the x-direction, and ten percent of the video height (480) in the y direction, then the two rectangles were classified as being significantly different. Secondly, if the aspect ratio of the two rectangles differ by more than ten percent, then the two rectangles were classified as being significantly different. The threshold γ that was described in section II-C was set to be sixty percent of the variance of the picture. This threshold was selected to take into consideration the change in lighting of each frame. And finally, the N rectangles parameter described in section II-D was set to ten. detection and a false positive detection. In other words we consider a missed frame as either a false positive or no detection. V. CONCLUSIONS The face tracking algorithm proposed by this paper does very well, and it was shown that it can enhance performance significantly. The only problem with this algorithm is that it is slow, which means face tracking cannot be done in real-time. Currently, the face tracking system is implemented in Mat-lab, and reading in a frame and processing it takes a long time. In order to increase the speed of the system the algorithm can be implemented in C++, where reading and writing files can be done with ease. Another modification that can be made, is to reduce the number of times the AdaBoost classifier is used to detect the face. Instead of having the AdaBoost classifier look at each frame, the algorithm could be modified so that it calls the AdaBoost classifier every ten or twenty frames. These implementation changes could be laid out in the future to increase the speed of the face tracking algorithm. IV. RESULTS The face tracking system was tested using the experimental setup described above, and was compared to a face tracking system that only uses an Ada-Boost classifier to track faces. The results are summarized in table I. As it can be seen our face tracking system out performed the Ada-Boost face tracking system. We have managed to reduce the errors from 726 frames to 193 frames which is a significant number. It must be noted that the results do not make a distinction between no REFERENCES [1] C. Papageorgiou, M. Oren, and T. Poggio, A general framework for object detection, International Conference on Computer Vision, 1998. [2] P. Viola and M. Jones, Robust Real-time Object Detection, Second International Workshop on Statistical and Computational Theories of Vision - Modeling, Learning, Computing, and Sampling, 2001. [3] Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Computational Learning Theory: Eurocolt 95, pages 2337. Springer-Verlag, 1995

[4] R. Cucchiara, C. Grana, M. Piccardi, and A. Prati, Detecting moving objects, ghosts and shadows in video streams, IEEE Trans. on Patt. Anal. and Machine Intell., vol. 25, no. 10, Oct. 2003, pp. 1337-1342. [5] Image Processing Fundamentals, Delft University of Technology.