Implementation of a Face Recognition System for Interactive TV Control System Sang-Heon Lee 1, Myoung-Kyu Sohn 1, Dong-Ju Kim 1, Byungmin Kim 1, Hyunduk Kim 1, and Chul-Ho Won 2 1 Dept. IT convergence, DGIST, Daegu, Republic of Korea {pobbylee, smk, radioguy, bmkim, hyunduk00}@dgist.ac.kr 2 Dept. of High Tech. Medical System, Kyungil University, Daegu, Republic of Korea chulho@kiu.ac.kr Abstract. In this paper, a face recognition system which can be applied to an Interactive TV Control System (ITCS) is proposed. The face recognition system consists of three subsystems. The first is for the registration of the user s face and the second is for the detection of the user s face. The final subsystem is for the recognition of the user via the user s face. The face recognition system is used with an ITCS in order to provide personalized services such as the selection of favorite channels or parental guidance. To detect a face, we extract Uniform Local Binary Patterns (ULBP) histogram features in near-infrared face images and use Support Vector Machine (SVM) as a classifier. To recognize a face, we extract local Gabor binary pattern histogram sequences (LGBPHS) and compare faces using a chi-square distance measure. Keywords: face recognition, Interactive TV, Near Infrared, face detection, local Gabor binary pattern histogram sequences (LGBPHS). 1 Introduction Face recognition systems are widely studied for use in various applications, such as surveillance systems, human-like robots, smile-shot cameras, and driverdrowsiness detection systems in intelligent vehicles [1]. Face recognition systems consist of three subsystems. The first is for the registration of the user s face and the second is for the detection of the user s face. The main aim in face detection is to determine whether there are any faces in an image and, if present, return the face s location. The face registration subsystem requires that the users initially 98
register their faces in system. The face recognition subsystem is used after face registration and detection. After the face detection, the detected face image shot is stored in a face database. We used uniform local binary pattern (ULBP) features and support vector machine (SVM) as classifier for face detection [2]. The most important properties of the LBP feature are its computational simplicity and its compensation for the monotonic transformation of the gray scale. ULBPs are a subset of all theoretically possible patterns, encompassing a total of 59 patterns for all LBPs. Support vector machine (SVM), which is a binary classification method, has been used for face detection. With the SVM classifier, the equal error rate (EER) is determined while changing the threshold values. For face recognition, Local Gabor Binary Pattern Histogram Sequences (LGBPHS) are used as feature vectors. Chi-square measurements are then used to determine the histogram distance. The face recognition system is the primary module of an ITCS, which is widely used at present and can provide personalized services such as the selection of favorite channels or parental control services. Fig. 1 shows a scenario of interactive TV using a face recognition system. In this paper, an implementation method for a face recognition system for ITCS is proposed. Fig. 1.Scenario of Interactive TV using a face recognition system 2 Face Recognition System for an ITCS 2.1 Face Detection with ULBPs (Uniform Local Binary Patterns) 2.1.1 LBP (Local Binary Pattern) feature Due to its high discriminative power, robustness against illumination changes, and computational simplicity, the LBP has been widely adopted in various applications, such as visual inspections, image and video retrieval, biomedical image analysis and facial image analysis [3]. The value of the LBP operator of a center pixel is presented in Eq. (1), where P is the number of neighboring pixels, R is the radius of the circle, corresponds to the gray value of the center pixel and is the gray values 99
of P equally spaced pixels. In Fig. 2, the value of the center pixel is subtracted from the values of its neighbors and the difference is represented as 1 or 0. The LBP binary code for a neighboring pixel is then formed and the decimal value can be obtained. (1) The input image is scanned pixel-by-pixel, and the operator outputs are accumulated into a discrete LBP histogram. The LBP histogram contains information about the distribution of the local micro-pattern, such as any edges, spots, and flat areas spread over the entire image. It can be used to describe the image s characteristics statistically. Face images are therefore composed of micro-patterns and can be effectively described by the LBP histograms. Fig. 2. Circularly symmetric neighbor sets for LBP. A Local Binary Pattern is called uniform, that is, ULBP if it contains at most two bitwise transitions from 0 to 1 or vice versa when the binary string is considered to be circular. Fig. 3 shows the templates for ULBPs using eight sampling points. Black points correspond to a binary value of 1, and white to 0. Fig. 3. Templates for ULBPs using eight sampling points 100
2.2 Face recognition using the LGBPHS Fig. 4. Flowchart of the face detection and recognition system Fig. 4 shows the algorithm of face detection and recognition system. We used the local Gabor binary pattern histogram sequence (LGBPHS) method for face recognition. The LGBPHS jointly makes use of Gabor filters and local binary patterns (LBPs). Here, four-resolution and eight-orientation Gabor magnitude images are converted to LBP images. Each LBP image is further divided into non-overlapping rectangular regions with 4 4 size, and a histogram is computed for each region, as shown in Fig. 5. Finally, the LBP histograms of all the LBP images are concatenated to form the histogram sequence as a face feature. For face recognition, a chi-square distance measure is used as the similarity measurement between two LGBPHSs. (a) (b) Fig. 5. Gabor filtering of a face image: (a) magnitude map and (b) LBP images of (a) 2.3 Interactive TV Control System We implemented an interactive TV control system with gesture recognition and face recognition capabilities. The overall architecture is illustrated in Fig. 6. The system has two parts: the RTU and the IUI. The RTU (Recognition to UI) is the core recognition subsystem, which contains the face recognition module and gesture recognition module. The IUI (Interactive User Interface) is a graphic user interface subsystem controlled by the RTU. The user s face is recognized in the RTU and the results are sent to the IUI. The RTU and IUI are implemented as different processes on an OS (Operating System). Face recognition is followed for personalized services, such as a 101
favorite channel, or other services. After a user authentication process, the last channel that the user watched is played. Three menus are provided while the channel is being played. These are the change volume, change channel, and the function menus. In the function menu, the user can turn off the TV or input their user information for instance. Fig. 7 shows the graphic user interface on the screen. Fig. 6. Overall architecture of the ITCS (a) (b) Fig. 7. Screen captured Images in the ITCS: (a) main menu and (b) face recognition menu 3 Experimental Results 3.1. Database and Training In this study, we used our database (DGIST database) and the SVM classifier to evaluate the face detection and recognition method. The database includes various expressions, which serve to make face detection more difficult. These include smiling, looking up or down, tilting one s head, and wearing glasses. The database images consist of 58 people total, with 100 to 240 images per person. Each image is 320 240 pixels or 640 480 pixels in size. In training using the SVM, the face image set (positive) is composed of 10,865 images with a pixel size of 32 32. A non-face image set (negative) is derived from a 320 240 database by a sub-window method and is composed of 19,848 images with the same pixel size. 102
3.2. Face detection evaluation The face detection process with a pyramid scaling structure was multiplied 1/1.1 times. The test data set is composed of 11,092 images of 39 people, each with a pixel size of 320 240. The experimental result showed that the total face detection rate achieved 99.93% accuracy with the SVM classifier. 3.3. Face recognition result The face recognition experiment was performed using a total of 310 images. The 310 images are composed of 31 persons and 10 images per person. The 10 images are divided into two groups. The reference images are five images and there were also five test images. Using LGBPHS, the final recognition result showed 96.77% accuracy. 4 Conclusions In this paper, a face recognition system which can be applied to an interactive TV control system (ITCS) is proposed. The experimental results show that the face recognition method offers excellent results. Thus, the proposed system is suitable as a commercialized product for interactive TV systems. A gesture recognition system can be applied to this TV system later as future work. Acknowledgments. This work was supported by the DGIST R&D program of the Ministry of Education, Science and Technology of Korea (12-IT-03). References 1. S. Z. Li and A. K. Jain, Handbook of Face Recognition, Springer, 2005. 2. N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines, Cambridge University Press, 2000. 3. T. Ojala, M. Pietikäinen, and T. Mäenpää, Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, July 2002, pp. 971-987. 103