Face Recognition Based on SVM and 2DPCA

Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty of Canberra lhtha@ft.hcmus.edu.vn, len.bu@canberra.edu.au Abstract The paper wll present a novel approach for solvng face recognton problem. Our method combnes 2D Prncpal Component Analyss (2DPCA), one of the promnent methods for extractng feature vectors, and Support Vector Machne (SVM), the most powerful dscrmnatve method for classfcaton. Experments based on proposed method have been conducted on two publc data sets FERET and AT&T; the results show that the proposed method could mprove the classfcaton rates. Keywords: 2DPCA, SVM. 1. Introducton Human faces contan a lot of mportant bometrc nformaton. The nformaton can be used n a varety of cvlan and law enforcement applcatons. For example, dentty verfcaton for physcal access control n buldngs or securty areas s one of the most common face recognton applcatons. At the access pont, an mage of a clamed person s face s captured by a camera and s compared wth stored mages of the clamed persons. Then t wll be accepted only f t s matched. For hgh securty areas, a combnaton wth card termnals s possble, so that a double check s performed. Fgure 1. Image Representatons n PCA and 2DPCA Snce Matthew Turk and Alex Pentland [1] used Prncpal Component Analyss (PCA) to deal wth the face recognton problem, PCA has become the standard method to extract feature vectors n face recognton because t s stable and has good performance. evertheless, PCA could not capture all local varances of mages unless ths nformaton s explctly provded n the tranng data. To deal wth ths problem, some researchers proposed other approaches. For example, Wskott et al. [2] suggested a technque known as elastc bunch graph matchng to extract local features of face mages. Penev and Atck [3] proposed usng local features to represent faces; they used PCA to extract local feature vectors. They reported that there was a sgnfcant 85

Vol. 4, o. 3, September, 2011 mprovement n face recognton. Bartlett et al. [4] proposed usng ndependent component analyss (ICA) for face representaton to extract hgher dependents of face mages that cannot represented by Gaussan dstrbutons, and reported that t performed better than PCA. Mng-HsuanYang [5] suggested Kernel PCA (or nonlnear subspace) for face feature extracton and recognton and descrbed that hs method outperformed PCA (lnear subspace). However, the performance costs of them are hgher than PCA. To solve these problems, Jan Yang [6] proposed a new method called 2D Prncpal Component Analyss (2DPCA). In conventonal PCA, face mages have been represented n vectors by some technque lke concatenaton. As opposed to PCA, 2DPCA represents face mages by usng matrces or 2D mages nstead of vectors (Fg. 1). Clearly, usng 2D mages drectly s qute smple and local nformaton of the orgnal mages s preserved suffcently, whch may brng more mportant features for facal representaton. In face dentfcaton, some face mages are easy to recognze, but others are hard to dentfy; for example, frontal face mages are easer than to be recognzed than profle face mages. Therefore, we proposed a weghted-2dpca model to deal wth the dffculty. In 1995, Vapnk and Cortes [7] presented the foundatons for Support Vector Machne (SVM). Snce then, t has become the promnent method to solve problems n pattern classfcaton and regresson. The basc dea behnd SVM s fndng the optmal lnear hyperplane such that the expected classfcaton error for future test samples s mnmzed,.e., good generalzaton performance. Obvously, the goal of all classfers s not to get the lowest tranng error. For example, a k- classfer can acheve the accuracy rate 100% wth k=1. However, n practce, t s the worst classfer because t has hgh structural rsk. hgh error goal low model rsk or model complexty hgh Fgure 2. Curves of Testng Error and Tranng Error They suggested the formula testng error = tranng error + rsk of model (Fg. 2). To acheve the goal to get the lowest testng error, they proposed the structural rsk mnmzaton nductve prncple. It means that a dscrmnatve functon that classfes the tranng data accurately and belongs to a set of functons wth the lowest VC dmenson wll generalze best results regardless of the dmensonalty of the nput space. Based on ths prncple, an optmal lnear dscrmnatve functon has been found. For lnearly non-separable data, SVM maps the nput to a hgher dmensonal feature space where a lnear hyperplane can be found. Although there s no warranty that a lnear soluton wll always exst n the hgher dmensonal space, t s able to fnd effectve solutons n practce. To deal wth the face gender classfcaton, many researchers [8-11] have appled SVM n ther studes and stated that the experment results are very postve. In our research, we have combned the power of each method, weghted-2dpca and SVM, to solve the problem. 86

Vol. 4, o. 3, September, 2011 The remanng sectons of our paper wll dscuss the mplementaton of our face recognton system, related theory, and experments. Secton 2 gves detals of 2DPCA. Secton 3 dscusses how to use SVM n face classfcaton. In Secton 4, we wll descrbe the mplementaton and experments. Fnally, Secton 5 s our concluson. 2. 2D Prncpal Component Analyss 2.1. Face Model Constructon As mentoned above, we propose a weghed-2dpca to deal wth some practcal stuatons n whch some face mages n database are dffcult to dentfy due to ther poses (front or profle) or ther qualtes (nose, blur). Tranng data D A w (),, 1,..., Algorthm 1: Construct proposed face model Step 1: Compute the mean mage Step 2: Compute matrx Step 3: Compute egenvectors 2.2. Feature Extracton w () w A 1 A = (1) w 1 T ( ) ( ) A A A A 1 G = (2) 1 w Ω Ω Ω and egenvalues,,..., n 1 2 1, 2,..,. n of G. Frst, a projecton pont of mage A on 2DPCA space s matrx X1, X2,..., X n Ωk, 1,..., Xk A A k d (3) Second, the matrx s projected on PCA space to convert matrx to vector and reduce the dmenson. 3. Support Vector Machne The goal of SVM classfers s to fnd a hyperplane that separates the largest fracton of a ( ) ( ) ( ) n ( ) labeled data set {( x, y ); x ; y { 1, 1}; 1,..., }. The most mportant requrement, whch the classfers must have, s that t has to maxmze the dstance or the margn between each class and the hyperplane (Fg 3.). In most of real applcatons, the data could not be lnearly classfed. To deal wth ths problem, we transform data nto a hgher dmensonal feature space and assume that our data n ths space can be lnearly classfed (See Fg 4.). 87

Vol. 4, o. 3, September, 2011 : x n In fact, determnng the optmal hyperplane s a constraned optmzaton problem and can be solved usng quadratc programmng technques. The dscrmnant hyperplane s defned as the followng where ', " m x ( ) ( ), y x y K x x b (5) 1 K x x s the kernel functon. (4) Fgure 3. An SVM Classfer Fgure 4. Input Space and Feature Space 3.1. Classfer Constructon Phase Algorthm 2: Construct classfer Step 1: Compute matrx H ( ) ( j ) ( ), ( j H y y K x x ) (6) j Step 2: Use quadratc solver to solve the optmzaton problem wth objectve functon: 1 T α argmn α Hα 2 α 1 0 C (7) () y 0 1 88

Vol. 4, o. 3, September, 2011 Step 3: Compute b dx { 0} dx dx 1 b y y K x, x dx dx ( ) ( j ) ( j ) ( ) j jdx (8) 3.2. Classfcaton Phase Algorthm 3: Classfy Step 1: Compute the value y ( ) ( ) y sgn y K x, x b (9) 1 Step 2: Classfy for x f y 1 then x belong class {+1} f y -1 then x belong class {-1} 3.3. SVM for Face Identfcaton To apply SVM n face recognton, we use One-Aganst-All decomposton to transform mult-class problem to a set of two-class problems. ( ) ( ) ( ) n ( ) Tranng set D {( x, y ); x ; y { 1, 1}; 1,..., } s transformed to seres () ( ) ( ) of Dk {( x, yk ); yk { 1, 1} } where () () 1 y k yk () 1 y k Algorthm 2 s used to compute the dscrmnant functons correspondng to ( ) ( ), k k 1 D k. (10) (11) f x y K x x b (12) In classfcaton phase, we use the followng rule to dentfy the class for nput x. k arg max f x (13) 4. Implementaton and Experments k We select FERET and AT&T databases to evaluate our approach. The FERET database [12] was collected at George Mason Unversty between August 1993 and July 1996. It contans 1564 sets of mages for 14,126 mages that nclude 1199 ndvduals and 365 duplcate sets of mages. In our experments, face regons of FERET mages were dentfed and extracted from the background of the nput mages usng the ground truth nformaton of mages but some mages do not contan nformaton on face locatons. In ths case, we used the well-known algorthm developed by Vola and Jones [13, 14] to fnd face postons. Then, they were scaled to 50-by-50 resoluton. In dataset k 89

Vol. 4, o. 3, September, 2011 buldng task, we constructed a dataset D contanng 1000 ndvduals whch are chosen from sets fa, fb, fc, dup1 and dup2 of 1996 FERET database. All mages of the dataset D are frontal face mages. ext, we randomly dvded the dataset nto 3 separate subsets A, B and C. The reported results were obtaned wth Cross-Valdaton analyss on these subsets. We also use tranng set M of database provded by FERET for PCA feature extracton and 2DPCA extracton. The AT&T database was taken at AT&T Laboratores. It contans 400 mages (92-by-112) of 40 ndvduals; each person has ten mages. We performed the same tasks to buld datasets for experments. Fgure 5. a) Three faces from AT&T b) Three processed faces from FERET 4.1. Experments on AT&T database We mplemented fve methods to conduct experments on the AT&T database: MLP (PCA): Ths method uses PCA to extract feature vectors and Mult Layer Perceptron (MLP) for classfcaton. The MLP has three layers: nput layer has 163 nodes, hdden layer has 100 nodes, and output layer has 40 nodes. Ths MLP uses Gradent Back-Propagaton algorthm for tranng. The actve functon of MLP s sgmod functon f x and the range of learnng rate s between 0.3 and 0.7. 1 (14) x 1 e, f [0, 1] f x k- (PCA): We use PCA to obtan feature vectors and employ k-earest eghbor (k-) wth dstance metrc L2 for classfcaton. SVM (PCA): It uses PCA to get feature vectors and apples SVM wth two kernel functons (Polynomal, Radal Bass Functons-RBF) for classfcaton. The value of 5 14 d of Polynomal s 3; for RBF kernel we used some values C {2,...,2 } and 15 3 {2,...,2 } for classfcaton. T x x x x K, ' ' 1 d K xx, ' e 2 xx' 2 2 k- (2DPCA): The method uses our proposed 2DPCA to get feature vectors and employs k- for classfcaton. SVM (2DPCA): It uses the proposed 2DPCA to get feature vectors and SVM for classfcaton. (15) 90

Cumulatve match score Internatonal Journal of Sgnal Processng, Image Processng and Pattern Recognton Vol. 4, o. 3, September, 2011 We used the subset M to create PCA feature extractor. The default dmenson of feature vector s k=163. Wth ths k, we can get to a reasonable PCA reconstructon error of MSE = 0.0015. We also used the same subset M to create 2DPCA feature extractor. A weght for each tranng mage s ts rotate angle. The dmenson of feature vector s k = 20. For each method, we conducted three expermental trals on subsets A, B and C. It means that we traned classfers on two subsets and evaluated on the remanng subset. The results are reported on ther average performance scores n Table I. The cumulatve match score vs. rank curve for each method has been show n Fg. 6. The values of curve are the percentage of correct matches n the top n matches (rank-5). The expermental results pont that our proposed method for feature extracton s better than PCA and 2DPCA methods. As mentoned above, PCA s a method to reduce the dmenson. There s not any mathematcal evdence that t wll ncrease the recognton rate. Our method has more advantages than tradtonal 2DPCA because t can create a subspace that reserves some mportance dscrmnatve nformaton of face mages such as pose. The expermental results also show that MLP s the worst classfcaton method and SVM s the best one. Obvously, MLP s easy to be overfttng because they usually focus on fndng the lowest error rate although we use some technques such as cross valdaton to lmt the weak pont. In other hand, SVM method always gves a sutable soluton. TABLE I. EXPERIMET RESULTS O AT&T DATABASE Feature extracton Classfcaton Accuracy (%) PCA MLP 75.2 PCA k- 95.2 PCA SVM 95.7 2DPCA k- 96.2 2DPCA SVM 97.3 100 95 90 85 MLP (PCA) k- (PCA) 80 SVM (PCA) k- (2DPCA) SVM (2DPCA) 75 1 2 3 4 5 Rank Fgure 6. Identfcaton Performance on AT&T Database 4.2. Experments on FERET Database We mplemented four methods to conduct experments on FERET database, whch are k- (PCA), SVM (PCA), k- (2DPCA) and SVM (2DPCA). We dd the same task to buld feature extractors. Frst, we used the subset M to create PCA feature extractor. The default dmenson of feature vector s k=100. Then, we contnued to use 91

Cumulatve match score Internatonal Journal of Sgnal Processng, Image Processng and Pattern Recognton Vol. 4, o. 3, September, 2011 the same subset M to create 2DPCA feature extractor. In our experments, we set weght for female s 3, for male s 2 and for ndvdual wth glass s 1. It means that an mage be easy to recognze has hgher weght. The dmenson of feature vector s k = 10. We conducted three expermental trals on subsets A, B and C for each method. The results are reported on ther average performance scores n Table II; and the cumulatve match score vs. rank curve (rank-50) for each method has been shown n Fg. 7. The method 2DPCA wth SVM for classfcaton stll gets the best performance on the FERET dataset. TABLE II. EXPERIMET RESULTS O FERET DATABASE Feature extracton Classfcaton Accuracy (%) PCA L2 80.1 PCA SVM 85.2 2DPCA L2 90.1 2DPCA SVM 95.1 100 98 96 94 92 90 88 86 k- (PCA) 84 SVM (PCA) k- (2DPCA) 82 SVM (2DPCA) 80 0 5 10 15 20 25 30 35 40 45 50 Rank 5. Conclusons Fgure 7. Identfcaton Performance on FERET Database In summary, we have proposed a new approach for face recognton. The frst contrbuton of ths paper s to propose a novel face model based on conventonal 2DPCA for extractng feature vectors. The second contrbuton of ths paper s to combne our proposed face model wth SVM. We have compared our method wth tradtonal methods. The results from our methods outperformed sgnfcantly. References [1] M. A. Turk and A. P. Pentland, "Face recognton usng egenfaces," n Computer Vson and Pattern Recognton, 1991. Proceedngs CVPR '91., IEEE Computer Socety Conference on, 1991, pp. 586-591. [2] L. Wskott, et al., "Face recognton by elastc bunch graph matchng," Pattern Analyss and Machne Intellgence, IEEE Transactons on, vol. 19, pp. 775-779, 1997. [3] P. Penev and J. Atck, "Local feature analyss: a general statstcal theory for object representaton," etwork: computaton n neural systems, vol. 7, pp. 477-500, 1996. 92

Vol. 4, o. 3, September, 2011 [4] M. S. Bartlett, et al., "Face recognton by ndependent component analyss," eural etworks, IEEE Transactons on, vol. 13, pp. 1450-1464, 2002. [5] Y. Mng-Hsuan, "Kernel Egenfaces vs. Kernel Fsherfaces: Face recognton usng kernel methods," n Automatc Face and Gesture Recognton, 2002. Proceedngs. Ffth IEEE Internatonal Conference on, 2002, pp. 215-220. [6] Y. Jan, et al., "Two-dmensonal PCA: a new approach to appearance-based face representaton and recognton," Pattern Analyss and Machne Intellgence, IEEE Transactons on, vol. 26, pp. 131-137, 2004. [7] C. Cortes and V. Vapnk, "Support-vector networks," Machne learnng, vol. 20, pp. 273-297, 1995. [8] C. Huaje and W. We, "Pseudo-Example Based Iteratve SVM Learnng Approach for Gender Classfcaton," n Intellgent Control and Automaton, 2006. WCICA 2006. The Sxth World Congress on, 2006, pp. 9528-9532. [9] B. Moghaddam and Y. Mng-Hsuan, "Gender classfcaton wth support vector machnes," n Automatc Face and Gesture Recognton, 2000. Proceedngs. Fourth IEEE Internatonal Conference on, 2000, pp. 306-311. [10] H. Xa, et al., "Gender Classfcaton Based on 3D Face Geometry Features Usng SVM," n CyberWorlds, 2009. CW '09. Internatonal Conference on, 2009, pp. 114-118. [11] L. Xue-Mng and W. Y-Dng, "Gender classfcaton based on fuzzy SVM," n Machne Learnng and Cybernetcs, 2008 Internatonal Conference on, 2008, pp. 1260-1264. [12] P. Phllps, et al., "The FERET evaluaton methodology for face-recognton algorthms," Pattern Analyss and Machne Intellgence, IEEE Transactons on, vol. 22, pp. 1090-1104, 2002. [13] P. Vola and M. J. Jones, "Robust real-tme face detecton," Internatonal Journal of Computer Vson, vol. 57, pp. 137-154, 2004. [14] L. Tha Hoang and B. Len Ten, "A hybrd approach of AdaBoost and Artfcal eural etwork for detectng human faces," n Research, Innovaton and Vson for the Future, 2008. RIVF 2008. IEEE Internatonal Conference on, 2008, pp. 79-85. Authors Tha Hoang Le receved B.S degree and M.S degree n Computer Scence from Hano Unversty of Technology, Vetnam, n 1995 and 1997. He receved Ph.D. degree n Computer Scence from Ho Ch Mnh Unversty of Scences, Vetnam, n 2004. Snce 1999, he has been a lecturer at Faculty of Informaton Technology, Ho Ch Mnh Unversty of Scence, Vetnam. Hs research nterests nclude soft computng pattern recognton, mage processng, bometrc and computer vson. Dr. Tha Hoang Le s co-author over twenty fve papers n nternatonal journals and nternatonal conferences. Len Bu receved B.S degree and M.S degree n Computer Scence from Ho Ch Mnh Unversty of Scence, Vetnam, n 1996 and 2001. Snce 2008, he has been a Ph.D. student n Computer Scence at Faculty of Informaton Scences and Engneerng, Unversty of Canberra, Australa. 93

Vol. 4, o. 3, September, 2011 94