Feature Extraction Based on Maximum Nearest Subspace Margin Criterion

Neural Process Lett DOI 10.7/s11063-012-9252-y Feature Extracton Based on Maxmum Nearest Subspace Margn Crteron Y Chen Zhenzhen L Zhong Jn Sprnger Scence+Busness Meda New York 2012 Abstract Based on the classfcaton rule of sparse representaton-based classfcaton (SRC) and lnear regresson classfcaton (LRC), we propose the maxmum nearest subspace margn crteron for feature extracton. The proposed method can be seen as a preprocessng step of SRC and LRC. By maxmzng the nter-class reconstructon error and mnmzng the ntra-class reconstructon error smultaneously, the proposed method sgnfcantly mproves the performances of SRC and LRC. Compared wth lnear dscrmnant analyss, the proposed method avods the small sample sze problem and can extract more features. Moreover, we extend LRC to overcome the potental sngular problem. The expermental results on the extended Yale B (YALE-B), AR, PolyU fnger knuckle prnt and the CENPARMI handwrtten numeral databases demonstrate the effectveness of the proposed method. Keywords Feature extracton Dmensonalty reducton Face recognton Fnger knuckle prnt recognton Lnear regresson classfcaton 1 Introducton Recently, the classfcatons based on reconstructon errors [1 6] have attracted a lot of researchers. Among the exstng reconstructon errors based classfcatons, the most popular ones are SRC [5] andlrc[6]. The man dfference between SRC and LRC s the reconstructon strategy. Let us take face recognton for example. SRC, whch s based on sparse representaton, represents a face mage as a sparse combnaton of all the face mages. Dfferently, based on lnear regresson model, LRC represents a face mage as a lnear combnaton of all the face mages from one class. Although SRC and LRC have dfferent Y. Chen (B) Z. Jn School of Computer Scence and Technology, Nanjng Unversty of Scence and Technology, Nanjng 294, People s Republc of Chna e-mal: cystory@qq.com Z. L School of Informaton Engneerng, Jangx Manufacturng Technology College, Nanchang 330095, Chna

Y. Chen et al. reconstructon strateges, ther classfcaton rules are based on the same assumpton: the probe mage belongs to the class wth the mnmum reconstructon error. Suppose x s a probe mage and ˆx ( = 1, 2,...,c) s the reconstructed mage by the th class. The dstance from x to the th class s defned as d = x ˆx 2. Then the label of x wll be assgned as the class wth the mnmum reconstructon error mn d (x). Unfortunately, although SRC and LRC have been successfully appled for face recognton, we notce the performances of SRC and LRC degrade severely under the llumnatons and noses condtons. That means the ntra-class reconstructon errors are probably larger than the nter-class reconstructon errors when the mages contans varatons of llumnatons and noses. In other words, the classfcaton rule of SRC and LRC may not hold well n the orgnal space. To acheve a hgh performance, a good classfcaton rule consders the characterstcs of the feature space. Therefore, a classfer works more effectvely only the feature space fts for the classfcaton rule of the classfer. Then a queston s: what s the optmal feature space for SRC and LRC? Intutvely, the optmal feature space can ft for the classfcaton rule of SRC and LRC as far as possble. To the best of our knowledge, most of the exstng feature extracton methods such as [7 15] are desgned based on the data structures rather than the classfcatons. Wthout consderng the classfcaton rule of SRC and LRC, the feature subspaces learned by the above methods may not hold the assumpton of SRC and LRC well. Therefore, the performances of SRC and LRC potentally degrade. To enhance the performances of SRC and LRC, we propose the maxmum nearest subspace margn crteron (MNSMC) accordng to the classfcaton rules of SRC and LRC. Based on MNSMC, a new feature extractor s developed to fnd the optmal feature subspace for SRC and LRC. The rest of the paper s organzed as follows. Related works are revewed n Sect. 2. In Sect. 3, MNSMC s descrbed n detal. In Sect. 4, the experments are presented on the well-known databases to demonstrate the effectveness of the proposed method. Fnally, conclusons are drawn n Sect. 5. 2 Related Works In ths secton, we brefly revew SRC and LRC. Suppose x j R n ( = 1, 2,...,c, j = 1, 2,...,n ) s the jth sample from the th class and X =[x 1, x2,...,xn ] s a set of all the samples from the th class. Let X = [X 1, X 2,...,X c ] be the set of orgnal tranng samples. 2.1 Sparse Representaton-Based Classfcaton SRC consders sparse representaton has natural dscrmnatng power: takng face mages nto account, the most compact expresson of a certan face mage s generally gven by the face mages from the same class [5]. Denote by z a probe mage. We represent z n a overcomplete dctonary whose bass vectors are tranng sample themselves,.e., z = Xβ (1) The sparsest soluton to Eq. (1) can be sought by solvng the followng optmzaton problem: ˆβ = arg mn β 0, subject to z = Xβ (2) where 0 denotes the L 0 -norm, whch counts the number of nonzero entres n a vector.

Feature Extracton Based on MNSMC Recent research efforts reveal that for certan dctonares, f the soluton ˆβ s sparse enough, fndng the soluton of the L 0 optmzaton problem s equvalent to fndng the soluton to the followng L 1 optmzaton problem [16 18]: ˆβ = arg mn β 1, subject to z = Xβ (3) Then fnd the class wth the mnmum reconstructon error dentty (z) = arg mn {ε } (4) where ε = z X ˆβ 2, ˆβ = [ˆβ1; ˆβ2;...; ˆβc] and ˆβ s the coeffcent vector assocated wth class. 2.2 Lnear Regresson Classfcaton LRC s based on the assumpton that samples from a specfc object class le on a lnear subspace. Usng ths concept, a lnear model s developed. In ths model, a probe mage s represented as a lnear combnaton of class-specfc samples. Thereby the task of recognton s defned as a problem of lnear regresson. Least-squares estmaton (LSE) [19 21]sused to estmate the reconstructon coeffcents for a gven probe mage aganst all class models. Fnally, the label s sgned as the class wth the most precse estmaton. Suppose z s a probe sample from the th class, t should be represented as a lnear combnaton of the mages from the same class (lyng on the same subspace),.e., z = X β (5) where β R n 1 s the reconstructon coeffcents. Gven that n n, the system of equatons n Eq. (5) s well condtoned and can be estmated by LSE: ( ) 1 ˆβ = X T X X T z (6) The probe sample can be reconstructed by Eq. (7): ẑ = X ˆβ = X (X T X ) 1 X T z (7) The label s sgned as the class wth the mnmum reconstructon error,.e., dentty (z) = arg mn z X ˆβ 2 (8) 3 Feature Extracton by Maxmum Nearest Subspace Margn Crteron 3.1 Motvaton In the applcaton of face recognton, SRC and LRC demonstrate mpressve results. When we nvestgate the msclassfcaton, we fnd the varaton of llumnatons can sgnfcantly affect the classfcaton results. An ntutve example s provded n Fg. 1 to state ths problem. On the YALE-B face database, we select four face mages of the frst class under dfferent llumnaton condtons. Accordng to the classfcaton rule of SRC and LRC, the face mage can be classfed correctly when the reconstructon error of the frst class s mnmal. However, takng LRC for example, as can be seen n Fg. 1, only the face mage wth the frontal llumnaton (Fg. 1a) has the mnmum ntra-class reconstructon error. In an extreme case,

Y. Chen et al. Reconstructon Errors 1400 1200 0 0 600 400 200 0 0 5 10 15 20 25 30 35 40 Class Labels (a) Reconstructon Errors 1600 1400 1200 0 0 600 400 200 0 0 5 10 15 20 25 30 35 40 Class Labels (b) Reconstructon Errors 1200 0 0 600 400 200 0 0 5 10 15 20 25 30 35 40 Class Labels (c) Reconstructon Errors 0 0 0 600 500 400 300 200 0 0 5 10 15 20 25 30 35 40 Class Labels Fg. 1 The reconstructon errors of four mages of the frst class under the dfferent llumnaton condtons on the YALE-B database (d).e., the face mage wth the backlght llumnaton (Fg. 1d), the ntra-class reconstructon error s even larger than most of the nter-class reconstructon errors. The above example ndcates the orgnal mage space s not sutable for classfcaton due to the varatons of llumnatons. To solve ths problem, we am to fnd a feature subspace to reduce the mpact of llumnatons and enhance the performance of SRC and LRC. Accordng to the classfcaton rule of SRC and LRC, the feature subspace wth larger nter-class reconstructon errors and smaller ntra-class errors wll lead to a good performance. To fnd the optmal feature subspace, we develop MNSMC for feature extracton. 3.2 Maxmum Nearest Subspace Margn Crteron In ths secton we wll ntroduce our MNSMC n detal. Frst let us ntroduce some notatons and prelmnary defntons. Let x j R n ( = 1, 2,...,c, j = 1, 2,...,n ) be the jth sample from the th class and X =[x 1, x2,...,xn ] s a set of all the samples from the th class. Based on the reconstructon errors, we frst defne the nearest subspace. Defnton 1 (Nearest subspace). For a gven sample x j j, ts nearest subspace N s the subspace spanned by the class wth the mnmum reconstructon error of x j. To ntroduce the label nformaton, we further defne the two types of nearest subspaces.

Feature Extracton Based on MNSMC Defnton 2 (Homogeneous nearest subspace). For a gven sample x j, ts homogeneous nearest subspace s the subspace spanned by the th class. Defnton 3 (Heterogeneous nearest subspace). For a gven sample x j, ts heterogeneous nearest subspace s the subspace spanned by the class wth the mnmum nter-class reconstructon error of x j. Before descrbe the nearest subspace margn, we provde a way to compute the two types of dstances: the pont-to-ntra-class dstance and the pont-to-nter-class dstance. In ths paper, we focus on the reconstructon errors. Thus, the pont-to-ntra-class dstance s defned as the ntra-class reconstructon error: ε j = x j X j β j 2 (9) where X j =[x 1, x2 j 1,...,x, x j+1,...,x n ] ndcates x j s excluded from X and β j s the reconstructon coeffcent vector obtaned from LSE,.e., ( β j = X j T X ) 1 j X j T x j (10) For each x j,wecanfndtsk heterogeneous nearest subspaces Ne j. The pont-to-nter-class dstance s defned as: η j = x j X mβm j 2 (11) X m N e j where represents the cardnalty of a set, X m s one of the k heterogeneous nearest subspaces and β j m s the reconstructon coeffcent vector. Based on the pont-to-ntra-class and the pont-to-nter-class dstance, we can defne the nearest subspace margn. N e j Defnton 4 (Nearest subspace margn). The nearest subspace margn γ j as follows. for x j s defned γ j = η j ε j (12) Ths margn measures the dstances from x j to ts own class and smlar heterogeneous classes. Accordng to the classfcaton rule of SRC and LRC, a large η j and a small ε j,.e., a large γ j, ndcate x j s easy to be classfed correctly. Then the total nearest subspace margn for the whole data set s defned to be as follows Defnton 5 (Total nearest subspace margn). The total nearest subspace margn γ for all the samples s defned as: γ = γ j = ( ) ηj ε j = η j ε j (13) j j j j Geometrcally, j η j measures the class separablty and j ε j measures the compactness of the ntra-class samples. From the defntons, a large j η j and a small j ε j,.e.,alargeγ, wll lead to a good separablty.

Y. Chen et al. 3.3 Lnear Feature Extracton When performng dmensonalty reducton, we am to fnd a mappng W=[w 1, w 2,...,w d ] R n d from the orgnal space to some feature space such that γ s maxmzed after the transformaton,.e., W = arg max γ (W) (14) W Let y j = W T x j be the mage of x j n the projected subspace. Then y j Y mβ j m 2 j X m Nj e Nj e = W T x j W T X mβ j m 2 j X m Nj e Nj e ( ) = W T x j W T X mβ j T ( ) m W T x j W T X mβm j j X m Nj e Nj e ( )( W T x j W T X mβm j W T x j W T X mβm j = tr j X m Nj e Nj e ( )( ) x j = tr W T X mβm j x j X mβ j T m j X m Nj e Nj e W ) = tr (W T Sb R W ) T where tr ( ) s the notaton of trace operator and S R b = j X m N e j ( )( x j X mβm j x j X mβm j N e j ) T (15) Smlarly, y j Ỹ jβ j j 2 = W T x j W T X jβ j 2 j = tr ( )( ) W T x j W T X jβ j W T x j W T X jβ j T j

Feature Extracton Based on MNSMC ( )( ) = tr W T x j X jβ j x j X jβ j T W ) = tr (W T Sw R W j where S R w = j ( )( ) x j X jβ j x j X jβ j T (16) The total nearest subspace margn n the projected subspace changes to: γ (W) = tr ( ( ) ) W T Sb R SR w W = d wk T k=1 ( ) Sb R SR w w k (17) To elmnate the freedom, we add the constrant w T k w k = 1. Thus the goal of MNSMC s to solve the followng optmzaton problem: max d wk T k=1 ( S R b Sw R ) wk subject to w T k w k = 1 (18) It s easy to prove the optmal soluton W s composed of the d egenvectors correspondng to the d largest egenvalues of Eq. (19). ( ) Sb R SR w w = λw (19) Compared wth LDA, the proposed method does not needs to compute the nverse of Sw R. Therefore, MNSMC avods the small sample sze (SSS) problem and can extract more features [9]. To show the effectveness of MNSMC, we recalculate the reconstructon errors of the four mages as shown n Fg. 1 n the MNSMC s subspace and llustrate the results n Fg. 2. As can be seen n Fg. 1, only one face mage has the smallest ntra-class reconstructon error. But n Fg. 2, we can fnd the ntra-class reconstructon errors are sgnfcantly smaller and three face mages n Fg. 2 have the mnmum ntra-class reconstructon errors. The expermental results ndcate that MNSMC can reduce the mpact of the llumnatons and mprove the performances of SRC and LRC. Note that, n Eq. (10), when the tranng number n of the th class s larger than the dmenson of x j, the matrx X j T X j s sngular. Thus the nverse of X j T X j can not be calculated drectly. In ths case, we can apply rdge regresson (RR) [22,23] to solve the sngularty problem. Dfferent from LSE, RR ams to mnmze the followng cost functon: ( x j mn X jβ j 2 + λ β j 2) (20) where λ s a postve factor to reduce the soluton space.

Y. Chen et al. 1200 1200 Reconstructon errors 0 0 600 400 200 Reconstructon errors 0 0 600 400 200 0 0 5 10 15 20 25 30 35 40 Class Labels (a) 0 0 5 10 15 20 25 30 35 40 Class Labels (b) Reconstructon errors 0 0 600 500 400 300 200 0 0 5 10 15 20 25 30 35 40 Class Labels (c) Reconstructon erros 500 450 400 350 300 250 200 150 50 0 0 5 10 15 20 25 30 35 40 Class Labels (d) Fg. 2 The reconstructon errors of four mages of the frst class under the dfferent llumnaton condtons n the MNSMC s subspace The cost functon can be rewrtten as: x j X jβ j 2 + λ β j 2 ( ) = x j X jβ j T ( x j X jβ j ( ( ) = x j T ( j x + β j ) ( + λ ) T X T j X jβ j 2 ( β j β j ) T β j ) T X T j x j ) ( + λ β j ) T β j Takng dervatves and equalng them to zero, then the optmal soluton of the cost functon can be obtaned as follows. ( β j = X j T X ) 1 j + λi X j T x j (21) where I s the dentty matrx. It s easy to prove X T j X j + λi s nonsngular. As can be seen n Eq. (6), smlar to MNSMC, LRC also suffers from the sngularty problem. Usng RR nstead of LSE, Eq. (6) can be rewrtten as: ˆβ = ( X T X + λi) 1 X T y (22)

Feature Extracton Based on MNSMC Table 1 The algorthm of MNSMC Input: Column sample matrx X, the nearest subspace sze k Output: Transform matrx W Step 1: Construct Sw R and SR b usng X. ( ) Step 2: Solve the generalzed egenvectors of Sb R SR w w = λw and construct W = {w 1, w 2,...,w d } correspondng to the d largest egenvalues. Step 3: Output W. Table 2 The detals of the four databases Database Sze Number of classes Number of samples per class Number of tranng sample per class YALE-B 32 32 38 64 10 AR 50 40 120 26 5 FKP 55 110 12 5 CENPARMI 121 10 600 200 Based on RR, we extend LRC to any cases wthout lmts. And the modfed LRC s called rdge regresson classfcaton (RRC) n ths paper. 3.4 The Algorthm of MNSMC The man steps of MNSMC s summarzed n Table 1. 4 Experments To evaluate the performance of the proposed method, we compare t wth three feature extracton methods,.e., prncpal component analyss (PCA) [24], LDA and (MMC) [25] over three classfers,.e., nearest neghbor classfer (NNC) [26], SRC and LRC, on four well-known databases. The detals of the databases are summarzed n Table 2. As a baselne, we drectly employ NNC, SRC and LRC to classfy the raw data. Then we nvestgate whether MNSMC can enhance the performances of SRC and LRC. We also compare MNSMC wth PCA, LDA and MMC to fnd whch feature subspace s more sutable for SRC and LRC. The NNC s used to compare wth SCR and LRC to determne whch classfer s more sutable for MNSMC. Note that LDA can extract at most c 1 features due to the characterstc (c s the total number of the classes). Therefore, the dmenson of LDA s lmted to a value n the fgures. 4.1 Parameter Selecton For effcency, on the YALE-B, AR and FKP databases, PCA s frst appled to reduce the dmensonalty. Then the experments are performed on the 150-dmensonal PCA subspaces. On the YALE-B, AR and FKP databases, there s only one model parameter,.e., k heterogeneous nearest subspace. And on the CENPARMI database, the parameter λ s ntroduced to overcome the sngularty problem. In the experments, the values of the parameters are emprcally set as n Table 3. 4.2 Face Recognton The YALE-B database [27 29] conssts of 2432 frontal face mages of 38 subjects under varous lghtng condtons. The database was dvded n fve subsets: subset 1 consstng of 266 mages (seven mages per subject) under nomnal lghtng condtons was used as the gallery.

Y. Chen et al. Table 3 The parameter settngs of the four databases Database PCA dmensons k λ YALE-B 150 1 None AR 150 1 None FKP 150 10 None CENPARMI 120 1 0.01 Table 4 The average maxmal recognton rates on the YALE-B database Baselne PCA LDA MMC MNSMC NNC 44.3 ± 1.0 43.1 ± 1.0.0 ± 1.4 79.7 ± 1.4.8 ± 2.0 SRC 86.2 ± 1.1 87.7 ± 0.8 85.4 ± 1.2 85.2 ± 1.3 88.3 ± 1.1 LRC 81.7 ± 1.0 83.2 ± 1.3 84.1 ± 1.1 87.6 ± 1.0 91.1 ± 0.9 Sgnfcance of bold are the hghest recognton rates of the correspondng classfers Fg. 3 Sample mages of one person from the YALE-B face database Subsets 2 and 3, each consstng of 12 mages per subject, characterze slght-to-moderate lumnance varatons, whle subset 4 (14 mages per person) and subset 5 (19 mages per person) depct severe lght varatons. The mages are also grayscale and normalzed to a resoluton of 32 32 pxels. Sample mages of one person from the YALE-B face database are shown n Fg. 3. We randomly choose 10 samples mages from all the subsets and the rest sample mages are used for test. Ths procedure s repeated for 50 tmes. The average recognton rates and the correspondng standard devaton are ndcated n Table 4.And the recognton rates versus the dmensons are llustrated n Fgs. 4, 5 and 6. The AR face database [30,31] contans over 4,000 color face mages of 126 people ( men and 56 women), ncludng frontal vews of faces wth dfferent facal expressons, lghtng condtons and occlusons. The pctures of most persons were taken n two sessons (separated by two weeks). Each secton contans 13 color mages and 120 ndvdual s (65 men and 55 women) partcpated n both sessons. The mages of these 120 ndvduals were selected and used n our experment. We manually cropped the face porton of the mage and then normalzed t to 50 40 pxels. Sample mages of one person from the AR face database are shown n Fg. 7. We randomly choose 5 samples mages from each person and the rest sample mages are used for test. Ths procedure s repeated for 50 tmes. The average recognton rates and the correspondng standard devaton are ndcated n Table 5. And the recognton rates versus the dmensons are llustrated n Fgs. 8, 9 and 10. 4.3 Fnger Knuckle Prnt Recognton In PolyU FKP database [32 34], FKP mages were collected from 165 volunteers, ncludng 125 males and 40 females. Among them, 143 subjects were 20 30 years old and the

Feature Extracton Based on MNSMC Recognton rates 60 50 40 30 20 PCA+LRC LDA+LRC MMC+LRC MNSMC+LRC 10 0 50 150 Dmensons Fg. 4 The recognton rates of 4 methods plus LRC on the YALE-B database Recognton rates 60 50 40 PCA+SRC LDA+SRC MMC+SRC MNSMC+SRC 30 0 50 150 Dmensons Fg. 5 The recognton rates of 4 methods plus SRC on the YALE-B database Table 5 The average maxmal recognton rates on the AR database Baselne PCA LDA MMC MNSMC NNC 59.6 ± 1.3 57.4 ± 1.1 88.6 ± 1.0 84.5 ± 1.1 85.4 ± 1.3 SRC 87.5 ± 0.7 91.8 ± 0.8.9 ± 0.7 92.0 ± 0.9 92.4 ± 0.5 LRC 74.8 ± 1.3 72.5 ± 0.4.4 ± 1.1 88.4 ± 1.0 92.5 ± 0.9 Sgnfcance of bold are the hghest recognton rates of the correspondng classfers

Y. Chen et al. Recognton rates 60 50 40 30 20 SRC LRC MNSMC+LRC MNSMC+SRC 10 0 50 150 Dmensons Fg. 6 The recognton rates of MNSMC plus SRC and LRC versus the baselnes of SRC and LRC on the YALE-B database Fg. 7 Sample mages of one person from the AR face database others were 30 50 years old. The samples were collected n two separate sessons. In each sesson, the subject was asked to provde sx mages for each of the left ndex fnger, the left mddle fnger, the rght ndex fnger and the rght mddle fnger. Therefore, 48 mages from four fngers were collected from each subject. In total, the database contans 7,920 mages from 660 dfferent fngers. The average tme nterval between the frst and the second sessons was about 25 days. The maxmum and mnmum tme ntervals were 96 days and 14 days respectvely. All the samples n the database are hstogram equalzed and reszed to 55 110. Sample mages of one person from the FKP database are shown n Fg. 11. We randomly choose fve samples mages from each person and the rest sample mages are used for test. Ths procedure s repeated for 50 tmes. The average recognton rates and the correspondng standard devaton are ndcated n Table 6. And the recognton rates versus the dmensons are llustrated n Fgs. 12, 13 and 14.

Feature Extracton Based on MNSMC Recognton rates 60 50 40 PCA+LRC LDA+LRC MMC+LRC MNSMC+LRC 30 0 50 150 Dmensons Fg. 8 The recognton rates of 4 methods plus LRC on the AR database Recognton rates 60 50 40 PCA+SRC LDA+SRC MMC+SRC MNSMC+SRC 30 0 50 150 Dmensons Fg. 9 The recognton rates of 4 methods plus SRC on the AR database Table 6 The average maxmal recognton rates on the FKP database Baselne PCA LDA MMC MNSMC NNC 86.6 ± 1.4 88.7 ± 1.6 83.5 ± 1.6 88.7 ± 1.6 92.8 ± 1.5 SRC 86.6 ± 1.1 91.7 ± 1.2 91.4 ± 1.2 93.0 ± 0.8 93.8 ± 0.9 LRC 86.1 ± 1.2 91.5 ± 1.4 92.4 ± 1.3 91.3 ± 1.2 94.5 ± 1.2 Sgnfcance of bold are the hghest recognton rates of the correspondng classfers

Y. Chen et al. 95 85 Recognton rates 75 65 60 55 SRC LRC MNSMC+LRC MNSMC+SRC 50 0 50 150 Dmensons Fg. 10 The recognton rates of MNSMC plus SRC and LRC versus the baselnes of SRC and LRC on the AR database Fg. 11 Sample mages of the rght ndex fnger from one ndvdual 4.4 Handwrtten Numeral Recognton The Concorda Unversty CENPARMI handwrtten numeral database [35] s used to test the performance of MNSMC plus RRC. The database contans 10 numeral classes and each class has 600 samples. In our experment, we randomly choose 200 samples of each class for tranng, the remanng 400 samples for testng. Thus, the total number of tranng samples s 2,000 whle the total number of testng samples s 4000. Snce the dmensonalty of one dgt (121) s less than tranng samples (200) of one class, LRC fals under ths crcumstance. So RRC s employed n ths experment. The recognton rates versus the dmensons are llustrated n Fgs. 15, 16 and 17.

Feature Extracton Based on MNSMC Table 7 The recognton rates on the four subsets usng dfferent methods and classfers Subset2 Subset3 Subset4 Subset5 NNC SRC LRC NNC SRC LRC NNC SRC LRC NNC SRC LRC PCA.1 41.5 98.6 98.6 15.8 74.0 76.2 8.2 26.1 30.4 LDA 99.8 94.3 99.8 98.6 22.8 56.3 62.8 7.6 20.7 25.6 MMC 95.8 66.2 99.6 99.6 12.5 61.4 67.2 7.4 17.6 22.3 MNSMC 99.3 73.8 16.8 83.4 88.3 10.2 33.6 39.8 Sgnfcance of bold are the hghest recognton rates of the correspondng classfers 95 Recognton rates 85 75 65 60 55 PCA+LRC LDA+LRC MMC+LRC MNSMC+LRC 50 0 50 150 Dmensons Fg. 12 The recognton rates of 4 methods plus LRC on the FKP database 95 Recognton rates 85 75 PCA+SRC LDA+SRC MMC+SRC MNSMC+SRC 0 50 150 Dmensons Fg. 13 The recognton rates of 4 methods plus SRC on the FKP database

Y. Chen et al. 95 Recognton rates 85 75 LRC SRC MNSMC+LRC MNSMC+SRC 65 0 50 150 Dmensons Fg. 14 The recognton rates of MNSMC plus SRC and LRC versus the baselnes of SRC and LRC on the FKP database 95 85 Recognton rates 75 65 60 55 50 PCA+RRC LDA+RRC MMC+RRC MNMSC+RRC 45 0 20 40 60 120 Dmensons Fg. 15 The recognton rates of 4 methods plus LRC on the CENPARMI database 4.5 Face Recognton Under llumnaton and nose condtons Further experments on the YALE-B face database are conducted to nvestgate the performance of MNSMC under llumnaton and nose condton. The YALE-B was dvded nto fve subsets accordng to the llumnaton drectons. Sample mages of one person from the subsets are shown n Fg. 18. We follow the evaluaton protocol as reported n [28,36]. Tranng s conducted usng subset 1 and the system s valdated on the remanng subsets. The expermental results are lsted n Table 7.

Feature Extracton Based on MNSMC Recognton rates 60 50 PCA+SRC LDA+SRC MMC+SRC MNSMC+SRC 40 0 20 40 60 120 Dmensons Fg. 16 The recognton rates of 4 methods plus SRC on the CENPARMI database 95 Recognton rates 85 75 65 RRC SRC MNSMC+RRC MNSMC+SRC 60 0 20 40 60 120 Dmensons Fg. 17 The recognton rates of MNSMC plus SRC and LRC versus the baselnes of SRC and LRC on the CENPARMI database In the next set of experments we contamnate the face mages n the subset2 wth salt and pepper nose [37]. Fgure 19 reflects face mages dstorted wth varous degrees of salt and pepper nose. The expermental results can be found n Table 8. 4.6 Dscussons From the expermental results, we can draw the followng conclusons: (1) Snce the Yale-B and AR databases contan llumnatons, occlusons and expressons, the recognton rates are not qute good when classfers are drectly appled on the raw

Y. Chen et al. Table 8 The recognton rates on the subsets2 usng dfferent methods and classfcatons Densty 20 % 40 % 60 % % NNC SRC LRC NNC SRC LRC NNC SRC LRC NNC SRC LRC PCA.0.3 95.3 93.2 43.6 76.4 51.1 18.2 20.1 16.2 LDA 99.1 86.6 98.7 96.4 40.8 85.1 61.5 11.4 23.3 15.8 MMC 97.4 88.3 99.2 96.6 42.6 83.7 56.8 13.8 21.2 15.3 MNSMC 98.3 92.1 98.2 50.7 87.4 65.2 23.7 26.5 25.8 Sgnfcance of bold are the hghest recognton rates of the correspondng classfers Table 9 The average maxmal recognton rates on the CENPARMI database Baselne PCA LDA MMC MNSMC NNC 88.3 ± 0.3 88.3 ± 0.3 87.4 ± 0.4.0 ± 0.4 92.3 ± 0.5 SRC 93.2 ± 0.3 93.0 ± 0.8 88.7 ± 0.7 92.8 ± 0.7 94.5 ± 0.4 RRC.4 ± 0.4.9 ± 0.4 86.2 ± 0.4 91.7 ± 0.6 94.2 ± 0.5 Sgnfcance of bold are the hghest recognton rates of the correspondng classfers data. More mportantly, we fnd that not all the feature extracton methods are helpful to the classfers. As can be seen n Tables 4 and 9, the recognton rates of LDA plus Fg. 18 Each row represents typcal mages from subsets 1, 2, 3, 4 and 5, respectvely

Feature Extracton Based on MNSMC Fg. 19 Face mage wth a 20 %, b 40 %, c 60 %, d % salt and pepper nose densty SRC are lower than the correspondng baselnes. That means selectng an approprate feature extracton method s very mportant to a specfc classfer. (2) Snce MNSMC s desgned accordng to the classfcaton rule of SRC and LRC, t matches SRC and LRC perfectly. As can be seen n Fgs. 6, 10, 14,and17, the recognton rates of MNSMC plus SRC or LRC are consstently hgher than the correspondng baselnes. Meanwhle, we observe MNSMC plus LRC outperforms MNSMC plus SRC on the YALE-B, AR and FKP databases. As we ntroduced above, SRC and LRC are based on dfferent reconstructon strateges. In the proposed method, MNSMC shares the same reconstructon strategy wth LRC. Thus, MNSMC plus LRC performs better n most of the experments. (3) Compared wth PCA, LDA and MMC, MNSMC s the best feature extracton method for SRC and LRC. As can be seen n Fgs. 4, 5, 8, 9, 12, 13, 15,and16, MNSMC plus SRC and LRC consstently outperform other combnatons. As MNSMC consders the reconstructon error, t perfectly fts the classfcaton rule of SRC and LRC. Therefore, MNSMC performs better than other feature extracton methods. (4) The expermental results also ndcate, compared wth SRC and LRC, NNC s not very sutable for MNSMC. Techncally, NNC assumes a sample and ts nearest neghbor are n the same class. However, MNSMC consders the nearest subspace rather than the nearest neghbor. Therefore, MNSMC may not match NNC perfectly. 5 Conclusons In ths paper, accordng to the classfcaton rule of SRC and LRC, a new feature extractor based on MNSMC s proposed. By maxmzng the nter-class reconstructon error and mnmzng the ntra-class reconstructon error smultaneously, the proposed method mproves the performances of SRC and LRC sgnfcantly. Our method avods the SSS problem and can extract more features than LDA. Moreover, based on RR, we develop RRC to overcome the potental sngular problem of LRC. The expermental results on the YALE-B, AR, FKP and the CENPARMI handwrtten numeral databases show the effectveness of the proposed method. Acknowledgements Ths work s partally supported by the Natonal Scence Foundaton of Chna under grant no.820306, 60873151, 60973098, 65008. References 1. L SZ (1998) Face recognton based on nearest lnear combnatons. In: Proceedngs of IEEE nternatonal conference on computer vson and pattern recognton, pp 839 844

Y. Chen et al. 2. L SZ, Lu J (1999) Face recognton usng nearest feature lne method. IEEE Trans Neural Netw 10(2):439 443 3. L SZ, Chan KL, Wang CL (2000) Performance evaluaton of the nearest feature lne method n mage classfcaton and retreval. IEEE Trans Pattern Anal Mach Intell 22(11):1335 1339 4. Chen J-T, Wu C-C (2002) Dscrmnant wavelet faces and nearest feature classfers for face recognton. IEEE Trans Pattern Anal Mach Intell 24(12):1644 1649 5. Wrght J, Yang A, Ganesh A, Sastry S, Ma Y (2009) Robust face recognton va sparse representaton. IEEE Trans Pattern Anal Mach Intell 31(2):210 227 6. Naseem I, Togner R, Bennamoun M (2010) Lnear regresson for face recognton. IEEE Trans Pattern Anal Mach Intell 32(11):2106 2112 7. Jollffe IT (1986) Prncpal component analyss. Sprnger, New York 8. Belhumeur PN, Hespanda J, Kregeman D (1997) Egenfaces vs Fsherfaces: recognton usng class specfc lnear projecton. IEEE Trans Pattern Anal Mach Intell 19(7):711 720 9. L H, Jang T, Zhang K (2003) Effcent and robust feature extracton by maxmum margn crteron. In: Proceedngs of advances n neural nformaton processng systems, pp 97 104 10. He X, Yan S, Hu Y, Nyog P, Zhang H-J (2005) Face recognton usng Laplacan faces. IEEE Trans Pattern Anal Mach Intell 27(3):328 340 11. He X, Ca Deng, Yan S, Zhang HJ (2005a) Neghborhood preservng embeddng. In: Proceedngs of the 10th IEEE nternatonal conference on computer vson, pp 1208-1213 12. Yan S, Xu D, Zhang B, Zhang H, Yang Q, Ln S (2007) Graph embeddng and extenson: a general framework for dmensonalty reducton. IEEE Trans Pattern Anal Mach Intell 29(1):40 51 13. Yang J, Zhang D, Yang J, Nu B (2007) Globally maxmzng, locally mnmzng: unsupervsed dscrmnant projecton wth applcatons to face and palm bometrcs. IEEE Trans Pattern Anal Mach Intell 29(4):650 664 14. Chen H-T, Chang H-W, Lu T-L (2005) Local dscrmnant embeddng and ts varants. In: IEEE conference on computer vson and pattern recognton (CVPR 2005), pp 846 853 15. Wang F, Wang X, Zhang D, Zhang CS, L T (2009) margnface: a novel face recognton method by average neghborhood margn maxmzaton. Pattern Recognt 42(11):2863 2875 16. Candès E, Tao T (2006) Near optmal sgnal recovery from random projectons: unversal encodng strateges?. IEEE Trans Inf Theory 52(12):5406 5425 17. Donoho D (2006) For most large underdetermned systems of lnear equatons the mnmal l1-norm soluton s also the sparsest soluton. Commun Pure Appl Math 59(6):797 829 18. Candès E, Romberg J, Tao T (2006) Stable sgnal recovery from ncomplete and naccurate measurements. Commun Pure Appl Math 59(8):1207 1223 19. Haste T, Tbshran R, Fredman J (2001) The elements of statstcal learnng: data mnng, nference and predcton. Sprnger, New York 20. Seber GAF (2003) Lnear regresson analyss. Wley-Interscence, Hoboken 21. Ryan TP (1997) Modern regresson methods. Wley-Interscence, Hoboken 22. Hoerl AE, Kennard RW (19) Rdge regresson: applcatons to nonorthogonal problems. Technometrcs 12(1):69 82 23. Hoerl AE, Kennard RW (19) Rdge regresson: based estmaton for nonorthogonal problems. Technometrcs 12(1):55 67 24. Jollffe IT (1986) Prncpal component analyss. Sprnger, New York 25. L H, Jang T, Zhang K (2003) Effcent and robust feature extracton by maxmum margn crteron. In: Proceedngs of Advances n Neural Informaton Processng Systems, pp 97 104 26. Cover TM, Hart PE (1967) Nearest neghbor pattern classfcaton. IEEE Trans Inf Theory 13(1):21 27 27. Lee K, Ho J, Kregman D (2005) Acqurng lnear subspaces for face recognton under varable lghtng. IEEE Trans Pattern Anal Mach Intell 27(5):684 698 28. Georghades A, Belhumeur P, Kregman D (2001) From few to many: llumnaton cone models for face recognton under varable lghtng and pose. IEEE Trans Pattern Anal Mach Intell 23(6):643 660 29. The extended YALE-B database: http://www.zjucadcg.cn/dengca/data/facedata.html 30. Martnez AM, enavente RB (1998) The AR Face Database. CVC Techncal Report, no. 24 31. Martnez AM, enavente RB (2003) The AR Face Database. http://rvl1.ecn.purdue.edu/~alex/ alex_face_db.html 32. Zhang L, Zhang L, Zhang D, Zhu H (2010) Onlne fnger-knuckle-prnt verfcaton for personal authentcaton. Pattern Recognt 43(7):2560 2571 33. Zhang L, Zhang L, Zhang D (2009) Fnger-knuckle-prnt: a new bometrc dentfer. In: Proceedngs of the IEEE nternatonal conference on mage processng 34. The FKP database. http://www.comp.polyu.edu.hk/~bometrcs/fkp.htm

Feature Extracton Based on MNSMC 35. Lao SX, Pawlak M (1996) On mage analyss by moments. IEEE Trans Pattern Anal Mach Intell 18(3):254 266 36. Welong C, Joo ME, Wu S (2006) Illumnaton compensaton and normalzaton for robust face recognton usng dscrete cosne transform n logarthm doman. IEEE Trans Syst Man Cybernet 36(2):458 464 37. Gonzalez RC, Woods RE (2007) Dgtal mage processng. Pearson Prentce Hall, Upper Saddle Rver