Facial Expression Recognition Based on Local Binary Patterns and Local Fisher Discriminant Analysis

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le Facal Expresson Recognton Based on Local Bnary Patterns and Local Fsher Dscrmnant Analyss SHIQING ZHANG, XIAOMING ZHAO, BICHENG LEI School of Physcs and Electronc Engneerng azhou Unversty azhou 38000 CHINA tzczsq@63.com, lebcheng@63.com Department of Computer Scence azhou Unversty azhou 38000 CHINA tzxyzxm@63.com Abstract: - Automatc facal expresson recognton s an nterestng and challengng subect n sgnal processng, pattern recognton, artfcal ntellgence, etc. In ths paper, a new method of facal expresson recognton based on local bnary patterns (LBP) and local Fsher dscrmnant analyss (LFDA) s presented. he LBP features are frstly extracted from the orgnal facal expresson mages. hen LFDA s used to produce the low dmensonal dscrmnatve embedded data representatons from the extracted hgh dmensonal LBP features wth strkng performance mprovement on facal expresson recognton tasks. Fnally, support vector machnes (SVM) classfer s used for facal expresson classfcaton. he expermental results on the popular JAFFE facal expresson database demonstrate that the presented facal expresson recognton method based on LBP and LFDA obtans the best recognton accuracy of 90.7% wth reduced features, outperformng the other used methods such as prncpal component analyss (PCA), lnear dscrmnant analyss (LDA), localty preservng proecton (LPP).. Key-Words: - Facal expresson recognton, local bnary patterns, local Fsher dscrmnant analyss, support vector machnes, prncpal component analyss, lnear dscrmnant analyss, localty preservng proecton Introducton Facal Expresson s one of the most powerful, nature, and mmedate means for human bengs to communcate ther emotons and ntentons. Automatc facal expresson recognton has ncreasngly attracted much attenton due to ts mportant applcatons to natural human-computer nteracton, data drven anmaton, vdeo ndexng, etc. An automatc facal expresson recognton system nvolves two crucal parts: facal feature representaton and classfer desgn. Facal feature representaton s to extract a set of approprate features from orgnal face mages for descrbng faces. Manly two types of approaches to extract facal features are found: geometry-based methods and appearance-based methods []. In the geometrc feature extracton system, the shape and locaton of varous face components are consdered. he geometry-based methods requre accurate and relable facal feature detecton, whch s dffcult to acheve n real tme applcatons. In contrast, the appearance-based methods, mage flters are appled to ether the whole face mage known as holstc feature or some specfc regon of the face mage known as local feature to extract the appearance change n the face mage. So far, prncpal component analyss (PCA) [], lnear dscrmnant analyss (LDA) [3], and Gabor wavelet analyss [4] have been appled to ether the whole-face or specfc face regons to extract the facal appearance changes. Nevertheless, t s computatonally expensve to convolve the face mages wth a set of Gabor flters to extract mult-scale and mult-orentaton coeffcents. It s thus neffcent n both tme and E-ISSN: 4-3488 Issue, Volume 8, January 0

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le memory for hgh redundancy of Gabor wavelet features. Local bnary patterns (LBP) [5], orgnally proposed for texture analyss [6] and a nonparametrc method effcently summarzng the local structures of an mage, have receved ncreasng nterest for facal mage representaton. he most mportant property of LBP features s ther tolerance aganst llumnaton changes and ther computatonal smplcty. In recent years, LBP has been successfully appled as a local feature extracton method n facal expresson recognton [7-]. When usng the extracted LBP features represented by a set of hgh dmensonal data sets to tran and test a classfer, the so-called curse of dmensonalty emerges, and thus removng rrelevant feature data, as a preprocessng step to a classfer, s needed. o solve ths problem, one usually feasble way s to perform dmensonalty reducton for the sake of generatng few new features contanng most of the valuable facal expresson nformaton. he two wdely used dmensonalty reducton methods are PCA and LDA. However, these two methods,.e., PCA and LDA, stll have ther respectve nherent drawbacks, resultng n decreasng ther performance on facal expresson recognton tasks to some extent. In detal, PCA, as an unsupervsed learnng method, fals to extract the dscrmnatve embedded nformaton from hgh dmensonal data. In contrast, LDA s a supervsed learnng method, but stll has an essental lmtaton. hat s, the maxmum of embedded features by LDA must be less than the number of data classes due to the rank defcency of the between-class scatter matrx [3]. In recent years, a new dmensonalty reducton method called local Fsher dscrmnant analyss (LFDA) [] has been proposed to overcome the lmtaton of LDA. LFDA effectvely combnes the deas of LDA and localty preservng proecton (LPP) [3], that s, LFDA maxmzes between-class separablty and preserves wthn-class local structure at the same tme. LFDA s thus capable of extractng the low dmensonal dscrmnatve embedded data representatons. Motvated by the defcency of studes on LFDA for facal expresson recognton, n ths work we explore the performance of LFDA on facal expresson recognton tasks. We frstly use LFDA to extract the low dmensonal dscrmnatve embedded data representatons from the orgnal extracted hgh dmensonal LBP features. hen the popular support vector machnes (SVM) s adopted for facal expresson classfcaton. o verfy the effectveness of LFDA we compare LFDA wth PCA, LDA and LPP for facal expresson recognton. We conduct facal expresson recognton experments on the popular Japanese female facal expresson (JAFFE) [4] database. he remander of ths paper s organzed as follows. Local Bnary Patterns (LBP) s gven n Secton. In Secton 3, PCA, LDA and LPP are revewed. In Secton 4, LFDA s descrbed. SVM s ntroduced n Secton 5. he popular JAFFE facal expresson database s ntroduced n Secton 6. Secton 7 shows the experment results and analyss. Fnally, the conclusons are gven n Secton 8. Local Bnary Patterns he orgnal local bnary patterns (LBP) [5] operator takes a local neghborhood around each pxel, thresholds the pxels of the neghborhood at the value of the central pxel and uses the resultng bnaryvalued mage patch as a local mage descrptor. It was orgnally defned for 3 3 neghborhoods, gvng 8 bt codes based on the 8 pxels around the central one. he operator labels the pxels of an mage by thresholdng a 3 3 neghborhood of each pxel wth the center value and consderng the results as a bnary number, and the 56-bn hstogram of the LBP labels computed over a regon s used as a texture descrptor. Fg. gves an example of the basc LBP operator. Fg. An example of basc LBP operator he lmtaton of the basc LBP operator s that ts small 3 3 neghborhood cannot capture the domnant features wth large scale structures. As a result, to deal wth the texture at dfferent scales, the operator was later extended to use neghborhoods of dfferent szes [5]. Fg. gves an example of the extended LBP operator, where the notaton (P, R) denotes a neghborhood of P equally spaced samplng ponts on a crcle of radus of R that form a crcularly symmetrc neghbor set. he second defned the socalled unform patterns: an LBP s unform f t contans at most one 0- and one -0 transton when vewed as a crcular bt strng. For nstance, 00000000, 000000 and 0000 are unform patterns. It s observed that unform patterns account E-ISSN: 4-3488 Issue, Volume 8, January 0

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le for nearly 90% of all patterns n the (8, ) neghborhood and for about 70% n the (6, ) neghborhood n texture mages. Accumulatng the patterns whch have more than transtons nto a u sngle bn yelds an LBP operator, LBP, wth less than P P, R bns. Here, the superscrpt u n LBP u P, R ndcates usng only unform patterns and labelng all remanng patterns wth a sngle label. 3 Revew of PCA, LDA and LPP he general dmensonalty reducton problem s as follows. Gven n data ponts { x, x, L, x n } wth dmenson D, dmensonalty reducton technques transform data set X = [ x, x, L, x n ] nto a new data set Y = [ y, y, L, y n ] wth dmenson d ( d D ), whle retanng the geometry of the data as much as possble. In the followng subsecton, we wll revew PCA, LDA and LPP n bref. Fg. An example of the extended LBP wth dfferent (P, R) After labelng an mage wth the LBP operator, a hstogram of the labeled mage fl ( x, y) can be defned as H = I( fl ( x, y) = ), = 0, L, n () x, y where n s the number of dfferent labels produced by the LBP operator and, A s true I( A) = () 0, A s false hs LBP hstogram contans nformaton about the dstrbuton of the local mcro-patterns, such as edges, spots and flat areas, over the whole mage, so can be used to statstcally descrbe mage characterstcs. For effcent face representaton, face mages were equally dvded nto m small regons R, R, L, Rm. Once the m small regons R, R, L, Rm are determned, a hstogram s computed ndependently wthn each of the m small regons. he resultng m hstograms are concatenated nto a sngle, spatally enhanced hstogram whch encodes both the appearance and the spatal relatons of facal regons. In ths spatally enhanced hstogram, we effectvely have a descrpton of the face mage on three dfferent levels of localty: the labels for the hstogram contan nformaton about the patterns on a pxel-level, the labels are summed over a small regon to produce nformaton on a regonal level and the regonal hstograms are concatenated to buld a global descrpton of the face mage. 3. PCA Prncpal component analyss (PCA) [] s a wellknown and wdely used lnear dmensonalty reducton technque. PCA ams to produce a low dmensonal representaton of hgh dmensonal data that preserves the greatest sources of varaton wthn the data set. hs s acheved by performng a lnear transformaton of the data, proectng t onto the axes of greatest varance, called the prncpal components. he resultng low dmensonal features are uncorrelated and ordered such that the greatest varance by any proecton of the data set s accounted for by the frst dmenson, the second greatest varance by the second dmenson, and so on. In order to fnd a lnear mappng M, PCA maxmzes the followng obectve functon: J ( M) = trace( M cov( X) M ) (3) F where cov( X ) s the sample covarance matrx of the data X = [ x, x, L, x n ].hen, PCA solves the followng egenproblem: cov( X) M= λm (4) he d prncpal egenvectors of the covarance matrx form the lnear mappng M. And then the low dmensonal data representatons are computed by Y= XM. Here, X s assumed to be centered,.e. have zero mean. In face recognton, x represents a face mage, and the egenvectors are so-called egenfaces. 3. LDA Lnear dscrmnant analyss (LDA) [3] s to seek the dscrmnant vectors such that the rato of the between-class scatter to the wthn-class scatter s D maxmzed. Let x R be D -dmensonal samples and l {,, L, c} be assocated class labels, where n E-ISSN: 4-3488 3 Issue, Volume 8, January 0

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le s the number of samples and c s the number of classes. Let y d R ( d D) be the low dmensonal data representaton of a sample x, where d s the dmenson of the embeddng space. hen the between-class scatter matrx S b and the wthn-class scatter matrx S w are constructed as follows: c S = l ( m m )( m m ) (5) b 0 0 = c l ( ) ( ) w = ( x m )( x m ) = = S (6) ( ) where x s the th sample of class ( =,, L, c ), m s the mean vector of the samples n class, and m 0 s the mean vector of all samples. he LDA method tres to fnd the proected matrx that maxmzes the rato of the between-class scatter matrx to the wthn-class scatter matrx n the proected space: trace( V SbV) J F ( V) = max trace( V S V) w (7) where V can be obtaned va the generalzed egenvalue problem: S V=λS V (8) b where the egenvectors V corresponds to the d largest egenvalues λ. hen the d-dmensonal representaton s Y = XV. Snce the between-class scatter matrx S b has at most rank c, LDA can fnd at most c meanngful features. hs s an essental lmtaton of LDA for dmensonalty reducton. 3.3 LPP Whle PCA ams to preserve the global structure of the data, LPP [3] seeks to preserve the local (.e., neghborhood) structure of the data by learnng a localty preservng submanfold. Based on the spectral graph theory, LPP constructs a weghted graph G= ( v, ε, P ), where v s the set of all ponts, ε s the set of edges connectng the ponts and P s a smlarty matrx wth weghts characterzng the lkelhood of two ponts. he obectve functon of LPP s as follows: w where y mn y y P W = W x,,,, = L n, and P= ( ) P n n (9) s a smlarty matrx whch s defned as follows: exp( x x / t) f x s among knn of x P = or f x s among knn of x (0) 0 otherwse Wth smple formulaton, the obectve functon s equvalent to mnmzng y y P = W x W x P = = W X(D - P)X W W XLX W () where D s a dagonal matrx wth ts entres beng the row sums of P,.e., d = p, and L= D P s the Laplacan matrx. In order to remove the arbtrary scalng factor n the embeddng, LPP mposes a constrant as follows: W XDX W= () hs constrant sets the mappng (embeddng) scale and makes the vertces wth hgh smlartes to be mapped nearer to the orgn. Fnally, the mnmzaton problem reduces to W XLX W mn W W XDX W (3) he optmal w s gven by the mnmum egenvalue soluton to the followng generalzed egenvalue problem: XLX W=λXDX W (4) hat s, LPP seeks a transformaton matrx W such that nearby data pars n the orgnal space are kept close n the embeddng space. hus, LPP tends to preserve the local structure of the data. In our experment, the neghbour number of KNN s set to and the parameter t s emprcally set to 5. E-ISSN: 4-3488 4 Issue, Volume 8, January 0

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le 4 LFDA Local Fsher Dscrmnant Analyss (LFDA) [] fnds a transformaton matrx Τ such that an embedded representaton y of a sample x s gven by y = x (5) where denotes the transpose of a matrx Τ. Let n l be the number of samples n class l : c nl = n (6) l= ( lw) ( lb) Let S and S be the local wthn-class scatter matrx and the local between-class scatter matrx: S W (7) n ( lw) ( lw) =, ( x x )( x x ), = S W (8) n ( lb) ( lb) =, ( x x )( x x ), = A / n f l = l ( lw), l W, = (9) 0 f l l A (/ n / n ) f l = l ( lb), l W, = (0) / n f l l where A s a affnty matrx between x and Usng the local scalng heurstc, A s defned as = x x σσ x. A, exp( / ) () where σ s the local scalng around x and defned ( k ) ( k ) by σ = x x, and x s the k-th nearest neghbor of x. A heurstc choce of k=7 has shown to be the best performance. he LFDA transformaton matrx LFDA s defnes as = trace S S () LFDA arg max[ ( ( lb) ( ( lw) ) )] D d R hat s, LFDA seeks a transformaton matrx such that nearby data pars n the same class are made close and the data pars n dfferent classes are separated from each other; far apart data pars n the same class are not mposed to be close. 5 Support Vector Machnes Support vector machnes (SVM) [7] are based on the statstcal learnng theory of structural rsk management whch ams to lmt the emprcal rsk on the tranng data and on the capacty of the decson functon. he basc concept of SVM s to transform the nput vectors to a hgher dmensonal space by a nonlnear transform, and then an optmal hyperplane whch separates the data can be found. Gven tranng data set ( x, y ),...,( xl, yl ), y {,}, to fnd the optmal hyperplane, a nonlnear transform, Z =Φ ( x), s used to make tranng data become lnearly dvdable. A weght w and offset b satsfyng the followng crtera wll be found: w z + b, y = w z + b, y = (3) he above procedure can be summarzed to the followng: mn Φ ( w) = ( w w) (4) w, b subect to y ( w z + b), =,,..., n If the sample data s not lnearly dvdable, the followng functon should be mnmzed. l Φ ( w) = w w+ C ξ (5) whereas ξ can be understood as the error of the classfcaton and C s the penalty parameter for ths term. By usng Lagrange method, the decson functon l of w0 = λ y z wll be = l = 0 = f = sgn[ λ y ( z z ) + b] (6) From the functonal theory, a non-negatve symmetrcal functon K( u, v) unquely defnes a Hlbert space H, where K s the rebuld kernel n the space H : K( u, v) = αϕ ( u) ϕ ( v) (7) E-ISSN: 4-3488 5 Issue, Volume 8, January 0

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le hs stands for an nternal product of a characterstc space: z z=φ( x ) Φ ( x) = K( x, x) (8) hen the decson functon can be wrtten as: l f = sgn[ λ y K( x, x) + b] (9) = he popular JAFFE facal expresson database [4] used n ths study contans 3 facal mages from 0 Japanese female. Each mage has a resoluton of 56 56 pxels. he head s almost n frontal pose. he number of mages correspondng to each of the 7 categores of expresson (neutral, happness, sadness, surprse, anger, dsgust and fear) s almost the same. A few of them are shown n Fg. 3. he development of a SVM classfcaton model depends on the selecton of kernel functon. here are several kernels that can be used n SVM models. hese nclude lnear, polynomal, radal bass functon (RBF) and sgmod functon. x x ( γ ) K( x, x ) = tanh( + ) Lnear deg ree x x + coeffcent exp( γ x x Polynomal RBF γ x x coeffcent Sgmod (30) Many real-world data sets nvolve mult-class problem. Snce SVMs are nherently bnary classfers, the bnary SVMs are needed to extend to be mult-class SVMs for mult-class problem. Currently, there are two types of approaches for buldng mult-class SVMs. One s the sngle machne approach, whch attempts to construct mult-class SVMs by solvng a sngle optmzaton problem. he other s the dvde and conquer approach, whch decomposes the mult-class problem nto several bnary sub-problems, and bulds a standard SVM for each. he most popular decomposng strategy s probably the one-aganstall. he one-aganst-all approach conssts of buldng one SVM per class and ams to dstngush the samples n a sngle class from the samples n all remanng classes. Another popular decomposng strategy s the one-aganst-one. he one-aganstone approach bulds one SVM for each par of classes. When appled to a test pont, each classfcaton gves one vote to the wnnng class and the pont s labeled wth the class havng most votes. In practce, the one-aganst-one approach s more effectve than the one-aganst-all approach due to ts computaton smplcty and comparable performance. 6 Facal Expresson Database Fg.3 Examples of facal mages from JAFFE database Fg.4 (a) wo eyes locaton, (b) the fnal cropped mage of 0 50 pxels As done n [7, 8, 5], we normalzed the faces to a fxed dstance of 55 pxels between the centers of two eyes. Generally, t s observed that the wdth of a face s roughly two tmes of the dstance, and the heght s roughly three tmes. herefore, based on the centers of two eyes, facal mages of 0 50 pxels were cropped from orgnal mage. o locate the centers of two eyes, automatc face regstraton was performed by usng a robust real-tme face detector based on a set of rectangle haar-lke features [6]. From the results of automatc face detecton ncludng face locaton, face wdth and face heght, two square boundng boxes for left eye and rght eye were automatcally created respectvely. hen, the center locatons of two eyes can be quckly worked out n terms of the centers of two square boundng boxes for left eye and rght eye. Fg.4 shows the process of two eyes locaton and the fnal cropped E-ISSN: 4-3488 6 Issue, Volume 8, January 0

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le mage. No further algnment of facal features such as algnment of mouth was performed. Addtonally, there was no attempt made to remove llumnaton changes due to LBP s gray-scale nvarance. 7 Experments and Results Analyss he cropped facal mages of 0 50 pxels contan facal man components such as mouth, eyes, brows and noses. For smplcty, we appled LBP operator to the whole regon of the cropped facal mages. As u done n [7, 8], we selected the 59-bn operator LBP P, R, and dvded the 0 50 pxels face mages nto 8 pxels regons, gvng a good trade-off between recognton performance and feature vector length. hus face mages were dvded nto 4 (6 7) regons, and represented by the LBP hstograms wth the length of 478 (59 4). he reduced feature dmenson s lmted to the range [, 0]. We used the LIBSVM package [8] to mplement SVM algorthm wth radal bass functon (RBF) kernel, kernel parameter optmzaton, one-aganst-one strategy for mult-class classfcaton problem. All extracted LBP features were normalzed by a mappng to [0, ] before anythng else. o testfy the performance of LFDA, we use the JAFFE database to perform two types of facal expresson recognton experments: persondependent experments and person-ndependent experments. For person-dependent experments, the tranng data and testng data have the same person wth dfferent mages. A more challengng applcaton s to create a person-ndependent facal expresson recognton system snce the facal expresson recognton system n real-world sceneres should be work for recognzng new person s expressons. herefore, for personndependent experments, each person only les n ether tranng data or testng data so that the persons n tranng data are guaranteed to be ndependent to the persons n testng data. 7. System Structure In order to clarfy the scheme of how to employ dmensonalty reducton technques lke LFDA on facal expresson recognton tasks, Fg.5 shows the basc structure of a facal expresson recognton system based on dmensonalty reducton technques. As shown n Fg.5, we can see that ths system conssts of three man components: feature extracton, feature dmensonalty reducton and facal expresson recognton. In the feature extracton stage, the orgnal facal mages from the JAFFE facal expresson database are dvded nto two parts: tranng data and testng data. he correspondng LBP features for tranng data and testng data are extracted. he result of ths stage s the extracted facal feature data represented by a set of hgh dmensonal LBP features. he second stage ams at reducng the sze of LBP features and generatng the new low dmensonal embedded features wth dmensonalty reducton technques, such as LFDA, PCA, LDA and LPP. It s noted that for the mappng of testng data, the low dmensonal embedded mappng of tranng data s needed to be learnt. hs s realzed by usng the out-of-sample extensons of dmensonalty reducton methods. Due to the lnearty, the out-of-sample extensons of all used lnear dmensonalty reducton methods,.e., LFDA, PCA, LDA and LPP, are performed by multplyng testng data wth the lnear mappng matrx wth a straghtforward method. he last stage n ths system s n the low dmensonal embedded feature space the traned SVM classfer are used to predct the accurate facal expresson categores on testng data and the recognton results are gven. Fg.5 he basc structure of a facal expresson recognton system based on dmensonalty reducton E-ISSN: 4-3488 7 Issue, Volume 8, January 0

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le 7. Person-dependent Experments o evaluate the performance of the algorthms for person-dependent facal expresson recognton on the JAFFE database, a 0-fold stratfed cross valdaton scheme was performed for facal expresson recognton experments and the average recognton results were reported. In 0-fold cross valdaton, the orgnal samples are randomly parttoned nto ten subsets. Of the ten subsets, one subset s retaned as the valdaton data for testng the model, and the remanng nne subsets are used as tranng data. hs process s then repeated ten tmes, wth each of the ten subsets used exactly once as the valdaton data. hen the average result across ten folds s computed. he person-dependent recognton results of dfferent dmensonalty reducton methods,.e., PCA, LDA, LPP and LFDA, are gven n Fg.6. It s pontng out that the reduced dmenson of LDA s set to the range [, 6] because LDA can fnd at most 6 (less than 7 categores of expresson) meanngful embedded features due to the rank defcency of the between-class scatter matrx [3]. he best accuracy for dfferent methods wth correspondng reduced dmenson s presented n able. Note that the Baselne method denotes that the result s obtaned on the orgnal 478 dmensonal LBP features wthout any dmensonalty reducton. From the results n Fg.6 and able, we can make the followng observatons. Frst, LFDA obtans the hghest accuracy of 90.7% wth reduced features, outperformng the other methods,.e., Baselne, LDA, LPP and PCA. hs ndcates that LFDA s capable of extractng the most dscrmnatve low dmensonal embedded data representatons for facal expresson recognton. Second, LDA performs better than PCA and LPP, snce LDA s a supervsed dmensonalty reducton method and can extract the low dmensonal embedded data representatons wth hgher dscrmnatve power than PCA and LPP. hrd, PCA outperforms LPP. hs may be caused by the fact PCA retans nformaton relevant to varaton whle reducng redundant nformaton so that PCA s more capable of extractng dscrmnatve nformaton than LPP. Fnally, there s no sgnfcant mprovement on facal expresson recognton performance f more reduced feature dmensons are used. hs shows that n our experments t s acceptable that the reduced target feature dmenson s confned to the range [, 0]. o further explore the recognton accuracy of each expresson when LFDA performs best, able gves the confuson matrx of 7-class facal expresson recognton results n person-dependent case. From able we can observe that four expressons,.e., anger, oy, dsgust and neutral, are classfed wth more than 90% accuracy, whle other three expressons, sad, surprse and fear, are dscrmnated wth relatvely low accuracy (less than 90%). Recognton accuracy / % 00 90 80 70 60 50 40 30 0 0 4 6 8 0 4 6 8 0 Reduced dmenson LDA LPP PCA LFDA Fg.6 Person-dependent recognton results versus reduced dmenson able he best recognton accuracy n persondependent case for dfferent methods wth correspondng reduced dmenson Methods Dmenson Accuracy Baselne 478 87.3 LDA 5 86.33 LPP 4 6.6 PCA 9 84.4 LFDA 90.70 Compared wth prevously reported results [8- ] n person-dependent case on the JAFFE database, n our work based on LBP and LFDA the best recognton accuracy of 90.7% wth reduced features s hghly compettve. In [8], smlar to our expermental settngs, they obtaned the best accuracy of 8% wth SVM and LBP features. In [9], they extracted the local texture nformaton by applyng LBP to facal feature ponts; the shape nformaton was also consdered as the par drecton. In addton, they used LBP wth the entre mage to get global texture nformaton. Combnng these three types of features, wth the nearest neghbour classfer they reported an accuracy of 83%. In [0], the recognton accuracy of 85.57% was acheved by E-ISSN: 4-3488 8 Issue, Volume 8, January 0

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le usng SVM and LBP features, but they dd not performed 0-fold cross-valdaton. In [], by usng LBP features and the lnear programmng technque, they reported an accuracy of 93.8%. Nevertheless, they preprocessed the mages by usng the CSU Face Identfcaton Evaluaton System [9] to exclude non-face area wth an ellptcal mask. able Confuson matrx of 7-class facal expresson obtaned by LFDA n person-dependent case Anger Joy Sad Surprse Dsgust Fear Neutral Anger 9.38 0 5.76 0.08 0.6 0.3.39 Joy 0 96. 0.6 0.55 0. 0.9.57 Sad 0.43 84.5 0.8.4.57.43 Surprse 0.6.54 0.3 88.3 0.8 6.04 Dsgust.38 0.4 4.63 0 90.76 3.09 0 Fear 0.9 0 5.69.8. 87.47 3.5 Neutral 0 0.45 0.89 0.03 95.63 7.3 Person-ndependent Experments o evaluate the performance of the algorthms for person-ndependent facal expresson recognton on the JAFFE database, we frstly splt the whole 3 facal mages nto ten groups accordng to the persons the JAFFE database contans, wth each group ncludng all the seven expressons of one dstnct person. hen the so-called leave-one-personout cross valdaton strategy s used n the experments. hat s, each tme, facal expresson mages of one person are used for tranng and all the mages of the remanng persons are used for testng. Repeat the process for each person. he average s the fnal recognton rate. Fg.7 gves the personndependent recognton results of dfferent dmensonalty reducton methods. able 3 presents the best accuracy for dfferent methods wth correspondng reduced dmenson. As shown n Fg.7 and able 3, we can see that LFDA stll performs best among all used methods for person-ndependent facal expresson recognton. In detal, LFDA gves the hghest accuracy of 65.9% wth 0 reduced features. In addton, compared the person-ndependent recognton results n Fg.7 wth the person-dependent recognton results n Fg.6, we can observe that the recognton accuracy n personndependent case are much lower than the recognton accuracy n person-dependent case. More precsely, we can get the best accuracy of about 90% for person-dependent facal expresson recognton, whle about 65% for person-ndependent facal expresson recognton. he results of about 65% accuracy n person-ndependent case are reasonable snce human bengs themselves normally can only recognze expressons wth an accuracy of about 60% [0]. able 4 presents the confuson matrx of 7-class expresson recognton results n person-ndependent case when usng LFDA to obtan the best performance. As shown n able 4, we can see that neutral s dentfed best wth an accuracy of 86.43%, whereas the other fve expressons are classfed wth less than 80% accuracy. Recognton accuracy / % 70 60 50 40 30 0 0 4 6 8 0 4 6 8 0 Reduced dmenson LDA LPP PCA LFDA Fg.7 Person-ndependent recognton results versus reduced dmenson E-ISSN: 4-3488 9 Issue, Volume 8, January 0

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le able 3 he best recognton accuracy n personndependent case for dfferent methods wth correspondng reduced dmenson Methods Dmenson Accuracy Baselne 478 6.8 LDA 6 60.9 LPP 6 36.6 PCA 7 59.4 LFDA 0 65.9 Compared wth the prevously reported results [-3] n person-ndependent case on the JAFFE database, the recognton accuracy of about 65% n our experments s stll comparable. In [], they used the method of PCA+LDA to extract the statstcal features, and obtaned the dfference of statstcal features for every expresson. Fnally, by usng the dfference of statstcal features and the nearest neghbour classfer they reported the hghest accuracy of 6.78%. In [], based on two dmensonal facal expresson feature extracton methods, ncludng two dmensonal prncpal component analyss (DPCA), two dmensonal lnear dscrmnant analyss (DLDA) and generalzed low rank approxmaton of matrces (GLRAM), wth SVM classfer they acheved the recognton accuracy of 63.%, 60.5%, 6.4% for DPCA, DLDA, and GLRAM, respectvely. In [3], base on the nd-order gray-level raw pxels and the encoded 3rd-order tensor-formed Gabor features of facal expresson mages, they employed the orthogonal tensor neghbourhood preservng embeddng (ONPE) algorthm for dmensonalty reducton, and obtaned about 50% accuracy wth the nearest neghbour classfer. able 4 Confuson matrx of 7-class facal expresson obtaned by LFDA n person-ndependent case Anger Joy Sad Surprse Dsgust Fear Neutral Anger 64.65 0.48 0.3 8.04.87.83 Joy 0.4 6.76 5.04 0.07 0. 0.78 Sad 3..97 66.3 7.68 8.4 5.45 6. Surprse 0.06 5.3.34 7.4 0.05.7 7.84 Dsgust 4.97.03 8.94 0.3 56.8.79 5.4 Fear.09 0.98 9.85 0.4 3.6 5.97.6 Neutral 0 0.7 7.6 4. 0.09.3 86.43 8 Conclusons Facal expresson recognton has attracted more and more attenton duo to ts mportant applcatons n a wde range of areas. One key step n facal expresson recognton s to extract the low dmensonal dscrmnatve features before the feature data are fed nto classfer for classfcaton. In ths paper, we presented a new method of facal expresson recognton based on LBP and LFDA. he experment results on the popular JAFFE facal expresson database ndcate that LFDA performs better than PCA, LDA as well as LPP, and obtans the promsng performance of 90.7% accuracy wth reduced features. hs s attrbuted to the fact that LFDA has the better ablty than PCA, LDA and LPP to extract the low dmensonal dscrmnatve embedded data representatons for facal expresson recognton. In the future, t s an nterestng task to employ LFDA to construct a real tme facal expresson recognton system for natural humancomputer nteracton. E-ISSN: 4-3488 30 Issue, Volume 8, January 0

WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le Acknowledgments hs work s supported by Zheang Provncal Natural Scence Foundaton of Chna (Grant No.Z0048, No. Y058). References: [] Y an, Kanade, and J Cohn, Facal expresson analyss, Handbook of face recognton, Sprnger, 005. [] M A urk, and A P Pentland, Face recognton usng egenfaces, Proc. IEEE Conference on Computer Vson and Pattern Recognton, 99, pp. 586-59. [3] P N Belhumeur, J P Hespanha, and D J Kregman, Egenfaces vs. fsherfaces: Recognton usng class specfc lnear proecton, IEEE ransactons on Pattern Analyss and Machne Intellgence, Vol. 9, No. 7, 997, pp. 7-70. [4] M J Lyons, J Budynek, and S Akamatsu, Automatc classfcaton of sngle facal mages, IEEE ransactons on Pattern Analyss and Machne Intellgence, Vol., No., 999, pp. 357-36. [5] Oala, M Petk nen, and M Enp, Multresoluton gray scale and rotaton nvarant texture analyss wth local bnary patterns, IEEE ransactons on Pattern Analyss and Machne Intellgence, Vol. 4, No. 7, 00, pp. 97-987. [6] Oala, M Petk nen, and D Harwood, A comparatve study of texture measures wth classfcaton based on featured dstrbutons, Pattern Recognton, Vol. 9, No., 996, pp. 5-59. [7] C Shan, S Gong, and P McOwan, Robust facal expresson recognton usng local bnary patterns, Proc. IEEE Internatonal Conference on Image Processng, 005, pp. 370-373. [8] C Shan, S Gong, and P McOwan, Facal expresson recognton based on Local Bnary Patterns: A comprehensve study, Image and Vson Computng, Vol. 7, No. 6, 009, pp. 803-86. [9] X Feng, B Lv, Z L and et al., A Novel Feature Extracton Method for Facal Expresson Recognton, Proc. Jont Conference on Informaton Scences, 006. [0] S Lao, W Fan, A Chung, and et al., Facal expresson recognton usng advanced local bnary patterns, tsalls entropes and global appearance features, Proc. IEEE Internatonal Conference on Image Processng, 006, pp. 665-668. [] X Feng, M Petkanen, and A Hadd, Facal expresson recognton wth local bnary patterns and lnear programmng, Pattern Recognton and Image Analyss, Vol. 5, No., 005, pp. 546-548. [] M Sugyama, Idé, S Nakama and et al., Sem-supervsed local Fsher dscrmnant analyss for dmensonalty reducton, Machne learnng, Vol. 78, No., 00, pp. 35-6. [3] X He, and P Nyog, Localty preservng proectons, Advances n neural nformaton processng systems (NIPS), MI Press, 003. [4] M Lyons, S Akamatsu, M Kamach, and et al., Codng facal expressons wth Gabor wavelets, Proc. hrd IEEE Internatonal Conference on Automatc Face and Gesture Recognton, 998, pp. 00-05. [5] Y an, Evaluaton of face resoluton for expresson analyss, Proc. frst IEEE Workshop on Face Processng n Vdeo, 004, pp. 8-8. [6] P Vola, and M Jones, Robust real-tme face detecton, Internatonal Journal of Computer Vson, Vol. 57, No., 004, pp. 37-54. [7] V Vapnk, he nature of statstcal learnng theory, Sprnger, 000. [8] Chh-Chung Chang and Chh-Jen Ln, LIBSVM : a lbrary for support vector machnes, 00. Software avalable at http://www.cse.ntu.edu.tw /~cln/lbsvm. [9] D Bolme, M exera, J Beverdge, and B Draper,he CSU Face Identfcaton Evaluaton System User s Gude: Its Purpose, Feature and Structure, Proc. 3rd Internatonal Conference on Computer Vson Systems, 003, pp. 304-33. [0] Jngha, Y Zlu, and Z Youwe, he contrast analyss of facal expresson recognton by human and computer, Proc. 8th Internatonal Conference on Sgnal Processng, 006, pp. 649-653. [] G Xue, and Z Youwe, Facal Expresson Recognton Based on the Dfference of Statstcal Features, Proc. 8th Internatonal Conference on Sgnal Processng, 006, pp. 6-0. [] Y Zlu, L Jngwen, and Z Youwe, Facal expresson recognton based on two dmensonal feature extracton, Proc. 9th Internatonal Conference on Sgnal Processng, 008, pp. 440-444. [3] S Lu, and Q Ruan, Orthogonal tensor neghborhood preservng embeddng for facal expresson recognton, Pattern Recognton, Vol. 44, 0, pp. 497-53. E-ISSN: 4-3488 3 Issue, Volume 8, January 0