Single Sample Face Recognition via Learning Deep Supervised Auto-Encoders Shenghua Gao, Yuting Zhang, Kui Jia, Jiwen Lu, Yingying Zhang

Size: px
Start display at page:

Download "Single Sample Face Recognition via Learning Deep Supervised Auto-Encoders Shenghua Gao, Yuting Zhang, Kui Jia, Jiwen Lu, Yingying Zhang"

Transcription

1 1 Sngle Sample Face Recognton va Learnng Deep Supervsed Auto-Encoders Shenghua Gao, Yutng Zhang, Ku Ja, Jwen Lu, Yngyng Zhang Abstract Ths paper targets learnng robust mage representaton for sngle tranng sample per person face recognton. Motvated by the success of deep learnng n mage representaton, we propose a supervsed auto-encoder, whch s a new type of buldng block for deep archtectures. There are two features dstnct our supervsed auto-encoder from standard auto-encoder. Frst, we enforce the faces wth varants to be mapped wth the canoncal face of the person, for example, frontal face wth neutral expresson and normal llumnaton; Second, we enforce features correspondng to the same person to be smlar. As a result, our supervsed auto-encoder extracts the features whch are robust to varances n llumnaton, expresson, occluson, and pose, and facltates the face recognton. We stack such supervsed auto-encoders to get the deep archtecture and use t for extractng features n mage representaton. Expermental results on the AR, Extended Yale B, CMU-PIE, and Mult-PIE datasets demonstrate that by couplng wth the commonly used sparse representaton based classfcaton, our stacked supervsed auto-encoders based face representaton sgnfcantly outperforms the commonly used mage representatons n sngle sample per person face recognton, and t acheves hgher recognton accuracy compared wth other deep learnng models, ncludng the deep Lambertan network, n spte of much less tranng data and wthout any doman nformaton. Moreover, supervsed auto-encoder can also be used for face verfcaton, whch further demonstrates ts effectveness for face representaton. Index Terms Sngle tranng sample per person; Face recognton; Supervsed Auto-encoder; Deep archtecture I. INTRODUCTION Sngle sample per person (SSPP) face recognton [1], [2] 1 s a very mportant research topc n computer vson because of ts potental applcatons n many realstc scenaros lke passport dentfcaton, gate ID dentfcaton, vdeo survellance, etc. However, as shown n Fg. 1, there s only one tranng mage (gallery mage) n SSPP, and the faces to be recognzed may contan lots of varances n, for example, llumnaton, expresson, occluson, pose, etc. Therefore, SSPP face recognton s an extremely challengng task. Because of the presence of such challengng ntra-class varances, a robust face representaton whch can overcome the effect of these varances s extremely desrable, and wll greatly facltate SSPP face recognton. However, restrcted Shenghua Gao and Yngyng Zhang are wth ShanghaTech Unversty, Shangha, Chna. Yutng Zhang s wth Zhejang Unversty, Zhejang provnce, Chna. Ku ja s wth Unversty of Macau, Macau SAR, Chna. Jwen Lu s wth Advanced Dgtal Scences Center, Sngapore. E-mal: gaoshh@shanghatech.edu.cn Copyrght (c) 2013 IEEE. Personal use of ths materal s permtted. However, permsson to use ths materal for any other purposes must be obtaned from the IEEE by sendng a request to pubs-permssons@eee.org. 1 SSPP s one specfc task of one shot learnng [3] (a) The AR dataset (b) The Extended Yale B dataset C09 C29 C27 (c) The CMU-PIE dataset 0 o -15 o 15o (d) The Mult-PIE dataset Fg. 1. Samples of gallery mages (the frst column) and probe mages (the rest faces) on the AR, Extended Yale B, CMU-PIE, and Mult-PIE datasets. by havng only one tranng sample for each person, the commonly used subspace analyss based face representaton methods, lke Egenfaces [4] and Fsherfaces [5], are no longer sutable nor applcable to SSPP. To seek a good SSPP mage representaton, lots of endeavors have been made whch acheve some good performance under certan settngs. For example, wth the help of manually generated vrtual faces [6] or an external dataset [1], tradtonal face representaton methods can be extended to the SSPP scenaro. However, the performance of these methods s stll not unsatsfactory for challengng real data. After representng each face wth a feature vector, a classfcaton technque, lke Nearest Neghbor, sparse representaton based classfcaton (SRC) [7], can be used to predct the labels of the probe mages. Deep neural networks have demonstrated ther great successes n mage representaton [8], [9], and ther fundamental ngredent s the tranng of a nonlnear feature extractor at each layer [10], [11], [12]. After the layer-wse tranng of each buldng block and the buldng of a deep archtecture, the output of the network s used for the mage representaton n the subsequent task. As a typcal buldng block n deep neural networks, Denosng Auto-Encoder [11] extracts the features through a determnstc nonlnear mappng, and t s robust to the noses of the data. Image representatons based on the denosng auto-encoder have shown good performance n many tasks, lke object recognton, dgt recognton, etc. Motvated by the success of denosng auto-encoder based deep neural networks, and drven by the SSPP face recognton, we propose a supervsed auto-encoder to buld the deep neural network. C07 C05

2 2 We frst treat the faces wth all type of varants (for example, llumnaton, expresson, occluson, or poses) as mages contamnated by noses. Wth a supervsed auto-encoder, we can recover the face wthout the varant, meanwhle, such a supervsed auto-encoder also extracts robust features for mage representaton n SSPP scenaro,.e., the features correspondng to the same person should have the same (deally) or smlar features, and such features should be stable to the ntraclass varances whch commonly exst n the SSPP scenaro. The contrbutons of ths paper are two-fold. Frstly, we propose a new type of buldng block supervsed autoencoder, for buldng the deep neural network. Dfferent from standard Auto-Encoder, on the one hand, we enforce all the faces wth varances to be mapped wth the canoncal face of the person, for example, frontal face wth neutral expresson and normal llumnaton. Such strategy helps remove the varances n face recognton. On the other hand, by mposng the smlarty preservaton constrants on the extracted features, the supervsed auto-encoder makes the features correspondng to the same person smlar, therefore t extracts more robust features for face representaton. Secondly, by leveragng the supervsed auto-encoder, robust features can be extracted for mage representaton n SSPP face recognton, therefore mproves the recognton accuracy of SSPP. The rest of the paper s organzed as follows: we wll revew the related work, ncludng the commonly used mage representaton methods for SSPP scenaro as well as the auto-encoder and ts varants n Secton II. In Secton III, we wll ntroduce our supervsed auto-encoder, and dscuss ts applcaton n SSPP. We wll expermentally evaluate the proposed technque and ts parameters n Secton IV, and conclude our work n Secton V. II. RELATED WORK I: Work Related to SSPP Face Representaton. Though subspace analyss based methods are usually adopted for effectve and effcent face representaton n general face recognton, they are no longer sutable nor applcable to the SSPP scenaro. On the one hand, because of the lmted number of gallery mages and uncertanty of the varances between probe mages and gallery mage, t s not easy to estmate the data dstrbuton and get the proper projecton matrx for unsupervsed methods, lke Egenfaces[4], 2DPCA [13], etc. To make these unsupervsed methods more sutable for SSPP, by takng advantage of some vrtual faces generated by dfferent methods, Projecton-Combned Prncpal Component Analyss ((PC) 2 A) [14], Enhanced (PC) 2 A (E(PC) 2 A) [15], etc., have been proposed. On the other hand, to make FL- DA [5] based mage representaton able to be used n the SSPP case where the ntra-class varance s mpossble to be drectly estmated because only one tranng sample s provded for each person, vrtual samples are usually generated by usng small perturbaton [16], transformaton [17], SVD decomposton [6], or submages generated by dvdng each mage nto small patches [18]. Moreover, the ntra-class varances can also be estmated from a generc dataset [1], [19], where each subject contans mages wth varances n pose, expresson, llumnaton, occluson, etc. Recently, wth the emergence of deep learnng technque, Tang et al. propose a Deep Lambertan Network (DLN) [20] whch combnes Deep Belef Nets [10] wth Lambertan reflecton assumpton. Therefore DLN extracts llumnaton nvarant features for face representaton and t also shows good performance under SSPP settng, but t s not able to handle other varances, lke expressons, poses, occlusons, etc., whch restrcts ts applcaton n real world problems. II: Work Related to Auto-Encoder. Auto-encoder whch s also termed as autoassocator or Dabolo network, s one of the commonly used buldng blocks n deep neural networks. It contans two modules. () A encoder maps the nput x to the hdden nodes through some determnstc mappng functon f: h = f(x). () A decoder maps the hdden nodes back to the orgnal nput space through another determnstc mappng functon g: x = g(h). For real-valued nput, by mnmzng the reconstructon error x g(f(x)) 2 2, the parameters of encoder and decoder can be learnt. Then the output of the hdden layer s used as the feature for mage representaton. It has been shown that such a nonlnear auto-encoder s dfferent from PCA [21], and t has been proven that tranng an auto-encoder to mnmze reconstructon error amounts to maxmzng a lower bound on the mutual nformaton between nput and the learnt representaton [11]. To further boost the ablty of auto-encoder for mage representaton n buldng deep networks, Vncent et al. [11] propose a denosng auto-encoder whch enhances ts generalzaton by tranng wth locally corrupted nputs. Rafa et al. enhance the robustness of auto-encoder to noses by addng the Jacoban [22], or Jacoban and Hessan at the same tme [23], nto the objectve of basc auto-encoder. Zhou et al. [24] use the poolng operaton after the Reconstructon Independent Component Analyss [25] encodng process, and enforce the pooled features to be smlar for nstances wth the same class label. These extensons mprove the performance of auto-encoder based neural networks for mage representaton n object and dgt recognton. III: Deep Learnng based Face Verfcaton Systems. In [26], a Samese Networks s proposed for face verfcaton and such SN s based on Convolutonal Neural Network n [26]. In the experment, we have tred dfferent settngs and use the one wth the best performance. As the SN s proposed for face verfcaton, to do the face recognton, we run face verfcaton over all pars of faces n the test gallery set and choose the par that s most smlar. In ths way, we can predct the label of the probe. Recently, Tagman et al. also propose a Convolutonal Neural Networks based face verfcaton system [27], whch sgnfcantly outperforms the exstng hand-crafted features based systems on the Labeled Face n the Wld (LFW) database. Smlarly, Fan et al. also use another varant of convolutonal neural network whch s termed as Pyramd CNN, for face verfcaton, and also acheve smlar performance on the LFW database. But these works are proposed for face verfcaton, where a par of faces are gven and the algorthm dentfes whether the two faces belongng to the same person or not (a random guess s 50%). Our task solves face recognton n whch, gven a test mage,

3 3 t tres to dentfy precsely the correct person among many. Random guess gves one over the number of subjects. Besdes the tasks to be solved are dfferent, the network archtecture of our paper s also dfferent wth these all exstng works. Moreover, our work s also closely related to the work of Zhu et al. [28][29]. In [28], a supervsed tranng s deployed to tran a network consstng of encodng layer and poolng layer. In [29], multple CNN networks traned on dfferent regons of face and a regresson layer are traned to solve the face verfcaton task. Smlar to these work, our work also makes faces of the same person be represented smlarly, but we are based on dfferent archtectures. Moreover, dfferent from [28][29], features learnt by our method s not desgned for some specfc task. As shown later, our model can also be used for face recognton or face verfcaton. III. SUPERVISED AUTO-ENCODER FOR SSPP FACE REPRESENTATION The motvaton of our supervsed auto-encoder for SSP- P face representaton comes from the denosng autoencoder [11]. In ths secton, we wll frst revst the denosng auto-encoder. Then we wll propose our supervsed auto-encoder, ts formulaton, ts optmzaton, ts dfferences wth denosng auto-encoder, and ts applcaton n SSPP face representaton. A. A Revst of Denosng Auto-Encoder A denosng auto-encoder tres to reconstruct the clean nput data by usng the manually corrupted verson of t. Mathematcally, denote the nput as x. In denosng autoencoder, nput x s frst corrupted by some predefned nose, for example, Addtve Gaussan nose ( x x N(x, σ 2 I)), maskng nose (a fracton of x s forced to 0), or saltand-pepper nose (a fracton of x s forced to be 0 or 1). Such a corrupted x s used as the nput of the encoder h = f( x) = s f (W x + b f ). Then the output of encoder h s nput nto the decoder ˆx = g(h) = s g (W h + b g ). Here s f and s g are predefned actvaton functons of encoder and decoder respectvely, whch can be sgmod functons, hyperbolc tangent functons, or rectfer functons [30], etc. W R d h d x and b f R d h are the parameters of encoder, and W R dx d h and b g R dx are the parameters of decoder. d x and d h are the dmensonalty of nput data and the number of hdden nodes respectvely. Based on the above defntons, the objectve of denosng auto-encoder s gven as follows: mn W,W,b f,b g L(x, ˆx) (1) x X Here L s the reconstructon error, typcally squared error L(x, ˆx) = x ˆx 2 for real-valued nputs. After learnng f and g, the output of clean nput (f(x)) s used as the nput of the next layer. By tranng such denosng autoencoder layer by layer, stacked denosng auto-encoders are bult. Expermental results show that stacked denosng autoencoders greatly mprove the generalzaton performance of the neural network. Even when the fracton of corrupted pxels (corrupted by zero maskng noses) reaches up to 55%, the recognton accuracy s stll better or comparable wth that of a network traned wthout corruptons. B. Supervsed Auto-Encoder In real applcatons, lke passport or Gate ID dentfcaton, the only tranng sample (gallery mage) for each person s usually a frontal face wth frontal/unform lghtng, neutral expresson, and no occluson. However, the test faces (probe mages) are usually accompaned by varances n llumnaton, occluson, expresson, pose, etc. Compared to the denosng auto-encoder, these gallery mages can be seen as clean data and these probe mages can be seen as corrupted data. For robust face recognton, we desre to learn the features whch are robust to these varances. The success of denosng autoencoder convnces us of the possblty to learn such features. Then the problem becomes: How do we learn a mappng functon whch captures the dscrmnatve structures of the faces of dfferent persons, whle stayng robust to the possble varances of these faces? Once such a functon s learnt, robust features can be extracted for mage presentaton, wth an expected mprovement to the performance of SSPP face recognton. Gven a set of data whch contan the gallery mages (clean data), probe mages (corrupted data) as well as ther labels, we use them to tran a deep neural network for feature extracton. We denote each probe mage n ths dataset as x, and ts correspondng gallery mage as x ( = 1,..., N). It s desrable that x and x should be represented smlarly. Therefore the followng formulaton s proposed (followng the work [11], [22], only the ted weghts case s explored n ths paper,.e., W = W T ): 1 mn W,b f,b g N where ρ x = 1 N ρ x = 1 N KL(ρ ρ 0 ) = j ( x g(f( x )) λ f(x ) f( x ) 2 2) + α ( KL(ρ x ρ 0 ) + KL(ρ x ρ 0 ) ) (2) 1 2 (f(x ) + 1), 1 2 (f( x ) + 1), ( ρ0 log( ρ 0 ρ j ) + (1 ρ 0 ) log( 1 ρ 0 1 ρ j ) ). In ths paper, the actvaton functons used are the hyperbolc tangent,.e., h = f(x) = tanh(w x + b f ), and g(h) = tanh(w T h + b g ). We lst the propertes of the supervsed auto-encoder as follows: 1) The frst term n equaton (2) s the reconstructon error. It means that though gallery mages contan some varances, after passng through the encoder and the decoder, they wll be repared. In ths way, our learnt model s robust to the varances of expresson, occluson, pose, etc., whch are qute dfferent to the noses n the Denosng Autoencoder. (3)

4 4 smlar smlar smlar smlar Input: Fg. 2. Archtecture of Staked Supervsed Auto-Encoders. The left fgure: The basc supervsed auto-encoder, whch s comprsed of the clean/ corrupted faces, there features (hdden layer), as well as the reconstructed clean face by usng the corrupted face. The mddle fgure: The output of prevous hdden layer s used as the nput to tran the next supervsed auto-encoder. We repeat such tranng several tmes untl the desred number of hdden layers s reached. In ths paper, only two hdden layers are used. The rght fgure: Once the network s traned, gven any nput face, the output of the last hdden layer s used as the feature for mage representaton. 2) The second term n equaton (2) s the smlarty preservaton term. Snce the output of the hdden layer s used as the feature, f(x ) and f( x ) correspond to the features of the same person. It s desrable that they should be the same (deally) or smlar. Such a constrant enforces the learnng of a nonlnear mappng robust to the varances that commonly appear n SSPP. 3) The thrd and the fourth term, the Kullback-Leber dvergence (KL dvergence) terms n equaton (2) ntroduce sparsty n the hdden layer. Work from bologcal studes shows that the percentage of the actvated neurons of human bran at the same tme s around 1% to 4% [31]. Therefore the sparsty constrant on the actvaton of the hdden layer s commonly used n the auto-encoder based neural networks and results show that sparse auto-encoder often acheves better performance [32] than that traned wthout the sparsty constrant. Snce the actvaton functon we used s the hyperbolc tangent, ts output s between -1 and 1, where the value of -1 s regarded as non-actvated [33]. Therefore we map the output of the encoder to the range (0,1) frst n equaton (3). Here ρ x and ρ x are the mapped average actvatons of the clean data and the corrupted data respectvely. By choosng a small ρ 0, the KL dvergence regularzer enforces that only a few fracton of neurons are actvated [34], [33]. Followng the work [34], [33], we also set ρ 0 to ) Weghtng the frst and the second term wth 1 N (N s the total number of tranng samples) helps balance the contrbutons of the the frst two terms and the last terms n the optmzaton. Otherwse we may need to tune α on dfferent datasets because the tranng samples for dfferent datasets may be dfferent. 5) We am at learnng an feature extractor to represent faces correspondng to the same person smlarly, but [35] and [26] focus on learnng a dstance metrc n the last layer of ther respectve DNN archtectures. Our work s dfferent from [35] and [26] n terms of network archtecture and applcaton. 6) Snce the labels of the faces are used whle tranng ths buldng block, we term our proposed formulaton as the Supervsed Auto-Encoder (SAE). We llustrate the dea of such supervsed auto-encoder n Fg 2 (the left fgure). C. Optmzaton of Supervsed Auto-Encoder The optmzaton method s very mportant for good performance of deep neural networks. Followng the work [36], [37], the entres of W are randomly sampled from the unform 6 6 dstrbuton between [ d h +d x, d h +d x ], and b f and b g are ntalzed wth zero vectors. It s worth notng that such normalzed ntalzaton s very mportant for the good performance of Auto-Encoder based method presumably because the layer-to-layer transformatons mantan magntudes of actvatons (flowng upward) and gradents (flowng backward) [36]. Then the Lmted memory BroydenCFletcherCGoldfarbCShanno (L-BFGS) algorthm s used for learnng the parameters because of ts faster convergence rate and better performance compared to stochastc gradent descent methods and conjugate gradent [38] methods. As the computatonal cost of the smlarty preservaton term (the 2 nd term n equaton (2)) and the calculaton of ts gradent wth respect to the unknown parameters s almost the same wth that of the reconstructon error term, the overall computatonal complexty s stll O(d x d h ) [22] n our supervsed auto-encoder. D. Stackng Supervsed Auto-Encoders to Buld Deep Archtecture Deep neural networks demonstrate better performance for mage representaton than shallow neural networks [32]. Therefore we also tran the supervsed auto-encoder n a layerwse manner and get the deep archtecture. Here we term such an archtecture as the Stacked Supervsed Auto-Encoders (SSAE). After learnng the actvaton functon of the encoder n prevous layer, t s appled to the clean data and corrupted data respectvely, and the outputs serve as the clean nput data and corrupted nput data to tran the supervsed auto-encoder n the next layer. Once the whole network s traned, the output

5 5 of the hghest layer wll be used as the feature for mage representaton. We llustrate ths learnng strategy n Fg. 2. A. Expermental Setup E. The Dfferences Between Stacked Supervsed Auto- Encoders and Stacked Denosng Auto-Encoders Our stacked auto-encoders are dfferent from the stacked denosng auto-encoders n the followng three aspects: 1. Input. Denosng auto-encoder s an unsupervsed feature learnng method, and the corrupted data are generated by manually corruptng the clean data wth predefned noses. Therefore the traned model s robust to these predefned noses. However, n our supervsed denosng auto-encoder, the corrupted data are the photos taken under dfferent settngs, and they are physcally meanngful data. By enforcng the reconstructed probe mage to be smlar to ts gallery mage, we can overcome the possble varances whch appear n SSPP face recognton, and whch are qute dfferent from the noses n denosng auto-encoder. Ths s why our supervsed autoencoder needs the labels of data to tran the network. 2. Formulaton. The smlarty preservaton term s used to further emphasze the desred property of SSPP mage representaton. Though we enforce the reconstructed probe mage to be smlar to the gallery mage, t cannot be guaranteed that the extracted features of them are smlar because the varances n pose, expresson, llumnaton, etc. can make the probe and the gallery qute dfferent. But n SSPP face recognton, t s desrable that the mages of the same person have smlar representaton. To ths end, the smlarty preservaton term s added to the objectve of SAE n equaton (2). Such term further makes the extracted features be robust to the varances n face recognton. As shown n Secton IV-F, ths term greatly mproves the recognton accuracy, especally for the cases wth large ntra-class varances. 3. Buldng a deep archtecture. In stacked denosng autoencoders, once the actvaton functon of the encoder n prevous layer s traned, t maps the clean data to the hdden layer, and the outputs serve as the clean nput data of next layer SAE. Then the noses whch are generated n the same way wth that n the prevous layer are added on the clean data to serve as the corrupted data. As clamed n [34], snce the features n the hdden layer are no longer n the same feature space wth the nput data, the meanng of applyng the noses generated n the same way wth that n prevous layer to the output of the hdden layer s no longer clear. Smlar to [34], we apply the actvaton functon of the encoder to both the clean data and the corrupted data, and ther outputs serve as the clean and corrupted nput data of the next layer. Hence ths way of stackng the basc supervsed auto-encoders s more natural for buldng a deep archtecture. We use the Extended Yale B, AR, and CMU-PIE, and Mult-PIE datasets for evaluaton of the effectveness of the proposed SSAE model. All the mages are gray-scale mages and are manually algned. 2 The mage sze s Thus the dmensonalty of the nput vector s Each nput feature s normalzed wth the l 2 normalzaton. We set λ = λ 0 dx d h. Further, we set λ 0 = 10 on the CMU-PIE and Mult-PIE dataset, and set λ 0 = 5 on the AR and the Extended Yale B datasets. The reason for ths s that the varances are more sgnfcant on the CMU-PIE and Mult-PIE dataset than that of the AR or the Extended Yale B datasets. The weght correspondng to the KL dvergency term (α) s fxed to be Followng the work [11], we fx the number of the hdden nodes n all the hdden layers to be In our experments, we found that two hdden layers already gve a suffcently good performance. After extractng features wth stacked supervsed auto-encoders for face representaton, followng the work [7], sparse representaton based classfcaton s used for face recognton. The features are normalzed to make ther l 2 norms equal to 1 before sparse codng. Baselnes. We compare our Stacked Supervsed Auto- Encoders (SSAE) wth the followng work because of ther close relatonshps. It s also worth notng that all the comparsons are based on the same tranng/test set, and the same generc data f they are used. 1) Sparse Representaton for Classfcaton (SRC) [7] wth raw pxels; 2) LBP feature followed by SRC; 3) Collaboratve Representaton for Classfcaton (CRC) [39]. 4) AGL [1]; 5) One-Shot Smlarty Kernel (OSS) [40]. The generc data are used as the negatve data n OSS,.e., we use the same generc/tranng/testng splt for both OSS and our SAE. 6) Denosng Auto-Encoder (DAE) [11] wth 10% maskng noses followed by SRC; 7) Modfed Denosng Auto-Encoder (MDAE). We propose to use the reconstructon error term only n equaton (2) (λ = α = 0), and term such baselne method as Modfed Denosng Auto-Encoder (MDAE); IV. EXPERIMENTS In ths secton, we wll expermentally evaluate the stacked supervsed auto-encoder for extractng the features n SSPP face recognton task on Extended Yale B, CMU-PIE, and AR datasets. Important parameters wll also be expermentally evaluated. Besdes face recognton, we also use stacked supervsed auto-encoder for face verfcaton on the LFW dataset. 2 In all our experments, faces are cropped based on manually labeled landmark ponts, and algned based landmark ponts. We also tested our work wth the faces cropped wth the OpenCV face detector on the AR dataset, but the performance of our work on such data s less than 50% on AR. One possble reason for such poor performance s that we don t have enough data to tran a deep network to be robust to the msalgnment. The combnaton of our work wth automatc face algnment method s our future work along ths drecton.

6 6 8) Samese network (SN)[26]. 3 In all these baselne methods, 1-3 correspond to the very popular sparse codng related methods. 4-5 are specally desgned for SSPP, and 6-8 are most related deep learnng methods. 4 B. Dataset Descrpton The CMU-PIE dataset [42] contans 41,368 mages of 68 subjects. For each subject, the mages are taken under 13 dfferent poses, 4 dfferent llumnaton condtons, and 4 dfferent expressons. For each subject, we use the face mages taken wth the frontal pose, neutral expresson, and normal lghtng condton as the galleres, and use the rest of the mages taken wth the poses C27, C29, C07, C05, C09 as probes. We use mages of 20 subjects to learn the SSAE, and use the remanng 48 subjects for evaluaton. The AR dataset [43] contans over 4,000 frontal faces taken from 126 subjects (70 men and 56 women) n two dfferent sessons, and the mages contan varances n occluson (sunglasses or scarves), expresson (neutral expresson, smle, angry, scream), and llumnaton. Some mages contans both occluson and llumnaton varances. In our experments, 20 subjects from sesson 1 are used as the generc set for tranng the SSAE, and another subjects also from sesson 1 are used for evaluaton. The Extended Yale B dataset [44] contans 38 categores. For each subject, we use the frontal faces whose lght source drecton wth respect to the camera axs s 0 degree azmuth ( A+000 ) and 0 degree elevaton ( E+00 ) as gallery mages, and use the rest of the mages wth dfferent lghtng condtons as the probe mages. Followng the work of deep Lambertan networks (DLN) [20], 28 categores are used to tran the SSAE and the remanng 10 categores whch are from the orgnal Yale B dataset are used for evaluaton. The Mult-PIE dataset [45] contan mages of 337 persons taken under the four sessons over the span of 5 months. For each persons, mages are taken under 15 dfferent vew angles and 19 dfferent llumnatons whle dsplayng dfferent facal expressons. In our experments, only the mages wth the neutral expresson, near frontal pose (0, 15, -15 ), and dfferent llumnatons are used. For each person, the frontal face wth neutral expresson, frontal llumnaton s used as 3 We tred several dfferent network archtectures n order to get the best expermental result for the Samese network. Surprsngly, the one of the best performers s the two-layer network whch take a fully connected layer (wth 500 hdden unts and the tanh non-lnearty) as the frst layer and the Samese as the second one. The orgnal deep archtecture proposed n [26] (.e. the basc archtecture s C1-S2-C3-S4-C5-F6 ) doesn t work as well as ths smpler ones (Wth the archtecture lsted n [1], the accuracy of Samese Network on AR s below 60%). Probably, the lmted sze of our generc tranng set make t dffcult to tran a good convolutonal neural network wth deep archtecture. Also as the mage we used are well-algned whch s dfferent from [26], the convolutonal layers n the orgnal archtecture mght become redundant. 4 Because the codes of deepface [27] and Pyramd CNN [41] are not avalable, and the mplementaton and preprocessng nvolves lots of trcks, say very sophstcated algnment method and lots of outsde data requred for network tranng, here we don t compare our method wth these works. Another reason for not comparng wth [27], [41] s that the task solved by these methods s face verfcaton, but our work solves the face verfcaton task. the gallery and all the mages are used as the probe. The 249 persons n sesson 1 are used as the evaluaton, and the frst tme appearance mages correspondng to the other 88 persons whch only appear n sesson 2-4 are used to tran the network. C. Performance Evaluaton The performance of dfferent methods on the AR, Extended Yale B, CMU-PIE, and Mult-PIE datasets s lsted n Table I, Table II, and Table III. We can see that our SSAE outperforms all the rest of the methods ncludng those specally desgned methods for SSPP mage representaton, and t acheves the best performance, whch proves the effectveness of SSAE for extractng robust features for face representaton. Needless to say, SSAE based face representaton can also be combned wth ESRC f another set of labeled data are provded. I. SAE vs. hand-crafted features. The mprovement of the features learnt from our method over the popular LBP features s between about 10% (C27) and 31% (C29) on the CMU-PIE dataset, wth larger mprovements for subsets of larger poses. On the AR dataset and the Extended Yale B dataset where there s lttle pose varaton, the mprovement of our method over LBP s stll around 5%. Our method also outperforms ESRC, whch ntroduces a pre-traned ntraclass varance dctonary to extend the SRC method to the SSPP case, and the mprovement s very evdent for cases of large pose varance (e.g., CMU-PIE C07, C09, C29, Mult- PIE 15, -15). Compared wth ESRC, another advantage of our method s speed. ESRC reles on the ntra-class varance dctonary to acheve good performance, and the sze of such dctonary s usually bg. Such a bg dctonary wll ncrease computatonal costs, but n our method, the dctonary sze n the sparse codng s the same wth the number of gallery faces. Thus our method s sgnfcantly faster than ESRC n testng. For example, ESRC costs about 4.63 seconds to solve sparse codng for each probe face on Extended Yale B, whle our method can solve the sparse codng n about seconds. Here the number of atoms n ntra-class varance dctonary s 1746 on the Extended Yale B. Therefore the total atoms n the dctonary s 1756 for ESRC. In our method, the number of atoms n the dctonary of SRC s only 10. All the methods are based on Matlab mplementatons, and run on a Wndows Server (64bt) wth a 2.13GHz CPU and 16GB RAM. It s worth notng that as the ntra-class varance dctonary s very large on Mult-PIE, the optmzaton s very slow, therefore we don t nclude ts performance on the Mult- PIE dataset. II. SAE vs. other DNNs. Compared wth DAE, the mprovement of our method s over 30% on all the datasets. The reason for the poor performance of DAE for SSPP face representaton s that the tranng of DAE s unsupervsed, and zero maskng nose s used to tran the DAE. Therefore t s natural that the traned DAE cannot handle the varances n poses, expressons, etc. Dfferent from DAE, MDAE enforces the reconstructed faces of the probe mages, whch contan the varances n expresson, pose, etc, to be well algned to ther gallery mages (frontal faces), therefore ts performance s better than DAE, but t s stll nferor to that of stacked

7 7 SAE, especally for the cases wth varance n pose and expresson on AR and CMU-PIE. In contrast, besdes usng more approprate reconstructon error term, our SAE also enforces extracted features wth the same person to be smlar. Therefore, SAE can overcome these varances n SSPP and s more sutable for extractng features for face recognton. Moreover, on the Extended Yale B dataset, the performance of our method (83.97% when the number of hdden nodes s 1024, and 82.22% when the number of hdden nodes s 2048) s better than the 81% obtaned by Deep Lambertan Network (DLN) [20], whch s specally desgned for removng the llumnaton effect n SSPP face recognton [20]. In addton to the 28 categores from the Extended Yale B, DLN also uses the Toronto Face Database, whch s a very large dataset to tran the network. Although we use much less data to tran the SAE and t does not use any doman nformaton about llumnaton model as n DLN, performance of SAE s stll better. 5 In addton, our SAE also outperforms SN on all the datasets. These experments clearly demonstrate the superorty of SAE for recognton tasks over other DNN buldng blocks such as DAE and DLN. III. More observatons. We notce that as the poses of probe mages change, the performance drops notably on the CMU-PIE. For example, the recognton accuracy drops at least 10.83% (C09) compared wth that of the frontal face (C27) pose. Interestngly, we also notce that the performance of C07 (look up) and C09 (look down) s slghtly better than C05 (look rght) and C29 (look left). A possble reason s that the mssng parts of face are larger for C05 and C29 than that n C07 and C09 (please refer to Fg. 1). Smlar observatons can also be found on the Mult-PIE dataset n Table III where the recognton accuracy also drops by around 30% f the poses are non-frontal. Moreover, some probe mages and ther reconstructed mages on the AR dataset are also shown n Fg. 3. We can see our method can remove the llumnaton, and recover the neutral face from the faces wth dfferent expressons. For the faces wth occluson, our method can also smulate the faces wthout occluson, but compared wth llumnaton and expresson, t s more dffcult to recover the face from the occluded face because too much nformaton s lost. Such results are natural because human can nfer the faces wth normal llumnaton and neutral expresson from the experence (For the deep neural networks, the experence s learnt from the generc set). But t s also almost mpossble for our human to nfer the occluded face parts because too much nformaton s mssng. Moreover, some probe mages and ther reconstructed mages on the AR dataset are also shown n Fg. 3. We can see our method can remove the llumnaton, and recover the neutral face from the faces wth dfferent expressons. 5 Because the results on the AR and CMU-PIE datasets and the codes for DLN are not avalable for comparson, we don t report ts performance on the AR and CMU-PIE datasets. But t s unlkely that DLN works well on the AR and CMU-PIE as t s not desgned to handle other varatons n the data such as poses, occlusons, expressons, etc. Moreover, the state-of-the-art performance on Extended Yale B has reached 93.6% n terms of accuracy under the SSPP settng n [46]. But the expermental setup n [46] s a lttle dfferent from that n our paper. For the faces wth occluson, our method can also smulate the faces wthout occluson, but compared wth llumnaton and expresson, t s more dffcult to recover the face from the occluded face because too much nformaton s lost. Such results are natural because human can nfer the faces wth normal llumnaton and neutral expresson from the experence (For the deep neural networks, the experence s learnt from the generc set). But t s also almost mpossble for our human to nfer the occluded face parts because too much nformaton s mssng. TABLE I PERFORMANCE COMPARISON BETWEEN DIFFERENT METHODS ON THE EXTENDED YALE B AND THE AR DATASETS. (%) Method AR Extended Yale B DAE MDAE DLN NA 81 SN AGL CRC SRC OSS LBP ESRC SSAE TABLE II PERFORMANCE COMPARISON BETWEEN DIFFERENT METHODS ON THE CMU-PIE DATASET. (%) Method C27 C05 C07 C09 C29 DAE MDAE SN AGL CRC SRC OSS LBP ESRC SSAE TABLE III PERFORMANCE COMPARISON BETWEEN DIFFERENT METHODS ON THE MULTI-PIE DATASET. (%) DAE MDAE AGL CRC SRC LBP SSAE D. Evaluaton of Smlarty Preservaton Term The smlarty preservaton term, the second term n equaton (2), s very mportant n the formulaton of SAE. Here we lst the performance of dfferent networks traned wth the formulaton wth and wthout ths term n Table 4. It can be

8 (a): Corrupted faces used for tranng the network (b): Corrupted faces used for evaluaton (c): Reconstructed faces of (a) (d): Reconstructed faces of (b) Fg. 3. A comparson between the orgnal mages and the reconstructed ones on the AR dataset. SSAE wthout smlarty preservaton term SSAE wth smlarty preservaton term C27 C05 C07 C09 C29 recognton accuracy layer 1 layer pose Fg. 4. The Effect of the Smlarty Preservaton Term on the CMU-PIE Dataset. (%) Fg. 5. The effect of SSAE wth dfferent depth on the Mult-PIE dataset. seen that ths term mproves the recognton by 3% for the bengn frontal face case (C27), about 10% for C09 and C07, and about 29% for C05 and C29. We can see that mprovement n performance ncreases wth the ncrease of the poses (the msalgnment of faces s larger for C05 and C29 than that n C07 and C09). Ths valdates that the smlarty preservaton term ndeed enhances the robustness of face representaton, especally for pose varatons. E. The Effect of Deep Archtecture As an example, we show the performance of the SSAE wth dfferent layers on the Mult-PIE dataset. We can see that 2- layer SSAE network outperforms the sngle layer network, whch demonstrates the effectveness of the deep archtecture. F. Parameter Evaluaton a. Number of Hdden Nodes n Hdden Layers. We change the number of hdden nodes from 512 to 4096 on the Extended Yale B, and show ther performance n Fg. 6 (top left). Results show that the recognton accuracy s hgher when the number of hdden nodes s 1024 or 2048, whch s the same or larger than the dmensonalty of the data. The reason for ths s that more hdden nodes usually ncrease the expressveness of the auto-encoder [32]. We also note that too many hdden nodes actually decreases the accuracy. A possble reason s that the data used for learnng the network n our settng s not enough (only mages of 28 subjects are used to learn the parameters), whch may lmt stablty of the learnt network. b. Weght of KL Terms (α). We plot the performance wth dfferent α n Fg. 6 (top rght). The poor performance when α s too small (10 6 ) proves the mportance of the sparsty term. But f we mpose too large weght on α (10 2 ), more hdden nodes wll hbernate for a gven nput, whch affects the senstveness of the model to the nput, and may make faces of dfferent persons be represented smlarly. Ths may be the reason that for a value of 10 2, our model has the lowest performance. So properly settng α s a requrement for the good performance of our model. Smlarly phenomenon also

9 9 The effect of the number of the hdden nodes on Ex t ended Yal e B The effect of α on Extended Yale B If the actvaton functon s ReLU, the objectve functon s rewrtten as follows: accuracy (%) accuracy (%) mn W,b f,b g N ( x g(f( x )) λ f(x ) f( x ) 2 2) + α ( f(x ) 1 + f( x ) 1 ) (6) number of hdden nodes accuracy (%) α The effect of λ 0 on Extended Yale B λ 0 Fg. 6. The effect of dfferent parameters n supervsed auto-encoder on the Extended Yale B dataset. happens n sparse representaton based face recognton [47], too small or too large sparsty both wll reduce the recognton accuracy. c. Weght of Smlarty Preservaton Term (λ). We plot the performance of SSAE wth dfferent λ n Fg. 6 (bottom). The poor performance wth smaller λ 0 (0.1 or 1) also demonstrates the mportance of the smlarty preservaton term. But f λ s too bg, the learnt network mght become less dscrmnatve for dfferent subjects as t enforces too strongly the smlarty of the learnt features for dverse samples. That leads to a drop n performance for very large λ. Moreover, from the plot, we see that good performance can be obtaned for a farly large range of λ. G. The Comparson of Dfferent Actvaton Functons Besdes hyperbolc tangent, we also evaluate the performance of SSAE wth other actvaton functons, ncludng sgmod and Rectfed Lnear Unts (ReLU) [48]. To acheve the sparsty n the hdden layer, dfferent strateges are used for dfferent actvaton functons. Specfcally, f the actvaton functon s sgmod, the objectve functon s rewrtten as follows: 1 ( mn x g(f( x )) λ f(x ) f( x ) 2 W,b f,b g N 2) + α ( KL(ρ x ρ 0 ) + KL(ρ x ρ 0 ) ) (4) We lst the performance of our SAE based on dfferent actvatons on the AR dataset n Table IV. We can see that hyperbolc tangent usually acheves the best performance. It s worth notng that ReLU s usually used for Convolutonal Neural Networks (CNN) and demonstrate good performance for mage classfcaton. But many trcks, ncludng the momentum, weght decay, early stoppng, are used to optmze the objectve functon n CNN. In our objectve functon optmzaton, we smply use the L-BFGS to optmze the objectve functon. Many exstng works [49][38] have shown that dfferent optmzaton methods wll greatly affect the performance of deep neural networks. Maybe more advanced optmzaton method and more trcks help mprove the performance of ReLU. TABLE IV PERFORMANCE COMPARISON WITH DIFFERENT ACTIVATION FUNCTIONS. (%) sgmod ReLU tanh AR Extended Yale B H. Extenson: SSAE for Face Verfcaton Besdes face recognton, our model can also be used for face verfcaton. Specally, we tested our work on the LFW dataset under the constraned wthout outsde data protocol. The performance of dfferent methods under such protocol s lsted n Table. V. Interestngly, though our work s desgned for learnng features to make faces of the same person represented smlarly, the performance of our model for face verfcaton s not bad. By learnng more sophstcated dstance metrc [50] wth our face representaton smultaneously, the performance of our method on LFW probably can be further boosted. TABLE V PERFORMANCE COMPARISON BETWEEN DIFFERENT METHODS ON THE LFW DATASETS. (%) where ρ x = 1 f(x ), N ρ x = 1 f( x ), N KL(ρ ρ 0 ) = ( ρ0 log( ρ 0 ) + (1 ρ 0 ) log( 1 ρ 0 ) ). ρ j j 1 ρ j (5) Method AUC V1-lke/MKL, funneled[51] APEM (fuson), funneled[52] MRF-MLBP[53] Fsher vector faces[54] Egen-PEP[55] MRF-Fuson-CSKDA[56] SSAE

10 10 V. CONCLUSION AND FUTURE WORK In ths paper, we propose a supervsed auto-encoder, and use t to buld deep neural network archtecture for extractng robust features for SSPP face representaton. By ntroducng a smlarty preservaton term, our supervsed auto-encoder enforces faces correspondng to the same person to be represented smlarly. Expermental results on the AR, Extended Yale B, and CMU-PIE datasets demonstrate clear superorty of ths module over other conventonal modules such as DAE or DLN. In vew of the sze of mages and tranng sets, we restrct the mage sze to be 32 32, and only mages of a handful of subjects are used to tran the network. For example, only 20 subjects are used on the CMU-PIE and AR datasets, and only 28 subjects on the Extended Yale B dataset. Obvously more tranng samples wll mprove the stablty of the learnt network and larger mages wll mprove the face recognton accuracy [57]. REFERENCES [1] Y. Su, S. Shan, X. Chen, and W. Gao, Adaptve generc learnng for face recognton from a sngle sample per person. n Proceedngs of IEEE Conference on Computer Vson and Pattern Recognton, [2] X. Tan, S. Chen, Z.-H. Zhou, and F. Zhang, Face recognton from a sngle mage per person: A survey, Pattern Recognton, vol. 39, no. 9, pp , [3] L. Fe-Fe, R. Fergus, and P. Perona, One-shot learnng of object categores, Pattern Analyss and Machne Intellgence, IEEE Transactons on, vol. 28, no. 4, pp , [4] M. Turk and A. Pentland, Egenfaces for recognton, n Proceedngs of IEEE Conference on Computer Vson and Pattern Recognton, [5] P. N. Belhumeur, J. P. Hespanha, and D. J. Kregman, Egenfaces vs. Fsherfaces: Recognton usng class specfc lnear projecton, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 19, no. 7, pp , [6] Q. Gao, L. Zhang, and D. Zhang, Face recognton usng FLDA wth sngle tranng mage per person, Appled Mathematcs and Computaton, pp , [7] J. Wrght, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, Robust face recognton va sparse representaton, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 31, no. 2, pp , [8] Q. V. Le, M. Ranzato, R. Monga, M. Devn, K. Chen, G. S. Corrado, J. Dean, and A. Y. Ng, Buldng hgh-level features usng large scale unsupervsed learnng, n Proceedngs of the Internatonal Conference on Machne Learnng, [9] A. Krzhevsky, I. Sutskever, and G. E. Hnton, Imagenet classfcaton wth deep convolutonal neural networks, n Proceedngs of the Conference on Neural Informaton Processng Systems, [10] G. E. Hnton, S. Osndero, and Y.-W. Teh, A fast learnng algorthm for deep belef nets. Neural Computaton, vol. 18, no. 7, pp , [11] P. Vncent, H. Larochelle, I. Lajoe, Y. Bengo, Perre-Antone, and Manzagol, Stacked denosng autoencoders : Learnng useful representatons n a deep network wth a local denosng crteron, Journal on Machne Learnng Research, vol. 11, no. 5, pp , [12] D. Erhan, Y. Bengo, A. Courvlle, P. A. Manzagol, P. Vncent, and S. Bengo, Why does unsupervsed pre-tranng help deep learnng? Journal on Machne Learnng Research, vol. 11, no. 2007, pp , [13] J. Yang, D. Zhang, A. F. Frang, and J.-Y. Yang, Two-dmensonal PCA: A new approach to appearance-based face representaton and recognton, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 21, no. 1, pp , [14] J. Wu and Z.-H. Zhou, Face recognton wth one tranng mage per person, Pattern Recognton Letters, vol. 23, pp , [15] S. Chen, D. Zhang, and Z.-H. Zhou, Enhanced (PC) 2 A for face recognton wth one tranng mage per person, Pattern Recognton Letters, vol. 25, pp , [16] A. M. Martnez, Recognzng mprecsely localzed, partally occluded, and expresson varant faces from a sngle sample per class, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 24, no. 6, pp , [17] S. Shan, B. Cao, W. Gao, and D. Zhao, Extended Fsherface for face recognton from a sngle example mage per person, n IEEE Internatonal Symposum on Crcuts and Systems, [18] S. Chen, J. Lu, and Z.-H. Zhou, Makng FLDA applcable to face recognton wth one sample per person, Pattern Recognton, pp , [19] T.-K. Km, J. Kttler, and J. Kttler, Locally lnear dscrmnant analyss for multmodally dstrbuted classes for face recognton wth a sngle model mage, IEEE Transactons on Pattern Analyss and Machne Intellgence, pp , [20] Y. Tang, R. Salakhutdnov, and G. Hnton, Deep lambertan networks, n Proceedngs of the Internatonal Conference on Machne Learnng, [21] N. Japkowcz, S. J. hanson, and M. A. Gluck, Nonlnear autoassocaton s not equvalent to PCA, Neural Computaton, vol. 12, no. 3, pp , [22] S. Rfa, P. Vncent, X. Muller, X. Glorot, and Y. Bengo, Contractve auto-encoders: Explct nvarance durng feature extracton, n Proceedngs of the Internatonal Conference on Machne Learnng, [23] S. Rfa, G. Mesnl, P. Vncent, X. Muller, Y. Bengo, Y. Dauphn, and X. Glorot, Hgher order contratve auto-encoder, n European Conference on Machne Learnng, [24] W. Y. Zou, A. Y. Ng, S. Zhu, and K. Yu, Deep learnng of nvarant features va tracked vdeo sequences, n Proceedngs of the Conference on Neural Informaton Processng Systems, [25] Q. V. Le, A. Karpenko, J. Ngam, and A. Y. Ngn, ICA wth reconstructon cost for effcent overcomplete feature learnng, n Proceedngs of the Conference on Neural Informaton Processng Systems, [26] S. Chopra, R. Hadsell, and Y. LeCun, Learnng a smlarty metrc dscrmnatvely, wth applcaton to face verfcaton, n Proceedngs of IEEE Conference on Computer Vson and Pattern Recognton, 2005, pp [27] Y. Tagman, M. Yang, M. Ranzato, and L. Wolf, Deepface: Closng the gap to human-level performance n face verfcaton, n Proceedngs of IEEE Conference on Computer Vson and Pattern Recognton, [28] Z. Zhu, P. Luo, X. Wang, and X. Tang, Deep learnng denttypreservng face space, n Computer Vson (ICCV), 2013 IEEE Internatonal Conference on. IEEE, 2013, pp [29], Recover canoncal-vew faces n the wld wth deep neural networks, arxv preprnt arxv: , [30] V. Nar and G. E. Hnton, Rectfed lnear unts mprove restrcted boltzmann machnes, n Proceedngs of the Internatonal Conference on Machne Learnng, 2010, pp [31] P. Lenne, The cost of cortcal computaton, Current Bology, vol. 13, pp , [32] Y. Bengo, Learnng deep archtectures for AI, Foundatons and Trends R n Machne Learnng, vol. 2, no. 1, pp , [33] A. Ng, Sparse auto-encoder, CS294A Lecture notes, Stanford unversty. [34] J. Xe, L. Xu, and E. Chen, Image denosng and npantng wth deep neural networks, n Proceedngs of the Conference on Neural Informaton Processng Systems, [35] R. Salakhutdnov and G. E. Hnton, Learnng a nonlnear embeddng by preservng class neghbourhood structure, n Proceedngs of the Internatonal conference on Artfcal Intellgence and Statstcs, 2007, pp [36] X. Glorot and Y. Bengo, Understandng the dffculty of tranng deep feedforward neural networks, n Proceedngs of the Internatonal conference on Artfcal Intellgence and Statstcs, [37] Y. Bengo, Practcal recommendatons for gradent-based tranng of deep archtectures, n Neural Networks: Trcks of the Trade. Sprnger, 2012, pp [38] Q. V. Le, J. Ngam, A. Coates, A. Lahr, B. Prochnow, and A. Y. Ng, On optmzaton methods for deep learnng, n Proceedngs of the Internatonal Conference on Machne Learnng, [39] L. Zhang and X. Feng, Sparse representaton or collaboratve representaton: Whch helps face recognton? n proceedngs of the nternatonal conference on computer vson, [40] L. Wolf, T. Hassner, and Y. Tagman, The one-shot smlarty kernel, n Computer Vson, 2009 IEEE 12th Internatonal Conference on. IEEE, 2009, pp [41] H. Fan, Z. Cao, Y. Jang, Q. Yn, and C. Doudou, Learnng deep face representaton, arxv preprnt arxv: , 2014.

11 11 [42] T. Sm, S. Baker, and M. Bsat, The CMU Pose, Illumnaton, and Expresson (PIE) database, n FG, [43] A. Martnez and R. Benavente, The AR face database, no. 24, June 1998, CVC Techncal Report. [44] A. Georghades, P. Belhumeur, and D. Kregman, From few to many: Illumnaton cone models for face recognton under varable lghtng and pose, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 23, no. 6, pp , [45] R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker, Mult-pe, Image and Vson Computng, vol. 28, no. 5, pp , [46] Y. Tang, R. Salakhutdnov, and G. Hnton, Tensor analyzers, n Proceedngs of the Internatonal Conference on Machne Learnng, [47] S. Gao, I. W.-H. Tsang, and L.-T. Cha, Sparse representaton wth kernels, Image Processng, IEEE Transactons on, vol. 22, no. 2, pp , [48] X. Glorot, A. Bordes, and Y. Bengo, Deep sparse rectfer neural networks, n Proceedngs of the nternatonal conference on Artfcal Intellgence and Statstcs, [49] G. E. Hnton and R. R. Salakhutdnov, Reducng the dmensonalty of data wth neural networks, Scence, vol. 313, no. 5786, pp , [50] J. Hu, J. Lu, and Y.-P. Tan, Dscrmnatve deep metrc learnng for face verfcaton n the wld, n Computer Vson and Pattern Recognton (CVPR), 2014 IEEE Conference on. IEEE, 2014, pp [51] N. Pnto, J. J. DCarlo, and D. D. Cox, How far can you get wth a modern face recognton test set usng only smple features? n Computer Vson and Pattern Recognton, CVPR IEEE Conference on. IEEE, 2009, pp [52] H. L, G. Hua, Z. Ln, J. Brandt, and J. Yang, Probablstc elastc matchng for pose varant face verfcaton, n Computer Vson and Pattern Recognton (CVPR), 2013 IEEE Conference on. IEEE, 2013, pp [53] S. R. Arashloo and J. Kttler, Effcent processng of mrfs for unconstraned-pose face recognton, n Bometrcs: Theory, Applcatons and Systems (BTAS), 2013 IEEE Sxth Internatonal Conference on. IEEE, 2013, pp [54] K. Smonyan, O. M. Parkh, A. Vedald, and A. Zsserman, Fsher Vector Faces n the Wld, n Brtsh Machne Vson Conference, [55] H. L, G. Hua, X. Shen, Z. Ln, and J. Brandt, Egen-pep for vdeo face recognton, Submtted to ECCV and under revew, vol. 2104, [56] S. Rahmzadeh Arashloo and J. Kttler, Class-specfc kernel fuson of multple descrptors for face verfcaton usng multscale bnarsed statstcal mage features. [57] M. D. Zeler and R. Fergus, Vsualzng and understandng convolutonal networks, n Computer Vson ECCV Sprnger, 2014, pp Shenghua Gao s an assstant professor n ShanghaTech Unversty, Chna. He receved the B.E. degree from the Unversty of Scence and Technology of Chna n 2008 (outstandng graduates), and receved the Ph.D. degree from the Nanyang Technologcal Unversty n From Jun 2012 to Aug 2014, he worked as a research scentst n Advanced Dgtal Scences Center, Sngapore. Hs research nterests nclude computer vson and machne learnng. He has publshed more than 20 papers on object and face recognton related topcs n many nternatonal conferences and journals, ncludng IEEE T-PAMI,IJCV, IEEE TIP, IEEE TNNLS, IEEE TMM, IEEE TCSVT, CVPR, ECCV, etc. He was awarded the Mcrosoft Research Fellowshp n 2010, and ACM Shangha Young Scentst n Yutng Zhang s a Ph.D. canddate n the Department of Computer Scence at Zhejang Unversty, P.R. Chna, advsed by Gang Pan. He s currently a vstng student n the Department of Electronc Engneerng and Computer Scence at the Unversty of Mchgan, USA. He was also a junor research assstant n the Advanced Dgtal Scences Center (Sngapore), Unversty of Illnos at Urbana-Champagn, durng He receved hs B.E. degree n Computer Scence from Zhejang Unversty n Hs research nterests nclude machne learnng, computer vson, and partcularly n the applcaton of deep graph model. Ku Ja receved the B.Eng. degree n marne engneerng from Northwestern Polytechnc Unversty, Chna, n 2001, the M.Eng. degree n electrcal and computer engneerng from Natonal Unversty of Sngapore n 2003, and the Ph.D. degree n computer scence from Queen Mary, Unversty of London, London, U.K., n He s currently a Vstng Assstant Professor at Unversty of Macau, Macau SAR, Chna. He s also holdng a Research Scentst poston at Advanced Dgtal Scences Center, Sngapore. Hs research nterests are n computer vson, machne learnng, and mage processng. Jwen Lu (S 10-M 11) receved the B.Eng. degree n mechancal engneerng and the M.Eng. degree n electrcal engneerng from the X an Unversty of Technology, X an, Chna, and the Ph.D. degree n electrcal engneerng from the Nanyang Technologcal Unversty, Sngapore, respectvely. He s currently a Research Scentst at the Advanced Dgtal Scences Center (ADSC), Sngapore. Hs research nterests nclude computer vson, pattern recognton, and machne learnng. He has authored/co-authored over 100 scentfc papers n these areas, where 23 papers are publshed n the IEEE Transactons journals (TPAMI/TIP/TIFS/TCSVT/TMM) and 12 papers are publshed n the top-ter computer vson conferences (ICCV/CVPR/ECCV). He serves as an Area Char for ICME 2015 and ICB 2015, and a Specal Sesson Char for 2 VCIP Dr. Lu was a recpent of the Frst-Prze Natonal Scholarshp and the Natonal Outstandng Student Award from the Mnstry of Educaton of Chna n 2002 and 2003, the Best Student Paper Award from PREMIA of Sngapore n 2012, and the Top 10% Best Paper Award from MMSP 2014, respectvely. Recently, he gves tutorals at some conferences such as CVPR 2015, FG 2015, ACCV 2014, ICME 2014 and IJCB Yngyng Zhang Yngyng Zhang receved the B.E. degree from X an Jaotong Unversty,Chna, n He currently s workng as a master canddate n ShanghaTech UnverstyChna. Hs research nterest s computer vson and machne learnng.

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Recognizing Faces. Outline

Recognizing Faces. Outline Recognzng Faces Drk Colbry Outlne Introducton and Motvaton Defnng a feature vector Prncpal Component Analyss Lnear Dscrmnate Analyss !"" #$""% http://www.nfotech.oulu.f/annual/2004 + &'()*) '+)* 2 ! &

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch Deep learnng s a good steganalyss tool when embeddng key s reused for dfferent mages, even f there s a cover source-msmatch Lonel PIBRE 2,3, Jérôme PASQUET 2,3, Dno IENCO 2,3, Marc CHAUMONT 1,2,3 (1) Unversty

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Learning a Class-Specific Dictionary for Facial Expression Recognition

Learning a Class-Specific Dictionary for Facial Expression Recognition BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 4 Sofa 016 Prnt ISSN: 1311-970; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-016-0067 Learnng a Class-Specfc Dctonary for

More information

Competitive Sparse Representation Classification for Face Recognition

Competitive Sparse Representation Classification for Face Recognition Vol. 6, No. 8, 05 Compettve Sparse Representaton Classfcaton for Face Recognton Yng Lu Chongqng Key Laboratory of Computatonal Intellgence Chongqng Unversty of Posts and elecommuncatons Chongqng, Chna

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Face Recognition by Fusing Binary Edge Feature and Second-order Mutual Information

Face Recognition by Fusing Binary Edge Feature and Second-order Mutual Information Face Recognton by Fusng Bnary Edge Feature and Second-order Mutual Informaton Jatao Song, Bejng Chen, We Wang, Xaobo Ren School of Electronc and Informaton Engneerng, Nngbo Unversty of Technology Nngbo,

More information

Robust Dictionary Learning with Capped l 1 -Norm

Robust Dictionary Learning with Capped l 1 -Norm Proceedngs of the Twenty-Fourth Internatonal Jont Conference on Artfcal Intellgence (IJCAI 205) Robust Dctonary Learnng wth Capped l -Norm Wenhao Jang, Fepng Ne, Heng Huang Unversty of Texas at Arlngton

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Modular PCA Face Recognition Based on Weighted Average

Modular PCA Face Recognition Based on Weighted Average odern Appled Scence odular PCA Face Recognton Based on Weghted Average Chengmao Han (Correspondng author) Department of athematcs, Lny Normal Unversty Lny 76005, Chna E-mal: hanchengmao@163.com Abstract

More information

A Bilinear Model for Sparse Coding

A Bilinear Model for Sparse Coding A Blnear Model for Sparse Codng Davd B. Grmes and Rajesh P. N. Rao Department of Computer Scence and Engneerng Unversty of Washngton Seattle, WA 98195-2350, U.S.A. grmes,rao @cs.washngton.edu Abstract

More information

Tone-Aware Sparse Representation for Face Recognition

Tone-Aware Sparse Representation for Face Recognition Tone-Aware Sparse Representaton for Face Recognton Lngfeng Wang, Huayu Wu and Chunhong Pan Abstract It s stll a very challengng task to recognze a face n a real world scenaro, snce the face may be corrupted

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

General Regression and Representation Model for Face Recognition

General Regression and Representation Model for Face Recognition 013 IEEE Conference on Computer Vson and Pattern Recognton Workshops General Regresson and Representaton Model for Face Recognton Janjun Qan, Jan Yang School of Computer Scence and Engneerng Nanjng Unversty

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Computer Aided Drafting, Design and Manufacturing Volume 25, Number 2, June 2015, Page 14

Computer Aided Drafting, Design and Manufacturing Volume 25, Number 2, June 2015, Page 14 Computer Aded Draftng, Desgn and Manufacturng Volume 5, Number, June 015, Page 14 CADDM Face Recognton Algorthm Fusng Monogenc Bnary Codng and Collaboratve Representaton FU Yu-xan, PENG Lang-yu College

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Robust Kernel Representation with Statistical Local Features. for Face Recognition

Robust Kernel Representation with Statistical Local Features. for Face Recognition Robust Kernel Representaton wth Statstcal Local Features for Face Recognton Meng Yang, Student Member, IEEE, Le Zhang 1, Member, IEEE Smon C. K. Shu, Member, IEEE, and Davd Zhang, Fellow, IEEE Dept. of

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Integrated Expression-Invariant Face Recognition with Constrained Optical Flow

Integrated Expression-Invariant Face Recognition with Constrained Optical Flow Integrated Expresson-Invarant Face Recognton wth Constraned Optcal Flow Chao-Kue Hseh, Shang-Hong La 2, and Yung-Chang Chen Department of Electrcal Engneerng, Natonal Tsng Hua Unversty, Tawan 2 Department

More information

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS P.G. Demdov Yaroslavl State Unversty Anatoly Ntn, Vladmr Khryashchev, Olga Stepanova, Igor Kostern EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS Yaroslavl, 2015 Eye

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

Kernel Collaborative Representation Classification Based on Adaptive Dictionary Learning

Kernel Collaborative Representation Classification Based on Adaptive Dictionary Learning Internatonal Journal of Intellgent Informaton Systems 2018; 7(2): 15-22 http://www.scencepublshnggroup.com/j/js do: 10.11648/j.js.20180702.11 ISSN: 2328-7675 (Prnt); ISSN: 2328-7683 (Onlne) Kernel Collaboratve

More information

Research of Image Recognition Algorithm Based on Depth Learning

Research of Image Recognition Algorithm Based on Depth Learning 208 4th World Conference on Control, Electroncs and Computer Engneerng (WCCECE 208) Research of Image Recognton Algorthm Based on Depth Learnng Zhang Jan, J Xnhao Zhejang Busness College, Hangzhou, Chna,

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM Classfcaton of Face Images Based on Gender usng Dmensonalty Reducton Technques and SVM Fahm Mannan 260 266 294 School of Computer Scence McGll Unversty Abstract Ths report presents gender classfcaton based

More information

Research Article A High-Order CFS Algorithm for Clustering Big Data

Research Article A High-Order CFS Algorithm for Clustering Big Data Moble Informaton Systems Volume 26, Artcle ID 435627, 8 pages http://dx.do.org/.55/26/435627 Research Artcle A Hgh-Order Algorthm for Clusterng Bg Data Fanyu Bu,,2 Zhku Chen, Peng L, Tong Tang, 3 andyngzhang

More information

Face Detection with Deep Learning

Face Detection with Deep Learning Face Detecton wth Deep Learnng Yu Shen Yus122@ucsd.edu A13227146 Kuan-We Chen kuc010@ucsd.edu A99045121 Yzhou Hao y3hao@ucsd.edu A98017773 Mn Hsuan Wu mhwu@ucsd.edu A92424998 Abstract The project here

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY SSDH: Sem-supervsed Deep Hashng for Large Scale Image Retreval Jan Zhang, and Yuxn Peng arxv:607.08477v2 [cs.cv] 8 Jun 207 Abstract Hashng

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

The Discriminate Analysis and Dimension Reduction Methods of High Dimension

The Discriminate Analysis and Dimension Reduction Methods of High Dimension Open Journal of Socal Scences, 015, 3, 7-13 Publshed Onlne March 015 n ScRes. http://www.scrp.org/journal/jss http://dx.do.org/10.436/jss.015.3300 The Dscrmnate Analyss and Dmenson Reducton Methods of

More information

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING.

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING. WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING Tao Ma 1, Yuexan Zou 1 *, Zhqang Xang 1, Le L 1 and Y L 1 ADSPLAB/ELIP, School of ECE, Pekng Unversty, Shenzhen 518055, Chna

More information

Infrared face recognition using texture descriptors

Infrared face recognition using texture descriptors Infrared face recognton usng texture descrptors Moulay A. Akhlouf*, Abdelhakm Bendada Computer Vson and Systems Laboratory, Laval Unversty, Quebec, QC, Canada G1V0A6 ABSTRACT Face recognton s an area of

More information

Human Face Recognition Using Generalized. Kernel Fisher Discriminant

Human Face Recognition Using Generalized. Kernel Fisher Discriminant Human Face Recognton Usng Generalzed Kernel Fsher Dscrmnant ng-yu Sun,2 De-Shuang Huang Ln Guo. Insttute of Intellgent Machnes, Chnese Academy of Scences, P.O.ox 30, Hefe, Anhu, Chna. 2. Department of

More information

Combination of Local Multiple Patterns and Exponential Discriminant Analysis for Facial Recognition

Combination of Local Multiple Patterns and Exponential Discriminant Analysis for Facial Recognition Sensors & ransducers 203 by IFSA http://.sensorsportal.com Combnaton of Local Multple Patterns and Exponental Dscrmnant Analyss for Facal Recognton, 2 Lfang Zhou, 2 Bn Fang, 3 Wesheng L, 3 Ldou Wang College

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Feature Extraction Based on Maximum Nearest Subspace Margin Criterion

Feature Extraction Based on Maximum Nearest Subspace Margin Criterion Neural Process Lett DOI 10.7/s11063-012-9252-y Feature Extracton Based on Maxmum Nearest Subspace Margn Crteron Y Chen Zhenzhen L Zhong Jn Sprnger Scence+Busness Meda New York 2012 Abstract Based on the

More information

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender 2013 Frst Internatonal Conference on Artfcal Intellgence, Modellng & Smulaton Comparng Image Representatons for Tranng a Convolutonal Neural Network to Classfy Gender Choon-Boon Ng, Yong-Haur Tay, Bok-Mn

More information

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25 Transformaton Networks for Target-Orented Sentment Classfcaton 1 Xn L 1, Ldong Bng 2, Wa Lam 1, Be Sh 1 1 The Chnese Unversty of Hong Kong 2 Tencent AI Lab ACL 2018 1 Jont work wth Tencent AI Lab Transformaton

More information

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

2. Related Work Hand-crafted Features Based Trajectory Prediction Deep Neural Networks Based Trajectory Prediction

2. Related Work Hand-crafted Features Based Trajectory Prediction Deep Neural Networks Based Trajectory Prediction Encodng Crowd Interacton wth Deep Neural Network for Pedestran Trajectory Predcton Yanyu Xu ShanghaTech Unversty xuyy2@shanghatech.edu.cn Zhxn Pao ShanghaTech Unversty paozhx@shanghatech.edu.cn Shenghua

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Comparison Study of Textural Descriptors for Training Neural Network Classifiers

Comparison Study of Textural Descriptors for Training Neural Network Classifiers Comparson Study of Textural Descrptors for Tranng Neural Network Classfers G.D. MAGOULAS (1) S.A. KARKANIS (1) D.A. KARRAS () and M.N. VRAHATIS (3) (1) Department of Informatcs Unversty of Athens GR-157.84

More information

A Hybrid Collaborative Filtering Model with Deep Structure for Recommender Systems

A Hybrid Collaborative Filtering Model with Deep Structure for Recommender Systems Proceedngs of the Thrty-Frst AAAI Conference on Artfcal Intellgence (AAAI-17) A Hybrd Collaboratve Flterng Model wth Deep Structure for Recommender Systems Xn Dong, Le Yu, Zhonghuo Wu, Yuxa Sun, Lngfeng

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

Gender Classification using Interlaced Derivative Patterns

Gender Classification using Interlaced Derivative Patterns Gender Classfcaton usng Interlaced Dervatve Patterns Author Shobernejad, Ameneh, Gao, Yongsheng Publshed 2 Conference Ttle Proceedngs of the 2th Internatonal Conference on Pattern Recognton (ICPR 2) DOI

More information

Two-Dimensional Supervised Discriminant Projection Method For Feature Extraction

Two-Dimensional Supervised Discriminant Projection Method For Feature Extraction Appl. Math. Inf. c. 6 No. pp. 8-85 (0) Appled Mathematcs & Informaton cences An Internatonal Journal @ 0 NP Natural cences Publshng Cor. wo-dmensonal upervsed Dscrmnant Proecton Method For Feature Extracton

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Scale Selective Extended Local Binary Pattern For Texture Classification

Scale Selective Extended Local Binary Pattern For Texture Classification Scale Selectve Extended Local Bnary Pattern For Texture Classfcaton Yutng Hu, Zhlng Long, and Ghassan AlRegb Multmeda & Sensors Lab (MSL) Georga Insttute of Technology 03/09/017 Outlne Texture Representaton

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

(a) Input data X n. (b) VersNet. (c) Output data Y n. (d) Supervsed data D n. Fg. 2 Illustraton of tranng for proposed CNN. 2. Related Work In segment

(a) Input data X n. (b) VersNet. (c) Output data Y n. (d) Supervsed data D n. Fg. 2 Illustraton of tranng for proposed CNN. 2. Related Work In segment 一般社団法人電子情報通信学会 THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS 信学技報 IEICE Techncal Report SANE2017-92 (2018-01) Deep Learnng for End-to-End Automatc Target Recognton from Synthetc

More information

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM PERFORMACE EVALUAIO FOR SCEE MACHIG ALGORIHMS BY SVM Zhaohu Yang a, b, *, Yngyng Chen a, Shaomng Zhang a a he Research Center of Remote Sensng and Geomatc, ongj Unversty, Shangha 200092, Chna - yzhac@63.com

More information

Understanding the difficulty of training deep feedforward neural networks

Understanding the difficulty of training deep feedforward neural networks Understandng the dffculty of tranng deep feedforward neural networks Xaver Glorot Yoshua Bengo DIRO, Unversté de Montréal, Montréal, Québec, Canada Abstract Whereas before 2006 t appears that deep multlayer

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Face Recognition using 3D Directional Corner Points

Face Recognition using 3D Directional Corner Points 2014 22nd Internatonal Conference on Pattern Recognton Face Recognton usng 3D Drectonal Corner Ponts Xun Yu, Yongsheng Gao School of Engneerng Grffth Unversty Nathan, QLD, Australa xun.yu@grffthun.edu.au,

More information

Robust Low-Rank Regularized Regression for Face Recognition with Occlusion

Robust Low-Rank Regularized Regression for Face Recognition with Occlusion Robust Low-Rank Regularzed Regresson for ace Recognton wth Occluson Janjun Qan, Jan Yang, anlong Zhang and Zhouchen Ln School of Computer Scence and ngneerng, Nanjng Unversty of Scence and echnology Key

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Dropout: A Simple Way to Prevent Neural Networks from Overfitting Journal of Machne Learnng Research 15 (2014) 1929-1958 Submtted 11/13; Publshed 6/14 Dropout: A Smple Way to Prevent Neural Networks from Overfttng Ntsh Srvastava Geoffrey Hnton Alex Krzhevsky Ilya Sutskever

More information

ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION

ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION ALEXNET FEATURE EXTRACTION AND MULTI-KERNEL LEARNING FOR OBJECT- ORIENTED CLASSIFICATION Lng Dng 1, Hongy L 2, *, Changmao Hu 2, We Zhang 2, Shumn Wang 1 1 Insttute of Earthquake Forecastng, Chna Earthquake

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

Hyperspectral Image Classification Based on Local Binary Patterns and PCANet

Hyperspectral Image Classification Based on Local Binary Patterns and PCANet Hyperspectral Image Classfcaton Based on Local Bnary Patterns and PCANet Huzhen Yang a, Feng Gao a, Junyu Dong a, Yang Yang b a Ocean Unversty of Chna, Department of Computer Scence and Technology b Ocean

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng, Natonal Unversty of Sngapore {shva, phanquyt, tancl }@comp.nus.edu.sg

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information