Capturing Large Intra-class Variations of Biometric Data by Template Co-updating

1 Capturing Large Intra-lass Variations of Biometri Data by Template Co-updating Ajita Rattani University of Cagliari Piazza d'armi, Cagliari, Italy Gian Lua Marialis University of Cagliari Piazza d'armi, Cagliari, Italy Fabio Roli University of Cagliari Piazza d'armi, Cagliari, Italy Abstrat The representativeness of a biometri template gallery to the novel data has been reently faed by proposing template update algorithms that update the enrolled templates in order to apture, and represent better, the subjet s intra-lass variations. Majority of the proposed approahes have adopted self update tehnique, in whih the system updates itself using its own knowledge. Reently an approah named template o-update, using two omplementary biometris to o-update eah other, has been introdued. In this paper, we investigate if template o-update is able to apture intra-lass variations better than those aptured by state of art self update algorithms. Aordingly, experiments are onduted under two onditions, i.e., a ontrolled and an unontrolled environment. Reported results show that oupdate an outperform self update tehnique, when initial enrolled templates are poor representative of the novel data (unontrolled environment), whilst almost similar performanes are obtained when initial enrolled templates well represent the input data (ontrolled environment). 1. Introdution Template enrollment is one of the most ruial phases of biometri verifiation systems, in whih the biometri data is aquired, proessed, features are extrated and identity labels are assigned in a supervised manner. In order to verify a subjet s identity, the enrolled template must be representative of the input biometri data. Intra-lass variations in the input biometri data may be (a) Temporary like for fae, hanges in environment, expression, pose et. or (b) Permanent like for fae, aging, aidental uts or sars on subjet s fae, hanging the subjet s appearane. These variations make the initial enrolled templates non-representative of the input data, resulting in poor performane of the system. Initial attempt to deal with the problem of representativeness has been addressed by enrolling multiple instanes representing temporary variations, and by repeating the proess of enrolment over time to apture permanent variations in the biometri data [1]. However, these approahes are supervised as supervisor assigns identity labels to the data to be used for updating, making the proess very expensive, time onsuming and ineffiient, as multiple enrolment sessions must be repeated over time and subjet s ooperation plays a ruial role in this regard [1]. To address the limitations of supervised methods, semi-supervised methods to template update has been addressed [2-5]. These methods to template update use both the initial set of enrolled and labelled templates and, iteratively, a set of unlabelled data aquired during the online operation of the biometri system for updating. Rationale is that unlabeled samples lassified (i.e. pseudolabeled) with high reliability an be used to update the template set in order to improve its representativeness. It is worth noting that the majority of the state-of-the-art semi-supervised methods an be termed as self update as they inrementally update themselves to the variations introdued in the input biometri data of an individual using their own knowledge. The riterion to update is based on the mathing sore between template and input data. In partiular, mathing sore is required to be over a very high threshold in order to guarantee that the novel sample is highly genuine, that is, the probability of being an impostor is very low. However, due to operation at high threshold, it is reasonable to believe that these methods have limited apability to exploit signifiant intra-lass variations than those represented by enrolled template, unless they operate at lower threshold. But relaxing the operation threshold may ause introdution of impostors into the gallery set, thus weakening the system s performane and seurity. In Figure 1, the initial template is representative of unlabeled samples marked as set A, thus they may be exploited by self update tehniques. But samples marked as set B may not be utilized, unless self update algorithm /08/$ IEEE

2 operates at a relaxed threshold in order to inrease the tolerane to illumination and expression variations, but at the same time inreasing the risk of using an impostor pattern to update templates. Figure 1. A hypothetial example using faes. Initial template is representative of unlabelled set A, but is not representative of unlabelled set B. As a onsequene, set A instanes may be exploited by self updating, whilst set B instanes may remain unexploited. Reently, a novel algorithm named template oupdate has been introdued in [6] for personal identifiation. The mutual, omplementary help of both the mathers are used to update the system, speifially one mather operating at high onfidene helps other to identify diffiult patterns. Experiments reported in [6] on himerial data set (AR and FVC) were promising but very preliminary. This paper differs from [6] in following aspets: (a) Co-update is investigated in personal verifiation mode, an appliation for whih template update an be onsidered more useful than identifiation. (b) Co-update algorithm is modified to better study and represent the aptured variations in the updated template galleries. () The omparison of template o-update with the state of the art self-update systems is experimentally done in two environmental onditions. (d) A different and larger bi-modal himerial data set, representing signifiant intra-lass variations per lass, is adopted. Reported experimental results show that template oupdate an outperform self update tehniques when initial templates are non representative, as an happen in an unontrolled environment, where large intra-lass variations are depited in input data. Moreover, it is experimentally shown that self-update has to operate at relaxed threshold in order to apture these variations, making them ounter-produtive, due to the introdution of impostors into the lients galleries. On the other hand, self update and o-update performanes are almost omparable in a ontrolled environment. The paper is organized as follows. Setion 2 desribes the proposed template o-update approah. Setion 3 reports experimental results, and onlusions are drawn in Setion A template o-update algorithm In a template o-update algorithm, two mathers are trained with an initial template for eah lient. During the on-line operation of the biometri system, a bath of unlabeled data is aquired. Eah mather is applied to suh unlabeled set individually. Unlabeled samples onsidered as highly genuine by one mather together with the orresponding omplementary sample (in this paper fae and fingerprint together are intended to be omplementary information) of the another trait are used to update the template set of the related laimed identity. Both the mathers are re-trained with this augmented template set, and the proess is repeated until a speifi stop riteria is met [6]. The proedure to update gallery set is usually one of the two ommonly used tehniques, as adopted by most of the template update methods reported in literature, a) by adding the novel sample as another instane in the gallery set of the respetive lient [1][5] or b) by fusing the sample with the template to form a super template, thus embedding the information into a single template [2-4]. In o-update algorithm proposed in [6], samples reognized as highly genuine by one of the available mathers together with the orresponding sample of the omplementary trait (ie., both the samples of multi-modal input data) are always fused with the multi-modal template set of the respetive lient, as usually done in self update tehnique. On the other hand, the hypothesis about oupdate is that it an apture large intra-lass variations, thanks to the omplementary biometri. Therefore, we have modified template o-update reported in [6] as follows: a seleted sample is fused only if it is a highly genuine sample otherwise inserted as a separate instane in the gallery set, independently for both the mathers. Rationale is that fusion proess may not better embed the information relating to intra-lass variations in the super template thus resulting in information loss, so intra-lass variations are kept separate. The proposed algorithm is reported in pseudo-ode form in Figure 2. For the b-th b biometri of the -th lient, an initial template set T,, onsisting of a single template, is given, with = 1,,C, and b = 1, 2. The set Du indiates the bath of unlabeled samples olleted during the system s operation i.e, several ouples of biometri samples. Eah sample x Du has two features sets, one for eah biometri, aordingly: x = { x 1, x 2}. In this work, x 1 is the fingerprint feature set, whilst x 2 is the fae feature set. For the b-th biometri and related mather (b = 1,2), the following funtions are defined: 1) s b ( xb, tb ), outputs the mathing sore between b-th feature set of x, namely, x b, and the orresponding feature set t b. s 1 (*,*) is implemented by the String minutiae mathing algorithm [7] and s 2 (*,*) by the SIFT features-

3 based mathing algorithm [8-9]. 2) fuse b ( xb, tb ), ombines the features sets x b and t b and outputs an updated feature set. fuse 1 (*,*) is implemented by the minutiae merging algorithm proposed in [2], and fuse 2 (*,*) by the fusion proess of SIFT-features [9]. Other terms used in Figure 2 are as follows, threshold b is the threshold value for b-th biometri in order to pseudolabel biometri sample x as highly genuine. T = T is the union set of all the lients templates. threshold b must b b, be evaluated on T b. HG is the set of highly genuine samples exploited at eah iteration. The algorithm (Figure 2) onsist of three loops: the external loop (Repeat-Until) defines the iterations, the seond one (For b) selets urrent pseudo-labeling mather b, (Note: In this work, fingerprint mather follows fae mather, but operation in vie-versa manner will not effet the results as reported in [6]), the third one (For ) is the o-updating ore. The set X of samples laiming identity is first extrated from Du. Set X hg of highly genuine samples for b-th mather is then extrated from X. This is verified by omparing related mathing sore with threshold b. Then, the sample exhibiting maximum mathing sore x = { x 1, x 2} from X hg and the template t b from T b, nearest to x b are seleted. t b is updated by fusion with x b and as this is very near to an existing template, the fusion proess will not result in information loss. Co-updating is performed in the following step: if the omplementary mather b agrees with b-th mather in lassifying x as highly genuine (by threshold b ), related nearest template t b is updated by fusion with x b. Otherwise, x b is simply added to the template set T b,, as it an be onsidered as a large intra-lass variation for that mather, identified with the help of omplementary mather, and fusing this sample may lessen its impat and information level in the fused template so it is kept separate. In this way, templates of b -th biometri are updated through the support of b-th biometri. x is finally removed from Du and added to HG. At the end of eah iteration, the thresholds are reestimated for both the mathers. However, to aount for the improvement due to updated templates, the lassifiation performane of the system is evaluated on the test set after eah iteration. The lassifiation is done by mathing test sample to the template set of the respetive lient and omparing the final sore, alulated using sum of sores tehnique, over a ertain threshold. The proess of iterations ontinues until no more unlabeled samples an be exploited for any lient, i.e., HG is empty at the end of an iteration, and the algorithm stops. Repeat HG = φ; Estimate treshold 1 on T 1 and treshold 2 on T 2 ; For b = 1, 2 b = (b mod 2) + 1; (b is the omplementary biometri) For = 1,,C X = { x Du laimed identity of x is }; X = { x X max s ( x, t) > threshold } ; If hg X hg φ t T b, b b b x = arg max max s (, ) ;, b yb t b y X t T hg tb = arg max sb ( xb, t); b, t T tb = fuseb ( xb, tb ); If max s '( ', ) ', b xb t > threshold b t T t = arg max s t T, t = fuse ( x Else T,, ( x, t ); = T x ; End If HG = HG x; Du = Du \ x; End If End For End For Until HG φ ;, t); Figure 2. The proposed template o-update algorithm. For the sake of spae, details about funtions and variables are given in Setion Experimental results 3.1. Data sets and protool The data set used for testing onsists of 42 individuals omposed of 20 fae and fingerprint images for eah individual, by keeping in mind the independene of fae and fingerprint traits. The time span of both the olleted data sets spans over one year. Forty-two frontal fae images with 20 instanes representing signifiant illuminations hanges and variations in faial expressions per person were used from the Equinox orporation database [10]. The fingerprint data set has been olleted by the authors using Biometrika Fx2000 optial sensor. The images are aquired with variations in pressure, moisture and time interval to represent large intra-lass variations. The results are omputed on five random oupling of fae and fingerprint datasets and are averaged. In a typial personal verifiation system, a different

4 bath of unlabelled set, owing to different aess attempts, is olleted for eah lient over a period of time. In order to respet this simple evidene, the following protool has been adopted: (1) 42 template ouples (fingerprint-fae) are seleted for eah lient. This is the initial template set, made up of only one ouple of fae and fingerprint images per lient. threshold b (b = 1, 2) are always evaluated on this set, being the only set available in real environments (Figure 2). Threshold values are evaluated on this template set by omparing eah template to the templates of all the other lients thus estimating the impostor distribution and seleting threshold at 1%FAR, being the only set available in real environment. These stringent starting onditions simulate the real environment where very less labeled data is present to set the system parameters. Threshold at 1%FAR ensures that probability of aeptane of an impostor is low and rejetion of a genuine lient is not too high, thus allowing to update the template set. (2) Remaining lient images are subdivided in two sets made up of nine ouples (unlabelled set Du) and ten ouples (test set). (3) The whole data set (exept for templates) is then randomly partitioned into 42 sets, suh that the -th partition does not ontain images of -th lient. Eah of these partitions, onsisting of 19 images, represents the impostor set for -th lient: five images are added to the unlabelled set Du and fifteen to the test set. The test set is used to evaluate the atual improvement in performane reahed by the template o-update algorithm with respet to the template self update. The performane of self update and o-update systems is evaluated at operation threshold b (Figure 2) seleted at 1%FAR. It should be noted that this protool follows guidelines proposed in [3] for benhmarking template update algorithms: an image different from all the templates is always a query, i.e., an input sample aquired during system operations, and a template image never beome a query image even as an impostor for other lients. We are aware that adopted database size may not be very well appropriate for the task, but it respets on an average, the size adopted in other template update works reported in literature [2-6] Results: experiment 1 The goal of this experiment is to ompare the performane of template o-update with that of self update systems in an unontrolled environment when the initial enrolled templates are not representative of the variations in unlabeled and test sets. The harateristis of initial templates, unlabelled and test set are desribed in Table 1. The onditions assumed are realisti and the environment is partially unontrolled. Set Initial templates Unlabelled Set Test Set Charateristis Neutral fae expression and good quality fingerprint image Expression and lighting variations for fae and variations like rotation, pressure, moisture and partial print for fingerprint images The same as the unlabelled set Table 1. Charateristis of templates, unlabelled and test sets used in Experiment 1. The adopted self update algorithm is an offline proess, in whih the system operates over a bath of unlabeled data olleted during normal system s operation and iteratively unlabelled data reognized with high onfidene is fused with the template of the related lient, threshold is re-estimated after eah iteration, more details an be found in [5]. The experimental protool is obviously the same as desribed in Setion 3.1. The performane of fae verifiation system is always better than that of fingerprint due to the diffiult fingerprint dataset adopted (DIEE-Extreme). Figure 3, shows the performane of the self update and o-update tehniques measured in terms of Equal Error Rate (EER) seleted at 1%FAR as a funtion of the umulative amount of unlabelled data added to the systems at eah iteration (Repeat-Until yle of Figure 2). It an be notied that: final EER of the self update is more than that of the o-update tehnique. The unlabeled set exploited by o-update is larger than that of self update, as shown by self update urves that end before those of oupdate. This means that self update proess depited a limited enhanement in the performane due to non exploitation of many useful samples. On the ontrary, oupdate system using omplementary information of fae and fingerprint together exploits these samples whih justifies our hypothesis made in setion 1. The points not evenly spaed on x axis show uneven amount of unlabeled data exploited by self update and o-update tehniques. Figure 4 and 5, show the updated template set after last iteration of o-update and self update algorithms, respetively, for a randomly hosen lient. It an be seen that in the self update template sets, the majority of samples are very similar to the initial template and some diffiult patterns may be inserted at the later iterations. In o-update template sets, many samples far from the template (intra-lass variations) are inserted in the early iterations, thanks to the omplementary biometri mather, thus improving the template representativeness and hene system s generalization apability.

5 EER (%) Fae Self-Update Finger Self-Update Fae Co-update Finger Co-update # No. of unlabelled data added Figure 3. EER on the test set as a funtion of the amount of unlabelled data exploited by template update algorithms in Experiment 1. The urve of the self update is shorter due to nonexploitation of muh unlabelled data. Figure 4. The template set generated by o-updating fae and fingerprint systems. The first fae and fingerprint ouple represent the initial template. As mentioned in Setion 1, it an be argued that operating at lower thresholds (e.g. at 2%FAR, 5%FAR et.) ould allow self update to apture larger intra-lass variations. But this ould easily lead to the introdution of lassifiation errors into the template set making self update ounter produtive. This is experimentally shown in Figure 7. For sake of larity, Figure 7 refers to fae mather and the effet of relaxing the operation threshold from 0%FAR to 5%FAR after self update is evaluated. The initial EER of the system is 17% (denoted by * ). At 0%FAR threshold, there is limited inrease in the performane due to stringent threshold onditions and non-exploitation of muh data. From 0%FAR to 1%FAR, the performane inreases due to exploitation of more data. But from 1%FAR, EER value inreases due to inlusion of impostors, and, at 5%FAR, it is even higher than the initial EER. In general, any template update tehnique [2-6] suffers from the risk of impostors introdution. No analytial study has been proposed so far to model this effet, thus it is diffiult to predit its impat. However, a simple study of Figure 3 and 7 allows to onlude that o-update suffers muh less from this drawbak: signifiant inrease in galleries representativeness an be ahieved even by adopting a stringent operative threshold (1% FAR), whih at the same time also allows to minimize the probability of introduing impostors initial auray fae self-update at varying threshold 17 Figure 5. The template set generated by self updating fae and fingerprint systems. The first fae and fingerprint ouple represent the initial templates. EER (%) a) b) Figure 6. Unlabelled fae samples of Figure 4 that are fused (a) and added separately (b) using proposed o-update method. For the sake of larity, samples of the fae gallery are only reported. Aording to adopted algorithms, samples in Figure 5 are progressively "fused" by self update. Instead, proposed o-update algorithm performs fusion only very similar samples. Figure 6(a-b) show the same samples of Figure 4 whih are differently organized aording to the proposed algorithm. In partiular, very similar samples are fused (Figure 6(a)), whilst signifiant intra-lass variations are kept separated (Figure 6(b)) thus improving the expressive power (representativeness) of the obtained gallery %FAR used for seleting threshold for unlabelled data Figure 7. EER values obtained for fae mather as a funtion of aeptane threshold seleted in range from 0%FAR to 5%FAR, for exploiting large intra-lass variations of the unlabelled data. Straight line pointed out by * symbol represent the EER obtained by using only the initial templates Results: experiment 2 The aim of this experiment is to ompare the performane of the template o-update and self update when the initial template set well represent the unlabelled and test data. Thus a ase of highly ontrolled environment, where negligible intra-lass variations are present.

6 Set Initial templates Unlabelled set Test set Charateristis Neutral fae expressions and good quality fingerprint Expressions near to neutral, no lighting variations for fae and negligible variations for fingerprint images The same as the unlabelled set Table 2. Charateristis of templates, unlabelled and test sets used in Experiment 2. In this experiment, the dataset used in experiment 1 has been modified to ensure that genuine data of unlabeled and test sets onsist of images well represented by the initial template. As the visual inspetion of images does not help us to deide representativeness to template gallery set or not, the seletion of near or highly representative images has been done on the basis of high mathing sore with respet to the initial template and the number of seleted lient images is suh that the lowest mathing sore is higher than the highest impostor sore to ensure the lear separation between lients and impostors trials apart from the similarity to initial template. Thus the above simulation of dataset helps to ompare self update and o-update when the environment is very ontrolled. Table 2 shows the harateristis of initial templates, unlabelled and test sets. EER (%) fae self-update fae o-update finger self-update finger o-update # No. of unlabelled data added Figure 8. EER on the test set as a funtion of the amount of unlabelled data exploited by template update algorithms in Experiment 2. It an be notied that self update and o-update urves exploit almost similar number of unlabelled data. Figure 8 shows the EER on test set as a funtion of the amount of unlabeled data used for updating. It may be noted that: The amount of unlabelled data exploited is more or less the same for both tehniques and the differenes among EERs of self and o-update are not very signifiant. Moreover, at the ross-over points, the self-update is better than o-update. This may be due to some good samples aptured by self update whih ould not be aptured by o-update. Aordingly, if initial templates are representative enough and the environment is ontrolled, o-update and self update tehniques perform almost equally well, a ase of very ontrolled environments where the user is trained and ooperative. Conlusions In this paper, we investigated the performane of reently introdued o-update against state of the art self update tehnique with respet to the representativeness of the initial enrolled templates. Reported results pointed out that when the initial templates annot be onsidered well representative of novel input samples, template o-update is able to apture large intra-lass variations due to omplementary nature of the adopted biometris. On the other hand, self update needs to operate at lower threshold to exploit suh variations, but at the expense of introduing impostors that ompromise the overall system reliability and performane. However, in a well ontrolled environment, when very less variations are admitted for lients, self update works almost as well as o-update, thus enabling a redution of the system osts and invasiveness (as only single biometri an be used). Referenes [1] U. Uludag, A. Ross, and A. Jain, Biometri template seletion and update: a ase study in fingerprints, Pattern Reognition, 37( 7): , [2] X. Jiang, and W. Ser, Online Fingerprint Template Improvement, IEEE Trans. PAMI, 24(8): , [3] C. Ryu, K. Hakil, A. Jain, Template adaptation based fingerprint verifiation, Pro. of International Conferene on Pattern Reognition (ICPR), 4: , [4] X. Liu, T. Chen, S.M. Thornton, Eigenspae updating for non-stationary proess and its appliation to fae reognition, Pattern Reognition, , [5] F. Roli, G.L. Marialis, Semi-supervised PCA-based fae reognition using self training, Pro. Joint IAPR Int. Work. on Strutural and Syntatial Pattern Reognition and Statistial Tehniques in Pattern Reognition S+SSPR06, Springer LNCS 4109: , [6] F. Roli, L. Didai, and G.L. Marialis, Template o-update in multimodal biometri systems, IEEE/IAPR 2 nd International Conferene on Biometris ICB 2007, Springer LNCS 4642 : , [7] A. Jain, L. Hong, R. Bolle, On-line fingerprint verifiation, IEEE Transations on PAMI, 19 (4): , [8] D.G. Lowe, Objet reognition from loal sale invariant features, International Conferene on Computer Vision, , Corfu, Greee, September [9] A. Rattani, D. Kisku, A. Lagorio, M. Tistarelli, Faial template synthesis based on SIFT Features, IEEE Int. Workshop on Automati Identifiation Advaned Tehnologies AutoID 2007, pp , [10]

