A New Gabor Phase Difference Pattern for Face and Ear Recognition Yimo Guo 1,, Guoying Zhao 1, Jie Chen 1, Matti Pietikäinen 1 and Zhengguang Xu 1 Machine Vision Group, Department of Electrical and Information Engineering, University of Oulu, PO Box 4500, 90014, Finland School of Information Engineering, University of Science and Technology Beijing, Beijing, 100083, China Abstract. A new local feature based image representation method is proposed. It is derived from the local Gabor phase difference pattern (LGPDP). This method represents images by exploiting relationships of Gabor phase between pixel and its neighbors. There are two main contributions: 1) a novel phase difference measure is defined; ) new encoding rules to mirror Gabor phase difference information are designed. Because of them, this method describes Gabor phase difference more precisely than the conventional LGPDP. Moreover, it could discard useless information and redundancy produced near quadrant boundary, which commonly exist in LGPDP. It is shown that the proposed method brings higher discriminative ability to Gabor phase based pattern. Experiments are conducted on the FRGC version.0 and USTB Ear Database to evaluate its validity and generalizability. The proposed method is also compared with several state-of-the-art approaches. It is observed that our method achieves the highest recognition rates among them. 1 Introduction Over the last decades, biometrics has gained increasing attention because of its broad applications ranged from identification to security. As one of its main research topics, face recognition has been developing rapidly. Numerous face recognition methods have been put forward and adopted in real-life advanced technologies. Meanwhile, ear recognition has also raised interest in research and commercial communities, since human ear is one of the representative human identifiers, and ear recognition would not encounter facial expression and aging problems [1]. As we know, the main task of vision-based biometrics is to extract compact descriptions from images that would subsequently be used to confirm the identity []. For face recognition, the key problem is also to represent objects effectively and improve the recognition performance, which is the same with that in ear recognition. In this paper, a new image representation method is presented based on the Local Gabor Phase Difference Pattern (LGPDP) [8]. It captures the Gabor phase difference in a novel way to represent images. According to the Gabor function and new phase difference definition, we design encoding rules for effective feature extraction. The proposed encoding rules have the following characteristics: 1) divide the quadrants precisely and encode them by a -bit number; and ) avoid useless patterns that might be produced near the quadrant boundary, which is an unsolved problem in the LGPDP. From the results of experiments conducted on the FRGC ver.0 database for face recognition and USTB ear database for ear recognition, the proposed method is
observed to further improve the capability of capturing information from the Gabor phase. The extension of its application from face recognition to ear recognition also achieves impressive results, which demonstrates its ability as a general image representation for biometrics. To our best knowledge, this is the first utilization of Gabor phase difference in ear image representation. Background of Methodology By now, various image representation methods have been proposed for vision-based biometrics. For face recognition, these methods can be generally divided into two categories: holistic matching methods and local matching methods. The Local binary pattern (LBP) [16], Gabor features and their related methods [3]-[6] have been considered as promising ways to achieve high recognition rates in face recognition. One of the influential local approaches is the histogram of Gabor phase patterns (HGPP) [7]. It uses global and local Gabor phase patterns for representation taking advantage of the fact that the Gabor phase can provide useful information as well as the Gabor magnitude. To avoid the sensitivity of Gabor phase to location variations, the Local Gabor Phase Difference Pattern (LGPDP) is put forward later [8]. Unlike the HGPP that exploits Gabor phase relationships between neighbors, LGPDP encodes discriminative information in an elaborate way to achieve a better result. However, its encoding rules would result in a loose quadrant division. This might produce useless and redundancy patterns near the quadrant boundary, which would bring confounding effect and reduce efficiency. For ear recognition, research involves D ear recognition, 3D ear recognition, earprint recognition and so on. Although many approaches have been proposed, such as the PCA for ear recognition [9], Linear Discriminant Analysis and their kernel based methods [10][11][1][13], D ear recognition remains a challenging task in real world applications as most of these methods are based on the statistical learning theory, which inspires the usage of local based approaches later [14]. It has been confirmed that the Gabor phase could provide discriminative information for classification and the Gabor phase difference has sufficient discriminative ability [6][8]. However, the Gabor phase difference should be exploited elaborately to avoid useless information. Therefore, we are motivated to give a new phase difference definition and design new encoding rules in order to extract features from Gabor phase differences effectively. 3 Gabor Phase based Image Representation Method 3.1 Gabor Function as Image Descriptors In this section, we present the new local feature based image representation method, which also captures Gabor phase differences between the referencing pixel and its neighboring pixels at each scale and orientation, but using a new phase difference definition and encoding rules. Gabor wavelets are biologically motivated convolution kernels in the shape of plane waves, restricted by a Gaussian envelope function [15]. The general form of a D Gabor wavelet is defined as:
k μ, v k μ, v z σ ikμ, vz σ Ψμ, v ( z) = e [ e e ]. (1) σ In this equation, z = ( x, y) is the variable in a complex spatial domain, denotes the norm operator, σ is the standard deviation of the Gaussian envelope determining the number of oscillations. The wave vector k μ, v is defined as iφ μ v k μ, v = kve, where kv = kmax f and φ μ = πμ 8, kmax is the maximum frequency of interest, and f is the spacing factor between kernels in the frequency domain. 3. The Novel Local Gabor Phase Difference Pattern Because of the biological relevance with human vision, Gabor wavelets can enhance visual properties which are useful for image understanding and recognition [4]. Thus, Gabor filter based representations are expected to be robust to unfavorable factors, for example, the illumination. The idea that uses Gabor function as image descriptor is to enhance discriminative information by the convolution between the original image and a set of Gabor kernels with different scales and orientations. A Gabor wavelet kernel is the product of an elliptical Gaussian envelope and a complex plane wave. The Gabor kernels in Equation (1) are all self-similar since they can be generated by scaling and rotation via the wave vector k μ, v. We choose eight orientations μ : { 0,1,..., 7} and five scales v : { 0,1,...4}, thus make a total of 40 Gabor kernels. The values of other parameters follow the setting in [4]: σ = π, k max = π, f =. The Gabor-based feature is obtained by the convolution of the original image I (z) and each Gabor filter Ψ μ, v ( z) : Oμ, v ( z) = I( z) Ψμ, v ( z). () O μ, v ( z) is the convolution result corresponding to the Gabor kernel at orientation μ and scale v. The magnitude and phase spectrum of the 40 O μ, v ( z) are shown in Fig. 1. The magnitude spectrum of O μ, v ( z) is defined as: M O ( z) = Re( O, v ( z)) Im( O, v ( z)), v μ + μ, (3) μ where Re () and Im() denote the real and imaginary part of the Gabor transformed image respectively. Usually, O μ, v ( z) as the magnitude part of O μ, v ( z) is adopted in the feature selection [4][17]. But in our case, we choose the phase part of O μ, v ( z) to utilize the discriminative power of the Gabor phase which was confirmed in [18]. The phase spectrum of O μ, v ( z) is defined as: Im( Oμ, v ( z)) φ ( z) = arctan. (4) Oμ, v Re( Oμ, v ( z)) Fig. 1. Illustrative Gabor magnitude spectrum (left) and Gabor phase spectrum (right).
Our method is based on the local Gabor phase difference pattern which captures discriminative information from the Gabor phase for image representation [8]. In the LGPDP, the absolute values of Gabor phase differences, ranged from 0 to π, are calculated for each pixel in the image. Then they are reformulated to a 1-bit number: 1denotes phase differences from 0 to π, and 0 denotes phase differences from π to π. This pattern has two shortages: 1) the division of quadrant is loose, so only 3 4 of the quadrant would be encoded as an 1-bit number; ) as phase differences are taken by Δ θ μ,v ( z), Gabor phase differences near π are almost useless. To increase the efficiency of th e Gabor phase based pattern, in our method, we define the phase difference as: min{ Δ θ μ, v ( z), π Δθ μ, v ( z) }. Thus, the values of Gabor phase differences are ranged in [ 0,π ], and would be reformulated to a -bit number by encoding rules:, where Cp denotes the new coding. In this way, the range of the Gabor phase difference is concentrated from [ 0,π ] to [ 0, π ]. Thus, each π 4 of the half quadrant can be encoded to be a -bit number, which is more precise than the LGPDP which divides the quadrant into two unequal parts. Meanwhile, the new definition of phase difference would discard useless information near the quadrant boundary. In this way, the coding of eight neighbors can be combined to be a 16-bit binary string for each pixel and converted to a decimal number ranged in [0,55]. This process is described in Fig.. The eight -bit numbers are concatenated into a 16-bit number without weight, so that the histogram would not be strongly dependent on the ordering of neighbors (clockwise in LGPDP). Each of these values represents a mode how the Gabor phase of the reference pixel is different from that of its neighbors and what is the range between them. Fig. gives an example of the pattern. The visualizations of the new pattern and the LGPDP are illustrated in Fig. 3. The μ and v are selected randomly. The histograms (56 bins) of Gabor phase differences at different scales and orientations are calculated and concatenated to form the image representation. As a single histogram suffers from losing spatial structure information, images are decomposed into sub-regions, from which local features are extracted. To capture both the global and local information, these histograms are concatenated to an extended histogram for each scale and orientation. The discriminative capability of this pattern could be observed from the results of histogram distance comparison ( μ = 90, v = 5.47 ), listed in Table 1 and. S1 x ( x =1,) and Sy ( y = 1, ) are four images for two subjects. Table 1. The histogram distances of four images for two subjects using the proposed pattern. Subjects S11 S1 S1 S S11 0 556 3986 5144 S1 -- 0 370 5308 S1 -- -- 0 86 S -- -- -- 0
Table. The histogram distances of four images for two subjects using the LGPDP. Subjects S11 S1 S1 S S11 0 316 3630 4166 S1 -- 0 3300 3788 S1 -- -- 0 93 S -- -- -- 0 Fig.. Quadrant bit coding and an example of the new Gabor phase difference pattern. The first 8-neighborhood records coefficients of π that describe neighborhood Gabor phases. The second one records coefficients of π that describe neighborhood Gabor phase differences using the new definition. The third one records corresponding binary numbers according to the new encoding rules. The binary string is 1011010010111000. The decimal number corresponding to this string is then normalized to the range of [0,55]. Fig. 3. Illustrative samples of resultant results when convoluting an image with Gabor filters v = 5.47,8.0 using the proposed pattern (left) and the LGPDP (right). ( ) 4 Experiments The proposed method is tested on the FRGC ver.0 database [19] and the USTB ear database [13] for face recognition and ear recognition, respectively. The classifier is the simplest classification scheme: nearest neighbour classifier in the image space with Chi square statistics as a similarity measure. 4.1 Experiments on the FRGC ver.0 Database To evaluate the performance of the proposed method in face recognition, we conduct experiments on the FRGC version.0 database which is one of the most challenging face databases [19]. The images are normalized and cropped to the size of 10 10 using the provided eye coordinates. Some samples are shown in Fig. 4. Fig. 4. Face images from the FRGC.0 database. In FRGC.0 database, there are 1776 images taken from subjects in the training set and 1608 images in the target set. We conduct Experiment 1 and Experiment 4 protocols to evaluate the performance of different approaches. In Experiment 1, there are 1608 query images taken under the controlled illumination
condition. The goal of Experiment 1 is to test the basic recognition ability of approaches. In Experiment 4, there are 8014 query images taken under the uncontrolled illumination condition. Experiment 4 is the most challenging protocol in FRGC because uncontrolled large illumination variations bring significant difficulties to achieve high recognition rate. The experimental results on the FRGC.0 database in Experiment 1 and 4 are evaluated by Receiving Operator Characteristics (ROC), which is face verification rate (FVR) versus false accept rate (FAR). Tables 3 and 4 list the performance of different approaches on face verification rate (FVR) at false accept rate (FAR) of 0.1% in Experiment 1 and 4. From experimental results listed in Table 3 and 4, the proposed method achieves the best performance, which demonstrates its basic abilities in face recognition. Table 5 exhibits results of the comparison with some well-known approaches. The images are uniformly divided into 64 sub-regions in local based methods. The database used in experiments for Gabor + Fisher Linear Discriminant Analysis (FLDA) and Local Gabor Binary Patterns (LGBP) are reported to be a subset of FRGC.0 [0], while the whole database is used for others. It is observed that our pattern has high discriminative ability and could improve face recognition performance. Table 3. The FVR values in Experiment 1 of the FRGC.0 database. Methods FVR at FAR = 0.1% (in %) ROC 1 ROC ROC 3 BEE Baseline [19] 77.63 75.13 70.88 LBP [16] 86.4 83.84 79.7 Our method 98.38 95.14 93.05 Table 4. The FVR values in Experiment 4 of the FRGC.0 database. Methods FVR at FAR = 0.1% (in %) ROC 1 ROC ROC 3 BEE Baseline [19] 17.13 15. 13.98 LBP [16] 58.49 54.18 5.17 Our method 8.8 80.74 78.36 Table 5. The ROC 3 on the FRGC.0 in Experiment 4. Methods ROC 3, FVR at FAR = 0.1% (in %) BEE Baseline [19] 13.98 Gabor + FLDA [0] 48.84 LBP [16] 5.17 LGBP [0] 5.88 LGPDP [8] 69.9 Our method 78.36 4. Experiments on the USTB Ear Database For ear recognition, experiments are conducted on a subset (40 subjects) of the USTB ear database that contains 308 images of 77 subjects [13]. These images are taken under 3 viewing conditions (azimuth α { 30,0,30} ) and different illumination conditions. The original images, with a resolution of 300 400, are cropped to grayscale images with a resolution of 70 360. Sample images for two subjects are shown in Fig. 5. In this experiment, three images of one subject are taken as the training set and the remaining one serves as the testing set. Considering that complex information is contained in the ear print area, we divide images into sub-regions with 5-pixel overlapping.
Fig. 5. Ear images from the USTB ear database for two subjects. As in other local feature based methods, recognition performance can be improved by image division. Here we divide images uniformly into nine sub-regions with small overlapping. The spatially enhanced histogram is defined as the combination of features extracted from each sub-region. In this way, the texture of image could be locally encoded by micro-patterns and the ear shape could be recovered by the construction of feature histograms. To evaluate the performance of the proposed method in ear recognition, it is compared with some widely-used methods: Principal Components Analysis (PCA) for ear recognition, Fisher Discriminant Analysis, rotation invariant descriptor, Local binary pattern and LGPDP. The average recognition rates (in %) using cross validation are listed in Table 6. From experimental results, we can observe that the proposed pattern performs well in ear recognition, which demonstrate its efficiency and generalizability as an image representation for biometrics. Table 6. Experimental results of ear recognition. Methods Recognition rate (in %) PCA [9] 78.68 FDA [10] 85.71 Rotation invariant descriptor [1] 88.3 LBP [16] 89.79 LGPDP [8] 89.53 Our method 9.45 5 Conclusions In this paper, we propose a new Gabor phase based image representation method, which is based on the local Gabor phase difference pattern (LGPDP). There are two disadvantages of the conventional LGPDP: 1) 3 4 of the quadrant is encoded as an 1-bit number because of its loose quadrant division; and ) Gabor phase difference patterns near π are almost useless because the phase difference is defined as the absolute value of phase distance between neighbors, which might bring confounding effects to image representation. Therefore, we propose a new local feature for effective image representation, which could discard useless information by defining the phase difference measure in a novel way. Moreover, new encoding rules are designed to provide more precise quadrant division than the LGPDP. In virtue of these two contributions, the discriminative ability of the Gabor phase based pattern can be significantly improved. This method is evaluated on both the FRGC version.0 database and the USTB ear database. It is also compared with several state-of-theart approaches and achieves the highest recognition rates among them. The experimental results could demonstrate its capability and generalizability as an image representation. Acknowledgments. The authors would like to thank the Academy of Finland for their support to this work.
References 1. Iannarelli, A.: Ear Identification. Forensic Identification Series. Paramount Publishing Company, Fremont, California (1989). Burge, M., Burger. W.: Ear Recognition. In: Jain, A. K., Bolle, R., Pankanti S. (eds.) Biometrics: Personal Identzjication in Networked Society, pp. 73--86, Kluwer Academic Publishing (1998) 3. Lyons, M.J., Budynek, J., Plante, A., Akamatsu, S.: Classifying Facial Attributes using a -d Gabor Wavelet Representation and Discriminant Analysis. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 1357--136 (000) 4. Liu, C., Wechsler, H.: Gabor Feature based Classification using the Enhanced Fisher Linear Discriminant Model for Face Recognition. IEEE Transactions on Image Processing, vol. 4, pp. 467--476 (00) 5. Zhang, W., Shan, S., Gao, W., Chen, X., Zhang, H.: Local Gabor Binary Pattern Histogram Sequence (LGBPHS): A Novel Non-Statistical Model for Face Representation and Recognition. In: International Conference on Computer Vision, pp. 786--791 (005) 6. Zhang, W., Shan, S., Chen, X., Gao, W.: Are Gabor Phases Really Useless for Face Recognition? In: International Conference on Pattern Recognition, pp. 606--609 (006) 7. Zhang, B., Shan, S., Chen, X., Gao, W.: Histogram of Gabor Phase Pattern (HGPP): A novel object representation approach for face recognition. IEEE Transactions on Image Processing, vol. 1, pp. 57--68 (007) 8. Guo, Y., Xu, Z.: Local Gabor Phase Difference Pattern for Face Recognition. In: International Conference on Pattern Recognition, pp. 1--4 (008) 9. Chang, K., Bowyer, K., Sarkar, S., Victor, B.: Comparison and Combination of Ear and Face Images in Appearance-Based Biometrics. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, pp. 1160--1165 (003) 10. Liu, Y., Mu, Z., Yuan, L.: Application of Kernel Function Based Fisher Discriminant Analysis Algorithm in Ear Recognition. Measurements and Control, pp. 304--306 (006) 11. Shailaja, D., Gupta, P.: A Simple Geometric Approach for Ear Recognition. In: International Conference on Information Technology, pp. 164--167 (006) 1. Fabate, A., Nappi, M., Riccio, D., Ricciardi, S.: Ear Recognition by means of a Rotation Invariant Descriptor. In: International Conference on Pattern Recognition (006) 13. Yuan, L., Mu, Z.: Ear Recognition based on D Images. In: IEEE International Conference on Biometrics: Theory, Applications, and Systems, pp. 1--5 (007) 14. Guo, Y., Xu, Z.: Ear Recognition using a New Local Matching Approach. In: IEEE International Conference on Image Processing, pp. 89--9 (008) 15. Wiskott, L., Fellous, J.-M., Kruger, N., Malsburg, C.v.d.: Face Recognition by Elastic Bunch Graph Matching. Intelligent Biometric Techniques in Fingerprint and Face Recognition, Chapter 11, pp. 355--396 (1999) 16. Ahonen, T., Hadid, A., Pietikäinen, M.: Face Description with Local Binary Pattern. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, pp. 037--041 (006) 17. Tao, D., Li, X., Wu, X., Maybank, S. J.: General Tensor Discriminant Analysis and Gabor Features for Gait Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 9, pp. 1700--1715 (007) 18. Qing, L., Shan, S., Chen, X., Gao, W.: Face Recognition under Varying Lighting based on the Probabilistic Model of Gabor Phase. In: International Conference on Pattern Recognition, vol. 3, pp. 1139--114 (006) 19. Phillips, P. J., Flynn, P. J., Scruggs, T., Bowyer, K. W., Chang, J., Hoffman, K., Marques, J., Min, J., Worek, W.: Overview of the Face Recognition Grand Challenge. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 947--954 (005) 0. Lei, Z., Liao, S., He, R., Pietikäinen, M., Li, Stan Z.: Gabor Volume based Local Binary Pattern for Face Representation and Recognition. In: IEEE Conference on Automatic Face and Gesture Recognition (008)