Image Processing and Image Representations for Face Recognition
1 Introduction Face recognition is an active area of research in image processing and pattern recognition. Since the general topic of face recognition is a large and complex area of active research, it is not feasible to address it fully in a semester project. Thus we must restrict our attention to some specific part or application of the face recognition problem. The application we are envisioning is face recognition for biometric security. Consider the problem of granting access to a restricted-access section of an office building. Suppose a face recognition system is implemented near the area entrance. The system has access to a known library of company employees. The problem is that we need to be able to correctly identify these workers to grant access when captured images are not necessarily the same as the library images. Alternatively we could consider a situation where we wish to exclude individuals access to a location. For example, consider automated recognition of passport photos at ports of entry. In that case, we hope to identify individuals belonging to a database of individuals whom we would like to exclude from entering the United States (e.g. Osama bin Laden or other known terrorists). Since passport photos conform to certain formatting restrictions (image size, magnification, etc.), they represent an attractive data set for analysis. However, since the database likely does not have access to their actual (and probably forged) passport photos, the problem is still difficult and interesting enough to merit attention. 1.1 Project overview We propose to compare the performance of face recognition methods based on different face representations. In particular, we intend to investigate the robustness of these methods to variations in face images such as additive noise or changes in illumination conditions, and to see how the performance of these methods can be improved by preparing the images using various image processing techniques such as contrast enhancement or edge detection. The end-to-end face recognition process is summarized as a block diagram in Figure 1. Suppose we have a database of K individuals. Now suppose we select an image of the face of the kth individual in this database, which we denote as f k. We apply some some combination of preprocessing algorithms to the image, and then we use the processed image as the input to some face recognition procedure. The output of the face recognition procedure is an estimate ˆk of which individual s face was processed. Image f k Preprocessing Face recognition procedure Individual ˆk Figure 1: Block diagram of the face recognition process 1
Training images {f k } Basis set construction Basis images {b l } (a) Image g k Preprocessing Face recognition procedure Representation Identification Individual ˆk (b) Figure 2: Block diagram of face recognition process with an empirically-derived image representation basis. (a) Basis set construction (training) (b) Recognition We are particularly interested in investigating a face recognition procedure that converts the two-dimensional array representation of a face image into some other image representation (for example principal component representation), then uses some metric (for example, minimum-distance) to identify that face. Many of these image representations represent faces as a linear combination of basis images b l. The end-to-end face recognition process when images are represented using an empirically-derived basis is summarized as a block diagram in Figure 2. We would like to investigate how these sort of face recognition methods perform when they encounter variation in face images of a particular individual. Variations that can be accounted for with just two-dimensional image processing techniques include additive noise, changes in illumination, or even disguises. Variations that would might require additional techniques such as three-dimensional head modelling or elastic deformation include severe changes in facial expression or head orientation. 1.2 Proposal outline We begin by reviewing the literature on face recognition. Next we describe the methods that we propose to use for this project, and extensions that we would like to explore if time permits. Finally, we describe an approximate timetable for completing this project. 2 Literature review A number of review articles have been published on the subject of face recognition. Zhang et al. [1] primarily focus on eigenface (principal component analysis) representation for face recognition, with neural networks and elastic deformation. Older review articles include Samal and Iyengar [2] and Chellappa et al. [3]. Sirovich and Kirby were the first to suggest Karhunen-Loeve decomposition (also known as principal component analysis, or PCA) for representation of faces [4]. The term eigenface for the basis images for PCA was coined by Turk and Pentland in [5, 6]. Bartlett et al. describe face recognition using an empirical basis derived using a method 2
known as independent component analysis (ICA) [7]. They chose this basis in the hope of improving the ability of recognition algorithms to distinguish individuals. Belhumeur et al. describe face recognition using linear discriminant analysis (LDA), with the aim of making the recognition algorithm insensitive to variations in illumination and facial expression [8]. They call their basis images Fisherfaces since their criterion for generating basis images is Fisher s linear discriminant. 3 Materials and methods In this section we describe the materials and methods that we propose to use for this project. In general, we assume that we will be using recognition algorithms based on image representation by PCA (eigenfaces), but if time allows we might also experiment with image representation by ICA or LDA (Fisherfaces). 3.1 Image databases Because of the time limitations for completing this project, we hope to utilize whatever external resources are available. In particular, we would like to use existing face databases for our experiments. There are a wide number of databases available, such as the AR Face Database [9] and the Yale Face Database B [10]. The AR Face Database contains 4000 images of 126 subjects (70 men and 56 women). The images are frontal views and feature variations in illumination, facial expression, and disguises. Access to the database must be requested via email. The Yale Face Database B contains images of 10 subjects under all 576 combinations of 9 head orientations and 64 illumination conditions and is publicly accessible on the Internet at http://cvc.yale.edu/projects/yalefacesb/yalefacesb.html 3.2 Basis set construction The basic block diagram for our experiments has been shown in figure 1. We will start with a training set of faces and perform PCA on this set. This will result in a basis set of eigenfaces. The training faces will be obtained from a pre-existing facial database yet to be determined (see Resources). A training set of K images each containing M N pixels will naturally yield a set of MN eigenfaces. This basis set will be computed once and used as a standard for all subsequent experiments. 3.3 Truncating the basis set In practice, it is inefficient to use a basis set that is MN in number. To compress the data and make the recognition process more efficient, we will truncate the basis set. By ordering the basis set by the magnitude of the eigenvalues, 3
we can determine which of the basis images contribute the least information to the decomposition. Truncating these basis images will result in image compression and a more efficient way to store and process large amounts of data. 3.4 Variations in face images After creating a basis using PCA on the training set of faces, we will then build a library of the PCA coefficients for all the faces we have obtained. We will then test the robustness of the coefficients to identify altered versions of the faces in the library. These alterations can be viewed as a type of disturbance or corruption of the original image. The alterations we propose to investigate are additive white Gaussian noise (AWGN) and changes in illumination conditions. To test the faces with AWGN we will simulate AWGN with zero mean and different variances in MATLAB and then add this to the original face. To test the effect of illumination we will use face images taken under different illumination conditions than the training images. 3.5 Nonlinear preprocessing techniques We will also investigate the performance of our facial recognition system with three different nonlinear preprocessing techniques. We propose using edge detection, median filtering, and contrast enhancement. We will apply the preprocessing to the training set and to all images stored in the library. We will then evaluate the performance of our facial recognition system with test images that have also undergone the same preprocessing. We hypothesize that edge detection will improve the performance of the system when the images are altered by illumination, since illumination variations can be represented as low frequency multiplicative noise. We hypothesize that median filtering will improve the performance of the system when the images are altered by AWGN. We hypothesize regional contrast enhancement could improve the performance of the system under changes in illumination. 3.6 Face recognition algorithm After applying the appropriate preprocessing, if any, we will then use a face recognition algorithm to determine if the face belongs to anyone in our library. The library will consist of the PCA coefficients of each face along with the eigenfaces generated from the training set. The input face image will be decomposed using the eigenfaces and a set of PCA coefficients will be determined. The minimum squared distance between the input coefficients and all of the other sets of coefficients in the library will be calculated. The library image with the least minimum squared distance between its PCA coefficients and the input image coefficients will be the estimate of the identity. If the least minimum squared distance between the PCA coefficients between the estimate and the input image is below a certain threshold we conclude the test image has been identified. Otherwise, we conclude the test image does not match any identity stored in the library. 4
3.7 Performance evaluation The primary metric for evaluating performance of our processing and recognition algorithms will be the error rate for identification. For example, if we train the algorithm on images where the subjects were illuminated head-on, then attempt to recognize images where the subjects were illuminated from their right at a 45 angle, we expect that the recognition algorithm might make mistaken identifications for some individuals. Then we can make some changes to the preprocessing algorithms to try to reduce the error rate. For this case, we hope to be able to devise a preprocessing strategy that has a low error rate over a wide range of illumination angles. In addition to the identification error rate, we plan to use several statistical metrics for evaluating the impact of image preprocessing on the performance of PCA/ICA/LDA. Bartlett et al. [7] claim that sparse coefficient distributions are advantageous in coding visual features. Sparse distributions are distinguished by high probabilities near the mean and relatively rare occurrences with large positive or negative values. Since outliers are very rare, their significance is increased simply due to their rarity. Bartlett et al. argue that if sparseness is maximized without loss of information then higher order statistical relationships also become increasing rare. This leads to system with higher signal to noise and better tolerance of partial information. The natural measurement of sparseness is kurtosis. Mathworld (http://www.mathworld.com) defines kurtosis as the ratio of the fourth moment of the distribution to the square of the variance. By convention, kurtosis is normalized to zero for the Gaussian distribution by subtracting 3. kurtosis = i (x i µ) 4 σ 4 3 By comparing the increase or decrease in kurtosis between the output of the unprocessed test images and preprocessed test images, we can measure the impact of the preprocessing. Preprocessing steps that increase the kurtosis should lead to better recognition systems. The other metric we plan to use is MSE measurements of noise corrupted training images. Basis decomposition of the noisy images should yield basis coefficients that are perturbed from the ideal values. By defining the weighted distance between the original coefficients and the perturbed coefficients as the MSE, we can measure the error reduction of each preprocessing method. 4 Extensions If time permits, we might extend our work to cover more ambitious aspects of the face recognition problem. We could investigate how to compensate for some of the more difficult variations in face images such changes in facial expressions, disguises, and changes in head orientation. To test the effect of different expressions we can test against 5
Task March April Week 4 Week 5 Week 1 Week 2 Week 3 Week 4 Develop project proposal Obtain facial database Investigate algorithms in MATLAB Create basis images Initial testing with corrupted images Algorithm refinement Generate final results Prepare oral presentation Write project report Figure 3: Project timetable images of the face while the subject is smiling, screaming, or angry. To test the effect of different disguises we can test against images of the face while the subject is wearing sunglasses or a scarf. Finally, to test the effect of orientation we can test against images where the subject is not looking directly at the camera. We could also investigate decompositions other than PCA, such as ICA or LDA. 5 Timetable We have sketched out a timetable for our project in Figure 3. Within the next week, we plan to decide on a face database to use and begin our investigation of the algorithms for basis generation and recognition. By the beginning of April, we plan to begin investigating the effect of variations in the face images on the performance of the recognition algorithms and exploring and refining preprocessing techniques for improving recognition performance. We plan to generate our final results by mid-april and write them up for the project presentation and report. 6
Bibliography [1] J. Zhang, Y. Yan, and M. Lades, Face recognition: eigenface, elastic matching, and neural nets, Proc. IEEE, vol. 85, no. 9, pp. 1423 1435, September 1997. [2] A. Samal and P. A. Iyengar, Automatic recognition and analysis of human faces and facial expressions: a survey, Pattern Recognition, vol. 25, pp. 65 77, January 1992. [3] R. Chellappa, C. L. Wilson, and S. Sirohey, Human and machine recognition of faces: a survey, Proc. IEEE, vol. 83, pp. 705 741, May 1995. [4] L. Sirovich and M. Kirby, Low-dimensional procedure for the characterization of human faces, J. Opt. Soc. Am. A, vol. 4, no. 3, pp. 519 524, March 1987. [5] M. A. Turk and A. P. Pentland, Face recognition using eigenfaces, in Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, Maui, Hawaii, 3-6 June 1991, pp. 586 591. [Online]. Available: http://www.face-rec.org/algorithms/pca/mturk-cvpr91.pdf [6], Eigenfaces for recognition, Journal of Cognitive Neuroscience, vol. 3, no. 1, pp. 71 86, March 1991. [Online]. Available: http://www.face-rec.org/algorithms/pca/jcn.pdf [7] M. S. Bartlett, J. R. Movellan, and T. J. Sejnowski, Face recognition by independent component analysis, IEEE Trans. Neural Networks, vol. 13, no. 6, pp. 1450 1464, November 2002. [8] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection, IEEE Trans. Pattern Anal. Machine Intell., vol. 19, no. 7, pp. 711 720, July 1997. [9] A. Martinez and R. Benavente, The AR face database, Purdue University School of Electrical and Computer Engineering, CVC Technical Report 24, June 1998. [Online]. Available: http://rvl1.ecn.purdue.edu/ aleix/aleix face DB.html [10] A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman, From few to many: Illumination cone models for face recognition under variable lighting and pose, IEEE Trans. Pattern Anal. Machine Intell., vol. 23, no. 6, pp. 643 660, 2001. 7