Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) References: [1] http://homepages.inf.ed.ac.uk/rbf/hipr2/index.htm [2] http://www.cs.wisc.edu/~dyer/cs540/notes/vision.html w04-p. R. Biometrics - Summer 2006 1
Image Processing & Pattern Recognition Image Enhancement Techniques Convolution filter Gaussian filter Laplacian/Laplacian of Gaussian filter Unsharp filter Contrast Stretching Histogram Equalization Feature Extraction Techniques Roberts Cross Sobel Edge Detector Canny Edge Detector Binarization Pattern Recognition Techniques Eigenspace Representation of Images PCA(Principal Component Analysis) Example Face Recognition Algorithm w04-p. R. Biometrics - Summer 2006 2
Image Enhancement Techniques Convolution Filter [1] A Convolution Filter is a simple and fundamental mathematical operation to many common image processing operators a way of `multiplying together' two arrays of numbers, with different sizes but same dimensionality used in image processing to implement operators whose output pixel values are simple linear combinations of certain input pixel values performed by sliding the kernel over the image, generally starting at the top left corner to move the kernel through all the positions where the kernel fits entirely within the boundaries of the image w04-p. R. Biometrics - Summer 2006 3
Image Enhancement Techniques Convolution Filter [1] Convolution Filter is calculated by multiplying the kernel value and the underlying image pixel value for each of the cells in the kernel, and then adding all these numbers together. e.g. Mathematically w04-p. R. Biometrics - Summer 2006 4
Image Enhancement Techniques - Gaussian filter [1] Gaussian filter - is a 2-D smoothing convolution operator is used to blur images and remove detail and noise where sigma is the standard deviation of the distribution assume distribution has a mean of zero,i.e. centered on the line x=0 1-D form w04-p. R. Biometrics - Summer 2006 5
Image Enhancement Techniques - Gaussian filter [1] Gaussian filter - idea of Gaussian smoothing is to use this 2-D distribution as a `point-spread' function 2-D form discrete approximation to Gaussian function with sigma = 1.0 w04-p. R. Biometrics - Summer 2006 6
Image Enhancement Techniques - Gaussian filter [1] Example: original image, sigma=1.0, sigma=2.0 w04-p. R. Biometrics - Summer 2006 7
Image Enhancement Techniques Laplacian/Laplacian of Gaussian filter [1] The Laplacian is a 2-D isotropic measure of the 2 nd spatial derivative of an image an image highlights regions of rapid intensity change often used for edge detection takes a graylevel image as input and produces another graylevel image as output Laplacian L(x,y) of an image with pixel intensity values I(x,y) w04-p. R. Biometrics - Summer 2006 8
Image Enhancement Techniques Laplacian/Laplacian of Gaussian filter [1] The Laplacian can be calculated using a convolution filter two commonly used discrete approximations to the Laplacian filter w04-p. R. Biometrics - Summer 2006 9
Image Enhancement Techniques Laplacian/Laplacian of Gaussian filter [1] The convolution kernels approximating a second derivative measurement on the image is very sensitive to noise The image is often Gaussian smoothed before applying the Laplacian filter which reduces the high frequency noise components prior to the differentiation step Laplacian of Gaussian (LoG) Since the convolution operation is associative we can convolve the Gaussian smoothing filter with the Laplacian filter first and convolve this hybrid filter with the image to achieve the required result advantage requires far fewer arithmetic operations The LoG kernel can be precalculated in advance so only one convolution needs to be performed w04-p. R. Biometrics - Summer 2006 10
Image Enhancement Techniques Laplacian/Laplacian of Gaussian filter [1] the 2D LoG function centered on zero and with Gaussian standard deviation sigma has the form w04-p. R. Biometrics - Summer 2006 11
Image Enhancement Techniques Laplacian/Laplacian of Gaussian filter response of the LoG to a step edge w04-p. R. Biometrics - Summer 2006 12
Image Enhancement Techniques Laplacian/Laplacian of Gaussian filter Example: LoG filter with Gaussian sigma = 1.0; and a 7 7 kernel result image contain negative and non-integer values output image normalized to the range 0 255 for display purpose w04-p. R. Biometrics - Summer 2006 13
Image Enhancement Techniques Unsharp filter [1] simple sharpening operator for edge enhancement subtracts an unsharp, or smoothed, version of an image from the original image produce edge image g(x,y) from original image f(x,y) fsmooth(x,y) is a smoothing function for f(x,y) w04-p. R. Biometrics - Summer 2006 14
Image Enhancement Techniques Unsharp filter [1] examining its frequency response characteristics calculate edge image for unsharp filter (a),(b),(c) add the edge image back to the original image w04-p. R. Biometrics - Summer 2006 15
Image Enhancement Techniques Unsharp filter [1] Complete unsharp sharping operator K varies between 0.2 and 0.7, with the larger values providing increasing amounts of sharpening w04-p. R. Biometrics - Summer 2006 16
Image Enhancement Techniques Unsharp filter [1] Example of Unsharp filter: original image, edge image, sharpened image w04-p. R. Biometrics - Summer 2006 17
Image Enhancement Techniques Contrast Stretching [1] Contrast stretching - often called normalization improve the contrast in an image by `stretching' the range of intensity values it contains to span a desired range of values a and b are the lower and the upper limits of the image type e.g. (0,255) for 8-bit grayscale image c and d are the lowest and highest value in the original image P in is pixel value of original image, P out is output pixel value w04-p. R. Biometrics - Summer 2006 18
Image Enhancement Techniques - Contrast Stretching [1] Example: w04-p. R. Biometrics - Summer 2006 19
Histogram Equalization Image Enhancement Techniques Histogram Equalization [1] provide a sophisticated method for modifying the dynamic range and contrast of an image by altering that image such that its intensity histogram has a desired shape employs a monotonic, non-linear mapping which re-assigns the intensity values of pixels in the input image such that the output image contains a uniform distribution of intensities used in image comparison processes usually introduced using continuous, rather than discrete, process functions suppose that the images of interest contain continuous intensity levels (in the interval [0,1]) suppose the transformation function f which maps an input image A(x,y) onto an output image B(x,y) is continuous within this interval w04-p. R. Biometrics - Summer 2006 20
Image Enhancement Techniques - Histogram Equalization [1] Histogram Equalization assumed that the transfer law which may also be written in terms of intensity density levels, e.g. is single-valued and monotonically increasing 1 and its inverse exist as D = f ( ) A D B Example of such a transfer function: D = B f ( D ) A w04-p. R. Biometrics - Summer 2006 21
Image Enhancement Techniques - Histogram Equalization [1] All pixels in the input image with densities in the region D A to D A + dd A will have pixel value replaced by an output pixel density value in the range from in the region of D B to D B + dd B surface areas h A (D A )dd A and h B (D B )dd B will therefore be equal yielding: where w04-p. R. Biometrics - Summer 2006 22
Image Enhancement Techniques Histogram Equalization Result can be written in the language of probability theory if the histogram h is regarded as a continuous probability density function p describing the distribution of the (assumed random) intensity levels: For histogram equalization, the output probability densities should all be an equal fraction of the maximum number of intensity levels in the input image (where the minimum level considered is 0). The transfer function (or point operator) necessary to achieve this result is simply: w04-p. R. Biometrics - Summer 2006 23
Image Enhancement Techniques - Histogram Equalization Therefore where is simply the cumulative probability distribution (i.e. cumulative histogram) of the original image. Thus, an image which is transformed using its cumulative histogram yields an output histogram which is flat! A digital implementation of histogram equalization is usually performed by defining a transfer function of the form: where N is the number of image pixels and n k is the number of pixels at intensity level k or less. w04-p. R. Biometrics - Summer 2006 24
Image Enhancement Techniques - Histogram Equalization Example w04-p. R. Biometrics - Summer 2006 25
Roberts Cross operator Feature Extraction Techniques Roberts Cross [1] simple, quick to compute, 2-D spatial gradient measurement on an image highlights regions of high spatial frequency which often correspond to edges consists of a pair of 2x2 convolution kernels w04-p. R. Biometrics - Summer 2006 26
Feature Extraction Techniques - Roberts Cross Kernels designed to respond maximally to edges running at 45 to the pixel grid, one kernel for each of the two perpendicular orientations, the gradient magnitude is given by: an approximate magnitude is computed using: angle of orientation of the edge giving rise to the spatial gradient is given by: pseudo-convolution kernels and the approximate magnitude is given by: w04-p. R. Biometrics - Summer 2006 27
Example Feature Extraction Techniques - Roberts Cross w04-p. R. Biometrics - Summer 2006 28
Feature Extraction Techniques - Sobel Edge Detector [1] Sobel operator performs a 2-D spatial gradient measurement on an image emphasizes regions of high spatial frequency that correspond to edges approximate absolute gradient magnitude at each point in an input grayscale image consists of a pair of 3 3 convolution kernels w04-p. R. Biometrics - Summer 2006 29
Feature Extraction Techniques - Sobel Edge Detector Kernels designed respond maximally to vertical and horizontal edge one kernel for each of the two perpendicular orientations can be applied separately to the input image, to produce separate measurements of the gradient component in each orientation (call these Gx and Gy) gradient magnitude, approximation and angle of orientation of the edge are given by: pseudo-convolution kernels and approximate magnitude w04-p. R. Biometrics - Summer 2006 30
Feature Extraction Techniques - Sobel Edge Detector Sobel Edge Detector: Example w04-p. R. Biometrics - Summer 2006 31
Feature Extraction Techniques - Canny Edge Detector [1] Canny operator designed to be an optimal edge detector (according to particular criteria) works in a multi-stage process smoothed by Gaussian convolution simple 2-D first derivative operator (somewhat like the Roberts Cross) is applied to the smoothed image to highlight regions of the image with high first spatial derivatives edges give rise to ridges in the gradient magnitude image the algorithm tracks along the top of ridges and sets to zero all pixels that are not on the ridge top to give a thin line in the output, a process known as nonmaximal suppression. w04-p. R. Biometrics - Summer 2006 32
Feature Extraction Techniques - Canny Edge Detector [1] Canny operator the tracking process exhibits hysteresis controlled by two thresholds: T1 and T2, with T1 > T2 tracking can only begin at a point on a ridge higher than T1 tracking then continues in both directions out from that point until the height of the ridge falls below T2 this hysteresis ensure that noisy edges are not broken up into multiple edge fragments. w04-p. R. Biometrics - Summer 2006 33
Feature Extraction Techniques - Canny Edge Detector Example Sigma=1.0 Upper thres=255 Lower thres=1 Sigma=1.0 Upper thres=255 Lower thres=200 Sigma=1.0 Upper thres=128 Lower thres=1 Sigma=2.0 Upper thres=128 Lower thres=1 w04-p. R. Biometrics - Summer 2006 34
Feature Extraction Techniques Binarization [1] Binarization used to extract feature from feature magnitude map used label features convert intensity map to binary values, such as 0 or 255 color of the object (usually white) is referred to as the foreground color the rest (usually black) is referred to as the background color produced by thresholding global threshold apply threshold on the whole image Local/adaptive threshold apply threshold locally w04-p. R. Biometrics - Summer 2006 35
Feature Extraction Techniques - Binarization Example: original image, edge map computed by sobel, binarization - threshold with value 150 w04-p. R. Biometrics - Summer 2006 36
Pattern Recognition Techniques Eigenspace Representation of Images [2] Image Representation in N 2 dimension an N x N image can be "represented" as a point in an N 2 dimensional image space each dimension is associated with one of the pixels in the image and the possible values in each dimension are the possible gray levels of each pixel e.g. 512 x 512 image where each pixel is an integer in the range 0,..., 255 (i.e., a pixel is stored in one byte), then image space is a 262,144-dimensional space and each dimension has 256 possible values. w04-p. R. Biometrics - Summer 2006 37
Pattern Recognition Techniques - Eigenspace Representation of Images Example: case of M training face images Suppose we represent our M training images as M points in image space One way of recognizing the person in a new test image would be to find its nearest neighbor training image in image space, However this approach would be very slow since the size of image space is so large does not exploit the fact that since all of our images are of palms, they will likely be clustered relatively near one another in image space So let's represent each image in a lower-dimensional feature space, called face space or eigenspace. w04-p. R. Biometrics - Summer 2006 38
Pattern Recognition Techniques Eigenspace Representation of Images Suppose, we have M' images, E 1, E 2,..., E M', called eigenfaces or eigenvectors. Each images define a basis set, so that each face image will be defined in terms of how similar it is to each of these basis images, i.e. we can represent an arbitrary image I as a weighted (linear) combination of these eigenvectors as follows: 1. Compute the average image, A, from all of the training images I 1, I 2,..., I M : M 1 A = I i M i = 1 2. For k = 1,..., M' compute a real-valued weight, w k, indicating the similarity between the input image, I, and the kth eigenvector, E k : w k = E kt * (I - A) where I is a given image and is represented as a column vector of length N 2, E k is the k th eigenface image and is a column vector of length N 2, A is a column vector of length N 2, * is the dot product operation, and - is pixel by pixel subtraction. Thus w k is a real-valued scalar. w04-p. R. Biometrics - Summer 2006 39
Pattern Recognition Techniques Eigenspace Representation of Images 3. W = [w 1, w 2,..., w M' ] T is a column vector of weights that indicates the contribution of each eigenface image in representing image I. instead of representing image I in image space, we'll represent it as a point W in the M'-dimensional weight space that we'll call face space or eigenspace. each image is projected from a point in the high dimensional image space down to a point in the much lower dimensional eigenspace. in terms of compression, each image is represented by M' real numbers, which means that for a typical value of M'=10 and 32 bits per weight, we need only 320 bits/image to encode it in face space. (Of course, we must also store the M' eigenface images, which are each N 2 pixels, but this cost is amortized over all of the training images, so it can be considered to be a small additional cost.) w04-p. R. Biometrics - Summer 2006 40
Pattern Recognition Techniques Eigenspace Representation of Images Notice that image I can be approximately reconstructed from W as follows: M ' I A + i = 1 ( w i * E This reconstruction will be exact if M' = min(m, N 2 ). Hence, representing an image in eigenspace won't be exact in that the image won't be reconstructible, but it will be a pretty good approximation that's sufficient for differentiating between faces. Question: How to select a value for M' and then determine the M' "best" eigenvector images (i.e., eigenfaces). Answer: Use the statistics technique called Principal Components Analysis (also called the Karhunen-Loeve transform in communications theory). Intuitively, this technique selects the M' images that maximize the information content in the compressed (i.e., eigenspace) representation. i ) w04-p. R. Biometrics - Summer 2006 41
Pattern Recognition Techniques PCA (Principal Component Analysis) [2] The best M' eigenface images can be computed as follows: 1. For each training image I i, normalize it by subtracting the mean (i.e., the "average image"): Y i = I i A 2. Compute the N 2 x N 2 Covariance Matrix: C = 1 M M i = 1 ( T y i y i ) 3. Find the eigenvectors of C that are associated with the M' largest eigenvalues. Call the eigenvectors E 1, E 2,..., E M'. These are the eigenface images used by the algorithm given above. w04-p. R. Biometrics - Summer 2006 42
Pattern Recognition Techniques Face Recognition Algorithm [2] Example: face recognition 1. Given a training set of face images, compute the M' largest eigenvectors, E 1, E 2,..., E M'. M' = 10 or 20 is a typical value used. Notice that this step is done once "offline. 2. For each different person in the training set, compute the point associated with that person in eigenspace. That is to compute W = [w 1,..., w M' ]. Note that this step is also done once offline. 3. Given a test image, I test, project it to the M'-dimensional eigenspace by computing the point W test 4. Find the closest training face to the given test face: d= min k abs(w test W k ) where W k is the point in eigenspace associated with the k th person in the training set, and X denotes the Euclidean norm defined as (x 12 + x 22 +... + x n2 ) 1/2 where X is the vector [x 1, x 2,..., x n ]. w04-p. R. Biometrics - Summer 2006 43
Pattern Recognition Techniques - Face Recognition Algorithm 5. Find the distance of the test image from eigenspace, i.e. compute the projection distance so that we can estimate the likelihood that the image contains a face): dffs = y - y f where Y = I test -A, and y f 6. If dffs < Threshold1 ; Test image is "close enough" to the eigenspace ; associated with all of the training faces to ; believe that this test image is likely to be some ; face (and not a house or a tree or something ; other than a face) then if d < Threshold2 then classify I test as containing the face of person k, where k is the closest face in the eigenspace to W test, the projection of I test to eigenspace else classify I test as an unknown person else classify I test as not containing a face = M ' i = 1 test, i * Ei ) w04-p. R. Biometrics - Summer 2006 44 ( W