Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD

Lab # 2 - ACS I Part I - DATA COMPRESSION in IMAGE PROCESSING using SVD Goals. The goal of the first part of this lab is to demonstrate how the SVD can be used to remove redundancies in data; in this example we will be compressing image data. We will see that for a matrix of rank r, the SVD can give a rank p < r approximation to a matrix and that this approximation is the one that minimizes the Frobenius norm of the error. Introduction Any image from a digital camera, scanner or in a computer is a digital image. The real world color image is digitized by converting the images to numerical data. A pixel is the smallest element of the digital image. For example, a 3 megapixel camera has a grid of 2048 1536 = 3,145,828 pixels. Since the size of a digitized image is dimensioned in pixels of say m rows and n columns, it is easy for us to think of the image as an m n matrix. However, each pixel of a color image has an RGB values (red, green, blue) which is represented by three numbers. The composite of the three RGB values creates the final color for the single pixel. So we can think of each entry in the m n matrix as having three numerical values stored in that location, i.e., an m n 3 matrix. Now suppose we have a digital image taken with a 3 megapixel camera and each color pixel is determined by a 24-bit number (8 bits for intensity of red, blue and green). Then the information we have is roughly 3 10 6 24. However, when we print the picture suppose we only use 8-bit colors giving 2 8 = 256 colors. We are still using 3 million pixels but the information used to describe the image has been reduced to 3 10 6 8, i.e., a reduction of one-third. This is an example of image compression. In the figure below we give a grayscale image of the moon s surface (the figure on the left) along with two different compressed images. Clearly the center image is not acceptable but the compressed image on the right has most of the critical information. We want to investigate using the SVD for doing data compression in image processing. Figure 1: The image on the left is the original image while the other two images represent the results of data compression. Understanding the SVD Recall from the notes that the SVD is related to the familiar result that any n n real symmetric matrix can be made orthogonally similar to a diagonal matrix which gives us the decomposition A = QΛQ T where Q is orthogonal and Λ is a diagonal matrix containing the eigenvalues of A. 1

The SVD provides an analogous result for a general nonsymmetric, rectangular m n matrix A. For the general result two different orthogonal matrices are needed for the decomposition. We have that A can be written as A = UΣV T, where Σ is an m n diagonal matrix and U, V are orthogonal. Recall from the notes that the SVD of an m n matrix A can also be viewed as writing A as the sum of rank one matrices. Recall that for any nonzero vectors x, y the outer product x y T is a rank one matrix since each row of the resulting matrix is a multiple of the other rows. The product of two matrices can be written in terms of outer products. In particular, the product BC T of two matrices can be written as the sum of the outer products i b i c T i where b i, c i denote the ith columns of B and C respectively. Hence if we let u i denote the ith column of U and v i the ith column of V then writing the decomposition A = UΣV T as the sum of outer products we get (1) A = u 1 σ 1 v T 1 + u 2 σ 2 v T 2 + + u r σ r v T r = r σ i u i v i T. where r denotes the rank of A and σ i is the ith singular value of A. Thus our expression for A is the sum of rank one matrices. If we truncate this series after p terms, then we have an approximation to A which has rank p. What is amazing, is that it can be shown that this rank p matrix is the best rank p approximation to A measured in the Frobenius norm. Recall that the Frobenius norm is just the matrix analogue of the standard Euclidean length, i.e., A F = i,j A 2 ij 1/2 i=1 where A ij denotes the (i, j) entry of the matrix A. However, it is NOT an induced matrix norm. We can get a result for the error that is made by approximating a rank r matrix by a rank p approximation. Clearly we have that the error E p is given by E p = A p σ i u i v i T = i=1 Due to the orthogonality of U and V we can write E p 2 F = and so a relative error measure can be computed from r i=p+1 r i=p+1 σ 2 i σ i u i v T i. (2) [ r i=p+1 σ2 i r i=1 σ2 i ] 1/2. Data compression using the SVD How can the SVD help us with our data compression problem? Remember that we can view our digitized image as an array of mn values and we want to find an approximation that captures the most significant features of the data. Recall that the rank of an m n matrix A tells us the number of linearly independent columns of A; this is essentially a measure of its non-redundancy. If the rank of A is small compared with n, then there is a lot of redundancy in the information. We would expect that an image with large-scale 2

features would possess redundancy in the columns or rows of pixels and so we would expect to be able to represent it with less information. For example, suppose we have the simple case where the rank of A is one; i.e., every column of A is a multiple of a single basis vector, say u. Then if the ith column of A is denoted a i and a i = k i u, then A = u k T, i.e., the outer product of the two vectors u and k where k = (k 1, k 2, k n ). If we can represent A by a rank one matrix then all we need to specify are the vectors u and k; that is m+n entries as opposed to mn entries for the full matrix A. Of all the rank one matrices we want to choose the one which best approximates A in some sense; if we choose the Frobenius norm to measure the error, then the SVD decomposition of A will give us the desired result. Of course in most applications the original matrix A is of higher rank than one so that a rank one approximation would be very crude. In general we seek a rank p approximation to A such that A p i=1 σ i u i v i T F is minimized. It is important to remember that the singular values given in Σ are ordered so that they are nonincreasing. Consequently, if the singular values decrease rapidly, then we would expect that fewer terms in the expansion of A in terms of rank one matrices would be needed. Computational Algorithms In this lab, we will treat the software for the SVD as a black box and assume that the results are accurate. In this lab, you can either use the LAPACK SVD routine dgesvd or MATLAB commands svd and svds. The LAPACK algorithm can be downloaded from netlib (www.netlib.org). The interested student is referred to Golub and Van Loan s book for a description of the algorithm used to obtain the SVD. Test image libraries for use in image compression are maintained by several institutions. Here we use the ones from the University of Southern California. In addtion to the SVD algorithm, we will need routines to generate the image chart (i.e., our matrix) from an image and to generate an image from our approximation. There are various ways to do this. One of the simplest approaches is to use the MATLAB commands imread - reads an image from a graphics file imwrite - writes an image to a graphics file imshow - displays an image Specifics of the image processing commands can be found from Matlab s technical documentation such as http : //www.mathworks.com/help/images/image import and export.html or the online help command. When you use the imread command the output is a matrix in unsigned integer format, i.e., (uint8). You should convert this to double precision (in Matlab double) before writing to a file or using commands such as svds. However, the imshow command wants the uint8 format. You should learn the difference between the Matlab commands svd and svds. Exercises 1. The purpose of this problem is to make sure you are using the SVD algorithm (either from Matlab or netlib) correctly. First, make sure that you get the SVD of A = 1 2 3 4 5 6 7 8 9 10 11 12 3

as.1409.8247.5418.0803 25.46 0 0.5045.7608.4082.3439.4263.6626.5109 0 1.291 0 A =.5745.0571.8165.5470.0278.3003.7809 0 0 0.6445.6465.4082.7501.3706.4211.3503 0 0 0 T. Note that Matlab gives you V although the decomposition uses the transpose of V. Next compute rank 1 and rank 2 approximations to A and determine the error in the Frobenius norm by (i) calculating the difference of A and its approximation and then computing its Frobenius norm and (ii) using the singular values. 2. In this problem we want to download an image from USC http://sipi.usc.edu/database/database.php?volume=misc to use. Choose image 5.2.10 stream and bridge. Create an integer matrix chart representing this image. Your image chart should be a 512 512 matrix with entries between 0 and 255. View your image (for example with imshow) to make sure you have read it in correctly. a. Write code to create the matrix chart for an image and to do a partial SVD decomposition; you should read in the rank of the approximation you are using as well as the image file name. Use your code to determine and plot the first 150 singular values for the SVD of this image. What do the singular values imply about the number of terms we need to approximate the image? b. Modify your code to determine approximations to the image using rank 8, 16, 32, 64 and 128 approximations. Display your results as images along with the original image and discuss the various quality of the images. c. Now suppose we want to determine a reduced rank approximation of our image so that the relative error (measured in the Frobenius norm) is no more than 0.5%. Determine the rank of such an approximation and display your approximation. Compare the storage required for this data compression with the full image storage. 3. In this problem we will use the color image mandrill from the same USC site. Now each pixel in the image is represented by three RGB values and so the output of imread is a three dimensional array. a. Plot the first 150 singular values and discuss implications. b. Obtain rank 8, 16, 32, 64 and 128 approximations to your image. Display and compare your results. c. What is the lowest rank approximation to your image that you feel is an adequate representation in the eyeball norm? How does this compare with your interpretation of your results in (a)? d. Repeat (a)-(c) with your favorite image from the USC website or one of your own. 4

Part II - DATA COMPRESSION in IMAGE PROCESSING USING CLUSTERING Goals To investigate a clustering algorithm and apply it to image compression. This will expose you to another approach for data compression than just the SVD approach in the first part of this lab. Introduction Although we haven t studied clustering methods you have probably heard talks where Centroidal Voronoi Tessellations (CVT) or K-means are used. K-Means is a well-known algorithm for clustering objects. When we have a discrete set of data we can also view CVTs as a clustering algorithm which is equivalent to K- Means in this case. A Voronoi tessellation {V i } K i=1 of a region associated with a set of points (or generators) {z i } K i=1 is the decomposition of the region into set of subregions which have the property that all points in V i are closer to z i than any other generator. A CVT is a Voronoi tessellation where the generator z i is also the center of mass of V i with respect to the given density function. Lloyd s Method is an iterative method for constructing CVTs; however, as described below it is computationally costly. The method is outlined in the following steps: Lloyd s Method Given a set of initial points {z i } K i=1, a density function ρ, and a metric or distance function: 1. Construct the Voronoi tessellation {V i } K i=1 associated with the points {z i} K i=1 ; 2. Determine the centers of mass of each Voronoi region {V i } K i=1 ; set these points to be the new generators; 3. If convergence has not been achieved, return to (1.) Determining the centers of mass is easy, however, the construction of the Voronoi tessellation is quite costly. An alternative to Lloyd s Method is to take a probabilistic viewpoint. Instead of actually constructing the Voronoi tessellation, we will sample the region with a random point w, then determine which generator z i is closest to the given point. After sampling with many points, instead of a Voronoi region we will have a set of points in the Voronoi region. We then average the points in each cluster and take as the new generators an average of the points in each cluster or alternately we could take a weighted average of the old generators and the corresponding cluster averages. If we sample with enough points, then this should be a reasonable approximation to the Voronoi regions. Note that this method can easily be parallelized. Specifically, the probabilisitc Lloyd s method is given by the following steps. Probablistic Lloyd s Method Given a set of initial generators {z i } K i=1, a density function ρ, a metric or distance function and a number of sampling points N: 1. For i = 1,..., N sample with a random point w i in the domain; determine k such that z k is closest to w i ; adjust the kth cluster to include the point w i and increment the counter for the number of points in the specific cluster; 2. Determine the average of each discrete cluster; these points are the new generators; 5

3. If convergence has not been achieved, return to (1). The most time consuming part of this algorithm is determining the generator which is closest to the random point w i. You can implement a brute force approach to do this or a more sophisticated one. For the stopping criteria to determine if convergence has been reached, there are various choices. Since we want z i = zi for i = 1,..., K we can simply check 1 K z n+1 i zi n tolerance, (1) K i=1 where zi n denotes the nth iteration of the algorithm for the generator z i and represents the metric we are using to determine the generator nearest a point. To display a Voronoi region in two dimensions, various software is available. For this lab, it is probably easiest to use the MATLAB command voronoi. Using Clustering for Image Compression If we have a color image we know that each pixel is represented by three RGB values creating a myriad of colors. Our strategy now is to choose just a few colors to represent the picture. An obvious application of this data compression is when you print an image using a color printer with many fewer colors than are available on your computer. After we choose these colors, then the image chart for the picture must be modified so that each color is replaced by the new color that it is closest in color space. We can use K-Means or equivalently a discrete CVT to accomplish this image compression. For example, suppose we have a grayscale image and decide that we want to represent it with 32 shades of gray. Our job is to find which 32 colors best represent the image. We then initiate our probabilistic Lloyd s algorithm with 32 generators which are numbers between 0 and 255; we can simply choose the generators randomly. In Lloyd s algorithm we need to sample the space so in our application this means to sample the image; i.e., sample a random pixel. If the image is not too large, then we can simply sample every pixel in the image. We then proceed with the algorithm until convergence is attained. After convergence is achieved we know the best 32 colors to represent our image so our final step is to replace each color in our original matrix representation of the image with the converged centroid of the cluster it is in. For this application we will just use a constant density function and the standard Euclidean distance for our metric. 1. In this problem you will generate a CVT and plot it so you can make sure your algorithm is working correctly before we proceed to the image compression. Write a code to implement the probabilistic Lloyd s Method for a region which is an n-dimensional box and calculate the cluster variance. Test your code by generating a CVT diagram in the region (0, 2) (0, 2) using 100 generators. Use (i) 10 sampling points per generator, (ii) 100 sampling points per generator, and (iii) 1000 sampling points per generator; for each case display your tessellation using, for example, the MATLAB command Voronoi. Use a maximum number of iterations (300) and a stopping criteria as described above with a tolerance of 0.005. Tabulate the number of iterations required for each case. Plot the cluster variance for each iteration. What conclusions can you draw from your result? 2. Use the color image (mandrill.tiff) from Part I of this lab and modify your algorithm from #1 to obtain approximations to the image using 4, 8, 16, 32, and 64 colors. Display your results along with the original image. As generators you will choose, e.g., 8 random points in the RGB color space and because there are only 512 2 pixels you can sample the image by choosing each pixel to determine which of the 8 colors it is closest to; you can use the standard Euclidean length treating each point as a three-dimensional vector. Use 10 2 as a tolerance in your stopping criteria. 6