PEG mage an Snyder December 14, 2009 Complete... Abstract This paper will outline the process of PEG image compression and the use of linear algebra as part of this process. t will introduce the reasons for image compression and clearly demonstrate each step of the process used by PEG as a sample image is compressed. n 1986 a committee known as the oint Photographic Experts Group (PEG) met to discuss and form a standard for image compression since most computers simply weren t capable of handling the large files required for storing images. A universal standard was needed as well since products from different manufacturers of electronics needed to be interoperable [11]. A paper by the chairmain of the group, Gregory Wallace, on the PEG s compression process was subsequently published in 1991, outlining their proposed standard of image compression. Adopted in 1994, PEG s standard file compression format has become so widespread that it is now one of the most common image file types on the web. Page 1 of 17
Figure 1: Generalized Complete... This begs the question, why is PEG s standard so successful? The answer is simple. The compression technique employed by PEG allows a large image file to be compressed down to a much smaller size while retaining a substantial amount of the integrity and quality of the image. The degree of compression in PEG s process can be modified to suit the needs of the individual who is compressing the image, and it is possible to compress an image to one eighth or one ninth of its original size while retaining enough quality for a decent image [1]. File compression basically has one goal: take a large file and condense it into a more compact form for easier storage and/or transport. mage files are much larger than text files on average, and therefore have a much greater need for compression. Figure 1 briefly illustrates the idea behind compression of data for storage or transport and then the resulting decompression. Many image compression techniques have been developed along two main lines: Lossless compression and Lossy compression. Lossless image compression takes Page 2 of 17
image data and compacts it as best it can while retaining all of the information. Lossy image compression, on the other hand, is just what it sounds like: information is lost during compression. This technique generally results in better compression since some information deemed uneccessary is discarded, resulting in an image with a drastically reduced file size. Since information is lost, this compression method would seem to result in an image of poorer quality, but the beauty of PEG compression is that much of the visual information that is discarded is imperceptible to the human eye, so the resulting image can virtually be of the same quality. PEG image compression is suitable for any type of bitmap image. Bitmap images are basically m n matrices (or layers of m n matrices) with each entry in the matrix determining the color of one pixel in the image. Each entry, in turn, is represented by a given number of bits. f each pixel was represented by one bit, only two different colors could be represented since a computer working in binary could either assign the pixel a value of 0 or 1. f two bits were used to represent each pixel, four colors could be represented. The number of possible colors per pixel has 2n possibilities, with n representing the number of bits per pixel. mages with either eight bits per pixel (one byte) or twenty-four bits per pixel (three bytes) are most common [10]. Three different types of bitmap images are generally used. The first type are intensity (grayscale) images, where each entry in the matrix has a value between zero and one, with zero being pure white and one being completely black. Next, there are 256-color images, where each entry in the matrix is a digit between 0 and 255 that corresponds to a distinct color. n this case each pixel will require eight bits, or one byte, of storage. The third image type is called truecolor. n truecolor images three matrices are used: one for the shade of red, another for the shade of green, and a third for the shade of blue (See Figure 2). Truecolor images are also known as RGB (Red, Complete... Page 3 of 17
Green, Blue) images because of this. A truecolor image is made by layering these three m n matrices on top of each other, producing the required color for each pixel [9]. The method employed by PEG exploits the limitations of the human eye to detect color and brightness changes in an image. When dealing with very bright or very dim colors, the human eye cannot easily detect changes in the color from pixel to pixel, especially if the change is rather large. PEG s process takes advantage of this fact and discards imperceptible changes in the color (or chrominance) of an image as well as some changes in the brightness (or luminance), which the human eye is a little better at detecting. This basis for compression makes PEG s method ideal for compressing photographs, which often have large variations in color and brightness from pixel to pixel. There are four main steps in the process of PEG compression: the Discrete Cosine Transform (DCT), quantization, reordering, and Huffman coding. The purpose of the Discrete Cosine Transform is to change the matrix from its pixel color values into a matrix of values that describe the change of color from pixel to pixel, both by row and by column. The quantization process then discards picture detail that is either nearly or completely imperceptible to the human eye, and the quantized matrix is then reordered and Huffman encoded to compress the file even further. All of these processes will be performed step by step as the Seal intensity image is compressed using PEG s method. Since there is no color to deal with, the compression only has to deal with the brightness, or luminance, coefficients. See Figure 3. Complete... Page 4 of 17
Complete... Page 5 of 17 Figure 2: llustration of Layered Matrices in RBG mages
Complete... Page 6 of 17 Figure 3: Seal mage
The Discrete Cosine Transform The Discrete Cosine Transform (DCT) is the heart of PEG s compression technique. While it performs no compression by itself, it allows the image matrix to be changed into a form that is more suitable for compression. A DCT matrix represents an image as a linear combination of sinusoidal functions of varying frequencies in two dimensions, changing an image from its spatial domain to its frequency domain [5]. n essence, the matrix no longer represents the intensity of each pixel directly, but instead it will records the changes in intensity across the picture. Large changes in intensity will correspond to higher DCT values, and lower changes will correspond to lower values. The two-dimensional DCT used in PEG compression is defined as follows: Cuv = αu αv where 0 u M 1 and Complete... M 1 N 1 X X m=0 π(2m + 1)u π(2n + 1)v Amn cos cos, 2M 2N n=0 0 v N 1. ( 1/ M, if u = 0, αu = p 2/M, if 1 u M 1 ( αv = 1/ M, if v = 0, p 2/M, if 1 v M 1 This equation produces DCT matrix C. The actual DCT of an image matrix is defined as: Page 7 of 17 D = CAC T, where A is the input image matrix and D is the final DCT of the image. Now, it would require quite a bit of computing power to perform this transform directly on the 1152 1536 Seal image, so PEG s method instead chops the image
matrix into 8 8 blocks. Each block is then modified using the discrete cosine transform, which requires much less computing power. The DCT also has the quality that it tends to concentrate the greater changes in intensity in the upper-left corner of the matrix, and this will prove to be vital in the next step of compression. The 8 8 matrix in the top left corner of the original image will be used as the example for PEG s method. This matrix is shown directly below. 182 181 176 178 176 177 173 175 182 184 177 176 179 174 176 173 176 176 176 180 178 174 171 168 182 174 176 180 177 169 168 168 O= 180 172 181 178 171 170 173 172 181 170 171 170 170 178 174 174 176 174 170 170 171 182 179 166 176 173 170 176 173 181 179 168 Generally the original 8 8 matrices are shifted so that the mean of the entries rests around 0 instead of 128. This simply requires that 128 be subtracted from every entry in the matrix. This computation is simple and is easily reversed when the image is reconstructed, so it is rather trivial to show the output here. The DCT can then be performed on the original matrix from the Seal image, and the result is shown below. Complete... Page 8 of 17
376.125 9.642 5.244.483 D= 1.125 3.622 2.037.345 13.589 9.971 6.027 2.142 3.511.352 3.090 9.409.991 10.167.375 1.753 7.701 6.326 2.058 7.203 7.092 8.938.529.659 1.915 4.799 4.375 2.448 3.881.0422 1.011.226 2.802 3.362 2.921 3.739 6.701 6.778 3.058.101 2.963.700.224 1.894.546 2.985 3.761 5.106.929.328 2.942.156 2.061 1.779 4.346.0516.199 1.942 2.403.241 As mentioned previously, the greatest changes in luminance are concentrated in the upper left corner. Another crucial property of the DCT that needs to be mentioned is that it is invertible, and the inverse DCT will allow the image to eventually be reconstructed from its compressed form. Complete... is where compression really happens. As described before, when the image was sent throught the discrete cosine transform, most of the visually important information was stored in the coefficients in the upper-left corner of the matrix. PEG now gets rid of the excess information found in the bottom-right part of the matrix by employing a quantization matrix. Each coefficient in in the transformed image matrix is divided by the corresponding coefficient of the quantization matrix, and then rounded to the nearest integer value. This produces many zero coefficients in the quantized matrix. Dij R = round Qij Page 9 of 17
While there are many quantization matrices that are employed in PEG compression, the one most commonly used for the luminance coefficients (and therefore most useful for the Seal intensity image) is the following: 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 Q= 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 This matrix can be scaled to achieve either a higher degree of compression or a higher quality image. When the DCT matrix D of the Seal image is quantized, the following matrix is produced. Notice how sparse it is with almost all of the lower-right containing zero entries. 24 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 R= 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Complete... Page 10 of 17
Complete... Figure 4: Zig-Zag Sequence The 8 8 matrix has now been DCT ed and quantized, resulting in the sparse matrix above. t can now be taken and reordered into a 64 1 vector that will concentrate all the non-zero entries from the matrix in the first few entries of the new vector. The string of zeros that follows the last non-zero entry can then be eliminated since it is unnecessary information. This will decrease the file size drastically. The reordering pattern is shown in Figure 4. The zig-zag reordering pattern turns the 8 8 matrix into a long string of matrix Page 11 of 17
coefficients, which will be shown as a row vector. V = [24 1 1 0 1 0 1 0 0 0 0 0...] All the zeros following the final 1 entry in the vector will be eliminated, and an end of block placeholder will be added to signify that the string is from one 8 8 matrix. Now that the matrix has been quantized, reordered, and excess strings of zeros have been deleted, it is ready for Huffman coding. t is rather trivial to try to apply Huffman coding to a single block of the compressed image matrix, but it is an important part of compression and will be briefly explained. Huffman coding is based on the frequency of the symbols being encoded and not on the symbols themselves, so it allows redundancies in the symbols to be compressed [3]. Code is generated with respect to the frequency of repetition of each symbol in the input, with smaller code values representing the most common symbols. Huffman coding is also uniquely decodable, meaning that the code generated from a string cannot represent something other than the original input when it is decoded. Figure 5 is an example Huffman coding tree for characters in a string of typed words. The coding breaks down the symbols based on their frequency, and generates a unique code for each. Coding pixel values uses essentially the same process. However, the use Huffman coding effectively, the entire image must be reduced to a string of pixel values, not just one 8 8 matrix. Complete... Page 12 of 17
Complete... Figure 5: Example Tree Page 13 of 17
Complete Process So far only one 8 8 example matrix has been compressed using the methods of PEG compression, but any given image will contain thousands of 8 8 block matrices making one by one computation of the matrices as shown in the example impossible. A PEG compression algorithm takes an image and processes it by 8 8 block matrices as shown in the example. The DCT is applied to each block, the blocks are then quantized, and finally reordered. Excess zeros are then removed, and the remaining entries are stored in a long string with an end of block marker inserted between the coefficients to separate the blocks. Huffman coding can then be implemented on the string, resulting in another reduction of the file size, and the image is stored. When the image is accessed, it can be reconstructed through the reversal of the Huffman coding, reordering of the string into matrix blocks as designated by the end of block markers, and then put through the inverse quantization and inverse DCT. A reconstructed image that has been compressed using the PEG compression procedure will be an appoximation of the original due to the loss of information in the compression procedure. However, the images (again, usually photographs) tend to retain a substantial amount of quality. The final result of the compression of the Seal image is shown in Figure 6. The compression was done using PEG s standard compression rate, and it has retained very good visual quality compared with the original Seal image shown in Figure 3. The compression program was written using MATLAB, and the file sizes for the original and compressed images were obtained from the MATLAB workspace. The file size of the original Seal image shown in Figure 3 is 1769 kilobytes, but the file size of the compressed image is a mere 158 kilobytes. This result includes implementing the Huffman coding procedure. This is a good compression result, considering the fact that the compressed image retains good quality and only has minor artifacting. Complete... Page 14 of 17
Complete... Page 15 of 17 Figure 6: Compressed Seal mage
Overall, PEG s compression process is an extremely useful type of image compression. t is ideal for storing photographs since visually unimportant information can be easily discarded (of which photographs have quite a bit) and has become one of the most common file types used. This paper has demonstrated the ability of the PEG compression process to effectively retain visual quality in an image while drastically reducing the file size through the implementation of the Discrete Cosine Transform, quantization, reordering, and Huffman coding. As a final thought, would like to thank my friend Ryan Wortman for supplying the digital image. Complete... References [1] Austin, David. mage : Seeing What s Not There. American Mathematical Society. 2009. Accessed 24 Aug. 2009. [2] Discrete Cosine Transform. mage Processing Toolbox. Mathworks. Accessed 4 Sept. 2009. [3] Gonzalez,Rafael C. and Richard E. Woods and Steven L. Eddis.Digital mage Processing Using MATLAB. Gatesmark Publishing. 2009. Pages 283-323. [4] Harris, Greg A. and Darrel Hankerson. Transform Methods and mage. Web. anuary 1st, 1999. Accessed September 2, 2009. [5] Holloway, Catherine. PEG mage : Transformation,, and Encoding. April 2008. Accessed 22 Sept. 2009. Page 16 of 17 [6] mage Types. Mathworks Helpdesk. Accessed 22 Sept. 2009.
[7] O Hanen, Ben and Matthew Wisan. PEG. Student Projects in Linear Algebra. Ed. David Arnold. 16 Dec. 2005. Accessed 2 Sept. 2009. [8] Penfield, Paul. Chapter 3:. Notes. 12 Feb. 2004. MT. Accessed 6 Sept. 2009. [9] Poynton, Charles. Digital Video and HDTV: Algorithms and nterfaces [10] Types of Bitmaps. MSDN. Microsoft Corporation. 2009. Accessed 2 Oct. 2009. [11] Wallace, Gregory K. The PEG Still Picture Standard. Communications of the ACM. 1 April 1991: 30-44. Accessed 4 Sept. 2009. Complete... Page 17 of 17