Index 1. Motivation 2. Background 3. JPEG Compression The Discrete Cosine Transformation Quantization Coding 4. MPEG 5. Literature
Lossy Compression Motivation To meet a given target bit-rate for storage (and transmission) Background Desktop computers communicate information primarily via their screens, so graphics are a major concern for computer programmers and designers. Researchs have been made to improve the way programs display data. One of the results of these researchs was a vast array of computers capable of displaying complex graphical images which quality approaching television or magazines. In many area of computing applications including games, education, desktop publishing, graphical design and most recently the World Wide Web programs using complex graphics are used. Although graphics do a great deal to enhance the usability and visual aesthetics of such applications, they consume prodigious amounts of disk storage. The answer to this problem is image compression. By the late of 1980s researchs on lossy compression has been extensively made, exploiting the fact (and also an advantage) of known limitations of the human eye. Its algorithms play on the idea that slight modifications and loss of information during the compression/decompression process often do not affect the quality of the image as perceived by the human user. Two standardization groups (ISO and CCITT) worked on the standard group for this compression and named it the Joint Photographic Expert Group (JPEG). The JPEG specification includes a specification for lossless encoding and for lossy encoding. The most interesting part of this compression is the lossy compression, which is detailed below. JPEG Compression The JPEG compression algorithm works in three successive stages : DCT Transformation Coefficient Quantization Lossless Compression The Discrete Cosine Transform The key to the JPEG compression is a mathematical transformation known as the Discrete Cosine Transform (DCT). It includes also the well-known fast Fourier Transform (FFT). Its basic operation is to take a signal and transform it from one type of representation to another. In this case the signal is
a graphical image. The concept of this transformation is to transform a set of points from the spatial domain into an identical representation in frequency domain. It identifies pieces of information that can be effectively thrown away without seriously reducing the image s quality. Firstable the DCT breaks the source image into N x N matrix or block down. In practice, N most often equals 8 because a larger block, though would probably give better compression, often takes a great deal of time to perform DCT calculations, creating an unreasonable tradeoff. As a result, DCT implementations typically break the image down into more manageable 8 x 8 blocks Then we apply the discrete cosine transform on the matrix.the mathematical function for a two-dimensional DCT is: If x is 0, else 1 if x > 0 This function can be re-written as a matrix multiplication problem which is more efficient for computers using the following matrix. For decoding purpose there is an Inverse DCT ( IDCT ) : if x is 0, else 1 if > 0 Let s see an example below. The input 8 x 8 matrix from a grey-scale image consists of pixel values which are randomly spread around 140 to 175 range. These values are fed to the DCT algorithm, creating the output matrix below.
Input matrix 140 144 147 140 140 155 179 175 144 152 140 147 140 148 167 179 152 155 136 167 163 162 152 172 168 145 156 160 152 155 136 160 162 148 156 148 140 136 147 162 147 167 140 155 155 140 136 162 136 156 123 167 162 144 140 147 148 155 136 155 152 147 147 136 Output Matrix 186-18 15-9 23-9 -14 19 21-34 26-9 -11 11 14 7-10 -24-2 6-18 3-20 -1-8 -5 14-15 -8-3 -3 8-3 10 8 1-11 18 18 15 4-2 -18 8 8-4 1-7 9 1-3 4-1 -7-1 -2 0-8 -2 2 1 4-6 0 The output matrix consists of DCT Coefficients, which is ordered in a way that coefficients containing useful and important data for representation of the image are in the upper left of the matrix and in the lower righht coefficients containing less useful information. The DC coefficient (typed in bold above) is at position 0,0 in the upper left-hand corner of the matrix and it represent the average of the other 63 values in the matrix. This step prepares the image for the next step, namely quantization. Quantization Up to this step in the JPEG compression, little actual image compression has occured. Of all the steps quantization is the essence of lossy compression. Quantization is a process of reducing the number of bits needed to store an integer value by reducing the precision of the integer. For every element in the DCT matrix, a corresponding value in the quantization matrix gives a quantum value. The quantum value is what is stored in the compressed image. The formula for quantization is simple.
From the formula, one can notice the smaller DCT coefficients of high-frequency elements divided by the larger quantum values will most often result in the high-frequency coefficients being rounded down to zero. Selecting a quantization matrix must be done carefully. Although JPEG allows for the use of any quantization matrix, ISO has done extensive testing and developed a standard set of quantization values that cause impressive degrees of compression. To determine the value of the quantum step sizes, the user inputs a single quality factor which should range from one to about twenty-five. Values larger than twenty-five would work, but picture quality has degraded far enough at quality level 25 to make going any farther an exercise in futility. The quality level sets the difference between adjoining bands of the same quantization level.these bands are orinted on diagonal lines across the matrix, so quantization levels of the same value are all roughly the same distance from the origin. To calculate the quantization matrix we use this formula : Qi,j = 1 + (1+i+j). quality factor An example of what the quantization matrix look like with a quality factor of two follows. 3 5 7 9 11 13 15 17 5 7 9 11 13 15 17 19 7 9 11 13 15 17 19 21 9 11 13 15 17 19 21 23 11 13 15 17 19 21 23 25 13 15 17 19 21 23 25 27 15 17 19 21 23 25 27 29 17 19 21 23 25 27 29 31 The sample matrices below show the effects of quantization on a DCT matrix. DCT Matrix before Quantization 92 3-9 -7 3-1 0 2-39 -58 12 17-2 2 4 2-84 62 1-18 3 4-5 5-52 -36-10 14-10 4-2 0-86 -40 49-7 17-6 -2 5-62 65-12 -2 3-8 -2 0-17 14-36 17-11 3 3-1 -54 32-9 -9 22 0 1 3 DCT Matrix after Quantization 90 0-7 0 0 0 0 0
-35-56 9 11 0 0 0 0-84 54 0-13 0 0 0 0-45 -33 0 0 0 0 0 0-77 -39 45 0 0 0 0 0-52 60 0 0 0 0 0 0-15 0-19 0 0 0 0 0-51 19 0 0 0 0 0 0 The low-frequency elements near the DC coefficient have been modified, but only by small amounts. The high-frequency areas of the matrix have, for the most part, been reduced to zero, eliminating their effect on the decompressed image. In this sense, insignificant data has been discarded and the image information has been compressed. Coding The final step in the JPEG process is coding the quantized images. This is done through three steps detailed below. 1. Convert the DC coefficient to a relative value First we change the absolute value of the DC Coefficient at 0,0 to a relative value. Since adjacent blocks in an image display a high degree of correlation, coding the DC element as the difference from the previous DC element typically produces a very small number. 2. Reorder the DCT block in a zig-zag sequence Because so many coefficients in the DCT image are truncated to zero values during the coefficient quantization stage, the zeros are handled differently than non-zero coefficients. They are coded using a Run-Length Encoding (RLE) algorithm. RLE gives a count of consecutive zero values in the image, and the longer the runs of zeros, the greater the compression. One way to increase the length of runs is to reorder the coefficients in the zig-zag sequence shown in the diagram below. This way, the JPEG algorithm moves through the block selecting the highest value elements first and eventually working its way to the lowest value elements, thus optimizing the effect of RLE 3. Entropy Encoding Finally, the JPEG algorithm outputs the DCT block s elements using an entropy encoding mechanism that combines the principles of RLE and Huffman encoding. The output of the entropy encoder consists of a sequence of three tokens, repeated until the block is complete. The three tokens are the run length, the number of consecutive zeros that precede the current non-
zero element in the DCT output matrix; the bit count, the number of bits used to encode the amplitude value that follows, as determined by the Huffman encoding scheme; and the amplitude, the amplitude of the DCT coefficient. MPEG MPEG is the acronym of Moving Picture Expert Group. It was developed by ISO/IEC and was established in 1988. Generally it uses the same coding techniques as JPEG, although additionally in MPEG we use the block based motion compensated prediction. Another differences from MPEG to JPEG are that MPEG only have one color space (4:2:0, YCbCr), one sample precision (8bits) and one scanning mode (sequential). There are some variations of MPEG, namely: MPEG 1 Standard, on which such products as VCD and MP3 are based MPEG 2 Standard, on which such products as Digital Television set top boxes and DVD are based MPEG 4 Standard, for multimedia for the web and mobility MPEG 7 Standard, in progress Literature 1. Nelson, Mark. The Data Compression Book. p.347-373, M&T Publishing, 1991 2. Sikora,T. Digital Consumer Electronics Handbook- McGRAW-HILL Book Company