1 Robert Matthew Buckley Nova Southeastern University Dr. Laszlo MCIS625 On Line Module 2 Graphics File Format Essay
2 JPEG COMPRESSION METHOD Joint Photographic Experts Group (JPEG) is the most commonly used format for storing photographic images (Miano, 2005). There are 4 compression modes identified as Sequential, Progressive, Lossless and Hierarchical. In sequential mode, images are encoded from top to bottom and resolutions of 8 or 12 bits are used with run length encoding (RLE) and Huffman coding. Progressive coding uses multiple scans and also provides resolutions of 8 or 12 bits with Huffman and RLE encoding. Lossless coding cannot compare to the image compression achieved with sequential or progressive coding methods. Hierarchical is a super progressive mode where the image is broken down into sub images (frames) and is an extremely complex implementation of the JPEG standard. Since this paper is more a narrative, the emphasis will be on the schemes involved in achieving compression ratios and what compressions can be expected. Also what tradeoffs to image clarity is reproduced depending on the compression ratio selected. Since 12 bits of resolution are normally allocated to medical image processing, this resolution will not be discussed here because of limited use. The Red, Blue and Green (RGB) video memory for example can maintain a various video image sizes such as 1024 x 1024 pixel elements. Ultimately these pixel elements are then divided into multiple 16 x 16 RGB boxes, or matrices and then further divided into 4 (8x8) RGB matrices (4x64x3 bytes). At this point, JPEG Compression begins processing this RGB video into a 4 (8x8) Luminance matrix or 4x64x1 byte and a 2x64x1 byte Chrominance blue and Chrominance red matrix respectively. The chrominance columns are generated from the 2 nd row & 2 nd column of the RGB matrix while the Luminance is processed using the complete 4x64x3 bytes. Figure 1 depicts these matrices with the Y, Cr and Cb information. Embedded in this figure is the calculation for Chrominance and Luminance prior to performing a DCT.
3 Figure 1 JPEG Compression Scheme ( http://www.fho-emden.de/~hoffmann/jpeg131200.pdf ) Before continuing to discuss the JPEG processing methods, 2 components must be understood. First, when signals are sampled, they are considered complex meaning that they contain both a sine and cosine function. To simplify processing, these sine and cosine functions are mirrored producing only components of the cosine wave (doubling its amplitude) while canceling the sine components. This simplifies Fourier processing & in this case can be performed using a discrete algorithm as opposed to a Fast Fourier algorithm because the amount of redundant samples is insignificant. The equation for the DCT is depicted herein. Figure 2 DCT
4 B (k1, k2) represents the DCT coefficients applied in the frequency domain to the transformed original digitized image. Note, the (i,j) matrix elements are the time sampled digitized signal levels requiring DCT processing after Y and Cb/Cr conversion processing. Also, since the cones of the eye are responsible for color response & only respond to mid to high intensities of light while the rods respond well to low levels of light (black & white). The realization that the eye s response to extremely high frequency visual changes in an image or scene is not be easily resolved or perceived (the eye will not respond). Therefore the DCT is a digital filter that transforms time sampled video or image information to a frequency domain filter representing horizontal & vertical picture elements of low to high frequency amplitude information with higher frequencies being zeroed. An example of digitizing is as follows. If we sample a 1 KHz signal at 10 KHz with 1000 samples and each filter bandwidth is 10 Hz starting (3 db) from the frequency range of DC to10khz. Also since the human eye s response peaks around green (about the middle frequency response between 400 and 700 nanometers) and since compression as example can be realized by ignoring the high frequency components without losing image fidelity, it should be realized that this compression scheme provides data reduction without loss in visual fidelity. This is one method of achieving JPEG compression. Decimation, data turning or averaging provides additional compression schemes for the image with little effect on fidelity. Finally, run length encoding and Huffman encoding is utilized in finalizing the compression scheme. In processing the information, the DCT coefficients are scanned in a Zigzag order for the purpose of obtaining the significant low frequency components first while expecting many zeros at the higher frequencies. The sequential zeros are lossless compression implemented by Run Length Encoding [RLE] (example data = 0,0,0,0,0 and the RLE = 0,5). After RLE encoding, a
5 Huffman code is applied which uses entropy coding & probability to reduce even further the data representation for the most used digital bit patterns. With respect to compression, JPEG defined herein is a lossy compression scheme that implements a lossy DCT and lossless RLE & Huffman encoding schemes. While the DCT compresses an image, it depends upon human perception to view this compressed image as an acceptable or high quality replica. Color images compress better while maintaining good fidelity but produce larger files than black & white images that do not compress as well as their color counterparts. Figure 3 depicts compression ratios achieved. Figure 3 JPEG Compression Ratio Table ( http://dynamo.ecn.purdue.edu/~ace/jpeg-tut/jpgimag1.html )
6 The subsequent pictures provide insight into the quality factors and image clarity for both color and black & white images. They are taken without JPEG compression, then at 20% quality factor, then at 5% quality factor. Figure 4 depicts the image quality depending upon quality factor. Also note that JPEG compression is poorly suited for compression of text files where run length codes or Huffman coding is more appropriate. JPEG compression ratios of 1Mbyte to 50Kbytes or 1,000,000/50,000=0:1 compression ratio without loss in perceived fidelity is realizable in JPEG compression.
7 Figure 4 Comparison JPEG Quality Factors ( http://dynamo.ecn.purdue.edu/~ace/jpeg-tut/jpgimag1.html ) Note Figure 5 provides insight into Zigzag scanning order to provide faster processing of more abundant low frequency components of the image and Figure 6 shows the DCT model.
8 Figure 5 Zigzag Scanning ( http://www.fho-emden.de/~hoffmann/jpeg131200.pdf ) Figure 6 JPEG Compression Block
9 References http://css.engineering.uiowa.edu/~aip/papers/jpeg2000_still_image_compression01.pdf http://www.fho-emden.de/~hoffmann/jpeg131200.pdf http://dynamo.ecn.purdue.edu/~ace/jpeg-tut/jpgimag1.html Miano, John. (2005). Compressed Imag e File Formats. Addison Wesley Publishing, Reading, MA. Watkinson, John. (1999). MPEG -2. Focal Press Publishing, Boston, MA.