Course Presentation Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology
Image Compression Basics Large amount of data in digital images File size for a 14 Megapixel color image 42 MB in uncompressed RGB 24bit/pixel format ~ 24 images in a 1GB memory card ~1.5 MB in JPEG (90% quality) format ~ 667 images in a 1GB memory card Compression crucial Different number of techniques available RLE, LZ, ADPCM, DCT Choice depends on Type of image (B/W, Grayscale, Color, Content) Application (Entertainment, Medial, Real-time) Page 1
Image Compression JPEG Most commonly used still image compression method Image files, cameras, and WWW Lossy Compression (inc. a lossless coding mode too) Adjustable degree of compression Tradeoff between storage size and image quality Typ. Compression ratio: 10:1 (with little perceptible loss in image quality) Supports a max. image size of 65535x65535 Original 178 KB Q: 50 37 KB Q: 5 16 KB Q: 1 13 KB Page 2
Image Compression JPEG Acronym for the Joint Photographic Experts Group A sub-groups of ISO/IEC http://www.jpeg.org/ The group was organized in 1986 First public release date JPEG part 1 standard, 1992 Page 3
Image Compression Pro: JPEG Works well on photographs and paintings of realistic scenes with smooth variations of tone and color. Lossy compression in the typical use is not suitable for certain applications such as medical imaging. Con: Not proper for line drawings and other textual or iconic graphics, where the sharp contrasts between adjacent pixels can cause noticeable artifacts. House Test Image Grass Test Image Page 4
Image Compression JPEG Encoder Steps Color space transformation: RGB to YCbCr The representation of the colors in the image is converted from RGB to Y CBCR, consisting of one luma component (Y'), representing brightness, and two chroma components, (Cb and Cr), representing color. This step is sometimes skipped. Chroma subsampling The resolution of the chroma data is reduced, usually by a factor of 2. This reflects the fact that the eye is less sensitive to fine color details than to fine brightness details. Block splitting and DCT The image is split into blocks of 8 8 pixels. For each block, each of the Y, Cb, and Cr data undergoes a discrete cosine transform (DCT). A DCT is similar to a Fourier transform in the sense that it produces a kind of spatial frequency spectrum. Quantization The amplitudes of the frequency components are quantized. Human vision is much more sensitive to small variations in color or brightness over large areas than to the strength of high-frequency brightness variations. Therefore, the magnitudes of the highfrequency components are stored with a lower accuracy than the low-frequency components. The quality setting of the encoder (for example 50 or 95 on a scale of 0 100 in the Independent JPEG Group's library) affects to what extent the resolution of each frequency component is reduced. If an excessively low quality setting is used, the high-frequency components are discarded altogether. Entropy Coding The resulting data for all 8 8 blocks is further compressed with a lossless algorithm, a variant of Huffman encoding. Page 5
JPEG Codec Diagram, Scheme 1 Encoder Decoder Page 6
JPEG Encoder Diagram, Scheme 2 JPEG encoder diagram for a single block of 8 by 8 pixels Page 7
JPEG Baseline JPEG Encoder block diagram Encoder Diagram, Scheme 3 Page 8
JPEG Color Space Transformation RGB to YCbCr conversion concept: The human eye is less sensitive to fine color (chrominance) details than to fine brightness (luminance) details. Analog TV Digital TV Cb = B Y Cr = R - Y Page 9
JPEG, Chroma Subsampling Subsampling in YCbCr Page 10
JPEG Block splitting Block Splitting and DCT The image is split into blocks of 8 8 pixels. Later we discuss why this is done. Discrete Cosine Transform (DCT) Each 8 8 block of each component (Y, Cb, Cr) is converted to a frequency-domain representation, using a normalized, two-dimensional type-ii discrete cosine transform (DCT). Page 11
JPEG, DCT Center Around Zero The 8 8 sub-image shown in 8-bit grayscale Page 12
JPEG, DCT Fourier Coefficients square-wave synthesized using Fourier cosine coefficients and sine coefficients Page 13
DCT The DCT transforms an 8 8 block of input values to a linear combination of these 64 patterns. The patterns are referred to as the two-dimensional DCT basis functions, and the output values are referred to as transform coefficients. The horizontal index is u and the vertical index is v. Basis Functions The 8 8 sub-image Page 14
JPEG, DCT DCT Coefficients DC coefficient ( Top-left corner, has large magnitude ) AC coefficients ( Other 63 coefficients ) DCT aggregates most of the signal in one corner Larger values in the top-left corner DCT coefficient for our sample block (rounded to the nearest two digits beyond the decimal point) Page 15
JPEG DCT Coefficients, Example The result of taking the DCT. The numbers in red are the coefficients that fall below the specified threshold of 10. Page 16
JPEG, DCT Histograms of DCT Coefficients Histograms of DCT Coefficients of image lena using blocks of 8 8 pixels Page 17
JPEG, Quantization Concept The human eye is good at seeing small differences in brightness over a relatively large area, but not so good at distinguishing the exact strength of a high frequency brightness variation. Small quantization step for low frequency components (Top-left corner in DCT coefficients matrix ) Big quantization step for high frequency components (Bottom-right corner in DCT coefficients matrix ) DCT coefficient Sample Images Page 18
JPEG, Quantization Quantization Matrix A typical quantization matrix, as specified in the original JPEG Standard G is the unquantized DCT coefficients Q is the quantization matrix B is the quantized DCT coefficients Page 19
JPEG, Quantization Sample Output Quantized DCT coefficient for our sample block Many of the higher frequency components are rounded to zero Page 20
JPEG, Quantization Page 21
JPEG, Entropy Coding DC Coefficient: DPCM AC Coefficients Run-length encoding ( RLE ) Then using Huffman coding on the whole sequence of numbers Zigzag Ordering Page 22
JPEG Encoder Example Page 23
JPEG Decoder Example Page 24
JPEG Compression Ratio Original JPEG Compressed Quality setting of 50 Difference (Darker means a larger difference) Page 25
JPEG Blocking Artifact Original JPEG Compressed Quality setting of 5 Page 26
JPEG, Block Splitting Why Blocking? Bocks of 8 by 8 Pixels Neighboring pixels are more correlated Lower computational complexity The computational complexity for 2D DCT of an 2 N by N image is: O N log 2, while the complexity of 2D DCT of all N/8 by 2 N/8 blocks of image is: What about blocks of 16 16 pixels? N N 8 log 8 2 O 8 2 2 2 O N Padding If the data for a channel does not represent an integer number of blocks then the encoder must fill the remaining area of the incomplete blocks with some form of dummy data. Page 27
JPEG, Block Splitting Larger Blocks Pro: Less blocking artifact Con: Less Correlated data inside the block Higher computational complexity Efficiency as a function of block size N N, measured for 8 bit quantization in the original domain and equivalent quantization in the transform domain. Block size 8 8 is a good compromise between coding efficiency and complexity Page 28
JPEG, Quantization Matrix Quality Factor The quality setting of the encoder (for example 50 or 95 on a scale of 0 100 in the Independent JPEG Group's library) affects to what extent the resolution of each frequency component is reduced. For a quality of 100%, the quantization tables should be setup such that all entries are one. For a quality factor of 50%, the ITU/ISO recommended tables are recommended, but any other choice is also valid. For a quality between 50% and 100%, one may interpolate between the quality factor given for 50%, and that for 100% (i.e. 1.0) Page 29
Multimedia Systems Image III (Compression, JPEG) Thank You Next Session: Video I FIND OUT MORE AT... 1. http://ce.sharif.edu/~m_amiri/ 2. http://www.dml.ir/ Page 30