Compression Part 2 Lossy Image Compression (JPEG)
General Compression Design Elements 2 Application Application Model Encoder Model Decoder Compression Decompression Models observe that the sensors (image sensors, video streams, digital audio) generate information that is not perceived in typical applications Application Model Model human visual (for JPEG, MPEG) and auditory (for MP3) systems Use that model to compress images (JPEG, MPEG) selectively losing information that is not seen Not heard (MP3) To capitalize on this fact and control what data is kept and lost, spatial (JPEG, MPEG), or time domains (MP3, MPEG) are transformed into frequency domains MPEG adds prediction based on past and future frames
3 Standards JPEG Joint Photographic Experts Group Pink book is the standard MPEG Motion Picture Experts Group MPEG 1, 2 (DVD), MPEG layer 3 (MP3) (DVD audio) Many other semi-standard and proprietary formats
4 Joint Photographic Experts Group (JPEG) Lossy compression modeled after the human visual system (HVS) ~120 Million Rods, ~5-6 Million Cones in the human eye
5 Human Eye Spatial Frequency Response Some spatial frequencies are less visible to the eye Color is less sensitive to higher frequencies JPEG takes advantage of this in choosing what information to loose
6 Color and Luminance are coded separately RGB color space has redundant information YCrCb: color is mapped to two components that can be further reduced in information content at a lower reduction in perceptual appearance; luminance is separate and can be treated differently; Low Complexity: Color space transform is a simple linear conversion Y = luminance (think monochrome), CrCb are the two color channels
7 Subsampling example
8 JPEG Block Diagram Key is the use of the Discrete Cosine Transform (DCT) to map the spatial image to a frequency domain. Frequencies that are not as visible are removed (quantized), then the remainder is lossless coded via Huffman coding. Color transform is included, then the color channels are down sampled to take advantage of the lower spatial frequency response
9 Discrete Cosine Transform (DCT) Basis functions and sample reconstruction *DC term, rest are called AC terms
10 JPEG Quantization Tables Based on psychovisual threshold experiments Luminance is not subsamples, lighter quantization Chrominance, subsampled 2:1, heavier quantization Luminance Quantization Table Chrominance Quantization Table Larger numbers more heavily quantize the DCT coefficient
11 Problem with high frequency term quantization Loss of high frequency terms results in ringing. The inability to reconstruct the edges. For the math, see Gibbs phenomenon.
12 Lossy compression: Quality? How to compare different algorithms or determine if the loss is ok for your application. Lossless just compare compression ratios. Lossy: Want to loose information that is not important to the application. One method: Mathematical error between the original and the processed image, pixel by pixel Mean squared error (MSE) - sorry wrong answer MSE does not account for the perceptive portion of the human visual system Use a model of the perceptual part of the human visual system (HVS): Tried that we do not understand the HVS enough to make a good model Human psycho-visual experiments Select images, process, print or view original vs processed Rank with as many observers as you can get Expensive labor and controlled lab setup Population bias Researchers are very critical, others not enough Image type bias
13 Test image example Processing and printing/viewing and ranking images have a per image cost Select the image, process, print or view setup, schedule observers, process observations, monitor and check each step Often images contain a lots of content to stress and cover a wide range of possible errors in a single image Need to use customer/market images with customers
14 Test image example Compressed 48:1 Original Compressed
15 Comparison ~48:1 Original Compressed
16 How to drive a high quality, 150ppm digital color press (igen 150) Free Flow Print Server igen150 CMYK Raster images & Job Control igen5 Up to 3.5M prints per month ~$750k-$1M+ price range 2400x2400 spi digital addressability Free Flow Print Server ~1Gbyte/second raster generation and print CMYK color space (600x600x48 bits to print engine) Need to achieve 10:1 guaranteed compression
17
Xerox Multi-Mode Compression (XM2) Adaptive compression technology Compression technology for graphics and text Compression technology for pictures
19 XM2 Technology Format & LZ77 CMYK Image File 8x8 Blocks Block Analysis JPEG High JPEG Low Each Separation is processed independently {C,M,Y,K}, Think 4 luminance files Block Analyzer determines how to compress the block to meet image quality and compression constraints Optimized Q tables, two levels for JPEG selected by Huffman table tag 1bit/Block Selector Map (LZ77)
20
21 Reading Material
22 Another aspect of HVS - Night adaptation What colors do you see in dim light? Why are instruments lit with red lights at night? Ship bridge in night mode Astronomy software Day mode Night mode
23 MPEG Video Compression
24 MPEG Frame Encoding I = DCT encoded reference frame no other frames are used P = Use only previous frames for prediction B = Use both forward and previous frames for prediction
25 Moving Pictures Experts Group Layer III aka MP3 Audio Signal Filter Bank 32 Sub-bands MDCT 512 samples Non-uniform Quantizer Huffman Encoding Side Band Encoding Stream Formatting FFT Psychoacoustic Model Analysis and Modeling for Quantizer MDCT Multiple DCTs on each subband sample FFT Fast Fourier Transform map to frequency domain for analysis Raw CD audio is ~ 10 MB/minute MP3 compression is typically ~10-11:1 reducing to ~1MB/minute
26 DC without lsb(0.088 b/p) Progressive vs sequential modes Image is 512x512 = 262,144 bytes Compressed 5.8:1
27 +1-63 AC w/o 3 lsb(0.283 b/p) Progressive vs sequential modes Image is 512x512
28 +1-63 AC +3rd lsb(0.482 b/p) Progressive vs sequential modes Image is 512x512
29 +1-63 AC +2nd lsb(0.482 b/p) Progressive vs sequential modes Image is 512x512
30 +1-63 AC +lsb(0.482 b/p) Progressive vs sequential modes Image is 512x512