CS 335 Graphics and Multimedia. Image Compression

CS 335 Graphics and Multimedia Image Compression

CCITT Image Storage and Compression Group 3: Huffman-type encoding for binary (bilevel) data: FAX Group 4: Entropy encoding without error checks of group 3 The Joint Photographic Experts Group (JPEG) International Standards Organization (ISO) and a subgroup of CCITT combined in 1987

Why Compress? Save storage space Improve transmission time Process data to provide error checking in transmission Re-represent image data to provide progressive transmission capability

Characteristics of Compression Lossy vs. Lossless Compression Lossy: decompressed image is different from the original (data is permanently lost) Lossless: decompressed image is always exactly the same as the original Compression Ratio Describes ratio of original data size to compressed size, i.e. 14:1 Symmetric vs. Asymmetric Compression

Image Formats and Compression Image storage formats are based on some form of compression Storing an image requires that an application understand the format Codec Codecs are chosen based on the application and the desired amount of compression

Image Format Components Header image size (spatial dimensions) image intensity range color representation (color space) compression scheme parameters Body image data in format indicated by header Raw image data with no header is ambiguous!

Image Compression Taxonomy Entropy coding ignore data semantics Run-length coding Huffman coding Arithmetic coding Lempel-Ziv-Wempel coding Prediction Lossless Predictive Coding (LPC) Differential Coding (DC) Delta Modulation (DM) Differential Pulse Code Modulation (DPCM) Source Coding exploit data semantics Transformation Layered Fast Fourier Transform (FFT) Discrete Cosine Transform (DCT) Wavelet Transform Bit Plane coding Subband coding Subsampling

Image Compression Taxonomy Hybrid coding methods that use combined techniques JPEG MPEG H.261... Most compression strategies have a lossy and a lossless form

Entropy Coding Key idea: reduce coding redundancy by identifying most common symbol and representing it with the fewest number of bits Huffman code (Huffman, 1951): construct unique codes that represent frequent symbols with short bit patterns, and infrequent symbols with long bit patterns Resulting code is of variable length

Huffman Coding Original source Source Reduction Symbol Probability 1 2 3 4 A2 0.4 0.4 0.4 0.4 0.6 A6 0.3 0.3 0.3 0.3 0.4 A1 0.1 0.1 0.2 0.3 A4 0.1 0.1 0.1 A3 0.06 0.1 A5 0.04

Huffman Coding Original source Source Reduction Symbol Probability Code 1 2 3 4 A2 0.4 1 0.4 1 0.4 1 0.4 1 0.6 A6 0.3 00 0.3 00 0.3 00 0.3 00 0.4 A1 0.1 011 0.1 011 0.2 010 0.3 01 A4 0.1 0100 0.1 0100 0.1 011 A3 0.06 01010 0.1 0101 A5 0.04 01011

The Fourier Transform Compute Forward Transform

The Fourier Transform

Hybrid Compression Schemes Combine a number of different techniques in sequence Algorithm (sequence of combination of techniques) is very important Tuned to particular kinds of data

The JPEG Compression Algorithm Basis of the algorithm is transform coding based in the Discrete Cosine Transform Quantization of the transformed data introduces permanent loss Amount of loss is controllable by controlling the step quantizer applied to the transform values

The JPEG Compression Algorithm

JPEG Compression Step 1: Color space Can encode color channels separately Can transform to optimal color space (YUV) Greyscale images need only encode 1 band Step 2: Partition image into blocks based on blocking factor Standard blocking factor is 8x8

JPEG Compression Step 3: Intensity value shift For particular channel, shift range so that it is centered around 0 Step 4: The Discrete Cosine Transform Apply the DCT to current 8x8 pixel block:

JPEG Compression Step 5: Apply quantizer The 8x8 block is quantized with an 8x8 table of quantization values These values are a function of the desired quality factor Step 6: Value reordering Reorder the quantized values to achieve greater compression rates in the subsequence steps

JPEG Compression Step 7: Run-length coding run-length code the block (compresses 0 values that occur after quantizer is applied) Step 8: Entropy/Huffman coding Entropy code the run-length coded stream Decompression simply inverts these steps; Loss occurs at quantization step and is dependent upon 8x8 table of quantization values

Examples of JPEG Compression

MPEG: Video Exploit frame-to-frame coherence Encode index frames (I frames) similar to JPEG (blocking factor is potentially different) Coder has control over I frame coding, I frame frequency, and B/P frame differencing algorithm