In this lecture... Compression and Standards Gail Reynard yintroduction to compression ytext compression Huffman LZW yimage compression GIF TIFF JPEG The Need for Compression ymultimedia data volume > available bandwidth ycost yneed reduced volume yneed reduced bandwidth Compression Principles ysource encoders and destination decoders software or hardware ylossless (reversible) and lossy compression Entropy Encoding ylossless yindependent of the type of information yjust concerned with how information is represented yexamples: Run-length Statistical Can be used separately or together Run-length Encoding yapplications: source information consisting of long substrings of the same character or binary digit yinformation transmitted as: set of codewords indicating character or bit number of occurrences e.g. 000000011111111110000011 0, 7, 1, 10, 0, 5, 1, OR with leading zeros, 7, 10, 5, Destination must know the set of codewords used! 1
Statistical Encoding yoften symbols do not occur at the same frequency of occurrence yuses a set of variable length codewords shortest used to represent the most frequently occurring symbols destination must know the set of codewords used to ensure clear codeword boundaries, a prefix property is used Source Encoding yexploits a particular property of the source information yexamples: Differential Transform Differential Encoding Transform Encoding yuses: where the amplitude of a value or signal covers a large range BUT the difference between successive values/signals is relatively small yuses a set of smaller codewords each indicates only the difference in amplitude between current and previous value/signal ycan be lossless or lossy yinvolves transforming the source information from one form into another new form lends itself more readily to compression yconsider: continuous tone monochromatic image produces D matrix of pixel values level of grey position of pixel (low spatial frequency) Example pixel patterns 1 4 5 6 7 8 9 10 1 4 5 6 7 8 9 10 Pixel amplitude Pixel amplitude Line (high spatial frequency) 0 0 0 0 Horizontal position Horizontal position Transform Encoding () transforms the original spatial form of representation into an equivalent representation involving spatial frequency components higher spatial frequency components cannot be detected by the eye - can be eliminated later ydiscrete Cosine Transformation (DCT) a mathematical transformation of a D matrix of pixel values into an equivalent matrix of spatial frequency components lossless
Text Compression Static Huffman Coding ymust be lossless yentropy used statistical in practice yhuffman and arithmetic coding algorithms use single characters for deriving an optimum set of codewords ylempel-ziv (LZ) algorithm uses variable length strings of characters ycharacter string analysed character types and frequency determined yunbalanced tree created - Huffman code tree yset of codewords associated with tree yboth transmitter and receiver must know codewords Dynamic Huffman Coding ytransmitter and receiver build the tree (and codeword table) dynamically (during transmission) yif character present in tree codeword determined and sent yif character not present in tree character transmitted in uncompressed form yencoder updates its tree yreceiver can carry out same modifications to its tree Lempel-Ziv Coding yuses strings of characters for the coding operation ytable of all possible character strings held at both encoder and decoder yencoder sends index of where word is stored yknown as a dictionary-based compression algorithm Lempel-Ziv-Welsh (LZW) Coding yencoder and decoder build the dictionary contents dynamically Initially, dictionary contains only the character set used to create the text remaining entries built dynamically Image Compression ytwo basic types of image: Digitized displayed as a D matrix of pixels stored as a D matrix of pixels Computer generated (graphical) displayed as a D matrix of pixels stored in the form of a program requires considerably less memory and bandwidth
Image Compression () ygraphical images if transferred in program form lossless compression only if transferred in bitmap form lossy compression can be used ydigitized images different compression algorithms used combination of run-length and statistical lossless - used for digitized documents combination of transform, differential and runlength Graphics Interchange Format (GIF) yused with graphical images yimages comprising 4-bit pixels are supported ybut closest 56 colours chosen ytable of colours used 56 entries of 4-bit colour values whole image global colour table part image local colour table ycompression ratio :1 Tagged Image File Format (TIFF) ysupports pixel resolutions of up to 48 bits yfor images and digitized documents therefore different formats can be used! Uncompressed (code number 1) LZW (code number 5),, 4 are for digitized documents Joint Photographic Experts Group (JPEG) yused with digitized pictures ya standard which defines a range of different compression modes ye.g. lossy sequential mode (a.k.a. baseline mode) intended for compression of monochromatic and colour digitized images ytypical compression ratios: between 10:1 and 0:1 JPEG Lossy Sequential Mode yfive main stages: Image/block preparation source image might be represented by one matrix e.g. D matrix for continuous tone monochrome image or more than one matrix e.g. matrices to store R, G and B values of an image matrix is split into blocks (block preparation) Forward DCT each block transformed by DCT JPEG Lossy Sequential Mode () Quantization transformed matrix is taken spatial frequency coefficients whose amplitudes are below a defined threshold are dropped eye less sensitive to these but lost forever! But threshold values vary for each coefficient held in a D matrix - quantization table 4
JPEG Lossy Sequential Mode () JPEG Encoder Entropy four steps vectoring» convert D matrix to 1D vector differential run-length Huffman building encapsulate all information relating to an encoded image/picture in a defined format Source image/ picture Image/block preparation Image preparation Vectoring Block preparation Differential Forward DCT Entropy Huffman Run-length Quantization Quanitizer builder Encoded bitstream Level 1 Level Level JPEG Encoder Output Bitstream Format Start-of-frame contents --------- ------------ End-of-frame Encoded bitstream decoder JPEG Decoding Huffman Differential Run-length Dequantizer Block Block DC Skip, value --------- ----- Block Skip, End of value block Set of Huffman codewords for the block Inverse DCT Image builder Memory or Video RAM Summary ycompression Principles ylossless and lossy compression ysome compression techniques 5