Lecture7: Huffman Code Lossless Image Compression Huffman Code Application A simple application of Huffman coding of image compression which would be : Generation of a Huffman code for the set of values that any pixel may take For monochrome images a set usually consists of integers from 0 to 55
Lecture7: Huffman Code Huffman Code Application Steps to have lossless image compression. Encode the image using Huffman code. Save it in a file The original (uncompressed) image representation uses 8 bits/pixel. The image consists of 56 rows of 56 pixels, so the uncompressed representation uses 65,536 bytes 3. Generate a Huffman code for compressed image 5. store it in a file again 6. Determine the compression ratio number of bytes (uncompressed)/ number of bytes compressed Notes:.The number of bytes in the compressed representation includes the number of bytes needed to store the Huffman code.the compression ratio is different for different images /8/06 LECTURES
Lecture7: Huffman Code Huffman Code Application Image Name Bits/Pixel Total Size (B) Compression Ratio Sena 7.0 57,504.4 Sensin 7.49 6,430.07 Earth 4.94 40,534.6 Omaha 7. 58,374. Huffman (Lossless JPEG) compression based on pixel value Image Name Bits/Pixel Total Size (B) Compression Ratio Sena 4.0 3,968.99 Sensin 4.7 38,54.70 Earth 4.3 33,880.93 Omaha 6.4 5,643.4 Huffman Compression Based on pixel difference value between the pixel and its neighbor /8/06 LECTURES 3
Lecture7: Huffman Code Huffman Code Application Image Name Bits/Pixel Total Size (B) Compression Ratio Sena 3.93 3,6.03 Sensin 4.63 37,896.73 Earth 4.8 39,504.66 Omaha 6.39 5,3.5 Huffman compression based on pixel difference value and adaptive model Notice that there is little difference between the performance of adaptive Huffman code and Huffman coder Adaptive Huffman coder can be used as an on line or real time coder makes the adaptive Huffman coder amore attractive option in many applications However, adaptive Huffman coder is more subjected to errors and may also be more difficult to implement In the end, the particular application will determine which approach is more suitable 4
Lecture7: Huffman Code Huffman Code Application Text Compression -Text compression seems natural for Huffman coding. In text, we have a discrete alphabet that, in a given class, has relatively stationary probabilities -The probabilities in the left table 3 are the probabilities of the 6 letters obtained for the U.S. Constitution and are representative of English text -The probabilities in the right table ere obtained by counting the frequency of occurrences of letters in an earlier version of some chapter -While the two documents are substantially different, the two sets of probabilities are very much alike
Lecture7: Huffman Code Huffman Code Application Audio Compression Another class of data that is very suitable for compression is CD quality audio data. The audio signal for each stereo channel is sampled at 44. khz, and each sample is represented by 6 bits The three segments used in this example represent a wide variety of audio material, from symphonic pieces as nominated File Name Original File Size (bytes) Entropy (bits) Estimated Compressed File Size(bytes) Compression Ratio Mozart 939.86.8 75.40.3 Cohn 40.44 3.8 349.300.5 Mir 884.00 3.7 759.540.6 /8/06 LECTURES 6
Lecture7: Tunstall Code Tunstall Code It is clear that Huffman code encodes letters from the source alphabet using codewords with varying numbers of bits codewords with fewer bits for letters that occur more frequently and codewords with more bits for letters that occur less frequently It is fixed to variable mapping On the other hand, errors in codewords propagate, error in one codeword will cause a series of errors to occur Tunstall Code It encodes letters such that each group of different letters from the source are encoded to codewords of equal length It is variable to fixed mapping /8/06 LECTURES 7
Lecture7: Tunstall Code Tunstall Code Algorithm Given: alphabet of size N Required: a Tunstall code of L bits for given pmf -Start with the N letters of the source alphabet -Remove the entry with highest probability -Add N string obtained by concatenating this letter with every letter in the alphabet (including itself), this will increase the size from N to N+(N-) -Calculated the probabilities of the new entries -Select the entry with the highest probability and repeat until the size reaches L, i.e., it is repeated k times until Ex: S={A,B,C}, P(A)=0.6, P(B)=0.3, P(c)=0., L=3 bits alphabet X P A 0.6 B 0.3 C 0. k= alphabet P B 0.3 C 0. X AA 0.36 AB 0.8 AC 0.06 alphabet X N k( N ) k= k=3 X P B 0.3 AAA 0.6 AB 0.8 AAB 0.08 C 0. AC 0.06 L arranged AAC AC C AAB AB AAA CODE 000 00 00 0 00 0 /8/06 AAC 0.036 B 0
Lecture7: Tunstall Code Tunstall Code i.e., it encodes into binary codewords of fixed length L, make L source phrases that are as nearly equally probable as we can If the probability of the last source phrases are not nearly equally probable you can use Huffman code at that instance, Tunstall/Huffman B 0.3 AAA 0.6 AB 0.8 AAB 0.08 C 0. AC 0.06 AAC 0.036 0 0.096 0.8 0.08 0 0.96 0.6 0.96 0 0.88 0.3 0.88 0 0.4 0.4 0 0.588 0 AAC AC 0 C 0 AAB 0 L 0.036x4 0.06x4 0.x3 0.08x3 0.8x3 0.6x 0.3x.58 Tunstall code length(3) AB 00 AAA 0 B 00 9
Lecture7: Golomb Code Golomb Rice Code Golomb Rice codes belong to a family of codes designed to encode integers with the assumption that the larger an integer, the lower its probability of occurrence The simplest code for this situation is the unary code Unary code for a positive integer n is n s followed by 0 the code for 4 is 0, the code for 7 is 0 Unary code is the same as the Huffman code for the semi infinite alphabet {,,3, } with probability model P( k) k Both are optimal for their probability models One step higher in complexity of unary code is that it splits the integer into two parts, representing one part with a unary code and the other part with a different code LECTURES /8/06 0
Lecture7: Golomb Code Golomb Rice Code It is parameterized by an integer m > 0 such that an integer n to be encoded is represented by two numbers, the quotient q and the reminder r q n m q can take values 0,,,3, q is coded by unary code of q r n qm r can take values 0,,,3, m- r is coded by: log m If m is power of, otherwise: log m Binary representation of r for the first m log m values of r log m Binary representation of r m log m for the rest values of r /8/06 LECTURES
Lecture7: Golomb Code Golomb Rice Code Ex: For m=5, the Golomb code for the integer numbers {0,,,5} is log m =log4= log m =log8=3 Find Golomb codes for integer numbers {3,4,5,8} for the case m=7,8 /8/06 LECTURES