Lecture 17, Image Coding and Compression GW Chapter 8.1 8.3.1, 8.4 8.4.3, 8.5.1 8.5.2, 8.6 Suggested problem: Own problem Calculate the Huffman code of this image > Show all steps in the coding procedure, and calculate Lavg. Magnus Gedda magnusg@cb.uu.se 2005 05 19 1
Data and information Data is not the same thing as information. Data is the means with which information is expressed. The amount of data can be much larger than the amount of information. Data that provide no relevant information = redundant data or redundancy. Image codning or compression has as a goal to reduce the amount of data by reducing the amount of redundancy. 2
Definitions n1 = data n2 = data redundancy (i.e. data after compression) Compression ratio = CR = Relative redundancy = RD = 3
Images can contain three types of redundancy 1. Coding Redundancy 2. Interpixel Redundancy 3. Psycho Visual Redundancy CR: some graylevels are more common than others IR: the same graylevel covers large areas PVR: the eye can only resolve 32 graylevels locally 4
Image compression and decompression compression original image 102900910 23155 70291 787418029 809578759187582 745187598475198 751878758457109 58507905750905 compact information (for storage or transmission) (loss less compression) decompression (lossy compression) approximation of the original image 5
Image compression can be Reversible (loss less) no loss of information. New image is identical to original image (after decoding). Neccessary in most image analysis. Compression ratio typically 2 10x. Non reversibel (lossy) loss of some information Often used in image communication, video, www. Important: that the image visually nice. Compression ratio typically 10 30x. 6
Objective measures of image quality error e(x,y) = fapprox(x,y) foriginal(x,y) M 1 N 1 total etot = f approx f original x=0 y=0 1 root men square erms= MN M 1 N 1 x=0 y=0 f approx f original M 1 N 1 signal to noice ratio SNRMS= x=0 y=0 M 1 N 1 f approx 2 2 f approx f original 2 x=0 y=0 Subjective measures of image quality Let a number of test persons grade the images as bad/ok/good etc.7
How much information is present in the image? If p(e) is the probability of an event, then I(E)= logp(e) is a measure of the information that the event provides. The average information is called entropy (Shannon entropy) 8
1. Coding redundancy Basic idea: different gray levels occur with different probability (non uniform histogram). Use shorter code words for the more common gray levels and longer code words for the less common gray levels. This is called Variable Length Coding. The amount of data in an MxN image with L gray levels =MxNxLavg where Lavg= l(rk) is the number of bits used to represent gray level rk p(rk) is the probability of gray level rk in the image 9
Example 3 bit image: gray level rk probability p(rk) 0 0.1 1 0.4 2 0.03 3 0.05 4 0.3 5 0.1 6 0.01 7 0.01 source 000 001 010 011 100 101 110 111 code 01 0 11 10 1 00 111 000 L 1 l r k p r k Lavg= k=0 source : Lavg=(constant l(rk)=3)=3*1=3 code: Lavg=0.1*2+0.4*1+ =1.32 This will however NOT work since the code is not unambiguous. What does for example the code 010 mean? Use Huffman coding! 10
The Huffman code: yields the smallest possible number of unique code symbols per source symbol. Step 1 1. sort the gray levels by decreasing probability 2. add the two smallest probabilities 3. sort the new value into the list 3. repeat until only two probabilities remain Step 2 1. give the code 0 to the highest probability, and the code 1 to the lowest probability in the present node 2. go backwards through the tree and add 0 to the highest and 1 to the lowest probability in each node until all gray levels have a unique code 11
Example of Huffman coding graylevel rk p(rk) 1 4 0 5 3 2 6 7 node 2 node 3 node 4 node 5 node 6 node 1 node 2 node 3 node 4 node 5 node 6 0.4 0.3 0.1 0.1 0.05 0.03 0.01 0.01 Lavg=3 code graylevel rk node 1 1 4 0 5 3 2 CR=n1/n2= RD=1 1/CR=(n1 n2)/n1= 12
The Huffman code (continued) The Huffman code results in an unambiguous code, i.e. no code can be created by combining other codes. The code is reversible without loss. The table for the translation of the code has to be stored together with the coded image. The Huffman code does not take correlation between adejacent pixels into consideration. 13
2. Interpixel Redundancy (also called spatial or geometric redundancy) There is often correlation between adjacent pixels, i.e. the value of the neighbours of an observed pixel can often be predicted from the value of the observed pixel. Coding methods: Run length coding Difference coding. 14
Run length coding Every code word is made upp of a pair (g,l) where g is the graylevel and l is the number of pixels with that graylevel (length, or run ). Ex 56 56 56 82 82 82 83 80 56 56 56 56 56 80 80 80 creates the runlength code (56,3) (82,3) (83,1) (80,4) (56,5) The code is calculated row by row. Very efficient coding for binary data. Important to know position, and the image dimensions must be stored with the coded image. Used in most fax machines 15
Difference coding f(xi)= { xi if i=0, xi xi 1 if i>0 Ex original: code f(xi) : 56 56 56 82 82 82 83 80 80 80 80 56 0 0 26 0 0 1 3 0 0 0 The code is calculated row by row. Both Run length coding och Difference coding are reversible and can be combined with for example Huffman coding. 16
Example of combined Difference and Huffman coding original image difference image 17
Huffman code of original image Lavg=3.1 18
Huffman code of difference image Lavg=2 19
Bitplane coding Divide the grayscale/color image into a series of binary images (one image per bit). Code each image separately using the above described methods. An 8 bit image will be represented by 8 coded binary images. 20
2. Psycho Visual Redundancy If the image will only be used for visual observation (i.e. illustrations on the web etc), a lot of the information is usually psycho visually redundant. It can be removed without changing the visual quality of the image. This kind of compression is usually irreversible. 0.5kB 0.05kB 21
Psycho visual redundancy is often reduced by quantifiacation: Example: Uniform quantification of graylevels remove the least significant bits of the data causes edge effects The edge effects can be reduced by "Improved Gray Scale", IGS Remove the least significant bits and add a random number based on the sum of the least significant bits of the present and the previous pixel. special case if the graylevel of a pixel in an 8 bit image is 1111 xxxx, add 0000. IGS reduces edge effects but will at the same time unsharpen true edges. 22
IGS 23
More quantification methods: Motion pictures method 1: 1. transfere the first image to the observer 2. find the changes from the previous image 3. transfere only the changes method 2: 1. transfere the most important information (e.g. the lowest frequencies) first 2. send the less important information later 24
Transform coding 1. Divide the image into nxn subimages 2. Transform each subimage using a reversible transform (e.g. the Hotellingtransform, the Discrete Fourier Transform (DFT), the Discrete Cosine Transform (DCT)). 3. Quantify, i.e. truncate the transformed image, (for example with DFT and DCT frequencies with small amplitude can be removed without much information loss).the quantification can be either image dependent (IDP) or image independent (IIP). 4. Code the resulting data, normally using some kind of "variable length coding", for example Huffman code. The coding is not reversible (unless step 3 is skipped) 25
Some common image formats JPEG (Joint Photographic Experts Group) exists in many different versions but is always some kind of transform coding. JPEG is not reversible due to quantification. MPEG (Motion Pictures Experts Group) Similar to JPEG, but the motion in comparison to the previous image is calculated and used in the compression. 26
Example: JPEG compression 75% 25% 27 kb 11 kb 50% 10% 17 kb 6 kb 27
Some more common image formats LZW kodning (Lempel Ziv Welch) A "word based" code. The data is represented by pointers to a library of symbols (see Huffman code). LZW compression is loss less and can often be choosen when TIFF (Tagged Image File Format) images are stored. The result is a smaller file which usually takes a bit longer to decode. An Image File Directory (set of symbols) is included in the header. GIF (Graphics Interchange Format) Creates a coding for color images where each color is coded by only a few bits (usually 3). GIF also uses LZW compression for storage and transfers. GIF is fully reversible (lossless) if less than 256 colors are present in the original image. Remember that the TIME used for coding and decoding is important 28 when choosing coding method!
Choice of image format Images to be used for image analysis should always be saved in a loss less format! Images for the WWW have to be either GIF, JPEG or PNG (due to the license issues of GIF) Chose GIF for graphs and hand drawn figures with few color shades (JPEG transform coding and truncation can cause artefacts around sharp edges) Chose JPEG for photos and figures with many colors and smooth transitions between colors (GIF reduces the number of colors to 256). 29
30