Repetition 1st lecture Human Senses in Relation to Technical Parameters Multimedia - what is it? Human senses (overview) Historical remarks Color models RGB Y, Cr, Cb Data rates Text, Graphic Picture, Audio, Video http://www.tu-ilmenau.de/ihs
Overview Focus MuMeSy
History 1980: Media and Computer technique separate Record Transmit Play Computer 1 Network Computer 2
History (5) 2000: Complete digital processing Computer 1 Network Computer 2 => Technical basis for a Multimedia-communication
Color spectrum Red, Blue, Green (Spectrum)
Colors RGB curves to generate all colors 0,05B + 0,06G 0,09R ~ 500nm source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada
Color models (RGB) color cube (diagonal: complementary colors) R:(FF,00,00) complement (difference to (FF,FF,FF)) (FF,FF,00) G:(00,FF,00) (FF,00,FF) B:(00,00,FF) 3 x 8 = 24 Bit 2^24 = 16 777 216 colors True Color (just to remember : Human can differ: 350 000!) => 3 x 6 = 18 Bit => 2^18 = 262 144 (00,FF,FF)
color models (RGB CMY) Monitor: Addition Printer: Subtraction RGB CMY to a dark area light is added Light colors Prism from a white area light is subtracted Corpus colors also in photography
Transformation RGB => YCbCr Y = 0.299R + 0.587G + 0.114B U = α (B Y) Cb = (B - Y) / 1.772 + 0.5 V = β (R Y) Cr = (R - Y) / 1.402 + 0.5 YUV used in PAL (α β depends on implementation) CbCr used in JPEG and MPEG H.-D. Wuttke 2007
Examples of Graphic - Formats Vector Graphic - PS, EPS, WMF geometrical figures, exact calculation to a needed resolution Raster Graphic - BMP, GIF, JPEG, PNG fixed number of pixels (ppi) reproduction via interpolation (dpi) Problem: Mac-design (Vector) => PC (GIF)
Summary Color: Resolution: overlapping of colors RGB additive CMY subtractive Y Cb Cr Luminance + 2 Crominance compromise between quality and required memory ppi dpi picture in memory in-/output device
Information ways CPU Memory Keyboard/ Mouse BUS - System Network external memory and interfaces Graphic / Video / Audio
Continuous Media Movement Audio- and Movie encoding (MPEG1) (Moving Audio rate: between 32... 448 kbit/s Picture Experts constant data rate: 1 856 000 Bit/s Group) 1,77 MBit/s Remark: M in Relation to Hz : 1 000 000 (10 6 ) M in Relation to Byte: 1 048 576 ( 2 20 ) M in Relation to Bit: 1 048 576 k in Relation to Hz: 1 000 (10 3 ) k in Relation to Byte: 1 024 ( 2 10 ) k in Relation to Bit: 1 024 M 3 : Multi-Media-Mogelfaktor
Day 2: October 3, 16:00-19:15 Compression Methods Lossless Compression Entropy encoding Source encoding Hybrid Compression Lossy Compression JPEG encoding
Compression classes Quelle: Steinmetz, Ralf: Multimedia-Technologie: Einführung and Grundlagen, Springer, Verlag
Entropy encoding compression algorithms RLC Huffman Adaptive Huffman encoding Arithmetic encoding LZW
Entropy vs. Source Entropy Ignoring the kind of data Removing of repetitions Statistical basis Lossless and exact reproducible Low compression Source Properties of the source and/ or drain are important (e.g. Human senses as drain) Lossless possible Lossy if high compression (MP3)
RLC: Run Length Code Special sign (in the example #) outside of the Alphabet shows that the next sign is a number, followed by the sign that has to be repeated e.g. eaaaabaaabb 11 signs e#4abaaabb 10 signs Makes sense only for many equal signs
Huffman- encoding Information of a sign (entropy H) is indirect proportional to the probability of its occurrence and its code length Seldom signs: higher information => longer code words their absence is more difficult to reconstruct => entropy H ~ length of the code => algorithm?
Huffman- encoding Formula: (Shannon) n signs with probabilities p i Information source S: generates binary signs S i p i: probability that S i occurs in S entropy H(S) = - i p i ld(p i ) i= 1... n
Huffman- encoding Entropy H(S) = - i p i ld(p i ) i= 1... n e.g. picture with an homogenous grey-part with 8-Bit-code: i= 1... 256 n=256 grey values, each grey value has the same probability p i = 1/256 => ld(p i ) = ld(1/256) = ld(1)-ld(256) = -8 H(S) = - (1/256 * (-8) +...+1/256 *(-8) ) = - 256 * 1/256 *(-8) = 8 => ld(1/p i ) is the ideal number of bits for encoding
Shannon- Fano algorithm Top Down S = AEABADDABADACADABADACABABABECADABCECECE 1. Sort on occurrence Symbol A B C D E Occurrence 15 7 6 6 5 source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada
Shannon- Fano algorithm 2. Divide iteratively into two sets so that the sum in each of them is nearly the same Symbol A B C D E Anzahl 15 7 6 6 5 22 17 1.Bit (0 1) 15 7 6 11 2.Bit (0 1) 6 5 3.Bit (0 1) 3. Encoding: 00 01 10 110 111 A B C D E
Calculation of the needed Bits Symbol S Number S: 39 ld(1/p i ) Code Sum Bits: 89 A 15 1.38*) 00 30 B 7 2.48 01 14 C 6 2.70 10 12 D 6 2.70 110 18 E 5 2.96 111 15 source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada *) p(a)=15/39 => ld(39/15)=1,38 Relative to a 3-bit-encoding: 3bit/S x 39S = 117bit-89bit= 28 bits
Huffman- algorithm Bottom-Up 1. Start with an open list. [A,B,C,D,E] Keep it always sorted by the number of occurrences Symbol A B C D E Number of occurrences 15 7 6 6 5 source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada
Huffman- algorithm 2. Take always the two elements with the lowest the occurrence and calculate their Sum s, [A(15),B(7),C(6),D(6),E(5)] [A(15),B(7),C(6),D(6),E(5)], s =11 replace them by a new element (knode) Wi(s) (W1(11)) in the list: [A(15),B(7),C(6),W1(11)] and sort: [A(15),W1(11),B(7),C(6)] further: [A(15),W1(11),B(7),C(6)] [A(15),W1(11),W2(13)] [A(15),W2(13),W1(11)] [A(15),W3(24)] [W3(24),A(15)] [W4(39)] until 1 element left over => binary tree E(5) D(6) C(6) B(7) 1 0 1 W1(11) 1 W3(24) 1 W2(13) W4(39) A(15) source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada 0 0 0
Calculation of the needed Bits, entropy Symbol number 39 log 2 (1/p i ) code Sum of Bits 87 A 15 1.38 0 15 B 7 2.48 100 21 C 6 2.70 101 18 D 6 2.70 110 18 E 5 2.96 111 15 Relative to a 3-bit-encoding: 87 to 3 x 39 = 117: 30 bits source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada
Huffman - encoding entropy H(S) = - i p i ld(p i ) i= 1... n entropy of the sequence of signs (p(a)=15/39, p(b)=7/39...) (15 x 1.38 + 7 x 2.48 + 6 x 2.7 + 6 x 2.7 + 5 x 2.96) / 39 = 85.26 / 39 = 2.19 (ideal value for encoding) Bits / signs for the Huffman encoding: 87 / 39 = 2.23 ~ entropy 2,19!
Encoding properties + Optimal encoding (entropy) + prefix unique + => no separator necessary - Code table must be known - => to be transmitted together with message, - Overhead - Not suitable for live video or audio => Adaptive Huffman encoding
Adaptive Huffman - encoding Idea: same Initial-Table same actualization routine for encoder and decoder Actualization routine update model : count the occurrence update the Huffman- tree, if necessary (if the Huffman-tree is not valid => swapping) => encoding of a sign is changing!
Adaptive Huffman - encoding ENCODER Initialize_model(); while ((c = getc(input))!=eof) { encode (c, output); update_model (c); } DECODER Initialize_model(); while ((c = decode(input))!= eof) { putc (c, output); update_model (c); }
Adaptive Huffman - encoding After 17 signs: A(1),B(2),C(2),D(2),E(10) encoding A: 000 0 1 0 1 0 1 0 1 source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada
Adaptive Huffman - encoding After next 2 A: A(3),B(2),C(2),D(2),E(10) encoding A: 011 0 1 0 1 0 1 0 1 source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada
Adaptive Huffman - encoding after next two A: A(5),B(2),C(2),D(2),E(10) encoding A:10 0 1 0 1
Encoding properties + Optimal encoding (entropy) + prefix unique + => no need of a separation sign, even if the codes of the signs have a different length + probability must not be known - Huge effort in encoding- / decoding - Increases with the number of signs - Exact synchronism is needed
Arithmetic encoding 1. Occurrences are normalized into an interval between 0...1 2. First sign of the sequence defines the 1.interval, which again will be divided proportional to the occurrence of the signs 3. The second sign defines the part in between the former defined interval and so on. 4. End at the end of the sequence of signs 5. The number in the last interval, that has the lowest number of digits is the encoding
Arithmetic encoding
k Arithmetic encoding 0,7 0,5 0,1
Arithmetische encoding
Encoding properties + only one number for the whole string - occurrences must be known - => has to be sent with the message - Overhead - Not possible for live video or audio - Limited accuracy of floating point numbers in computers => limited Length
Lempel-Ziv-Welch algorithm (LZW) Sources: LZ77, LZ78 Terry A. Welch: "A Technique for High Performance Data Compression", IEEE Computer, Vol. 17, No. 6, 1984, pp. 8-19. Used in the compress-instruction of Unix und picture-compression-format TIFF Idea: successive build a dictionary
Lempel-Ziv-Welch algorithm (LZW) w k wk = actually processed actual sign (word) = actual sign = actual sign of w and k Initial dictionary : 8 Bit ASCII 0...255 => 1. free encoding: 256
Lempel-Ziv-Welch algorithm (LZW) w = NIL; while ( read a character k ) { if wk exists in the dictionary w = wk; else add wk to the dictionary; output the code for w; w = k; }
Lempel-Ziv-Welch algorithm (LZW) ^WED^WE^WEE^WEB^WET w k Output Index Symbol NIL ^ ^ W ^ 256 ^W W E W 257 WE E D E 258 ED D ^ D 259 D^ ^ W ^W E 256 260 ^WE E ^ E 261 E^ ^ W source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada
w k Output Index Symbol ^ ^W W E Lempel-Ziv-Welch algorithm (LZW) ^WE E 260 262 ^WEE E ^ E^ W 261 263 E^W W E WE B 257 264 WEB B ^ B 265 B^ ^ W ^W E ^WE T 260 266 ^WET T EOF T
LZW Decompression Algorithm read a character k; output k; w = k; while ( read a character k ) /* k could be a character or a code. */ { entry = dictionary entry for k; output entry; add w + entry[0] to dictionary; w = entry; }
LZW Decompression Algorithm w k Output Index Symbol ^ ^ ^ W W 256 ^W W E E 257 WE E D D 258 ED D <256> ^W 259 D^ <256> E E 260 ^WE E <260> ^WE 261 E^ <260> <261> E^ 262 ^WEE <261> <257> WE 263 E^W <257> B B 264 WEB B <260> ^WE 265 B^ <260> T T 266 ^WET source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada
Compression classes Quelle: Steinmetz, Ralf: Multimedia-Technologie: Einführung and Grundlagen, Springer, Verlag
Compression classes Quelle: Steinmetz, Ralf: Multimedia-Technologie: Einführung and Grundlagen, Springer, Verlag
Source encoding compression algorithms Where information is getting lost? Why can we accept this? Which side effects occur? Which advantages bring transformations? What are asymmetric algorithms?
DPCM Differential Pulse Code Modulation Prediction coding High compression rate by an optimal respect to the properties of the source / drain Specialized for each class of information (Audio, Video, Picture, Text)
DPCM source: http://spemaus
Transformation Examples
DCT Discrete Cosine Transform (DCT): Inverse Discrete Cosine Transform (IDCT): source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada
FFT vs. DCT source: Ze-Nian Li : Script Multimedia Systems, Sin theon Fraser University, Canada
Wavelet Examples Source: Amara Graph, Introduction to Wavelets
Sub sampling
JPEG -Steps 1. Color model transformation 2. Discrete Cosine Transformation (DCT) 3. Quantization 4. Zig-zag-scan 5. DPCM, RLE 6. Huffman
JPEG - Overview source: Ze-Nian Li : Script Multimedia Systems, Simon Fraser University, Canada
Transformation RGB => YCbCr Y = 0.299R + 0.587G + 0.114B U = α (B Y) Cb = (B - Y) / 1.772 + 0.5 V = β (R Y) Cr = (R - Y) / 1.402 + 0.5 YUV used in PAL (α β depends on implementation) CbCr used in JPEG and MPEG
Components Y U V source: Ze-Nian Li : Script Multimedia Systems, Simon Fraser University, Canada
Discrete Cosine Transform (DCT) Basic functions for 8x8 constant 1/2 vertical cosine period 7/2 vertical cosine periods 1/2 horizontal cosine period source: Ze-Nian Li : Script Multimedia Systems, Simon Fraser University, Canada
Discrete Cosine Transformation (DCT) source: Ze-Nian Li : Script Multimedia Systems, Simon Fraser University, Canada
DCT Factorized Discrete Cosine Transform (DCT): Inverse Discrete Cosine Transform (IDCT): source: Ze-Nian Li : Script Multimedia Systems, Simon Fraser University, Canada
Vertical Line
Horizontal line
Corner
Quantization Luminance Quantization Table q(u, v) Chrominance Quantization Table q(u, v) -------------------------------------- ---------------------------------- 16 11 10 16 24 40 51 61 17 18 24 47 99 99 99 99 12 12 14 19 26 58 60 55 18 21 26 66 99 99 99 99 14 13 16 24 40 57 69 56 24 26 56 99 99 99 99 99 14 17 22 29 51 87 80 62 47 66 99 99 99 99 99 99 18 22 37 56 68 109 103 77 99 99 99 99 99 99 99 99 24 35 55 64 81 104 113 92 99 99 99 99 99 99 99 99 49 64 78 87 103 121 120 101 99 99 99 99 99 99 99 99 72 92 95 98 112 100 103 99 99 99 99 99 99 99 99 99 ---------------------------------------- ----------------------------------- Scalable for different levels of quality and compression rate source: Ze-Nian Li : Script Multimedia Systems, Simon Fraser University, Canada
Zig Zag Scan most important values first most reduced values (many to 0 ) at the end as a sequence source: Ze-Nian Li : Script Multimedia Systems, Simon Fraser University, Canada
DPCM RLC DC component is large and varied, often close to the previous value. Encode the difference from previous 8 x 8 blocks -- DPCM AC vector has lots of zeros in it RLC for those values special method: zeros are skipped and replaced by the number of zeros, other signs are non zeros
Huffman Entropy encoding of the whole result from former steps finished the algorithm. source: Ze-Nian Li : Script Multimedia Systems, Simon Fraser University, Canada
Summary JPEG source: Ze-Nian Li : Script Multimedia Systems, Simon Fraser University, Canada
Compression classes Quelle: Steinmetz, Ralf: Multimedia-Technologie: Einführung and Grundlagen, Springer, Verlag
Day 2: October 3, 16:00-19:15 Compression Methods Lossless Compression Entropy encoding Source encoding Hybrid Compression Lossy Compression JPEG encoding Thanks again for your attention! Hope to see you next Thursday Oct. 9 th Room VI-201