Lecture Information Multimedia Video Coding & Architectures

Multimedia Video Coding & Architectures (5LSE0), Module 01 Introduction to coding aspects 1 Lecture Information Lecturer Prof.dr.ir. Peter H.N. de With Faculty Electrical Engineering, University Technology Eindhoven SPS Video Coding & Architectures (VCA) Address: Flux 5.092, Tel: 040-247.25.40, Email: P.H.N.de.With@tue.nl 2 Peter H.N. de With (p.h.n.de.with@tue.nl ) slides version 1.0 Information: http://vca.ele.tue.nl/courses/5lse0 Manuscript Lecture slides Software Practical Exercises Examination: Oral Examn (70%) + Practical exercises (30%) Mod 01 Part 1: The Need for Compression 3 What is Digital Signal Coding? Digital signals: audio, image, video (multimedia) Will not discuss document compression No detailed lectures on still picture coding (not yet) Coding/Compression = compact representation: Less space required on storage media (hard disks, CD, DVD) Smaller transmission bandwidth required (Internet, terrestrial broadcast, satellite links, mobile phones) Allows for faster uploading/downloading of multimedia files (Internet, authoring environment) 4 1

Why Digital Signal Coding? (1) 5 Why Digital Signal Coding? (2) 6 Digital camera technology 2002 (approx. 400 Euro) Digital (MiniDV) camcorder technology 1995-2010 (1000 Euro) 1200 lines x 1600 pixels per line RGB, 24 bit (3 bytes) per color pixel Total uncompressed (raw, bmp) size is 5.8 Mbyte 36 photo s film: 200 Mbyte (about 1/3 of a CD-ROM) Download time: 1.5 hours (GSM); 12 minutes (ISDN/56k); Multimedia 46 seconds Video Coding (ADSL); & A / 55LSE0 seconds / (slow (40% of original size) IntraNet); <1 second (fast IntraNet) 576 lines x 720 pixels per line YUV, 16 bit (2 bytes) per color pixel Total uncompressed (raw, bmp) size per frame is 830 kbyte 1 hour of video 75 GByte Download time: 16 hours (slow IntraNet); 2 hours (fast IntraNet) What is Digital Signal Coding? (1) 7 What is Digital Signal Coding? (2) 8 Digital signals: audio, image, video (multimedia) Will not discuss document compression in detail Coding/Compression = compact representation: Less space required on storage media Smaller transmission bandwidth required Allows for faster uploading/downloading of multimedia files / communication Examples of products using compression: Digital audio: MP3 Digital image and video cameras: JPEG, MPEG Internet movie downloading/streaming DivX, Qtime, Real players Speech compression for mobile phones: GSM Objectives of this course: Fundamental limitations of compression: Information Theory Techniques by which compression can be achieved: Signal Processing, the key steps What is inside the MPEG standard, DVB, DV etc. Provide an introduction to architectural trade-offs Practical experiments with compression factor and quality: Software package 2

Compression is now an embedded system 9 Lossless vs. Lossy Compression (1) 10 Signal to be coded x(t), x(t,s) Identical A/D RF/IF modulate Compress Baseband modulation Channel symbols Secure FEC Packetization and Multiplexing Coding includes Compression, Security, and Error Control techniques This course focuses on Compression Compressed photo 001010010110 001011010101 Compress Decompress x ( i, y( i, Lossless or Reversible Compression y(i, = x(i, Not very effective for multimedia Used for document compression (Zip) Lossless vs. Lossy Compression (2) x( i, Not Identical Compressed photo 001010010110 001011010101 Compress Decompress Lossy or Non-reversible Compression y(i, x(i, y( i, Examples: JPEG, MPEG, MP3, DivX Can trade compression for amount of difference (quality) 11 Some Ballpark Figures (1) MP3 Audio/Music» CD: 44.1 khz at 16 bits/sample = 700 kbit/sec (mono); Stereo 1.44 Mbit/sec» MP3: 128 kbit/sec» Bit rate : 1.5 bit/sample (bps)» Compression factor : 11 GSM Speech» 8 khz at 8 bit/sample = 64 kbit/sec» GSM: approx. 13 kbit/sec» Bit rate : 1.6 bit/sample (bps)» Compression factor : 4.9 12 3

Some Ballpark Figures (2) JPEG Images» Digital camera: 1600x1200 pixels at 24 bit/color pixel = 46 Mbit/file = 5.8 MByte/file» Good quality JPEG: 600 kbyte» Bit rate : 2.5 bit / color pixel (bpp)» Compression factor : 9.6 13 Compression is not for free (1) 14 MPEG Video» Broadcast TV: 720x576 at 16 bit/color pixel at 25 pictures/second = 166 Mbit/sec = 20 MByte/sec» Commercial tv station MPEG: 4.5 MBit/sec» Bit rate : 0.43 bit / pixel (bpp)» Compression factor : 36.8 lossy Compression is not for free (2) 15 Compression is not for free (3) 16 0.6 bit per pixel (Cf=13) 0.3 bit per pixel (Cf=26) 4

Compression is not for free (4) 17 Compression is not for free (5) 18 Same holds for video High quality MPEG Low quality MPEG And Audio High quality MP3 Low quality MP3 0.2 bit per pixel (Cf=40) Quality of Compressed Signals (1) 19 Quality of Compressed Signals (2) 20 Visual evaluation (always important) Numerical evaluation: PSNR Original signal X Coder bit rate R Visual evaluation PSNR Decoder Compressed signal Y Number Score Quality Scale Impairment Scale 5 Excellent Imperceptible 4 Good (Just) perceptible but not annoying 3 Fair (Perceptible and) slightly annoying 2 Poor Annoying (but not objectionable) 1 Unsatisfactory (Bad) Very annoying (Objectionable) 5

PSNR (db) Quality of Compressed Signals PSNR = 10log 55 50 45 40 35 30 25 20 15 10 Peak Signal - to - Noise 255 2 [ X ( i, Y ( i, ] 7 6 5 4 3 2 1 Number of bits per pixel Ratio = (db = de 2 Visually: Excellent-good Poor very bad 21 Good-Moderate -Poor Mod 01 Part 2: Obvious Ways to Compress Signals 22 PCM: Simple Compression 23 Reduce the #Amplitudes (1) 24 Two obvious ways to compress signals 1. Reduce the sampling rate (subsampling) Fewer audio samples per second Fewer pixels per picture line and fewer lines (images) Fewer pictures per second (video) 2. Reduce the No. of amplitude levels per signal sample: Pulse coded modulation or PCM Obviously we will develop far more advanced techniques in this course 24 bits (16777200 different colors) 6

25 26 Reduce the #Amplitudes (2) Reduce the #Amplitudes (3) 8 bits (256 different colors) Compression factor 3 6 bits (64 different colors) Compression factor 4 27 28 Reduce the #Amplitudes (4) Reduce the #Amplitudes (5) 4 bits (16 different colors) Compression factor 6 1 bit (2 different colors) Compression factor 24 7

29 30 Reduce the #Amplitudes (6) Reduce the #Amplitudes (7) 8 bits (256 different gray values) Compression factor 3 4 bits (16 different gray values) Compression factor 6 31 32 Reduce the #Amplitudes (8) Reduce the #Amplitudes (9) 3 bits (8 different gray values) Compression factor 8 2 bits (4 different gray values) Compression factor 12 8

33 34 But real coding is always better! Mod 01 Part 3: Roadmap Compression JPEG Compression factor 15 Questions to be Answered 35 Questions to be Answered 36 Three questions will play a central role: Three questions will play a central role: 1. Why can signals be compressed? 2. How much can signals be compressed? 3. Which signal processing/information theory algorithms are most efficient in reaching the maximum compression? 1. Why can signals be compressed? 2. How much can signals be compressed? 3. Which signal processing/information theory algorithms are most efficient in reaching the maximum compression? (For lossless and lossy compression) (For lossless and lossy compression) 9

Why can Signals be Compressed? (1) Because signal amplitudes are statistically redundant Example: Signal has integer amplitudes in range [0,7] What is the required number of bits per signal sample (the bit rate) to represent the signal amplitudes? 37 Why can Signals be Compressed? (2) Because signal amplitudes are statistically redundant Example: Signal has integer amplitudes in range [0,7] 3 bit per signal sample (3 bps or bpp) Signal value Codeword 0 000 1 001 2 010 3 011 4 100 5 101 6 110 7 111 38 Why can Signals be Compressed? (3) Because signal amplitudes are statistically redundant Example: Signal has integer amplitudes in range [0,7] Signal amplitudes have following probabilities of occurrence 0,6 0,5 0,4 0,3 0,2 0,1 0 0 1 2 3 4 5 6 7 39 Why can Signals be Compressed? (4) Because signal amplitudes are statistically redundant Example: Signal has integer amplitudes in range [0,7] 3 bit per signal sample (3 bps or bpp) Signal value Probability Codeword 0 0.125 000 1 0 001 2 0.5 010 3 0 011 4 0.125 100 5 0.125 101 6 0 110 7 0.125 111 40 10

Why can Signals be Compressed? (5) Because signal amplitudes are statistically redundant Entropy Coding or Variable Length Coding Example: Signal has integer amplitudes in range [0,7] 2 bits per signal sample Signal value Probability Codeword 0 0.125 100 1 0 2 0.5 0 3 0 4 0.125 101 5 0.125 110 6 0 7 0.125 111 41 Why can Signals be Compressed? (6) Because signal amplitudes are statistically redundant Question 1: What is the shortest average codeword length that one can achieve for a given signal (or source )? Question 2: 42 (Shannon Information Theory) How did you design/derive those code words? (Construction Recipes) Why can Signals be Compressed? (7) Because infinite accuracy of signal amplitudes is (perceptually) irrelevant 43 Why can Signals be Compressed? (8) Because infinite accuracy of signal amplitudes is (perceptually) irrelevant 44 x (i) Represent amplitudes coarser ( PCM ) 24 bit (16777200 different colors) The difference between these two pictures is perceptually irrelevant because our visual system is insensitive for such small differences. But the files differ a factor of 3 in size! 8 bit (256 different colors) Compression factor 3 - Numerical difference: SNR Perceptual difference 11

Why can Signals be Compressed? (9) 45 Example of 8-Level Quantizer 46 Because infinite accuracy of signal amplitudes is (perceptually) irrelevant Typically implemented by a Quantizer (intelligent rounding ) x (i) Q x( i) Z or R 255 x (i) Q 0 255 x(i) Why can Signals be Compressed? (10) Because infinite accuracy of signal amplitudes is (perceptually) irrelevant Usually in combination with variable length encoding 110 0 x(i) 47 Why can Signals be Compressed? (11) Because infinite accuracy of signal amplitudes is (perceptually) irrelevant By choosing the number of representation levels, we trade off rate versus distortion Number of representation levels ~ average codeword length L x (i) 010010 Q VLC 48 111010 x (i) 010010 Q VLC Distortion = Difference -x(i) 12

Why can Signals be Compressed? (12) Because infinite accuracy of signal amplitudes is (perceptually) irrelevant Question 1: What is the best possible trade-off between required bit rate and resulting distortion? (Rate-Distortion Theory) Question 2: How do we implement a system that gives us that best possible trade-off? (Scalar and Vector Quantization Theory) 49 Why can Signals be Compressed? (13) 250 200 150 100 50 0 2000 4000 6000 8000 1 10 4 Any signal that carries information contains a certain structure or stochastic dependencies 50 Why can Signals be Compressed? (14) 250 51 Why can Signals be Compressed? (15) 52 200 150 100 50 0 500 1000 1500 2000 2500 3000 3500 4000 4500 S 200 150 100 0 1000 2000 3000 4000 5000 O one image line 13

Why can Signals be Compressed? (16) Example of signal with no dependencies between successive amplitudes (Gaussian uncorrelated noise) Indeed noise compresses badly 53 Why can Signals be Compressed? (17) Exploit the stochastic dependencies (~ correlation): Predicting the next value of the basis of the current one xˆ ( i, = x( i 1, Calculate the amplitude of prediction difference x( i, = x( i, xˆ( i, = x( i, x( i 1, This difference has (on the average) a much smaller amplitude range (~smaller variance). Quantize prediction difference in few representation levels 54 Why can Signals be Compressed? (18) 55 Why can Signals be Compressed? (19) 56 Positive value Zero! Negative value x (i) T Q VLC 010010 14

Why can Signals be Compressed? (20) Dependency is commonly expressed by the autocorrelation function: (k) R X R X ( k ) = E[ X ( n) X ( n k)] 1 R X ( k ) = N N n= 0 X ( n) X ( n k) 57 Why can Signals be Compressed? (21) For images: 2-D autocorrelation function R X ( k, l) R X 58 ( k, l) = E[ X ( n, m) X ( n k, m l)] k k l Why can Signals be Compressed? (22) Question 1: What is the best possible exploitation of the correlation (dependencies) in natural signals? (Rate-Distortion Theory) Question 2: How do we implement a system that exploits the correlation in natural signals? (Compression algorithms: - DPCM - Subband/wavelet - Transform/DCT - Motion compensation) 59 Analog capturing device (camera, microphone) Sampling Summarizing Fine Quantization A/D CONVERTER Transform Quantizer VLC encoding COMPRESSION/SOURCE CODING Encipher Error protect. CHANNEL CODING PCM encoded or raw signal ( wav, bmp, ) 60 Compressed bit stream (mp3, jpg, ) Channel bit stream 15