1 Introduction to Video Coding o Motivation & Fundamentals o Principles of Video Coding o Coding Standards Special Thanks to Hans L. Cycon from FHTW Berlin for providing first-hand knowledge and much of the material!

2 Video Data the Problem o PAL uncompressed - 768x576 pixels per frame - x 3 bytes per pixel (24 bit colour) - x 25 frames per second - 32 MB per second GB per minute Raw video data not device compliant! Even cameras need immediate compression

3 Signal Transmission Scheme Coder channel Decoder Saving of bit rate Reconstruction of signal

4 Fundamentals Why don t we just use *zip? o Suppose our video-pixels attain N values i with probability p i o and we know nothing about them (just iid random) o Then (Shannon): The Entropy H = N i= 1 p i log ( p i 2 ) is the minimal bound for data needed (mean of information) o For individually encoded pixels this results in optimal compression rates around 1.33! Image and video pixels are not iid random, but highly correlated! Correlations are hidden from the individual pixel level

5 Image Compression Concepts o lossless, by removing redundancies - spatial redundancies - temporal redundancies - spatial-temporal correlations - statistical redundancies o lossy, by removing (visually) irrelevant information - reduction of accuracy in colors, contours and motion

6 Image Quality Measure Q( PSNR) = 20log 1 N N i ( 255 f org i f cmp i ) 2 = 10log MSE higher performance

7 The Idea of Transformation o Mathematically an image can be considered as a matrix in some high dimensional space o Transformations rotate this matrix into an advantageous position (of sparse population) o This results in compactification of energy : most of the coefficients will be (nearly) zero o Leads to simplified separation of irrelevant information

8 Transform Coding Initial Image T Q PC/C Compressor Bitstream o Transformation: o Quantisation: o Pre-Coder: o Coder: De-correlation, compactification of energy, reversible Elimination of psycho-visual irrelevant information, not reversible Pre-processing for additional elimination of statistical redundancies, reversible Generation of variable length Codes, reversible

9 Spatial Decorrelation: Discrete Cosine Transform - DCT Transformation of spatial into frequency coordinates F( u, v) Λ( u) Λ( v) 4 = 7 7 i= 0 j= 0 cos (2i + 1) uπ cos 16 (2 j + 1) vπ 16 f ( i, j) Λ( ξ ) = 1... for. ξ = otherwise

10 Concept of conventional DCT coding (JPEG, MPEG, H.26x) block scanning DCT quantisation zig-zag scanning VLC channel , 70, 10, 20, 10, 10, 30, 10, 10, 0, 0, 0,... 01, 00111, 01, 01, 01, 01, 010, 01, x 8 x 8 bit = 512 bit 8 x 8 x 10 bit = 640 bit 8 x 8 x 4 bit = 256 bit = EOB -> 26 bit Source: Schäfer HHI [W2] compression factor = 512/26 20

11 Transformed Representation o Concentration of information in few spectral coefficients (decorrelation)

12 Transformed Representation o Concentration of information in few spectral coefficients (decorrelation) of 64 coefficients

13 Transformed Representation o Concentration of information in few spectral coefficients (decorrelation) of 64 coefficients of 64 coefficients

14 Transformed Representation o Concentration of information in few spectral coefficients (decorrelation) of 64 coefficients of 64 coefficients of 64 coefficients Source: Schäfer HHI [W2]

15 Problem of DCT: Blocking Artefacts Original DCT 1:64

16 Alternative Transformation: DWT Original DCT 1:64 WLT 1:64

17 Transform Coding Decoding (DCT- or Wavelet- based) Image T Q C lossless decorelation lossy Quantizer entropy coder compressed bitstream Rec.Image IT IQ IC

18 Temporal Decorrelation: Difference Coding In slow moving scenes many subsequent images are nearly alike: Temporal Redundancy is eliminated by coding only the difference of subsequent images (Inter-Frames). To limit accumulating errors full images (Intra-Frames) are coded regularly ( one of 50 frames) I = Intra P = Inter I P P P P I P P P GOP t

19 Hybrid Decorrelation: Difference Coding with Motion Prediction Source: Schäfer HHI [W2]

20 Block Motion Compensation Prediction Block Matching frame k-1 frame k o Decomposition of previous picture into blocks o Move & match blocks on top of next picture o Simplify by motion vector discretisation

21 Bidirektional Prediction Coding... I frames - Intracoding (JPEG)

22 Bidirektional Prediction Coding... P I frames - Intracoding (JPEG) P frames - Uni-directional predictive coding

23 Bidirektional Prediction Coding... B P I frames - Intracoding (JPEG) P frames - Uni-directional predictive coding B frames - Bi-directional predictive coding

24 Bidirektional Prediction Coding... B B P I frames - Intracoding (JPEG) P frames - Uni-directional predictive coding B frames - Bi-directional predictive coding

25 Bidirektional Prediction Coding B B P P I frames - Intracoding (JPEG) P frames - Uni-directional predictive coding B frames - Bi-directional predictive coding

26 Bidirektional Prediction Coding B B P B B P I frames - Intracoding (JPEG) P frames - Uni-directional predictive coding B frames - Bi-directional predictive coding

27 Bi-directional Prediction

28 Statistical Coding Principles/ Entropy Coding Huffmann Coder (variable length symbolic coder) Assign to every fixed word a variable length code word Frequent words short code word, rare words long code Improvement: Arithmetic Coder Map entire sequences of symbols on [0,1] (also binary mapping) Run-Length Coder abbbbbbbbcc a7b!cc Pattern Substitution: Dictionary Coding Represent repeating sequences of symbols by pointers Context Modelling (Pre-Coding) Determine local conditional probabilities for symbols, instead of global frequencies

29 Layered Coding Scalability and adaptability to varying play-out scenarios may be achieved through coding layers: o o o o Spatial layers range of (pixel) resolutions Data partitioning layers high and low priority data SNR layers range of visual resolutions Temporal layers range of frame rates

30 Video Coding Standards Video Coding Standards are defined in ranges of applicability (image resolution, bandwidth, computational complexity, power consumption ), initially for specific target groups: o ISO Moving Pictures Experts Group MPEG - MPEG-1 (1989): CD-ROM applications at 1,5 Mb/s - MPEG-2 (1991): High Quality Coding at 2 50 Mb/s - MPEG-4 (1998): Scalable 64 kb/s 4 Mb/s 100 Mb/s (V3) oitu-t - H.261 (1991): Video telephony, video conferencing 64 kb/s 1 Mb/s - H.263 (1996): Low bit rate coding (ISDN) 8 kb/s 1 Mb/s - H.26L (2001): Low bit rate, low complexity - H.264/AVC (2003): Joint with ISO, dbld. compr. of MPEG-4, 8 kb/s 100 Mb/s

31 Milestones in Video Compression PSNR [db] Visual Gain 10dB H H.26L (2001) MPEG4/H H.120 MPEG1/ Bit rate Reduction 85% Foreman 10 Hz, QCIF 133 frames encoded H DCT (Motion JPEG) (1985) Bit-Rate [kbps]

32 MPEG-2 o Aiming at TV quality (interlacing), but generic picture format: The DVD-Standard o Discrete Cosine Transform (8 x 8 blocks) o Motion compensation and prediction (I, P, B Frames) o Supports coding layers o Error resilience by interpolation o Supports multiple audio and video flows

33 H.263 o Aiming at telecommunication: CIF + QCIF formats. The old video conferencing standard o Discrete Cosine Transform (8 x 8 blocks) o Improved motion compensation (precision, variable block size, overlapping blocks) o Prediction with PB-frame (interpolated B component) o Advanced negotiability o Arithmetic coding

34 MPEG-4 o Ambitious standard to encode multimedia streams (including interactivity) o Focus of interest on video compression, based on a collection of profiles: Simple, advanced simple, o Content based compression, motion prediction, scaling o Concept of Video Object Planes (I/P/B-VOPs) - Motion estimation and compensation - Shape coding - Texture coding (DCT, but also wavelet based) - Sprite coding o Adaptive techniques (motion comp., arithmetic coding, error resilience )

35 MPEG4 Generic Coding Scheme

36 MPEG4 System Model

37 H.264/AVC o Aiming at full scalability: from 3GPP to HDTV o Approval May 2003 (Editor T. Wiegand, HHI) o New 4x4 integer transform (of DCT kind) omany modes: - Adaptive block size for transform - Adaptive blocking for motion compensation - Adaptive Intra prediction - Two VL Entropy codings: CAVLC + CABAC (D. Marpe, HHI) o Content adaptive deblocking filters ocomplexity: times MPEG-2 for encoding - 3 times MPEG-2 for decoding

38 H.264: Structure - Decoder Coder Control Transform/ Quantizer Deq./Inv. Transform Control Data Quant. Transf. coeffs Intra/Inter 0 Motion- Compensated Predictor Entropy Coding Motion Estimator Motion Data

39 Deblocking Filter Source: Schäfer HHI [W2]

40 What else? o MPEG-7: Multimedia Content Description Interface - Meta data standard - Goal: describe multimedia data for search, retrieval and (combined/synchronized) play out o MPEG-21: Multimedia Framework (just finishing) - Meta data standard for multimedia applications o Proprietary codecs: - RealNetworks: Helix - Microsoft: VC-1 - a few more - (at most) similar performance, similar ideas visible - pay per???

41 References Hans L. Cycon: Digitale Audio- und Videotechnik, Vorlesungsskript Ralf Schäfer HHI, W.Effelsberg, R.Steinmetz: Video Compression Techniques, dpunkt.verlag Y. Shi, H. Sun: Image and Video Compression for Multimedia Engineering, CRC Press, Boca Raton N. Chapman, J. Chapman: Digital Multimedia, 2 nd edition, Wiley, Chichester, GB, Detlev Marpe, Thomas Wiegand, and Gary J. Sullivan: The H.264/MPEG4-AVC Standard and its Fidelity Range Extensions, IEEE Communications Magazine, September 2005.

