Video Codecs. National Chiao Tung University Chun-Jen Tsai 1/5/2015

Similar documents
MPEG-4: Simple Profile (SP)

Video Compression Standards (II) A/Prof. Jian Zhang

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46

High Efficiency Video Coding: The Next Gen Codec. Matthew Goldman Senior Vice President TV Compression Technology Ericsson

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Advanced Video Coding: The new H.264 video compression standard

High Efficiency Video Coding. Li Li 2016/10/18

H.264 / AVC (Advanced Video Coding)

Video Compression An Introduction

Selected coding methods in H.265/HEVC

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Chapter 10 ZHU Yongxin, Winson

HEVC The Next Generation Video Coding. 1 ELEG5502 Video Coding Technology

The Scope of Picture and Video Coding Standardization

Digital Video Processing

RATE DISTORTION OPTIMIZATION FOR INTERPREDICTION IN H.264/AVC VIDEO CODING

PREFACE...XIII ACKNOWLEDGEMENTS...XV

VIDEO COMPRESSION STANDARDS

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression

Introduction to Video Encoding

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

10.2 Video Compression with Motion Compensation 10.4 H H.263

Video coding. Concepts and notations.

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD

VHDL Implementation of H.264 Video Coding Standard

Video Coding in H.26L

EE Low Complexity H.264 encoder for mobile applications

ITU-T DRAFT H.263 VIDEO CODING FOR LOW BITRATE COMMUNICATION LINE TRANSMISSION OF NON-TELEPHONE SIGNALS. DRAFT ITU-T Recommendation H.

Lecture 13 Video Coding H.264 / MPEG4 AVC

Chapter 10. Basic Video Compression Techniques Introduction to Video Compression 10.2 Video Compression with Motion Compensation

Video Coding Standards. Yao Wang Polytechnic University, Brooklyn, NY11201 http: //eeweb.poly.edu/~yao

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho

A COMPARISON OF CABAC THROUGHPUT FOR HEVC/H.265 VS. AVC/H.264. Massachusetts Institute of Technology Texas Instruments

4G WIRELESS VIDEO COMMUNICATIONS

EE 5359 H.264 to VC 1 Transcoding

H.264 / AVC Context Adaptive Binary Arithmetic Coding (CABAC)

An Efficient Table Prediction Scheme for CAVLC

INTERNATIONAL TELECOMMUNICATION UNION ITU-T H.263 (03/96) TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU TRANSMISSION OF NON-TELEPHONE SIGNALS VIDEO

The VC-1 and H.264 Video Compression Standards for Broadband Video Services

Introduction of Video Codec

Using animation to motivate motion

Standard Codecs. Image compression to advanced video coding. Mohammed Ghanbari. 3rd Edition. The Institution of Engineering and Technology

Video Coding Using Spatially Varying Transform

Video Coding Standards

White paper: Video Coding A Timeline

Emerging H.26L Standard:

Lecture 5: Video Compression Standards (Part2) Tutorial 3 : Introduction to Histogram

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

An Improved H.26L Coder Using Lagrangian Coder Control. Summary

Transcoding from H.264/AVC to High Efficiency Video Coding (HEVC)

FPGA based High Performance CAVLC Implementation for H.264 Video Coding

A comparison of CABAC throughput for HEVC/H.265 VS. AVC/H.264

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

Encoding Video for the Highest Quality and Performance

MPEG-4 Part 10 AVC (H.264) Video Encoding

COMPARATIVE ANALYSIS OF DIRAC PRO-VC-2, H.264 AVC AND AVS CHINA-P7

Heinrich Hertz Institute (HHI), Einsteinufer 37, D Berlin, Germany Tel: , Fax: ,

Video Compression MPEG-4. Market s requirements for Video compression standard

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc.

The Basics of Video Compression

H.264 STANDARD BASED SIDE INFORMATION GENERATION IN WYNER-ZIV CODING

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

Lecture 4: Video Compression Standards (Part1) Tutorial 2 : Image/video Coding Techniques. Basic Transform coding Tutorial 2

Advanced Encoding Features of the Sencore TXS Transcoder

CMPT 365 Multimedia Systems. Media Compression - Video

Video Coding Standards: H.261, H.263 and H.26L

Introduction to Video Coding

Xin-Fu Wang et al.: Performance Comparison of AVS and H.264/AVC 311 prediction mode and four directional prediction modes are shown in Fig.1. Intra ch

Introduction to Video Compression

Anatomy of a Video Codec

PERFORMANCE ANALYSIS AND IMPLEMENTATION OF MODE DEPENDENT DCT/DST IN H.264/AVC PRIYADARSHINI ANJANAPPA

High Efficiency Video Coding (HEVC) test model HM vs. HM- 16.6: objective and subjective performance analysis

DigiPoints Volume 1. Student Workbook. Module 8 Digital Compression

MPEG-2. ISO/IEC (or ITU-T H.262)

Digital Image Representation Image Compression

Title page to be provided by ITU-T ISO/IEC TABLE OF CONTENTS

CMPT 365 Multimedia Systems. Media Compression - Image

Wireless Communication

Chapter 2 Joint MPEG-2 and H.264/AVC Decoder

Mali GPU acceleration of HEVC and VP9 Decoder

Mark Kogan CTO Video Delivery Technologies Bluebird TV

COMPARISON OF HIGH EFFICIENCY VIDEO CODING (HEVC) PERFORMANCE WITH H.264 ADVANCED VIDEO CODING (AVC)

In the name of Allah. the compassionate, the merciful

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

Lecture 6: Compression II. This Week s Schedule

Entropy Coding in HEVC

Low power context adaptive variable length encoder in H.264

Ch. 4: Video Compression Multimedia Systems

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

PERFORMANCE ANALYSIS AND COMPARISON OF DIRAC VIDEO CODEC WITH H.264 / MPEG-4 PART 10 AVC ARUNA RAVI

Multimedia Decoder Using the Nios II Processor

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho

A 4-way parallel CAVLC design for H.264/AVC 4 Kx2 K 60 fps encoder

VIDEO AND IMAGE PROCESSING USING DSP AND PFGA. Chapter 3: Video Processing

Lecture Coding Theory. Source Coding. Image and Video Compression. Images: Wikipedia

Transcription:

Video Codecs National Chiao Tung University Chun-Jen Tsai 1/5/2015

Video Systems A complete end-to-end video system: A/D color conversion encoder decoder color conversion D/A bitstream YC B C R format RGB Question: how do we compress video data for common usage? MPEG-1 video data rate: 1 ~ 2 mbps MPEG-2 video data rate: 2 ~ 20 mbps MPEG-4 video data rate: Simple Profile, 64kbps ~ 1.5 mbps AVC/H.264, 32 kbps ~ 20 mbps 2/42

Video Frame Representation Video frame are represented in YC B C R 4:2:0 format RGB format is well-known, but not suitable for video coding YC B C R is used for video coding because: Color components can be subsampled easily Human eyes are less sensitive to color gradient Some people refer to YC B C R space as YUV color space 4:2:0 stands for color subsampling one luma sample two chroma samples An *.yuv file stores video data frame-by-frame. Each frame stores complete luma sample before chroma samples. Y: luma; C B, C R : chroma For more info., see Charles Poynton s website: http://www.poynton.com/ 3/42

Color Space Conversion RGB YC B C R : Y = α red Red + α green Green + α blue Blue C B = (Blue - Y) / (2 2 α blue ) C R = (Red - Y) / (2 2 α red ) YC B C R RGB: Red = C R (2 2 α red ) + Y Green = (α blue Blue - α red Red) / α green Blue = C B (2 2 α blue ) + Y Coefficients table: Recommendation α red α green α blue BT-601 0.2990 0.5870 0.1140 BT-709 0.2126 0.7152 0.0722 4/42

History of Video Standards : developed by ITU-T/VCEG : developed by ISO/MPEG : official joint work by ISO & ITU-T ITU-T Standards ITU-T H.261 (video telephony) H.263/H.263+/H.263++ (video telephony, MMS) Joint Standards MPEG 2 Part 2 (a.k.a. H.262) AVC: MPEG-4 Part 10 / H.264 (ed.1 ~ 3) HEVC: MPEG-H Part 2 / H.265 (ed. 1) ISO/IEC Standards MPEG 1 Part 2 MPEG 4 Part 2 (ver.1 ~ ver.3) 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2007 2008 2009... 2013 5/42

From Video Frame to Bitstream video sequence I P P I B P time I: Intra coding (spatially-compensated) P: Inter coding (motion-compensated) B: Bi-inter coding (motion-compensated) video frame slices macroblocks Each pixel is composed of a luma (Y) and two chroma (C B, C R ) components Simply put, each block is composed of 386 numbers from the range of [0, 255] Picture header Slice Data Slice Data Slice Data Compressed bitstream Video header Picture header... Slice Data Note: In the past, video experts tries to break a video frame into objects instead of square blocks before coding, but the idea didn t fly! 6/42

Components of MPEG Video Codecs Popular video codecs today are all block-based motion-compensated transform codecs, which composed of four modules: Predictive coder (loss-less) Transform coder (loss-less, theoretically) Quantizer (lossy) Entropy coder (loss-less) YC B C R Prediction T Q Entropy Packing bitstream encoder decoder Prediction 1 T 1 Q 1 Entropy 1 Unpacking * T transform; Q Quantization 7/42

Predictive Coder Predictive coders perform the Guess Work in a video codec Two types of predictions are available: Spatial prediction (a.k.a. intra-prediction) Temporal prediction (a.k.a. inter-prediction) 8/42

Spatial Prediction Pixels are predicted using their neighboring pixels Assume similar pixels extend in this direction Prediction model (predictor) The predicted model may not match the real pixels exactly, therefore, the errors between the model and the real pixels must be recorded 9/42

Temporal (Motion) Prediction The most important prediction coding technique for video is called motion-compensated prediction: motion vector R M m n blocks reference frame current frame The block M can be represented by the motion vector plus the differences between R and M 10/42

How Big Should a Block Be? It would be ideal if the spatio-temporal prediction is based on the natural object boundary Engineers try this idea and failed Today, most successful codecs use variable block sizes for prediction Example: in H.264, two-level quadtree partition is used Level one 16x16 16x8 8x16 8x8 0 0 1 0 0 1 1 2 3 Level two 0 0 1 0 1 8x8 8x4 4x8 0 1 2 3 4x4 11/42

Advantages of Small Prediction Blocks The smaller the block size is, the better your model fits video data 16 16 macroblocks reference frame current frame 12/42

½-Pixel Motion Compensation Since MV resolution is half-a-pixel, when the MV is, say, (-10.5, 4.5), we must make up the predictor block R by interpolation: A p B q r Integer pixel position C D Half pixel position p = (A + B + 1 δ)/2 q = (A + C + 1 δ)/2 r = (A + B + C + D + 2 δ)/4 Note: δ = 0 or 1, is a rounding control parameter. δ is data-dependent and determined by the encoder. 13/42

Motion Vector Coding Motion vectors are also predictively coded The predictor is the median of MV1, MV2, and MV3 14/42

Transform Coder (DCT) 2D Discrete Cosine Transform (DCT) is used: Forward transform (for encoder): Backward transform (for decoder): Due to rounding issues, DCT can not be computed precisely by a computer In MPEG-2 & MPEG-4 part 2, codec modifies the last coefficient of each block by ±1 to reduce the mismatch effect = = + + = 1 0 1 0 2 1) (2 cos 2 1) (2 )cos, ( ) ( ) ( ), ( N x N y N v y N u x y x f v C u C v u F π π = = + + = 1 0 1 0 2 1) (2 cos 2 1) (2 )cos, ( ) ( ) ( ), ( N u N v N v y N u x v u F v C u C y x f π π = = 0, 0, ) ( 2 1 t t t C N N Note: In both equations, N is the size of block, and 15/42

Quantization Module The transformed coefficients are then quantized using a staircase function (with stepsize i ): Output Levels y M y i i y i = x / i b i-1 b i Input Levels x How do we choose the right i? 16/42

MPEG-4 Part 2 DC Prediction Pick DC predictor based on gradients of the DC s: If ( DC A DC B < DC B DC C ) DC X = DC C else DC X = DC A B C D or or A X Y Block (8x8) Macroblock (16x16) 17/42

MPEG-4 Part 2 AC Prediction Coefficients are predicted from previous coded blocks. The best direction is chosen based on the DC prediction. B or C or D A X Y M macroblock 18/42

Rate-Distortion (R-D) Trade-off The R-D function depends on the codec model as well as the video content (segment) quality Good operating points! bit-budget range bit-budget range rate 19/42

Rate-Distortion Optimization Theoretical R-D function is a characteristics of the video content Operational R-D function is determined by the codec Distortion codec #1 codec #2 theoretical limit from information theory Rate 20/42

DCT Coefficients Entropy Coding The transform coefficients are coded using an entropy coder Today, many popular video codecs perform entropy coding in three steps: Convert 2-D data to 1-D array of coefficients (zigzag scan) Convert 1-D array of coefficients to 1-D array of symbols (run-length coding) Variable-length coding (VLC) of the run-length symbols 21/42

Zigzag Scan Zigzag scan is used to map the 2-D array of DCT coefficients to an 1-D array: 0 1 5 6 14 15 27 28 2 4 7 13 16 26 39 42 3 8 12 17 25 30 41 43 9 11 18 24 31 40 44 53 10 19 23 32 39 45 52 54 20 22 33 38 46 51 55 60 21 34 37 47 50 56 59 61 35 36 48 49 57 58 62 63 22/42

VLC-based Entropy Coding Take MPEG-1/2/4 for example, each nonzero coefficient in the 1-D array is converted to a (Last, Run, Level) symbol: Last: is this the last non-zero coefficient? Run: the number of zeros precede this coefficient Level: the value of this coefficient These symbols are coded using variable length codes (VLC) 23/42

Entropy Coding Example 78 0 0 0 0 0 0 0-9 0 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0-9 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 zig-zag scan 78, 0, -9, 0, 0, 0, 0, 22, 0, 0,, 0, 0, -9 convert to symbols (0, 0, 78), (0, 1, -9), (0, 4, 22), (0, 14, 8), (0, 0, 3), (1, 21, -9) VLC coding Convert symbols to VLC codes by table-lookup. Last Run Level Bits VLC Code 0 0 1 3 10s 0 0 2 5 1111s 0 0 3 7 0101 01s 1 0 1 5 0111s 1 0 2 10 0000 1100 1s 24/40

Packing the Compressed Data In previous slides, we covered the core technologies inside a video codec. However, the compressed data has to be arranged into a bitstream, alone with some system information The packing mechanism is critical to the application scenario (e.g. video-over-ip) 25/42

MPEG-4 Simple Profile Bitstreams VOL Header* Elementary Stream Short Video Header Mode (H.263 Baseline) Combined Mode Data-Partitioned (DP) Mode Video Packet (VP) Mode Non-video Packet Mode *Note: VOL header for short video header mode is an empty header 26/40

Combined Mode with VP Syntax TIMESTAMP, PTYPE, QUANT VOL header Picture header Slice Data Slice Data Picture header PROFILE, WIDTH, HEIGHT Slice header MB 0 MB 1 MB N MB#, MBTYPE, DQUANT, MV, CBP, Coeff. Data VLC is used to code these symbols 27/40

Video Packets (Slices) A video packet (slice) is a set of consecutive macroblocks in scan order: Resync Marker VP Info. MB Data MB# QP HEC (width, height, etc.) 28/42

Macroblock Syntax Each macroblock data contains the following information COD MCBPC CBPY DQUANT TCOEF INTRADC MVD 2-4 MVD * MCBPC combines MB Prediction Type and chroma coded block pattern (CBP) into one symbol; INTRADC is coded using 8-bit FLC, they are present for every blocks in an Intra MB 29/42

Generic Encoder Architecture YC B C R Frame mode, model (dir, mv, ref_id, etc) Partition slice info Pred Model current MB - + T/Q Pred Transform Scan Entropy slice info Q -1 /T -1 Mux Pred Spatial residual Mode Select mode, model Model Compensation + bitstream reference MB(s) Ref 1 Ref 2 In-Loop Deblocking Filter Ref 3 30/42

Brief History of H.264 In 2000, Real Network, Nokia, and HHI proposed to MPEG to adopt their new coding technologies MPEG issued a CfP in 2001 The Joint Video Team (JVT) of ISO and ITU-T were established afterwards H.26L was selected as the starting point, however, during the course of development, most modules in H.26L were replaced H.264 became an International Standard in May 2003, 2nd ed. is released in March 2004; 3rd ed. Is released in Apr. 2005 Official name: Advanced Video Coding (AVC), also referred to as MPEG-4 Part 10 or H.264 Documents and reference software: http://iphome.hhi.de/suehring/tml/ 31/42

H.264 Key Features H.264 is still a block-based motion-compensated transform codec; it is a technical evolution from MPEG-1/2/4, not a technical revolution Key features: Predictive coding Spatial prediction: 9 directional prediction patterns plus a gradient prediction pattern (i.e. plane mode) Multiple references and 4 4 to 16 16 variable-size blocks for inter predictions 16-bit integer combined transform/quantization Exact forward-inverse transform pair is used Transform block size is 4 4 or 8 8 Two different entropy coding methods Universal VLC plus Context Adaptive VLC Context Adaptive Binary Arithmetic Coding In-loop filter 32/42

Multiple Reference Frames Good prediction H.264 Motion Prediction Good prediction MPEG-1/2/4 Motion Prediction Bad prediction Bad prediction Bad prediction 33/42

DCT-like Integer Transform Starting with a 4x4 general transform: Y T 21 22 23 24 = AXA = a a a a x31 x32 x33 x 34 a c a b One can reach: T ( ) Y = CXC E = a a a a x11 x12 x13 x14 a b a c b c c b x x x x a c a b c b b c x41 x42 x43 x44 a b a c d 1 1 d x x x x 1 1 1 d cos( π / 8) cos(3π /8) 2 2 1 1 1 1 x11 x12 x13 x14 1 1 1 d a ab a ab 2 2 1 d d 1 x21 x22 x23 x 24 1 d 1 1 ab b ab b 1 1 1 1 x 2 2 31 x32 x33 x 34 1 d 1 1 a ab a ab 41 42 43 44 2 2 ab b ab b a = b = c = 1 2 1 2 1 2 d = 2 1 = 0.414213... approximated by d = ½ 34/42

Entropy Coding of H.264 Two types of entropy coders are supported in H.264 1. Variable Length Coder: Universal VLC (UVLC) for syntax elements Context Adaptive VLC (CAVLC) for transform coefficients 2. Arithmetic coder (not for Baseline): Context Adaptive Binary Arithmetic Coding (CABAC) 35/42

UVLC for Semantic Elements UVLC uses Exp-Golomb codewords: codewords 0 1 Bit patterns 1 ~ 2 0 1 x 0 3 ~ 6 0 0 1 x 1 x 0 7 ~ 14 0 0 0 1 x 2 x 1 x 0 15 ~ 30 0 0 0 0 1 x 3 x 2 x 1 x 0...... where each x n equals 0 or 1 In original design, UVLC is used for transform coefficients as well, but the performance was bad 36/42

CAVLC for Transform Coefficients CAVLC Process: Uses the number of non-zero coefficients of neighboring blocks to select different VLC tables For example, if scanned coefficients are: 0, 3, 0, 1, -1, -1, 0, 1 The coded symbols are: number of non-zeros = 5 number of trailing ones = 3 sings of trailing ones = +, -, - level symbols = 1, 3 number of zeros = 3 runs of zeros = 1, 0, 0, 1, 1 37/42

Context Adaptive Binary AC CABAC in AVC is composed of three parts: Binarization: convert syntax values to binary (0/1) strings Context modeling: estimate probability of 0/1 occurrence Arithmetic coding: only two subdivisions for each subinterval bin value for context model update non-binary valued syntax element syntax element Binarizer binary valued syntax element bin string loop over bins bins regular bypass Context modeler bin value, context model Regular Coding Engine Bypass Coding Engine regular bypass coded bits coded bits bitstream Binary Arithmetic Coder 38/42

Quality Comparison Ref.: T. Wiegand et. al, Rate-Constrained Coder Control and Comparison of Video Coding Standards, IEEE T-CSVT, July, 2003 39/42

High Efficiency Video Coding (HEVC) HEVC is a joint video standard developed by ITU-T and ISO by the team JCT-VC The effort starts around 2008 and version 1 is officially released on April 13, 2013. HEVC is designed for high resolution videos such as 4K and 8K videos. However, it can achieve over 50% coding efficiency gains for videos larger than SD (720 480) resolutions The architecture of HEVC is similar to that of AVC, with extensions to support superblocks (64 64 pixel blocks) Documents and reference software: http://hevc.hhi.fraunhofer.de/ 40/42

HEVC Major Features Intra prediction now consider 33 different directions The block-based prediction unit (a.k.a. coding tree unit) can be as large as 64 64; The transform coding unit size can be as large as 32 32 Improved motion vector prediction and interpolation filter (now 7 or 8 taps) The loop filters now includes a deblocking filter and a sample adaptive offset (SAO) filter that can remove banding and ringing artifacts Interlaced video supported at metadata layer, not video coding layer 41/42

Other Video Codec Standards Society of Motion Picture and Television Engineers (SMPTE) has standardized three codecs since 2005 SMPTE VC-1 (a.k.a. Microsoft Video Codec) SMPTE VC-2 (a.k.a. BBC Mezzanine Video Codec) SMPTE VC-3 (a.k.a. Avid DNxHDt Video Codec) Google has its own royalty free codec: VP9 VP9 was officially released on June, 2013 VP9 is supported by major web browsers and ffmpeg/libav VP9 has 64 64 superblock coding structure similar to HEVC VP9 will be the standard codec for YouTube 4K video 42/42