Fraunhofer Institut für Nachrichtentechnik Heinrich-Hertz-Institut Ralf Schäfer schaefer@hhi.de http://bs.hhi.de H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression
Introduction H.264/AVC: A step forward in compression technology Adaptive streaming of H.264/AVC Scalable extension of H.264/AVC Conclusions 2
D-Cinema TV / HDTV production HDTV Digital TV / DVD CD-ROM video conf. video phone mobile radio ITU H.26 JPEG MPEG- MPEG-2 ITU H.263 Vers. MPEG-4 990 992 994 996 998 2000 2002 Vers. 2 JPEG- 2000 Vers. 3 ITU/MPEG (JVT) H.264/AVC 00 Mbit/s 20 Mbit/s Mbit/s 64 kbit/s 8 kbit/s 3
!!!" #$ %& " ' ()*+ #, - "!., ' ()*/ 0## *!)*/23!"4 %' ()*!)*+/23! $., )"! /23 ) " # 5556 56%2 7!+.$ %8+"$ # %8"#8.9+ #, " : 7 2 $ 556 4
! ;, Control Data Video Coding Layer Data Partitioning Coded Macroblock Coded Slice/Partition Network Abstraction Layer H.320 MP4FF H.323/IP MPEG-2 etc. 5
";!;,, Video coding layer is based on hybrid video coding and similar in spirit to other standards but with important differences Some new key aspects are: Enhanced motion compensation Small blocks for transform coding Improved de-blocking filter Enhanced entropy coding Substantial bit-rate savings relative to other standards for the same quality 6
7! Input Video Signal Split into Macroblocks 6x6 pixels Decoder Coder Control Transform/ Scal./Quant. Scaling & Inv. Transform Control Data Quant. Transf. coeffs Entropy Coding Intra/Inter Intra-frame Prediction Motion- Compensation Deblocking Filter Output Video Signal Motion Data Motion Estimation 7
! ) $ "" Macroblocks: 6x6 luma + 2 x 8x8 chroma samples Input: Association of luma and chroma and conventional subsampling of chroma (4:2:0) Block motion displacement Motion vectors over picture boundaries Variable block-size motion Block transforms Scalar quantization I, P, and B coding types 8
'! #, Input Video Signal Coder Control Control Data - Transform/ Scal./Quant. Quant. Transf. coeffs Split into Macroblocks 6x6 pixels Decoder Scaling & Inv. Transform Entropy Coding Intra/Inter Intra-frame Prediction Motion- Compensation Motion Estimation De-blocking Filter MB Types 8x8 Types 6x6 0 8x8 0 Output 6x8 Video0 Signal 8x4 Motion Data 0 8x6 0 4x8 0 8x8 0 2 3 4x4 0 2 3 Motion vector accuracy /4 (6-tap filter) 9
' # 0 < Input Video Signal Split into Macroblocks 6x6 pixels - Decoder Coder Control Transform/ Scal./Quant. Scaling & Inv. Transform Control Data Quant. Transf. coeffs Entropy Coding Intra/Inter Intra-frame Prediction Motion- Compensation De-blocking Filter Output Video Signal Motion Estimation Motion Data Multiple Reference Frames Generalized B Frames Weighted Prediction 0
( Input Video Signal Split into Macroblocks 6x6 pixels - Decoder Intra/Inter Coder Control Transform/ Scal./Quant. Intra-frame Prediction Motion- Compensation Motion Estimation Scaling & Inv. Transform De-blocking Filter Directional spatial prediction (9 types for luma, chroma) Control Data Q A B C D E F G H I a Quant. b c d J Transf. e f g coeffs h K i j k l L m n o p Output Video Signal Entropy Coding 0 e.g., Mode 3: diagonal down/right prediction Motion a, f, k, p are Data predicted by (A + 2Q + I + 2) >> 2 4 6 5 3 7 2 8
! Input Video Signal Intra/Inter Coder Control Transform/ Scal./Quant. 4x4 Decoder Split into Block Integer Transform Macroblocks 6x6 pixels 2 2 H = 2 2 Intra-frame Prediction Repeated transform of DC coeffs for 8x8 chroma and some 6x6 Intra luma blocks - Motion- Compensation Scaling & Inv. Transform De-blocking Filter Control Data Quant. Transf. coeffs Output Video Signal Entropy Coding Motion Estimation Motion Data 2
)2. : < Highly compressed decoded inter picture ) Without Filter 2) with H264/AVC Deblocking 3
)#,! Two schemes depending on profile: Context adaptive VLC (CAVLC) Context-based Adaptive Binary Arithmetic Codes (CABAC) -> 0-5% gain over CAVLC 4
0 ## Codec (B pictures used when in profile) Average rate savings relative to: MPEG-4 ASP H.263 HLP MPEG-2 H.264/AVC 39% 49% 64% MPEG-4 ASP - 7% 43% H.263 HLP - - 3% 5
Comparison of MPEG-2 and H.264/AVC H.264/AVC @ 340 kbit/s MPEG-2 @ 024 kbit/s CIF, 30Hz : 340 & 024 kbit/s 6
Comparison of MPEG-2 and H.264/AVC H.264/AVC @ 340 kbit/s MPEG-2 @ 024 kbit/s CIF, 30Hz : 340 & 024 kbit/s 7
Comparison of MPEG-2 and H.264/AVC H.264/AVC @ 340 kbit/s MPEG-2 @ 024 kbit/s CIF, 30Hz : 340 & 024 kbit/s 8
Comparison of MPEG-2 and H.264/AVC H.264/AVC @ 340 kbit/s MPEG-2 @ 024 kbit/s CIF, 30Hz : 340 & 024 kbit/s 9
0 #.! H.264/AVC encoding is 8-0 times as complex as MPEG-2 H.264/AVC decoding is about 3 times as complex as MPEG-2 However, the computing power of semiconductors has increased by a factor of 00 since the infancy of MPEG-2 A number of real time encoders and decoder (chips) for TV/HDTV applications are currently under development (IBC) Encoder and decoder chips for mobile devices are still a technological challenge Software decoding of 720p (720 x 280 @ 24 Hz) is already possible! 20
!!" #$% &' ( # ) 2
! 2 7! " # $ %& ' ($") *+, -./ ) &-0 " $ 2 - +34.&! 22
!## Wireless Adopted by 3GPP as optional codec in Release 6 Adopted by DVB (AVC) for DVB-H Adopted for Korean DMB System Adopted by Japanes Segment ISDB-T system Broadcast Adopted by DVB (DVB-AVC) To be adopted by ATSC Adopted in Japan and Korea Used for HDTV services via satellite (DirecTV, Echo Star, BskyB, Premiere, ) Storage Adopted as mandatory codec for HD-DVD Adoted as mandatory codec for BluRay-DVD 23
.! Facing the scenario of heterogeneous media delivery: Different users Different needs Different displays Different links Flexible source coding, i.e. scalability is needed Simple adaptation to different bit-rates, frame rates or spatial resolutions of the video content on a bit-stream level Realization of a fully scalable video coding scheme as an extension of H.264/AVC HHI proposed an SVC scheme, which incorporates the concept of Motion Compensated Temporal Filtering (MCTF) into the H.264/AVC framework. This approach outperformes all competing approaches, which used spatial wavelets, by far. HHI s proposal has been selected as basis for MPEG-4 SVC 24
. )9! Realization as an extension of H.264/AVC Most components of H.264/AVC can be used as specified - Variable block size motion-compensated prediction with multiple-frame and multi-hypothesis capabilities - Spatial intra prediction - Transform coding including adaptive transforms and CABAC H.264/AVC can be used as base layer Basic Codec Components Temporal dependency between pictures is coded using an open-loop approach - Block-based motion-compensated temporal filtering (MCTF) Open-loop structure: Efficient incorporation of scalability Block-based transform coding of temporally filtered pictures 25
'! # #<%,+ Original picture A Low-pass picture L + ½ Prediction using M P (MCP) + Update using M U (MCP) Original picture B High-pass picture H 26
#2 # GOP boundary 27
#., MCTF (AVC based) (Intra Pred.) AVC Transform Q Q - - + Q Can be replaced by a single layer H.264AVC coder Q - - + Q Decimation Filter Decoder (inverse Transform + inverse MCTF) Entropy Coding Up-conversion Filter + Multiplex Embedded Bitstream Video MCTF (AVC based) (Intra + BL Pred.) AVC Transform Q Q - - + Q Q - - + Q 28
=0., Layered representation of subband pictures Residual coding of the quantization error between the original subband pictures and their base layer reconstruction Efficient for coarse grains of scalable SNR layers General Decomposition... + - + - MC / Intra Prediction + Transform, Scal. / Quant. Inv. Scaling, Inv. Transform Entropy Coding SNR Base Layer (Layer 0) Group of Pictures of a Group of Pictures using Motion-Comp. Temporal Filtering Temporal subband Pictures (MCTF)... + -... Transform, Scal. / Quant. Inv. Scaling, Inv. Transform Transform, Scal. / Quant. Entropy Coding Entropy Coding SNR Enhancement Layer (Layer ) SNR Enhancement Layer (Layer N-) 29
!.., Layer 0: QCIF, 7.5 Hz, 64 kbit/s Layer : QCIF, 5 Hz, 28 kbit/s I B P B P B Spatial upsampling Layer 2: CIF, 5 Hz, 256 kbit/s H 0 2 H 0 L 0 2 H 0 H 0 2 H 0 {M P },2 Layer 3: CIF, 5 Hz, 52 kbit/s H 2 H L 2 H H 2 H Layer 4: CIF, 30 Hz, 024 kbit/s H 22 H 00 H 2 H 00 L 22 H 00 H 2 H 00 H 22 H 00 H 2 H 0 0 {M P } 0 Layer 5: CIF, 30 Hz, 2048 kbit/s H 23 H 0 H 3 H 0 L 23 H 0 H 3 H 0 H 23 H 0 H 3 H 0 30
0!.., 39 City 38 37 Y-PSNR [db] 36 35 34 33 32 3 H.264/AVC Scalable MCTF extension 30 QCIF 5Hz 64 kbit/s QCIF 5Hz 28 kbit/s CIF 30Hz 256 kbit/s CIF 30Hz 52 kbit/s 4CIF 30Hz 024 kbit/s 4CIF 60Hz 2048 kbit/s 3
! Image & video coding are key technologies for multimedia and communication. The H.264/AVC Video Coding Standard is based on hybrid video coding with important new tools increasing the compression factor of 2-3 compared to MPEG-2. H.264/AVC has been adopted for different fields of applications ranging from low bit rate video up to HDTV MPEG-4 SVC is the first coding scheme providing almost the same efficiency as single layer coding MPEG-4 SVC is based on an extension of H.264/AVC FhG/HHI plays a very prominent role in the standardisation of both coding schemes. 32
Thank you 33