An Improved H.26L Coder Using Lagrangian Coder Control. Summary

Similar documents
Fraunhofer Institute for Telecommunications - Heinrich Hertz Institute (HHI)

New Techniques for Improved Video Coding

Reduced Frame Quantization in Video Coding

Advanced Video Coding: The new H.264 video compression standard

Rate Distortion Optimization in Video Compression

Overview: motion-compensated coding

Digital Video Processing

Decoded. Frame. Decoded. Frame. Warped. Frame. Warped. Frame. current frame

Wavelet-Based Video Compression Using Long-Term Memory Motion-Compensated Prediction and Context-Based Adaptive Arithmetic Coding

Title Adaptive Lagrange Multiplier for Low Bit Rates in H.264.

IBM Research Report. Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms

IN the early 1980 s, video compression made the leap from

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

AVC/H.264 Generalized B Pictures

An Efficient Mode Selection Algorithm for H.264

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46

Video Coding Using Spatially Varying Transform

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

JPEG 2000 vs. JPEG in MPEG Encoding

The Scope of Picture and Video Coding Standardization

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

LONG-TERM MEMORY PREDICTION USING AFFINE MOTION COMPENSATION

Video Codecs. National Chiao Tung University Chun-Jen Tsai 1/5/2015

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

ARTICLE IN PRESS. Signal Processing: Image Communication

Optimum Quantization Parameters for Mode Decision in Scalable Extension of H.264/AVC Video Codec

Cross Layer Protocol Design

Lecture 13 Video Coding H.264 / MPEG4 AVC

A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

Video Compression Standards (II) A/Prof. Jian Zhang

An Efficient Table Prediction Scheme for CAVLC

Video Coding Standards: H.261, H.263 and H.26L

Video Compression An Introduction

Model-Aided Coding: A New Approach to Incorporate Facial Animation into Motion-Compensated Video Coding

Scalable Video Coding

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression

Model-Aided Coding: A New Approach to Incorporate Facial Animation into Motion-Compensated Video Coding

MCTF and Scalability Extension of H.264/AVC and its Application to Video Transmission, Storage, and Surveillance

Reduced 4x4 Block Intra Prediction Modes using Directional Similarity in H.264/AVC

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

Chapter 10. Basic Video Compression Techniques Introduction to Video Compression 10.2 Video Compression with Motion Compensation

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho

STACK ROBUST FINE GRANULARITY SCALABLE VIDEO CODING

Multimedia Standards

OF QUADTREE VIDEO CODING SCHEMES

Scalable Video Coding in H.264/AVC

IMPROVED CONTEXT-ADAPTIVE ARITHMETIC CODING IN H.264/AVC

H.264/AVC Baseline Profile to MPEG-4 Visual Simple Profile Transcoding to Reduce the Spatial Resolution

H.264 to MPEG-4 Transcoding Using Block Type Information

An Efficient Intra Prediction Algorithm for H.264/AVC High Profile

Introduction to Video Encoding

VHDL Implementation of H.264 Video Coding Standard

CAMED: Complexity Adaptive Motion Estimation & Mode Decision for H.264 Video

High Efficiency Video Coding: The Next Gen Codec. Matthew Goldman Senior Vice President TV Compression Technology Ericsson

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

Coding of Coefficients of two-dimensional non-separable Adaptive Wiener Interpolation Filter

Xin-Fu Wang et al.: Performance Comparison of AVS and H.264/AVC 311 prediction mode and four directional prediction modes are shown in Fig.1. Intra ch

Performance and Complexity Co-evaluation of the Advanced Video Coding Standard for Cost-Effective Multimedia Communications

High Efficiency Video Coding (HEVC) test model HM vs. HM- 16.6: objective and subjective performance analysis

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Editorial Manager(tm) for Journal of Real-Time Image Processing Manuscript Draft

Rate-Distortion Optimized Layered Coding with Unequal Error Protection for Robust Internet Video

Introduction to Video Coding

Modeling and Simulation of H.26L Encoder. Literature Survey. For. EE382C Embedded Software Systems. Prof. B.L. Evans

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

ISO/IEC JTC1/SC29/WG11 N4065 March 2001, Singapore. Deadline for formal registration Deadline for submission of coded test material.

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

A NOVEL SCANNING SCHEME FOR DIRECTIONAL SPATIAL PREDICTION OF AVS INTRA CODING

Video Quality Analysis for H.264 Based on Human Visual System

Very Low Complexity MPEG-2 to H.264 Transcoding Using Machine Learning

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS

Compression of Stereo Images using a Huffman-Zip Scheme

COMPARATIVE ANALYSIS OF DIRAC PRO-VC-2, H.264 AVC AND AVS CHINA-P7

Video coding. Concepts and notations.

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

Video Coding. Video Coding (esp. ITU & ISO/IEC Standards) Standardization Organizations. The Scope of Picture and Video Coding Standardization

ARCHITECTURES OF INCORPORATING MPEG-4 AVC INTO THREE-DIMENSIONAL WAVELET VIDEO CODING

H.264/AVC Video Watermarking Algorithm Against Recoding

Investigation of the GoP Structure for H.26L Video Streams

Video Coding Standards. Yao Wang Polytechnic University, Brooklyn, NY11201 http: //eeweb.poly.edu/~yao

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

Overview, implementation and comparison of Audio Video Standard (AVS) China and H.264/MPEG -4 part 10 or Advanced Video Coding Standard

Introduction to Video Compression

Stereo Image Compression

EE 5359 H.264 to VC 1 Transcoding

Scalable video coding with robust mode selection

Evaluation of H.264/AVC Coding Elements and New Improved Scalability/Adaptation Algorithm/Methods

Emerging H.26L Standard:

Video Coding in H.26L

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho

(Invited Paper) /$ IEEE

Introduction to Video Encoding

Video Compression. Learning Objectives. Contents (Cont.) Contents. Dr. Y. H. Chan. Standards : Background & History

EE Low Complexity H.264 encoder for mobile applications

HEVC The Next Generation Video Coding. 1 ELEG5502 Video Coding Technology

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

Transcription:

UIT - Secteur de la normalisation des télécommunications ITU - Telecommunication Standardization Sector UIT - Sector de Normalización de las Telecomunicaciones Study Period 2001-2004 Commission d' études Study Group Comisión de Estudio 16 Contribution tardive Delayed Contribution Contribución tardía D.146 Porto Seguro,. May - 8. June 2001 Texte disponible seulement en Text available only in Texto disponible solamente en E Question(s): 6 SOURCE*: Heiko Schwarz and Thomas Wiegand Image Processing Department Heinrich-Hertz-Institute Einsteinufer 10587 Berlin Germany TITLE: An Improved H.L Coder Using Lagrangian Coder Control Summary This document describes an H.L coder whose operational mode is rate-distortion optimized using Lagrangian techniques. A similar method has been proposed to ITU-T VCEG by the second author in Q15-D-13 1 for the application to the test model of the ITU-T Recommendation H.3, version 2. The proposal in Q15-D-13 led to the creation of a new encoder recommendation (TMN-10) and is up to now the core of the high-complexity mode of the latest TMN. The coder generates an H.L bit-stream compliant with the H.L Test Model Long Term Number 7 (TML-7) 2 where the context-based adaptive binary arithmetic coding (CABAC) 3 is used for 1 ITU-T/SG 16/Q15-D-13 can be obtained via anonymous ftp to standard.pictel.com/video-site/9804_tam/q15d13.doc. 2 ITU-T/SG 16/Q.6/VCEG-M-81, H.L Test Model Long Term Number 7 (TML-7) draft0, May 2001. * Contact: Tel: +49 002 206 / 617 Fax: +49 2 72 00 E-mail: hschwarz@hhi.de, wiegand@hhi.de

- 2 - entropy coding. Comparisons are made against the ITU-T H.L codec version 6.2. The simulations are run for the set of test sequences and conditions as specified in the MPEG coding efficiency CfP4. All bit-streams have been successfully decoded and an accompanying excel document contains results on R-D performance. For motion estimation, reference frame selection, and intra 4x4-mode decision, we use bit-rates in our experiments that would have been obtained for the UVLC, while the actual entropy coding uses CABAC. The macroblock mode decision is based on the actual CABAC bit-rates. We have observed that for all sequences tested, the improved encoding strategy provides PSNR gains of between 0.2 and 0.3 db or, correspondingly, bit-rate savings of about 3-6% comparing against the TML-7 encoding strategy using CABAC. 1 Motion Estimation and Mode Selection The problem of optimum bit allocation to the motion vectors and the residual coding in any hybrid video coder is a non-separable problem requiring a high amount of computation. To circumvent this joint optimization, we split the problem of macroblock encoding into two parts: motion estimation and mode decision. For that, motion estimation for the various modes (16x16, 16x8, 8x16, 8x8, etc.) is conducted first, and then given these motion vectors, the overall rate-distortion costs for all macroblock modes are computed for rate-constrained mode decision. In the remainder of this section, our approaches to motion estimation and mode decision are described. In the next section, the complete algorithm is given. 1.1 Rate-Constrained Motion Estimation For each block or macroblock the motion vector is determined by full search on integer-pixel positions followed by quarter-pixel refinement. The integer-pixel search for the 16x16 block and the most recent decoded frame is conducted over the range [-16 16]x[-16 16] pixels relative to the motion vector prediction for the regarded block. For smaller blocks or older pictures the search range is reduced according to the TML-7. The search is conducted given the predictor of the block motion vector. We view motion-compensated prediction as a source coding problem with a fidelity criterion. For bit-allocation, we use a Lagrangian formulation wherein distortion is weighted against rate using a Lagrange multiplier. More precisely, our integer-pixel motion search as well as our sub-pixel refinement returns the motion vector that minimizes J( m λmotion ) = SAD( s, c( m)) + λmotion R( m p ) with m = ( m, ) T x my being the motion vector, p = ( p, ) T x py being the prediction for the motion vector, and λ MOTION being the Lagrange multiplier. The rate term R( m-p ) represents the motion 3 ITU-T/SG 16/Q.6/VCEG-L-13, Adaptive Codes for H.L, January 2001. 4 ISO/IEC JTC 1/SC /WG 11, Call For Proposals On New Tools For Video Compression Technology, Doc. N65, March 2001, Singapore

- 3 - information only and is computed by a table-lookup. The rate is estimated by using the universal variable length code (UVLC) table of the TML-7. The SAD is computed as B, B SAD(, s c( m )) = s[, x y] c[ x m, y m ], B = 16, 8 or 4. x= 1, y= 1 x y with s being the original video signal and c being the coded video signal. The choice of λ MOTION has a rather small impact on the result of the 16x16 block motion estimation. But the search result for smaller blocks is strongly affected by λ MOTION. In our coder, we choose λ = 2^ ( QP / 6 1/ 2), MOTION where QP is the macroblock quantization parameter. The current H.L test model (TML-7) already defines this concept of rate-constrained motion estimation. We have only adapted the Lagrangian multiplier λ MOTION. Another difference is that for motion prediction of a 16x16 block with zero vector components no bias is subtracted from the motion vector cost J ( m λmotion ) in our approach. There is no necessity to favour the skip mode during motion estimation since it will be treated separately in the macroblock mode decision (see section 1.2). 1.2 Rate-Constrained Mode Decision In the proposed coder, the current macroblock mode is chosen given the mode decisions made for the past macroblocks. Rate-constrained mode decision refers to the minimization of the following Lagrangian functional J (,, s c QP, λ ) = SSD(,, s c QP) + λ R(,, s c QP) where QP is the macroblock quantizer, λ is the Lagrange multiplier for mode decision, and indicates a mode chosen from the set of potential prediction: I-frame: { INTRA4x4, INTRA16x16}, P-frame: B-frame: INTRA4x4, INTRA16x16, SKIP,, 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, 4x4 INTRA4x4, INTRA16x16, BIDIRECT, DIRECT, FWD16x16, FWD16x8, FWD8x16, FWD8x8, FWD8x4,. FWD 4x8, FWD 4x4, BAK16x16, BAK 16x8, BAK 8x16, BAK 8x8, BAK 8x4, BAK 4x8, BAK 4x4

- 4 - Note that the SKIP mode refers to the 16x16 mode where no motion and residual information is encoded. SSD is the sum of the squared differences between the original block s and its reconstruction c being given as 16,16 2 ( ), SSD( s, QP) = s[ x, y] c[ x, y, QP] x= 1, y= 1 and R(,, s c QP ) is the number of bits associated with choosing and QP including the bits for the macroblock header, the motion, and all DCT blocks. c[ x, y, QP ] represents the reconstructed luminance values corresponding to s[ xy., ] We choose λ = 2^ ( QP / 3 1), where QP is the macroblock quantization parameter. The determination of the reference frame REF and the associated motion vectors for the NxM inter modes in P-frames and the FWD NxM modes in B-frames is done similar to the definition in the TML-7 by minimizing J ( REF λ ) = SAD( s, c( REF, m( REF))) + λ ( R( m( REF) p( REF)) R( REF)). MOTION MOTION + The rate term R (REF) represents the number of bits associated with choosing REF and is computed by table-lookup using UVLC. The reference frame and block sizes for the bi-directional mode are chosen as combination of the best forward and backward mode. The best INTER16x16 prediction mode is chosen according to the TML-7. For the INTRA4x4 prediction, the mode decision for each 4x4 block is performed similar to the macroblock mode decision by minimizing J ( s, I QP, λ ) = SSD( s, I QP) + λ R( s, I QP) where QP is the macroblock quantizer, λ is the Lagrange multiplier for mode decision, and I indicates an intra prediction mode: { DC, HOR, VERT, DIAG, DIAG _ RL, DIAG _ LR} I. SSD is the sum of the squared differences between the original 4x4 block s and its reconstruction c and R ( s, I QP) represents the number of bits associated with choosing I. It includes the bits for the intra prediction mode and the DCT-coefficients for the 4x4 luminance block. The rate term is computed using the UVLC entropy coding since the state of the arithmetic coder for coding the intra prediction modes and the DCT-coefficients is not known during the process of intra mode selection (CBP is unknown for most of the 4x4 blocks at this stage).

- 5-2 The Algorithm for Rate-Constrained Encoding The procedure to encode one macroblock s in a I-, P- or B-frame in our video codec is summarized as follows. 1. Given the last decoded frames, λ, λ MOTION, and the macroblock quantizer QP 2. Choose intra prediction modes for the INTRA 4x4 macroblock mode by minimizing J ( s, I QP, λ ) = SSD( s, I QP) + λ R( s, I QP) with I { DC, HOR, VERT, DIAG, DIAG _ RL, DIAG _ LR}. 3. Determine the best INTRA 16x16 prediction mode according to the TML-7. 4. Perform motion estimation and reference frame selection by minimizing J ( REF, m( REF) λ ) = SAD( s, c( REF, m( REF))) + λ ( R( m( REF) p( REF)) R( REF)) MOTION MOTION + for each reference frame and motion vector of a possible macroblock mode. 5. Choose the macroblock prediction mode by minimizing J (,, s c QP, λ ) = SSD(,, s c QP) + λ R(,, s c QP), given QP and λ when varying. indicates a mode out of the set of potential macroblock modes: I-frame: { INTRA4x4, INTRA16x16}, P-frame: B-frame: INTRA4x4, INTRA16x16, SKIP,, 16x16, 16x8, 8x16, 8x8, 8x4, 4x8, 4x4 INTRA4x4, INTRA16x16, BIDIRECT, DIRECT, FWD16x16, FWD16x8, FWD8x16, FWD8x8, FWD8x4,. FWD 4x8, FWD 4x4, BAK16x16, BAK 16x8, BAK 8x16, BAK 8x8, BAK 8x4, BAK 4x8, BAK 4x4 The computation of J( s, SKIP QP, λ ) and J ( s, DIRECT QP, λ ) is simple. The costs for the other macroblock modes are computed using the intra prediction modes or motion vectors and reference frames, which have been estimated in steps 2-4.

- 6-3 Remaining Coder Parts All remaining coder parts are operated as described in the H.L Test Model Long Term Number 7 (TML-7) draft0 where CABAC is used for entropy coding. 4 Experimental Results For illustration purpose of effectiveness of the proposed encoding strategy in comparison to the TML-7 using CABAC for entropy coding, parameterized rate-distortion curves are plotted. These curves show PSNR of the luminance component versus bit-rate measured of the complete bitstream. PSNR is measured as the arithmetic mean of the PSNR values for each frame. The bit-rate is averaged over the complete sequence. The plots have 0.5 db grid lines on the PSNR axis. For the following sequences specified in the coding efficiency CfP rate-distortion curves are depicted: Name Format Frame Rate # of frames Foreman QCIF 10 fps 100 News QCIF 10 fps 100 Container Ship QCIF 10 fps 100 Tempete QCIF 10 fps 87 Foreman CIF 15 fps 150 News CIF 15 fps 150 Container CIF 15 fps 150 Tempete CIF 15 fps 1 Bus CIF 15 fps 75 Mobile & Calendar CIF 15 fps 125 Flowers & Garden CIF 15 fps 125 The rate-distortion curves are generated by varying the value of the macroblock quantizer QP that is fixed for a sequence. The parameters given below have been used for the original ITU-T H.L coder version 6.2 coder as well as for the rate-distortion optimized coder. QP I (I-frame) and QP P (P-frame),, 25, 23, 21, 18, 16, 14, 12, 10, 8 QP B (B-frame) QP P +2 M (distance P-P) 3 (i.e., 2 B-frames inserted) Number of reference frames 5 Use of hadamard transform enabled

- 7 - FOREMAN (QCIF, 10fps),5 43,5,5,5,5,5,5,5,5 0 50 100 150 200 250 NEWS (QCIF, 10fps) 43,5 44,5 43,5,5,5,5,5,5,5,5 0 20 60 80 100 120 1

- 8 - CONTAINER (QCIF, 10fps),5 43,5,5,5,5,5,5,5,5 0 10 20 50 60 70 80 90 100 TEMPETE (QCIF, 10fps),5,5,5,5,5,5,5,5 25,5 24,5 25 24 0 50 100 150 200 250 0 0 0

- 9 - FOREMAN (CIF, 15fps) 43,5,5,5,5,5,5,5 0 200 0 600 800 1000 1200 NEWS (CIF, 15fps) 44 43,5 43,5,5,5,5,5,5,5 0 50 100 150 200 250 0 0 0 450 500

- 10 - CONTAINER (CIF, 15fps),5 43,5,5,5,5,5,5,5,5 0 100 200 0 0 500 600 700 TEMPETE (CIF, 15fps),5,5,5,5,5,5,5,5 25,5 25 0 500 1000 1500 2000 2500

- 11 - BUS (CIF, 15fps),5,5,5,5,5,5,5,5 25,5 25 0 500 1000 1500 2000 2500 MOBILE & CALENDAR (CIF, 15fps),5,5,5,5,5,5,5 25,5 24,5 25 23,5 24 23 0 500 1000 1500 2000 2500

- 12 - FLOWERS & GARDEN (CIF, 15fps),5,5,5,5,5,5,5,5 25,5 24,5 25 23,5 24 23 0 500 1000 1500 2000 2500 00