Rate Distortion Optimization in Video Compression

Similar documents
Reduced Frame Quantization in Video Coding

IN the early 1980 s, video compression made the leap from

An Improved H.26L Coder Using Lagrangian Coder Control. Summary

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

Decoded. Frame. Decoded. Frame. Warped. Frame. Warped. Frame. current frame

Overview: motion-compensated coding

Module 7 VIDEO CODING AND MOTION ESTIMATION

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

New Techniques for Improved Video Coding

Stereo Image Compression

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

LONG-TERM MEMORY PREDICTION USING AFFINE MOTION COMPENSATION

Interframe coding of video signals

Compression of Stereo Images using a Huffman-Zip Scheme

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 6, NO. 11, NOVEMBER

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

Advanced Video Coding: The new H.264 video compression standard

In the name of Allah. the compassionate, the merciful

Digital Video Processing

Fast Mode Decision for H.264/AVC Using Mode Prediction

CMPT 365 Multimedia Systems. Media Compression - Video

CS 335 Graphics and Multimedia. Image Compression

Chapter 10. Basic Video Compression Techniques Introduction to Video Compression 10.2 Video Compression with Motion Compensation

Video Coding in H.26L

LECTURE VIII: BASIC VIDEO COMPRESSION TECHNIQUE DR. OUIEM BCHIR

Model-Aided Coding: A New Approach to Incorporate Facial Animation into Motion-Compensated Video Coding

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS

JPEG 2000 vs. JPEG in MPEG Encoding

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

Title Adaptive Lagrange Multiplier for Low Bit Rates in H.264.

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

Video Compression MPEG-4. Market s requirements for Video compression standard

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression

Model-Aided Coding: A New Approach to Incorporate Facial Animation into Motion-Compensated Video Coding

Wavelet-Based Video Compression Using Long-Term Memory Motion-Compensated Prediction and Context-Based Adaptive Arithmetic Coding

Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video

Introduction to Video Coding

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Compression of Light Field Images using Projective 2-D Warping method and Block matching

MPEG-4: Simple Profile (SP)

EFFICIENT DEISGN OF LOW AREA BASED H.264 COMPRESSOR AND DECOMPRESSOR WITH H.264 INTEGER TRANSFORM

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

Image Compression - An Overview Jagroop Singh 1

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Using animation to motivate motion

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

IBM Research Report. Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

Image and Video Coding I: Fundamentals

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

Pre- and Post-Processing for Video Compression

Video Quality Analysis for H.264 Based on Human Visual System

Homogeneous Transcoding of HEVC for bit rate reduction

Video Compression Standards (II) A/Prof. Jian Zhang

Professor, CSE Department, Nirma University, Ahmedabad, India

Optimal Estimation for Error Concealment in Scalable Video Coding

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

10.2 Video Compression with Motion Compensation 10.4 H H.263

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho

Tutorial T5. Video Over IP. Magda El-Zarki (University of California at Irvine) Monday, 23 April, Morning

Image and Video Coding I: Fundamentals

Research on Distributed Video Compression Coding Algorithm for Wireless Sensor Networks

An Efficient Mode Selection Algorithm for H.264

IMAGE COMPRESSION USING HYBRID QUANTIZATION METHOD IN JPEG

CODING METHOD FOR EMBEDDING AUDIO IN VIDEO STREAM. Harri Sorokin, Jari Koivusaari, Moncef Gabbouj, and Jarmo Takala

PAPER Optimal Quantization Parameter Set for MPEG-4 Bit-Rate Control

Video encoders have always been one of the resource

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

Reconstruction PSNR [db]

View Synthesis Prediction for Rate-Overhead Reduction in FTV

Bit Allocation for Spatial Scalability in H.264/SVC

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

Lecture 6: Compression II. This Week s Schedule

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

JPEG Joint Photographic Experts Group ISO/IEC JTC1/SC29/WG1 Still image compression standard Features

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK

Video Compression An Introduction

Implementation and analysis of Directional DCT in H.264

VHDL Implementation of H.264 Video Coding Standard

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho

Image Compression Algorithm and JPEG Standard

06/12/2017. Image compression. Image compression. Image compression. Image compression. Coding redundancy: image 1 has four gray levels

ADVANCED TECHNIQUES FOR HIGH FIDELITY VIDEO CODING. Qi Zhang

THE H.264, the newest hybrid video compression standard

The Scope of Picture and Video Coding Standardization

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

Mesh Based Interpolative Coding (MBIC)

A High Quality/Low Computational Cost Technique for Block Matching Motion Estimation

MOTION ESTIMATION AT THE DECODER USING MAXIMUM LIKELIHOOD TECHNIQUES FOR DISTRIBUTED VIDEO CODING. Ivy H. Tseng and Antonio Ortega

Video Transcoding Architectures and Techniques: An Overview. IEEE Signal Processing Magazine March 2003 Present by Chen-hsiu Huang

Zonal MPEG-2. Cheng-Hsiung Hsieh *, Chen-Wei Fu and Wei-Lung Hung

MRT based Fixed Block size Transform Coding

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

Transcription:

Rate Distortion Optimization in Video Compression Xue Tu Dept. of Electrical and Computer Engineering State University of New York at Stony Brook 1. Introduction From Shannon s classic rate distortion theory, we know that the main task of source coding or compression is to represent a source with the fewest number of bits possible for a given reproduction quality. Compression can be achieved with lossless techniques where the decompressed data is an exact copy of the original. However, this requirement also makes compression performance somewhat limited, especially for modern video compression where the amount of source information is extremely huge. As an example, consider terrestrial broadcast (at about 20 Mb/s) of HDTV (raw bit rate of over 1 Gb/sec) that would require a compression ration exceeding 50:1, which is at least an order of magnitude in excess of the capacity of the best lossless image compression methods. In such situation, lossy compression is called for. Higher compression rations are possible at the cost of imperfect source representation. The appropriate trade-off between source fidelity and coding rate is exactly the rate distortion optimization problem. In lossy compression, the decoded images are not the exact copies of the originals, however, if the properties of the human visual system are correctly exploited, the differences are almost indistinguishable. Then, we come up with the question: how much fidelity in the representation are we going to give up in order to reduce the number of bits required to transmit the data? In the following parts, we will take an overview of how rate-distortion trade-off is taken into account in practical video coders, and introduce the optimization method Lagrange Multiplier Method in details. 2. Video Compression Basics First, let us go over the basic scheme of modern video compression, which is also called hybrid video coding. Its basic structure is shown in Figure 1. One way of compressing video contents is to compress each frame independently, which is known as INTRA-frame coding. The pictures are coded without other pictures in the video sequence. Usually, the frame is broke up into blocks of different sizes, these blocks are then transformed by Discrete Cosine Transform (DCT), and the DCT coefficients are then quantized and transmitted using variable length codes. Obviously, INTRA coding is low efficient since it ignores the relativity between consecutive frames. Improved compression performance can be achieved by taking advantage of the large amount of temporal redundancy. Such technique is referred as INTER-frame coding. Usually, much of the depicted scene is essentially repeated in 1

picture after picture without significant change, which is known as SKIP mode. Then the video will be more efficiently represented by coding only the changes in the video content. Fig. 1 Typical motion-compensated DCT video coder However, simply coding the difference has a shortcoming, which is its inability to refine the approximation. Often the content of an area of a previous picture can be a good approximation of the new picture, needing only a minor alteration, then the motion compensated prediction (MCP) is proposed. Most changes in video content are typically due to the motion of objects in the depicted scene relative to the image plane, displacing an area of the prior picture by a few pixels in spatial location can result in a significant reduction in the amount of information that needs to be sent as a frame difference approximation. The use of spatial displacement to form an approximation is known as motion compensation and the encoder s search for the best spatial displacement approximation is known as motion estimation. The coding of the resulting difference signal for the refinement of the MCP signal is known as displaced frame difference (DFD) coding. Hence, such video compression designs are called hybrid codecs, which is due to using a hybrid of motion-handling and picture coding techniques. Its design and operation involve the optimization of a set of coding parameters, including 1. How to segment each frame into areas. 2. Whether or not to replace each area with completely new INTRA-frame content. 2

3. If not replacing an area with new INTRA content (a) How to do motion estimation; i.e., how to select the spatial shifting displacement. (b) How to do DFD coding; i.e., how to select the approximation to use as a refinement of the inter prediction. At this point, we have introduced a problem for designers of video coding system, which is: What part of the image should be coded using what method? Here comes the optimization task, which is to choose the most efficient coding parameters (block size, prediction modes, motion vectors, quantization levels, etc.) in the rate distortion sense. 3. Lagrange Optimization Techniques As mentioned in previous section, various coding methods will increase the overall compression performance, but different coding modes have different rate-distortion characteristics, then the goal of the encoder is to optimize its overall fidelity: Minimize distortion D, subject to a constraint rate R. This constrained problem can be expressed as follows min{d }, subject to R < Rc (1) The optimization task in Eq. (1) can be elegantly solved using Lagrange Optimization where a distortion term is weighted against a rate term. The Lagrange formulation of the minimization problem is given by min{j }, where J = D + λr (2) where J is minimized for a particular value of the Lagrange multiplier λ. Each solution to Eq. (2) for a given value of the Lagrange multiplier λ corresponds to an optimal solution to Eq. (1) for a particular value of R c. This technique has gained importance due to its effectiveness, conceptual simplicity, and its ability to evaluate a large number of possible coding choices in an optimized fashion. In practice, a number of interactions between coding decisions must be neglected in video coding optimization. The main use of this technique is in motion estimation and prediction mode decisions. In motion estimation, search for the best motion vector can be viewed as the minimization of the Lagrangian cost function J MOTION = D + λ R (3) DFD MOTION MOTION In which the distortion D DFD, representing the prediction error measured as SAD (sum of absolute differences), is weighted against the number of bits R MOTION associated with the MVs using a Lagrange multiplier λ MOTION. The Lagrange 3

multiplier imposes the rate constraint, and directly controls the rate-distortion trade-off, meaning that small values of λ MOTION correspond to high fidelities and bit rates, while large values correspond to lower fidelities and bit rates. Another task of encoder is to choose the appropriate prediction mode. From the viewpoint of bit-allocation strategies, the various prediction modes relate to various bit-rate partitions. Considering the various H.263 modes: INTRA, SKIP, INTER, and INTER+4V, Table 1 gives typical values for the bit-rate partition of motion and DFD texture coding for typical sequences. Table 1 If we assume for the moment that the bit rate and distortion of the residual coding stage is controlled by the selection of a quantizer step size Q, then rate-distortion optimized mode decision refers to the minimization of the following Lagrangian function J ( Α, M, Q) = D ( Α, M, Q) + λ R ( Α, M, Q) (4) REC MODE independently for each macroblock Α, where REC M {INTRA, SKIP, INTER, INTER+4V}, Q is the selected quantizer step size. D REC ( Α, M, Q) is the SSD (sum of squared differences) between the original macroblock Α and its reconstruction, and R REC ( Α, M, Q) is the result after run-level variable-length coding. 4. Determination of the best Lagrange Multiplier settings The Lagrange multiplier λ MODE controls the macroblock mode decision when evaluating Eq. (4). The Lagrangian cost function (4) depends for the INTER modes on the motion-compensated prediction (MCP) signal and the DFD coding. The MCP signal is obtained by minimizing Eq. (3), which depends on the choice of λ MOTION, while the DFD coding is controlled by the DCT quatizer value Q. Hence, for a fixed value of λ MODE, a particular setting of λ MOTION and Q yields a minimum Lagrangian 4

cost function in Eq. (4). One approach to find those values for λ MOTION and Q is to evaluate the product space of these two parameters. However, this approach requires a prohibitive amount of computation. Therefore, the relationship between λ MODE and Q is considered first while λ MOTION measure in Eq. (3). To abtain a relationship between Q and λ MODE = λ MODE, when considering the SSD distortion, the minimization of the Lagrangian cost function in Eq. (4) is extended by the macroblock mode type INTER+Q, which permits changing Q by a small amount when sending an INTER macroblock. More precisely, the macroblock mode decision is conducted by minimizing Eq. (4) over the set of macroblock modes {INTRA, SKIP, INTER, INTER+4V, INTER+Q(-2), INTER+Q(-1), INTER+Q(1),INTER+Q(2)}, where, for example, INTER+Q(-2) stands for the INTER macroblock mode beging coded with DCT quantizer value reduced by two relative to the previous macroblock. Hence, the Q value selected by the minimization routine becomes dependent on λ MODE. Otherwise the algorithm for running the rate-distortion optimized video coder remains unchanged. Fig. 2 shows the relative occurrence of macroblock QUANT values (as QUANT is defined in H.263, Q is 2^QUANT) for several Lagrange parameter settings. The Lagrange multiplier λ MODE is varied over seven values: 4, 25, 100, 250, 400, 730, 1000, producing seven normalized histograms that are depicted in the plots in Fig. 2. In Fig. 2, the macroblock QUANT values are gathered while coding 100 frames of the video sequences Foreman, Mobile-Calendar, Mother-Daughter, and News. As can already be seen from the histograms in Fig. 2, the peaks of the histograms are very similar among the three sequences and they are only dependent on the choices of λ MODE. Fig. 3 shows the obtained average macroblock QUANT gathered when coding the complete sequences Foreman, Mobile-Calendar, Mother-Daughter, and News. The bold curve in Fig. 3 depicts the function λ MODE = 0.85 ( QUANT ) 2 (5) which is an approximation of the functional relationship between the macroblock QUANT and the Lagrange parameter λ MODE up to QUANT values of 25, and H.263 allows only a choice of QUANT {1,2,,31}. Particularly remarkable is the strong dependency between content. λ MODE and QUANT, even for sequences with widely varying 5

Fig. 2 Relative occurrence vs. macroblock QUANT for various Lagrange parameter settings Fig. 3 Lagrange parameter vs. average macroblock QUANT As a further justification of our simple approximation of the relationship between λ MODE and QUANT, let us assume a typical high-rate approximation curve for entropy-constrained scalar quantization can be written as 6

b R ( D) = a log2( ) (6) D where a and b depend on the source pdf. The minimization of Eq. (4) can then be accomplished by setting the derivative of J with respect to R equal to zero, i.e. dj dd = + λ MODE = 0 (7) dr dr which yields 1 λ MODE = dr dd = a D For the distortion-to-quantizer relation, it is assumed that at sufficiently high rate, the source probability distribution can be approximated as uniform within each quantization interval yielding 2 2 ( 2 QUANT ) ( QUANT ) D = = (9) 12 3 Then, we can get dd( QUANT ) dr( QUANT ) λ MODE = = c ( QUANT ) 2 (10) where c=1/(3a). Although our assumptions may not be completely realistic, the derivation reveals at least the qualitative insight that it may be reasonable for the value of Lagrange multiplier λ MODE to be proportional to the square of the quantization parameter. As shown above, 0.85 appears to be reasonable value for constant c. To confirm the relationship in Eq. (10), an experiment has been conducted to measure the rate-distortion slopes dd(quant)/dr(quant) for a given value of QUANT. The experiment consists of the following steps: 1 The hybrid video coder is run employing quantizer values {4, 5, 7, 10, Q REF 11, 15, 25}. The resulting bit-streams are decoded and the reconstructed frames are employed as reference frames in the next step. 2. Given the coded reference frames, the MCP signal is computed for a fixed value of (8) λ MOTION = 2 0.85 QREF when employing the SSD distortion measure in the minimization of Eq. (3). Here, only 16 16 blocks are utilized for motion compensation. The MCP signal is subtracted from the original signal providing the DFD signal that is further processed in the next step. 3. The DFD signal is encoded for each frame when varying the value of DCT quantizer in the range Q={1,2,,31} for the INTER macroblock mode. The other macroblock modes have been excluded to avoid the macroblock mode 7

decision that involve Lagrangian optimization using λ MODE. 4. For each sequences and, the distortion and rate values per frame Q REF including the motion vector bit-rate are averaged, and the slopes are computed numerically. Via this procedure, the relationship between the DCT quantizer value Q and the negative slope of the distortion-rate curve has been obtained as shown in Fig. 4. This experiment shows that the relationship in Eq. (10) can be measured using the rate-distortion curve for DFD coding part, and also is employed to establish Eq. (5). Fig. 4 Measured slopes vs. average macroblock QUANT This ties together two of the three optimization parameters, QUANT and λ MODE. For the third, λ MOTION, we make an adjustment to the relationship to allow use of the SAD measure rather than the SSD measure in that stage of encoding. Experimentally, we have found that such method is effective to measure distortion during motion estimation and simply adjust λ for the lack of squaring operation, as given by λ MOTION = λ MODE (11) Using this adjustment, experiments show that both distortion measures SSD and SAD provide very similar results. 5. Conclusions We have described the structure of typical video coders and showed that their design and operation requires a keen understanding and analysis of the trade-offs between bit rate and distortion. The single powerful principle of D + λr Lagrange multiplier optimization has emerged as the weapon of choice in the optimization operation. In practical systems, due to the strong dependency between Q and λ, we can employ a simple deterministic relationship to set the Lagrange Multiplier λ, and hence decide the best coding modes and motion vectors in rate-distortion sense. 8

Reference: [1] Rafael C. Gonzalez, Richard E. Woods, Digital Image Processing, Second Edition, Prentice Hall, 2002 [2] Gary J. Sullivan and Thomas Wiegand, Rate-distortion Optimization for Video Compression, Signal Processing Magzine, IEEE Volume 15, Issue 6, Nov. 1998 Page(s):74-90 [3] Thomas Wiegand and Bernd Girod, Lagrange Multiplier Selection in Hybrid Video Coder Control, Image Processing, 2001. Proceedings. 2001 International Conference on Volume 3, 7-10 Oct. 2001 Page(s):542-545 vol.3 [4] Antonio Ortega and Kannan Ramchandran, Rate-distortion Methods for Image and Video Compression, Signal Processing Magazine, IEEE Volume 15, Issue 6, Nov. 1998 Page(s):23-50 9