Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

Similar documents
Complexity Reduced Mode Selection of H.264/AVC Intra Coding

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46

Homogeneous Transcoding of HEVC for bit rate reduction

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Efficient Dictionary Based Video Coding with Reduced Side Information

ARTICLE IN PRESS. Signal Processing: Image Communication

Advanced Video Coding: The new H.264 video compression standard

Extensions of H.264/AVC for Multiview Video Compression

An Improved H.26L Coder Using Lagrangian Coder Control. Summary

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS

Next-Generation 3D Formats with Depth Map Support

Improved H.264/AVC Requantization Transcoding using Low-Complexity Interpolation Filters for 1/4-Pixel Motion Compensation

Depth Estimation for View Synthesis in Multiview Video Coding

View Synthesis Prediction for Rate-Overhead Reduction in FTV

Video Coding Using Spatially Varying Transform

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

H.264 to MPEG-4 Transcoding Using Block Type Information

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

View Synthesis for Multiview Video Compression

Block-based Watermarking Using Random Position Key

Reduced Frame Quantization in Video Coding

Title Adaptive Lagrange Multiplier for Low Bit Rates in H.264.

An Efficient Mode Selection Algorithm for H.264

Performance Comparison between DWT-based and DCT-based Encoders

Digital Video Processing

A Novel Partial Prediction Algorithm for Fast 4x4 Intra Prediction Mode Decision in H.264/AVC

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

Fast Mode Decision for H.264/AVC Using Mode Prediction

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.

Editorial Manager(tm) for Journal of Real-Time Image Processing Manuscript Draft

Implementation and analysis of Directional DCT in H.264

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho

IBM Research Report. Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms

STACK ROBUST FINE GRANULARITY SCALABLE VIDEO CODING

Introduction to Video Compression

Rate Distortion Optimization in Video Compression

Reduced 4x4 Block Intra Prediction Modes using Directional Similarity in H.264/AVC

Overview: motion-compensated coding

CAMED: Complexity Adaptive Motion Estimation & Mode Decision for H.264 Video

An Efficient Intra Prediction Algorithm for H.264/AVC High Profile

Video compression with 1-D directional transforms in H.264/AVC

Complexity Reduction Tools for MPEG-2 to H.264 Video Transcoding

Transcoding from H.264/AVC to High Efficiency Video Coding (HEVC)

H.264/AVC Baseline Profile to MPEG-4 Visual Simple Profile Transcoding to Reduce the Spatial Resolution

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

Performance Optimization of an MPEG-2 to MPEG-4 Video Transcoder

Transcoding from H.264/AVC to High Efficiency Video Coding (HEVC)

Fraunhofer Institute for Telecommunications - Heinrich Hertz Institute (HHI)

EFFICIENT DEISGN OF LOW AREA BASED H.264 COMPRESSOR AND DECOMPRESSOR WITH H.264 INTEGER TRANSFORM

COMPARATIVE ANALYSIS OF DIRAC PRO-VC-2, H.264 AVC AND AVS CHINA-P7

Multi-path Transport of FGS Video

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression

Performance analysis of Integer DCT of different block sizes.

New Techniques for Next Generation Video Coding

Fast Transcoding From H.264/AVC To High Efficiency Video Coding

MRT based Fixed Block size Transform Coding

High Efficient Intra Coding Algorithm for H.265/HVC

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc.

The Scope of Picture and Video Coding Standardization

Introduction to Video Encoding

Very Low Complexity MPEG-2 to H.264 Transcoding Using Machine Learning

Introduction to Video Encoding

SAD implementation and optimization for H.264/AVC encoder on TMS320C64 DSP

A Survey on Early Determination of Zero Quantized Coefficients in Video Coding

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

[30] Dong J., Lou j. and Yu L. (2003), Improved entropy coding method, Doc. AVS Working Group (M1214), Beijing, Chaina. CHAPTER 4

Video Transcoding Architectures and Techniques: An Overview. IEEE Signal Processing Magazine March 2003 Present by Chen-hsiu Huang

BLOCK MATCHING-BASED MOTION COMPENSATION WITH ARBITRARY ACCURACY USING ADAPTIVE INTERPOLATION FILTERS

An Efficient Motion Estimation Method for H.264-Based Video Transcoding with Arbitrary Spatial Resolution Conversion

VIDEO COMPRESSION STANDARDS

Motion Vector Coding Algorithm Based on Adaptive Template Matching

A NOVEL SCANNING SCHEME FOR DIRECTIONAL SPATIAL PREDICTION OF AVS INTRA CODING

Video Compression An Introduction

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION

Video Codecs. National Chiao Tung University Chun-Jen Tsai 1/5/2015

Video Compression Standards (II) A/Prof. Jian Zhang

Smoooth Streaming over wireless Networks Sreya Chakraborty Final Report EE-5359 under the guidance of Dr. K.R.Rao

Optimal Estimation for Error Concealment in Scalable Video Coding

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS

Updates in MPEG-4 AVC (H.264) Standard to Improve Picture Quality and Usability

Quality Improvement and Overhead Reduction for Soft Video Delivery

View Synthesis for Multiview Video Compression

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS

Pattern based Residual Coding for H.264 Encoder *

Overview of Multiview Video Coding and Anti-Aliasing for 3D Displays

VHDL Implementation of H.264 Video Coding Standard

Optimum Quantization Parameters for Mode Decision in Scalable Extension of H.264/AVC Video Codec

VIDEO streaming applications over the Internet are gaining. Brief Papers

Performance and Complexity Co-evaluation of the Advanced Video Coding Standard for Cost-Effective Multimedia Communications

TRANSCODING OF H264 BITSTREAM TO MPEG 2 BITSTREAM. Dr. K.R.Rao Supervising Professor. Dr. Zhou Wang. Dr. Soontorn Oraintara

Region-Based Rate-Control for H.264/AVC. for Low Bit-Rate Applications

Simple intra prediction algorithms for heterogeneous MPEG-2/H.264 video transcoders

CODING METHOD FOR EMBEDDING AUDIO IN VIDEO STREAM. Harri Sorokin, Jari Koivusaari, Moncef Gabbouj, and Jarmo Takala

EE 5359 H.264 to VC 1 Transcoding

Advanced Encoding Features of the Sencore TXS Transcoder

Fast Intra Mode Decision in High Efficiency Video Coding

Transcription:

MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Efficient MPEG- to H.64/AVC Transcoding in Transform-domain Yeping Su, Jun Xin, Anthony Vetro, Huifang Sun TR005-039 May 005 Abstract In this paper, we propose a transform-domain MPEG- to H.64/AVC intro video transcoder. In the transcoder, the input DCT coefficients are first converted to H.64 transform () coefficients entirely in the transform-domain. Rate-distortion optimized macroblock mode decision is then performed based on the coefficients. Finally, the coefficients are coded using the selected modes to generate the output H.64 bitstream. The proposed transcoder is equivalent to the conventional pixel-domain implementation in terms of functionality, but its complexity is sifnificantly lower (on average over 0%). To further reduce its computational complexity, we propose a fast mode decision algorithm based on a simple cost function calculated in the -domain. This fast algorithm reduces the complexity requirement about 50%, while maintains virtually the same coding efficiency. IEEE International Symposium on Circuits and Systems (ISCAS) This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Copyright c Mitsubishi Electric Research Laboratories, Inc., 005 0 Broadway, Cambridge, Massachusetts 039

Efficient MPEG- to H.64/AVC Transcoding in Transform-domain Yeping Su University of Washington EE Department Seattle WA 9895, USA su@ee.washington.edu Jun Xin, Anthony Vetro, Huifang Sun Mitsubishi Electric Research Laboratories Cambridge MA 039, USA {jxin, avetro, hsun}@merl.com Abstract In this paper, we propose a transform-domain MPEG- to H.64/AVC intra video transcoder. In the transcoder, the input DCT coefficients are first converted to H.64 transform () coefficients entirely in the transformdomain. Rate-distortion optimized macroblock mode decision is then performed based on the coefficients. Finally, the coefficients are coded using the selected modes to generate the output H.64 bitstream. The proposed transcoder is equivalent to the conventional pixel-domain implementation in terms of functionality, but its complexity is significantly lower (on average over 0%). To further reduce its computational complexity, we propose a fast mode decision algorithm based on a simple cost function calculated in the -domain. This fast algorithm reduces the complexity requirement about 50%, while maintains virtually the same coding efficiency. I. INTRODUCTION The latest video-coding standard, known as H.64 (or MPEG-4 AVC) [], has been developed collaboratively by the Joint Video Team of ISO/IEC MPEG and ITU-T VCEG. At an equivalent perceptual quality, AVC provides a bit-rate saving of about 50% relative to prior standards such as MPEG-. Due to its superior compression efficiency, it is expected to have a wide range of applications, including mobile broadcasting and storage. With the deployment of H.64, e.g., for mobile broadcasting, there is a need to convert video in the MPEG- format to videos of the H.64 format. This would enable more efficient network transmission and storage. As shown in Figure, a transcoder may first decode an input MPEG video to reconstruct the image pixels, and then encodes them in the H.64 format. This is referred to as a -Domain Transcoder (PDT). Note that we discuss intratranscoding only in this paper. It is well known that transform domain techniques may be simpler since they eliminate the need of inverse transform and transform. However, in the case of MPEG- to H.64 transcoding, the transform-domain approach must efficiently solve the following two mismatch problems: Input MPEG- Bitstream Input MPEG- Bitstream VLD/ I IDCT ( domain) Transform mismatch: an efficient algorithm for DCT-to- coefficient conversion. It must be simpler than the trivial concatenation of IDCT-. mismatch: efficient H.64 mode decision in transform-domain (-domain). Figure illustrates our proposed transform-domain transcoder, with the problems highlighted. As will be explained in this paper, both mismatch problems are solved more efficiently in the transform-domain than in the pixeldomain. Note that the intra prediction consumes little decision Inverse / Inverse Buffer H.64 Entropy Coding Figure. -domain intra transcoding architecture VLD/ I DCT-to- conversion (S-Transfor (-domain) decision Inverse Inverse Buffer H.64 Entropy Coding Figure. Tranform-domain intra transcoding architecture

computation since it can reuse the result of the mode decision. In the remainder of this paper, we discuss how these two problems are solved in our proposed novel transform-domain transcoder. We first discuss in Section II the efficient DCTto- conversion. Then in Section III, we discuss the transform-domain mode decision and a fast algorithm for further speedup. Simulation results are provided in Section IV before Section V concludes this paper. II. EFFICIENT DCT-TO- CONVERSION The DCT-to- conversion can be shown to be a D linear transform [], hereinafter referred as S-Transform. Let X denote an 8x8 block of DCT coefficients, the corresponding coefficient block Y, consisting of four 4x4 blocks, is give by 4x4 luma block) is shown in Figure 4. The prediction residual is transformed, quantized and entropy encoded to determine the rate, R(, for a given mode m. Then, inverse (a) (b) Figure 3. (a) Neighboring samples A- are used for prediction of samples a-p. (b) mode directions (except DC_Pred). Compute rate R The kernel matrix S is Y T = S X S () Inverse Inverse Determine distortion Compute cost (J=D+λ R) D () Buffers Figure 4. -domain RD cost computation S-Transform Determine rate R The symmetry of the kernel matrix can be utilized to design fast implementations of the transform. In this work, we use the fast integer algorithm proposed in [], which saves about 30% of operations compared with IDCT followed by. For the proof and more details of the fast algorithms, please refer to []. III. TRANSFORM-DOMAIN MODE DECISION A. Conventional Decision Let us consider the conventional H.64 pixel-domain mode decision (as in JM76), and in particular, the Rate- Distortion Optimized (RDO) decision for the _4x4 modes. Figure 3(a) illustrates the candidate neighboring pixels A- used for prediction of current 4x4 block pixels a-p. Figure 3(b) illustrates the eight directional prediction modes. In addition, DC prediction (DC_Pred) can also be used. Consider the rate distortion calculation in a video encoder with RDO_on, the conventional calculation of the Lagrange cost for one coding module (in this case for one Buffer quantization, inverse transform are performed and then compensated with the prediction block to get the reconstructed signal. The distortion, denoted SSD REC (, is computed as the sum of squared distance between the original block, s, and the reconstructed block ~ s ( m ) : SSD REC Inverse Determine distortion (-domain) Figure 5. Transform-domain RD cost computation ( = s ~ s ( Compute cost (J=D+λ R) where is the Lp-norm. The Lagrange cost is p computed using the rate and distortion as follows: D (3)

Cost = SSD ( + λ R( ) (4) 4 x4 REC MODE m where λ MODE is a Lagrange multiplier, which may be calculated as a function of the quantization parameter. The optimal coding mode corresponds to the mode with the minimum cost. Besides this RDO mode selection, a low-complexity algorithm (i.e., with RDO_off) would only calculate the sum of absolute distance of the Hadamard-transformed prediction residual signal: SATD( = T ( s sˆ( ) (5) Accuracy 0.95 0.9 0.85 0.8 Fast Decision Algorithm Verification where s ˆ( is the prediction signal for the mode m. In this case, the cost function would then be given by Cost * = SATD( + λ 4 ( δ ( m )) (6) 4 x 4 MODE = m where m * is the most probable mode for the block. B. Transform-domain Decision The proposed Transform-domain mode decision calculates the Lagrange cost for each mode according to Figure 5, which is based on our previous work on H.64 encoding [3]. Compared to the pixel-domain approach, the transform-domain implementation has several major differences in terms of computation involved: It saves one inverse computation for each candidate prediction mode, since the distortion is determined using the reconstructed and original residual coefficients. Instead of operating on the prediction residual pixels, the now operates on the prediction signals. of some intra-prediction signals is very simple to compute [3]. For example, for DC_Pred mode, there is only one nonzero DC element in the transformed prediction signal ( Sˆ( DC _ Pred)). Therefore, additional computational saving is achieved. Note that the distortion is calculated in the -domain. is not an orthonormal transform; hence, it does not preserve the L norm (energy). However, the distortion can still be calculated with proper coefficient weighting [3]. Let ~ s = s + e~ denote the reconstructed signal, where e ~ is the reconstructed residual signal. The pixel-domain distortion, SSD REC (, is given by (3). The same distortion as can be computed in the -domain as [3]: D ~ ( = ( E E( (7) ) W = 0.75 3 4 5 6 7 8 4 4 0 0 4 6 Number Of s Figure 6. Number of Test s vs. Accuracy. 0 0 W = 64 6 0 6 0 0 5 0 5 6 0 6 0 0 5 0 5 where E and E ~ are transformed residual signal and reconstructed residual signal. Note that there are two scaling factor matrixes, W and W, to compensate for the different norms of the, the inverse and the H.64 quantization design. The operator represents a scalar multiplication or element-wise multiplication of two matrices. For further details, please refer to [3]. C. Fast Decision When operating at high-complexity mode, the H.64 coder utilizes Lagrange coder control to optimize mode decisions in the rate-distortion sense. At low-complexity mode, the SATD cost in (6) requires much simpler computation, but the coding performance is not as good because the cost function is only an approximation of the actual RD cost given by (4). The proposed fast intra mode decision algorithm is based on the following observation: although choosing the mode with the smallest SATD value often misses the best mode in the RD sense, the best mode usually contains smaller SATD cost. In other words, the mode rankings according to the two cost functions are highly correlated. The basic idea is to rank all candidate modes using the easy-to-compute SATD costs, and then evaluate Lagrange RD costs only for the few best modes decided by the ranking. Based on the input coefficients of prediction residual signal, the algorithm is described in the following.

First, we compute the -domain cost for all candidate modes based on normalized -domain residual coefficients: cost ( = λ MODE ( S Sˆ( ) + 4 ( δ ( m = m )) Then, we sort the modes according to cost in ascending order, putting the first k smallest modes in the test set T. Next, we add DC_Pred into T if it is not in the set already. For the modes in T, compute (8) ~ cost ( = ( E E( ) + λ * R( ) (9) MODE m We finally select the best mode according to cost (. Note that in calculating (6), instead of using Hadamard transform, the distortion SATD is defined as the SAD of coefficients since they are already available in the transformdomain transcoder. The parameter k controls the complexityquality trade-off. To verify the correlations between rankings using cost and cost, a simple experiment is performed. We collect the two costs for all luma 4x4 blocks in the first frame of all 5 CIF test sequences coded with P=8, and then count the percentage of times when the best mode according to cost is in the test set T. This is called the mode prediction accuracy. The results are plotted in Figure 6 as k vs. accuracy. The strong correlation between the two costs is evident in the high accuracies shown. In this work, k is set to be 3. IV. SIMULATION RESULTS The proposed transcoder is implemented based on MSSG MPEG- software codec [4] and H.64 JM7.6 reference code [5]. All simulations are performed using Windows XP, Intel P4 CPU, Intel C++ Compiler 7.0, 30 frames/s CIF test sequences (00 frames each), RDO on, and UVLC entropy coding. The sequences are all intra-encoded at 6Mbps. Three transcoders are evaluated: Domain Transcoder (PDT), Transform Domain Transcoder (TDT), and Transform Domain Transcoder with Fast intra mode decision (TDT-F). The performance is measured using PSNR between the output H.64 video and the corresponding raw video. The computational complexity is measured using the runtime of the entire transcoder. Tables - show the RD performances of the three transcoders. Table 3 shows the complexity reductions of TDT/TDT-F over PDT. TDT achieves virtually the same RD performance as PDT. The computational saving of TDT over PDT is typically more than 0%. When the fast intra mode decision algorithm is used, TDT-F saves around 50% of the computation, with a negligible PSNR degradation of less than 0.dB for all test cases. V. CONCLUSIONS We proposed a transform-domain MPEG- to H.64 intra video transcoder. The transform-domain architecture is equivalent to the conventional pixel-domain implementation in terms of functionality, but it has significantly lower complexity. We achieved the complexity reduction by taking advantage of direct DCT-to- coefficient conversion and transform-domain mode decision. We also presented a fast intra mode decision algorithm utilizing transform-domain features that can further reduce its computational complexity. Simulations show that we can achieve significant computational reduction without sacrificing video quality. TABLE I. RD PERFORMANCE COMPARISONS WITH P=30 Akiyo Foreman Container Mobile REFERENCES Stefan PDT Bit-rate 53.0 659.48 3.60 5608.50 3807.39 PSNR 38.63 35.84 34.77 9.3 33.6 TDT Bit-rate 58.04 654.6 07.9 5593.96 3795.78 PSNR 38.59 35.83 34.75 9.30 33.5 TDT-F Bit-rate 57.7 656.48 08.46 5587.70 3789.77 PSNR 38.59 35.8 34.7 9.6 33.9 TABLE II. (Bit-rate unit: kbps, PSNR unit: db.) RD PERFORMANCE COMPARISONS WITH P=45 Akiyo Foreman Container Mobile Stefan PDT Bit-rate 495.55 558.8 60.09 34.6 909.97 PSNR 8.00 7.0 5.57.97 3.35 TDT Bit-rate 495.3 559.60 60.07 33.5 90.60 PSNR 7.93 7.04 5.54.95 3.35 TDT-F Bit-rate 496.4 56.50 60.77 37.36 94.96 PSNR 7.83 7.03 5.57.96 3.37 TABLE III. (Bit-rate unit: kbps, PSNR unit: db.) COMPLEXITY REDUCTIONS (%) RELATIVE TO PDT Akiyo Foreman Container Mobile Stefan TDT P=30.3%.5% 9.3% 7.9% 0.4% P=45 4.0% 4.6% 9.% 3.%.5% TDT-F P=30 49.8% 49.8% 49.9% 50.9% 5.5% P=45 5.7% 5.5% 49.4% 48.4% 49.9% [] ITU-T Rec. H.64 ISO/IEC 4496-0, Advanced Video Coding, 003. [] J. Xin, A. Vetro, and H. Sun, Converting DCT Coefficients to H.64/AVC Transform Coefficients, Pacific-rim Conference on Multimedia (PCM) 004. [3] J. Xin, A. Vetro, and H. Sun, Efficient Macroblock Coding Decision for H.64/AVC Video Coding, Picture Coding Symposium (PCS) 004. [4] MPEG- video codec v, available online at http://www.mpeg.org/mpeg/mssg. [5] H.64/AVC Reference Software JM7.6, available online at http://bs.hhi.de/~suehring/tml/download/.