Implementation and analysis of Directional DCT in H.264

Similar documents
EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.

Video compression with 1-D directional transforms in H.264/AVC

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

Homogeneous Transcoding of HEVC for bit rate reduction

EE 5359 Low Complexity H.264 encoder for mobile applications. Thejaswini Purushotham Student I.D.: Date: February 18,2010

Objective: Introduction: To: Dr. K. R. Rao. From: Kaustubh V. Dhonsale (UTA id: ) Date: 04/24/2012

EE Low Complexity H.264 encoder for mobile applications

Advanced Video Coding: The new H.264 video compression standard

Digital Video Processing

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho

A NOVEL SCANNING SCHEME FOR DIRECTIONAL SPATIAL PREDICTION OF AVS INTRA CODING

Transforms for the Motion Compensation Residual

Overview, implementation and comparison of Audio Video Standard (AVS) China and H.264/MPEG -4 part 10 or Advanced Video Coding Standard

MPEG-4: Simple Profile (SP)

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

H.264 STANDARD BASED SIDE INFORMATION GENERATION IN WYNER-ZIV CODING

An Efficient Mode Selection Algorithm for H.264

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

VHDL Implementation of H.264 Video Coding Standard

Reduced 4x4 Block Intra Prediction Modes using Directional Similarity in H.264/AVC

Performance Comparison between DWT-based and DCT-based Encoders

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

STUDY AND PERFORMANCE COMPARISON OF HEVC AND H.264 VIDEO CODECS

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

H.264 to MPEG-4 Transcoding Using Block Type Information

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

1740 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 7, JULY /$ IEEE

HEVC The Next Generation Video Coding. 1 ELEG5502 Video Coding Technology

H.264 / AVC (Advanced Video Coding)

PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9.

IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.264 FOR BASELINE PROFILE SHREYANKA SUBBARAYAPPA

Video Coding Using Spatially Varying Transform

EE5359:MULTIMEDIA PROCESSING

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

Performance analysis of Integer DCT of different block sizes.

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

H.264/AVC Baseline Profile to MPEG-4 Visual Simple Profile Transcoding to Reduce the Spatial Resolution

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

Complexity Reduction Tools for MPEG-2 to H.264 Video Transcoding

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION

AUDIOVISUAL COMMUNICATION

EE5359:MULTIMEDIA PROCESSING

Professor, CSE Department, Nirma University, Ahmedabad, India

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho

STUDY AND PERFORMANCE COMPARISON OF HEVC AND H.264 VIDEO ENCODERS

An Efficient Intra Prediction Algorithm for H.264/AVC High Profile

Enhanced Hexagon with Early Termination Algorithm for Motion estimation

Department of Electrical Engineering

Reduced Frame Quantization in Video Coding

Zonal MPEG-2. Cheng-Hsiung Hsieh *, Chen-Wei Fu and Wei-Lung Hung

VIDEO AND IMAGE PROCESSING USING DSP AND PFGA. Chapter 3: Video Processing

CMPT 365 Multimedia Systems. Media Compression - Image

Fast Mode Decision for H.264/AVC Using Mode Prediction

A Novel Partial Prediction Algorithm for Fast 4x4 Intra Prediction Mode Decision in H.264/AVC

Stereo Image Compression

Video Quality Analysis for H.264 Based on Human Visual System

PERFORMANCE ANALYSIS AND IMPLEMENTATION OF MODE DEPENDENT DCT/DST IN H.264/AVC PRIYADARSHINI ANJANAPPA

Combined Copyright Protection and Error Detection Scheme for H.264/AVC

Intra Prediction Efficiency and Performance Comparison of HEVC and VP9

ISSN (ONLINE): , VOLUME-3, ISSUE-1,

LECTURE VIII: BASIC VIDEO COMPRESSION TECHNIQUE DR. OUIEM BCHIR

Video Compression An Introduction

Introduction to Video Coding

Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding

Week 14. Video Compression. Ref: Fundamentals of Multimedia

EFFICIENT DEISGN OF LOW AREA BASED H.264 COMPRESSOR AND DECOMPRESSOR WITH H.264 INTEGER TRANSFORM

Adaptive Quantization for Video Compression in Frequency Domain

VIDEO COMPRESSION STANDARDS

COMPLEXITY REDUCTION FOR VP6 TO H.264 TRANSCODER USING MOTION VECTOR REUSE JAY R PADIA. Presented to the Faculty of the Graduate School of

VC 12/13 T16 Video Compression

A Study on Structural Similarity Based Interframe Video Coding

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

Final report on coding algorithms for mobile 3DTV. Gerhard Tech Karsten Müller Philipp Merkle Heribert Brust Lina Jin

Intra Prediction Efficiency and Performance Comparison of HEVC and VP9

Variable Temporal-Length 3-D Discrete Cosine Transform Coding

High Efficiency Video Coding. Li Li 2016/10/18

Fast Implementation of VC-1 with Modified Motion Estimation and Adaptive Block Transform

Vidhya.N.S. Murthy Student I.D Project report for Multimedia Processing course (EE5359) under Dr. K.R. Rao

Chapter 10. Basic Video Compression Techniques Introduction to Video Compression 10.2 Video Compression with Motion Compensation

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING

PERFORMANCE ANALYSIS OF AVS-M AND ITS APPLICATION IN MOBILE ENVIRONMENT

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK

arxiv: v1 [cs.mm] 9 Aug 2017

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

Video Compression Standards (II) A/Prof. Jian Zhang

Reduction of Blocking artifacts in Compressed Medical Images

Rate Distortion Optimization in Video Compression

Block-Matching based image compression

Mesh Based Interpolative Coding (MBIC)

Transcription:

Implementation and analysis of Directional DCT in H.264 EE 5359 Multimedia Processing Guidance: Dr K R Rao Priyadarshini Anjanappa UTA ID: 1000730236 priyadarshini.anjanappa@mavs.uta.edu

Introduction A popular scenario in image blocks is the occurrence of directional edges. By recognizing such characteristics, the video coding standard H.264/advanced video coding (AVC) [2] has developed a number of directional predictions in the coding of all intra blocks- called intra predictions. But it is still the conventional discrete cosine transform (DCT) [3] that is used after each intra prediction. The transform used by H.264/AVC to process both intra and inter prediction residuals is related to an integer 2-D DCT, implemented using horizontal and vertical transforms. It has been found that the coding efficiency can be improved by using directional transforms [1][6], since the residuals often contain textures that exhibit directional features.

Conventional DCT [3] The 2D DCT of a square or a rectangular block is used for almost all block-based transform schemes for image and video coding. Forward 2D DCT (NXM) Inverse 2D DCT (NXM)

The conventional 2D DCT is implemented separately through two 1D transforms, one along the vertical direction and the other along the horizontal direction, as shown in Fig.1. These two processes can be interchanged, as the 2D DCT is a separable transform. + Fig. 1. 2D DCT implementation: A combination of 1D DCTs along horizontal and vertical directions. The conventional DCT seems to be the best choice for image blocks in which vertical and/or horizontal edges are dominating.

H.264 Encoder Fig.2. Basic coding structure for H.264/AVC for a macroblock [2]

H.264 Profiles Fig.3. Illustration of H.264 profiles [14]

Intracoding in H.264 [8] Intra coding predicts the image content based on the values of previously decoded pixels. It has 9 prediction modes for 4x4 blocks 9 prediction modes for 8x8 blocks 4 prediction modes for 16x16 blocks. For each intra prediction mode, an intra prediction algorithm is used to predict the image content in the current block based on decoded neighbors. The intra prediction errors are transformed using an integer DCT. An additional 2x2 Hadamard transform is applied to the four DC coefficients of each chroma component. If a macroblock is coded in intra- 16x16 mode, a 4x4 Hadamard transform is performed for the 4x4 DC coefficients of the luma signal as shown in Fig. 5a.[2].

16x16 luma intra prediction modes Mode 0 (vertical): extrapolation from upper samples (H). Mode 1 (horizontal): extrapolation from left samples (V). Mode 2 (DC): mean of upper and left-hand samples (H+V). Mode 3 (Plane): a linear plane function is fitted to the upper and left-hand samples H and V. This works well in areas of smoothly-varying luminance. Fig. 4. 16x16 luma intra prediction modes [5]

Intra prediction modes in H.264 Fig. 5. 4x4 luma intra prediction modes [5] A-H -> they are the previously coded pixels of the upper macroblock and are available both at encoder/decoder. I-L -> they are the previously coded pixels of the left macroblock and are available both at encoder/decoder. M -> it is the previously coded pixel of the upper left macroblock and is available both at encoder/decoder.

Fig. 5a. 4x4 DC coefficients for intra 16x16 mode

Directional DCT (DDCT) A new DDCT framework [1] has been developed which provides a remarkable coding gain as compared to the conventional DCT. In H.264, the intra prediction errors are transformed using an integer DCT. In the new framework, DDCT is used to replace the AVC transforms by taking into account the intra prediction mode of the current block.

Modes in DDCT Fig. 6. Six directional modes in DDCT defined in a similar way as in H.264 for the block size 8x8. [1] (The vertical and horizontal modes are not included here)

DDCT implementation DDCT provides 9 transforms for 4x4, 9 transforms for 8x8, and 4 transforms for 16x16 [4][5][8]. For each intra prediction mode, DDCT consists of two stages: Stage 1 along the prediction direction: Pixels that align along the prediction direction are grouped together and 1-D DCT is applied. In cases of prediction modes that are neither horizontal nor vertical, the DCT transforms used are of different sizes. Fig. 7. NXN image block in which 1D DCT is applied along diagonal down left direction (mode 3) [1]

Stage 2 across the prediction direction: Another stage of 1-D DCT is applied to the transform coefficients resulted in the first stage. Again, the DCTs may be of different sizes. Fig. 8. Arrangement of coefficients after the first 1-D DCT, followed by rearrangement of coefficients after the second DCT as well as the modified zig zag scanning [1]

STEP 1 STEP 2 STEP 3 (a,b,, p) = 4x4 block of pixels in the 2D spatial domain (A, B,, P) = 1D DCT coefficients DCTs are of lengths L= 1, 2, 3, 4, 2 and 1 along the dominant direction in mode 3 Rearrangement of coefficients after the first 1D DCT STEP 4 STEP 5 = 1D DCT coefficients Horizontal DCTs of lengths L= 7, 5, 3, 2 and 1 Fig. 9. Implementation of mode 3 DDCT Rearrangement of coefficients after the second 1D DCT for zig zag scanning

After step 3 in Fig. 9., for each basis image repeat step 4 in Fig. 9. by replacing the corresponding coefficient with 1 and the remaining coefficients with 0. For a 4x4 block, 16 basis images are obtained. The same procedure is applied for all the other DDCT modes. Fig. 10. Procedure to obtain basis images

Basis Images Fig. 11. Basis Images for mode 3 DDCT, 4x4 block Fig. 12. Basis images for mode 0/1 DDCT, mode 3 DDCT and mode 5 DDCT [1]

Fig. 13. Getting mode 4 from mode 3 [8]

Fig. 14. Getting mode 6 from mode 5 [8]

Fig. 15. Getting mode 7 from mode 5 [8]

Fig. 16. Getting mode 8 from mode 5 [8]

Compression efficiency of DDCT [8] To make the transform sizes more balanced, the DDCTs group pixels in the corners together in order to use DCT of longer size, hence more efficient in terms of compression. Properties of DDCT [8] Adaptivity DDCT assigns a different transform and scanning pattern to each intra prediction mode unlike the integer DCT. Directionality By first applying the transform along the prediction direction, DDCT has the potential to minimize the artifacts around the object boundaries.

Symmetry Although there are 22 DDCTs for 22 intra prediction modes (9 modes for 4x4, 9 modes for 8x8, and 4 modes for 16x16), these transforms can be derived, using simple operators such as rotation and/ or reflection, from only 7 different core transforms: 16x16: one transform for all 16x16 modes 8x8 and 4x4: Modes 0, 1: The same transform similar to AVC, DCT is used, first horizontally, then vertically Modes 3 and 4: The DDCT for mode 4 can be obtained from the transform for mode 3 using a reflection on the vertical line at the center of the block Modes 5 to 8: The DDCT for modes 6-8 can be derived from mode 5 using reflection and rotation.

OBJECTIVE The objective of the project is to implement DDCT in place of Integer DCT in the transform block of the encoder in the H.264/AVC Reference Software JM17.2 [7]. A single intra prediction frame will be considered for the DDCT implementation. The coding will be done using Microsoft Visual Studio 2008. Coding simulations will be performed on various sets of test images and also on video formats like QCIF (Quarter Common Intermediate Format). The coding performance, with different quality assessment metrics like MSE, PSNR, SSIM and MS-SSIM will be observed. This project considers the main profile in H.264/AVC.

ENCODER CONFIGURATION: FramesToBeEncoded = 1 # Number of frames to be coded ProfileIDC = 77 # Profile IDC (66=baseline, 77=main, 88=extended;FREXT Profiles: 100=High, 110=High 10, 122=High 4:2:2, 244=High 4:4:4, 44=CAVLC 4:4:4 Intra, 118=Multiview High Profile, 128=Stereo High Profile) IntraProfile = 0 # Activate Intra Profile for FRExt (0: false, 1: true) # (e.g. ProfileIDC=110, IntraProfile=1 => High 10 Intra Profile) Transform8x8Mode = 0 # (0: only 4x4 transform, 1: allow using 8x8 transform additionally, 2: only 8x8 transform) Input YUV file: foreman_part_qcif.yuv Output H.264 bitstream: test.264 Output YUV file: test_rec.yuv YUV format: YUV 4:2:0 Frames to be encoded: 1 Frequency used for encoded bitstream: 30.00 fps Fig. 17. QCIF sequence used DistortionSSIM = 1 # Compute SSIM distortion. (0: disabled/default, 1: enabled) DistortionMS_SSIM = 1 # Compute Multiscale SSIM distortion. (0: disabled/default, 1: enabled)

Fig. 18. MSE versus bitrate for integer DCT in H.264

Fig. 19. Y-PSNR versus bitrate for integer DCT in H.264

Fig. 20. SSIM vs bitrate for integer DCT in H.264

Fig. 21. MS-SSIM vs bitrate for integer DCT in H.264

QP (I frame) Y PSNR (db) Y MSE Y SSIM Y MS-SSIM Total encoding time (s) Bit rate (kbit/s) 0 79.159 0.00079 1 1 10.876 5590.56 4 68.825 0.00852 1 1 10.032 5232.48 8 57.204 0.12378 0.9995 0.9999 9.183 3891.84 12 52.28 0.38463 0.9983 0.9997 8.23 2838.72 16 48.86 0.84553 0.9965 0.9996 7.292 2168.16 20 45.131 1.99499 0.9921 0.9992 6.433 1560.72 24 41.803 4.29364 0.9846 0.9985 5.666 1088.88 28 38.892 8.39157 0.9735 0.9973 4.968 745.68 32 35.862 16.8695 0.9566 0.9953 4.391 496.8 36 33.173 31.32035 0.9342 0.9919 4.067 331.92 40 30.279 60.97929 0.8955 0.984 3.731 221.76 44 27.981 103.5018 0.8388 0.9705 3.484 152.64 48 25.293 192.227 0.7532 0.935 3.321 104.88 51 23.335 301.693 0.6703 0.8967 3.073 72 Table 1. Variation of image metrics for Y component (Foreman QCIF sequence) in Integer DCT implementation of H.264

Prediction residuals and directional transforms in H.264 A transform is used to transform not only the image intensities but also prediction residuals of image intensities. The prediction residuals have different spatial characteristics from image intensities [6]. Unlike an image, the energy of the prediction residual is not uniformly distributed in the spatial domain. Even within a block, many pixels may have zero intensity and the energy may often be concentrated in a region of the block [15].

Fig. 22. Visual comparison between original video and prediction residual [10]

The residuals have more locally 1-D anisotropic features in horizontal/vertical and other directions as compared to the original video In this case, the conventional 2-D DCT may not be the best choice to de-correlate the residual blocks. There exists a high correlation along the anisotropic edges in the prediction residual. This spatial redundancy can be better exploited by use of directional DCT. A 2-D DDCT [1] can improve the coding gain for original images; however this may not be the optimal solution for 1-D structures in the residuals. For prediction residuals, 1-D DDCT along the direction with high correlated pixels can perform better energy compaction than 2D-DDCT. If 1D-DDCT used in neighboring blocks are with the same size and direction mode, the spatial redundancy can be further exploited by using 1D- conventional DCT for the corresponding coefficients among these blocks. The difference between 2D-DDCT and 1D-DDCT is that after 1D DCT is applied along the dominant direction in the block, in 1D-DDCT, the second 1D-DCT is not applied along the rows in the rearranged pattern, since the correlation of residuals in such a direction is relatively low.

Also, after 1D-DDCT instead of zigzag scanning, a different scanning pattern can be used as shown in Fig. 23. (a) (b) (c) Fig. 23. Comparison of scanning orders of DDCT coefficients: (a) mode 4 DDCT (b) scanning order after 2-D DDCT (c) scanning order after 1-D DDCT [10].

Mode Selection Selection of the best intra prediction mode can be done by using sum of absolute differences (SAD) or mean absolute differences (MAD). Another method used [10] is Gabor filtering. Before performing the DDCTs, spatial Gabor filters with different frequencies and orientation of 1-D structures in each inter-block. Fig. 24 denotes the filter kernels with eight directions: 0, By using such filters, each residual pixel has an output response with respect to the filter kernel, the largest response value will be chosen as the representative orientation, the number of pixels in each orientation is calculated. The orientation with the largest number of pixels is considered as the statistically major direction and the mode corresponding to the dominant direction is thus selected for directional transform [10].

Fig. 24. Spatial Gabor filter kernels with eight directions [10]

FUTURE WORK To implement a suitable method for mode decision of the intra prediction residuals based on the various approaches investigated. To integrate DDCT into H.264/AVC Reference Software. To compare image quality metrics MSE, PSNR, SSIM and MS- SSIM for DDCT with those of integer DCT in H.264. To compare performances of 1-D DDCT and 2-D DCCT.

REFERENCES [1] B. Zeng and J. Fu, Directional discrete cosine transforms A new framework for image coding, IEEE Trans. on Circuits and Systems for Video Technology, vol. 18, no. 3, pp. 305-313, Mar. 2008. [2] T. Wiegand et al, Overview of the H.264/AVC video coding standard, IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, pp. 560-576, Jul. 2003. [3] K. R. Rao and P. Yip, Discrete cosine transform- algorithms, advantages, applications, London, U.K.: Academic, 1990. *4+ I. Richardson, The H.264 advanced video compression standard, Wiley, 2 nd edition, 2010. [5] Intra prediction modes in H.264. Website:http://www.vcodex.com/files/h264_intrapred.pdf [6] F. Kamisli and J. S. Lim, Video compression with 1-d directional transforms in H.264/AVC, IEEE ICASSP, pp. 738-741, Mar. 2010.

[7] H.264/AVC reference software. Website: http://iphome.hhi.de/suehring/tml/download [8] Intra coding with directional DCT and directional DWT, Document: JCTVC-B107_r1 Website: http://wftp3.itu.int/av-arch/jctvc-site/2010_07_b_geneva/jctvc-b107.zip [9] B.Chen, H.Wang and L.Cheng, Fast directional discrete cosine transform for image compression, Opt. Eng., vol. 49, issue 2, article 020101, Feb. 2010. *10+ C. Deng et al, Performance analysis, parameter selection and extensions to H.264/AVC FRExt for high resolution video coding, J. Vis. Commun. Image R., vol. 22 (In Press), Available online, Feb. 2011. [11] Z.Wang et al, Image quality assessment: From error visibility to structural similarity, IEEE Trans. on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004. [12] W.Zhao, J.Fan and A.Davari, H.264-based wireless surveillance sensors in application to target identification and tracking, i-manager s Journal on Software Engineering, vol.4, no. 2, Oct. 2009. Website: http://web.eng.fiu.edu/fanj/pdf/j5_i-manager09h264_camera.pdf

[13] W.Zhao et al, H.264-based architecture of digital surveillance network in application to computer visualization, i-manager s Journal on Software Engineering, vol.4, no. 4, Apr. 2010. Website: http://web.eng.fiu.edu/fanj/pdf/j8_i-mgr10architecture_camera.pdf [14] D. Marpe, T. Wiegand and G. J. Sullivan, The H.264/MPEG-4 AVC standard and its applications, IEEE Communications Magazine, vol. 44, pp. 134-143, Aug. 2006. Website: http://iphome.hhi.de/wiegand/assets/pdfs/h264-avc-standard.pdf [15] F. Kamisli and J. S. Lim, Transforms for motion compensation residual, IEEE ICASSP, pp. 789-792, Apr. 2009. [16] Z.Wang, E.P.Simoncelli and A.C.Bovik, Multi-scale structural similarity for image quality assessment, Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, Nov. 2003.

Thank you!