A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

Similar documents
Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Reduced Frame Quantization in Video Coding

An Efficient Mode Selection Algorithm for H.264

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

An Efficient Table Prediction Scheme for CAVLC

Title Adaptive Lagrange Multiplier for Low Bit Rates in H.264.

Objective: Introduction: To: Dr. K. R. Rao. From: Kaustubh V. Dhonsale (UTA id: ) Date: 04/24/2012

Fast Implementation of VC-1 with Modified Motion Estimation and Adaptive Block Transform

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

Advanced Video Coding: The new H.264 video compression standard

Implementation and analysis of Directional DCT in H.264

A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION

A NOVEL SCANNING SCHEME FOR DIRECTIONAL SPATIAL PREDICTION OF AVS INTRA CODING

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.

An Efficient Intra Prediction Algorithm for H.264/AVC High Profile

Performance Comparison between DWT-based and DCT-based Encoders

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

Fast Mode Decision for H.264/AVC Using Mode Prediction

H.264/AVC Baseline Profile to MPEG-4 Visual Simple Profile Transcoding to Reduce the Spatial Resolution

Video Compression An Introduction

Reduced 4x4 Block Intra Prediction Modes using Directional Similarity in H.264/AVC

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

STACK ROBUST FINE GRANULARITY SCALABLE VIDEO CODING

Overview, implementation and comparison of Audio Video Standard (AVS) China and H.264/MPEG -4 part 10 or Advanced Video Coding Standard

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

EE 5359 Low Complexity H.264 encoder for mobile applications. Thejaswini Purushotham Student I.D.: Date: February 18,2010

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

A Fast Intra/Inter Mode Decision Algorithm of H.264/AVC for Real-time Applications

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

A deblocking filter with two separate modes in block-based video coding

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

IBM Research Report. Inter Mode Selection for H.264/AVC Using Time-Efficient Learning-Theoretic Algorithms

Video Compression Method for On-Board Systems of Construction Robots

Block-Matching based image compression

Video Coding Using Spatially Varying Transform

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS

VHDL Implementation of H.264 Video Coding Standard

Homogeneous Transcoding of HEVC for bit rate reduction

EE Low Complexity H.264 encoder for mobile applications

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46

Pattern based Residual Coding for H.264 Encoder *

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression

Design of a High Speed CAVLC Encoder and Decoder with Parallel Data Path

H.264 to MPEG-4 Transcoding Using Block Type Information

EFFICIENT PU MODE DECISION AND MOTION ESTIMATION FOR H.264/AVC TO HEVC TRANSCODER

JPEG 2000 vs. JPEG in MPEG Encoding

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho

VIDEO streaming applications over the Internet are gaining. Brief Papers

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

IN RECENT years, multimedia application has become more

Fast frame memory access method for H.264/AVC

A CAVLC-BASED VIDEO WATERMARKING SCHEME FOR H.264/AVC CODEC. Received May 2010; revised October 2010

[30] Dong J., Lou j. and Yu L. (2003), Improved entropy coding method, Doc. AVS Working Group (M1214), Beijing, Chaina. CHAPTER 4

STANDARD COMPLIANT FLICKER REDUCTION METHOD WITH PSNR LOSS CONTROL

ARTICLE IN PRESS. Signal Processing: Image Communication

A high-level simulator for the H.264/AVC decoding process in multi-core systems

High Efficiency Video Coding (HEVC) test model HM vs. HM- 16.6: objective and subjective performance analysis

High Efficient Intra Coding Algorithm for H.265/HVC

Compression of Stereo Images using a Huffman-Zip Scheme

Rate Distortion Optimization in Video Compression

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

Lecture 13 Video Coding H.264 / MPEG4 AVC

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 19, NO. 9, SEPTEMBER

Adaptation of Scalable Video Coding to Packet Loss and its Performance Analysis

streams cannot be decoded properly, which would severely affect the quality of reconstructed video. Therefore, error resilience is utilized to solve

An Improved H.26L Coder Using Lagrangian Coder Control. Summary

Fast Wavelet-based Macro-block Selection Algorithm for H.264 Video Codec

Digital Image Stabilization and Its Integration with Video Encoder

Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video

Xin-Fu Wang et al.: Performance Comparison of AVS and H.264/AVC 311 prediction mode and four directional prediction modes are shown in Fig.1. Intra ch

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc.

Optimum Quantization Parameters for Mode Decision in Scalable Extension of H.264/AVC Video Codec

Video Quality Analysis for H.264 Based on Human Visual System

Video compression with 1-D directional transforms in H.264/AVC

Digital Video Processing

Zonal MPEG-2. Cheng-Hsiung Hsieh *, Chen-Wei Fu and Wei-Lung Hung

FAST SPATIAL LAYER MODE DECISION BASED ON TEMPORAL LEVELS IN H.264/AVC SCALABLE EXTENSION

Quality Enhancement of Compressed Video via CNNs

PERFORMANCE ANALYSIS OF AVS-M AND ITS APPLICATION IN MOBILE ENVIRONMENT

Block-based Watermarking Using Random Position Key

Editorial Manager(tm) for Journal of Real-Time Image Processing Manuscript Draft

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

ISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1

An adaptive preprocessing algorithm for low bitrate video coding *

The Application of H.264 Video Coding Technology in the Digital Monitoring Systems

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

Overview of H.264 and Audio Video coding Standards (AVS) of China

COMPARISON OF HIGH EFFICIENCY VIDEO CODING (HEVC) PERFORMANCE WITH H.264 ADVANCED VIDEO CODING (AVC)

Image Error Concealment Based on Watermarking

An Efficient Image Compression Using Bit Allocation based on Psychovisual Threshold

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou

AVS VIDEO DECODING ACCELERATION ON ARM CORTEX-A WITH NEON

H.264/AVC Video Watermarking Algorithm Against Recoding

Combined Copyright Protection and Error Detection Scheme for H.264/AVC

Analysis of Information Hiding Techniques in HEVC.

Transcription:

2009 Third International Conference on Multimedia and Ubiquitous Engineering A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation Yuan Li, Ning Han, Chen Chen Department of Automation, School of Technology Beijing Forestry University Beijing China 100083 e-mail: bjfu.yuan.li@gmail.cn, hn217@bjfu.edu.cn, chen_chen09@yahoo.cn Abstract Due to the limitation of computing complexity, it is difficult to apply the H.264 deblocking filter to low-end terminals. Although some technologies to optimize it have been proposed, the complexity is still high for real time implementation. Considering that deblocking filter is applied to all of the vertical and horizontal edges of 4x4 blocks but only some of them actually need to be filtered, in this paper, based on previous encoding or decoding information, we propose a novel deblocking filter algorithm to reduce complexity of exist algorithms by skipping quantities of needless edges. The experimental results show that our proposed algorithm has better performance in subjective video quality and time efficiency compared to JM 10.2 algorithm. Keywords- H.264/AVC; Deblocking filter;loop filter I. INTRODUCTION H.264/AVC standard provides high compression efficiency compared to previous video coding standards, such as MPEG-4 and H.263, mainly due to various compression algorithms added into this standard [1][2]. For example, block-based discrete cosine transform (DCT) and block-based motion compensated prediction (MCP). For the code efficiency of the H.264, its use has been growing in consumer electronic products for applications such as video telephony and video surveillance. However, the use of block-based processing, adds visible blocking artifacts to the reconstructed frames. In order to overcome this problem, in-loop deblocking filter is used to remove the blocking artifacts. Although the deblocking filter improves qualities of output video frames, the computational complexity of the H.264 is two or three times higher than that of the H.263 and MPEG-4. In [3], time arranged in H.264 decoder are reported as follows: loop filtering (33%), interpolation (25%), followed by bit-stream parsing and entropy decoding (13%) and reconstruction (13%). As it can be seen among above, loop filtering has the largest computational complexity. With the purpose of presenting high performance implementation for real time applications, many efficient hardware architectures for deblocking filter in H.264 have been proposed. A deblocking architecture requires extremely low gate count for meeting the mobile application was illustrated [4] and various fast memory accessing techniques have been proposed [5][6] to satisfy the real-time constraints. But none of these solutions considers the algorithmic enhancements for overall speedup, so the complexity still cannot be reduced significantly because of the flow of algorithm itself. On the other hand, there are several new deblocking algorithm in the literature that have resulted in better video quality. For instance, a post deblocking filter has been added into the original algorithm [7], which result in better performance, but there is another time and memory time elapsing either. In [8], projection onto convex set (POCS) approach has been proposed. The idea behind POCS is to iteratively impose different sets of constraints to reduce the blocking artifacts. A set of constraints is applied as long as it is closed convex. Chia-Hung Yeh, Kai-Lin Huang and Shiunn-Jang Chern [9] has introduced a color analysis postprocessing deblocking filter to reduce the blocking effects which based on the human color vision and color psychology analysis. However, few methods above mentioned their algorithm complexity. In order to obtain the sufficient gains to meet real-time requirement, two optimization steps are introduced in this paper by fully utilizing encoding or decoding information. And the rest of this paper is organized as follow. The H.264/AVC deblocking filtering algorithm is reviewed in Section 2. Our proposed novel algorithm is described in Section 3. In Section 4, simulation results are presented and compared with the reference model JM 10.2 software. Finally, the paper ends with conclusion in Section 5. II. THE H.264/AVC DEBLOCKING ALGORITHM In H.264/AVC standard, deblocking filter treats every decoded luminance 16x16 macroblock as a filterd unit. As the Figure 1 depicting, boundary strength (BS) is calculated from first vertical edges (a-d), and then horizontal edges (eh). For each edge, data in the form of horizontal or vertical line of pixels (LOP) is processed 16 times respectively (0-15). The rules to determine the BS for each LOP has been shown in Table I. In this way, in order to process a 16x16 macroblock, 128 separate operations should be required. And each LOP is applied adaptive filter only if (1) is true. In our complex analysis experiment, BS computation through BS detection has taken up 90% overall time during deblocking process. So our optimization work is to utilize previous information which has been done before delocking filter to avoid complex BS operation. BS 0, p0 q0 < α, p1 p0 < β, q1 q0 < β (1) 978-0-7695-3658-3/09 $25.00 2009 IEEE DOI 10.1109/MUE.2009.97 26

Figure 2. Block size modes used for inter prediction in H.264. Figure1. The order of computing Bs in a 16x16 macroblock. TABLE I. DETERMINING BS FOR LOP. Condition BS p or q is intra coded and boundary is a macroblock boundary 4 p or q is intra coded and boundary is not a macroblock boundary 3 neither p or q is intra coded; p or q contain coded coefficients 2 neither p or q is intra coded; neither p or q contain coded 1 coefficients; p and q have different reference frames or a different number of reference frames or different motion vector values otherwise 0 III. OUR PROPOSED OPTIMIZATION ALGORITHM Since the minimal unit in the previous DCT, intra/inter prediction and MCP are based on 4x4 blocks, one 4x4 block should be a independent unit which include the same coded coefficients, intra/inter mode and motion vector values. According to the Table 1, the four LOP of vertical edge or horizontal edge in one 4x4 block should have the same BS. Thus, Our first optimization step is that we only calculate BS values of LOP 0,4,8 and 12, then copy BS value of LOP 0 to LOP 1,2 and 3; copy BS value of LOP 4 to LOP 5,6 and 7; copy BS value of LOP 8 to LOP 9,10 and 11; copy BS value of LOP 12 to LOP 13,14 and 15 in Figure 1. By doing so, we can decrease nearly 3/4 complexity in BS computation without any video quality loss. In H.264, there are seven different block size modes (skip, 16 16, 16 8, 8 16, 8 8, 8 4, 4 8 and 4 4 ) that are used in the macroblock for inter prediction, as it shown in Figure 2. We select three sequence (foreman, stefan and coastguard) in different quantity parameter (QP) to statistics block mode used in Figure 3. As depicted in Figure 3, skip mode, 16x16 mode, 16x8 mode, and 8x16 mode have dominated over 60% block size modes in sum. Moreover, the number of skip mode grows swiftly with the increased QP value. We could find that when QP is equal to 40 only the skip mode has already reached approximately 60% of all the block size mode. So these block modes consume the most deblocking time. Figure 3. Block size mode used in three sequence. The H.264 standard divides a frame into a set of large area block size mode (16x16, 16x8, and 8x16) for their internal pixels have changed not as rigidly as other small block size modes. So a separate motion vector is required for every block size mode in Figure 4. According to Table 1, we could draw a conclusion that for 16x16 mode, 16x8 mode and 8x16 mode, their internal pixels have the same motion vector values and are all belong to inter macroblock mode, and hence almost no blocking artifacts boundary will appear in these block size mode interior. Figure 4. Different block modes and motion vectors assigned in residual of stefan 27

In the inter prediction, skip mode is very special form. In order to adopt skip mode, macroblock should meet all of the four conditions below: block size mode being 16x16 mode; reference frame being just one previous; motion vector value being zero; transform coefficients being all quantized to zero; Apparently, the conditions above, requested by skip mode have indicated that there no blocking artifacts would generate on the boundary of skip mode. Consequently, if we apply deblocking filter on these numerous skip modes, a amount of time waste will be inevitable. On the basis of the analysis above, we propose our second optimization approach. Before we apply the deblocking filter on macroblock, we firstly obtain the block size modes information from the previous inter prediction process without any extra computation. If current macroblock is skip mode, we will skip this macroblock directly instead of placing any blocking artifacts detection on it. Due to the large proportion that skip mode macroblocks have occupied, we avoid to devote a lot of needless time during the deblocking filter process. For 16x16 mode, 16x8 mode and 8x16 mode, we only select the boundary where blocking artifacts most frequently take place, to apply deblocking filter. Refer to Figure 1, we only filter vertical edge a and horizontal edge e in 16x16 mode macroblock; filter vertical edge a and horizontal edges e,h in inter 16x8 mode macroblock; filer vertical edges a,c and horizontal edge e in 8x16 mode macroblock. Thus, we avert to filter some internal edges of 16x16 modes, 16x8 mode and 8x16 mode. By doing so we could save plenty of time. So our two steps optimization deblocking algorithm is generally illustrated in Figure 5. Deblocking filter start If macroblock is skip mode break; Else if macroblock is 16x16 mode Deblockedge(a,e); Else if macroblock is 16x8 mode Deblockedge(a,e,h); Else if macroblock is 8x16 mode Deblockedge(a,c,e); Else Deblockedge(a,b,c,d,e,f,g,h); Deblockedge() { compute BS for LOP 0 then copy it to LOP 1,2,3; compute BS for LOP 4 then copy it to LOP 5,6,7; compute BS for LOP 8 then copy it to LOP 9,10,11; compute BS for LOP 12 then copy it to LOP 13,14,15; apply adaptive filter on LOP; } Deblocking filter end Figure 5. Our two steps optimization deblocking algorithm. IV. EXPERIMENTAL RESULTS For our simulations, in order to measure the performance of the deblocking filter proposed in this paper, the JM 10.2 software is selected as a reference for our experiment. The test sequence parameters are showed in Table II. We use the commonly used tool PSNRCalculator to measure objective video quality Peak Signal to Noise Ratio (PSNR). As we could see in Table III, the average of PSNR reduction for our proposed algorithm is slightly lost compared to JM 10.2. Inter VTune Performance Analyzer software is used for weighing time reduction in contrast to JM 10.2. And our test platform is Inter 686 processor with 1024MB memory. Since an absolute time for each sequence may vary with the different test platforms, we use time consume percentage between our proposed deblocking algorithm and deblocking algorithm in JM 10.2 to measure time efficiency. In Figure 6, we test fast (stefan) activity sequence filter time when QP=22,28,32,36,40, and our proposed algorithm gains timesaving improvement 43.34%, 48.72%, 57.1%, 71.7%, 68.9% respectively, compared to JM 10.2. The results show that our proposed algorithm has obvious time reduction in all the QP values. Especially, the time used in our proposed algorithm decreases rapidly with the QP values increasing. As the bit rate is inversely proportional to the QP values, our approach has better performance in low bit rate coding. TABLE II. PARAMETERS FOR TEST SEQUENCE. Sequence name foreman, stefan, coastguard Frame size CIF (352x288 pixels) Encoded frames 50 Reference frames 5 Video format 30fps, IPPP sequence Fast motion estimation enabled Fast mode decision enabled Motion estimation search 16 range TABLE III. Δ PSNR FOR DIFFERENT TEST SEQUENCE. QP Foreman stefan coastguard Δ PSNR Δ PSNR Δ PSNR 22-0.02dB - 0.02dB 0.02dB 28-0.01dB -0.02dB -0.02dB 32-0.03dB 0.01dB -0.02dB 36-0.06dB -0.01dB -0.02dB 40-0.04dB 0.01dB 0.02dB average -0.03dB -0.006dB -0.004dB ( Δ PSNR=PSNR(Our proposed algorithm)-psnr(jm)) 28

Figure 6. Our proposed deblocking algorithm s processing time percentage compare to JM 10.2 deblocking algorithm s processing time. Figure 7(a). Original 27th frame. Figure 7(b). Reconstructed 27th frame QP=40 without deblocking filter. Figure (d). Reconstructed 27 th frame QP=40 with our proposed deblocking algorithm. We also make a control group in low (coastguard) activity sequence, medium (foreman) activity sequence and fast (stefan) activity sequence in figure 6. And we find that our method process faster in foreman and coastguard because low activity sequence has more large area block size modes. Figure 7(a)-(d) are illustrated to measure the sequence subjective video quality visually. In Figure 7(b), we could feel a mass of blocking artifacts there. With JM 10.2 deblocking algorithm in Figure 7(c), blocking artifacts has been removed. However, the whole picture has been blurred in contrast with our proposed algorithm in Figure 7(d). V. CONCLUSION In this paper, a novel deblocking filter algorithm was proposed for video real time encoding or decoding in H.264. The first optimization step copies one BS value of LOP to that of the other three LOP in the same 4x4 block in order to avoid repetitive calculations. The second optimization step skips some needless edges so as to not waste time to detect BS values on these edges. According to the results, it was verified that computational cost can be decreased considerablely with better subjective video quality, especially, in low rate coding or low activity sequence. ACKNOWLEDGMENT This research was supported by the Introduction of Advanced International Forestry Science and Technology Program from the State Forestry Administration, China (Grant No. 2005-4-04) and the College Students Creative Research Program from the Ministry of Education of China (Grant No. GCS08016). Figure 7(c). Reconstructed 27th frame QP=40 with JM 10.2 deblocking algorithm. REFERENCES [1] ITU-T Rec. H.264 ISO/IEC 14496-10 AVC, Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification, JVT Doc. JVTG050, 2003. 29

[2] T. Wiegand, G. J. Sullivan, G. Bjontegard, A. Luthra," Overview of the H.264/AVC Video Coding Standard," IEEE Trans. on Circuits and Systems for Video Technology, vol. 13,no.7, pp. 560-576, Jul. 2003. [3] M. Horowitz, A. Joch, F. Kossentini,etc. "H264/AVC Baseline Profile Decoder Complexity Analysis," IEEE Transactions on Circuit and System for Video Technology, Vol. 13, No.7, pp.704-716, Jul. 2003. [4] Vladimir Ciric, Ivan Milentijevic,"Area-Time Tradeoffs in H.264/AVC Deblocking Filter Design Mobile Device," 9th International Symposium on Signal Processing and Its Applications, pp. 1-4, Feb. 2007. [5] S.-C. Chang, W.-H. Peng, S.-H. Wang, T. Chiang, "A Platform Based Bus-interleaved Architecture for De-blocking Filter in H.264/MPEG-4 AVC," IEEE Trans. on Consumer Electronics, vol. 51, no. 1, pp. 249-255, Feb. 2005. [6] B. Sheng, W. Gao, D. Wu, "An implemented architecture of deblocking filter for H.264/AVC," IEEE International Conference on Image Processing, vol. 1, pp. 665-668, Oct. 2004. [7] Yao-Min Huang, Jin-Jang Leou, Ming-Hui Cheng, "A Post Deblocking Filter for H.264 Video," Computer Communications and Networks, ICCCN 2007, pp. 1137-1142, Aug. 2007. [8] J. Zou, H. Yan, A Deblocking Method for BDCT Compressed Images Based on Adaptive Projections, IEEE Transaction on Circuits and Systems for Video Technology, Vol. 15, No. 3, pp.430-435, Mar. 2005, [9] Chia-Hung Yeh, Kai-Lin Huang, Shiunn-Jang Chern," Deblocking Filter by Color Psychology Analysis for H.264/AVC Video Coders," Intelligent Information Hiding and Multimedia Signal Processing, 2008. pp. 802-805, Aug. 2008. 30