Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE 5359 Gaurav Hansda 1000721849 gaurav.hansda@mavs.uta.edu
Outline Introduction to H.264 Current algorithms for intra prediction Proposed algorithms Implementation results Conclusions
Overview of H.264 H.264 is an industry standard. It defines a format for compressed video data. It provides a set of tools that can be used in a variety of ways to compress and communicate visual information. Purpose of a standard Define a coded representation (or syntax) that describes visual data in a compressed form and method of decoding the syntax to reconstruct visual information. compliant encoders and decoders can successfully interoperate with each other.
H.264 Profiles Main Profile Baseline Profile Extended Profile SP frames SI frames FMO Redundant slices I frames P frames CAVLC B frames Interlace CABAC Fig. 1. H.264 profiles [2]
Applications Broadcast television Streaming video Video storage and playback Videoconferencing Mobile video Studio distribution
Coding Process The image is divided into macroblocks (16x16 pixels). The macroblocks are grouped into slice groups, which are divided into slices. Each slice is coded either as an I, P or B slice (There are also types called SI and SP). In an I slice, all blocks are coded as I blocks. In a P slice, blocks are coded as I or P blocks. In a B slice, blocks are coded as I, P or B blocks.
Fig. 2. Typical H.264 encoder [18]
Fig.3. Typical H.264 decoder [18]
Mode decision of H.264 encoder Fig. 4. Mode decision hierarchy of an H.264 compliant encoder. [4]
Implication of Hierarchical Structure 1. To ensure the correctness of the decision at upper layer. 2. To ensure early termination is executed accurately and as early as possible. Most fast mode decision algorithms developed so far, only deal with a single stage of the mode decision hierarchy [5]-[14] and fail to achieve the best possible complexity reduction.
Intra-Prediction There are 3 macroblock (MB) modes for intra prediction of luma pixels: intra4x4 (I4MB), intra8x8 (I8MB), and intra16x16 (I16MB). Intra4MB and Intra8MB have 9 prediction modes as shown in Fig. 5(a). Intra16MB has only 4 prediction modes as shown in Fig. 5(b). Fig. 5. Prediction modes for (a) Intra4MB and (b) Intra16MB. [4]
Fig. 6. Prediction flow diagram [18] Fig. 7. Intra-prediction [18]
Mode Decision To achieve a better tradeoff between bit-rate and distortion, H.264 encoder adopts the rate-distortion (R-D) optimization framework and the Lagrangian technique for mode decision [2]. For intra frames, the best prediction mode of a block is defined as the mode that, among all prediction modes of the block, gives rise to the minimum R-D cost. The R-D cost of an MB mode is the sum of the minimum R- D cost of each individual block.
Proposed Algorithm for Block Size Decision Block size is highly correlated with texture complexity. Variance of a block corresponds to the total energy of the AC coefficients of the block, hence it is good measurement of the texture complexity. Thus variance based classification of texture complexity is used [16]. If variance is above the threshold, Intra4MB and Intra8MB is selected; otherwise, Intra8MB and Intra16MB is chosen. This is simple way to skip the examination of Intra4MB mode.
Fig. 8. Variance-based MB mode decision [4]
Improved Prediction Mode Decision Earlier algorithms only consider the edge information of the current block. The correlation between blocks was not considered. Hence the Most Probable Mode(MPM) is used. The MPM, which takes advantage of the spatial correlation of the prediction modes between the neighboring blocks and the current block for coding, is defined as the prediction mode of the left or the upper neighbor, whichever has the smaller prediction mode number.
Improved Prediction Mode Decision Each original image block is evenly divided into four subblocks first. Each subblock is represented by the average pixel magnitude of its pixels. Fig. 9. Formation of subsampled block for a block of (a) Intra4MB, (b) Intra8MB, and (c) Intra16MB [4]
Apply the following filters: Fig.10. Five sets of filter coefficients for dominant edge detection. [4] Determine the dominant edge.
Input 2x2 subsampled block Pass through the filters separately Determine the dominant edge Choose the candidate modes Fig. 11. Prediction mode decision [4]
Proposed algorithm for Intra Block Decision Intra block decision, for inter frames, occupies a considerable percentage of the total computations of inter-frame coding. Intra16MB takes much less computation time than the other modes. Hence, scaled R-D cost is used [4]. An MB is less probable to be intra coded if the R-D cost difference between best inter mode and Intra16MB is small. Denoting the scaled R-D cost differences between Intra16MB and the inter MB mode by dˆj, and based on the above observation, if dˆj is small both I4MB and I8MB can be skipped.
Joint Model (JM) Reference implementation standardized in ISO/IEC JTC1/SC29/WG11 Decoder implements almost all the features Encoder Exercises most of the important coding tools Provides an elaborate list of control parameters Offers a rate distortion optimized implementation Offers several fast computation options Serves as a reference for what is best quality possible using H.264 Good description of the reference algorithms exists Currently at v17.2 Can be downloaded from http://iphome.hhi.de/suehring/tml/download/
Sequences used Hall (CIF and QCIF) Container (CIF and QCIF) Mobile (CIF)
JM software implementation The JM reference software version 17.2 is implemented [17]. The conditions of the experiment are as follows. 1. Run on a PC with Intel Core i3 2.27GHz processor and 3.00 GB RAM. 2. Set the QP values to 16, 20, 24, and 28. 3. Number of frames to be coded=100 4. Enable the R-D optimization. 5. Choose context-adaptive binary arithmetic coding (CABAC) as the entropy coding method.
Performance QP PSNR(dB) Bit rate (kbits/s) Time (s) SSIM 28 38.685 697.64 20.022 0.9766 24 41.378 961.59 30.875 0.9831 20 44.118 1359.14 34.094 0.9883 16 47.085 1930.5 37.767 0.9934 Table I. Performance of Hall (QCIF) in JM reference software version 17.2
Variance-based block size decision Sequence Decrease in PSNR (db) Increase in bitrate (%) Container (QCIF) 0.099 0.49 23.7 Hall (QCIF) 0.051 0.27 11.5 Container (CIF) 0.045 0.26 25.4 Hall (CIF) 0.054 0.43 24.2 Mobile (CIF) 0.019 0.07 7.41 Decrease in encoding time (%) Table II. Performance of variance based block size decision in JM reference software version 17.2 compared to using JM v17.2 directly.
Prediction mode decision [8] Sequence Decrease in PSNR (db) Increase in bitrate (%) Container (QCIF) -0.186 0.95-31.0 Hall (QCIF) -0.181 1.01-28.6 Container (CIF) -0.150 0.83-29.1 Hall (CIF) -0.203 1.60-32.0 Mobile (CIF) -0.174 0.64-25.4 Decrease in encoding time (%) Table III. Performance of Prediction mode decision [8] in JM reference software version 17.2 compared to using JM v17.2 directly.
Improved Prediction Mode Decision Sequence Decrease in PSNR (db) Increase in bitrate (%) Container (QCIF) -0.092 0.47-31.0 Hall (QCIF) -0.118 0.65-29.1 Container (CIF) -0.075 0.41-29.1 Hall (CIF) -0.091 0.70-32.4 Mobile (CIF) -0.136 0.50-25.9 Decrease in encoding time (%) Table IV. Performance of Improved prediction mode decision in JM reference software version 17.2 compared to using JM v17.2 directly.
Two-stage approach Sequence Decrease in PSNR (db) Increase in bitrate (%) Container (QCIF) -0.191 0.97-48.5 Hall (QCIF) -0.161 0.90-36.7 Container (CIF) -0.123 0.68-50.5 Hall (CIF) -0.145 1.12-52.6 Mobile (CIF) -0.154 0.57-35.5 Decrease in encoding time (%) Table V. Performance of two-stage approach incorporating variance based block size decision and improved prediction mode decision in JM reference software version 17.2 compared to using JM v17.2 directly.
PSNR (db) Bitrate vs PSNR for Hall (QCIF) 49 47 45 43 JM 17.2 Variance based block size decision Improved Prediction mode or MPM 41 Two-Stage approach 39 37 500 700 900 1100 1300 1500 1700 1900 2100 Bitrate (kbits/s) Fig. 12. Performance plot for Hall (QCIF) : Bit rate vs PSNR.
Conclusions The variance-based block size decision achieves on the average 19% (maximum 25.4%) time saving while maintaining almost the same R-D performance as that of the full-search scheme employed in the JM reference software. For Mobile sequence that has more texture regions, less time is saved due to the computationally expensive operations for examining I4MB. As the texture of an MB becomes homogeneous as the frame size increases, more time is saved for high resolution sequences. Note that the bit-rate increase for all sequences in this test is below 0.5%, which is negligible. Up to 52.0% complexity reduction over the JM reference software is achieved, and both the drop in PSNR and the increase in bitrate are very small. The proposed algorithms provide significant improvement of computational efficiency over previous methods without noticeable R-D performance loss.
References [1] Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification, document JVT-G050.doc, ITU-T Rec. H.264 and ISO/IEC 14496-10 AVC, 2003. [2] T. Wiegand et al, Overview of H.264 video coding standard, IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 560 576, Jul. 2003. [3] G. J. Sullivan and T. Wiegand, Rate-distortion optimization for video compression, IEEE Signal Process. Mag., vol. 15, no. 6, pp. 74 90, Nov. 1998. [4] Y.-W. Huang, T. Ou, and H. Chen, Fast decision of block size, prediction mode and intra block for H.264 intra prediction, IEEE Trans. Circuits Syst. Video Technol., vol. 20, no.8, pp. 1122-1132, Aug. 2010. [5] Y.-W. Huang et al, Analysis, fast algorithm, and VLSI architecture design for H.264/AVC intra frame coder, IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 3, pp. 378 401, Mar. 2005. [6] F. Pan et al, Fast mode decision algorithm for intra prediction in H.264/AVC video coding, IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 7, pp. 813 822, Jul. 2005.
[7] C. Tseng, H. Wang, and J. Yang, Enhanced intra-4x4 mode decision for H.264/AVC coder, IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 8, pp. 1027 1032, Aug. 2006. [8] J.-C. Wang, J.-F. Wang, and J.-T. Chen, A fast mode decision algorithm and its VLSI design for H.264/AVC intra-prediction, IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 10, pp. 1414 1422, Oct. 2007. [9] A.-C. Tsai et al, Intensity gradient technique for efficient intra-prediction in H.264/AVC, IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 5, pp. 694 698, May 2008. [10] H. Li, K. Ngan, and Z. Wei, Fast and efficient method for block edge classification and its application in H.264/AVC video coding, IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 6, pp. 756 768, Jun. 2008. [11] A.-C. Tsai et al, Effective subblock- based and pixel-based fast direction detections for H.264 intra prediction, IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 7, pp. 975 982, Jul. 2008. [12] K. Bharanitharan et al, A low complexity detection of discrete cross differences for fast H.264/AVC intra prediction, IEEE Trans. Multimedia, vol. 10, no. 7, pp. 1250 1260, Nov. 2008.
[13] I. Choi, J. Lee, and B. Jeon, Fast coding mode selection with rate-distortion optimization for MPEG-4. Part 10: AVC/H.264, IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 12, pp. 1557 1561, Dec. 2006. [14] C. Kim and C. Kuo, Feature-based intra/inter coding mode selection for H.264/AVC, IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 4, pp. 441 453, Apr. 2007. [15] B. Kim, Fast selective intra-mode search algorithm based on adaptive thresholding scheme for H.264/AVC encoding, IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 1, pp. 127 133, Jan. 2008. [16] A. Yu, G. Martin, and H. Park, Fast inter-mode selection in the H.264/AVC standard using a hierarchical decision process, IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 2, pp. 186 195, Feb. 2008. [17] JM software http://iphome.hhi.de/suehring/tml/ This software is a product of Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG. The latest version of JM Software is 18.0 [18] I. E. Richardson, The H.264 Advanced Video Compression Standard, Wiley publications, 2010.