EE 5359 Low Complexity H.264 encoder for mobile applications Thejaswini Purushotham Student I.D.: 1000-616 811 Date: February 18,2010
Fig 1: Basic coding structure for H.264 /AVC for a macroblock [1]
.The high-computational complexity of H.264 and real-time requirements of video systems represent the main challenge to overcome the development of efficient encoder solutions.
Fig. 2 :Various block sizes in H.264 for motion estimation/compensation [1]
However, among H.264 compression modules, it is important to emphasize that the most computational expensive process is ME. For example, assuming FS and M block types, N reference frames and a search range for each reference frame and block type equal to +/- W, we need to examine N x M x (2W + 1)^2 positions compared to only (2W + 1)^2 positions for a single reference/block type. After the first FS, a sup-pixel search could also be performed.
At the end, the union of all mode evaluations, cost comparisons and exhaustive search inside ME cause a great amount of time spent by the encoder. In other words, complex and exhaustive ME evaluation is the key to good performance achieved by H.264, but the cost is in the encoding time.
The evaluation of all Intra / Inter modes to select the best coding mode among possible combinations guarantees the smallest distortion under the given bit rate instead of just minimizing the bit-rate or the distortion. However, in order to achieve a better bit-rate distortion, the complexity is increased, making H.264/AVC difficult to apply directly to low complexity devices especially in wireless network environments.
Machine learning
What to do? Implement the tree as if-else statements. This will take lesser encoding time compared to the FS (full search) method.
References [1]T. Wiegand et al, Overview of the H.264/AVC video coding standard, IEEE Trans. CSVT, Vol. 13, pp. 560-576, July 2003. [2]S. K. Kwon, A. Tamhankar and K.R. Rao, "An overview of H.264/MPEG-4 Part 10," Special issue of Journal of Visual Communication and Image Representation,vol.17, pp 186-216, April 2006. [3]G. Sullivan, P. Topiwala and A. Luthra, The H.264/AVC video coding standard: overview and introduction to the fidelity range extensions, SPIE Conference on Applications of Digital Image Processing XXVII, vol. 5558, pp. 53-74 Aug. 2004. [4]T. Weigand et al, Introduction to the Special Issue on Scalable Video Coding Standardization and Beyond IEEE Trans. on Circuits and Systems for Video Technology, Vol. 17, pp 1034, Sept. 2007. [5]H. Kalva and L. Christodoulou, Using machine learning for fast intra MB coding in H.264, Proc. of VCIP 2007, Jan. 2007.
REFERENCE BOOKS: K. Sayood, Introduction to Data compression, III edition, Morgan Kauffmann publishers, 2006. I.E.G. Richardson, H.264 and MPEG-4 video compression: video coding for next-generation multimedia, Wiley, 2003. K. R. Rao and P. C. Yip, The transform and data compression handbook, Boca Raton, FL: CRC press, 2001. K.R. Rao and J.J. Hwang Techniques and standards for image, video, and audio coding - Prentice Hall, 1996. I. Richardson, The H.264 advanced video compression standard, Hoboken, NJ: May 2010.
REFERENCE WEBSITES JM software : http://iphome.hhi.de/suehring/tml/n Introduction to Machine learning : http://ai.stanford.edu/~nilsson/mldraftbook/ MLBOOK.pdf
ACRONYMS: ASO Arbitrary slice ordering AVC Advanced Video Coding B MB Bi-predicted MB DCT Discrete Cosine Transform DSP Digital Signal Processing DVD Digital Versatile Disc FMO Flexible macroblock ordering FRExt Fidelity Range Extensions FS Full Search GOP Group Of Pictures I MB Intra Predicted MB IEC International Electrotechnical Commission ISO International Organization for Standardization ITU-T International Telecommunication Union Transmission sector
JVT Joint Video Team P MB Inter Predicted MB IDCT Inverse Discrete Cosine Transform IQ Inverse Quantizer MB Macroblock MBAFF Macroblock level Adaptive Frame/Field PicAFF Picture level Adaptive Frame/Field ME Motion Estimation MC Motion Compensation MV Motion Vector MPEG Moving Picture Experts Group MSE Mean Square Error PSNR Peak to peak Signal to Noise Ratio Q Quantizer R-D Rate Distortion RS Redundant slice SP/SI Switched P / Switched I SMPTE Society of Motion Picture and Television Engineers SSIM Structural Similarity Index Measure SVC Scalable Video Coding VCEG Video Coding Experts Group VLC Variable Length Coding VLD Variable Length Decoder YUV Y- Luminance and UV- Chrominance