Performance analysis of AAC audio codec and comparison of Dirac Video Codec with AVS-china Under guidance of Dr.K.R.Rao Submitted By, ASHWINI S URS
Outline Overview of Dirac Overview of AVS-china Overview of AAC Performance of Dirac codec Performance of AAC codec References
Overview of Dirac Dirac[1] is a hybrid video codec developed by British Broadcasting Corporation (BBC). Dirac is a hybrid video codec because it involves both transform and motion compensation. The key feature of Dirac is that it is an open technology, which means that the technology can be used without payment of licensing fees. Dirac uses modern techniques like, wavelet transform and arithmetic coding for entropy coding. The applications of Dirac range from high definition television (HDTV) to web streaming due to its flexibility.
Fig.1 Block Diagram of Dirac Encoder [1]
Overview of AVS-China Audio-video coding standard (AVS) is a working group of audio and video coding standard in China, which was established in 2002. Based on versatile applications in the area of video, AVS-china is categorized into various profiles[16]. AVS-china consists of four profiles namely: Jizhun (base) profile, Jiben (basic) profile, Shenzhan (extended) profile and Jiaqiang (enhanced) profile, defined in AVS-video targeting to different applications (Table.1) [16]. AVS- china has different parts for various categories. AVS Part- 2 defines Jizhun profile. Jizhun profile is preferable for high coding efficiency on video sequences of higher resolutions, at the expense of moderate computational complexity.
Profiles Key applications Jizhun profile Jiben profile Television broadcasting, HDTV, etc. Mobility applications, etc. Shenzhan profile Video surveillance, etc. Jiaqiang profile Multimedia entertainment, etc. Table.1 Application based profiles of AVS [16]
Part Category 1 System 2 Video 3 Audio 4 Conformance test 5 Reference Software 6 Digital media rights management 7 Mobile video 8 Transmit AVS via IP network 9 AVS file format 10 Mobile speech and audio coding Table.2 Different Parts of AVS-china
Fig.3 Typical AVS-China video coding Chain [16]
Fig.4 Block diagram of AVS-china encoder [15]
Overview of AAC AAC [2] consists of three profiles, namely: main profile, low-complexity profile and scalable sampling rate (SSR) profile. The key feature of low-complexity profile is, it deletes the prediction tool and reduces the temporal noise shaping tool in complexity. Supports Sample frequencies from 8 khz to 96 khz (official MP3: 16 khz to 48 khz) [4]. Superior performance at bit rates > 64 kbps and at bit rates reaching as low as 16 kbps. Improved compression provides higher-quality audio with smaller bit rates. AAC can accommodate two basic bit stream formats: Audio data interchange format (ADIF) and Audio data transport stream (ADTS) [10].
Fig.5 Block Diagram AAC Encoder [7]
Performance analysis of dirac video codec QCIF sequence: Akiyo Total No: of frames : 300 frames. Width : 176. Height: 144. Frame rate: 30fps. QF Original File Size (KB) Compressed File Size (KB) Compression Ratio Y-MSE Y-PSNR Y- SSIM 0 5569 26 215:1 115.4475 27.5070 0.7958 1 5569 27 206:1 85.4363 28.8144 0.8390 2 5569 29 192:1 68.7197 29.7599 0.8691 3 5569 31 180:1 52.1033 30.9622 0.8940 4 5569 35 159:1 33.2061 32.9186 0.9235 5 5569 42 133:1 21.3228 34.8424 0.9471 6 5569 52 107:1 13.7721 36.7408 0.9655 7 5569 66 84:1 9.7199 38.2542 0.9750 8 5569 86 65:1 7.1746 39.5728 0.9831 9 5569 119 47:1 6.2528 40.1700 0.9869 10 5569 182 31:1 5.7789 40.5123 0.9882 Lossless 5569 1276 4:1 4.5678 41.5337 0.9950 Table.3 Performance of Dirac for Akiyo test sequence (150 frames)
CIF sequence: Tempete Total No: of frames : 260 frames. Width : 352. Height: 288. Frame rate: 30fps QF Original File Size (KB) Compressed File Size (KB) Compression Ratio Y-MSE Y-PSNR Y- SSIM 0 13365 66 203:1 445.7747 21.6397 0.5845 1 13365 77 174:1 359.2327 22.5771 0.6636 2 13365 93 144:1 274.7563 23.7413 0.7433 3 13365 118 113:1 223.5672 24.6367 0.7974 4 13365 158 85:1 191.7917 25.3025 0.8389 5 13365 224 60:1 172.5263 25.7623 0.8671 6 13365 333 40:1 164.1558 25.9782 0.8836 7 13365 505 26:1 162.0681 26.0338 0.8917 8 13365 794 17:1 161.2051 26.0570 0.8959 9 13365 1181 11:1 160.8626 26.0663 0.8979 10 13365 1767 8:1 160.7939 26.0681 0.8990 Lossless 13365 7859 2:1 160.8231 26.0673 0.9002 Table.4 Performance of Dirac for Tempete test sequence (90 frames)
Original Sequence Recon image QF=0 Recon image QF=10 Recon image lossless mode
Original sequence Recon sequence for QF= 0 Recon sequence for QF=10 Recon sequence lossless mode
Performance analysis of the AAC codec: Results: Length of audio sequence = 2.13 minutes. Bitrate for encoded sequence = (2.01*8)/(2.13*60) = 0.126Mbps Bitrate for the decoded sequence = (24.4*8)/(2.13*60) = 1.528Mbps File format No: of Frames in a sequence Encoding time(seconds) Decoding time(seconds) Original Size(MB) Compressed Size(MB) Compression Ratio ADTS 6257 7.1 1.16 24.4 2.01 12:1
Acronyms AAC Advanced audio coding. ADIF Audio data interchange format. ADTS Audio data transport stream. AES Audio engineering society. AFC Adaptation field control. AVC Advanced video coding. AVS Audio video coding standard. BBC British broadcasting corporation. CIF Common intermediate format. HDTV High definition television. ISDB-T Integrated services digital broadcasting Terrestrial. MPEG Moving picture experts group. MSE Mean square error. M/S Mid/Side. PSNR Peak signal to noise ratio. QCIF Quarter common intermediate format SSIM Structural similarity index measurement. SSR Scalable sampling rate. TNS Temporal noise shaping.
References [1] T. Borer, and T. Davies, Dirac video compression using open technology, BBC EBU Technical Review, July 2005. [2] MPEG 2 Advanced audio coding, AAC. International Standard IS 13818 7, ISO/IEC JTC1/SC29 WG11, 1997. [3] MPEG. Information technology - Generic coding of moving pictures and associated audio information, part 4: Conformance testing. International Standard IS 13818 4, ISO/IEC JTC1/SC29 WG11, 1998. [4] M. Bosi and M. Goldberg Introduction to digital audio coding and standards, Boston: Kluwer academic publishers, c2003. [5] A. Puri, X. Chen and A. Luthra, Video coding using the H.264/MPEG-4 AVC compression standard, Signal processing: image communication, vol. 19, issue 9, pp. 793-849, Oct. 2004. [6] K. Brandenburg, MP3 and AAC Explained, AES 17th International conference, Florence, Italy, Sep. 1999. [7] P.A. Sarginson, MPEG-2: Overview of systems layer, BBC RD 1996/2. [8] Dirac software download and source code: http://diracvideo.org/download/dirac-research/ [9] AVS-china software download: ftp://159.226.42.57/public/avs_doc/avs_software [10] H. Murugan, M.S.E.E Thesis, University of Texas at Arlington, TX Multiplexing H264 video bit-stream with AAC audio bit-stream, demultiplexing and achieving lip sync during playback, May 2007. [11] AVS-China official website: http://www.avs.org.cn [12] M. Uehara, Application of MPEG-2 systems to terrestrial ISDB (ISDB-T), Proceedings of the IEEE, vol.94, pp. 261-268, Jan. 2006.
References [13] MSU Video Quality measurement tool: http://compression.ru/video/quality_measure/vqmt_download_en.html#start [14] A. Ravi and K.R. Rao, Performance analysis and comparison of the Dirac video codec with H.264/ MPEG-4 Part 10 AVC", Submitted to Journal of VCIR, Sept. 2009. [15] L.Fan, Mobile Multimedia Broadcasting Standards, ISBN: 978-0-387-78263-8, Springer US, 2009. [16] Lu Yu, Sijia Chen, Jianpeng Wang, Overview of AVS-video coding standards, special issue on AVS, SPIC, vol. 24, pp. 247-262, April 2009. [17] Dirac video codec - A programmer's guide: http://dirac.sourceforge.net/documentation/code/programmers_guide/toc.htm [18] Digital audio compression standard (AC-3, E-AC-3), revision B, ATSC Document A/52B, Advanced Television Systems Committee, Washington, D.C., Jun. 14, 2005. [19] Video test sequences QCIF and CIF sequences: http://trace.eas.asu.edu/yuv/index.html [20] Z. Wang, et al Image quality assessment: From error visibility to structural similarity, IEEE Trans. on Image Processing, vol. 13, pp. 600-612, Apr. 2004. http://www.ece.uwaterloo.ca/~z70wang/ [21] L.Yu et al., Overview of AVS-Video: Tools, performance and complexity, SPIE VCIP, vol. 5960, pp. 596021-1~ 596021-12, Beijing, China, July 2005. [22] C. C. Todd, et.al, AC-3: perceptual coding for audio transmission and storage, presented at the 96th Conv. Audio Engineering Soc., 1994, Preprint 3796. [23] Power point slides by L.Yu, chair of AVS video : http://www-ee.uta.edu/dip/courses/ee5351/ispacsavs.pdf