Advances of MPEG Scalable Video Coding Standard

Similar documents
ARCHITECTURES OF INCORPORATING MPEG-4 AVC INTO THREE-DIMENSIONAL WAVELET VIDEO CODING

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ICIP.2006.

STACK ROBUST FINE GRANULARITY SCALABLE VIDEO CODING

Optimum Quantization Parameters for Mode Decision in Scalable Extension of H.264/AVC Video Codec

University of Brescia (Italy) Signals & Communications Laboratory SVC CE1: STool - a native spatially scalable approach to SVC.

Performance Comparison between DWT-based and DCT-based Encoders

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

A Hybrid Temporal-SNR Fine-Granular Scalability for Internet Video

OPTIMIZATION OF LOW DELAY WAVELET VIDEO CODECS

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

Lecture 5: Error Resilience & Scalability

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression

A fully scalable video coder with inter-scale wavelet prediction and morphological coding

A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression

Low-complexity video compression based on 3-D DWT and fast entropy coding

Fine grain scalable video coding using 3D wavelets and active meshes

SCALABLE HYBRID VIDEO CODERS WITH DOUBLE MOTION COMPENSATION

Week 14. Video Compression. Ref: Fundamentals of Multimedia

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

FPGA IMPLEMENTATION OF BIT PLANE ENTROPY ENCODER FOR 3 D DWT BASED VIDEO COMPRESSION

Motion-Compensated Wavelet Video Coding Using Adaptive Mode Selection. Fan Zhai Thrasyvoulos N. Pappas

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

SIGNAL COMPRESSION. 9. Lossy image compression: SPIHT and S+P

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

ADVANCES IN VIDEO COMPRESSION

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

Aliasing reduction via frequency roll-off for scalable image/video coding

Standard Codecs. Image compression to advanced video coding. Mohammed Ghanbari. 3rd Edition. The Institution of Engineering and Technology

J. Vis. Commun. Image R.

BANDWIDTH-EFFICIENT ENCODER FRAMEWORK FOR H.264/AVC SCALABLE EXTENSION. Yi-Hau Chen, Tzu-Der Chuang, Yu-Jen Chen, and Liang-Gee Chen

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

Modified SPIHT Image Coder For Wireless Communication

In the name of Allah. the compassionate, the merciful

Scalable Video Coding

MCTF and Scalability Extension of H.264/AVC and its Application to Video Transmission, Storage, and Surveillance

Video Transcoding Architectures and Techniques: An Overview. IEEE Signal Processing Magazine March 2003 Present by Chen-hsiu Huang

Georgios Tziritas Computer Science Department

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

A New Configuration of Adaptive Arithmetic Model for Video Coding with 3D SPIHT

WITH the growth of the transmission of multimedia content

A comparative study of scalable video coding schemes utilizing wavelet technology

Motion Prediction and Motion Vector Cost Reduction during Fast Block Motion Estimation in MCTF

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

Quality Scalable Low Delay Video Coding using Leaky Base Layer Prediction

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

Lecture 13 Video Coding H.264 / MPEG4 AVC

A SCALABLE SPIHT-BASED MULTISPECTRAL IMAGE COMPRESSION TECHNIQUE. Fouad Khelifi, Ahmed Bouridane, and Fatih Kurugollu

Wavelet-Based Video Compression Using Long-Term Memory Motion-Compensated Prediction and Context-Based Adaptive Arithmetic Coding

Wavelet Based Image Compression Using ROI SPIHT Coding

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

4D Wavelet-Based Multi-view Video Coding

Unit-level Optimization for SVC Extractor

An Efficient Mode Selection Algorithm for H.264

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

VIDEO streaming applications over the Internet are gaining. Brief Papers

Lossy-to-Lossless Compression of Hyperspectral Image Using the 3D Set Partitioned Embedded ZeroBlock Coding Algorithm

Module 7 VIDEO CODING AND MOTION ESTIMATION

Fully scalable texture coding of arbitrarily shaped video objects

A COMPARISON OF CABAC THROUGHPUT FOR HEVC/H.265 VS. AVC/H.264. Massachusetts Institute of Technology Texas Instruments

Digital Video Processing

signal-to-noise ratio (PSNR), 2

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

Fraunhofer Institute for Telecommunications - Heinrich Hertz Institute (HHI)

H.264 to MPEG-4 Transcoding Using Block Type Information

CONTENT ADAPTIVE COMPLEXITY REDUCTION SCHEME FOR QUALITY/FIDELITY SCALABLE HEVC

Motion Estimation Using Low-Band-Shift Method for Wavelet-Based Moving-Picture Coding

Adaptation of Scalable Video Coding to Packet Loss and its Performance Analysis

REGION-BASED SPIHT CODING AND MULTIRESOLUTION DECODING OF IMAGE SEQUENCES

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

Scalable Video Coding

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

Video Compression Standards (II) A/Prof. Jian Zhang

Video coding. Concepts and notations.

Homogeneous Transcoding of HEVC for bit rate reduction

Motion Estimation for Video Coding Standards

FAST SPATIAL LAYER MODE DECISION BASED ON TEMPORAL LEVELS IN H.264/AVC SCALABLE EXTENSION

AVC VIDEO CODERS WITH SPATIAL AND TEMPORAL SCALABILITY. Marek Domański, Łukasz Błaszak, Sławomir Maćkowiak,

IMPROVED CONTEXT-ADAPTIVE ARITHMETIC CODING IN H.264/AVC

ECE 634: Digital Video Systems Scalable coding: 3/23/17

Fast frame memory access method for H.264/AVC

Efficient Scalable Video Coding Based on Matching Pursuits

ISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1

WIRELESS networks provide only limited support for

Advanced Video Coding: The new H.264 video compression standard

JPEG Joint Photographic Experts Group ISO/IEC JTC1/SC29/WG1 Still image compression standard Features

Coding for the Network: Scalable and Multiple description coding Marco Cagnazzo

CMPT 365 Multimedia Systems. Media Compression - Video

Robust Wireless Delivery of Scalable Videos using Inter-layer Network Coding

Comparison of EBCOT Technique Using HAAR Wavelet and Hadamard Transform

Scalable Video Watermarking. Peter Meerwald June 25, 2007

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD

Embedded Descendent-Only Zerotree Wavelet Coding for Image Compression

Embedded Rate Scalable Wavelet-Based Image Coding Algorithm with RPSWS

Fully Scalable Wavelet-Based Image Coding for Transmission Over Heterogeneous Networks

Image coding based on multiband wavelet and adaptive quad-tree partition

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

H.264/AVC Video Watermarking Algorithm Against Recoding

Transcription:

Advances of MPEG Scalable Video Coding Standard Wen-Hsiao Peng, Chia-Yang Tsai, Tihao Chiang, and Hsueh-Ming Hang National Chiao-Tung University 1001 Ta-Hsueh Rd., HsinChu 30010, Taiwan pawn@mail.si2lab.org, cytsai.ee90g@nctu.edu.tw, {tchiang,hmhang}@mail.nctu.edu.tw Abstract. To support clients with diverse capabilities, the MPEG committee is defining a novel scalable video coding (SVC) framework that can simultaneously support multiple spatial, temporal and SNR resolutions under the constraints of low complexity and low delay. To fulfill the requirements, two major approaches have been considered as the potential technologies. One is the wavelet-based scheme and the other is the scalable extension of MPEG-4 AVC/H.264. This paper aims to give a brief overview for the latest advances of these technologies. 1 Introduction Scalable video coding attracts wide attention with the rapid growth of multimedia applications over Internet and wireless channels. In such applications, the video may be transmitted under a heterogeneous environment. To support clients with diverse capabilities in complexity, bandwidth, power and display resolution, the MPEG committee is defining a scalable video coding (SVC) framework that can simultaneously support multiple spatial, temporal and SNR resolutions under the constraints of low complexity and low delay. To fulfill the requirements of SVC, more than 20+ proposals were submitted during the stage of call-for-proposal [1] in February 2004. According to the spatial transform, these proposals can be roughly classified into two categories, which are the wavelet-based scheme and the AVC/H.264-based approach [2]. In addition, depending on the transform order in the spatio-temporal domain, the wavelet-based scheme can further have two variations, which are the 2D+t and t+2d structures [2]. To distinguish the differences, Fig. 1 shows a comparison at architectural level. As shown, in order to achieve the temporal scalability, both the waveletbased scheme and the AVC-based approach adopt the technique of motion compensated temporal filtering (MCTF). In addition, to achieve the SNR scalability with fine granularity, the AVC-based scheme uses a context-adaptive bit-plane coding [3][4]. On the other hand, the wavelet-based scheme employs a zero-tree coding [5][6][7] for the same purpose. As for the spatial scalability, the waveletbased scheme takes the advantages of multi-resolution property of wavelet transform, while the AVC-based scheme exploits the layered coding concept used in MPEG-2, H.263, and MPEG-4. R. Khosla et al. (Eds.): KES 2005, LNAI 3684, pp. 889 895, 2005. c Springer-Verlag Berlin Heidelberg 2005

890 Wen-Hsiao Peng et al. (a) An AVC/H.264-based approach (also DCT-based) (b) A wavelet-based approach with t+2d structure (c) A wavelet-based approach with 2D+t structure Fig. 1. An architectural level comparison for various SVC algorithms [2] Subjective test has been conducted for comparing these technologies [8]. Having better subjective quality in different scenarios, the AVC-based scheme is adopted as the working draft of SVC. On the other hand, MPEG committee also establishes an ad-hoc group to further study the wavelet-based technologies

Advances of MPEG Scalable Video Coding Standard 891 for the further video coding applications. In this paper, we give a brief overview for the latest advances of these technologies. The rest of this paper is organized as follows: Section 2 elaborates the detail for each dimension of scalability in AVC-based approach and Section 3 describes the corresponding algorithms in the wavelet-based scheme. Lastly, Section 4 summarizes the latest activities in SVC. 2 Scalable Extension of AVC/H.264 To simultaneously support spatial, temporal and SNR scalability, a scalable extension of AVC/H.264 was proposed [3]. Fig. 1 (a) shows the encoder structure of the AVC-based scheme. To facilitate the spatial scalability, the input video is decimated into various spatial resolutions and the sequence in each spatial resolution is coded in a separated layer using AVC/H.264. Within each spatial layer, the motion compensated temporal filtering (MCTF) is employed in every group of pictures (GOPs) to provide the temporal scalability. In addition, to remove the redundancy among different spatial layers, a large degree of inter-layer prediction is incorporated. The residual frames after the inter-layer prediction are then transformed and successively quantized for the SNR scalability. In the following subsections, we elaborate the details for each dimension of scalability. 2.1 Temporal Scalability In each spatial layer, the temporal scalability is achieved by the motion compensated temporal filtering (MCTF) technique, which performs the wavelet decomposition/reconstruction along the motion trajectory. Particularly, the MCTF is mostly restricted to the short-length (5, 3) wavelet, which can be implemented by a lifting scheme with only one prediction/update step. In this special case, the prediction and update can be realized using bidirectional prediction as shown in Fig. 2, where{l n } stands for the low-pass frames of level n and {H n } denotes the associated high-pass frames. Inside the MCTF, an odd-indexed frame is predicted from the adjacent and even-indexed frames to produce the high-pass frame. Accordingly, an even-index frame is updated using the combination of adjacent high-pass frames to generate the lowpass frame. To remove temporal redundancy, motion compensation is conducted before the prediction and update. By using n decomposition stages, up to n levels of temporal scalability can be provided. Specifically, the video of lower frame rate can be obtained from the low-pass frames at higher level. 2.2 Spatial Scalability and Inter-layer Prediction For the spatial scalability, sequences of different spatial resolutions are coded in separated layers. To remove the redundancy among different spatial layers, the residues and motion vectors of an enhancement layer are predicted from the ones of the subordinate layer.

892 Wen-Hsiao Peng et al. Fig. 2. MCTF structure for the (5, 3) wavelet In the prediction process, the residues and motion vectors of the subordinate layer are firstly interpolated if the subordinate layer has a lower resolution. In addition, the partition of an inter MB can be derived from the relevant sub-blocks at the subordinate layer and the motion vectors can be obtained by refining and scaling the ones from the corresponding sub-blocks. On the other hand, for an intra MB, the inter-layer prediction is allowed only if the corresponding 8x8 block of the subordinate layer is within an intra-coded MB. 2.3 SNR Scalability For the SNR scalability, the residues after the inter-layer prediction are transformed with the 4x4 integer transform. Then the transform coefficients are successively quantized into multiple quality layers. The coefficients in a quality layer is coded by a hybrid approach of bit-plane and (Run, Level) coding. For the scalability with fine granularity, the bit-planes in a quality layer are coded using a cyclical block coding [4]. The coding step is partitioned into the significant and refinement passes. The significant pass first encodes the insignificant coefficients that have zero values in the subordinate layers. Then, the refinement pass refines the remaining significant coefficients ranging from -1 to +1. During the significance pass, the transform blocks are coded in a cyclical and block-interleaved manner. On the other hand, the coding of refinement pass is conducted subband-by-subband. To further reduce bit rate, a context-adaptive binary arithmetic coder is employed.

Advances of MPEG Scalable Video Coding Standard 893 3 Scalable Approach Using Inter-frame Wavelet Different from the aforementioned AVC-based scheme that exploits the hybrid coding structure, a wavelet-based scheme using (t+2d) structure was proposed in [5][9]. Like the AVC-based approach, the wavelet-based scheme can produce a fully embedded bit-stream that simultaneously supports spatial, temporal, and SNR scalability.however, a major difference between the AVC-based approach and the wavelet-based scheme is that all the predictions in the latter case are conducted in an open-loop manner. The open-loop prediction provides more flexibility on bit-stream extraction and is more robust to transmission errors. Fig. 2 (b) shows the framework of wavelet-based scheme using (t+2d) structure. Within each GOP, MCTF is used for temporal decomposition. Particularly, the temporal low-pass frames come from the motion compensation of the original images. The low-pass frames are used for the temporal scalability. For the spatial scalability, 2-D wavelet decomposition is conducted for each frame after the MCTF. The spatial scalability is achieved by the multi-resolution property. Furthermore, after the spatial decomposition, the wavelet coefficients are coded by a zero-tree entropy coder (or other arithmetic coder) to generate an embedded bit-stream for the SNR scalability. By means of proper extraction, a single bit-stream with scalable spatial, temporal and SNR parameters can be produced. Similar wavelet video coding schemes but with specific features have been latter suggested by several other researchers [10][11]. In the following subsections, we will give a brief overview for each dimension of scalability in the wavelet-based scheme. 3.1 Temporal Scalability To provide temporal scalability, the wavelet-based scheme also adopts the MCTF asdescribedinsection2.1. In fact, the MCTF concept was first proposed for wavelet coding [9]. Typically, Haar or (5, 3) wavelet are used. To improve the accuracy of motion field so as to achieve a better coding performance, several techniques for better motion estimation have been proposed [12][13]. Particularly, in [13], a novel structure, known as Barbell-Lifting, is presented. The basic idea is to use a barbell function for generating the prediction/update values for the lifting scheme. Specifically, for each pixel in the high-pass frame, the prediction value is obtained by using a set of pixels as the input to the barbell function. It has been proved that the prediction using the barbell function offers a superior performance as compared to the one using a single-pixel. Moreover, it often reduces the mismatch of motion in the prediction and update steps. However, due to the coupling of different spatial subbands, this scheme may result in an inaccurate prediction/update when extracting the video of lower resolution. To solve this problem, an in-band MCTF can be used instead [10].

894 Wen-Hsiao Peng et al. 3.2 Spatial and SNR Scalability To achieve the spatial scalability, a separable 2-D wavelet transform is applied to both low-pass and high-pass frames. Similar to JPEG2000, the lower subbands are used to reconstruct the lower resolution images. To provide the SNR scalability with fine granularity, an embedded zerotree coding as in [6] is used after the spatial decomposition. In addition to [6], other methods like embedded zero-block coding (EZBC) [5] and embedded block coding with optimized truncation (EBCOT) [7] are also commonly used for coding the wavelet transform coefficients. Particularly, to reduce the temporal redundancy among successive frames, these coding techniques can be further extended to have 3-D structure such as 3-D EBCOT in [11]. 3.3 Motion Scalability The compressed bit-stream includes both the texture and the motion information. The bit-stream for the texture part can be arbitrarily truncated. However, the one for the motion information can not be easily partitioned. Since the motion information may be a considerable portion, it can consume a large percentage of the transmitted data in the low bit rate applications. This leads to few available bits for transmitting texture data and thus results in poor subjective quality. To solve this problem, motion information should be represented in a scalable manner. In [14], a scalable representation of motion information is proposed for the MC-EZBC. Further, in [15], the motion information is partitioned into multiple motion layers. Each layer records the motion vectors with a specified accuracy. The lowest layer denotes a rough representation of the motion vectors and the higher layers are used to refine the accuracy. Different layers are coded independently so that the motion information can be truncated at the layer boundary. Due to the mismatch between the truncated motion information and the residual data, the schemes with scalable motion information may have drifting error. As a result, a linear model is proposed in [12] toprovideabettertrade-off between the scalable representation and the rate-distortion performance. 4 Conclusion In this paper, we have reviewed the fundamentals of SVC and its latest development in MPEG standard. Both the AVC-based approach and the wavelet-based scheme are capable of offering a fully scalable bit-stream. Although the AVCbased approach has been selected as the working draft in MPEG, the waveletbased scheme has the potential for future video coding applications/standard. Research activities on the wavelet-based technology have been growing rapidly in the past a few years and the MPEG committee continuously keeps an ad-hoc group working on this subject. At the moment, a number of core experiments are set up for the AVC-based SVC in MPEG. The target date of completing the MPEG SVC standard is 2006.

Advances of MPEG Scalable Video Coding Standard 895 References 1. Ohm, J.R.: Registered Responses to the Call for Proposals on Scalable Video Coding. ISO/IEC JTC1/SC29/WG11, M10569 (2004) 2. Reichel, J., Hanke, K., Popescu, B.: Scalable Video Model V1.0. ISO/IEC JTC1/SC29/WG11, N6372 (2004) 3. Reichel, J., Wien, M., Schwarz, H.: Scalable Video Model 3. ISO/IEC JTC1/SC29/WG11, N6716 (2004) 4. Ridge, J., Bao, Y., Karczewicz, M., Wang, X.: Cyclical Block Coding for FGS. ISO/IEC JTC1/SC29/WG11, M11509 (2005) 5. Hsiang, S.T., Woods, J.W.: Embedded Image Coding Using Zeroblocks of Subband-Wavelet Coefficients and Context Modeling. In: IEEE ISCAS. (2000) 6. Shapiro, J.M.: Embedded Image Coding Using Zerotrees of Wavelet Coefficients. IEEE Transactions on Signal Processing 41 (1992) 657 660 7. Taubman, D.: High Performance Scalable Image Compression with EBCOT. IEEE Transactions on Image Processing 9 (2000) 1158 1170 8. MPEG: Subjective Test Results for the CfP on Scalable Video Coding Technology. ISO/IEC JTC1/SC29/WG11, N6383 (2004) 9. Ohm, J.R.: Three-dimensional Subband Coding with Motion Compensation. IEEE Trans. on Image Processing 3 (1994) 559 571 10. Andreopoulos, Y., Munteanu, A., Barbarien, J., van der Schaar, M., Cornelis, J., Schelkens, P.: In-Band Motion Compensated Temporal Filtering. Signal Processing: Image Communication 19 (2004) 653 673 11. Taubman, D., Mehrseresht, N., Leung, R.: SVC Technical Contribution: Overview of Recent Technology Developments at UNSW. ISO/IEC JTC1/SC29/WG11, M10868 (2004) 12. Secker, A., Taubman, D.: Highly Scalable Video Compression with Scalable Motion Coding. IEEE Trans. Circuits Syst. Video Technol. 13 (2004) 1029 1041 13. Xu, J., Xiong, R., Feng, B., Sullivan, G., Lee, M.C., Wu, F., Li, S.: 3D Sub-band Video Coding Using Barbell Lifting. ISO/IEC JTC1/SC29/WG11, M10569 (2004) 14. Hang, H.M., Tsai, S.S., Chiang, T.: Motion Information Scalability for MC- EZBC. ISO/IEC JTC1/SC29/WG11, M9756 (2003) 15. Tsai, C.Y., Hsu, H.K., Hang, H.M., Chiang, T.: Response to Cfp on Scalable Video Coding Technology: Proposal S08 A Scalable Video Coding Scheme Based on Interframe Wavelet Technique. ISO/IEC JTC1/SC29/WG11, M10569 (2004)