Smoooth Streaming over wireless Networks Sreya Chakraborty Final Report EE-5359 under the guidance of Dr. K.R.Rao

Similar documents
EE Low Complexity H.264 encoder for mobile applications

EE 5359 Low Complexity H.264 encoder for mobile applications. Thejaswini Purushotham Student I.D.: Date: February 18,2010

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

Advanced Video Coding: The new H.264 video compression standard

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

VIDEO COMPRESSION STANDARDS

WE STUDY the video traffic generated by the

Homogeneous Transcoding of HEVC for bit rate reduction

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc.

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

Optimum Quantization Parameters for Mode Decision in Scalable Extension of H.264/AVC Video Codec

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

Adaptation of Scalable Video Coding to Packet Loss and its Performance Analysis

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

Research Article Traffic and Quality Characterization of the H.264/AVC Scalable Video Coding Extension

MCTF and Scalability Extension of H.264/AVC and its Application to Video Transmission, Storage, and Surveillance

Scalable Video Coding

Video Coding Standards. Yao Wang Polytechnic University, Brooklyn, NY11201 http: //eeweb.poly.edu/~yao

JPEG 2000 vs. JPEG in MPEG Encoding

LIST OF TABLES. Table 5.1 Specification of mapping of idx to cij for zig-zag scan 46. Table 5.2 Macroblock types 46

Objective: Introduction: To: Dr. K. R. Rao. From: Kaustubh V. Dhonsale (UTA id: ) Date: 04/24/2012

BANDWIDTH-EFFICIENT ENCODER FRAMEWORK FOR H.264/AVC SCALABLE EXTENSION. Yi-Hau Chen, Tzu-Der Chuang, Yu-Jen Chen, and Liang-Gee Chen

Department of Electrical Engineering, IIT Bombay.

Performance Comparison between DWT-based and DCT-based Encoders

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION

Digital Video Processing

VHDL Implementation of H.264 Video Coding Standard

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD

Scalable Extension of HEVC 한종기

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

Overview, implementation and comparison of Audio Video Standard (AVS) China and H.264/MPEG -4 part 10 or Advanced Video Coding Standard

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

STACK ROBUST FINE GRANULARITY SCALABLE VIDEO CODING

Testing HEVC model HM on objective and subjective way

(Invited Paper) /$ IEEE

ROUTING PROTOCOL ANLYSIS FOR SCALABLE VIDEO CODING (SVC) TRANSMISSION OVER MOBILE AD- HOC NETWORKS EE 5359 SPRING 2015 MULTIMEDIA PROCESSING

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS

CONTENT ADAPTIVE COMPLEXITY REDUCTION SCHEME FOR QUALITY/FIDELITY SCALABLE HEVC

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Motion Estimation. Original. enhancement layers. Motion Compensation. Baselayer. Scan-Specific Entropy Coding. Prediction Error.

Implementation and analysis of Directional DCT in H.264

Video Compression An Introduction

FPGA IMPLEMENTATION OF BIT PLANE ENTROPY ENCODER FOR 3 D DWT BASED VIDEO COMPRESSION

An Efficient Motion Estimation Method for H.264-Based Video Transcoding with Arbitrary Spatial Resolution Conversion

Video compression with 1-D directional transforms in H.264/AVC

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

Advanced Encoding Features of the Sencore TXS Transcoder

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

Professor, CSE Department, Nirma University, Ahmedabad, India

Scalable Video Coding in H.264/AVC

Video Compression Standards (II) A/Prof. Jian Zhang

Performance analysis of AAC audio codec and comparison of Dirac Video Codec with AVS-china. Under guidance of Dr.K.R.Rao Submitted By, ASHWINI S URS

Week 14. Video Compression. Ref: Fundamentals of Multimedia

NEW CAVLC ENCODING ALGORITHM FOR LOSSLESS INTRA CODING IN H.264/AVC. Jin Heo, Seung-Hwan Kim, and Yo-Sung Ho

COMPARATIVE ANALYSIS OF DIRAC PRO-VC-2, H.264 AVC AND AVS CHINA-P7

Complexity Estimation of the H.264 Coded Video Bitstreams

Introduction to Video Coding

EE 5359 H.264 to VC 1 Transcoding

White paper: Video Coding A Timeline

Lecture 13 Video Coding H.264 / MPEG4 AVC

Standard Codecs. Image compression to advanced video coding. Mohammed Ghanbari. 3rd Edition. The Institution of Engineering and Technology

The Scope of Picture and Video Coding Standardization

Video Transcoding Architectures and Techniques: An Overview. IEEE Signal Processing Magazine March 2003 Present by Chen-hsiu Huang

Recent, Current and Future Developments in Video Coding

Video Coding Standards

A Hybrid Temporal-SNR Fine-Granular Scalability for Internet Video

High Efficiency Video Coding (HEVC) test model HM vs. HM- 16.6: objective and subjective performance analysis

Sergio Sanz-Rodríguez, Fernando Díaz-de-María, Mehdi Rezaei Low-complexity VBR controller for spatialcgs and temporal scalable video coding

ECE 634: Digital Video Systems Scalable coding: 3/23/17

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

Zonal MPEG-2. Cheng-Hsiung Hsieh *, Chen-Wei Fu and Wei-Lung Hung

H.264/AVC Baseline Profile to MPEG-4 Visual Simple Profile Transcoding to Reduce the Spatial Resolution

Cross Layer Protocol Design

THIS TUTORIAL on evaluating the performance of video

Unit-level Optimization for SVC Extractor

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

TRANSCODING OF H264 BITSTREAM TO MPEG 2 BITSTREAM. Dr. K.R.Rao Supervising Professor. Dr. Zhou Wang. Dr. Soontorn Oraintara

Transcoding from H.264/AVC to High Efficiency Video Coding (HEVC)

H.264 to MPEG-4 Transcoding Using Block Type Information

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

ARCHITECTURES OF INCORPORATING MPEG-4 AVC INTO THREE-DIMENSIONAL WAVELET VIDEO CODING

VIDEO AND IMAGE PROCESSING USING DSP AND PFGA. Chapter 3: Video Processing

Video Coding Standards: H.261, H.263 and H.26L

Reduced Frame Quantization in Video Coding

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

Editorial Manager(tm) for Journal of Real-Time Image Processing Manuscript Draft

Low-complexity video compression based on 3-D DWT and fast entropy coding

Investigation of the GoP Structure for H.26L Video Streams

H.264 / AVC (Advanced Video Coding)

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

Video coding. Concepts and notations.

Introduction to Video Compression

Review Article Video Traffic Characteristics of Modern Encoding Standards: H.264/AVC with SVC and MVC Extensions and H.265/HEVC

Transcription:

Smoooth Streaming over wireless Networks Sreya Chakraborty Final Report EE-5359 under the guidance of Dr. K.R.Rao 28th April 2011

LIST OF ACRONYMS AND ABBREVIATIONS AVC: Advanced Video Coding DVD: Digital Video Disc GOP: Group of Pictures DCT : Discrete Cosine Transform. VLC: Variable Length Coding VLD: Variable Length Decoding IEC: International Electro technical Commission ISO: International Standards Organization ITU: International Telecommunication Union JVT: Joint Video Team MPEG: Moving Picture Experts Group SVC: Scalable Video Coding JM: Joint Model JSVM: Joint Scalable Video Model

Abstract: Smooth streaming is a serious problem since bandwidth is a natural resource and it is limited. In this project the implications of video traffic smoothing on the numbers of statistically multiplexed H.264 SVC (Scalable Video Coding) [1], H.264/AVC (Advanced Video Coding) [1], and MPEG-4 part 2 streams, the bandwidth requirements for streaming, and the introduced delay are examined. SVC enables the transmission and decoding of partial bit streams to provide video services with lower temporal or spatial resolutions or reduced fidelity while retaining a reconstruction quality that is high relative to the rate of partial bit streams. Introduction: Smooth streaming is a challenge in areas where bandwidth is low or limited. In most of the cases for streaming video and audio data UDP was found useful over TCP, since TCP introduces various delays. It also waits for the receipt of acknowledgement causing delay in the frame arrival. The loss of data is acceptable to certain extent but not the delay caused. Modern video transmission and storage are based on RTP/IP for real time services. Most RTP/IP [9] RTP (Real Time Protocol/ Internet Protocol) access networks are typically characterized by a wide range of connection qualities and receiving devices. The varying connection quality is due to adaptive resource sharing mechanisms of these networks. Traditional digital video transmission and storage systems are based on H.222.0 and H.320 [7] for broadcasting services over satellite, cable, and terrestrial transmission channels, for DVD (Digital Video Disc) storage and for conversational video conferencing services. International video coding standards H.262, H.263 and MPEG-4 part 2 [17,18] already include several tools by which the most important scalability modes can be supported. But the characteristics of traditional video transmission systems and the quality scalability features came with a significant loss in coding efficiency as well as a large increase in decoder complexity. Simulcast provides similar functionalities as a scalable bit stream. With a system that encodes video streams into separate quality layers, a different subset of the layers can create a distinct picture quality. For example, the network systems may have different display resolutions, different caching or intermediate storage resources, varying bandwidths, loss rates and best effort or QoS capabilities. In order to achieve a desired quality level, the decoder will select a subset of layers to process. Without the layering scheme, all existing video contents may need to be transcoded to the new quality level to obtain the required quality. The scalable video scheme will therefore enhance the whole system s extensibility and flexibility [10]. In general, a compression system is composed of the following three key building blocks as shown in Fig.1. The first issue that needs to be considered is the quality-compression performance, which aims to provide the best quality decoded video with the minimal number of bits.

Fig.1 Basic compression system [10] Representation Concentrates important information into a few parameters Quantization Discretizes the parameters Binary Encoding Exploits non-uniform statistics of the quantized parameters Creates bitstream for transmission Fig.2 Typical coding system [10] As seen in fig.2 at the encoder, the raw video is transformed by discrete cosine transform (DCT), quantized and coded by variable length coding (VLC). Then the compressed video stream is transmitted to the decoder through the network. At the decoder, the received compressed video stream is first decoded by variable length decoding (VLD), then inversely quantized (IQ), and inversely DCT (IDCT) transformed.

Fig. 3: Different profiles in H.264 [14] Fig, 3 shows the different profiles in H.264 Baseline Profile I/P slices Multiple reference frames In-loop deblocking CAVLC entropy coding Main Profile Baseline Profile features mentioned above B slices CABAC entropy coding Interlaced coding Weighted prediction High Profile Main Profile features mentioned above 8 8 transform option Custom quantisation matrices :

Fig. 4: Block diagram of H.264 [14] The block diagram for H.264 coding is shown in Fig. 4. Encoder may select between intra and inter-coding for block-shaped regions of each picture. Intra-prediction: H.264 uses the methods of predicting intra-coded macroblocks to reduce the high amount of bits coded by original input signal itself. For encoding a block or macro-block in Intra-coded mode, a prediction block is formed based on previously reconstructed (but, unfiltered for deblocking) blocks. The residual signal between the current block and the prediction is finally encoded. For the luma samples, the prediction block may be formed for each 4x4 subblock. One case is selected from a total of 9 prediction modes for each 4x4. Refer fig5. Fig. 5: 4x4 Luma prediction (intra-prediction) modes in H.264 [15]

Inter-prediction: Inter-coding uses inter-prediction of a given block from some previously decoded pictures. The aim to use inter-coding is to reduce the temporal redundancy by making use of motion vectors. In H.264, the current picture can be partitioned into the macroblocks or the smaller blocks. A macroblock of 16x16 luma samples can be partitioned into smaller block sizes up to 4x4. The smaller block size requires larger number of bits to signal the motion vectors and extra data of the type of partition, however the motion compensated residual data can be reduced. Therefore, the choice of partition size depends on input video characteristics. Refer fig.6 (a) (b) Fig.6: Macroblock portioning in H.264 for inter prediction [1] (a) (L-R) 16x16, 8x16, 16x8, 8x8 blocks; (b) (L-R) 8x8, 4x8, 8x4, 4x4 blocks [15] JM Software [12]: This software is a product of Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG. The latest version of JM Software is 17.2. It supports both planar and interleaved/packed raw image data (viz., yuv, rgb). The input file is a configuration file (text file) and some of the parameters passed in that file are: Input file Number of frames to be encoded Frame rate Output frame width and Height Profile, level selection GOP size Bit rate control JSVM software [13]: This software is a product of Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG. The input file is a configuration file (text file) and some of the parameters passed in that file are: Input file Number of frames to be encoded Frame rate Output frame width and Height Profile, level selection GOP size Bit rate control

JM(17.2) Performance Analysis JM Performance in Baseline Profile Video Sequence - akiyo_qcif in fig 7 Number of frames encoded 25 GOP - IBPBPBPBPB Quantization parameter 25, 30, 35, 40 Number of reference frames -3 Video Sequences Used: Baseline File Size: 3713 KB QCIF format 176 x 144 YUV 4:2:0 Fig 7: akiyo_qcif

Quantization Parameter Peak to Peak Signal to Noise ratio (db) Total Encoding time(s) Bitrate (Kbps) 10 51.967 160.654 4333.35 20 44.576 149.769 1547.8 25 41.26 141.364 884.57 30 37.606 131.664 455.68 35 34.968 120.128 241.37 40 32.957 107.499 132.43 Table 1: JM performance in Baseline profile for akiyo_qcif Fig 8: PSNR Vs QP for Baseline Profile for akiyo-qcif

Fig 9: Encoding Time Vs QP for Baseline Profile for akiyo-qcif Need for SVC It has been envisioned that network visual communication has become an active research area in the recent years. One of the most challenging problems for the implementation of a video communication system is that the available bandwidth of the network is usually insufficient for the delivery of the voluminous amount of the video data. In order to solve this problem, considerable effort has been applied in the last three decades for the development of video compression techniques. These efforts have resulted in the video coding standards such as H.261[17], H.263 [17], MPEG-1, MPEG-2 and MPEG-4 part 2 [18]. The SVC extension provides temporal scalability, coarse (CGS), medium (MGS), and fine (FGS) granularity scalability, or SNR scalability in general, spatial scalability, and combined spatiotemporal-snr scalability (restricted set of spatio-temporal-snr points can be extracted from a global scalable bit stream) [8]. Scalable video coding extension of the H.264/AVC with its hierarchical B-frames compresses single layer video. H.264/AVC and H.264 SVC video encoding are expected to be widely adopted for wired and wireless network video transport due to their increased compression efficiency compared to MPEG-4 and their widespread inclusion in application standards. For a given video quality, the lower the compressed bitrate, the more efficient is the compression. The improvements in rate-distortion (RD) compression efficiency with H.264 SVC and H.264/AVC come at the expense of significantly increased variabilities of the encoded frame sizes (in bits). The recently developed H.264/AVC video codec with Scalable Video Coding (SVC) [1] extension compresses non-scalable (single-layer) and scalable video significantly more efficiently than MPEG 4 Part 2 [18]. Since the traffic characteristics of encoded video have a significant

impact on its network transport, the bit rate-distortion and bit rate variability-distortion performance of single-layer video traffic of the H.264/AVC codec and SVC extension using long CIF resolution videos is examined. The traffic characteristics of the hierarchical B frames (SVC) versus classical B frames are studied. In addition, the impact of frame size smoothing on the video traffic to mitigate the effect of bit rate variabilities is examined. Compared to MPEG 4 Part 2, the H.264/AVC codec and SVC extension achieve lower average bit rates at the expense of significantly increased traffic variabilities that remain at a high level even with smoothing. Through simulations the implications of this increase in rate variability on (i) frame losses when transmitting a single video, and (ii) on the number of supported video streams in a bufferless statistical multiplexing scenario with restricted link capacity and information loss is investigated. In general, video can be encoded (i) with fixed quantization scales, which results in nearly constant video quality at the expense of variable video traffic (bit rate), or (ii) with rate control, which adapts the quantization scales to keep the video bit rate nearly constant at the expense of variable video quality. In order to examine the fundamental traffic characteristics of the H.264/AVC standard, which does not specify a normative rate control mechanism, primarily on encodings with fixed quantization scales is focused. An additional motivation for the focus on variable bit rate video encoded with fixed quantization scales is that the variable bit rate streams allow for statistical multiplexing gains that have the potential to improve the efficiency of video transport over communication networks. The development of video network transport mechanisms that meet the strict playout deadlines of the video frames and efficiently accommodate the variability of the video traffic is a challenging problem. A wide array of video transport mechanisms has been developed and evaluated, based primarily on the characteristics of MPEG 2 and MPEG 4 Part 2 encoded video. The widespread adoption of the new H.264/AVC video standard necessitates the careful study of the traffic characteristics of video coded with the new H.264/AVC codec and its extensions. Therefore, it is necessary to examine the new video encoder s statistical characteristics and compression performance from a communication network perspective. The study of the newest H.264 SVC extension analyzes single-layer (non-scalable) video traffic characteristics of long CIF videos, i.e., although the H.264 SVC single-layer encoding supports temporal scalability. H.264/AVC and H.264 SVC single-layer video traffic is significantly more variable than MPEG 4 Part 2 traffic under similar encoding conditions. At the same time, the significant average bit rate savings is confirmed. The increased bit rate variability is observed over a wide range of average qualities of the encoded streams and for all tested video sequences. This makes the transport of H.264/AVC and H.264 SVC single-layer traffic more challenging than MPEG 4 Part 2 traffic. In the following, the concept of hierarchical B frames is discussed in more detail, since the study refers to this concept repeatedly. SVC s temporal scalability is built on the hierarchical prediction concept for B frames. Temporal Scalability with Hierarchical B Frames: The introduction of hierarchical B frames has allowed the H.264 SVC encoder to achieve temporal scalability while at the same time improving RD efficiency compared to the classical B frame prediction method employed by the older MPEG standards (MPEG 1/2/4-Part 2) and by default in H.264/AVC. Fig. 10 illustrates both concepts for predicting B frames. Hierarchical B frames are an important new concept that was first introduced in H.264/AVC using generalized B frames and was later found to be the best method to build the Scalable Video Coding (SVC) extension. Hence, the H.264 SVC encoded single-layer stream is decodable by

existing H.264/AVC codec. The scalability modes do require new SVC capability, with the supported modes depending on the applications or equivalently on the H.264 SVC profiles. Fig. 10(a) depicts the classical B frame prediction structure, where each B frame is predicted only from the preceding I or P frame and from the subsequent I or P frame. Other B frames are not referenced since this is not allowed by video standards preceding H.264/AVC. This restriction is lifted in the generalized B frame paradigm that was first introduced in the H.264/AVC standard. Fig. 10(b) depicts the hierarchical B frame structure which uses B frames for the prediction of B frames. The illustrated case is the dyadic hierarchy of B frames, meaning that the number of B frames n in between the key pictures (I or P frames) equals n = 2 k_ 1. The hierarchy with 3 B frames (I frame period is 16) is depicted in Fig. 10(b). In this example, the frame sequence is I 0 B 2 B 1 B 2 P 0 B 2 B 1 B 2 P 0 B 2 B 1 B 2 P 0 B 2 B 1 B 2 I 1, where the index represents the temporal layer number. The coding efficiency of hierarchical B frames depends on the number of hierarchical B frames (temporal levels) and on the choice of quantization parameters for each B frame. Therefore, H.264 SVC introduces cascading quantizers which assign a higher quantization parameter value (lower quality) to B frames belonging to higher temporal layers. This concept is based on the insight that the lowest temporal layer 0 requires higher quality than the next temporal layer, since all other predictions depend on it. The quality of each subsequent temporal layer can be gradually reduced since fewer layers depend on it. Apparently the quality fluctuation that is introduced within a GoP is not subjectively noticeable by human observers. For a video sequence consisting of M frames encoded with a given quantization scale, let X m (m = 1; : : : ; M) denote the sizes [bits] The mean frame size X [bits] of the encoded video sequence is defined as [8] While the variance of the frame sizes ( is the standard deviation [bits] ) is defined as The coefficient of variation of frame sizes [unit free] is defined as

Fig.10 B frame prediction structures [8] In the subsequent experiments, four different GoP structures are employed, namely IBPBPBPBPBPBPBPB (16 frames, with 1 B frame per I/P frame), which is denoted by G16-B1, IBBBPBBBPBBBPBBB (16 frames, with 3 B frames per I/P frame) denoted by G16-B3, IBBBBBBBPBBBBBBB (16 frames, with 7 B frames per I/P frame) denoted by G16-B7, and IBBBBBBBBBBBBBBB (16 frames, with 15 B frames per I frame) denoted by G16-B15. In the context of SVC, these four GoP structures are respectively designated by their GoP size which is the number of hierarchical B frames plus one key picture, either of type I or P. Hence, G16-B1 has GoP size 2, G16-B3 has GoP size 4, G16-B7 has GoP size 8, and G16-B15 has GoP size 16. Basic concept for extending H.264/AVC toward a scalable video coding standard: Since SVC was developed as an extension of H.264/AVC with all of its well-designed core coding tools being inherited, one of the design principles of SVC is that new tools should only be added if necessary for efficiently supporting the required types of scalability. Fig.11 shows the types of scalability.

Fig.11: Types of scalability Temporal scalability: A bit stream provides temporal scalability when the set of corresponding access units can be partitioned into a temporal base layer and one or more temporal enhancement layers. The prior video coding standards MPEG-1, H.262 MPEG-2 Video, and H.263 all support temporal scalability to some degree. H.264/AVC provides a significantly increased flexibility for temporal scalability because of its reference picture memory control. Hence, for supporting temporal scalability with a reasonable number of temporal layers, no changes to the design of H.264/AVC were required. The only related change in SVC refers to the signaling of temporal layers. The coding order for hierarchical prediction structures has to be chosen in a way that reference pictures are coded before they are employed for motion-compensated prediction. This can be ensured by different strategies, which mostly differ in the associated decoding delay and memory requirement. Spatial scalability: For supporting spatial scalable coding, SVC follows the conventional approach of multi-layer coding, which is also used in H.262 MPEG-2 Video, and H.263. Each layer corresponds to a supported spatial resolution and is referred to by a spatial layer or dependency identifier D. The dependency identifier D for the base layer is equal to 0, and it is increased by 1 from one spatial layer to the next. Since the support of quality and spatial scalability usually comes along with a loss in coding efficiency relative to single-layer coding, the trade-off between coding efficiency and the provided degree of scalability can be adjusted according to the needs of an application. Combined scalability: The general concept for combining spatial, quality, and temporal scalability is illustrated in Fig. 12, which shows an example encoder structure with two spatial

layers. The SVC coding structure is organized in dependency layers. A dependency layer usually represents a specific spatial resolution. In an extreme case it is also possible that the spatial resolution for two dependency layers is identical, in which case the different layers provide coarse-grain scalability (CGS) in terms of quality. Fig.12: SVC encoder structure example [1]

SNR scalability: fig.13 and fig 14 shows the SNR scalable coder and decoder Base layer Q LQ = Coarse Quantizer Q LQ= Fine Quantizer Fig.13: SNR scalable coder upper layer

Fig.14: Decoding process for SNR scalability For SNR scalability, coarse-grain scalability (CGS) and fine-grain scalability (FGS) are distinguished [16]. Coarse-grain SNR scalability Coarse-grain SNR scalable coding is achieved using the concepts for spatial scalability. The only difference is that for CGS the upsampling operations of the inter-layer predic-tion mechanisms are omitted. Note that the restricted inter-layer prediction that enables single-loop decoding is even more important for CGS than for spatial scalable coding.

Fine-grain SNR scalability In order to support fine-granular SNR scalability, so-called progressive refinement (PR) slices have been introduced. Each PR slice represents a refinement of the residual signal that corresponds to a bisection of the quantization step size (QP increase of 6). These signals are represented in a way that only a single inverse transform has to be performed for each transform block at the decoder side. The ordering of transform coefficient levels in PR slices allows the corresponding PR NAL units to be truncated at any arbitrary byte-aligned point, so that the quality of the SNR base layer can be refined in a fine-granular way. FGS enhancement layer key picture SNR base layer key picture Fig. 15: Motion-compensated prediction with FGS. [16] The main reason for the low performance of the FGS in MPEG-4 is that the motioncompensated prediction (MCP) is always done in the SNR base layer. In the SVC design, the highest quality reference available is employed for the MCP of temporal refinement pictures as depicted in Fig. 15. Note that this difference significantly improves the coding efficiency without increasing the complexity when hierarchical prediction structures are used. The MCP for key pictures is done by only using the base layer representation of the reference pictures. Thus, the key pictures serve as re-synchronization points, and the drift between encoder and decoder reconstruction is efficiently limited. JSVM Performance Analysis [11] JSVM Performance in Baseline Profile [11] Video Sequence Die Hard in fig 16 Number of frames encoded 30 GOP G16B15 Quantization parameter 25, 30, 35, 40

Fig. 16: Video sequence [11] Fig. 17: Peak/Mean of size vs Average quality (PSNR-Y) for Die Hard[11]

MGS layer 0 Fig. 18: Average quality (PSNR-Y) vs Average bit rate for Die Hard [11]

Fig. 19: Trace preview for the video sequence for Die Hard[11]

SVC reference encodings [11] Die hard QP=25, fps=30, baselayer0, layer 4

QP=30

QP=35

QP=40

JSVM Performance in Baseline Profile [11] Video Sequence Citizen Kane in fig 20 Number of frames encoded 30 GOP G16B15 Quantization parameter 25, 30, 35, 40 Fig. 20: Video sequence [11] Fig. 21: Peak/Mean of size vs Average quality (PSNR-Y) for Citizen Kane [11]

Fig. 22: Average quality (PSNR-Y) vs Average bit rate for Citizen Kane [11] Fig. 23: Trace preview for the video sequence for Citizen Kane [11]

SVC reference encodings [11] QP=25, fps=30, baselayer0, layer 4

QP=30

QP=35

QP=40

Advantages of SVC: An SVC stream incorporates multiple streams in a single stream for transmission and storage of video. But it is not always necessary to provide all types of scalability for every video stream and hence SVC stream can be customized according to the needs of an application. H.264-SVC has multiple advantages. One obvious advantage is the ability to send a single video stream to multiple heterogeneous clients. One can also do that by using transcoding, and video conferencing systems use that technique for rate matching, but this takes up a relatively large amount of processor load as the video stream sent to each client needs to be encoded individually. Also, transcoding introduces some latency on its own. And H.264-SVC is useful even for multicasting. An SVC video stream is just 10-20% larger than the size of the largest stream it carries. So, when one SVC stream is sent instead of multiple individual video streams, a lot of bandwidth and storage space is saved. More over, the base video stream layer of lower quality can be stored separately, instead of storing all the layers. This might be useful for video surveillance. Conclusions: In comparison to the scalable profiles of prior video coding standards, the H.264/AVC extension for scalable video coding (SVC) provides various tools for reducing the loss in coding efficiency relative to single-layer coding. The most important differences are: (1) The possibility to employ hierarchical prediction structures for providing temporal scalability with several layers while improving the coding efficiency and increasing the effectiveness of quality and spatial scalable coding. (2) New methods for inter-layer prediction of motion and residual improving the coding efficiency of spatial scalable and quality scalable coding. (3) The concept of key pictures for efficiently controlling the drift for packet-based quality scalable coding with hierarchical prediction structures. (4) Single motion compensation loop decoding for spatial and quality scalable coding providing a decoder complexity close to that of single-layer coding. (5) The support of a modified decoding process that allows a lossless and low-complexity rewriting of a quality scalable bit stream into a bit stream that conforms to a non-scalable H.264/AVC profile. These new features provide SVC with a competitive rate-distortion performance while only requiring a single motion compensation loop at the decoder side. (1) Temporal scalability: can be typically achieved without losses in rate-distortion performance. (2) Spatial scalability: when applying an optimized SVC en-coder control, the bit rate increase relative to non-scalable H.264/AVC coding at the same fidelity can be as low as 10% for dyadic spatial scalability. It should be noted that the results typically become worse as spatial resolution of both layers decreases and results improve as spatial resolution increases.

References: [1] H.Schwarz, D.Marpe, and T.Weigand, Overview of the scalable video coding extension of the H.264/AVC standard, IEEE Trans. Circuits and Systems for Video Technology, vol 17, no.9, pp.1103-1120, Sep.2007 (Introduction to the special issue on Scalable video codingstandardization and beyond, pp.1099-1269). [2] G.Van der Auwera and M.Reisslein, Implications of smooth streaming on statistical multiplexing of H.264/AVC and SVC video streams, IEEE Trans. Broadcasting, vol.55, no.3, pp.541-558, Sep.2009. [3] M.Wien, H.Schwarz, and T.Oelbaum, Performance analysis of SVC, IEEE Trans. Circuits and Systems for Video Technology, vol.17, no.9, pp.1194-1203, Sep.2007 (Introduction to the special issue on Scalable video coding-standardization and beyond, pp.1099-1269). [4] G.Vander der Auwera, P.T.David, and M.Reisslein, Traffic characteristics of H.264/AVC variable bit rate video, IEEE Communications Magazine, vol.46, no.11, pp.164-174, Nov.2008. [5] T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to algorithms, First Edition, MIT press and McGraw-Hill, Cambridge, MA, USA, 1990. [6] T.R. Rahman and M. Rahman, Compression algorithms for audio-video streaming IEEE Conference on Intelligent systems, modeling and simulation, pp. 187-192, 2010. [7] ITU-T and ISO/IEC JTC 1, Generic coding of moving pictures and associated audio information-part 1: Systems, ITU-T Recommendation H.222.0 and ISO/IEC 13818-1(MPEG-2 Systems), Nov.1994. [8] G.Vander der Auwera, P.T.David, and M.Reisslein, Traffic and quality characterization of single-layer video streams encoded with the H.264/MPEG-4 advanced video coding standard and scalable video coding extension, IEEE Trans. Broadcasting, vol.54, no.3, pp.698-718, Aug.2008. [9] S.Rahim and S.F.Hassan, Performance evaluation of fast TCP and TCP Westwood+ for multimedia streaming in wireless environment ICCIT '09. 12th International Conference, pp.697-702, Dec. 2009. [10] R.Mahalingam RD-Optimized rate shaping for scalable coded streaming video, Masters Thesis, Technische Universit at M unchen, Oct. 2006. [11] Video Trace Library - http://trace.eas.asu.edu [12] JM software http://iphome.hhi.de/suehring/tml/ [13] JSVM software - http://ip.hhi.de/imagecom_g1/savce/mpeg-verification-test/sbb2.htm [14] S. Kwon, A. Tamhankar and K.R. Rao, Overview of H.264 / MPEG-4 Part 10, J. Visual Communication and Image Representation, vol. 17, pp.183-216, April 2006. [15] I.E.Richardson, The H.264 advanced video compression standard second edition, Aug. 2010. [16] H.Schwarz, D.Marpe and T.Wiegand, Overview of the scalable H.264/MPEG4-AVC extension IEEE international conference on Image Processing, pp.161-164, Feb. 2007. [17] B. Girod, E-Steinbach and N. Farber, Comparison of the H.263 and H.261 Compression Standards, SPIE, Photonics East, Philadelphia, PA, vol.cr60 Oct. 1995. [18] P.D. Symes, Video Compression: Fundamental Compression Techniques and Overview of the JPEG and MPEG Compression Systems, McGraw-Hill, New York, 1998.