1 ROUTING PROTOCOL ANLYSIS FOR SCALABLE VIDEO CODING (SVC) TRANSMISSION OVER MOBILE AD- HOC NETWORKS EE 5359 SPRING 2015 MULTIMEDIA PROCESSING A PROJECT PROPOSAL UNDER GUIDANCE OF K.R.RAO PRAJWAL S SANKET UNIVERSITY OF TEXAS AT ARLINGTON ELECTRICAL ENGINEERING UTA ID: 1000980854
2 TABLE OF CONTENTS LIST OF ACRONYMS AND ABBREVIATIONS... 3 ABSTRACT... 3 INTRODUCTION... 5 OVERVIEW OF VIDEO CODING... 6 SCALABLE VIDEO CODING... 7 TYPES OF SVC... 8 TEMPORAL SCALABILITY... 8 SPATIAL SCALABILITY... 9 QUALITY SCALABILITY... 10 SVC ENCODER... 12 SVC DECODER... 13 MANET ROUTING PROTOCOLS: PROTOCOLS OVERVIEW... 14 AD-HOC ON DEMAND DISTANCE VECTOR ROUTING... 15 DISTANCE SEQUENCE DISTANCE VECTOR... 15 DYNAMIC SOURCE ROUTING... 15 AD-HOC ON-DEMAND MULTIPATH DISTANCE VECTOR ROUTING.... 15 IMPLEMENTATION... 16 CONCLUSION... 17 REFERENCES... 19
3 LIST OF ACRONYMS AND ABBREVIATIONS AODV: Ad-hoc on demand Distance Vector Routing AODMV: ad-hoc on demand multipath distance vector routing. CH: Cluster head DSR: Dynamic source routing DSDV: Destination Sequenced DV GOP: group of pictures LAR: location aided routing MANET: Mobile Ad-hoc network MPEG: Moving Pictures Expert group. OLSR: Optimized link state routing PDR: Packet delivery ratio QOS: Quality of service RREQ: Route request Message RREP: Route replay SVC: scalable video coding TCP: transmission control Protocol TORA: Temporally ordered routing algorithm UDP: User datagram protocol
4 ABSTRACT Video streaming is a multimedia service that has gained significant development in the recent years. The main problem to tackle within this network is the bandwidth fluctuation in the network. [1] The challenge of a wireless network is to provide the quality of service (QOS) and quality of experience to satisfy the customer. Satisfying the customer requires better streaming capability for the networks. This project is focused on performance analysis of routing protocols over MANET for scalable video streaming. The video codec under evaluation is H.264/SVC and routing protocols are DSR, AODV and AODMV. Performance of Ad-hoc Networks such as 802.11 and 802.11e are evaluated using this study. This evaluation is carried out using NS- 2 simulators configured to support the ad-hoc networks. [1]
5 INTRODUCTION Today, the developments in telecommunication technology are increasingly demanding the use of broadband and data transmission with the high speeds. Especially in mobile video services where there is a need for higher bandwidth and more speed of transmission data to meet the market demand. The demand for the user of this technology and 4G standards even 5G preparation led to various schemes and innovations from various party mainly academics, research institutions and communication vendors. [1] Mobile Ad-hoc Network (MANET) is a collection of communication devices or nodes that wish to communicate without any fixed infrastructure and predetermined organization of available links. The MANET performance widely depends on the used routing mechanisms and Protocols. Routing protocols may be classified into three categories- dynamic cluster based routing, proactive and reactive. [1] The main concept of cluster based routing is network diving into interconnected substructures, called clusters. Each cluster has a cluster head (CH) which acts as a coordinator inside a cluster. It maintains the contact with other cluster heads (CH). [1] Reactive MANET protocol dynamically finds the route between the nodes. A reactive protocol performs route discovery and route maintainability. Route discovery is responsible for finding new routes. And route Maintainability is responsible for finding the broken links and repairing the existing route. For example: AODV, LAR, and DSR. [3] A proactive or table driven protocol is based on exchange of control packets and continuous update on the route information in the routing table. Hence, the route is readily available. For example: DSDV, OLSR. [3]
6 OVERVIEW OF VIDEO CODING Non-scalable video coding: There are three basic types for Moving Picture Experts Group (MPEG) video frames: (1) I-frame, or intra-coded frame, where the frame is encoded independently of other frames and decoded by itself, (2) P- frame, or predictive-frame, where the frame is encoded using predictions from a preceding I- or P-frame in the video sequence, and (3) B-frame, or bi-directionally predictive-coded frame, where the frame is encoded using predictions from preceding and succeeding I- or P-frames. Generally, the entire video sequence can be decomposed into smaller units, which are then coded together, called the Group of Pictures (GOP). A GOP pattern is characterized by two parameters, G (N, M): N is the I-to-I frame distance and M is the I-to-P frame distance. For example, as shown in Fig. 1, G (9, 3) means that the GOP includes one I-frame, two P-frames, and six B-frames. The second I-frame showed in Fig. 1indicates the beginning of the next GOP. The arrows indicate that the B-frames and P-frames decoded are dependent on the preceding or succeeding I- or P-frames. [5] Figure 1: An example of MPEG coding with GOP (N=9, M=3) [5]
7 SCALABLE VIDEO CODING In scalable or layered video coding, the video is encoded hierarchically into a base layer and one or more enhancement layers. Decoding the base layer offers low but standard video quality, while decoding the base layer together with additional enhancement layers provides further refinement of the video quality. There are different forms of scalability, including temporal, spatial, and SNR scalability. Figure 2 shows an example of the temporal scalable encoding. The I- and P-frames form the base layer, and the B-frames form the enhancement layer. The base layer provides the basic video quality with a lower frame rate. Adding the enhancement layer to the base layer increases the smoothness of the video quality. H.264/SVC is a scalable extension of H.264/AVC. It is a current standardization of the Joint Video Team (JVT). An encoded SVC bit stream consists of an H.264/AVCcompatible base layer and one or more scalable enhancement layers. Conceptually, the design of H.264/AVC covers a Video Coding Layer (VCL) and a Network Abstraction Layer (NAL). While the VCL creates a coded representation of the source content, the NAL formats these data and provides the header information in a way that enables simple and effective customization of the use of VCL data for a wide variety of systems. Figure 2: An Example of Temporal Video coding [5]
8 TYPES OF SVC Figure 3: Types of scalable video coding [13] Temporal Scalability A video bit stream is called temporal scalable when parts of the stream can be removed in a way that the resulting sub stream forms another valid bit stream for some target decoder, and the sub stream represents the source content with a frame rate that is smaller than the frame rate of the complete original bit stream. As depicted in Figure 3, temporal scalability can be achieved by partitioning the access units of a bit stream (each access unit corresponds to a video picture) into a temporal base layer and one or more temporal enhancement layers with the following property: Let the temporal layers be identified by a temporal layer identifier T, which starts from 0 for the base layer and is increased by 1 from one temporal layer to the next. Then for each natural number k, the bit stream that is obtained by removing all access units of all temporal layers with a temporal layer identifier T greater than k forms another valid bit stream for the given decoder. For hybrid video codecs, temporal scalability can generally be enabled by restricting motion-compensated prediction to reference pictures with a temporal layer identifier that is less than or equal to the temporal layer identifier of the picture to be predicted. The prior video coding standards MPEG-1, H.262/MPEG-2 Video, H.263, and MPEG-4 Visual all support temporal scalability to some degree. H.264/AVC provides a significantly increased flexibility for temporal scalability
because of its reference picture memory control. It allows the coding of picture sequences with arbitrary temporal dependencies, which are only restricted by the maximum usable DPB size. Hence, for supporting temporal scalability with a reasonable number of temporal layers, no changes to the design of H.264/AVC were required. The only related change in SVC refers to the signaling of temporal layers. A very efficient concept for providing temporal scalability is the usage of hierarchical prediction structures, which have been proposed by the Image and Video Coding group. [13] 9 Spatial Scalability A bit stream is called spatial scalable when parts of the stream can be removed in a way that the resulting sub stream forms another valid bit stream for some target decoder, and the sub stream represents the source content with a spatial resolution that is less than that of the complete original bit stream. For supporting spatial scalable coding, SVC follows the conventional approach of multi-layer coding, which is also used in H.262/MPEG-2 Video, H.263, and MPEG-4 Visual. Each layer corresponds to a supported spatial resolution and is referred to by a spatial layer or dependency identifier D. The dependency identifier D for the base layer is equal to 0, and it is increased by 1 from one spatial layer to the next. In each spatial layer, motion-compensated prediction and intra prediction are employed as for single-layer coding. But in order to improve coding efficiency in comparison to simulcasting different spatial resolutions in separate bit streams, additional so-called inter-layer prediction mechanisms are incorporated as illustrated in Figure 4. The inter-layer prediction techniques have been developed by the Image and Video coding group and include: Inter-layer intra prediction Inter-layer macroblock mode and motion prediction Inter-layer residual prediction
As an important feature the inter-layer prediction techniques are designed in a way that each spatial enhancement layer can be decoded with a single motioncompensation loop. [13] 10 Figure 4:Multi-Layer structure with additional inter-layer prediction [13] Quality Scalability video bit stream is called quality scalable when parts of the stream can be removed in a way that the resulting sub stream forms another valid bit stream for some target decoder, and the sub stream represents the source content with a reconstruction quality that is less than that of the complete original bit stream. Quality scalability can be considered as a special case of spatial scalability with identical picture sizes in base and enhancement layer. The SVC approach supports the same inter-layer prediction mechanisms as for spatial scalability, but without using the corresponding up sampling operations. Furthermore, the inter-layer intraand residual- prediction are directly performed in the transform domain. The approach of re-using the concepts for spatial scalability is also referred to as coarse-grain quality scalability (CGS). When utilizing inter-layer prediction for quality scalability in SVC, a refinement of texture information is typically achieved by re-quantizing the residual texture signal in the enhancement layer with a smaller quantization step size relative to that used for the preceding quality layer. However, this multilayer concept for quality scalable coding only allows a few selected bit rates to be supported in a scalable bit stream. In general, the number of supported rate points is identical to the number of coded quality layers. Switching between different quality layers can only be done at defined points in the bit stream. Furthermore, the multilayer concept for quality scalable coding becomes
11 less efficient, when the relative rate difference between successive quality layers gets smaller. Especially for increasing the flexibility of bit stream adaptation and error robustness, but also for improving the coding efficiency for bit streams that have to provide a variety of bit rates, a variation of the CGS approach, which is also referred to as medium-grain quality scalability (MGS), is included in the SVC design. Besides the modified high-level signaling, the following additional concepts are supported in medium-grain quality scalable coding: Key picture concept for adjusting a suitable trade-off between drift and enhancement layer coding efficiency. Transform coefficient partitioning for increasing the granularity of quality scalable coding [13]
12 SVC ENCODER Figure 5: Block diagram of a H.264/SVC encoder for two spatial layers [3] The sophisticated architecture of the H.264/SVC standard is particularly designed to increase the codec capabilities while offering a flexible encoder solution that supports three different scalabilities: temporal, spatial and SNR quality [3]. Figure 5 illustrates the structure of a H.264/SVC encoder for a basic two-spatiallayer scalable configuration. In H.264/SVC, each spatial dependency layer requires its own prediction module in order to perform both motion-compensated prediction and intra prediction within the layer. Besides, there is a SNR refinement module that provides the necessary mechanisms for quality scalability within each layer. The dependency between subsequent spatial layers is managed by the inter-layer prediction module, which can support reusing of motion vectors, intra texture or residual signals from inferior layers so as to improve compression efficiency. Finally, the scalable H.264/SVC bitstream is merged by the so-called multiplex, where different temporal, spatial and SNR levels are simultaneously integrated into a single scalable bitstream. [3]
13 SVC DECODER Figure 6: Block diagram of a H.264/SVC decoder for two spatial layers [4] The system demultiplexer in Figure 6 extracts two bitstreams and inputs corresponding bitstreams to base- and enhancement- layer decoders. The output of the base-layer decoder can be shown standalone at half-temporal rate or after multiplexing with enhancement-layer decoded frames and shown at full temporalrate. [4]
14 MANET ROUTING PROTOCOLS: PROTOCOLS OVERVIEW MANET routing protocols are classified into three classes: proactive, reactive and hybrid. Figure 7, illustrates routing protocol classification. Figure 7: Manet routing Protocols [4] Routing Protocols Reactive Proactive Hybrid DSR AODV OLSR DSDV TORA
Ad-hoc on demand distance vector routing In AODV routing protocol the node works independently and does not carry information related to other nodes. When a route request is sent by RREQ, the source node broadcasts all other nodes for the destination node. When destination node is found it replies with RREP back to source node. Each RREQ has a lifespan after which it times out. If source nodes are unable to find the destination node within this lifespan, it adds some more lifespan and sends another RREQ with a different sequence number. [4] Distance sequence distance vector DSDV is a proactive protocol, is designed according to Bellman-ford algorithm. It maintains information about next node and each and every node. It collects a list of all destinations and number of hops to the destination and each gateway is numbered, and incremental packets are used to lower the traffic volume due to network route updates. The only advantage of this protocol is preventing creation of routing loops in networks containing mobile routers. [4] Dynamic Source routing The DSR is on-demand routing protocol, where route is calculated only when it is required. It is designed to use in multi-hop ad-hoc networks. DSR allows the network to be self-organized and self-configured with any central administration and network infrastructure. It doesn t use periodic routing messages like AODV, thus reduces bandwidth overhead and conserved battery power. It only needs support from MAC layer to identify link failure. [4] AD-hoc on-demand multipath distance vector routing. AOMDV had many characteristics like AODV. The main difference lies in the number of routes found in each route discovery. In AODMV, the propagation of RREQ from source towards the destination establishes multiple reverse paths both at intermediate nodes as well as the destination. Multiple RREPs traverse these reverse paths back to form multiple forward paths to the destination at the source and intermediate nodes. We note that AODMV also provides intermediate nodes with alternate paths as they are found to be useful in order to reduce the route discovery frequency. [4] 15
16 IMPLEMENTATION SVEF is used for implementation. SVEF is meant to reproduce a distribution chain formed by three factors: streaming server, middle box and receiver. All factors are connected by an IP network. Figure 4 shows the structure of SVEF with interactions between single tools and data flows depicted as arrows. The software modules inherited from the JSVM package are represented in grey. The whole process, from the encoding of the original video source to the evaluation after the streaming over a network can be summarized in four steps, better detailed in the following sub-sections: 1) A YUV video is encoded in H.264 SVC format through the JSVM Encoder. The encoded video and its NALU trace are transferred to the Streamer. 2) The encoded video is transmitted over the IP network by the Streamer, at a fixed frame-rate. 3) In presence of a middle box, the video NALUs first enter a cross-layer scheduler and then the NALUs are forwarded to the receiver (for example through a wireless link). 4) The Receiver generates in real time a trace of the received NALUs. At the end of the streaming process, the received NALU trace is processed to produce a YUV file (filtered- YUV video) characterized by missing frames due to transmission losses, unsatisfied decoding dependencies or excessive delay. The filtered- YUV video is processed to achieve a simple error concealment, obtaining a final-yuv video with the same number of frames as the original video. [5] Figure 8 shows the implementation of the SVEF framework.
Figure 8:Implementation of SVEF framework [5] 17
18 CONCLUSION The objective of this project is twofold. The first is to integrate SVEF and NS2 to create the myevalsvc framework for the evaluation of H.264/SVC transmission in a simulated environment. Researchers who work on video coding can simulate the effects of a more realistic network on video sequences resulting from their coding schemes, while researchers who work on network technology can evaluate the impact of real video streams on the proposed network architecture or protocols. The evaluation starts from encoding the raw YUV video, parse the video content, prepare the NS2 traffic trace file, and perform the simulation. After the simulation, the network-level performance metrics such as packet loss rate and end-to-end delay can be obtained with the aid of programs provided in myevalsvc. Moreover, the received video can be constructed through the process of filtering out very late and undecodable NALUs and through frame concealment. Lastly, the end-to-end application level metric, PSNR, can be calculated by comparison of the received final YUV video with the original raw YUV video. In addition, visual evaluation is also possible with the help of the YUV viewer program. [7] FUTURE WORK Further work on this project can be done by implementing HEVC video sequences in IEEE 802.11 and 802.11e networks. Different routing protocols can also be tested for their performance.
19 REFERENCES [1] Xin Lu, Graham.R.Martin, and Xuesong Jin, "Performance comparison of the SVC,WSVC, and Motion JPEG 2000 Advanced Scalable Video coding schemes," in intelligent signal processing, vol. 1, London, Dec 2013, pp. 1-6. [2] A.Puri, L.Yan, and B.G.Haskell, "Temporal resolution scalable video coding," in Image Processing, vol. 2, Austin,USA,TX, NOV 1994, pp. 947-951. [3] H.Schwarz, D.Marpe, and T.Wiegand, "Overview of the scalable Video Technology," in Circuits and Systems for Video Technology, vol. 17, Chicago,IL, sept 2007, pp. 1103-1120. [4] SVEF Framework: scalable video evaluation Framework.SVEF is a mixed online/offline open-source framework devised to evaluate the performance of H.264 SVC video streaming. It is written in C and Python and released under the GNU General Public License. [Online]. http://svef.netgroup.uniroma2.it/ [5] Elizibeth M.Rover and Chai-Keong Toh, "A review of current Routing protocols for Ad-hoc Mobile Wireless Networks," in Personal communication, vol. 6, california, April 1999, pp. 46-55. [6] Iraide Unanue et al., "A Tutorial on H.264/SVC Scalable Video Coding and its Tradeoff between Quality, Coding Efficiency and Performance," in Recent Advances in video coding, Javier Del Ser, Ed.: Intech, 05 july 2011, ch. 1, pp. 1-26. [Online]. http://www.intechopen.com/books/recent-advances-on-videocoding [7] Image Processing SVC Extension of H.264/AVC. This gives overview of different SVC Methods. [Online]. http://www.hhi.fraunhofer.de/de/kompetenzfelder/image-processing/researchgroups/image-video-coding/svc-extension-of-h264avc.html [8] Tutorial for network simulator. This Explains about working of network simulator. [Online]. http://www.isi.edu/nsnam/ns/tutorial/ [9] C.H.ke. (2014) How to do H.264 SVC transmission simulations. This website
20 explains the simulation steps for SVEF Evaluation FrameWork. [Online]. http://csie.nqu.edu.tw/smallko/ns2/svc.htm [10] Andrea Detti et al., "SVEF: an open source Experimental Evaluation Framework for H.264 Scalable Video Streaming," in Computers and commmunication, vol. 5, Dec 2009, pp. 45-50. [11] Chih-Heng Ke, "myevalsvc:an Integrated simulation Framework for Evaluation of H.264/svc Transmission," in KSII Transactions on Internet and information systems, vol. 6, NO.1, jan 2012, pp. 379-394. [12] N.I.Sarkar and Roger McHaney, "Modulation and simulation of IEEE 802.11WLAN: A case study of A Network Simulator," in Computer and information science review, vol. 3, New Zealand, september 2005, pp. 340-346. [13] Olfa Ben Rhaiem and Lamia Chaari Fourati, "Routing Protocols Performance analysis for scalable video coding(svc) transmission over mobile ad-hoc networks," in signal and image Processing applications, vol. 3, Melaka, oct 2013, pp. 197-202. [14] Haiyan yang and XinXing JING, "Apaptive scalable Video coding for Wireless Networks," in Microwave,Antenna,Propogation and EMC Technologies for Wireless communication, vol. 4, chengdu, oct 2013, pp. 496-499.