RDTC Optimized Streaming for Remote Browsing in Image-Based Scene Representations

Size: px
Start display at page:

Download "RDTC Optimized Streaming for Remote Browsing in Image-Based Scene Representations"

Transcription

1 TC Optimized Streaming for Remote Browsing in Image-Based Scene Representations Ingo Bauermann, Yang Peng and Eckehard Steinbach Institute of Communication Networks, Media Technology Group, Technische Universität München, Munich, Germany Abstract Remote navigation in compressed image-based scene representations requires random access to arbitrary parts of the reference image data to recompose virtual views. The degree of inter-frame dependencies exploited during compression has an impact on the effort needed to access reference images and delimits the rate distortion () trade-off that can be achieved. If, additionally, a given receiver hardware and a maximum available channel bitrate is taken into account, the traditional rate-distortion optimization is extended to an TC trade-off between rate (R), distortion (D), transmission data rate (T), and decoding complexity (C). In this work we introduce our TC optimized compression framework. In addition, an experimental testbed for streaming of these TC optimally compressed image-based scene representations is described and the impact of client side caching is investigated. Our results show a significant reduction in user perceived delay for TC optimized streams in such a remote browsing environment compared to optimized or independently encoded scene representations. *. Introduction Remote interactive viewing of photorealistic 3D scenes has many applications in virtual reality, gaming, virtual museums and E-commerce. Image-based scene representations like light fields [,2], concentric mosaics [3], panoramas, and others (see [4] for an overview) allow for a fast and easy acquisition and rendering compared to the traditional tedious and time consuming geometric modeling process. Based on sampling of the plenoptic function [5], the downside of image-based rendering approaches is the large amount of reference image data that has to be stored. With recent advances in image and video compression, efficient compression schemes for image-based scenes have emerged also (again, see e.g. [4]). This work was supported, in part, by Siemens AG. Compression Server (Source) Navigation Internet Partial view DSL WLAN UMTS Clients (Decoder) Figure. Considered scenario: An image-based scene representation is acquired, compressed, and stored on a server. Clients with different network access and computational resources are used to navigate in the acquired virtual scene. Rate-distortion optimization has been well studied for video compression in the recent years. But, while for video sequences sequential play out is dominant and therefore temporal dependency structures are known a priori, free real-time navigation in image-based scenes requires random access to arbitrary single frames or even image parts, and therefore does not allow for exploiting all dependencies that are present in image sequences. Additionally, in a remote navigation scenario and with heterogeneous computational capabilities of user devices and different bitrate access (see Figure ) there are strict requirements on the decoding complexity and transmission data rate. A low channel throughput of some few kilobits per second and at the same time a high computational power available at the client side allow for more dependencies to be exploited than a handheld device with the same available bitrate. In both scenarios a highly adapted stream would be preferable as, for instance, too many dependencies would force the handheld device to download too much data that has to be decoded with too much computational effort. Independent coding of reference images that are needed to generate virtual views might be the best solution here. In order to solve the challenges arising with such

2 heterogeneous systems a technique can be used that has been adopted from video coding: real-time scheduling (e.g. [7]). A priori information is used by the server to decide which part of a compressed representation should be sent to meet the scenario and the clients requirements. Unfortunately, for browsing in compressed image-based scene representations only little a priori information can be exploited as the motion trajectory and therefore the dependency structure within the reconstructed virtual views presented to the user are not known. Online scheduling would be very complex and so the most intuitive solution is to provide different, precoded datasets that are chosen according to the momentary system parameters like throughput or available computational resources. The goal of this work is to investigate a compression scheme that allows for maximizing reconstruction quality and minimizing user-perceived delay for a single stream representation taking into account the available computational resources at the decoder and the available channel throughput as well as the server side representation file size. Additionally, the impact of a client side cache is investigated. A streaming testbed is implemented to measure the performance of streams optimized subject to constraints on several parameters. The remainder of this paper is structured as follows. In Chapter 2 we discuss related work. In Chapter 3 we introduce TC optimization and the proposed coding framework. Chapter 4 discusses the influence of client side caching on the whole system while Chapter 5 describes the streaming testbed implemented. Chapter 6 shows experimental results followed by Chapter 7 concluding this paper. 2. Related Work There has been much work done on compression and streaming of image-based scenes. Vector quantization [2,3], MPEG-2 like encoding [6], geometry aided motion compensation [8], and many others (see [4] for an overview). All of them were trying to trade off random access complexity and coding efficiency, where coding efficiency is almost always measured in terms of server side file size rather than the actual number of bits transmitted per. Additionally, most of these schemes maximize PSNR subject to a rate constraint or a fixed dependency structure. Rate-distortion optimization for streaming of image-based scene representations even over error prone networks has been investigated in [9]. A major drawback of this scheme is that the moving trajectory is restricted to be on a hemisphere. This simplifies dependency structure exploitation but restricts the applicability to browsing of small objects rather than rooms or even outdoor scenes. The investigations on multiple representation coding for light fields in [] share the same restrictions. Both schemes show a very good performance but do not consider the client s computational resources at all. For multimedia content delivery a rate-distortioncomplexity (C) optimization framework has been proposed in []. Investigations on the C trade-off for a block-based coder for image-based scenes providing random access and evaluating operational rate-distortion-complexity curves have been done in [2]. In this paper, we introduce TC optimization [3] with a client side caching model and measure the performance of the TC streaming system using a testbed implementation. 3. TC Optimization Traditionally, image and video compression and streaming of video sequences have been studied within the rate-distortion theory framework. However, when encoding or decoding has to be done in real-time, algorithms have to be investigated with respect to computational complexity. Additionally, in the context of remote navigation a desired client-server round trip time allowing immersive user interaction severely constrains the dependency structure of the compressed data. 3.. Measures The measures used in this work to parameterize the TC space are defined as follows: - Rate (R). R is the mean number of bits required to store a s RGB values on the server (file size). - Distortion (D). D is defined as the mean of squared differences (MSE) of RGB intensity values measured using the difference between original intensity values and intensity values reconstructed from the compressed bitstream. - Transmission data rate (T). Transmission data rate T is defined as the mean number of bits that have to be signaled to completely decode a. The transmission data rate T is a measure for user-perceived delay in remote navigation and can be significantly larger than R as dependencies might have to be resolved. - Decoding complexity (C). The decoding complexity C of a given is the mean number of independently encoded s that have to be decoded to reconstruct the current completely. Though our definition of the decoding complexity is rather rough, the number of inverse DCT-transformations gives a good measure of the true complexity that the stream 2

3 implements. Investigations on the mapping from a specific hardware to the decoding complexity C is part of other work, e.g. [] The Coding Framework The basic building blocks of the codec investigated in this work are the discrete cosine transform (DCT) and motion compensated prediction (MCP) performed on 8x8 blocks. An example Group of Pictures (GOP) structure for a camera trajectory along a single path (e.g. an image sequence produced by a concentric mosaic setup) and some example block modes are shown in Figure 2. Although the presented concept applies to all image-based scene representations, we use concentric mosaics [3] in this paper as an example data representation. frame intra... Figure 2. D GOP structure with example block modes for an odd number (n) of images. The anchor frame is encoded independently. 3. Block Modes skip/inter intra anchor frame n /2 GOP of n images anchor-inter/ anchor-skip skip/inter The block modes used in our coding framework are adopted from [2] and based on common hybrid video codecs. TC measures for each block to be encoded are calculated as follows: R = [depends on content and q] [ D = [depends on content and q] T = R+ Dependencies + [signaling overhead] [ C = M + Dependencies [ ] The rate R and distortion D have to be determined by experiment and depend on the content of a specific block and on the quantization parameter q. The transmission data rate T differs from R as dependencies might have to be resolved (referenced blocks) and signaling overhead has to be taken into account. The decoding complexity C depends on the block mode decision (M= for, INTER, and ANCHOR-INTER, otherwise M=). Dependencies might be resolved which increases the decoding complexity of a specific block in a chosen mode. This recursive structure helps to control the amount of dependencies within a bitstream.... n 3.4. Lagrangian Optimization Once the rate, distortion, transmission data rate and decoding complexity of a given block in a given mode are known, the encoder can perform an TC optimization. While encoding, each block in each frame of the image data is encoded in all available modes and then the mode decision mechanism goes through each block and minimizes the Lagrangian cost function: J = D+λ R+ λ T +λ C () q 2 3 Here, λ, λ 2, λ 3 are parameters that are used to select the trade-off between rate R, distortion D, transmission data rate T, and decoding complexity C. The subscript q indicates that this cost is computed for a specific quantization parameter q. As illustrated in Figure 3, for a specific scenario this means that if the client s computational resources C max and the maximum available transmission data rate T max are not allowed to be exceeded then for a specific distortion D, a traditional optimized encoder would produce some uncontrolled TC trade-off minimizing the rate R (not shown in the Figure for easier presentation). Though the rate is minimal with respect to D the TC trade-off is not applicable because too many dependencies might have to be resolved for one random access to the bitstream. On the other hand independent encoding of all blocks () would not exploit all resources because available computational power is not used for decoding in this case. The lower bound for TC-trade-offs drawn in Figure 3 can be argued as follows: Exploiting redundancies between neighboring frames decreases the transmission data rate T but increases C as more and more dependencies are exploited. An experimentally determined curve for the TC lower bound will be shown in Chapter 6. Decoding Complexity C C max lower TC bound operational TC-points T max not applicable Transmission Data Rate T Not optimized (MSE = D ) Figure 3. Analysis of a common encoder in the TC space given a distortion D. T max and C max are maximum resources determined by the channel and the client hardware. Only uncontrolled TC trade-offs are produced by common coders. A lower TC bound can not be under-run for D. 3

4 Independent encoding of frames will lead to a feasible TC trade-off in a streaming scenario, though, wasting computational resources. Decoding Complexity C C max Figure 4. In order to optimally use the available resources, a coding scheme has to be adopted that allows for shifting the lower TC bound to intersect with both the maximum available transmission rate and the maximum allowed decoding complexity while achieving a minimum distortion D 2 < D of a reconstructed view. To optimally utilize the available resources the lower bound has to be moved so that it intersects with T max and C max while decreasing the distortion D as shown in Figure 4. For a minimum distortion D 2 < D the optimal working TC point can be reached. This is done by searching for a quadruple (q, λ, λ 2, λ 3 ) that - used as weights for a block-based minimization using () - produces a stream with the desired TC trade-off for a given scenario. 4. Caching For remote interactive navigation, caching parts of the bitstream or even of uncompressed reference blocks at the client side is very efficient. We distinguish three cases discussed in the following sections. 4.. No Cache If there is no cache present at the client side, TC optimization degrades to an T or C optimization as C and T are proportional in this case [3]. From the perspective of TC optimization there is no point in choosing something other than the -mode as otherwise dependencies have to be resolved Bitstream Caching T max Optimized (MSE = D 2 < D ) optimal TC-working-point lower TC bound not applicable Transmission Data Rate T If there is a cache present at the client side which holds the already transmitted and received bitstream, the TC optimization framework can be applied. If the bitstream of an infinite number of blocks can be cached and every block is requested exactly once as the user moves through the virtual environment, the mean transmission data rate is reduced compared to the case with no cache as no dependencies have to be considered (every block is transmitted exactly once). The same model applies if the mean recursion level for INTER- and SKIP-modes is less than or equal to the number of frames that can be cached. Figure 5 illustrates this for the case of maximum recursion level of 2. The TC measures in this case are updated as follows: R = [depends on content and q] [ D = [depends on content and q] T = R [ C = M + Dependencies [ ] Note that the decoding complexity is still computed as if there is no cache implemented because only the bitstream is cached. The resulting model is referred to as the B-Cache model in the remainder of this paper. time scene object dependencies S Figure 5. Caching of two frames and a mean recursion level of 2. A scene object moves through the blocks. No matter what access pattern is performed, if the cache can hold at least two frames, every block has to be transmitted only once. In the case of bitstream caching, the mean transmission data rate T does not depend on the access pattern (similar to video coding) while the momentary decoding complexity C and transmission data rate T for a single block access strongly depends on the access pattern. For domain caching both C and T are independent of the access pattern. 4. Pixel Domain Caching I previous frame (recursion level 2) previous frame (recursion level ) current frame If the transmitted and decoded images or blocks are cached at the client side, both the transmission data rate T and the decoding complexity C are reduced compared to the case with no cache or bitstream caching as fewer recursions in the computation of Lagrangian costs have to be considered. A simple example is given in Figure 5 to show how the domain caching influences the transmission data rate and decoding complexity for a specific bitstream. In our example the -block I is referenced exactly twice (in the previous frame with recursion level ). Its transmission data rate T I =R I and decoding complexity C I = are computed without any dependencies. Assuming S 2 S 3 S 4 4

5 SKIP-modes for all blocks except the -block in our example, we compute T S,S2,S3 =R I and C S,S2,S3 =C I =, T S4 =2 R I and C S4 =2 C I =2 (one for S and S 2 each). For a system with no cache the total decoding complexity and transmission data rate when decoding the two blocks in the current frame sum up to C sum =6 and T sum =6 R I. Coding all the blocks in -mode would be less expensive and therefore the optimal way of encoding for such a target system. When there is a cache at the client side with at least the capacity to store as many decoded RGB images as the mean number of recursions within the bitstream, then, whenever S 4 is requested by the client, S, S 2, and I are requested in turn. After transmission and decoding these blocks are available at the client side as long as they stay in cache. The same procedure would apply to all other dependent blocks. Now, assuming that each block is requested at least once during a decoding pass and each block is requested equally likely the mean decoding complexity and transmission data rate of each block reduce to C sum = and T sum = R I. The first block of this small group of blocks requested will have a decoding complexity of and a transmission data rate of R I. All succeeding requests do not force the server to transmit the -block anymore, vanishing the momentary decoding complexity and transmission data rate. An exact implementation of this model would mean to globally optimize the whole dataset using e.g. a markov random field model and is part of future work. In this work, to approximately reflect the behavior of the cache in an iterative frame-by-frame optimization, the TC measures are updated as follows: performs the TC optimized encoding of a dataset and provides a compressed single representation bitstream to the server. During operation of the testbed the user chooses a view using the keyboard for translational movement and the mouse for rotational movement. The chosen view is signaled by the client to the server which assembles the bitstream required to decode and display the requested view. The B-Index (for the client s bitstream-cache) and P-Index (for the client s -domain-cache tables) provide the information to the server that a block is already present in the client s cache or not. This prevents the server to transmit a compressed block twice if it is still present in the client s cache. A first-in-first-out replacement strategy is used in both caching systems. The assembled bitstream is transmitted to the client using the channel simulator. The client decodes the compressed data and its cache is updated. Finally the view is displayed to the user. Parameters that can be adjusted in the testbed include: Encoder: q, λ, λ 2, λ 3 ; Server: processing time (delay); Client: cache sizes; decoding capacity; Channel Simulator: maximum available bitrate, delay; Renderer: sample-rate, interpolation filter; The testbed can generate random views and rigid motion trajectories and also can capture motion performed by a user. These trajectories are then used to emulate user input to perform tests with varying system settings. R = [depends on content and q] [ D = [depends on content and q] T = R+ Dependencies [ KT C = M + Dependencies [ ] KC Here we weight the decoding complexity and transmission data rate dependencies using KT=KC=3 for a mean recursion level of approximately 2. In the example from Figure 5 C sum =2 and T sum =2 T I. In Chapter 6 we show results using this simplified model. This model will be referred to as P-Cache model in the remainder of this paper. 5. The Streaming Testbed To evaluate the performance of the proposed system a streaming testbed for concentric mosaics is implemented. The testbed consists of an encoder, a server and a client, a channel simulator, a renderer, and a caching system. Figure 6 illustrates the components of the system. The encoder Encoder Server B-Index P-Index compressed data blocks, columns, views,... Channel Simulator Renderer Client (decoder) B-Cache P-Cache Figure 6. Testbed for streaming of TC optimized streams. The encoder provides an TC optimized stream to the server. The client requests blocks, columns or views and decodes the assembled and transmitted compressed data. A caching system for RGB blocks (P-Cache) and the encoded bitstream (B-Cache) is implemented. The server uses index tables (B-Index and P-Index) to keep track of the clients cache state. A channel simulator allows to adjust the maximum available bitrate and a fixed delay. The client also 5

6 simulates the decoding capacity of a specific machine by limiting the decoding throughput. 6. Experimental Results In this section we evaluate the results of the TC optimization in the TC space using our caching model. We show results using an -only coded stream, an -optimized stream, and an TC optimized stream for remote navigation in a concentric mosaic dataset using our streaming testbed. The test dataset classroom used in our experiments consists of a normal concentric mosaic captured with a camera radius of meters. 525 frames at CIF resolution with a FOV of approximately 4 degrees are captured. Figure 7 shows an example frame of the dataset which is partitioned into GOPs of size 25 frames with an anchor frame (I-frame) in the middle (see Figure 2). Figure 7. One frame of the test sequence classroom. The concentric mosaic consists of 525 frames in CIF resolution. 6.. TC Analysis Actual TC trade-offs for different streams are shown in Figure 8. For a specific distortion D (PSNR=4dB), the resulting transmission data rate T and decoding complexity C, as well as rate R of different optimization strategies are illustrated in the TC space. 6.. TC Analysis Using the B-Cache Model While the traditional optimization gives the lowest rate and transmission data rate (assuming an infinitely large bitstream cache), its TC trade-off might not be applicable in some remote scenarios as the decoding complexity is too high. Same applies for -only encoded streams that might not be optimal when high computational power is available. Two TC trade-offs of TC-optimized streams are shown in Figure 8. TC 2 has a complexity of.ppp which is almost as low as for -only encoded streams but also shows rate savings of 3%. For TC there is a reduction of decoding complexity of a factor of 2 compared to optimized streams while the rate increases only about 2% TC Analysis Using the P-Cache Model Using the P-Cache model to analyze an TC optimized stream gives a decoding complexity of ppp for the optimized stream. Note that this model only holds when every block is decoded at least once and an infinitely large cache is used. For the TC stream a transmission data rate of.4bpp and a decoding complexity of.5ppp are determined by the P-Cache model. This working point will be used in Chapter 6.2. as we found it to reflect a feasible trade off for adaptation to random view requests (measured more accurately by the B-Cache Model with.8ppp) and dependent view requests like translational movement and rotation (modeled more accurately by the P-Cache Model) performed by a human operator navigating through a virtual environment. C [/] P-Cache Model.4. B-Cache Model Figure 8. Analysis in the TC space given a fixed distortion D (PSNR=4dB). For systems with no cache is the optimal choice. However, optimized streams produce the lowest rate R. The TC optimization proposed here allows for TC trade-offs between -only encoded bitstreams and an optimization for a better random access performance System Performance R=2bpp TC R=8bpp.5 TC 2 R=.7bpp R=.bpp..5 T [bit/] We investigate the performance of our streaming system with respect to different modes of movement and the available domain cache size. The bitstream cache is switched off. For our experiments with the streaming testbed the -only encoded and -optimized streams will be used as comparison to the TC optimized stream TC in Figure 8. A target system capable of decoding million s per second (Mpps) and a maximum channel bitrate of Mbps is chosen accordingly. This target system is approximately equivalent to a Pentium III 8MHz with DSLMbps Internet connection in the actual state of implementation. Four different scenarios are defined, and several motion trajectories for each of them are generated and used for the 6

7 performance evaluation. The four modes of motion are: Random Views, Rotation, Translation, and trajectories that would appear in a first person 3D game. The random views trajectory consists of 5 random positions and random viewing directions within the concentric mosaic testset. Five of these trajectories are generated. The mean delay introduced by interactively transmitting these 25 views is measured before we change the domain cache size and re-run the experiment. Figure 9 shows the result. Note that performs best when the domain cache is switched off. Already with a small cache TC optimized streaming outperforms encoding (with only a third of the overall file size). optimized encoding has always the worst performance. The same kind of experiments are performed for the remaining modes of motion and the results are shown in Figures -2. Delay [ms] Figure 9. Mean user-perceived delay for random views as a function of the domain cache size at 4dB PSNR, a maximum bitrate of Mbps, and a maximum available decoding complexity of Mpps. With almost no cache available performs best as expected. Already with a small cache size the TC optimized stream outperforms coding. 6. Impact of the Caching System TC [ b ytes ] 3 3 [ images ] With a cache size of about.2 times the image size a significant gain in terms of reduced delay can be observed for and TC. This is the cache size where the first blocks within the decoding process of a column can be reused. There is another significant gain when the cache size is larger than 3-4 images. This is when blocks between neighboring virtual views are shared via caching. Note that approximately the number of blocks equivalent to 3 images have to be decoded to display one virtual view. The reason for this is that in our concentric mosaic streaming system single columns are used for view interpolation but only blocks can be requested and transmitted. This overhead has a direct impact on the system performance. For a domain cache size of about MB or 3 images there is no decrease of the user-perceived delay anymore. A mean delay of less than 2ms for a gamer trajectory can be achieved. TC and share the same gains over (about 9ms). Looking at the worst case delay for the gamer trajectory (mainly due to an empty cache before streaming the first view of each trajectory) in Figure 3 we can see that TC optimized streams clearly outperform and encoding even for approximately the same mean delay for all three streams (compare to Figure 2). Finally Figure 4 shows the mean delay as a function of the maximum available channel bitrate and the maximum allowable decoding complexity of a system using a MB domain cache and a MB bitstream cache. For widely used machines (e.g. Pentium IV 2GHz) and bit rates of about.2mbps a delay less than ms introduced by the system can be achieved with our testset at a PSNR of 4dB (calculated in the RGB-space) [ bytes ] 3 3 [ images] Figure. Mean user-perceived delay for pure rotation as a function of the domain cache size. Delay [ms] Delay [ ms ] [ bytes ] 3 3 [ images ] Figure. Mean user-perceived delay for pure translation as a function of the domain cache size. Note the significant gains of and TC optimal streams compared to an coded stream for larger cache sizes. Delay [ ms ] TC TC TC [bytes] 3 3 [images] Figure 2. Mean user-perceived delay for a trajectory consisting of translation and rotation as it can be found in first person 3D games. 7

8 8. References Delay [ms] [] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen. The Lumigraph. in Proc. SIGGRAPH 96, pp , 996. [2] M. Levoy and P. Hanrahan. Light Field Rendering. in Proc. SIGGRAPH 96, pp. 3 42, 996. [3] H.-Y. Shum, L.-W. He. Rendering with Concentric Mosaics. ACM SIGGRAPH 99, Los Angeles, CA, Aug. 999, pp [4] H.-Y. Shum, S.B. Kang, and S.-C. Chan. Survey of Image-Based Representations and Compression Techniques. IEEE Transactions on Circuits and Systems for Video Technology, pp. 2 37, Volume: 3, Issue:, Nov. 23. [5] E. H. Adelson and J. Bergen. The Plenoptic Function and the Elements of Early Vision. Computational Models of Visual Processing, pp. 3 2, MIT Press, Cambridge, MA, 99. [6] C. Zhang and J. Li. On the Compression and Streaming of Concentric Mosaic Data for Free Wondering in a Realistic Environment over the Internet. IEEE Trans. on Multimedia. pp. 7-82, Volume:7 Issue: 6, Dec. 25 [7] P. Ramanathan and B. Girod. Receiver-Driven Rate-Distortion Optimized Streaming of Light Fields. IEEE International Conference on Image Processing, ICIP 25, Genoa, Italy, September 25. [8] M. Magnor, P. Ramanathan, and B. Girod. Multiview Coding for Image-Based Rendering using 3-D Scene Geometry. IEEE Transactions on Circuits and Systems for Video Technology, vol. 3, no., pp. 92-6, Nov. 23. [9] P. Ramanathan, M. Kalman, and B. Girod. Rate-Distortion Optimized Interactive Light Field Streaming. To appear in IEEE Transactions on Multimedia. [] P. Ramanathan and B. Girod. Random Access for Compressed Light Fields Using Multiple Representations. Proc. IEEE International Workshop on Multimedia Signal Processing, MMSP 24, Siena, Italy, September 24. [] M. van der Schaar and Y. Andreopoulos. Rate-Distortion-Complexity Modeling for Network and Receiver Aware Adaptation. IEEE Transactions on Multimedia, vol. 7, no. 3, pp , Jun. 25. [2] E. Vutukuri, I. Bauermann, and E. Steinbach. Decoding Complexity-Constrained Rate-Distortion Optimization for the Compression of Concentric Mosaics. Picture Coding Symposium, PCS 24, San Francisco, CA, December 24. [3] I. Bauermann, Y. Peng, and E. Steinbach. Receiver- and Channel-adaptive Compression for Remote Browsing of Image-Based Scene Representations. Picture Coding Symposium, PCS 26, Beijing, China, April TC [bytes ] 3 [ images] Figure 3. Maximum user-perceived delay for a trajectory consisting of translation and rotation as it can be found in first person 3D games. Delay [ms] Max. Decoding Complexity [kpps] Max. Available Bitrate [ kbps] Figure 4. Rate-Complexity-Delay performance of the proposed coding scheme using a MB cache in the and bitstream domain. 7. Conclusion In this paper we investigated a rate, distortion, transmission data rate, decoding complexity (TC) optimization for the encoding of image-based scene representations. A streaming testbed is implemented and used to measure the impact of a caching system on the overall performance. Results on TC optimized compression of concentric mosaics and domain caching for streaming of concentric mosaics show that an adaptation to both the client computational resources and the available transmission data rate can be used to significantly decrease the user perceived delay in a remote navigation scenario. 8

DECODING COMPLEXITY CONSTRAINED RATE-DISTORTION OPTIMIZATION FOR THE COMPRESSION OF CONCENTRIC MOSAICS

DECODING COMPLEXITY CONSTRAINED RATE-DISTORTION OPTIMIZATION FOR THE COMPRESSION OF CONCENTRIC MOSAICS DECODING COMPLEXITY CONSTRAINED RATE-DISTORTION OPTIMIZATION FOR THE COMPRESSION OF CONCENTRIC MOSAICS Eswar Kalyan Vutukuri, Ingo Bauerman and Eckehard Steinbach Institute of Communication Networks, Media

More information

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations Prashant Ramanathan and Bernd Girod Department of Electrical Engineering Stanford University Stanford CA 945

More information

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations Prashant Ramanathan and Bernd Girod Department of Electrical Engineering Stanford University Stanford CA 945

More information

View Synthesis for Multiview Video Compression

View Synthesis for Multiview Video Compression View Synthesis for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, and Anthony Vetro email:{martinian,jxin,avetro}@merl.com, behrens@tnt.uni-hannover.de Mitsubishi Electric Research

More information

Reconstruction PSNR [db]

Reconstruction PSNR [db] Proc. Vision, Modeling, and Visualization VMV-2000 Saarbrücken, Germany, pp. 199-203, November 2000 Progressive Compression and Rendering of Light Fields Marcus Magnor, Andreas Endmann Telecommunications

More information

View Synthesis for Multiview Video Compression

View Synthesis for Multiview Video Compression MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com View Synthesis for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, and Anthony Vetro TR2006-035 April 2006 Abstract

More information

Reduced Frame Quantization in Video Coding

Reduced Frame Quantization in Video Coding Reduced Frame Quantization in Video Coding Tuukka Toivonen and Janne Heikkilä Machine Vision Group Infotech Oulu and Department of Electrical and Information Engineering P. O. Box 500, FIN-900 University

More information

Joint Tracking and Multiview Video Compression

Joint Tracking and Multiview Video Compression Joint Tracking and Multiview Video Compression Cha Zhang and Dinei Florêncio Communication and Collaborations Systems Group Microsoft Research, Redmond, WA, USA 98052 {chazhang,dinei}@microsoft.com ABSTRACT

More information

Compression of Lumigraph with Multiple Reference Frame (MRF) Prediction and Just-in-time Rendering

Compression of Lumigraph with Multiple Reference Frame (MRF) Prediction and Just-in-time Rendering Compression of Lumigraph with Multiple Reference Frame (MRF) Prediction and Just-in-time Rendering Cha Zhang * and Jin Li Dept. of Electronic Engineering, Tsinghua University, Beijing 100084, China Microsoft

More information

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Yongying Gao and Hayder Radha Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48823 email:

More information

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

International Journal of Emerging Technology and Advanced Engineering Website:   (ISSN , Volume 2, Issue 4, April 2012) A Technical Analysis Towards Digital Video Compression Rutika Joshi 1, Rajesh Rai 2, Rajesh Nema 3 1 Student, Electronics and Communication Department, NIIST College, Bhopal, 2,3 Prof., Electronics and

More information

Rate Distortion Optimization in Video Compression

Rate Distortion Optimization in Video Compression Rate Distortion Optimization in Video Compression Xue Tu Dept. of Electrical and Computer Engineering State University of New York at Stony Brook 1. Introduction From Shannon s classic rate distortion

More information

Extensions of H.264/AVC for Multiview Video Compression

Extensions of H.264/AVC for Multiview Video Compression MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Extensions of H.264/AVC for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, Anthony Vetro, Huifang Sun TR2006-048 June

More information

JPEG 2000 vs. JPEG in MPEG Encoding

JPEG 2000 vs. JPEG in MPEG Encoding JPEG 2000 vs. JPEG in MPEG Encoding V.G. Ruiz, M.F. López, I. García and E.M.T. Hendrix Dept. Computer Architecture and Electronics University of Almería. 04120 Almería. Spain. E-mail: vruiz@ual.es, mflopez@ace.ual.es,

More information

Dimensional Imaging IWSNHC3DI'99, Santorini, Greece, September SYNTHETIC HYBRID OR NATURAL FIT?

Dimensional Imaging IWSNHC3DI'99, Santorini, Greece, September SYNTHETIC HYBRID OR NATURAL FIT? International Workshop on Synthetic Natural Hybrid Coding and Three Dimensional Imaging IWSNHC3DI'99, Santorini, Greece, September 1999. 3-D IMAGING AND COMPRESSION { SYNTHETIC HYBRID OR NATURAL FIT? Bernd

More information

A Review of Image- based Rendering Techniques Nisha 1, Vijaya Goel 2 1 Department of computer science, University of Delhi, Delhi, India

A Review of Image- based Rendering Techniques Nisha 1, Vijaya Goel 2 1 Department of computer science, University of Delhi, Delhi, India A Review of Image- based Rendering Techniques Nisha 1, Vijaya Goel 2 1 Department of computer science, University of Delhi, Delhi, India Keshav Mahavidyalaya, University of Delhi, Delhi, India Abstract

More information

The Plenoptic videos: Capturing, Rendering and Compression. Chan, SC; Ng, KT; Gan, ZF; Chan, KL; Shum, HY.

The Plenoptic videos: Capturing, Rendering and Compression. Chan, SC; Ng, KT; Gan, ZF; Chan, KL; Shum, HY. Title The Plenoptic videos: Capturing, Rendering and Compression Author(s) Chan, SC; Ng, KT; Gan, ZF; Chan, KL; Shum, HY Citation IEEE International Symposium on Circuits and Systems Proceedings, Vancouver,

More information

New Techniques for Improved Video Coding

New Techniques for Improved Video Coding New Techniques for Improved Video Coding Thomas Wiegand Fraunhofer Institute for Telecommunications Heinrich Hertz Institute Berlin, Germany wiegand@hhi.de Outline Inter-frame Encoder Optimization Texture

More information

Mesh Based Interpolative Coding (MBIC)

Mesh Based Interpolative Coding (MBIC) Mesh Based Interpolative Coding (MBIC) Eckhart Baum, Joachim Speidel Institut für Nachrichtenübertragung, University of Stuttgart An alternative method to H.6 encoding of moving images at bit rates below

More information

Real-time Generation and Presentation of View-dependent Binocular Stereo Images Using a Sequence of Omnidirectional Images

Real-time Generation and Presentation of View-dependent Binocular Stereo Images Using a Sequence of Omnidirectional Images Real-time Generation and Presentation of View-dependent Binocular Stereo Images Using a Sequence of Omnidirectional Images Abstract This paper presents a new method to generate and present arbitrarily

More information

But, vision technology falls short. and so does graphics. Image Based Rendering. Ray. Constant radiance. time is fixed. 3D position 2D direction

But, vision technology falls short. and so does graphics. Image Based Rendering. Ray. Constant radiance. time is fixed. 3D position 2D direction Computer Graphics -based rendering Output Michael F. Cohen Microsoft Research Synthetic Camera Model Computer Vision Combined Output Output Model Real Scene Synthetic Camera Model Real Cameras Real Scene

More information

Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding

Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding 2009 11th IEEE International Symposium on Multimedia Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding Ghazaleh R. Esmaili and Pamela C. Cosman Department of Electrical and

More information

IMPROVED CONTEXT-ADAPTIVE ARITHMETIC CODING IN H.264/AVC

IMPROVED CONTEXT-ADAPTIVE ARITHMETIC CODING IN H.264/AVC 17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 IMPROVED CONTEXT-ADAPTIVE ARITHMETIC CODING IN H.264/AVC Damian Karwowski, Marek Domański Poznań University

More information

Digital Video Processing

Digital Video Processing Video signal is basically any sequence of time varying images. In a digital video, the picture information is digitized both spatially and temporally and the resultant pixel intensities are quantized.

More information

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Torsten Palfner, Alexander Mali and Erika Müller Institute of Telecommunications and Information Technology, University of

More information

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Project Title: Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Midterm Report CS 584 Multimedia Communications Submitted by: Syed Jawwad Bukhari 2004-03-0028 About

More information

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

2014 Summer School on MPEG/VCEG Video. Video Coding Concept 2014 Summer School on MPEG/VCEG Video 1 Video Coding Concept Outline 2 Introduction Capture and representation of digital video Fundamentals of video coding Summary Outline 3 Introduction Capture and representation

More information

VIDEO COMPRESSION STANDARDS

VIDEO COMPRESSION STANDARDS VIDEO COMPRESSION STANDARDS Family of standards: the evolution of the coding model state of the art (and implementation technology support): H.261: videoconference x64 (1988) MPEG-1: CD storage (up to

More information

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE 5359 Gaurav Hansda 1000721849 gaurav.hansda@mavs.uta.edu Outline Introduction to H.264 Current algorithms for

More information

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS Xie Li and Wenjun Zhang Institute of Image Communication and Information Processing, Shanghai Jiaotong

More information

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC Randa Atta, Rehab F. Abdel-Kader, and Amera Abd-AlRahem Electrical Engineering Department, Faculty of Engineering, Port

More information

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS Television services in Europe currently broadcast video at a frame rate of 25 Hz. Each frame consists of two interlaced fields, giving a field rate of 50

More information

Compression of Stereo Images using a Huffman-Zip Scheme

Compression of Stereo Images using a Huffman-Zip Scheme Compression of Stereo Images using a Huffman-Zip Scheme John Hamann, Vickey Yeh Department of Electrical Engineering, Stanford University Stanford, CA 94304 jhamann@stanford.edu, vickey@stanford.edu Abstract

More information

CODING METHOD FOR EMBEDDING AUDIO IN VIDEO STREAM. Harri Sorokin, Jari Koivusaari, Moncef Gabbouj, and Jarmo Takala

CODING METHOD FOR EMBEDDING AUDIO IN VIDEO STREAM. Harri Sorokin, Jari Koivusaari, Moncef Gabbouj, and Jarmo Takala CODING METHOD FOR EMBEDDING AUDIO IN VIDEO STREAM Harri Sorokin, Jari Koivusaari, Moncef Gabbouj, and Jarmo Takala Tampere University of Technology Korkeakoulunkatu 1, 720 Tampere, Finland ABSTRACT In

More information

FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING

FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING 1 Michal Joachimiak, 2 Kemal Ugur 1 Dept. of Signal Processing, Tampere University of Technology, Tampere, Finland 2 Jani Lainema,

More information

VIDEO streaming applications over the Internet are gaining. Brief Papers

VIDEO streaming applications over the Internet are gaining. Brief Papers 412 IEEE TRANSACTIONS ON BROADCASTING, VOL. 54, NO. 3, SEPTEMBER 2008 Brief Papers Redundancy Reduction Technique for Dual-Bitstream MPEG Video Streaming With VCR Functionalities Tak-Piu Ip, Yui-Lam Chan,

More information

Pre- and Post-Processing for Video Compression

Pre- and Post-Processing for Video Compression Whitepaper submitted to Mozilla Research Pre- and Post-Processing for Video Compression Aggelos K. Katsaggelos AT&T Professor Department of Electrical Engineering and Computer Science Northwestern University

More information

Video Compression Method for On-Board Systems of Construction Robots

Video Compression Method for On-Board Systems of Construction Robots Video Compression Method for On-Board Systems of Construction Robots Andrei Petukhov, Michael Rachkov Moscow State Industrial University Department of Automatics, Informatics and Control Systems ul. Avtozavodskaya,

More information

View Generation for Free Viewpoint Video System

View Generation for Free Viewpoint Video System View Generation for Free Viewpoint Video System Gangyi JIANG 1, Liangzhong FAN 2, Mei YU 1, Feng Shao 1 1 Faculty of Information Science and Engineering, Ningbo University, Ningbo, 315211, China 2 Ningbo

More information

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm International Journal of Engineering Research and General Science Volume 3, Issue 4, July-August, 15 ISSN 91-2730 A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

More information

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS Yen-Kuang Chen 1, Anthony Vetro 2, Huifang Sun 3, and S. Y. Kung 4 Intel Corp. 1, Mitsubishi Electric ITA 2 3, and Princeton University 1

More information

Advanced Video Coding: The new H.264 video compression standard

Advanced Video Coding: The new H.264 video compression standard Advanced Video Coding: The new H.264 video compression standard August 2003 1. Introduction Video compression ( video coding ), the process of compressing moving images to save storage space and transmission

More information

Compression of Light Field Images using Projective 2-D Warping method and Block matching

Compression of Light Field Images using Projective 2-D Warping method and Block matching Compression of Light Field Images using Projective 2-D Warping method and Block matching A project Report for EE 398A Anand Kamat Tarcar Electrical Engineering Stanford University, CA (anandkt@stanford.edu)

More information

ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS

ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS E. Masala, D. Quaglia, J.C. De Martin Λ Dipartimento di Automatica e Informatica/ Λ IRITI-CNR Politecnico di Torino, Italy

More information

Depth Estimation for View Synthesis in Multiview Video Coding

Depth Estimation for View Synthesis in Multiview Video Coding MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Depth Estimation for View Synthesis in Multiview Video Coding Serdar Ince, Emin Martinian, Sehoon Yea, Anthony Vetro TR2007-025 June 2007 Abstract

More information

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework System Modeling and Implementation of MPEG-4 Encoder under Fine-Granular-Scalability Framework Literature Survey Embedded Software Systems Prof. B. L. Evans by Wei Li and Zhenxun Xiao March 25, 2002 Abstract

More information

Capturing and View-Dependent Rendering of Billboard Models

Capturing and View-Dependent Rendering of Billboard Models Capturing and View-Dependent Rendering of Billboard Models Oliver Le, Anusheel Bhushan, Pablo Diaz-Gutierrez and M. Gopi Computer Graphics Lab University of California, Irvine Abstract. In this paper,

More information

Homogeneous Transcoding of HEVC for bit rate reduction

Homogeneous Transcoding of HEVC for bit rate reduction Homogeneous of HEVC for bit rate reduction Ninad Gorey Dept. of Electrical Engineering University of Texas at Arlington Arlington 7619, United States ninad.gorey@mavs.uta.edu Dr. K. R. Rao Fellow, IEEE

More information

Image and Video Coding I: Fundamentals

Image and Video Coding I: Fundamentals Image and Video Coding I: Fundamentals Thomas Wiegand Technische Universität Berlin T. Wiegand (TU Berlin) Image and Video Coding Organization Vorlesung: Donnerstag 10:15-11:45 Raum EN-368 Material: http://www.ic.tu-berlin.de/menue/studium_und_lehre/

More information

Bit Allocation for Spatial Scalability in H.264/SVC

Bit Allocation for Spatial Scalability in H.264/SVC Bit Allocation for Spatial Scalability in H.264/SVC Jiaying Liu 1, Yongjin Cho 2, Zongming Guo 3, C.-C. Jay Kuo 4 Institute of Computer Science and Technology, Peking University, Beijing, P.R. China 100871

More information

Fast Mode Decision for H.264/AVC Using Mode Prediction

Fast Mode Decision for H.264/AVC Using Mode Prediction Fast Mode Decision for H.264/AVC Using Mode Prediction Song-Hak Ri and Joern Ostermann Institut fuer Informationsverarbeitung, Appelstr 9A, D-30167 Hannover, Germany ri@tnt.uni-hannover.de ostermann@tnt.uni-hannover.de

More information

Context based optimal shape coding

Context based optimal shape coding IEEE Signal Processing Society 1999 Workshop on Multimedia Signal Processing September 13-15, 1999, Copenhagen, Denmark Electronic Proceedings 1999 IEEE Context based optimal shape coding Gerry Melnikov,

More information

Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video

Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video Frank Ciaramello, Jung Ko, Sheila Hemami School of Electrical and Computer Engineering Cornell University,

More information

In the name of Allah. the compassionate, the merciful

In the name of Allah. the compassionate, the merciful In the name of Allah the compassionate, the merciful Digital Video Systems S. Kasaei Room: CE 315 Department of Computer Engineering Sharif University of Technology E-Mail: skasaei@sharif.edu Webpage:

More information

Distributed Rate Allocation for Video Streaming over Wireless Networks. Wireless Home Video Networking

Distributed Rate Allocation for Video Streaming over Wireless Networks. Wireless Home Video Networking Ph.D. Oral Defense Distributed Rate Allocation for Video Streaming over Wireless Networks Xiaoqing Zhu Tuesday, June, 8 Information Systems Laboratory Stanford University Wireless Home Video Networking

More information

A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression

A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression Habibollah Danyali and Alfred Mertins University of Wollongong School of Electrical, Computer and Telecommunications Engineering

More information

Data Hiding in Video

Data Hiding in Video Data Hiding in Video J. J. Chae and B. S. Manjunath Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 9316-956 Email: chaejj, manj@iplab.ece.ucsb.edu Abstract

More information

View Synthesis Prediction for Rate-Overhead Reduction in FTV

View Synthesis Prediction for Rate-Overhead Reduction in FTV MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com View Synthesis Prediction for Rate-Overhead Reduction in FTV Sehoon Yea, Anthony Vetro TR2008-016 June 2008 Abstract This paper proposes the

More information

Model-Aided Coding: A New Approach to Incorporate Facial Animation into Motion-Compensated Video Coding

Model-Aided Coding: A New Approach to Incorporate Facial Animation into Motion-Compensated Video Coding 344 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 3, APRIL 2000 Model-Aided Coding: A New Approach to Incorporate Facial Animation into Motion-Compensated Video Coding Peter

More information

Reference Stream Selection for Multiple Depth Stream Encoding

Reference Stream Selection for Multiple Depth Stream Encoding Reference Stream Selection for Multiple Depth Stream Encoding Sang-Uok Kum Ketan Mayer-Patel kumsu@cs.unc.edu kmp@cs.unc.edu University of North Carolina at Chapel Hill CB #3175, Sitterson Hall Chapel

More information

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames Ki-Kit Lai, Yui-Lam Chan, and Wan-Chi Siu Centre for Signal Processing Department of Electronic and Information Engineering

More information

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC) STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC) EE 5359-Multimedia Processing Spring 2012 Dr. K.R Rao By: Sumedha Phatak(1000731131) OBJECTIVE A study, implementation and comparison

More information

Motion Estimation for Video Coding Standards

Motion Estimation for Video Coding Standards Motion Estimation for Video Coding Standards Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Introduction of Motion Estimation The goal of video compression

More information

Modified SPIHT Image Coder For Wireless Communication

Modified SPIHT Image Coder For Wireless Communication Modified SPIHT Image Coder For Wireless Communication M. B. I. REAZ, M. AKTER, F. MOHD-YASIN Faculty of Engineering Multimedia University 63100 Cyberjaya, Selangor Malaysia Abstract: - The Set Partitioning

More information

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Week 14. Video Compression. Ref: Fundamentals of Multimedia Week 14 Video Compression Ref: Fundamentals of Multimedia Last lecture review Prediction from the previous frame is called forward prediction Prediction from the next frame is called forward prediction

More information

*Faculty of Electrical Engineering, University of Belgrade, Serbia and Montenegro ** Technical University of Lodz, Institute of Electronics, Poland

*Faculty of Electrical Engineering, University of Belgrade, Serbia and Montenegro ** Technical University of Lodz, Institute of Electronics, Poland Terrain-adaptive infrared line-scan coding: A new image segmentation scheme by D. Milovanovic *, A. Marincic * and B. Wiecek ** *Faculty of Electrical Engineering, University of Belgrade, Serbia and Montenegro

More information

Comparison of Shaping and Buffering for Video Transmission

Comparison of Shaping and Buffering for Video Transmission Comparison of Shaping and Buffering for Video Transmission György Dán and Viktória Fodor Royal Institute of Technology, Department of Microelectronics and Information Technology P.O.Box Electrum 229, SE-16440

More information

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING Dieison Silveira, Guilherme Povala,

More information

CS 260: Seminar in Computer Science: Multimedia Networking

CS 260: Seminar in Computer Science: Multimedia Networking CS 260: Seminar in Computer Science: Multimedia Networking Jiasi Chen Lectures: MWF 4:10-5pm in CHASS http://www.cs.ucr.edu/~jiasi/teaching/cs260_spring17/ Multimedia is User perception Content creation

More information

Receiver-based adaptation mechanisms for real-time media delivery. Outline

Receiver-based adaptation mechanisms for real-time media delivery. Outline Receiver-based adaptation mechanisms for real-time media delivery Prof. Dr.-Ing. Eckehard Steinbach Institute of Communication Networks Media Technology Group Technische Universität München Steinbach@ei.tum.de

More information

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 4, APRIL

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 4, APRIL IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 4, APRIL 2006 793 Light Field Compression Using Disparity-Compensated Lifting and Shape Adaptation Chuo-Ling Chang, Xiaoqing Zhu, Prashant Ramanathan,

More information

3-D Shape Reconstruction from Light Fields Using Voxel Back-Projection

3-D Shape Reconstruction from Light Fields Using Voxel Back-Projection 3-D Shape Reconstruction from Light Fields Using Voxel Back-Projection Peter Eisert, Eckehard Steinbach, and Bernd Girod Telecommunications Laboratory, University of Erlangen-Nuremberg Cauerstrasse 7,

More information

Video Quality Analysis for H.264 Based on Human Visual System

Video Quality Analysis for H.264 Based on Human Visual System IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021 ISSN (p): 2278-8719 Vol. 04 Issue 08 (August. 2014) V4 PP 01-07 www.iosrjen.org Subrahmanyam.Ch 1 Dr.D.Venkata Rao 2 Dr.N.Usha Rani 3 1 (Research

More information

Module 7 VIDEO CODING AND MOTION ESTIMATION

Module 7 VIDEO CODING AND MOTION ESTIMATION Module 7 VIDEO CODING AND MOTION ESTIMATION Lesson 20 Basic Building Blocks & Temporal Redundancy Instructional Objectives At the end of this lesson, the students should be able to: 1. Name at least five

More information

A Hybrid Temporal-SNR Fine-Granular Scalability for Internet Video

A Hybrid Temporal-SNR Fine-Granular Scalability for Internet Video 318 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 3, MARCH 2001 A Hybrid Temporal-SNR Fine-Granular Scalability for Internet Video Mihaela van der Schaar, Member, IEEE, and

More information

Morphable 3D-Mosaics: a Hybrid Framework for Photorealistic Walkthroughs of Large Natural Environments

Morphable 3D-Mosaics: a Hybrid Framework for Photorealistic Walkthroughs of Large Natural Environments Morphable 3D-Mosaics: a Hybrid Framework for Photorealistic Walkthroughs of Large Natural Environments Nikos Komodakis and Georgios Tziritas Computer Science Department, University of Crete E-mails: {komod,

More information

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain

Efficient MPEG-2 to H.264/AVC Intra Transcoding in Transform-domain MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Efficient MPEG- to H.64/AVC Transcoding in Transform-domain Yeping Su, Jun Xin, Anthony Vetro, Huifang Sun TR005-039 May 005 Abstract In this

More information

Decoded. Frame. Decoded. Frame. Warped. Frame. Warped. Frame. current frame

Decoded. Frame. Decoded. Frame. Warped. Frame. Warped. Frame. current frame Wiegand, Steinbach, Girod: Multi-Frame Affine Motion-Compensated Prediction for Video Compression, DRAFT, Dec. 1999 1 Multi-Frame Affine Motion-Compensated Prediction for Video Compression Thomas Wiegand

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 19, NO. 9, SEPTEMBER

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 19, NO. 9, SEPTEMBER IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 19, NO. 9, SEPTEER 2009 1389 Transactions Letters Robust Video Region-of-Interest Coding Based on Leaky Prediction Qian Chen, Xiaokang

More information

Compression Algorithms for Flexible Video Decoding

Compression Algorithms for Flexible Video Decoding Compression Algorithms for Flexible Video Decoding Ngai-Man Cheung and Antonio Ortega Signal and Image Processing Institute, Univ. of Southern California, Los Angeles, CA ABSTRACT We investigate compression

More information

Performance Comparison between DWT-based and DCT-based Encoders

Performance Comparison between DWT-based and DCT-based Encoders , pp.83-87 http://dx.doi.org/10.14257/astl.2014.75.19 Performance Comparison between DWT-based and DCT-based Encoders Xin Lu 1 and Xuesong Jin 2 * 1 School of Electronics and Information Engineering, Harbin

More information

5LSH0 Advanced Topics Video & Analysis

5LSH0 Advanced Topics Video & Analysis 1 Multiview 3D video / Outline 2 Advanced Topics Multimedia Video (5LSH0), Module 02 3D Geometry, 3D Multiview Video Coding & Rendering Peter H.N. de With, Sveta Zinger & Y. Morvan ( p.h.n.de.with@tue.nl

More information

Re-live the Movie Matrix : From Harry Nyquist to Image-Based Rendering. Tsuhan Chen Carnegie Mellon University Pittsburgh, USA

Re-live the Movie Matrix : From Harry Nyquist to Image-Based Rendering. Tsuhan Chen Carnegie Mellon University Pittsburgh, USA Re-live the Movie Matrix : From Harry Nyquist to Image-Based Rendering Tsuhan Chen tsuhan@cmu.edu Carnegie Mellon University Pittsburgh, USA Some History IEEE Multimedia Signal Processing (MMSP) Technical

More information

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation 2009 Third International Conference on Multimedia and Ubiquitous Engineering A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation Yuan Li, Ning Han, Chen Chen Department of Automation,

More information

Video-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014

Video-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014 Video-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014 1/26 ! Real-time Video Transmission! Challenges and Opportunities! Lessons Learned for Real-time Video! Mitigating Losses in Scalable

More information

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV

Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV Comparative Study of Partial Closed-loop Versus Open-loop Motion Estimation for Coding of HDTV Jeffrey S. McVeigh 1 and Siu-Wai Wu 2 1 Carnegie Mellon University Department of Electrical and Computer Engineering

More information

ESTIMATION OF THE UTILITIES OF THE NAL UNITS IN H.264/AVC SCALABLE VIDEO BITSTREAMS. Bin Zhang, Mathias Wien and Jens-Rainer Ohm

ESTIMATION OF THE UTILITIES OF THE NAL UNITS IN H.264/AVC SCALABLE VIDEO BITSTREAMS. Bin Zhang, Mathias Wien and Jens-Rainer Ohm 19th European Signal Processing Conference (EUSIPCO 2011) Barcelona, Spain, August 29 - September 2, 2011 ESTIMATION OF THE UTILITIES OF THE NAL UNITS IN H.264/AVC SCALABLE VIDEO BITSTREAMS Bin Zhang,

More information

Fast Implementation of VC-1 with Modified Motion Estimation and Adaptive Block Transform

Fast Implementation of VC-1 with Modified Motion Estimation and Adaptive Block Transform Circuits and Systems, 2010, 1, 12-17 doi:10.4236/cs.2010.11003 Published Online July 2010 (http://www.scirp.org/journal/cs) Fast Implementation of VC-1 with Modified Motion Estimation and Adaptive Block

More information

WATERMARKING FOR LIGHT FIELD RENDERING 1

WATERMARKING FOR LIGHT FIELD RENDERING 1 ATERMARKING FOR LIGHT FIELD RENDERING 1 Alper Koz, Cevahir Çığla and A. Aydın Alatan Department of Electrical and Electronics Engineering, METU Balgat, 06531, Ankara, TURKEY. e-mail: koz@metu.edu.tr, cevahir@eee.metu.edu.tr,

More information

ARCHITECTURES OF INCORPORATING MPEG-4 AVC INTO THREE-DIMENSIONAL WAVELET VIDEO CODING

ARCHITECTURES OF INCORPORATING MPEG-4 AVC INTO THREE-DIMENSIONAL WAVELET VIDEO CODING ARCHITECTURES OF INCORPORATING MPEG-4 AVC INTO THREE-DIMENSIONAL WAVELET VIDEO CODING ABSTRACT Xiangyang Ji *1, Jizheng Xu 2, Debin Zhao 1, Feng Wu 2 1 Institute of Computing Technology, Chinese Academy

More information

Zonal MPEG-2. Cheng-Hsiung Hsieh *, Chen-Wei Fu and Wei-Lung Hung

Zonal MPEG-2. Cheng-Hsiung Hsieh *, Chen-Wei Fu and Wei-Lung Hung International Journal of Applied Science and Engineering 2007. 5, 2: 151-158 Zonal MPEG-2 Cheng-Hsiung Hsieh *, Chen-Wei Fu and Wei-Lung Hung Department of Computer Science and Information Engineering

More information

Image and Video Coding I: Fundamentals

Image and Video Coding I: Fundamentals Image and Video Coding I: Fundamentals Heiko Schwarz Freie Universität Berlin Fachbereich Mathematik und Informatik H. Schwarz (FU Berlin) Image and Video Coding Organization Vorlesung: Montag 14:15-15:45

More information

Optimized architectures of CABAC codec for IA-32-, DSP- and FPGAbased

Optimized architectures of CABAC codec for IA-32-, DSP- and FPGAbased Optimized architectures of CABAC codec for IA-32-, DSP- and FPGAbased platforms Damian Karwowski, Marek Domański Poznan University of Technology, Chair of Multimedia Telecommunications and Microelectronics

More information

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami to MPEG Prof. Pratikgiri Goswami Electronics & Communication Department, Shree Swami Atmanand Saraswati Institute of Technology, Surat. Outline of Topics 1 2 Coding 3 Video Object Representation Outline

More information

VIDEO AND IMAGE PROCESSING USING DSP AND PFGA. Chapter 3: Video Processing

VIDEO AND IMAGE PROCESSING USING DSP AND PFGA. Chapter 3: Video Processing ĐẠI HỌC QUỐC GIA TP.HỒ CHÍ MINH TRƯỜNG ĐẠI HỌC BÁCH KHOA KHOA ĐIỆN-ĐIỆN TỬ BỘ MÔN KỸ THUẬT ĐIỆN TỬ VIDEO AND IMAGE PROCESSING USING DSP AND PFGA Chapter 3: Video Processing 3.1 Video Formats 3.2 Video

More information

Professor, CSE Department, Nirma University, Ahmedabad, India

Professor, CSE Department, Nirma University, Ahmedabad, India Bandwidth Optimization for Real Time Video Streaming Sarthak Trivedi 1, Priyanka Sharma 2 1 M.Tech Scholar, CSE Department, Nirma University, Ahmedabad, India 2 Professor, CSE Department, Nirma University,

More information

Star Diamond-Diamond Search Block Matching Motion Estimation Algorithm for H.264/AVC Video Codec

Star Diamond-Diamond Search Block Matching Motion Estimation Algorithm for H.264/AVC Video Codec Star Diamond-Diamond Search Block Matching Motion Estimation Algorithm for H.264/AVC Video Codec Satish Kumar Sahu 1* and Dolley Shukla 2 Electronics Telecommunication Department, SSTC, SSGI, FET, Junwani,

More information

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework System Modeling and Implementation of MPEG-4 Encoder under Fine-Granular-Scalability Framework Final Report Embedded Software Systems Prof. B. L. Evans by Wei Li and Zhenxun Xiao May 8, 2002 Abstract Stream

More information

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000 Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000 EE5359 Multimedia Processing Project Proposal Spring 2013 The University of Texas at Arlington Department of Electrical

More information

Overview: motion-compensated coding

Overview: motion-compensated coding Overview: motion-compensated coding Motion-compensated prediction Motion-compensated hybrid coding Motion estimation by block-matching Motion estimation with sub-pixel accuracy Power spectral density of

More information