106 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, MARCH 2014

Size: px
Start display at page:

Download "106 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, MARCH 2014"

Transcription

1 106 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, MARCH 2014 Depth Map Coding for View Synthesis Based on Distortion Analyses Feng Shao, Weisi Lin, Senior Member, IEEE, Gangyi Jiang, Member, IEEE, Mei Yu, and Qionghai Dai, Senior Member, IEEE Abstract In 3-D video, view synthesis with depth-image-based rendering is employed to generate any virtual view between available camera views. Distortions in depth map induce geometry changes in the virtual views, and thus degrade the performance of view synthesis. This paper proposes a depth map coding method to improve the performance of view synthesis based on distortion analyses. The major technical innovation of this paper is to formulate maximum tolerable depth distortion (MTDD) and depth disocclusion mask (DDM), since such depth sensitivity for view synthesis and inter-view redundancy can be well utilized in coding. To be more specific, we define two different encoders (e.g., base encoder and side encoder) for depth maps in left and right views, respectively. For base encoding, different types of coding units are extracted based on the distribution of MTDD and assigned with different quantitative parameters for coding. For side encoding, a warped-skip mode is designed to remove inter-view redundancy based on the distribution of DDM. The experimental results show that the proposed scheme not only achieves high view synthesis performance, but also reduce the computational complexity of encoding. Index Terms Depth disocclusion mask, depth map coding, maximum tolerable depth distortion, three-dimensional (3-D) video, view synthesis. I. INTRODUCTION WITH the advancement of 3-D related technologies [1], e.g., content creation, video coding, network transmission, and stereoscopic display, 3-D video applications have drawn increasing attention during recent years. Especially since 2009, the great success of Avatar has greatly promoted 3-D research and markets [2]. Since a 3-D video system requires an enormous amount of information captured by at least two cameras, efficient storage and transmission is the main challenge. Manuscript received August 26, 2013; revised November 30, 2013; accepted December 29, Date of publication January 20, 2014; date of current versionmarch07,2014.thisworkwassupportedinpartbythenaturalscience Foundation of China under Grant , Grant , Grant U , and Grant ), and in part by the K. C. Wong Magna Fund in Ningbo University. This paper was recommendedbyguesteditorb.yan. F. Shao, G. Jiang, and M. Yu are with the Faculty of Information Science and Engineering, Ningbo University, Ningbo , China ( shaofeng@nbu.edu.cn; jianggangyi@nbu.edu.cn; yumei@nbu.edu.cn). W. Lin is with the Centre for Multimedia and Network Technology, School of Computer Engineering, Nanyang Technological University, Singapore ( wslin@ntu.edu.sg). Q. Dai is with the Broadband Networks and Digital Media Lab, Tsinghua University, Beijing , China ( qhdai@tsinghua.edu.cn). Color versions of one or more of the figures in this paper are available online at Digital Object Identifier /JETCAS One promising solution is to encode only limited views and synthesize the virtual view by a depth image-based-rendering (DIBR) technique [3]. Recently, multi-view video plus depth (MVD) scene description format has been standardized by MPEG and ITU-T as an efficient data representation for 3-D systems [4]. For the past several years, dozens of works were concentrated on design of various multi-view video coding (MVC) methods by exploiting temporal and inter-view dependencies, which had been standardized by both the joint multi-view video model (JMVM) [5] and the joint multi-view video coding (JMVC) [6]. For depth maps, in order to be backward compatible with the MVC standard, they are often treated as gray scale image sequences, and can be compressed by JMVM or JMVC reference software. Since the characteristic of depth maps is very different from that of color texture video, some special depth map coding approaches were proposed. Morvan et al. proposed the quadtree decomposition scheme to model regions by concentrating on depth smooth properties [7]. Oh et al. proposed a depth boundary reconstruction filter to compress the depth map and utilized it as an in-loop filter [8]. Hidalgo et al. proposed a segmentation-based coding method by considering the smooth structure and sharp edges of depth maps [9]. Milani et al. employed over-segmentation to split the depth map into multiple segmentation regions, and merged these regions to create an object-based quality scaleable prediction for depth map coding [10]. Nguyen et al. proposed a weighted mode filtering to suppress the coding artifacts and reconstructed the depth map from the reduced spatial resolution [11]. Besides, it is possible to make use of the correlation between color texture and depth, and some joint depth/texture coding schemes were proposed to improve the coding efficiency [12], [13]. However, the correlation between color texture video and depth map is not strong as expected, and more importantly, the effects of color texture and depth distortions on view synthesis should be taken into account in depth map coding. It is important to note that the unique property of depth maps is that they are not directly used for display, but only provide supplementary data (i.e., geometric information of the captured scene) for view synthesis. Therefore, in addition to conventional rate-distortion (R-D) criteria, the R-D property of the synthesized view should also be fully utilized in depth map coding. Kim et al. proposed a new R-D criterion to replace the conventional distortion function [14], in order to quantify the effect of depth coding distortion on the synthesized view. Oh et al. proposed a view synthesis distortion function by involving the co-located color texture information [15], and applied to the op IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.

2 SHAO et al.: DEPTH MAP CODING FOR VIEW SYNTHESIS BASED ON DISTORTION ANALYSES 107 timal macroblock mode decision of depth map coding. Liu et al. proposed a linear distortion model to approximate the view synthesis distortion [16], and determined the optimal bitrate ratio between color texture and depth by minimizing the view synthesis distortion. Yuan et al. proposed a new virtual view oriented R-D criterion to replace the mean squared error (MSE) criterion during the R-D optimization process [17]. Tech et al. proposed a new distortion metric to account for the changes of the overall synthesized view distortion [18]. Zhang et al. proposed a regional bit allocation and rate distortion optimization algorithm by applying different view synthesis distortion models [19]. The related works were also proposed in [20], [21]. However, these view synthesis oriented R-D criterions could not completely reflect the view synthesis process, so the performance improvement of view synthesis is limited. From another perspective, it is advantageous to take the properties of depth sensitivity into account to enhance the performance of view synthesis as well as 3-D display. In addition to being used for view synthesis, depth maps can also assist 3-D display (e.g., autostereoscopic display) to enhance depth perception. De Silva et al. proposed just noticeable depth difference (JNDD) model to represent the quantitative threshold below which the human vision system (HVS) cannot perceive the depth change in 3-D display [22]. Nguyen et al. derived a theoretical upper bound of geometric error on the mean absolute error in the synthesized view [23]. Zhao et al. proposed depth no-synthesis-error (D-NOSE) model to represent the allowable depth distortions in view synthesis without introducing any geometric changes [24]. Cheung et al. defined a range of depth values as the don t-care region (DCR) within which any depth values lead to insignificant synthesized view distortion [25]. However, the depth sensitivity for 3-D display (e.g., JNDD) only provides depth distance of the scene (cannot be directly reused for view synthesis), and human visual perception redundancies were not considered in establishing the depth sensitivity for view synthesis (e.g., D-NOSE and DCR models). The current hybrid coding framework (with motion or disparity compensated prediction) does not fully exploit the redundancies of 3-D data, and there is room for further improvement. For example, the locations of corresponding samples in different views can be determined by view warping; even if temporal and inter-view correlation has been adopted to determine the skipped blocks of depth maps [26], depth-based view warping can be an effective means to remove the inter-view redundancy. Lee et al. proposed to skip some blocks of the depth image at the early stage based on temporal and inter-view correlation between the previously encoded color texture images [27]. Daribo et al. used the 3D-warping technique to produce the right view at the decoder, but the quality of right view will be largely decreased because of the produced disoccluded region [28]. Zamarin et al. used the 3D-warping approach to replace disparity compensated prediction in the encoding architecture to improve coding performance [29]. Jager et al. warped prediction only for key pictures and replaced intra-coded pictures of the enhancement views [30]. In order to handle the disocclusion in warping, Gautier et al. proposed a depth-based image completion algorithm to fill the disoccluded region [31]. However, directly applying 3-D-warping to depth maps usually cannot obtain the accurate warped positions due to low depth inconsistence across viewpoints. In our previous work [32], [33], we focused on joint encoding in MVD data, in which different R-D models were applied to texture video and depth map coding by characterizing the relationship between coding distortion and view synthesis distortion, and optimal bitrates were allocated to texture and depth. However, these methods still have the following limitations: 1) the distortion analyses for depth map is insufficient because the characteristics of depth maps are completely different with those of texture video; 2) depth map coding should be specially devised in order to ensure optimal 3-D video coding performance. In this paper, we propose a depth map coding method to improve view synthesis performance based on distortion analyses. Weconcentrateondepthmapcodinginthisworktoimprove the performance of view synthesis, since depth maps are only used for view synthesis as nonvisual data and the influence upon 3-D perception will not be direct. The main contributions of this work are as follows. 1) A comprehensive analysis of the important factors affecting the quality of the synthesized view has been presented. These factors are fully taken into account in depth map coding. 2) By considering depth-sensitivity for view synthesis, we derive the maximum tolerable depth distortion (MTDD) model and design a base encoder for depth map coding according to the distribution of MTDD. 3) To eliminate inter-view redundancy of depth maps as much as possible, we derive the depth disocclusion mask (DDM) model and design a side encoder for depth map coding. The rest of the paper is organized as follows. Problems in 3-D video coding systems are discussed in Section II. Sections III and IV present the definitions of MTDD and DDM, respectively. Then, the proposed method is introduced in Section V, and experimental results are analyzed in Section VI. Finally, conclusions are drawn in Section VII. II. PROBLEM DESCRIPTION IN 3-D VIDEO SYSTEM Fig. 1 illustrates a typical 3-D video coding system framework, in which color texture video and depth maps are independently or jointly encoded using different MVC encoders. At the client side, the arbitrary virtual views are synthesized from the decoded color texture video and depth maps by DIBR. In this paper, we do not intend to study rate allocation between texture video and depth maps (see our previous work [32], [33]). Since virtual views are synthesized from the adjacent two views (i.e., left view and right view) [34], we only consider two-view MVD format in this work, and it can be easily extended to multiple views. For example, for three-view MVD format (i.e., cases of I-view, P-view, and B-view), the P-view and B-view can be synthesized from the I-view, in which indexes for I, P, and B pictures can represent hierarchical levels of the prediction structure. In the two-view MVD format, suppose that one view is regarded as left view and the other view is regarded as right view, the typical prediction structure for two-view based 3-D video coding is shown in Fig. 2. Pictures in each view form a hierarchical B picture prediction structure [35]. The left view (as I-view) is encoded with temporal motion compensated prediction (MCP), and the right view (as P-view) is encoded with both the temporal MCP and the disparity compensated predic-

3 108 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, MARCH 2014 Fig. 1. Framework of 3-D video system. Fig. 2. Typical prediction structure for MVC. tion (DCP) between views. Since the depth maps can be treated as the monochromatic video, they are also encoded by the same prediction structure in Fig. 2. In DIBR-based view synthesis, 3D-warping is usually used to synthesize the virtual view, which can be separated into two steps: projection of the reference image into the 3-D world coordinates, and followed by projection of the 3-D scene into the target image plane. A pixel in the reference image is projected into the 3-D world coordinate [36] Fig. 3. Pixel projection between two views using 3-D warping. where is the depth value calculated from the pixel in the depth map, and are the intrinsic and rotation matrices of the reference camera, and is the translation vector of the reference camera. In the next step, the world coordinates are projected into the target camera plane via (1) Fig. 4. Effect of compression artifact in homogeneous depth region: (a) uncompressed depth map; (b) synthesized image with uncompressed depth map; (c) compressed depth map; (d) synthesized image with compressed depth map. (2) where is the homogeneous coordinates of the target image plane, and,,and are the intrinsic matrix, rotation matrix and translation vector of the target camera, respectively. The corresponding pixel location in the synthesized image of the target camera is.the matrices,,and,aswellas, are known in advance for specific cameras. As illustrated in Fig. 3, the warped pixel position using the wrong depth (the black point in the figure) will deviate from its actual position using the original depth (the gray point in the figure), so the error (compression artifact usually) in the depth map will lead to geometric error in the synthesized virtual view. Moreover, the position deviation induced quality degradation of the synthesized view may differ from regions to regions. That is, the synthesized virtual view with the same geometric error results in different visual quality in different regions. We illustrate the effect of compression artifacts in homogeneous and discontinuous depth regions respectively in Figs. 4 and 5. From the figures, we can see that, even though the compression artifact in the homogeneous depth region is evident, the resultant synthesized Fig. 5. Effect of compression artifact in discontinuous depth region: (a) uncompressed depth map; (b) synthesized image with uncompressed depth map; (c) compressed depth map; (d) synthesized image with compressed depth map. image does not show significant quality change with the original one, while the difference between the corresponding synthesized images in the discontinuous depth region is significant. Also, depth distortion will affect the revealing of disoccluded region (locations of the black pixels in the figures are changed). Therefore, it is necessary to quantify the effect of depth distortion (e.g., depth sensitivity) and utilize this property in depth map coding. However, current depth estimation method is based on stereo matching in essence. As shown in Fig. 6, the estimated depth maps are inconsistent across viewpoints since inter-view correlation is not fully exploited in depth estimation [37]. As a consequence, depth maps can be locally erroneous with spot noise in some regions. This reduces both the coding performance for depth maps and the quality of synthesized view. Currently, some depth map preprocessing algorithms were proposed to enhance

4 SHAO et al.: DEPTH MAP CODING FOR VIEW SYNTHESIS BASED ON DISTORTION ANALYSES 109 where denotes 3D-warping operation that warps A (color texture or depth) to the virtual view using the depth information B, indicates the synthesized color texture image, is the warped pixel position for the synthesized view using original depth map, and is the horizontal geometric error inducing by the distorted depth map. Similar formulation as (3) can be obtained for the right view. It is already proven that a linear relationship is satisfied between the geometric error and the distortion of depth map [14], i.e., (4) where is a coefficient determined by the following equation: (5) Fig. 6. Inter-view inconsistence analysis of depth maps. the inter-view consistence [37], [38]. In this work, considering that inter-view correlation of color texture video is usually high, it can assist to measure the inter-view correlation of depth maps using 3-D-warping. In other words, we can skip these coherent regions in coding (i.e., with the proposed warped-skip mode design), instead of aiming to improve prediction accuracy by view synthesis prediction as done in [26]. More importantly, low depth consistence across viewpoint can be relieved by applying the proposed warped-skip mode design. From the above analysis, on one hand, inter-view correlation can be well exploited by the 3D-warping process. If this correlation is appropriately utilized, the coded (transmitted) information can be largely reduced. On the other hand, the coding distortion of depth maps leads to geometric error in the synthesized view. Besides, during this process, some pixel positions in the virtual view are not mapped from the reference view, because some areas exist at the reference view but are invisible at the virtual view, such as occluded/disoccluded regions. Therefore, in order to effectively describe the 3-D video (aiming at lower transmitted bitrate and higher synthesized quality), the factors above should be taken into account in depth map coding. In this work, we try to derive the characteristic description models for depth maps, and apply these models to depth map coding. III. MAXIMUM TOLERABLE DEPTH DISTORTION It is known that depth map distortion will lead to geometric error in the synthesized view and will affect the quality of synthesized view. In this work, in order to investigate how and the extent to which depth map distortion affects view synthesis, the distortion of the virtual views [measured using the squared differences (SD)] synthesized from the original left color texture image using the original depth map and the distorted depth map is defined as (3) where denotes the focal length of the camera in the horizontal direction, expresses the baseline distance between the current and the virtual views, and and are the values of the nearest and farthest depth of the scene, respectively. Equation (4) reveals the fact that the distortion of the depth map will cause the warped position deviation. Since parameters, and are fixed for a specific imaging system, is known once the virtual view is established. The ground-truth geometric error is the one that minimizes the distortion of the synthesized view However, it would be impractical to directly calculate the geometric error from the above equations, because: 1) it needs to calculate the warping for each pixel in different position deviations, and this requires enormous computation; 2) the groundtruth geometric error does not necessarily exist, because the minimum distortion may be the one that the geometric error equals to zero; 3) disocclusion in the synthesized virtual view is hard to measure because there is no point of reference for comparison. Considering that virtual view is synthesized from the left view, it is assumed that, and a certain amount of distortion in the synthesized view can be tolerated by considering human visual perception; e.g., background can tolerate more distortion. Therefore, the distortion of the synthesized view in (3) is redefined as where and are the lower and upper bounds of the geometric error, for the resultant within these ranges to be lower than a given threshold.inthe experiments, the maximum search range for establishing the lower and upper bounds is set to the maximum disparity range between the left and right views. Considering that the synthesized virtual view is eventually perceived by the human, the factors that affect human visual perception, e.g., contrast sensitivity function, luminance adaptation, and contrast masking are taken into account in determining the threshold.itiswell (6) (7)

5 110 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, MARCH 2014 known that visual masking effect [e.g., just-noticeable difference (JND)] has played an important role in pro-hvs signal processing. In this work, the JND threshold in [39] is selected, i.e.,. In this work, only monocular visual characteristic is considered in the JND threshold because we only apply the model for left view in depth map coding that follows. For the actual binocular visual response process, binocular JND (BJND) [40] will be a good choice for efficient coding framework design. Finally, by finding the lower bound and upper bound of the geometric error, the MTDD is defined as (8) In the experiments, according to the definition in (5), is calculated with the largest baseline (i.e., distance between left and right-most cameras), in order to cover all the range of virtual views. Thus, the MTDD for each view can be obtained by implementing the above operation independently. Besides, the definition of MTDD is per pixel, and this provides useful information about how much distortion we can tolerate in depth map coding. IV. DEPTH DISOCCLUSION MASK The above MTDD depicts the depth sensitivity property (i.e., the internal characteristic of depth maps) for view synthesis. For depth map coding, the external characteristic (e.g., inter-view redundancy) should also be exploited. However, as analyzed in the previous section, inter-view correlation of depth maps is usually weak due to the limitations of available depth estimation methods. In this work, we measure the inter-view correlation of depth maps from the co-located color texture video. The synthesized right color texture image can be obtained from the left color texture image using the original depth map by However, one prominent issue in the warping process is that hole and disocclusion (defined as disoccluded region in this work) will inevitably occur due to the viewing angle difference. As shown in Fig. 7, the background behind the foreground in the original view will be disoccluded in the virtual view. That is, the original view does not provide any information about the background, so disocclusion gaps in the synthesized virtual view occur. In this work, in order to determine which pixels will be disoccluded in the synthesized right view, we compare the difference of the original and the synthesized right views. If the pixel is successfully warped, the probability value of the pixel is set to 1; otherwise, it is set to 0, as described as follows: (9) (10) where,and is a threshold controlling the difference strength. The disoccluded pixels are Fig. 7. Disocclusion problem description. marked as black (value equals to 255) in the synthesized view so that these pixels can be correctly differentiated by comparing the difference between the original and synthesized views (the wrongly warped pixels can also be differentiated). In the experiments, is set to 10. Since the above definition of probability is per pixel, to be compatible with the block-based coding system, the probability of each coding unit (CU) is calculated by (11) where denotes the size of the CU, e.g., Besides, considering that edges in the depth map will have great impact on view synthesis [41], we firstextracttheedgecusoftheleft view by performing Canny edge detection on the MTDD map. Then, these edge CUs are warped to the right view and marked as, and the remaining CUs in the right view are marked as. Finally, the DDM of the right view is defined as (12) In the experiments, is set to 0.5, and this means that if more than a half of the pixels in the CU are disoccluded, the CU belongs to the disoccluded region; otherwise, it belongs to the warped region. In the proposed scheme, based on the derived DDM, only the disoccluded region needs to be encoded, and the warped region can be skipped in coding and then synthesized at the decoder. Since the DDM of the right view is dependent on the left view, similar DDM of the left view can be obtained by inversely warping the right view. V. PROPOSED DEPTH MAP CODING METHOD To effectively reduce the transmitted bitstreams of the compressed depth maps while maintaining high view synthesis quality, a depth map coding method is proposed by considering the above characteristics models (i.e., MTDD and DDM). In the proposed method, we define two different encoders (i.e., a base encoder and a side encoder) for left and right views,

6 SHAO et al.: DEPTH MAP CODING FOR VIEW SYNTHESIS BASED ON DISTORTION ANALYSES 111 respectively. Specifically, for the base encoder, different types of CUs are extracted according to the distribution of the MTDD map and assigned different quantization parameters (QPs). For the side encoder, a warped-skip mode is designed to remove the inter-view redundancy according to the distribution of DDM map. At the decoder, view reconstruction is performed to reconstruct the right view. Therefore, how to effectively encode the left view and right view is the challenge for the success of the method. A. Base Encoder for Left View Coding As we have discussed, depth sensitivity for view synthesis is space-variant. Therefore, different CUs can be represented with different flexibility. To effectively reduce transmitted bitstream of the depth map, we propose QP selection for coding depth maps according to the distribution of MTDD. In order to facilitate the following process, the MTDD values are mapped to [0, 255]. Firstly, since edges of the depth map have great impact on view synthesis [41], the edge CUs are extracted by performing Canny edge detection on the MTDD map and marked as. For the remaining nonedge regions (marked as ), the mean and variance of the nonedge CUs are first calculated, and different types of CUs (defined as A1, A2, A3, and A4) in the nonedge regions are divided by comparing the mean and variance with predefined thresholds. The specific steps are as follows: When and,thecuisdefined as the type A1, where and are the thresholds of the mean and variance, respectively. When, the CU can tolerate larger distortion; when, the CU is relatively smooth. The CU is set to types A2, A3, and A4 for and, and,and,and, respectively. Then, different QPs are assigned to the CUs by (13) where, is the base QP for coding, and controls QP (to be analyzed in the next subsection). For the type, is directly used. Fig. 8 describes the flowchart of the proposed coding scheme for the left view. According to the statistical analysis of and on Leaving Laptop and Lovebird1 test sequences, we found the optimum thresholds and for all the possible distributions of the types. As a consequence, and are set to 6.5 and 746 in our experiments. Fig. 9 shows the example of the block types of Leaving Laptop and Lovebird1. It is obvious that type A1 (black regions in the figure) is usually for smooth regions of color texture image and depth map, and thus, depth distortion in these regions does not affect the synthesized virtual view significantly. The type A2 (dark gray regions in the figure) is mainly concentrated in the regions with relatively small depth variations and relatively smooth texture. The type A3 (light gray regions in the figure) and A4 (white regions in the figure) are mainly distributed in the regions with large depth variations (e.g., depth discontinuities) and complex texture. The proposed base encoder design in this work improves the view synthesis performance, as demonstrated in Section VI-C. Fig. 8. Flowchart of the proposed left view coding. Fig. 9. Examples of the block types of the left depth map. (a) Leaving Laptop. (b) Lovebird1. B. Side Encoder for Right View Coding For the right view, only the disoccluded region is encoded so that the transmitted bitrate can be largely saved; the disoccluded information can be reconstructed from the left view at the decoder. This idea has been demonstrated in the layered depth images, and it is possible to detect occlusion in rendering [42]. Of course, the quality of the reconstructed right view will have some degree of degradation because the depth map information used in 3-D-warping is not the original one. As analyzed in the previous section, depth map distortion will lead to geometric distortion in the synthesized view. In the implementation process, the warped region does not need to be encoded (i.e., skipped in coding). Similar to SKIP mode design in the traditional video coding standard, we define a SKIP mode here (termed as warped-skip mode) for the warped region to remove the inter-view redundancy. Simultaneously, the mode information is transmitted to the decoder (few bits are needed to transmit the mode information). Fig. 10 describes the flowchart of the proposed right view coding. If is chosen to be 1, no further processing is carried out and the mode of the CU is marked as warped-skip; when is not equal to 1, we process the CU as normal (encoded with the R-D optimization process). Fig. 11 shows examples of the DDM distribution (using Leaving Laptop and Lovebird1: black and white blocks represent the warped and

7 112 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, MARCH 2014 Fig. 10. Flowchart of the proposed right view coding. Laptop, Lovebird1, and Pantomime. For Alt Moabit, Book Arrival, and Leaving Laptop containing 16 views with 6.5 cm spacing between adjacent views, the tenth and eighth views, denoted as view10 and view8, are adopted as left and right views, and the virtual view, denoted as view9, is synthesized. For Lovebird1 containing 12 views with 3.5 cm spacing between adjacent views, the sixth and eighth views, denoted as view6 and view8, are adopted as left and right views, and the virtual view, denoted as view7, is synthesized. For Dog and Pantomime containing 80 views with 5 cm spacing between adjacent views, the forty and forty-second views, denoted as view40 and view42, are adopted as left and right views, and the virtual view, denoted as view 41, is synthesized. The depth maps of the test sequences are generated by Depth Estimation Reference Software (DERS) [44] ver For all experiments, we used the JMVC software [6] ver. 8.3 to encode color texture video and depth maps, and used the view synthesis reference software (VSRS) [34] version 3.5 to synthesize the virtual view. The detailed encoding settings used for color texture video and depth maps are as follows: the basic QP values are 22, 27, 32, and 37, the temporal GOP size is set to 8, and the total number of the encoded frames in each view is 50. Fig. 11. Examples of the DDM map. (a) Leaving Laptop. (b)lovebird1. disoccluded regions, respectively). It is obvious that the information to be encoded is comparatively small (the number of CUs with the warped-skip mode is far more than other CUs). Besides, computational complexity of the encoder is largely reduced (R-D optimization consumes most of the computation resources). The synthesized right depth map is obtained from the decoded left depth map using the decoded depth map by (14) Then, based on the mode information of the decoded CU, the reconstructed right depth map is obtained by (15) where denotes the decoded right depth map. However, as analyzed in Section II, depth distortion affects the identification of disoccluded regions, and thus, some small holes will still appear in the reconstructed depth map. For these small holes, we use total variation (TV) model to inpaint their pixel values [43]. In addition, a three-tap low-pass filter is applied to the boundaries of DDM in both horizontal and vertical directions to eliminate the ghost contour. VI. EXPERIMENTAL RESULTS AND ANALYSES A. Experimental Setup In the experiments, we select the MPEG 3-D video test sequences: Alt Moabit, Book Arrival, Dog, Leaving B. Parameters Determination In the proposed scheme, we determine the optimum values of by comparing Bjonteggrad delta PSNR (BD-PSNR) [45] values with different settings. In the experiment, we apply the base encoder for the left and right views simultaneously. Considering that the type usually has great impact on view synthesis, for simplicity, we set.then, we determine the remaining parameters,, and by comparison method. To determine the optimum value of,weset,,and, respectively. Similarly, we set, and to determine the optimum value of, and set, and to determine the optimum value of,using the synthesized virtual view with, and as benchmark. Finally, the BD-PSNR of the synthesized virtual view with different,,and is calculated, and the optimum value of is selected with the maximum BD-PSNR value. In the experiments, we pre-encode two GOPs of Leaving Laptop and Lovebird1 test sequences, and use Gaussian function to fit the curves. The optimum value of is located in the wave crest of the fitted curves, and the determination results are, and.inthefollowingexperiments, thesamevaluesof,,and are used for all the test sequences. C. View-Synthesis R-D Performance Comparison In order to objectively evaluate the performance of view synthesis, with the virtual view synthesized from the original color texture video using the original depth maps as reference, the average peak signal-to-noise ratio (PSNR) is measured. The color texture video is encoded using the same QP with the depth

8 SHAO et al.: DEPTH MAP CODING FOR VIEW SYNTHESIS BASED ON DISTORTION ANALYSES 113 Fig. 12. View synthesis R-D performances of the three schemes. (a) Alt Moabit, (b) Book arrival. (c) Dog. (d) Leaving laptop. (e) Lovebird1. (f) Pantomime. maps by the JMVC encoder. We compare the view-synthesis R-D performance of three schemes (the original JMVC scheme [6], Lee s scheme [27] and the proposed scheme) in Fig. 12, denoted by JMVC, Lee s [27], and Proposed, respectively. The vertical axis in each sub-figure shows the average PSNR of the synthesized virtual view, while the horizontal axis corresponds to the total bitrate of depth maps. For the original JMVC scheme, the original JMVC encoder is directly applied to coding depth maps. For Lee s scheme, only the right view is encoded by predicting the skipped blocks based on the temporal and inter-view correlation, and the left view is encoded as normal (using original JMVC encoder). The results show that the performance of Lee s scheme is lower than the proposed scheme, although it is superior to the original JMVC scheme. The reason is that the depth sensitivity for view synthesis is not considered in Lee s scheme, and thus, the overall performance enhancement is impressive for the proposed scheme. Besides, the performance of Lee s scheme is highly dependent on the number of predicted views (in fact, it is more suitable to a three-view case, as demonstrated in [27]). In order to demonstrate the impact of each component in the proposed scheme, we show the detailed bitrate and synthesized quality, and the corresponding BD-PSNR in Table I, where we separately provide the performance for Lee s scheme, the proposed scheme with base encoder only (Scheme-1), the proposed scheme with side encoder only (Scheme-2), and the proposed scheme (combining Scheme-1 and Scheme-2), with the original JMVC scheme as benchmark (due to the limitation of space in the table, we use Lee, S-1, S-2, and Pro to represent the four schemes respectively in the table and the following tables). Overall, the performance of the proposed scheme is better than the two constituent schemes (i.e., Scheme-1 and Scheme-2) that only utilize depth-sensitivity or inter-view redundancy of depth maps. Scheme-2 outperforms Scheme-1 for all test sequences, because the estimated depth maps have relatively low inter-view inconsistence, leading to lower inter-view prediction accuracy, while Scheme-2 perform better in this regard. The performance of Lee s scheme is superior to Scheme-2 for most of the test sequences except for Dog and Lovebird1, because more blocks are encoded with R-D optimization in the two test sequences due to the significant inter-view inconsistence. To further analyze the performance of view synthesis, we use peak signal-to-perceptible-noise ratio (PSPNR) [46] to evaluate the perceptual quality of the virtual view. With the original JMVC scheme as benchmark again, we compare Bjonteggrad delta PSPNR (BD-PSPNR) and Bjonteggrad delta bitrate (BD-RD) of Lee s [27], Scheme-1, Scheme-2 and Proposed, as tabulated in Tables II and III. It is obvious that the proposed scheme provides better performance of view synthesis than other schemes. The overall performance of view synthesis is improved under the same bitrate. D. Subjective View-Synthesis Performance Comparison Even though the purpose of the proposed scheme is to save bits for depth map coding, we do not further allocate the saved bits for texture coding, because we only focus on depth map coding in this work. In order to show the impact of depth map coding on view synthesis, we compare the synthesized virtual views of Leaving Laptop and Lovebird1 with or without the proposed scheme. Fig. 13, Fig. 14(a) and (b) show the decoded right depth maps with the original JMVC scheme and the reconstructed right depth maps with the proposed scheme under the

9 114 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, MARCH 2014 TABLE I CODING BITRATE AND SYNTHESIZED QUALITY COMPARISON OF THE FOUR SCHEMES TABLE II BD-PSPNR PERFORMANCE COMPARISON OF THE SCHEMES TABLE III BD-RD PERFORMANCE COMPARISON OF THE SCHEMES same basic QPs, respectively. The differences between the two depth maps are obvious due to low depth consistence between left and right views. The corresponding synthesized virtual views using the above depth maps are shown in Fig. 13 and Fig. 14(c) and (d), respectively. It is obvious that the two synthesized virtual views are very similar, and only small differences are found in local regions. Since we use the same texture images and different depth maps (encoded with the original JMVC and the proposed scheme) to synthesize the virtual view, better quality of the synthesized view will be obtained if the saved bits are allocated to texture (this conclusion is noticeable). These further explain that certain distortions in depth maps are tolerable, and depth distortion does not significantly degrade the quality of synthesized virtual view. In addition, the subjective test is implemented to measure the virtual view video. Supposed the virtual view video synthesized from the original texture video and the original depth maps as reference, the virtual view synthesized from the distorted texture video using JMVC coding and the distorted depth maps using three depth coding methods, i.e., JMVC, Lee s [27], and Proposed, are compared. The subjective tests were conducted in the laboratory designed for subjective quality tests according to the recommendation BT Two different videos are displayed simultaneously in a screen, and subject is asked to rank the overall quality of the videos on fiveoptions{5,4,3,2,1}, representing Excellent, Good, Fair, Poor, and Unsatisfactory, respectively. Eight nonexpert adult viewers with ages range from 20 to 25 were participated in the subjective evaluation. Fig. 15 shows the Mean Option Score (MOS) for the subjective evaluation, where the higher MOS means the corresponding video have relative higher visual quality. We can observe that the reference video usually has relative higher visual quality, except Alt Moabit and Lovebird1, because the synthesized view video using the original depth map has serious geometric distortion. The subjective visual qualities for the three depth coding methods almost have no difference. E. Computational Complexity Analysis The complexity of the proposed scheme mainly depends on the predetermination of the MTDD and DDM, the encoding process with R-D optimization, and the view reconstruction process. Since predetermining the MTDD and DDM is

10 SHAO et al.: DEPTH MAP CODING FOR VIEW SYNTHESIS BASED ON DISTORTION ANALYSES 115 Fig. 13. View synthesis results of Leaving Laptop: (a) decoded depth map with original JMVC scheme (41.68 db); (b) reconstructed depth map with the proposed scheme (32.58 db); (c) synthesized virtual view with original JMVC scheme (39.20 db); (d) synthesized virtual view with the proposed scheme (39.01 db). Fig. 14. View synthesis results of Lovebird1: (a) decoded depth map with original JMVC scheme (47.41 db); (b) reconstructed depth map with the proposed scheme (38.68 db); (c) synthesized virtual view with original JMVC scheme (36.92 db); (d) synthesized virtual view with the proposed scheme (37.51 db). performed in an offline mode, the accurate measure of the execution time is difficult. In fact, the predetermination time is lower than encoding time. Therefore, we only conduct a qualitative analysis of the encoding computational complexity. Through statistical analysis, we find that warped-skip mode occupies more than 75% of all encoding modes in right view coding (for Lovebird1 and Dog, the warped-skip mode proportion is more than 95%), while the encoding time can be omitted for the warped-skip mode. Besides, the processing time for the view synthesis is significantly lower than R-D optimization encoding time. Therefore, the overall encoding computational complexity of the proposed scheme is lower than the original JMVC scheme.

11 116 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 4, NO. 1, MARCH 2014 Fig. 15. MOS of the subjective evaluation. VII. CONCLUSION This paper has presented a new depth map coding method to improve view synthesis performance based on distortion analysis. The prominent advantage of the proposed method is that we define MTDD and depth DDM to fully exploit the internal characteristic (e.g., depth sensitivity property) and the external characteristic (e.g., inter-view redundancy) of depth maps for view synthesis. Specifically, different types of CUs are extracted based on the distribution of MTDD and assigned different QPs for base encoding, and a warped-skip mode is designed to remove inter-view redundancy based on the distribution of DDM for side encoding. Experimental results have confirmed that the proposed method significantly improves the performance of view synthesis. In the current implementation, color texture video is directly encoded by the original JMVC encoder. In the further work, we plan to tackle the issues along the directions below: 1) since color texture video is directly encoded by the original JMVC encoder in the current implementation, we should research how to expand the proposed depth map coding to color texture video coding (distortion characteristics in color texture video and depth maps are different); 2) since the current 3-D video coding standard does not consider special depth map encoder, we can embed the MTDD and DDM models into the view synthesis optimization encoding option in the 3-D-HEVC anchor software; 3) more accurate region classification for depth maps should be considered; 4) more effective assessment metric is needed in designing the view synthesis distortion criterion. REFERENCES [1] K. Muller, P. Merkle, and T. Wiegand, 3-D video representation using depth maps, Proc. IEEE, vol.99,no.4,pp ,Apr [2] A.Gotchev,G.B.Akar,T.Capin,D.Strohmeier,andA.Boev, Threedimensional media for mobile devices, Proc. IEEE, vol. 99, no. 4, pp , Apr [3] C. Fehn, Depth-image-based rendering (DIBR), compression and transmission for a new approach on 3-D-TV, in Proc. SPIE, San Jose, CA, Jan. 2004, vol. 5291, pp [4] WD 3 Reference Software for MVC, JTC1/SC29/WG11, ISO/IEC, Busan, Korea, Oct [5] Joint Multiview Video Model (JMVM) 7.0, JTC1/SC29/WG11, ISO/ IEC, Antalya, Turkey, Jan [6] Draft Reference Software for MVC,, ISO/IEC MPEG & ITU-T VCEG, London, U.K., Jul [7] Y. Morvan, D. Farin, and P. H. N. de With, Depth-image compression based on an R-D optimized quadtree decomposition for the transmission of multiview image, in Proc. IEEE Int. Conf. Image Process.,San Antonio, TX, Sep. 2007, pp [8] K. J. Oh, A. Vetro, and Y. S. Ho, Depth coding using a boundary reconstruction filter for 3-D video system, IEEE Trans. Circuits Syst. Video Technol., vol. 21, no. 3, pp , Mar [9] J.R.Hidalgo,J.R.Morros,P.Aflaki, F. Calderero, and F. Marqués, Multiview depth coding based on combined color depth segmentation, J. Vis. Commun. Image Represent., vol. 23, no. 1, pp , Jan [10] S. Milani and G. Calvagno, A Depth image coder based on progressive silhouettes, IEEE Signal Process. Lett., vol. 17, no. 8, pp , Aug [11] V. A. Nguyen, D. Min, and M. N. Do, Efficient techniques for depth video compression using weighted mode filtering, IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 2, pp , Feb [12] I. Daribo, C. Tillier, and B. Pesquet-Popescu, Motion vector sharing and bit-rate allocation for 3-D video-plus-depth coding, EURASIP J. Adv. Signal Process., vol. 2009, Jan [13] J. Zhang, M. M. Hannuksela, and H. Q. Li, Joint multiview video plus depth coding, in Proc. IEEE Int. Conf. Image Process., Sep. 2010, pp [14] W.S.Kim,A.Ortega,P.L.Lai,D.Tian,andC.Gomila, Depthmap distortion analysis for view rendering and depth coding, in Proc. IEEE Int. Conf. Image Process., Cairo, Egypt, Nov. 2009, pp [15] B. T. Oh, J. Lee, and D. S. Park, Depth map coding based on synthesized view distortion function, IEEE J. Sel. Topic Signal Process., vol. 5, no. 7, pp , Nov [16] Y. W. Liu, Q. M. Huang, S. W. Ma, D. B. Zhao, and W. Gao, Joint video/depth rate allocation for 3-D video coding based on view synthesis distortion model, Signal Processing: Image Commun., vol. 24, no. 8, pp , Sep [17] H. Yuan, J. Liu, Z. Li, and W. Liu, Virtual view oriented distortion criterion for depth map coding, IET Electron. Lett., vol. 48, no. 1, pp , Jan [18] G.Tech,H.Schwarz,K.Muller,andT.Wiegand, 3-Dvideocoding using the synthesized view distortion change, presented at the Picture Coding Symp., Krakow, Poland, May [19] Y. Zhang, S. Kwong, L. Xu, S. D. Hu, G. Y. Jiang, and C.-C. Jay Kuo, Regional bit allocation and rate distortion optimization for multiview depth video coding with view synthesis distortion model, IEEE Trans. Image Process., vol. 22, no. 9, pp , Sep [20] J. M. Xiao, T. Tillo, and H. Yuan, Real-time macroblock level bits allocation for depth maps in 3-D video coding, in Advances in Multimedia Information Processing-PCM New York: Springer, 2012, vol. 7674, Lecture Notes Computer Science, pp [21] F. Shao, M. Yu, G. Y. Jiang, F. C. Li, and Z. J. Peng, Depth map compression and depth-aided view rendering for 3-D video system, IET Signal Process., vol. 6, no. 3, pp , May [22] D. V. S. X. De Silva, W. A. C. Fernando, S. T. Worrall, S. L. P. Yasakethu, and A. M. Kondoz, Just noticeable difference in depth model for stereoscopic 3-D displays, in Proc. IEEE Int. Conf. Multimedia Expo, Jul. 2010, pp [23] H. T. Nguyen and M. N. Do, Error analysis for image-based rendering with depth information, IEEE Trans. Image Process., vol.18,no.4, pp , Apr [24] Y. Zhao, C. Zhu, Z. Z. Chen, and L. Yu, Depth no-synthesis-error model for view synthesis in 3-D video, IEEE Trans. Image Process., vol. 20, no. 8, pp , Aug [25] G. Cheung, A. Kubota, and A. Ortega, Sparse representation of depth maps for efficient transform coding, presented at the IEEE Picture Coding Symp., Nagoya, Japan, Dec [26] S. Yea and A. Vetro, View synthesis prediction for multiview video coding, Signal Process.: Image Commun., vol. 24, no. 1 2, pp , Jan [27] J. Y. Lee, H. C. Wey, and D. S. Park, A fast and efficient multi-view depth image coding method based on temporal and inter-view correlations of texture images, IEEE Trans. Circuits Syst. Video Technol., vol. 21, no. 12, pp , Dec [28] I. Daribo, C. Tillier, and B. Pesquet-Popescu, Distance dependent depth filtering in 3-D warping for 3DTV, in Proc. IEEE Int. Workshop Multimedia Signal Process., Crete, Greece, Oct. 2007, pp [29] M. Zamarin, S. Milani, P. Zanuttigh, and G. M. Cortelazzo, A novel multi-view image coding scheme based on view-warping and 3-D-DCT, J. Vis. Commun. Image Represent., vol. 21, no. 5 6, pp , Jun

12 SHAO et al.: DEPTH MAP CODING FOR VIEW SYNTHESIS BASED ON DISTORTION ANALYSES 117 [30] F. Jager and C. Feldmann, Warped-Skip mode for 3-D video coding, presented at the Picture Coding Symp., Krakow, Poland, May [31] J. Gautier, O. Le Meur, and C. Guillemot, Depth-based image completion for view synthesis, presented at the 3DTV Conf., Antalya, Turkey, May [32] F.Shao,G.Y.Jiang,M.Yu,K.Chen,andY.S.Ho, Asymmetric coding of multi-view video plus depth based 3-D video for view rendering, IEEE Trans. Multimedia, vol. 14, no. 1, pp , Feb [33] F.Shao,G.Y.Jiang,W.S.Lin,M.Yu,andQ.H.Dai, Jointbitallocation and rate control for coding multi-view video plus depth based 3-D video, IEEE Trans. Multimedia, vol. 15, no. 8, pp , Dec [34] 3DV/FTV EE2: Report on VSRS Extrapolation, JTC1/SC29/WG11, ISO/IEC, Guangzhou, China, Oct [35] Draft Reference Software for MVC,, ISO/IEC MPEG & ITU-T VCEG, London, U.K., Jul [36] Y. Mori, N. Fukushima, T. Yendo, T. Fujii, and M. Tanimoto, View generation with 3-D warping using depth information for FTV, Signal Process.; Image Commun., vol. 24, no. 1 2, pp , Jan [37] E. Ekmekcioglu, V. Velisavljević, and V. S. T. Worrall, Content adaptive enhancement of multi-view depth maps for free viewpoint video, IEEE J. Sel. Topics Signal Process., vol. 5, no. 2, pp , Apr [38] M. Kurc, O. Stankiewicz, and M. Dpmanski, Depth map inter-view consistency refinement for multiview video, presented at the Picture Coding Symp., Poznan, Poland, May [39] X. H. Zhang, W. S. Lin, and P. Xue, Just-noticeable difference estimation with pixels in images, J. Vis. Commun. Image Represent., vol. 19, no. 1, pp , Jan [40] Y. Zhao, Z. Chen, C. Zhu, Y. P. Tan, and L. Yu, Binocular JND model for stereoscopic images, IEEE Signal Process. Lett., vol. 18, no. 1, pp , Jan [41] P.Merkle,Y.Morvan,A.Smolic,D.Farin,K.Muller,P.H.N.deWith, and T. Wiegand, The effect of multiview depth video compression on multiview rendering, Signal Process.: Image Commun., vol. 24, no. 1 2, pp , Jan [42] L. S. Karlsson and M. Sjostrom, Layer assignment based on depth data distribution for multiview-plus-depth scalable video coding, IEEE Trans. Circuits Syst. Video Technol., vol. 21, no. 6, pp , Jun [43] T.Chen,W.Yin,X.S.Zhou,D.Comaniciu,andT.S.Huang, Total variation models for variable lighting face recognition, IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 9, pp , Sep [44] Depth Estimation Reference Software (DERS) 3.0, JTC1/SC29/WG11, ISO/IEC, Maui, HI, Apr [45] Calculation of Average PSNR Differences Between RD-Curves, SG16/Q6, ITU-T, Austin, TX, [46] Perceptual Measurement for Evaluating Quality of View Synthesis, JTC1/SC29/WG11, ISO/IEC, Maui, HI, Apr Weisi Lin (M 92 SM 98) received the B.Sc. and M.Sc. degrees from Zhongshan University, Guangzhou, China, and the Ph.D. degree from King s College London, London, U.K. He was the Lab Head, Visual Processing, and the Acting Department Manager, Media Processing, for the Institute for Infocomm Research, Singapore. Currently, he is an Associate Professor in the School of Computer Engineering, Nanyang Technological University, Singapore. His areas of expertise include image processing, perceptual modeling, video compression, multimedia communication, and computer vision. He has published 200+ refereed papers in international journals and conferences. He is on the editorial board of the Journal of Visual Communication and Image Representation. Dr. Lin is on the editorial boards of the IEEE TRANSACTIONS ON MULTIMEDIA and the IEEE SIGNAL PROCESSING LETTERS. He served as the Lead Guest Editor for a special issue on perceptual signal processing, IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, He chairs the IEEE MMTC Special Interest Group on Quality of Experience. He has been elected as a Distinguished Lecturer of APSIPA (2012/3). He is the Lead Technical Program Chair for Pacific-Rim Conference On Multimedia (PCM) 2012, and a Technical Program Chair for the IEEE International Conference on Multimedia and Expo (ICME) He is a Chartered Engineer (U.K.), a fellow of Institution of Engineering Technology, and an Honorary Fellow, Singapore Institute of Engineering Technologists. Gangyi Jiang received the M.S. degree from Hangzhou University, China, in 1992, and the Ph.D. degree from Ajou University, Gyeonggi-do, South Korea, in He is now a Professor in Faculty of Information Science and Engineering, Ningbo University, Ningbo, China. His research interests mainly include digital video compression, multi-view video coding, etc. Mei Yu received the M.S. degree from Hangzhou Institute of Electronics Engineering, Hangzhou, China, in 1993, and the Ph.D. degree form Ajou University, Gyeonggi-do, South Korea, in SheisnowaProfessorinFaculty of Information Science and Engineering, Ningbo University, Ningbo, China. Her research interests include image/video coding and video perception. Feng Shao received the B.S. and Ph.D. degrees from Zhejiang University, Hangzhou, China, in 2002 and 2007, respectively, all in electronic science and technology. He is currently an Associate Professor in Faculty of Information Science and Engineering, Ningbo University, Ningbo, China. He was a visiting Fellow with the School of Computer Engineering, Nanyang Technological University, Singapore, from February 2012 to August His research interests include 3-D video coding, 3-D quality assessment, and image perception, etc. Qionghai Dai (SM 05) received the B.S. degree from Shanxi Normal University, Shanxi, China, in 1987, and the M.E. and Ph.D. degrees from Northeastern University, Shenyang, China, in 1994 and 1996, respectively. Since 1997, he has been with the faculty of Tsinghua University, Beijing, China, and is currently Professor and the Director of the Broadband Networks and Digital Media Laboratory. His research areas include video communication, computer vision, and computational photography.

View Generation for Free Viewpoint Video System

View Generation for Free Viewpoint Video System View Generation for Free Viewpoint Video System Gangyi JIANG 1, Liangzhong FAN 2, Mei YU 1, Feng Shao 1 1 Faculty of Information Science and Engineering, Ningbo University, Ningbo, 315211, China 2 Ningbo

More information

FOR compressed video, due to motion prediction and

FOR compressed video, due to motion prediction and 1390 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 24, NO. 8, AUGUST 2014 Multiple Description Video Coding Based on Human Visual System Characteristics Huihui Bai, Weisi Lin, Senior

More information

View Synthesis Prediction for Rate-Overhead Reduction in FTV

View Synthesis Prediction for Rate-Overhead Reduction in FTV MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com View Synthesis Prediction for Rate-Overhead Reduction in FTV Sehoon Yea, Anthony Vetro TR2008-016 June 2008 Abstract This paper proposes the

More information

Coding of 3D Videos based on Visual Discomfort

Coding of 3D Videos based on Visual Discomfort Coding of 3D Videos based on Visual Discomfort Dogancan Temel and Ghassan AlRegib School of Electrical and Computer Engineering, Georgia Institute of Technology Atlanta, GA, 30332-0250 USA {cantemel, alregib}@gatech.edu

More information

arxiv: v1 [cs.mm] 8 May 2018

arxiv: v1 [cs.mm] 8 May 2018 OPTIMIZATION OF OCCLUSION-INDUCING DEPTH PIXELS IN 3-D VIDEO CODING Pan Gao Cagri Ozcinar Aljosa Smolic College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics V-SENSE

More information

Homogeneous Transcoding of HEVC for bit rate reduction

Homogeneous Transcoding of HEVC for bit rate reduction Homogeneous of HEVC for bit rate reduction Ninad Gorey Dept. of Electrical Engineering University of Texas at Arlington Arlington 7619, United States ninad.gorey@mavs.uta.edu Dr. K. R. Rao Fellow, IEEE

More information

WITH the improvements in high-speed networking, highcapacity

WITH the improvements in high-speed networking, highcapacity 134 IEEE TRANSACTIONS ON BROADCASTING, VOL. 62, NO. 1, MARCH 2016 A Virtual View PSNR Estimation Method for 3-D Videos Hui Yuan, Member, IEEE, Sam Kwong, Fellow, IEEE, Xu Wang, Student Member, IEEE, Yun

More information

LBP-GUIDED DEPTH IMAGE FILTER. Rui Zhong, Ruimin Hu

LBP-GUIDED DEPTH IMAGE FILTER. Rui Zhong, Ruimin Hu LBP-GUIDED DEPTH IMAGE FILTER Rui Zhong, Ruimin Hu National Engineering Research Center for Multimedia Software,School of Computer, Wuhan University,Wuhan, 430072, China zhongrui0824@126.com, hrm1964@163.com

More information

FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING

FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING 1 Michal Joachimiak, 2 Kemal Ugur 1 Dept. of Signal Processing, Tampere University of Technology, Tampere, Finland 2 Jani Lainema,

More information

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE 5359 Gaurav Hansda 1000721849 gaurav.hansda@mavs.uta.edu Outline Introduction to H.264 Current algorithms for

More information

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC

SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC SINGLE PASS DEPENDENT BIT ALLOCATION FOR SPATIAL SCALABILITY CODING OF H.264/SVC Randa Atta, Rehab F. Abdel-Kader, and Amera Abd-AlRahem Electrical Engineering Department, Faculty of Engineering, Port

More information

Fast Mode Decision Algorithm for Multiview Video Coding Based on Binocular Just Noticeable Difference

Fast Mode Decision Algorithm for Multiview Video Coding Based on Binocular Just Noticeable Difference 2428 JOURNAL OF COMPUTERS, VOL. 9, NO., OCTOBER 214 Fast Mode Decision Algorithm for Multiview Video Coding Based on Binocular Just Noticeable Difference Yapei Zhu 1 1 Faculty of Information Science and

More information

CONVERSION OF FREE-VIEWPOINT 3D MULTI-VIEW VIDEO FOR STEREOSCOPIC DISPLAYS

CONVERSION OF FREE-VIEWPOINT 3D MULTI-VIEW VIDEO FOR STEREOSCOPIC DISPLAYS CONVERSION OF FREE-VIEWPOINT 3D MULTI-VIEW VIDEO FOR STEREOSCOPIC DISPLAYS Luat Do 1, Svitlana Zinger 1, and Peter H. N. de With 1,2 1 Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven,

More information

MULTIVIEW video is capable of providing viewers

MULTIVIEW video is capable of providing viewers IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 1, MARCH 2011 15 Efficient Multi-Reference Frame Selection Algorithm for Hierarchical B Pictures in Multiview Video Coding Yun Zhang, Sam Kwong, Senior Member,

More information

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain Author manuscript, published in "International Symposium on Broadband Multimedia Systems and Broadcasting, Bilbao : Spain (2009)" One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

More information

A New Data Format for Multiview Video

A New Data Format for Multiview Video A New Data Format for Multiview Video MEHRDAD PANAHPOUR TEHRANI 1 AKIO ISHIKAWA 1 MASASHIRO KAWAKITA 1 NAOMI INOUE 1 TOSHIAKI FUJII 2 This paper proposes a new data forma that can be used for multiview

More information

5LSH0 Advanced Topics Video & Analysis

5LSH0 Advanced Topics Video & Analysis 1 Multiview 3D video / Outline 2 Advanced Topics Multimedia Video (5LSH0), Module 02 3D Geometry, 3D Multiview Video Coding & Rendering Peter H.N. de With, Sveta Zinger & Y. Morvan ( p.h.n.de.with@tue.nl

More information

An Efficient Mode Selection Algorithm for H.264

An Efficient Mode Selection Algorithm for H.264 An Efficient Mode Selection Algorithm for H.64 Lu Lu 1, Wenhan Wu, and Zhou Wei 3 1 South China University of Technology, Institute of Computer Science, Guangzhou 510640, China lul@scut.edu.cn South China

More information

DEPTH MAPS CODING WITH ELASTIC CONTOURS AND 3D SURFACE PREDICTION

DEPTH MAPS CODING WITH ELASTIC CONTOURS AND 3D SURFACE PREDICTION DEPTH MAPS CODING WITH ELASTIC CONTOURS AND 3D SURFACE PREDICTION Marco Calemme 1, Pietro Zanuttigh 2, Simone Milani 2, Marco Cagnazzo 1, Beatrice Pesquet-Popescu 1 1 Télécom Paristech, France 2 University

More information

A Novel Filling Disocclusion Method Based on Background Extraction. in Depth-Image-Based-Rendering

A Novel Filling Disocclusion Method Based on Background Extraction. in Depth-Image-Based-Rendering A Novel Filling Disocclusion Method Based on Background Extraction in Depth-Image-Based-Rendering Zhenguo Lu,Yuesheng Zhu,Jian Chen ShenZhen Graduate School,Peking University,China Email:zglu@sz.pku.edu.cn,zhuys@pkusz.edu.cn

More information

Next-Generation 3D Formats with Depth Map Support

Next-Generation 3D Formats with Depth Map Support MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Next-Generation 3D Formats with Depth Map Support Chen, Y.; Vetro, A. TR2014-016 April 2014 Abstract This article reviews the most recent extensions

More information

DEPTH PIXEL CLUSTERING FOR CONSISTENCY TESTING OF MULTIVIEW DEPTH. Pravin Kumar Rana and Markus Flierl

DEPTH PIXEL CLUSTERING FOR CONSISTENCY TESTING OF MULTIVIEW DEPTH. Pravin Kumar Rana and Markus Flierl DEPTH PIXEL CLUSTERING FOR CONSISTENCY TESTING OF MULTIVIEW DEPTH Pravin Kumar Rana and Markus Flierl ACCESS Linnaeus Center, School of Electrical Engineering KTH Royal Institute of Technology, Stockholm,

More information

INTERNATIONAL ORGANISATION FOR STANDARISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

INTERNATIONAL ORGANISATION FOR STANDARISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO INTERNATIONAL ORGANISATION FOR STANDARISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11 MPEG/M15672 July 2008, Hannover,

More information

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation

A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation 2009 Third International Conference on Multimedia and Ubiquitous Engineering A Novel Deblocking Filter Algorithm In H.264 for Real Time Implementation Yuan Li, Ning Han, Chen Chen Department of Automation,

More information

Depth Estimation for View Synthesis in Multiview Video Coding

Depth Estimation for View Synthesis in Multiview Video Coding MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Depth Estimation for View Synthesis in Multiview Video Coding Serdar Ince, Emin Martinian, Sehoon Yea, Anthony Vetro TR2007-025 June 2007 Abstract

More information

Enhanced View Synthesis Prediction for Coding of Non-Coplanar 3D Video Sequences

Enhanced View Synthesis Prediction for Coding of Non-Coplanar 3D Video Sequences Enhanced View Synthesis Prediction for Coding of Non-Coplanar 3D Video Sequences Jens Schneider, Johannes Sauer and Mathias Wien Institut für Nachrichtentechnik, RWTH Aachen University, Germany Abstract

More information

Final report on coding algorithms for mobile 3DTV. Gerhard Tech Karsten Müller Philipp Merkle Heribert Brust Lina Jin

Final report on coding algorithms for mobile 3DTV. Gerhard Tech Karsten Müller Philipp Merkle Heribert Brust Lina Jin Final report on coding algorithms for mobile 3DTV Gerhard Tech Karsten Müller Philipp Merkle Heribert Brust Lina Jin MOBILE3DTV Project No. 216503 Final report on coding algorithms for mobile 3DTV Gerhard

More information

DEPTH-LEVEL-ADAPTIVE VIEW SYNTHESIS FOR 3D VIDEO

DEPTH-LEVEL-ADAPTIVE VIEW SYNTHESIS FOR 3D VIDEO DEPTH-LEVEL-ADAPTIVE VIEW SYNTHESIS FOR 3D VIDEO Ying Chen 1, Weixing Wan 2, Miska M. Hannuksela 3, Jun Zhang 2, Houqiang Li 2, and Moncef Gabbouj 1 1 Department of Signal Processing, Tampere University

More information

An Efficient Image Compression Using Bit Allocation based on Psychovisual Threshold

An Efficient Image Compression Using Bit Allocation based on Psychovisual Threshold An Efficient Image Compression Using Bit Allocation based on Psychovisual Threshold Ferda Ernawan, Zuriani binti Mustaffa and Luhur Bayuaji Faculty of Computer Systems and Software Engineering, Universiti

More information

A study of depth/texture bit-rate allocation in multi-view video plus depth compression

A study of depth/texture bit-rate allocation in multi-view video plus depth compression A study of depth/texture bit-rate allocation in multi-view video plus depth compression Emilie Bosc, Fabien Racapé, Vincent Jantet, Paul Riou, Muriel Pressigout, Luce Morin To cite this version: Emilie

More information

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11 MPEG2011/N12559 February 2012,

More information

Quality improving techniques in DIBR for free-viewpoint video Do, Q.L.; Zinger, S.; Morvan, Y.; de With, P.H.N.

Quality improving techniques in DIBR for free-viewpoint video Do, Q.L.; Zinger, S.; Morvan, Y.; de With, P.H.N. Quality improving techniques in DIBR for free-viewpoint video Do, Q.L.; Zinger, S.; Morvan, Y.; de With, P.H.N. Published in: Proceedings of the 3DTV Conference : The True Vision - Capture, Transmission

More information

Key-Words: - Free viewpoint video, view generation, block based disparity map, disparity refinement, rayspace.

Key-Words: - Free viewpoint video, view generation, block based disparity map, disparity refinement, rayspace. New View Generation Method for Free-Viewpoint Video System GANGYI JIANG*, LIANGZHONG FAN, MEI YU AND FENG SHAO Faculty of Information Science and Engineering Ningbo University 315211 Ningbo CHINA jianggangyi@126.com

More information

A reversible data hiding based on adaptive prediction technique and histogram shifting

A reversible data hiding based on adaptive prediction technique and histogram shifting A reversible data hiding based on adaptive prediction technique and histogram shifting Rui Liu, Rongrong Ni, Yao Zhao Institute of Information Science Beijing Jiaotong University E-mail: rrni@bjtu.edu.cn

More information

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

Reducing/eliminating visual artifacts in HEVC by the deblocking filter. 1 Reducing/eliminating visual artifacts in HEVC by the deblocking filter. EE5359 Multimedia Processing Project Proposal Spring 2014 The University of Texas at Arlington Department of Electrical Engineering

More information

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation Optimizing the Deblocking Algorithm for H.264 Decoder Implementation Ken Kin-Hung Lam Abstract In the emerging H.264 video coding standard, a deblocking/loop filter is required for improving the visual

More information

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD Siwei Ma, Shiqi Wang, Wen Gao {swma,sqwang, wgao}@pku.edu.cn Institute of Digital Media, Peking University ABSTRACT IEEE 1857 is a multi-part standard for multimedia

More information

Conversion of free-viewpoint 3D multi-view video for stereoscopic displays Do, Q.L.; Zinger, S.; de With, P.H.N.

Conversion of free-viewpoint 3D multi-view video for stereoscopic displays Do, Q.L.; Zinger, S.; de With, P.H.N. Conversion of free-viewpoint 3D multi-view video for stereoscopic displays Do, Q.L.; Zinger, S.; de With, P.H.N. Published in: Proceedings of the 2010 IEEE International Conference on Multimedia and Expo

More information

Scene Segmentation by Color and Depth Information and its Applications

Scene Segmentation by Color and Depth Information and its Applications Scene Segmentation by Color and Depth Information and its Applications Carlo Dal Mutto Pietro Zanuttigh Guido M. Cortelazzo Department of Information Engineering University of Padova Via Gradenigo 6/B,

More information

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Project Title: Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Midterm Report CS 584 Multimedia Communications Submitted by: Syed Jawwad Bukhari 2004-03-0028 About

More information

Graph-based representation for multiview images with complex camera configurations

Graph-based representation for multiview images with complex camera configurations Graph-based representation for multiview images with complex camera configurations Xin Su, Thomas Maugey, Christine Guillemot To cite this version: Xin Su, Thomas Maugey, Christine Guillemot. Graph-based

More information

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 7, NO. 2, APRIL 1997 429 Express Letters A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation Jianhua Lu and

More information

Novel United Buffer Rate Control Methods for Stereoscopic Video

Novel United Buffer Rate Control Methods for Stereoscopic Video JOURNAL OF SOFTWARE, VOL. 8, NO. 8, AUGUST 2013 2015 Novel United Buffer Rate Control Methods for Stereoscopic Video Yi Liao Faculty of Information Science and Engineering, Ningbo University, Ningbo, China

More information

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS

A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS A Novel Statistical Distortion Model Based on Mixed Laplacian and Uniform Distribution of Mpeg-4 FGS Xie Li and Wenjun Zhang Institute of Image Communication and Information Processing, Shanghai Jiaotong

More information

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 5, MAY

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 5, MAY IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 5, MAY 2015 1573 Graph-Based Representation for Multiview Image Geometry Thomas Maugey, Member, IEEE, Antonio Ortega, Fellow Member, IEEE, and Pascal

More information

A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation

A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 1, JANUARY 2001 111 A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation

More information

QUAD-TREE PARTITIONED COMPRESSED SENSING FOR DEPTH MAP CODING. Ying Liu, Krishna Rao Vijayanagar, and Joohee Kim

QUAD-TREE PARTITIONED COMPRESSED SENSING FOR DEPTH MAP CODING. Ying Liu, Krishna Rao Vijayanagar, and Joohee Kim QUAD-TREE PARTITIONED COMPRESSED SENSING FOR DEPTH MAP CODING Ying Liu, Krishna Rao Vijayanagar, and Joohee Kim Department of Electrical and Computer Engineering, Illinois Institute of Technology, Chicago,

More information

EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM

EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM TENCON 2000 explore2 Page:1/6 11/08/00 EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM S. Areepongsa, N. Kaewkamnerd, Y. F. Syed, and K. R. Rao The University

More information

A SCALABLE CODING APPROACH FOR HIGH QUALITY DEPTH IMAGE COMPRESSION

A SCALABLE CODING APPROACH FOR HIGH QUALITY DEPTH IMAGE COMPRESSION This material is published in the open archive of Mid Sweden University DIVA http://miun.diva-portal.org to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein

More information

Lossless Compression of Stereo Disparity Maps for 3D

Lossless Compression of Stereo Disparity Maps for 3D 2012 IEEE International Conference on Multimedia and Expo Workshops Lossless Compression of Stereo Disparity Maps for 3D Marco Zamarin, Søren Forchhammer Department of Photonics Engineering Technical University

More information

Fast Mode Decision for Depth Video Coding Using H.264/MVC *

Fast Mode Decision for Depth Video Coding Using H.264/MVC * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 31, 1693-1710 (2015) Fast Mode Decision for Depth Video Coding Using H.264/MVC * CHIH-HUNG LU, HAN-HSUAN LIN AND CHIH-WEI TANG* Department of Communication

More information

A New Fast Motion Estimation Algorithm. - Literature Survey. Instructor: Brian L. Evans. Authors: Yue Chen, Yu Wang, Ying Lu.

A New Fast Motion Estimation Algorithm. - Literature Survey. Instructor: Brian L. Evans. Authors: Yue Chen, Yu Wang, Ying Lu. A New Fast Motion Estimation Algorithm - Literature Survey Instructor: Brian L. Evans Authors: Yue Chen, Yu Wang, Ying Lu Date: 10/19/1998 A New Fast Motion Estimation Algorithm 1. Abstract Video compression

More information

CONTENT ADAPTIVE COMPLEXITY REDUCTION SCHEME FOR QUALITY/FIDELITY SCALABLE HEVC

CONTENT ADAPTIVE COMPLEXITY REDUCTION SCHEME FOR QUALITY/FIDELITY SCALABLE HEVC CONTENT ADAPTIVE COMPLEXITY REDUCTION SCHEME FOR QUALITY/FIDELITY SCALABLE HEVC Hamid Reza Tohidypour, Mahsa T. Pourazad 1,2, and Panos Nasiopoulos 1 1 Department of Electrical & Computer Engineering,

More information

DISPARITY-ADJUSTED 3D MULTI-VIEW VIDEO CODING WITH DYNAMIC BACKGROUND MODELLING

DISPARITY-ADJUSTED 3D MULTI-VIEW VIDEO CODING WITH DYNAMIC BACKGROUND MODELLING DISPARITY-ADJUSTED 3D MULTI-VIEW VIDEO CODING WITH DYNAMIC BACKGROUND MODELLING Manoranjan Paul and Christopher J. Evans School of Computing and Mathematics, Charles Sturt University, Australia Email:

More information

Context based optimal shape coding

Context based optimal shape coding IEEE Signal Processing Society 1999 Workshop on Multimedia Signal Processing September 13-15, 1999, Copenhagen, Denmark Electronic Proceedings 1999 IEEE Context based optimal shape coding Gerry Melnikov,

More information

Implementation and analysis of Directional DCT in H.264

Implementation and analysis of Directional DCT in H.264 Implementation and analysis of Directional DCT in H.264 EE 5359 Multimedia Processing Guidance: Dr K R Rao Priyadarshini Anjanappa UTA ID: 1000730236 priyadarshini.anjanappa@mavs.uta.edu Introduction A

More information

An Independent Motion and Disparity Vector Prediction Method for Multiview Video Coding

An Independent Motion and Disparity Vector Prediction Method for Multiview Video Coding Preprint Version (2011) An Independent Motion and Disparity Vector Prediction Method for Multiview Video Coding Seungchul Ryu a, Jungdong Seo a, Dong Hyun Kim a, Jin Young Lee b, Ho-Cheon Wey b, and Kwanghoon

More information

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Automatic Video Caption Detection and Extraction in the DCT Compressed Domain Chin-Fu Tsao 1, Yu-Hao Chen 1, Jin-Hau Kuo 1, Chia-wei Lin 1, and Ja-Ling Wu 1,2 1 Communication and Multimedia Laboratory,

More information

EFFICIENT PU MODE DECISION AND MOTION ESTIMATION FOR H.264/AVC TO HEVC TRANSCODER

EFFICIENT PU MODE DECISION AND MOTION ESTIMATION FOR H.264/AVC TO HEVC TRANSCODER EFFICIENT PU MODE DECISION AND MOTION ESTIMATION FOR H.264/AVC TO HEVC TRANSCODER Zong-Yi Chen, Jiunn-Tsair Fang 2, Tsai-Ling Liao, and Pao-Chi Chang Department of Communication Engineering, National Central

More information

Analysis of Depth Map Resampling Filters for Depth-based 3D Video Coding

Analysis of Depth Map Resampling Filters for Depth-based 3D Video Coding MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Analysis of Depth Map Resampling Filters for Depth-based 3D Video Coding Graziosi, D.B.; Rodrigues, N.M.M.; de Faria, S.M.M.; Tian, D.; Vetro,

More information

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H. EE 5359 MULTIMEDIA PROCESSING SPRING 2011 Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.264 Under guidance of DR K R RAO DEPARTMENT OF ELECTRICAL ENGINEERING UNIVERSITY

More information

AN EFFICIENT VIDEO WATERMARKING USING COLOR HISTOGRAM ANALYSIS AND BITPLANE IMAGE ARRAYS

AN EFFICIENT VIDEO WATERMARKING USING COLOR HISTOGRAM ANALYSIS AND BITPLANE IMAGE ARRAYS AN EFFICIENT VIDEO WATERMARKING USING COLOR HISTOGRAM ANALYSIS AND BITPLANE IMAGE ARRAYS G Prakash 1,TVS Gowtham Prasad 2, T.Ravi Kumar Naidu 3 1MTech(DECS) student, Department of ECE, sree vidyanikethan

More information

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Torsten Palfner, Alexander Mali and Erika Müller Institute of Telecommunications and Information Technology, University of

More information

signal-to-noise ratio (PSNR), 2

signal-to-noise ratio (PSNR), 2 u m " The Integration in Optics, Mechanics, and Electronics of Digital Versatile Disc Systems (1/3) ---(IV) Digital Video and Audio Signal Processing ƒf NSC87-2218-E-009-036 86 8 1 --- 87 7 31 p m o This

More information

Depth Map Boundary Filter for Enhanced View Synthesis in 3D Video

Depth Map Boundary Filter for Enhanced View Synthesis in 3D Video J Sign Process Syst (2017) 88:323 331 DOI 10.1007/s11265-016-1158-x Depth Map Boundary Filter for Enhanced View Synthesis in 3D Video Yunseok Song 1 & Yo-Sung Ho 1 Received: 24 April 2016 /Accepted: 7

More information

Extensions of H.264/AVC for Multiview Video Compression

Extensions of H.264/AVC for Multiview Video Compression MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Extensions of H.264/AVC for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, Anthony Vetro, Huifang Sun TR2006-048 June

More information

An Efficient Saliency Based Lossless Video Compression Based On Block-By-Block Basis Method

An Efficient Saliency Based Lossless Video Compression Based On Block-By-Block Basis Method An Efficient Saliency Based Lossless Video Compression Based On Block-By-Block Basis Method Ms. P.MUTHUSELVI, M.E(CSE), V.P.M.M Engineering College for Women, Krishnankoil, Virudhungar(dt),Tamil Nadu Sukirthanagarajan@gmail.com

More information

A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression

A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression A 3-D Virtual SPIHT for Scalable Very Low Bit-Rate Embedded Video Compression Habibollah Danyali and Alfred Mertins University of Wollongong School of Electrical, Computer and Telecommunications Engineering

More information

Compression-Induced Rendering Distortion Analysis for Texture/Depth Rate Allocation in 3D Video Compression

Compression-Induced Rendering Distortion Analysis for Texture/Depth Rate Allocation in 3D Video Compression 2009 Data Compression Conference Compression-Induced Rendering Distortion Analysis for Texture/Depth Rate Allocation in 3D Video Compression Yanwei Liu, Siwei Ma, Qingming Huang, Debin Zhao, Wen Gao, Nan

More information

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding Jung-Ah Choi and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST) 261 Cheomdan-gwagiro, Buk-gu, Gwangju, 500-712, Korea

More information

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images Karthik Ram K.V & Mahantesh K Department of Electronics and Communication Engineering, SJB Institute of Technology, Bangalore,

More information

Focus on visual rendering quality through content-based depth map coding

Focus on visual rendering quality through content-based depth map coding Focus on visual rendering quality through content-based depth map coding Emilie Bosc, Muriel Pressigout, Luce Morin To cite this version: Emilie Bosc, Muriel Pressigout, Luce Morin. Focus on visual rendering

More information

FAST MOTION ESTIMATION DISCARDING LOW-IMPACT FRACTIONAL BLOCKS. Saverio G. Blasi, Ivan Zupancic and Ebroul Izquierdo

FAST MOTION ESTIMATION DISCARDING LOW-IMPACT FRACTIONAL BLOCKS. Saverio G. Blasi, Ivan Zupancic and Ebroul Izquierdo FAST MOTION ESTIMATION DISCARDING LOW-IMPACT FRACTIONAL BLOCKS Saverio G. Blasi, Ivan Zupancic and Ebroul Izquierdo School of Electronic Engineering and Computer Science, Queen Mary University of London

More information

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames

A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames A Quantized Transform-Domain Motion Estimation Technique for H.264 Secondary SP-frames Ki-Kit Lai, Yui-Lam Chan, and Wan-Chi Siu Centre for Signal Processing Department of Electronic and Information Engineering

More information

DEPTH IMAGE BASED RENDERING WITH ADVANCED TEXTURE SYNTHESIS. P. Ndjiki-Nya, M. Köppel, D. Doshkov, H. Lakshman, P. Merkle, K. Müller, and T.

DEPTH IMAGE BASED RENDERING WITH ADVANCED TEXTURE SYNTHESIS. P. Ndjiki-Nya, M. Köppel, D. Doshkov, H. Lakshman, P. Merkle, K. Müller, and T. DEPTH IMAGE BASED RENDERING WITH ADVANCED TEXTURE SYNTHESIS P. Ndjiki-Nya, M. Köppel, D. Doshkov, H. Lakshman, P. Merkle, K. Müller, and T. Wiegand Fraunhofer Institut for Telecommunications, Heinrich-Hertz-Institut

More information

Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video

Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video Frank Ciaramello, Jung Ko, Sheila Hemami School of Electrical and Computer Engineering Cornell University,

More information

View Synthesis for Multiview Video Compression

View Synthesis for Multiview Video Compression View Synthesis for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, and Anthony Vetro email:{martinian,jxin,avetro}@merl.com, behrens@tnt.uni-hannover.de Mitsubishi Electric Research

More information

On the Adoption of Multiview Video Coding in Wireless Multimedia Sensor Networks

On the Adoption of Multiview Video Coding in Wireless Multimedia Sensor Networks 2011 Wireless Advanced On the Adoption of Multiview Video Coding in Wireless Multimedia Sensor Networks S. Colonnese, F. Cuomo, O. Damiano, V. De Pascalis and T. Melodia University of Rome, Sapienza, DIET,

More information

Analysis of 3D and Multiview Extensions of the Emerging HEVC Standard

Analysis of 3D and Multiview Extensions of the Emerging HEVC Standard MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Analysis of 3D and Multiview Extensions of the Emerging HEVC Standard Vetro, A.; Tian, D. TR2012-068 August 2012 Abstract Standardization of

More information

Template based illumination compensation algorithm for multiview video coding

Template based illumination compensation algorithm for multiview video coding Template based illumination compensation algorithm for multiview video coding Xiaoming Li* a, Lianlian Jiang b, Siwei Ma b, Debin Zhao a, Wen Gao b a Department of Computer Science and technology, Harbin

More information

HEVC based Stereo Video codec

HEVC based Stereo Video codec based Stereo Video B Mallik*, A Sheikh Akbari*, P Bagheri Zadeh *School of Computing, Creative Technology & Engineering, Faculty of Arts, Environment & Technology, Leeds Beckett University, U.K. b.mallik6347@student.leedsbeckett.ac.uk,

More information

An Information Hiding Algorithm for HEVC Based on Angle Differences of Intra Prediction Mode

An Information Hiding Algorithm for HEVC Based on Angle Differences of Intra Prediction Mode An Information Hiding Algorithm for HEVC Based on Angle Differences of Intra Prediction Mode Jia-Ji Wang1, Rang-Ding Wang1*, Da-Wen Xu1, Wei Li1 CKC Software Lab, Ningbo University, Ningbo, Zhejiang 3152,

More information

Further Reduced Resolution Depth Coding for Stereoscopic 3D Video

Further Reduced Resolution Depth Coding for Stereoscopic 3D Video Further Reduced Resolution Depth Coding for Stereoscopic 3D Video N. S. Mohamad Anil Shah, H. Abdul Karim, and M. F. Ahmad Fauzi Multimedia University, 63100 Cyberjaya, Selangor, Malaysia Abstract In this

More information

Video Compression System for Online Usage Using DCT 1 S.B. Midhun Kumar, 2 Mr.A.Jayakumar M.E 1 UG Student, 2 Associate Professor

Video Compression System for Online Usage Using DCT 1 S.B. Midhun Kumar, 2 Mr.A.Jayakumar M.E 1 UG Student, 2 Associate Professor Video Compression System for Online Usage Using DCT 1 S.B. Midhun Kumar, 2 Mr.A.Jayakumar M.E 1 UG Student, 2 Associate Professor Department Electronics and Communication Engineering IFET College of Engineering

More information

Scalable Bit Allocation between Texture and Depth Views for 3D Video Streaming over Heterogeneous Networks

Scalable Bit Allocation between Texture and Depth Views for 3D Video Streaming over Heterogeneous Networks Scalable Bit Allocation between Texture and Depth Views for 3D Video Streaming over Heterogeneous Networks Jimin XIAO, Miska M. HANNUKSELA, Member, IEEE, Tammam TILLO, Senior Member, IEEE, Moncef GABBOUJ,

More information

3D Mesh Sequence Compression Using Thin-plate Spline based Prediction

3D Mesh Sequence Compression Using Thin-plate Spline based Prediction Appl. Math. Inf. Sci. 10, No. 4, 1603-1608 (2016) 1603 Applied Mathematics & Information Sciences An International Journal http://dx.doi.org/10.18576/amis/100440 3D Mesh Sequence Compression Using Thin-plate

More information

Image Error Concealment Based on Watermarking

Image Error Concealment Based on Watermarking Image Error Concealment Based on Watermarking Shinfeng D. Lin, Shih-Chieh Shie and Jie-Wei Chen Department of Computer Science and Information Engineering,National Dong Hwa Universuty, Hualien, Taiwan,

More information

Complexity Reduced Mode Selection of H.264/AVC Intra Coding

Complexity Reduced Mode Selection of H.264/AVC Intra Coding Complexity Reduced Mode Selection of H.264/AVC Intra Coding Mohammed Golam Sarwer 1,2, Lai-Man Po 1, Jonathan Wu 2 1 Department of Electronic Engineering City University of Hong Kong Kowloon, Hong Kong

More information

High Efficient Intra Coding Algorithm for H.265/HVC

High Efficient Intra Coding Algorithm for H.265/HVC H.265/HVC における高性能符号化アルゴリズムに関する研究 宋天 1,2* 三木拓也 2 島本隆 1,2 High Efficient Intra Coding Algorithm for H.265/HVC by Tian Song 1,2*, Takuya Miki 2 and Takashi Shimamoto 1,2 Abstract This work proposes a novel

More information

PAPER Optimal Quantization Parameter Set for MPEG-4 Bit-Rate Control

PAPER Optimal Quantization Parameter Set for MPEG-4 Bit-Rate Control 3338 PAPER Optimal Quantization Parameter Set for MPEG-4 Bit-Rate Control Dong-Wan SEO, Seong-Wook HAN, Yong-Goo KIM, and Yoonsik CHOE, Nonmembers SUMMARY In this paper, we propose an optimal bit rate

More information

MOTION estimation is one of the major techniques for

MOTION estimation is one of the major techniques for 522 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 18, NO. 4, APRIL 2008 New Block-Based Motion Estimation for Sequences with Brightness Variation and Its Application to Static Sprite

More information

Network Image Coding for Multicast

Network Image Coding for Multicast Network Image Coding for Multicast David Varodayan, David Chen and Bernd Girod Information Systems Laboratory, Stanford University Stanford, California, USA {varodayan, dmchen, bgirod}@stanford.edu Abstract

More information

ISSN: An Efficient Fully Exploiting Spatial Correlation of Compress Compound Images in Advanced Video Coding

ISSN: An Efficient Fully Exploiting Spatial Correlation of Compress Compound Images in Advanced Video Coding An Efficient Fully Exploiting Spatial Correlation of Compress Compound Images in Advanced Video Coding Ali Mohsin Kaittan*1 President of the Association of scientific research and development in Iraq Abstract

More information

CONTENT ADAPTIVE SCREEN IMAGE SCALING

CONTENT ADAPTIVE SCREEN IMAGE SCALING CONTENT ADAPTIVE SCREEN IMAGE SCALING Yao Zhai (*), Qifei Wang, Yan Lu, Shipeng Li University of Science and Technology of China, Hefei, Anhui, 37, China Microsoft Research, Beijing, 8, China ABSTRACT

More information

Image Quality Assessment Techniques: An Overview

Image Quality Assessment Techniques: An Overview Image Quality Assessment Techniques: An Overview Shruti Sonawane A. M. Deshpande Department of E&TC Department of E&TC TSSM s BSCOER, Pune, TSSM s BSCOER, Pune, Pune University, Maharashtra, India Pune

More information

A content based method for perceptually driven joint color/depth compression

A content based method for perceptually driven joint color/depth compression A content based method for perceptually driven joint color/depth compression Emilie Bosc, Luce Morin, Muriel Pressigout To cite this version: Emilie Bosc, Luce Morin, Muriel Pressigout. A content based

More information

A Comparison of Still-Image Compression Standards Using Different Image Quality Metrics and Proposed Methods for Improving Lossy Image Quality

A Comparison of Still-Image Compression Standards Using Different Image Quality Metrics and Proposed Methods for Improving Lossy Image Quality A Comparison of Still-Image Compression Standards Using Different Image Quality Metrics and Proposed Methods for Improving Lossy Image Quality Multidimensional DSP Literature Survey Eric Heinen 3/21/08

More information

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264 Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264 Jing Hu and Jerry D. Gibson Department of Electrical and Computer Engineering University of California, Santa Barbara, California

More information

Partial Video Encryption Using Random Permutation Based on Modification on Dct Based Transformation

Partial Video Encryption Using Random Permutation Based on Modification on Dct Based Transformation International Refereed Journal of Engineering and Science (IRJES) ISSN (Online) 2319-183X, (Print) 2319-1821 Volume 2, Issue 6 (June 2013), PP. 54-58 Partial Video Encryption Using Random Permutation Based

More information

/$ IEEE

/$ IEEE IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 56, NO. 1, JANUARY 2009 81 Bit-Level Extrinsic Information Exchange Method for Double-Binary Turbo Codes Ji-Hoon Kim, Student Member,

More information