3D Video Processing Algorithms Part I. Sergey Smirnov Atanas Gotchev Sumeet Sen Gerhard Tech Heribert Brust

Size: px
Start display at page:

Download "3D Video Processing Algorithms Part I. Sergey Smirnov Atanas Gotchev Sumeet Sen Gerhard Tech Heribert Brust"

Transcription

1 3D Video Processing Algorithms Part I Sergey Smirnov Atanas Gotchev Sumeet Sen Gerhard Tech Heribert Brust

2 Project No D Video Processing Algorithms Part I Sergey Smirnov, Atanas Gotchev, Sumeet Sen, Gerhard Tech, Heribert Brust Abstract: This report describes algorithms developed to enhance the quality of 3D video. At the preprocessing side, we have addressed the following scenarios: stereo-video with higher resolution to be downscaled to meet the resolution of mobile 3D display; stereo-video captured at noisy conditions (e.g. user-created content) to be denoised; depth map in the format view+depth to be further refined. At the post-processing side we address the problem of dealing with depth map impaired by blocky artifacts resulted from block transform based encoders, such as H.264. For all these cases, we investigate advanced algorithms and present experimental results illustrating their performance. Keywords: mobile 3D video resolution, mixed resolution coding, down-sampling, up-sampling, denoising, 3D grouping and transform-domain collaborative filtering, local polynomial approximation, bilateral filtering, hypothesis filtering, time consistency; depth map filtering.

3 Executive Summary We present the first part of pre- and post-processing methods for 3D video represented in different formats. In this report we concentrate on sampling rate conversion for stereo video, stereo-video denoising and refinement of depth maps in the view plus depth representation. Sampling rate conversion is required when higher definition video is to be downscaled to mobile resolution. It also appears in the mixed resolution stereo representations schemes, where one of the channels is deliberately downscaled for the sake of more effective compression and then upscaled back for visualization. For this, standard up- and down-sampling methods as well as an alternative simple FIR filter for down-sampling with variable cutoff frequency have been presented and evaluated. Coding experiments demonstrate that the simple FIR filter with a cutoff frequency of approx. 0.6 outperforms the standard methods. PSNR gains up to 1 db at a constant bit rate or bit rate savings up to 30% at a constant PSNR can be achieved. Denoising of stereo video might be needed when the content to be delivered to the mobile device has been created under low light conditions. Noisy channels are more problematic not only for creating pleasant stereo perception but also for compression, depth estimation and view synthesis. One of the most competitive video denoising methods, abbreviated as VBM3D (video block matching in 3D) has been evaluated for its applicability and performance for stereo video. Experiments demonstrate that the denoised left and right video channels are with very high quality, where all 3D visual cues are well preserved and in fact even enhanced. From implementation point of view the results show an equal performance of the algorithm when applied independently to the two channels or jointly. Marginal improvement can be expected only for content with high amount of motion. Deblocking of depth maps is perhaps one of the most important pre- and post-processing tasks for the representation format view plus depth since practitioners tend to employ standard, i.e. block transform based, compression methods. A set of five filtering approaches has been tested. Approaches vary from simple Gaussian smoothing through standard H.264 deblocking to more sophisticated methods utilizing structural and colour constraint from the presented color video channel. The methods have been optimized with respect to the quantization parameter of the H.264 compression used. The experiments have ranked the methods for their performance. For the best performing method, we have suggested practical modifications leading to a faster and memory-efficient implementation. We have extended the same method also to video and for more general types of depth impairments (e.g. resulting from fast depth estimation or noise). Our approach yield highly time-consistent depth sequences adequately restoring the depth properties of the 3D scenes. 2

4 Table of Contents 1 Introduction Evaluation of down-sampling methods for Mixed Resolution Coding Sampling Methods Standard anti-aliasing filters Standard interpolation filters FIR anti-aliasing filter with variable cutoff frequency (VCF) Coding Experiments Setup Results Filtering of color stereo video sequences Introduction Denoising of stereo video by VBM3D Experiments Restoration of block transform compressed depth maps Introduction Problem Formulation Depth maps filtering approaches Gaussian Filtering Adaptive H.264 Loop-Filtering Local Polynomial Approximation approach Bilateral Filter Hypothesis filtering approach Quality measures Experimental results Temporally-consistent filtering of depth map sequences Introduction Problem formulation Extending the filtering approach to video Experiments Results Conclusions

5 1 Introduction This deliverable consists of four parts. The first part deals with down-sampling and up-sampling of stereo video in the mixed resolution stereo representation. The second part deals with color channel filtering, particularly with denoising in order to increase quality of followed depth estimation and view synthesis. The third part describes methods for deblocking of depth maps impaired by compression artifacts. In the fourth part we extend the most effective filtering approach from the previous part for depth map sequences and more general types of depth map distortions. We especially target bettr time-consistency to avoid flickering and some other 3D artifacts in the synthesized views [37]. The first part is authored by Gerhard Tech and Heribert Brust from Fraunhofer HHI, the second part is authored by Sumeet Sen and Atanas Gotchev and the third and fourth parts are authored by Sergey Smirnov and Atanas Gotchev from TTY. 4

6 2 Evaluation of down-sampling methods for Mixed Resolution Coding The mixed resolution approach is based on the transmission of a full and a down-sampled view. In a pre-processing step one view of a stereoscopic sequence is decimated. The decimated and the full view are coded and transmitted. At the receiver side the decimated view is up-sampled again ([1], [2]). Although decimation and interpolation is a theoretically solved problem, in practice a great variety of up- and down-sampling methods exist. Differences are given in the design of antialiasing and interpolation filters. In this scope additional design factors that affect the performance of up and down-sampling have to be regarded. These factors are the distortions introduced by coding and the low resolution of content suitable for displaying on mobile devices. To achieve best overall quality using the mixed resolution approach two standard methods previously used by VCEG/MPEG Joint Video Team (JVT) are analyzed and evaluated in this section. Moreover an approach using a down-sampling filter with variable cutoff frequency is optimized and evaluated. 2.1 Sampling Methods The standard sampling methods discussed in sections and are implemented in the resample tool downconvert provided with Reference Software for Scalable Video Coding JSVM [3]. An implementation of the filter with variable cutoff frequency presented in section is part of the Mathworks Matlab Software [4]. All filters are applied separately in vertical and horizontal direction Standard anti-aliasing filters Sine windowed sinc (SWS) For down-sampling the filter given in [5] is used. Filter coefficients are given by the sinewindowed sinc-function shown in equation (1), for otherwise (1) This leads for a decimation of factor 2 with and to a 14-tap filter. In [5] this filter is collapsed to a 12-tap filter, whereas the software implementation clips it to a 8-tap filter. Magnitude and impulse response of this filter are shown in Figure 2.1 5

7 Figure 2.1 Impulse and Magnitude Response of sine windowed sinc down-sample filter Dyadic down-sampling filter (DDS) For down-sampling the dyadic filter presented in [6] is used. For the Mixed Resolution Approach decimation by factor two is sufficient. Therefore the filter has to be applied only once. Impulse and magnitude responses of the filter are shown in Figure 2.2. Figure 2.2: Impulse and Magnitude Response of dyadic down-sample filter Standard interpolation filters SVC normative up-sampling (SNU) Interpolation is based on a set of 4-taps filters. These integer-based 4-tap filters are originally derived from the Lanczos-3 filter. For a detailed description of the complex interpolation process please refer to [7] Dyadic up-sampling filter (DUS) After doubling of the sampling rate the AVC 6-tap half pel filter presented in [6] is applied for interpolation. The impulse and magnitude response are shown in Figure

8 Figure 2.3: Impulse and Magnitude Response of dyadic interpolation filter FIR anti-aliasing filter with variable cutoff frequency (VCF) Additional to the down-sampling filter provided with JSVM Reference software, a hammingwindowed FIR filter with varying cutoff frequencies has been evaluated. For a detailed description of the filter design please refer to [8]. Figure 2.4 and Figure 2.5 show the impulse and magnitude response for cutoff frequencies of 0.4 and 0.6. The order of the filter has been set to 10. Figure 2.4 Impulse and Magnitude Response of VCF filter with normalized cutoff frequency 0.4 Figure 2.5 Impulse and Magnitude Response of VCF filter with normalized cutoff frequency 0.6 7

9 2.2 Coding Experiments Setup For the evaluation of the down-sampling filters one view of each sequence has been downsampled, encoded and up-sampled again. Codec parameters are given in Table 2.1. Profile GOP Size Symbol Mode 8x8 Transform Table 2.1: codec settings Baseline 1 (IPPP) CAVLC Disabled Search Range 48 Intra Period 16 Quantization Parameter 24, 28, 32, 36, 40 The filter combinations shown in Table 2.2 have been examined. The first two combinations are the standard filters provided with JSVM Software. Note that the SWS and SNU filters introduce and remove a shift of a half pel, hence a combination with the other filters is not possible. The last nine combinations utilize the VCF filter with cutoff frequencies from 0.1 to 0.9 for antialiasing and the DUS filter for interpolation. Table 2.2: combinations of evaluated up and down-sampling methods and cutoff frequencies Down DDS SWS VCF VCF VCF VCF VCF VCF VCF VCF VCF Cutoff ~0.4 ~ Up DUS SNU DUS DUS DUS DUS DUS DUS DUS DUS DUS The six sequences from the coding test set of the stereo video database [9] have been used for evaluation. This leads to a total of number of 6 (Sequences) x 11 (Up/Down Combinations) x 5 (QPs) = 330 sequences that have been coded. 8

10 2.2.2 Results Results of coding experiments are presented in Figure 2.6. The curves depict the PSNR vs. the bit rate of the down-sampled, coded and re-up-sampled right view. The uncoded full right view has been used for reference. The solid curves show results for the VCF down-sampling filter in combination with the dyadic up-sampling filter. For each QP the cutoff frequency was varied from 0.1 to 0.9 with a step size of 0.1. In Figure 2.6 the corresponding nine points are marked with crosses for each QP. With an increased cutoff frequency more details retain in the smoothed picture, hence coding leads to an increased bit rate. Therefore the leftmost rate-distortion point for each QP corresponds to a cut off frequency of 0.1 and the rightmost point to a frequency of 0.9. The envelope of the solid curves is depicted as yellow dashed line in Figure 2.6 and gives the rate-distortion points with optimal cutoff frequency. Regarding PSNR measure, for most sequences and rate points the optimal cutoff frequency is around 0.6. Lower frequencies lead to an over-smoothing and a strongly reduced image quality. Higher frequencies result not only in a further increased bit rate but also in a reduced PSNR by introduction of aliasing artifacts. Results obtained using the standard methods provided with the JSVM Software are presented as black and magenta dashed lines in Figure 2.6. Since the cutoff frequency is fixed only the QP was varied here. It can be seen that the dyadic up- and down-sampling approach performs slightly better than the combination of SWS down-sampling filter and SVC normative upsampling filter. A comparison with the optimized VCF filter shows that both methods are outperformed. The VCF filter leads to PSNR gains up to 1 db at a constant bit rate or at a constant PSNR to bit rate savings up to 30% respectively. 9

11 Figure 2.6 the solid lines show PSNR vs. bit rate for the down-sampled, coded and re up-sampled view using the VCF-filter, each curve represents a fixed QP, the cutoff frequency increases from left (0.1) to right (0.9), the dashed yellow curve is the envelope of the solid lines and shows the optimal cutoff frequencies; the dashed magenta and black curves show the results for varying QPs obtained with the JSVM tools 10

12 3 Filtering of color stereo video sequences 3.1 Introduction In the recent years, denoising of still images and video has received high interest due to the availability of mobile imaging platforms and the recent trends in user-created content. Capture of images and video have became quite popular with the use of consumer and compact cameras. Content created by users using nonprofessional equipment has been spreading through content-sharing platforms. In many cases such content is created in low illumination conditions and is quite noisy. This has determined the research interest in developing high performance denoising methods. The state-of-the-art denoising approaches seek for similarities between non-local patches within images or video frames and utilize them for getting highly over-completed and sparse representations, usually in transform domain, where the noise can be effectively separated from the information signals and subsequently suppressed [11],[12],[13],[14]. Methods based on non-local means [11] and collaborative non-local transform-domain filtering [13] are considered as the most powerful denoising approaches. We refer to the review paper [12] for a nice overview of the topic. In our development, we consider a scenario where the input stereo video is impaired by noise. We try to evaluate the importance of having more information, as in stereo, for achieving better denoising results. Similar problems have been addressed in [15], [16], [17] where non-local means have been applied on multiple frames or along with given depth map in noisy multi-view setting. In our setting, we adopt the collaborative transform-domain filtering approach, known as 3D Block-Matching (BM3D) [13] and its video version VBM3D [14] as they have shown superior performance for conventional 2D video. We aim at quantifying the performance of this algorithm for stereo video and at investigating the advantage stereo video would bring to the approach while using sparse 3D transform-domain collaborative filtering. 3.2 Denoising of stereo video by VBM3D We have applied the VB3D algorithm as in [14]. The algorithm operates by identifying similar blocks in the spatial and temporal neighborhood of a reference block. The similarity is measured by Eucledian distance and the similar blocks are collected in a stack (3D block). This step is called grouping. The advantage of grouping is that highly similar signal fragments are put and processed together. The noise is then suppressed using collaborative filtering in DCT domain which takes the advantage of the increased correlation between the grouped blocks. For video, rhe denoising is performed in two steps; predictive-search block-matching is combined with collaborative hard-thresholding in the 1st step and with collaborative Wiener filtering in the second step. Figure 7 shows a pictorial representation of the algorithm. The predictive-search block-matching is performed for successive video frames, assuming that the intra-frame search has identified the similar blocks to the reference one. Then, these blocks are used to find new similar ones in positions close to their spatial positions (predictive search). Thus, the similarity search is extended along temporal dimension with no need of explicit motion estimation. The algorithm essentially depends on the search range within the current video frame and with respect to the reference block (intra-frame search) and the search range for each similar block along the temporal axis. For stereo, it is straightforward to extend the algorithm to search for similar blocks in the other given view as well. In practice this would require some knowledge about the disparity range so to adjust the inter-view search range. In this study we are interested in two cases: in the first case the two noisy video channels are denoised independently using VBM3D and in the second case they are denoised jointly using the modified approach. 11

13 3.3 Experiments Figure 7. Video 3D block-matching denoisng approach In the first experiment, we add white Gaussian noise to ground true stereo video sequences, then denoise either jointly or individually and then measure the denosing performance in terms of frame-wise PSNR between ground true and denoised channels. The test sequences Horse and Car of resolution 640x360 were used in the experiments. Figure 8 illustrates the experimental setting. Left channel video Right channel video Left channel video Right channel video Interleaved video Denoising block (VBM3D) Denoising block (VBM3D) Denoised Left channel video Denoising block (VBM3D) Denoised Right channel video Denoised Interleaved video Plot frame wise PSNR with original sequence Plot frame wise PSNR with original sequence Denoised Left channel video Denoised Right channel video Plot frame wise PSNR with original sequence Plot frame wise PSNR with original sequence Figure 8. Experimental setting for denoising of stereo video The results for Horse sequences are given in Figure 9 and the results for Car sequence are given in Figure

14 Figure 9. Denoising results for 'Horse' for the left and right channels. Red : noisy vs noise-free; ligh-blue : jointly denoised vs noise-free; blue : individually denoised vs noise-free Figure 10. Denoising results for 'Car' for the left and right channels. Red : noisy vs noise-free; ligh-blue : jointly denoised vs noise-free; blue : individually denoised vs noise-free As can be seen in the figures, the stereo adds little to the denoising performance. The jointly denoised and individually denoised channels go close to each other with a small preference for the individual denoising. In the second experiment, we put the noisy and denoised sequences to tasks such as depth estimation and view synthesis. The test involved the same noise-free and noisy sequences. Using the FhG s depth estimator [18], the depth was estimated for the following stereo sequences a) noise-free (i.e. ground true) sequences; b) noisy sequences; c) individually denoised sequences (denoised data 1); d) jointly denoised sequences (denoised data 2). The obtained depths were used to render the right channel using the corresponding left (noise-free, noisy, or denoised) channel. The resulted right channel video sequences were compared with the original ones in terms of PSNR. The results are presented in Figure 11 and Figure

15 PSNR [db] PSNR [db] MOBILE3DTV 35 horse Noise-free data Noisy data Denoised data1 Denoised data Frame index 34 car Noise-free data Noisy data Denoised data1 Denoised data Frame index Figure 11. PSNR of ground true vs synthesized right channels out of different depth maps (see Legend) 14

16 PSNR [db] PSNR [db] MOBILE3DTV 33.4 horse Frame index 33 car Frame index Figure 12. Zoomed version of Figure 11 The denoising plays a substantial role in improving the quality of the synthesized view. The synthesized views out of denoised data are even better than those rendered using depth estimated out of the noise-free data. This suggests that beside the simulated added noise, the original videos contained some small amount of inherent noise which impeded the depth estimation but was suppressed by the denoising technique. Improving the depth estimation quality and subsequently, the quality of the synthesized view is a nice property of the VBM3D algorithm. In terms of depth estimation and view rendering, and for the horse test sequence, the approach of individual denoising of left and right channels showed better performance. For the car sequence however, the approach of joint denoising was superior. This is caused by differences in the content. Horse data contains small amount of motion, there is static background dominating in the scene. There are also luminance differences between the left and right channel. Correspondingly, VBM3D takes more advantage of finding more similar blocks along temporal domain than between views. Thus, individual processing of views turn to be more successful. In opposite, Car contains more motion, i.e. more changes along temporal axis. The similarity search finds more similar blocks between views and filters them collaboratively in a successful manner. 15

17 4 Restoration of block transform compressed depth maps 4.1 Introduction One of the 3D video formats, studied within the Mobile3DTV project, is informally called video plus depth, where the 2D video frames are augmented with per-pixel depth information. The 2D color video is represented in its ordinary form (e.g. in luminance-chrominance space) while the associated depth is represented as a quantized (gray-scale) map ranging from the minimum to the maximum distance with respect to the assumed camera position. Figure 13 illustrates the concept of view plus depth 3D representation for the popular test sequence Ballet dancer. Figure 13. Illustration of the 'view+depth' format concept Such a representation has a number of benefits: it ensures backward compatibility for legacy devices and offers easy rendering of virtual views for 3DTV and Free-viewpoint TV applications, while being also compression-friendly. The latter feature is based on the observation that the depth channel is more compression-susceptible than any other color video channel for delivering the same 3D scene geometry information. We refer to Deliverables D2.2, D2.3, and D2.5 for more details about the compressibility of depth maps. Depth image has two noticeable peculiarities. First, it is an image never seen by the viewer. It is used for rendering new views only (so-called depth-image based rendering - DIBR). Second, being a range map, it exhibits smooth regions representing objects of the same distance, delineated by sharp transitions (object boundaries). Thus, it is quite different from color texture images compressible with block transform based compression methods. This peculiarity has been addressed in designing compression schemes especially tailored for depth images [19], [20]. Nevertheless, the block transform-based video coding schemes have been favored in rateallocation studies because of the existing standardized encoders, such as H.264 and MPEG [21], [22]. In these studies two rate-allocation approaches have been adopted. In the first approach the bit allocation has been optimized jointly for the video and depth to minimize the rendering distortion of the desired virtual view [21]. In the second approach, the video quality has been maximized for the sake of backward compatibility while the depth has been encoded with a small fraction (10-15%) of the total bit rate [22]. The H.264 coding scheme has also been adopted within the project, where the total bit budget between color video and depth has been carefully jointly optimized [23] (see also D2.2 and D2.5). In the above rate-allocation approaches, especially for low bit rates, depth has been compressed by enforcing strong quantization of DCT coefficients. This creates the well-known blocking artifacts which are generic for block-transform-based compression schemes. For the case of depth images, blocking leads to distorted depth discontinuities and therefore distorted geometrical properties and object boundaries in the rendered view. The problem is illustrated by 16

18 Figure 14. The problem can be partially addressed by simple (e.g. Gaussian) smoothing an approach used also for mitigating occlusion effects. While simple, this approach is also weak as it destroys true sharp boundaries and impedes true virtual view rendering. We study the problem of restoration of compressed depth maps affected by blocky artifacts from two points of view. Our first aim is to adapt and compare state-of-the-art methods, originally designed to handle similar problems. We are interested in two groups of methods: methods from the first group regard the depth image as it is, i.e. they process it independently from the available color video. Methods from the second group utilize structural information from the video channel in order to improve the depth map restoration. Our second aim is to identify appropriate quality measures to quantify the distortions in the depth image and their effect on the rendered virtual view. (a) (b) (c) Figure 14. Teddy dataset. (a) ground truth depth; (b) rendered view using (a) (without occlusion filling); (c) ground truth depth compressed as H.264 I-frame with QP=51; (d) rendered view using (c) 4.2 Problem Formulation Consider an individual colour video frame in some colour space. For sake of clarity we consider YUV colour space however most of the developments can be done in RGB as well. We denote the colour frame as three-component vector, where (d) 17

19 is a spatial variable, being the image domain. Along with the video frame, we consider the associated per-pixel depth. A new, virtual view can be synthesized out of the given (reference) color frame and depth, applying projective geometry and knowledge about the reference view camera [24]. The synthesized view is composed of two parts,, where denotes the visible pixels from the position of the virtual view camera and denotes the pixels of occluded areas. The corresponding domains are denoted by correspondingly,. We consider the case where both are to be coded as H.264 intra frames with some QPs, this leafing to their quantized versions. We model the effect of quantization as quantization noise added to the uncompressed signal. Namely, The quantization noise terms added to the color channels and the depth channel are considered independent white Gaussian processes:,. While this modeling is simple, it has proven quite effective for mitigating the blocking artifacts arising from quantization of transform coefficients. In particular, it allows for establishing a direct link between the quantization parameter (QP) and the quantization noise variance to be used for tuning deblocking filtering algorithms [25]. Let us denote by the virtual view synthesized out of quantized depth and quantized reference view. Unnatural discontinuities at the boundaries of the transform blocks (the blocking artifacts) in the quantized depth image cause geometrical distortions and distorted object boundaries in the rendered view. The goal of the restoration of compressed depth maps is to mitigate the blocking effects in the depth image domain, i.e. to obtain a deblocked depth image estimate, which would be closer to the original, uncompressed depth, and would improve the quality of the rendered view. 4.3 Depth maps filtering approaches We have implemented and compared five methods which can be grouped into two groups. First two methods work directly on the depth image making no use of the given reference color video frame. These methods are simple and by choosing them we wanted to check the effect of simple or adaptive smoothing of the depth image on the rendered view. The second set groups methods which essentially utilize structural information from the video channel(s). The assumption here is that the video channel is coded with better quality and as such it can provide trustful information about objects at different depth to be used for restoring the true depth discontinuities. We aim at utilizing structural information such as pixel neighborhood or color (dis-)similarity from the given video frame to infer the true depth values Gaussian Filtering Gaussian smoothing is a popular technique for getting rid of usually high-frequency contaminations. The method suggests convolving the noisy image with 2D discrete smoothing kernel in the form: The standard deviation is a free parameter which can be used to control the imposed smoothness. For our experiments we have tuned it as a function of the H.264 Quantization Parameter. The main drawback of the Gaussian filtering is that is applies fixed-size 18 (2) (3) (4)

20 rectangular window across true object boundaries and thus smoothes out true image features together with the noise Adaptive H.264 Loop-Filtering The H.264 video compression standard has a built-in deblocking algorithm addressing the problem of adaptive smoothing. It works adaptively on boundaries trying to avoid smoothing of real signal discontinuities. To achieve this, two adaptive threshold functions have been experimentally defined to determine whether or not to apply smoothing across block boundaries. The functions depend on the QP as well as on two encoder-selectable offsets, denoted by and included and transmitted in the slice header. These two offsets are the only user-tunable parameters allowing some adjustment of the smoothing for a specific application. For more details on the H.264 deblocking we refer to [26] Local Polynomial Approximation approach The anisotropic local polynomial approximation (LPA) is a point-wise method for adaptive estimation in noisy conditions [27]. For every point of the image, local polynomial sectorialneighborhood estimates are fitted for different directions. In the simpler case, instead of sectors, 1D directional estimates of four (by 90 degrees) or eight (by 45 degrees) different directions are used. The length of each estimate, denoted as scale, is adjusted so to meet the compromise between the exact polynomial model (low bias) and enough smoothing (low variance). A statistical criterion, denoted as Intersection of Confidence Intervals (ICI) rule is used to find this compromise [28], [29], i.e. the optimal scale for each direction. These optimal scales in each direction determine an anisotropic polygonal neighborhood for every point of the image well adapted to the structure of the image. This neighborhood has been successfully utilized for shape-adaptive transform-based color image denoising and deblurring [25]. In the spirit of [25], we use the quantized luminance channel as source of structural information. The image is convolved with a set of 1D directional polynomial kernels, where is the set of different lengths (scales) and are the directions, thus obtaining the estimates. In order to find the optimal scale for each direction (hereafter the notation of direction is omitted), so-called confidence intervals are formed first: (Goldenshluger & Nemirovski, 1997),(Katkovnik V., 1999). The optimal scale is the largest scale (in number of pixels), which ensures a non-empty intersection of confidence intervals. Figure 15a illustrates the optimal scale for each pixel (encoded with different gray value) for a particular direction. The optimal scales for all directions form an adaptive polygonal neighborhood with current pixel being in the centre, as illustrated in Figure 15b. After finding optimal neighborhood in the luminance image domain, the same is used for smoothing the depth image (cf. Figure 15c). The smoothing is done by fitting a plane within the neighborhood. Since LPA is point-wise procedure, neighborhoods for each pixel overlap. Correspondingly, depth pixels get estimated multiple times depending on how many times they get inside a neighborhood. The final estimate or each depth pixel is obtained by averaging the aggregated planar estimates for the pixel. Figure 15e illustrates the result of LPA-ICI filtering. Note that the scheme depends on two parameters: the noise variance of the luminance channel and the positive threshold parameter. The former depends on the quantization of the color video. We assume low quantization noise. The latter can be adjusted so to favor higher amount 19

21 of smoothing. We have optimized it with respect to the quantization parameter of the depth channel:. a) b) c) d) e) Figure 15 LPA-ICI filtering of depth maps. a) Optimal scales for one of the directions; b) luminance channel with some of found optimal neighbors; c) compressed depth with the same neighbours overlaid; d) input (compressed) depth e) filtered by LPA-ICI Bilateral Filter The goal of bilateral filtering is to smooth the image while preserves edges [30]. It utilizes information from all color channels to specify suitable weights for local (non-linear) neighborhood filtering. For grayscale images, local weights of neighbors are calculated based on both their spatial distance and their photometric similarity, favoring nearer values to distant ones in both spatial domain and intensity range. For color images, bilateral filtering uses color distance to distinguish photometric similarity between pixels, thus reducing phantom colors in the filtered image. Figure 16 a-e illustrates the approach in forming the filtering window. Such collaboratively-weighted neighborhood defined by the color image is applicable also to the depth channel. The approach is similar also to the one used in depth estimation where contour color information has been used for finding correspondences [31]. In our setting, we have adopted a version of the bilateral filter as in [32]. 20

22 where, and. The two parameters and determine the spatial extent and the range extent of the weighting functions correspondingly. We have optimized them with respect to the QP:. A result of bilateral filtering is given in Figure 16 f,g. (5) a) b) c) d) e) f) g) Figure 16. Bilateral filtering of depth maps. a) color frame with reference pixel (in red); b) spatial proximity; c) colour similarity; d) colour window; e) combined spatial-colour window; f) blocky depth; g) bilaterally filtered depth 21

23 4.3.5 Hypothesis filtering approach Originally, the considered method has been developed for increasing the resolution of lowresolution depth images, utilizing information from the high-resolution color image [32]. This method is perfectly applicable to our problem of suppression of compression artefacts and restoration of real discontinuities in the depth map. In the original approach, a 3D cost volume is constructed frame-wise out of several depth hypothesizes and the hypothesis with lowest cost is selected as a refined depth value at the current iteration. More specifically, the cost volume at the i-th iteration is formed as truncated quadratic difference, (6) where d is the potential depth candidate, is the current depth estimate at coordinates x and L is the search range controlled by a constant. The obtained slices of the cost volume for different values of d somehow keep the degraded pattern of z, as illustrated in Figure 17 left. Therefore, each slice of the cost volume undergoes joint bilateral filtering, i.e. each pixel of the cost slice is obtained as a weighted average of neighboring pixels where weights are also modified by the color similarity measured as l 1 distance between the corresponding pixel of the color video frame and the neighboring ones where, and is the neighborhood of coordinate x. The reason of applying bilateral filtering is two-fold: it assumes the depth reflects the piecewise smoothness of the surfaces of the given 3D scene and that the depth is correlated with the local scene color (same local color corresponds to constant depth). Our experimental tests demonstrated that filtering of the cost volume (1) is more effective than directly filtering the noisy depth. After bilateral filtering, the slices get smoothed (Figure 17 right) and the depth for the next iteration is obtained as (7). (8) Figure 17 Result of filtering of cost volume. Left: unfiltered cost function; right: bilaterally-filtered cost function. The hypothesis filtering approach is illustrated in Figure 18. The approach methodologically assumes three steps: (1) form a cost volume, (2) filter the cost volume, (3) peak the min hypothesis. In the original approach [32], a further refinement of the depth is suggested: instead of selecting the depth giving the minimum cost, as of Eq. (3), a quadratic function is fit around that minimum and the minimum value of that function is selected instead. 22

24 Figure 18 Block diagram of hypothesis filtering We suggest several modifications to the original approach to make it more memory-efficient and to improve its speed. It is straightforward to figure out that there is no need to form cost volume in order to obtain the depth estimate for a given coordinate x at the i-th iteration. Instead, the cost function is formed for the required neighbourhood only and then filtering applies, i.e. Furthermore, the computation cost is reduced by assuming that not all depth hypotheses are applicable for the current pixel. A safe assumption is that only depths within the range where have to be checked. (9) Figure 19 Histogram of non-compressed and compressed depth map Additionally, depth range is scaled with the purpose to further reduce the number of hypothesizes. This step is especially efficient for the compression (blocky) artifacts. For compressed depth maps, the depth range appears to be sparse due to the quantization effect. Figure 19 illustrates histograms of depth values before and after compression so to confirm the use of rescaled search range of depth hypotheses. This modification speeds up the procedure and relies on the subsequent quadratic interpolation to find the true minimum. A pseudo-code of the suggested procedure in Eq.(4) is given in Table 1. 23

25 seconds MOBILE3DTV Table 1. Pseudo-code of modified hypothesis filtering Rescale the range of Noisy Depth Image For every (x,y) in Noisy Depth Image D = read window of depth frame around (x,y) C = read window of color frame around (x,y) W = calculate bilateral weights from C; Xmin = max color difference; For d=min(d) to max(d) X = W*MIN((D-d)^2, threshold)/w; If sum(x) < Xmin Depth_new(x,y) = d; Xmin = sum(x); End End End Rescale the range of Filtered Depth Bilateral filter directly on depth Original approach No cost volume Slices Figure 20 Execution time of different implementations of filtering approach Figure 20 illustrates the achievements in terms of speed. The figure shows experiments with depth filtering of a scene with different implementations of the filtering procedure. All implementations have been written in C and then compiled into MEX files to be run from Matlab environment. The vertical axis shows the execution time in seconds and the horizontal line shows the number of slices employed (and thus the dynamic range assumed). In the figure, the dotted curve shows single pass bilateral filtering. It does not depend on the dynamic range but on the window size, thus it is a constant in the figure. The red line shows the computational time for the original approach implemented as a three step procedure for the full dynamic range. Naturally, it is linear function with respect to the slices to be filtered. Our implementation (blue 24

26 curve) applying reduced dynamic range is also linearly depending on the number of slices but with dramatically reduced steepness. 4.4 Quality measures We have considered two groups of quality measures, the first group operating directly on the depth images (true and processed) and the second group operating on the rendered view (true and restored). While the measures in the first group are simpler and faster to calculate, the measures from the second group are more realistic to subjective perception. PSNR of Restored Depth compares the compressed or restored depth against ground true depth where is number of pixels of the depth image. Percentage of bad pixels is a measure originally used to compare estimated depths from stereo [34]. It counts the number of pixels differing more that a pre-specified threshold Consider the gradient of the difference between true depth and approximated depth. By Depth Consistency we denote the percentage of pixels, having magnitude of that gradient higher than a pre-specified threshold. The measure favors non-smooth areas in the restored depth considered as main source of geometrical distortion, as illustrated in (10) (11) (12) Figure 21. Results of tresholding in PSNR of Rendered View. Analogously to formula (9) but taken over the rendered view. Gradient-normalized RMSE has been suggested in [36] as a performance metric for optical flow estimation algorithms to make it more robust to local intensity variations in textured areas. In our implementation we calculate it over the luminance channel of rendered image and excluding true occluded areas 25

27 Discontinuity Falses accounts for the percentage of wrong occlusions in the rendered channel. Those are either new occlusions of initially non-occluded pixels or falsely disoccluded pixels where is cardinality (number of elements) of a domain. 4.5 Experimental results We present two experiments. In the first experiment, we compare the performance of all depth restoration algorithms assuming the true color channel is given (it has been also used in the optimization of the tunable parameters). In the second experiment we compare the effect of depth restoration in the case of mild quantization of the color channel. Figure 22 illustrates the performance of some of the filtering techniques. Rendering of the right channel has been accomplished using the original left channel and either compressed or filtered depth. No occlusion filling has been applied. Results of the first experiment are presented in Figure 23. Along x-axis of all plots, the H.364 QPs are given and the area of interest is between 30 and 50. All measures but the BAD one distinguish the methods in a consistent way. The group of structurally-constrained methods clearly outperforms the simple methods working on the depth image only. The two PSNR-based seem to be less reliable in characterizing the performance of the methods. The three remained measures, namely Depth Consistency, Discontinuity Falses and Gradient-normalized RMSE perform in a consistent manner. While NRMSE is perhaps the measure closest to the subjective perception, we favor also the other two measures of this group as they are relatively simple and do not require calculation of the warped (rendered) image. To characterize the consistency of our optimized parameters, in Figure 23g, we show the trend of CONSIST calculated for the algorithms with parameters optimized for NMRSE. One can see that the trend is pretty consistent with that of Figure 2e (where the methods are both optimized and compared with respect to CONSIST). The same can be seen while comparing Figure 23h with Figure 23f. In the former, the NRMSE is calculated over the test set while the algorithms parameters are optimized over the training set with respect to CONSIST. The measure shows the same trend as in the case when the algorithms are optimized with respect to the same measure. So far, we have been working with uncompressed color channel. It has been involved in the optimizations and comparisons. Our aim was to characterize the pure influence of the depth restoration only. In the second experiment we play with quantized color channel. We assume mild quantization of the color image, e.g. by QP=35 and two QPs, 35 and 45 for the depth. For our test imagery, the first depth QP corresponds to about 10% of the total bitrate. The NRMSE of the rendered channel is calculated with respect to the channel rendered from uncompressed color and depth. The results are given in Figure 24. One can see that the depth post- processing clearly makes a difference allowing to use stronger quantization of the depth channel and still to achieve good quality. (13) (14) 26

28 a) b) c) d) e) f) Figure 22. Filtering of compressed depth maps. a) decompressed depth map; b) right channel rendered using original left and depth from a); c) depth filtered by bilateral filer; d) right channel rendered using c); e) depth filtered by hypothesis filter; f) right channel rendered using e) 27

29 Depth Consistency (%) Percent (%) Normalized RMSE (db) Depth Consistency (%) Percent (%) Normalized RMSE (db) Bad Pixels Percentage (%) Discontinuity Falses (%) PSNR of Restored Depth (db) PSNR of Rendered Channel (db) MOBILE3DTV (a) No Filtering H.264 Loop Filter Gaussian Smooth LPA-ICI Filtering Bilateral Filtering Super Resolution (b) No Filtering H.264 Loop Filter Gaussian Smooth LPA-ICI Filtering Bilateral Filtering Super Resolution (c) H.264 Quantization Parameter No Filtering 35 H.264 Loop Filter Gaussian Smooth LPA-ICI Filtering 30 Bilateral Filtering Super Resolution 25 (d) H.264 Quantization Parameter No Filtering H.264 Loop Filter Gaussian Smooth LPA-ICI Filtering Bilateral Filtering Super Resolution (e) No Filtering H.264 Loop Filter Gaussian Smooth LPA-ICI Filtering Bilateral Filtering Super Resolution (f) No Filtering H.264 Loop Filter Gaussian Smooth LPA-ICI Filtering Bilateral Filtering Super Resolution (g) H.264 Quantization Parameter No Filtering 5 4 H.264 Loop Filter Gaussian Smooth LPA-ICI Filtering Bilateral Filtering Super Resolution (h) No Filtering H.264 Loop Filter Gaussian Smooth LPA-ICI Filtering Bilateral Filtering Super Resolution Figure 23. Experiment 1. Horisontal axes show H.264 QP. (a)-(f) Performance of selected algorithms optimized for and compared by same measure. (g) Peformance measured by CONSIST of algorithms optimized for NRMSE. (h) Peformance measured by NRMSE for algorithms optimized for CONSIST. 28

30 True Color, True Depth Color QP=35, True Depth, NRMSE=10 True Color, Depth QP=35, NRMSE=23 Color QP=35, Depth QP=35, NRMSE=24 29

31 True Color, Depth QP=45, NRMSE=31 Color QP=35, Depth QP=45, NRMSE=32 True Color, Filtered Depth from QP=45, NRMSE=21 Color QP=35, Filtered Depth from QP=45, NRMSE=22 Figure 24. Experiment 2. Effect of compressed color and compressed and filtered depth to the quality of rendered view 30

32 5 Temporally-consistent filtering of depth map sequences 5.1 Introduction In the previous section we addressed the problem of refinement of depth maps impaired by compression artefacts. The quality of the depth maps also depends on the way they have been generated: that is either through depth-from-stereo or depth-from-multiview type of algorithms or using special depth sensors based on time-of-flight (ToF) principles or laser scanners or structural light. When accompanying video sequences, the consistency of successive depth maps in the sequence becomes an issue. Time-inconsistent depth sequences might cause flickering in the synthesized views as well as other 3D-specific artifacts [37]. The time-consistency issue has been addressed mainly at the stage of depth estimation either by adding a smoothing constraint along temporal dimension in the depth estimation global optimization procedure or by simple median filtering along successive depth frames [38], [39]. In this section, we address the problem of filtering of depth map sequences, which are impaired either by inaccurate depth estimation or noise or compression artifacts. We extend the approach from Section 4 toward video to tackle the time-consistency issue. 5.2 Problem formulation We extend the formulation in Sub-section 4.2, to add the temporal dimension. Consider color video sequence in YUV color space, accompanied by the associated per-pixel depth, where is a spatial variable being the image domain, and is frame index. The virtual view to be synthesized out of the given (reference) color frame and depth at time t, is denotebe by. It is composed of two parts,, where denotes the visible pixels from the position of the virtual view camera and denotes the pixels of occluded areas. The corresponding domains are denoted by correspondingly,. We consider the case where the depth sequence has been degraded by some impairment added to the true depth: Finally, we denote by the virtual view synthesized out of the degraded depth and by the virtual view synthesized out of processed depth and the given reference view. The goal of the depth filtering is to get an estimate of the depth sequence closer to the ground true depth sequence and providing synthesized virtual view with improved quality Extending the filtering approach to video In Section 4, we found out that the hypothesis filter gives superior performance when applied to individual depth frames impaired by compression artifacts. Here, we extend the same approach to video and to more general types of depth artifacts. Eq. (8) is extended to video sequences as follows where (15). 31

33 This essentially means, that the depth hypotheses are checked within a parallelogram around the current depth voxel with coordinates (x,t). While the neighbouring voxels are weighted by their color similarities to the central one, the temporal distance is penalized separately from the spatial one to enable better flexibility in tuning the filter parameters. Note, that the video filtering uses no explicit motion information. No motion estimation/compensation is applied. We rely on the color (dis-)similarity weights to suppress sufficiently depth voxels changed considerably by motion. The hypothesis filtering procedure for video is illustrated in Figure Experiments Figure 25. Extension of hypothesis filtering to video We present two experiments. In the first experiment, we consider the depth sequence as estimated from noisy stereo sequences. Namely, a given stereo sequence and is used to estimate the depth sequence. Then, white noise is added to the stereo video to obtain noisy stereo video, which is used to estimate the impaired depth sequence The latter is filtered by the suggested video hypothesis filtering. For comparison, median filtering is applied to the noisy depth sequence and to the per-frame hypothesis filtered data. In our practical setting, we have used a stereo pair of the Cone test data from the Middlebury Evaluation Test bench [40]. For that given stereo pair we have the ground true depth and we also estimated the depth by the method in [41]. To simulate a stereo video, we repeated the stereo pair 40 times to form 40 successive video frames, then added different amount of noise to each frame and estimated the depth from each so-obtained noisy stereo frame. The results of different filtering techniques applied to the noisy depth sequence are given in Figure 26. The results are consistent over all measures and show considerable improvement along the temporal dimension when the video extension of the hypothesis filtering is applied. The video hypothesis filtering not only manages to equalize the quality along the time axis but also improves the depth estimates compared to ones obtained from noise-free data by the method from [41]. In the second experiment we simulate blocky artifacts in the depth channel. To create ground true video plus depth, we circularly shifted the same cone sequence with a radius of 10 pixels 32

34 also adding some noise to the shifting vectors and then crop the central parts of the so-obtained frames. Thus, we got a sequence simulating circular motion of the camera plus some small amount of shaking. The sequence was compressed by H.264 encoder in IPIPIP mode varying slightly the quantization parameter (QP) per frame to simulate different amount of blockiness in successive frames. The filtering results are presented in Figure 27. We kept the following filters: single-frame hypothesis filter, the same followed by median filtering along time, and video hypothesis filtering. As it can be seen in the figure, the video version of hypothesis filtering has the most consistent performance. It performs especially well around edges. The rendered frames are with similar quality thus providing smooth and flicker-free experience. The only exception is the BAD metric, where the compressed depth seems to be the best. The metric, originally introduced to measure the performance of depth estimation algorithms, simply counts differences between ground true and processed pixels no matter how big or small (but above a threshold) the differences are. While all filtering algorithms introduce small changes over the whole image, those small changes seem to be more in percentage than the number of different pixels in the quantized depth image. However, what really matter are the bigger differences appearing around edges. These are well tackled by the filtering, as seen in the other metrics. Especially informative is the NRMSE, which measures the quality of the rendered channel being closer to the human perception. There, the new filtering approach truly excels. Finally, we provide some visual illustrations on the performance of the algorithm. We use the Book arrival sequence provided by Fraunhofer HHI, where the depth is estimated by the MPEG depth estimation software [42]. While it incorporates rather powerful techniques and yields highquality and time-consistent depth maps, our technique still adds some improvements. Figure 28shows the result of filtering for frame 20. From left to right, the figure shows the originallyestimated depth, then the depth obtained after median filtering along time, and then depth resulting from the proposed method. The depth estimation has failed around the face of the person entering the room and at the floor area. Median filtering manages to correct the depth of the floor but fails to correct the face of the person. The proposed method restores both the floor and the face. The same sequence has been compressed/decompressed with H.264 intra-frame and then filtered. The result of decompression and filtering is shown in Figure 29. Again, despite the substantial blocking artefacts, details as human faces have been successfully restored. 33

35 PSNR PSNR of Virtual Channel CONSIST Normalized RMSE BAD BAD near discontinuities MOBILE3DTV 5.4 Results 80 Cones 80 Cones Frame Cones Frame Cones Frame Cones Frame Cones Frame Noise-Free Estimate Noisy Estimate Noisy Estimate + Median(5 frm) Noisy + Hypotesis + Median(5 frm) Noisy + Hypothesis Noisy + Video Hypothesis (3 frm) Noisy + Video Hypothesis (5 frm) Frame Figure 26. Comparative results of filtering approaches as in Experiment 1 34

36 PSNR PSNR of Virtual Channel CONSIST Normalized RMSE BAD BAD near discontinuities MOBILE3DTV Cones Cones Frame Cones Frame Cones Frame Cones Frame Cones Frame Noisy Estimate Noisy + Hypothesis Noisy + Hypotesis + Median (7 frm) Noisy + Video Hypothesis (7 frm) Frame Figure 27. Comparative results of filtering approaches as in Experiment 2 35

37 Figure 28. Results of filtering of 'Book arrival' depth sequence. From left to right: originally-estimated depth; median-filtered; filtered by proposed approach Figure 29. Filtering of compressed depth sequence. From left to right: decompressed depth map; decompressed dep map filtered by proposed approach 36

Final report on coding algorithms for mobile 3DTV. Gerhard Tech Karsten Müller Philipp Merkle Heribert Brust Lina Jin

Final report on coding algorithms for mobile 3DTV. Gerhard Tech Karsten Müller Philipp Merkle Heribert Brust Lina Jin Final report on coding algorithms for mobile 3DTV Gerhard Tech Karsten Müller Philipp Merkle Heribert Brust Lina Jin MOBILE3DTV Project No. 216503 Final report on coding algorithms for mobile 3DTV Gerhard

More information

Development and optimization of coding algorithms for mobile 3DTV. Gerhard Tech Heribert Brust Karsten Müller Anil Aksay Done Bugdayci

Development and optimization of coding algorithms for mobile 3DTV. Gerhard Tech Heribert Brust Karsten Müller Anil Aksay Done Bugdayci Development and optimization of coding algorithms for mobile 3DTV Gerhard Tech Heribert Brust Karsten Müller Anil Aksay Done Bugdayci Project No. 216503 Development and optimization of coding algorithms

More information

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11 MPEG2011/N12559 February 2012,

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information

Advanced Video Coding: The new H.264 video compression standard

Advanced Video Coding: The new H.264 video compression standard Advanced Video Coding: The new H.264 video compression standard August 2003 1. Introduction Video compression ( video coding ), the process of compressing moving images to save storage space and transmission

More information

Depth Estimation for View Synthesis in Multiview Video Coding

Depth Estimation for View Synthesis in Multiview Video Coding MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Depth Estimation for View Synthesis in Multiview Video Coding Serdar Ince, Emin Martinian, Sehoon Yea, Anthony Vetro TR2007-025 June 2007 Abstract

More information

3D Video Processing Algorithms Part II. Lucio Azzari Olli Suominen Suomeet Sen Atanas Gotchev. Done Bugdaici. Gozde Bozdagi Akar

3D Video Processing Algorithms Part II. Lucio Azzari Olli Suominen Suomeet Sen Atanas Gotchev. Done Bugdaici. Gozde Bozdagi Akar 3D Video Processing Algorithms Part II Lucio Azzari Olli Suominen Suomeet Sen Atanas Gotchev Done Bugdaici Gozde Bozdagi Akar Project No. 21653 3D Video Processing Algorithms Part II Lucio Azzari, Olli

More information

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter

More information

Module 7 VIDEO CODING AND MOTION ESTIMATION

Module 7 VIDEO CODING AND MOTION ESTIMATION Module 7 VIDEO CODING AND MOTION ESTIMATION Lesson 20 Basic Building Blocks & Temporal Redundancy Instructional Objectives At the end of this lesson, the students should be able to: 1. Name at least five

More information

5LSH0 Advanced Topics Video & Analysis

5LSH0 Advanced Topics Video & Analysis 1 Multiview 3D video / Outline 2 Advanced Topics Multimedia Video (5LSH0), Module 02 3D Geometry, 3D Multiview Video Coding & Rendering Peter H.N. de With, Sveta Zinger & Y. Morvan ( p.h.n.de.with@tue.nl

More information

Efficient Techniques for Depth Video Compression Using Weighted Mode Filtering

Efficient Techniques for Depth Video Compression Using Weighted Mode Filtering 1 Efficient Techniques for Depth Video Compression Using Weighted Mode Filtering Viet-Anh Nguyen, Dongbo Min, Member, IEEE, and Minh N. Do, Senior Member, IEEE Abstract This paper proposes efficient techniques

More information

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University. 3D Computer Vision Structured Light II Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Introduction

More information

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical

More information

Coding of 3D Videos based on Visual Discomfort

Coding of 3D Videos based on Visual Discomfort Coding of 3D Videos based on Visual Discomfort Dogancan Temel and Ghassan AlRegib School of Electrical and Computer Engineering, Georgia Institute of Technology Atlanta, GA, 30332-0250 USA {cantemel, alregib}@gatech.edu

More information

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision What Happened Last Time? Human 3D perception (3D cinema) Computational stereo Intuitive explanation of what is meant by disparity Stereo matching

More information

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Project Title: Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding. Midterm Report CS 584 Multimedia Communications Submitted by: Syed Jawwad Bukhari 2004-03-0028 About

More information

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation ÖGAI Journal 24/1 11 Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation Michael Bleyer, Margrit Gelautz, Christoph Rhemann Vienna University of Technology

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Motivation. Gray Levels

Motivation. Gray Levels Motivation Image Intensity and Point Operations Dr. Edmund Lam Department of Electrical and Electronic Engineering The University of Hong ong A digital image is a matrix of numbers, each corresponding

More information

Motion Estimation. There are three main types (or applications) of motion estimation:

Motion Estimation. There are three main types (or applications) of motion estimation: Members: D91922016 朱威達 R93922010 林聖凱 R93922044 謝俊瑋 Motion Estimation There are three main types (or applications) of motion estimation: Parametric motion (image alignment) The main idea of parametric motion

More information

Data Term. Michael Bleyer LVA Stereo Vision

Data Term. Michael Bleyer LVA Stereo Vision Data Term Michael Bleyer LVA Stereo Vision What happened last time? We have looked at our energy function: E ( D) = m( p, dp) + p I < p, q > N s( p, q) We have learned about an optimization algorithm that

More information

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding

Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding Deblocking Filter Algorithm with Low Complexity for H.264 Video Coding Jung-Ah Choi and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST) 261 Cheomdan-gwagiro, Buk-gu, Gwangju, 500-712, Korea

More information

Subpixel accurate refinement of disparity maps using stereo correspondences

Subpixel accurate refinement of disparity maps using stereo correspondences Subpixel accurate refinement of disparity maps using stereo correspondences Matthias Demant Lehrstuhl für Mustererkennung, Universität Freiburg Outline 1 Introduction and Overview 2 Refining the Cost Volume

More information

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS Television services in Europe currently broadcast video at a frame rate of 25 Hz. Each frame consists of two interlaced fields, giving a field rate of 50

More information

Advanced Encoding Features of the Sencore TXS Transcoder

Advanced Encoding Features of the Sencore TXS Transcoder Advanced Encoding Features of the Sencore TXS Transcoder White Paper November 2011 Page 1 (11) www.sencore.com 1.605.978.4600 Revision 1.0 Document Revision History Date Version Description Author 11/7/2011

More information

A MULTI-RESOLUTION APPROACH TO DEPTH FIELD ESTIMATION IN DENSE IMAGE ARRAYS F. Battisti, M. Brizzi, M. Carli, A. Neri

A MULTI-RESOLUTION APPROACH TO DEPTH FIELD ESTIMATION IN DENSE IMAGE ARRAYS F. Battisti, M. Brizzi, M. Carli, A. Neri A MULTI-RESOLUTION APPROACH TO DEPTH FIELD ESTIMATION IN DENSE IMAGE ARRAYS F. Battisti, M. Brizzi, M. Carli, A. Neri Università degli Studi Roma TRE, Roma, Italy 2 nd Workshop on Light Fields for Computer

More information

Filtering Images. Contents

Filtering Images. Contents Image Processing and Data Visualization with MATLAB Filtering Images Hansrudi Noser June 8-9, 010 UZH, Multimedia and Robotics Summer School Noise Smoothing Filters Sigmoid Filters Gradient Filters Contents

More information

Motivation. Intensity Levels

Motivation. Intensity Levels Motivation Image Intensity and Point Operations Dr. Edmund Lam Department of Electrical and Electronic Engineering The University of Hong ong A digital image is a matrix of numbers, each corresponding

More information

Multimedia Technology CHAPTER 4. Video and Animation

Multimedia Technology CHAPTER 4. Video and Animation CHAPTER 4 Video and Animation - Both video and animation give us a sense of motion. They exploit some properties of human eye s ability of viewing pictures. - Motion video is the element of multimedia

More information

CHAPTER 6 PERCEPTUAL ORGANIZATION BASED ON TEMPORAL DYNAMICS

CHAPTER 6 PERCEPTUAL ORGANIZATION BASED ON TEMPORAL DYNAMICS CHAPTER 6 PERCEPTUAL ORGANIZATION BASED ON TEMPORAL DYNAMICS This chapter presents a computational model for perceptual organization. A figure-ground segregation network is proposed based on a novel boundary

More information

MULTIVIEW 3D VIDEO DENOISING IN SLIDING 3D DCT DOMAIN

MULTIVIEW 3D VIDEO DENOISING IN SLIDING 3D DCT DOMAIN 20th European Signal Processing Conference (EUSIPCO 2012) Bucharest, Romania, August 27-31, 2012 MULTIVIEW 3D VIDEO DENOISING IN SLIDING 3D DCT DOMAIN 1 Michal Joachimiak, 2 Dmytro Rusanovskyy 1 Dept.

More information

Correcting User Guided Image Segmentation

Correcting User Guided Image Segmentation Correcting User Guided Image Segmentation Garrett Bernstein (gsb29) Karen Ho (ksh33) Advanced Machine Learning: CS 6780 Abstract We tackle the problem of segmenting an image into planes given user input.

More information

Scene Segmentation by Color and Depth Information and its Applications

Scene Segmentation by Color and Depth Information and its Applications Scene Segmentation by Color and Depth Information and its Applications Carlo Dal Mutto Pietro Zanuttigh Guido M. Cortelazzo Department of Information Engineering University of Padova Via Gradenigo 6/B,

More information

FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING

FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING 1 Michal Joachimiak, 2 Kemal Ugur 1 Dept. of Signal Processing, Tampere University of Technology, Tampere, Finland 2 Jani Lainema,

More information

Artifacts and Textured Region Detection

Artifacts and Textured Region Detection Artifacts and Textured Region Detection 1 Vishal Bangard ECE 738 - Spring 2003 I. INTRODUCTION A lot of transformations, when applied to images, lead to the development of various artifacts in them. In

More information

View Synthesis for Multiview Video Compression

View Synthesis for Multiview Video Compression View Synthesis for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, and Anthony Vetro email:{martinian,jxin,avetro}@merl.com, behrens@tnt.uni-hannover.de Mitsubishi Electric Research

More information

NEW CONCEPT FOR JOINT DISPARITY ESTIMATION AND SEGMENTATION FOR REAL-TIME VIDEO PROCESSING

NEW CONCEPT FOR JOINT DISPARITY ESTIMATION AND SEGMENTATION FOR REAL-TIME VIDEO PROCESSING NEW CONCEPT FOR JOINT DISPARITY ESTIMATION AND SEGMENTATION FOR REAL-TIME VIDEO PROCESSING Nicole Atzpadin 1, Serap Askar, Peter Kauff, Oliver Schreer Fraunhofer Institut für Nachrichtentechnik, Heinrich-Hertz-Institut,

More information

Advanced phase retrieval: maximum likelihood technique with sparse regularization of phase and amplitude

Advanced phase retrieval: maximum likelihood technique with sparse regularization of phase and amplitude Advanced phase retrieval: maximum likelihood technique with sparse regularization of phase and amplitude A. Migukin *, V. atkovnik and J. Astola Department of Signal Processing, Tampere University of Technology,

More information

Recent, Current and Future Developments in Video Coding

Recent, Current and Future Developments in Video Coding Recent, Current and Future Developments in Video Coding Jens-Rainer Ohm Inst. of Commun. Engineering Outline Recent and current activities in MPEG Video and JVT Scalable Video Coding Multiview Video Coding

More information

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ) 5 MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ) Contents 5.1 Introduction.128 5.2 Vector Quantization in MRT Domain Using Isometric Transformations and Scaling.130 5.2.1

More information

Patch-Based Color Image Denoising using efficient Pixel-Wise Weighting Techniques

Patch-Based Color Image Denoising using efficient Pixel-Wise Weighting Techniques Patch-Based Color Image Denoising using efficient Pixel-Wise Weighting Techniques Syed Gilani Pasha Assistant Professor, Dept. of ECE, School of Engineering, Central University of Karnataka, Gulbarga,

More information

On the Adoption of Multiview Video Coding in Wireless Multimedia Sensor Networks

On the Adoption of Multiview Video Coding in Wireless Multimedia Sensor Networks 2011 Wireless Advanced On the Adoption of Multiview Video Coding in Wireless Multimedia Sensor Networks S. Colonnese, F. Cuomo, O. Damiano, V. De Pascalis and T. Melodia University of Rome, Sapienza, DIET,

More information

A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation

A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation Alexander Andreopoulos, Hirak J. Kashyap, Tapan K. Nayak, Arnon Amir, Myron D. Flickner IBM Research March 25,

More information

Segmentation Based Stereo. Michael Bleyer LVA Stereo Vision

Segmentation Based Stereo. Michael Bleyer LVA Stereo Vision Segmentation Based Stereo Michael Bleyer LVA Stereo Vision What happened last time? Once again, we have looked at our energy function: E ( D) = m( p, dp) + p I < p, q > We have investigated the matching

More information

MRT based Fixed Block size Transform Coding

MRT based Fixed Block size Transform Coding 3 MRT based Fixed Block size Transform Coding Contents 3.1 Transform Coding..64 3.1.1 Transform Selection...65 3.1.2 Sub-image size selection... 66 3.1.3 Bit Allocation.....67 3.2 Transform coding using

More information

Lecture 13 Video Coding H.264 / MPEG4 AVC

Lecture 13 Video Coding H.264 / MPEG4 AVC Lecture 13 Video Coding H.264 / MPEG4 AVC Last time we saw the macro block partition of H.264, the integer DCT transform, and the cascade using the DC coefficients with the WHT. H.264 has more interesting

More information

Lecture 7: Most Common Edge Detectors

Lecture 7: Most Common Edge Detectors #1 Lecture 7: Most Common Edge Detectors Saad Bedros sbedros@umn.edu Edge Detection Goal: Identify sudden changes (discontinuities) in an image Intuitively, most semantic and shape information from the

More information

A deblocking filter with two separate modes in block-based video coding

A deblocking filter with two separate modes in block-based video coding A deblocing filter with two separate modes in bloc-based video coding Sung Deu Kim Jaeyoun Yi and Jong Beom Ra Dept. of Electrical Engineering Korea Advanced Institute of Science and Technology 7- Kusongdong

More information

MAXIMIZING BANDWIDTH EFFICIENCY

MAXIMIZING BANDWIDTH EFFICIENCY MAXIMIZING BANDWIDTH EFFICIENCY Benefits of Mezzanine Encoding Rev PA1 Ericsson AB 2016 1 (19) 1 Motivation 1.1 Consumption of Available Bandwidth Pressure on available fiber bandwidth continues to outpace

More information

[2006] IEEE. Reprinted, with permission, from [Wenjing Jia, Huaifeng Zhang, Xiangjian He, and Qiang Wu, A Comparison on Histogram Based Image

[2006] IEEE. Reprinted, with permission, from [Wenjing Jia, Huaifeng Zhang, Xiangjian He, and Qiang Wu, A Comparison on Histogram Based Image [6] IEEE. Reprinted, with permission, from [Wenjing Jia, Huaifeng Zhang, Xiangjian He, and Qiang Wu, A Comparison on Histogram Based Image Matching Methods, Video and Signal Based Surveillance, 6. AVSS

More information

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

2014 Summer School on MPEG/VCEG Video. Video Coding Concept 2014 Summer School on MPEG/VCEG Video 1 Video Coding Concept Outline 2 Introduction Capture and representation of digital video Fundamentals of video coding Summary Outline 3 Introduction Capture and representation

More information

Structure-adaptive Image Denoising with 3D Collaborative Filtering

Structure-adaptive Image Denoising with 3D Collaborative Filtering , pp.42-47 http://dx.doi.org/10.14257/astl.2015.80.09 Structure-adaptive Image Denoising with 3D Collaborative Filtering Xuemei Wang 1, Dengyin Zhang 2, Min Zhu 2,3, Yingtian Ji 2, Jin Wang 4 1 College

More information

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 5, MAY

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 5, MAY IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 5, MAY 2015 1573 Graph-Based Representation for Multiview Image Geometry Thomas Maugey, Member, IEEE, Antonio Ortega, Fellow Member, IEEE, and Pascal

More information

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

International Journal of Emerging Technology and Advanced Engineering Website:   (ISSN , Volume 2, Issue 4, April 2012) A Technical Analysis Towards Digital Video Compression Rutika Joshi 1, Rajesh Rai 2, Rajesh Nema 3 1 Student, Electronics and Communication Department, NIIST College, Bhopal, 2,3 Prof., Electronics and

More information

An Approach for Reduction of Rain Streaks from a Single Image

An Approach for Reduction of Rain Streaks from a Single Image An Approach for Reduction of Rain Streaks from a Single Image Vijayakumar Majjagi 1, Netravati U M 2 1 4 th Semester, M. Tech, Digital Electronics, Department of Electronics and Communication G M Institute

More information

Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology

Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology Course Presentation Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology Image Compression Basics Large amount of data in digital images File size

More information

Mesh Based Interpolative Coding (MBIC)

Mesh Based Interpolative Coding (MBIC) Mesh Based Interpolative Coding (MBIC) Eckhart Baum, Joachim Speidel Institut für Nachrichtenübertragung, University of Stuttgart An alternative method to H.6 encoding of moving images at bit rates below

More information

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami to MPEG Prof. Pratikgiri Goswami Electronics & Communication Department, Shree Swami Atmanand Saraswati Institute of Technology, Surat. Outline of Topics 1 2 Coding 3 Video Object Representation Outline

More information

Stereo Vision II: Dense Stereo Matching

Stereo Vision II: Dense Stereo Matching Stereo Vision II: Dense Stereo Matching Nassir Navab Slides prepared by Christian Unger Outline. Hardware. Challenges. Taxonomy of Stereo Matching. Analysis of Different Problems. Practical Considerations.

More information

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.

EE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H. EE 5359 MULTIMEDIA PROCESSING SPRING 2011 Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.264 Under guidance of DR K R RAO DEPARTMENT OF ELECTRICAL ENGINEERING UNIVERSITY

More information

CIRCULAR MOIRÉ PATTERNS IN 3D COMPUTER VISION APPLICATIONS

CIRCULAR MOIRÉ PATTERNS IN 3D COMPUTER VISION APPLICATIONS CIRCULAR MOIRÉ PATTERNS IN 3D COMPUTER VISION APPLICATIONS Setiawan Hadi Mathematics Department, Universitas Padjadjaran e-mail : shadi@unpad.ac.id Abstract Geometric patterns generated by superimposing

More information

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Yongying Gao and Hayder Radha Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48823 email:

More information

Performance Comparison between DWT-based and DCT-based Encoders

Performance Comparison between DWT-based and DCT-based Encoders , pp.83-87 http://dx.doi.org/10.14257/astl.2014.75.19 Performance Comparison between DWT-based and DCT-based Encoders Xin Lu 1 and Xuesong Jin 2 * 1 School of Electronics and Information Engineering, Harbin

More information

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier Computer Vision 2 SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung Computer Vision 2 Dr. Benjamin Guthier 1. IMAGE PROCESSING Computer Vision 2 Dr. Benjamin Guthier Content of this Chapter Non-linear

More information

Perceptual Grouping from Motion Cues Using Tensor Voting

Perceptual Grouping from Motion Cues Using Tensor Voting Perceptual Grouping from Motion Cues Using Tensor Voting 1. Research Team Project Leader: Graduate Students: Prof. Gérard Medioni, Computer Science Mircea Nicolescu, Changki Min 2. Statement of Project

More information

A Statistical Consistency Check for the Space Carving Algorithm.

A Statistical Consistency Check for the Space Carving Algorithm. A Statistical Consistency Check for the Space Carving Algorithm. A. Broadhurst and R. Cipolla Dept. of Engineering, Univ. of Cambridge, Cambridge, CB2 1PZ aeb29 cipolla @eng.cam.ac.uk Abstract This paper

More information

Implementation and analysis of Directional DCT in H.264

Implementation and analysis of Directional DCT in H.264 Implementation and analysis of Directional DCT in H.264 EE 5359 Multimedia Processing Guidance: Dr K R Rao Priyadarshini Anjanappa UTA ID: 1000730236 priyadarshini.anjanappa@mavs.uta.edu Introduction A

More information

Compression of Stereo Images using a Huffman-Zip Scheme

Compression of Stereo Images using a Huffman-Zip Scheme Compression of Stereo Images using a Huffman-Zip Scheme John Hamann, Vickey Yeh Department of Electrical Engineering, Stanford University Stanford, CA 94304 jhamann@stanford.edu, vickey@stanford.edu Abstract

More information

Video Coding Using Spatially Varying Transform

Video Coding Using Spatially Varying Transform Video Coding Using Spatially Varying Transform Cixun Zhang 1, Kemal Ugur 2, Jani Lainema 2, and Moncef Gabbouj 1 1 Tampere University of Technology, Tampere, Finland {cixun.zhang,moncef.gabbouj}@tut.fi

More information

Anno accademico 2006/2007. Davide Migliore

Anno accademico 2006/2007. Davide Migliore Robotica Anno accademico 6/7 Davide Migliore migliore@elet.polimi.it Today What is a feature? Some useful information The world of features: Detectors Edges detection Corners/Points detection Descriptors?!?!?

More information

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE) Motion and Tracking Andrea Torsello DAIS Università Ca Foscari via Torino 155, 30172 Mestre (VE) Motion Segmentation Segment the video into multiple coherently moving objects Motion and Perceptual Organization

More information

Motion Estimation for Video Coding Standards

Motion Estimation for Video Coding Standards Motion Estimation for Video Coding Standards Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Introduction of Motion Estimation The goal of video compression

More information

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Accurate 3D Face and Body Modeling from a Single Fixed Kinect Accurate 3D Face and Body Modeling from a Single Fixed Kinect Ruizhe Wang*, Matthias Hernandez*, Jongmoo Choi, Gérard Medioni Computer Vision Lab, IRIS University of Southern California Abstract In this

More information

MR IMAGE SEGMENTATION

MR IMAGE SEGMENTATION MR IMAGE SEGMENTATION Prepared by : Monil Shah What is Segmentation? Partitioning a region or regions of interest in images such that each region corresponds to one or more anatomic structures Classification

More information

Image Restoration and Reconstruction

Image Restoration and Reconstruction Image Restoration and Reconstruction Image restoration Objective process to improve an image, as opposed to the subjective process of image enhancement Enhancement uses heuristics to improve the image

More information

Guided Image Super-Resolution: A New Technique for Photogeometric Super-Resolution in Hybrid 3-D Range Imaging

Guided Image Super-Resolution: A New Technique for Photogeometric Super-Resolution in Hybrid 3-D Range Imaging Guided Image Super-Resolution: A New Technique for Photogeometric Super-Resolution in Hybrid 3-D Range Imaging Florin C. Ghesu 1, Thomas Köhler 1,2, Sven Haase 1, Joachim Hornegger 1,2 04.09.2014 1 Pattern

More information

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling Moritz Baecher May 15, 29 1 Introduction Edge-preserving smoothing and super-resolution are classic and important

More information

Digital Image Processing. Prof. P. K. Biswas. Department of Electronic & Electrical Communication Engineering

Digital Image Processing. Prof. P. K. Biswas. Department of Electronic & Electrical Communication Engineering Digital Image Processing Prof. P. K. Biswas Department of Electronic & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture - 21 Image Enhancement Frequency Domain Processing

More information

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth Common Classification Tasks Recognition of individual objects/faces Analyze object-specific features (e.g., key points) Train with images from different viewing angles Recognition of object classes Analyze

More information

ELEC Dr Reji Mathew Electrical Engineering UNSW

ELEC Dr Reji Mathew Electrical Engineering UNSW ELEC 4622 Dr Reji Mathew Electrical Engineering UNSW Review of Motion Modelling and Estimation Introduction to Motion Modelling & Estimation Forward Motion Backward Motion Block Motion Estimation Motion

More information

Image Processing Via Pixel Permutations

Image Processing Via Pixel Permutations Image Processing Via Pixel Permutations Michael Elad The Computer Science Department The Technion Israel Institute of technology Haifa 32000, Israel Joint work with Idan Ram Israel Cohen The Electrical

More information

x' = c 1 x + c 2 y + c 3 xy + c 4 y' = c 5 x + c 6 y + c 7 xy + c 8

x' = c 1 x + c 2 y + c 3 xy + c 4 y' = c 5 x + c 6 y + c 7 xy + c 8 1. Explain about gray level interpolation. The distortion correction equations yield non integer values for x' and y'. Because the distorted image g is digital, its pixel values are defined only at integer

More information

Video Quality Analysis for H.264 Based on Human Visual System

Video Quality Analysis for H.264 Based on Human Visual System IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021 ISSN (p): 2278-8719 Vol. 04 Issue 08 (August. 2014) V4 PP 01-07 www.iosrjen.org Subrahmanyam.Ch 1 Dr.D.Venkata Rao 2 Dr.N.Usha Rani 3 1 (Research

More information

Image Processing Lecture 10

Image Processing Lecture 10 Image Restoration Image restoration attempts to reconstruct or recover an image that has been degraded by a degradation phenomenon. Thus, restoration techniques are oriented toward modeling the degradation

More information

Image Restoration Using DNN

Image Restoration Using DNN Image Restoration Using DNN Hila Levi & Eran Amar Images were taken from: http://people.tuebingen.mpg.de/burger/neural_denoising/ Agenda Domain Expertise vs. End-to-End optimization Image Denoising and

More information

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 7, NO. 2, APRIL 1997 429 Express Letters A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation Jianhua Lu and

More information

VC 12/13 T16 Video Compression

VC 12/13 T16 Video Compression VC 12/13 T16 Video Compression Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos Miguel Tavares Coimbra Outline The need for compression Types of redundancy

More information

Image Segmentation Via Iterative Geodesic Averaging

Image Segmentation Via Iterative Geodesic Averaging Image Segmentation Via Iterative Geodesic Averaging Asmaa Hosni, Michael Bleyer and Margrit Gelautz Institute for Software Technology and Interactive Systems, Vienna University of Technology Favoritenstr.

More information

Investigation of the GoP Structure for H.26L Video Streams

Investigation of the GoP Structure for H.26L Video Streams Investigation of the GoP Structure for H.26L Video Streams F. Fitzek P. Seeling M. Reisslein M. Rossi M. Zorzi acticom GmbH mobile networks R & D Group Germany [fitzek seeling]@acticom.de Arizona State

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

Coding of Coefficients of two-dimensional non-separable Adaptive Wiener Interpolation Filter

Coding of Coefficients of two-dimensional non-separable Adaptive Wiener Interpolation Filter Coding of Coefficients of two-dimensional non-separable Adaptive Wiener Interpolation Filter Y. Vatis, B. Edler, I. Wassermann, D. T. Nguyen and J. Ostermann ABSTRACT Standard video compression techniques

More information

CS4442/9542b Artificial Intelligence II prof. Olga Veksler

CS4442/9542b Artificial Intelligence II prof. Olga Veksler CS4442/9542b Artificial Intelligence II prof. Olga Veksler Lecture 8 Computer Vision Introduction, Filtering Some slides from: D. Jacobs, D. Lowe, S. Seitz, A.Efros, X. Li, R. Fergus, J. Hayes, S. Lazebnik,

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

VIDEO COMPRESSION STANDARDS

VIDEO COMPRESSION STANDARDS VIDEO COMPRESSION STANDARDS Family of standards: the evolution of the coding model state of the art (and implementation technology support): H.261: videoconference x64 (1988) MPEG-1: CD storage (up to

More information

Image Frame Fusion using 3D Anisotropic Diffusion

Image Frame Fusion using 3D Anisotropic Diffusion Image Frame Fusion using 3D Anisotropic Diffusion Fatih Kahraman 1, C. Deniz Mendi 1, Muhittin Gökmen 2 1 TUBITAK Marmara Research Center, Informatics Institute, Kocaeli, Turkey 2 ITU Computer Engineering

More information

Digital Image Processing COSC 6380/4393

Digital Image Processing COSC 6380/4393 Digital Image Processing COSC 6380/4393 Lecture 21 Nov 16 th, 2017 Pranav Mantini Ack: Shah. M Image Processing Geometric Transformation Point Operations Filtering (spatial, Frequency) Input Restoration/

More information

Spatio-Temporal Stereo Disparity Integration

Spatio-Temporal Stereo Disparity Integration Spatio-Temporal Stereo Disparity Integration Sandino Morales and Reinhard Klette The.enpeda.. Project, The University of Auckland Tamaki Innovation Campus, Auckland, New Zealand pmor085@aucklanduni.ac.nz

More information

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology Course Presentation Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology Video Coding Correlation in Video Sequence Spatial correlation Similar pixels seem

More information

BLIND QUALITY ASSESSMENT OF JPEG2000 COMPRESSED IMAGES USING NATURAL SCENE STATISTICS. Hamid R. Sheikh, Alan C. Bovik and Lawrence Cormack

BLIND QUALITY ASSESSMENT OF JPEG2000 COMPRESSED IMAGES USING NATURAL SCENE STATISTICS. Hamid R. Sheikh, Alan C. Bovik and Lawrence Cormack BLIND QUALITY ASSESSMENT OF JPEG2 COMPRESSED IMAGES USING NATURAL SCENE STATISTICS Hamid R. Sheikh, Alan C. Bovik and Lawrence Cormack Laboratory for Image and Video Engineering, Department of Electrical

More information

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) References: [1] http://homepages.inf.ed.ac.uk/rbf/hipr2/index.htm [2] http://www.cs.wisc.edu/~dyer/cs540/notes/vision.html

More information