Sprite Generation and Coding in Multiview Image Sequences
|
|
- Dominick Houston
- 6 years ago
- Views:
Transcription
1 302 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 2, MARCH 2000 Sprite Generation and Coding in Multiview Image Sequences Nikos Grammalidis, Student Member, IEEE, Dimitris Beletsiotis, and Michael G. Strintzis, Senior Member, IEEE Abstract A novel algorithm for the generation of background sprite images from multiview image sequences is presented. A dynamic programming algorithm, first proposed in [1] using a multiview matching cost, as well as pure geometrical constraints, is used to provide an estimate of the disparity field and to identify occluded areas. By combining motion, disparity, and occlusion information, a sprite image corresponding to the first (main) view at the first time instant is generated. Image pixels from other views that are occluded in the main view are also added to the sprite. Finally, the sprite coding method defined by MPEG-4 is extended for multiview image sequences based on the generated sprite. Experimental results are presented, demonstrating the performance of the proposed technique and comparing it with standard MPEG-4 coding methods applied independently to each view. Index Terms Dynamic programming, MPEG-4, multiview video applications, sprites. I. INTRODUCTION ABACKGROUND sprite is an image composed of pixels belonging to a video object visible throughout a video segment. For instance, a sprite generated from a panning sequence will contain all visible pixels of the background object throughout the sequence. Certain portions of this background may not be visible in certain frames due to the occlusion of the foreground objects or the camera motion. Since the sprite contains all parts of the background that were at least visible once in an image sequence, the sprite can be used for the reconstruction or the predictive coding of the background. Sprites for background representation are also commonly referred to as salient stills [2], [3] or background mosaics in the literature [4] [10]. The procedure for generating background sprite images from a video sequence typically starts by detecting scene cuts (changes) and thus dividing the video sequence in subsequences containing similar content. A background mosaic (sprite) is then generated for each subsequence by warping (aligning) different instances of the background region to a fixed coordinate system, after estimating their motion using a two-dimensional (2-D) or three dimensional (3-D) motion model. Finally, information from all warped images is combined into the sprite image by using median filtering or averaging operations. Manuscript received March 15, 1999; revised October 20, This work was supported by the EU IST INTERFACE and the GSRT PAVE and PANORAMA projects. This paper was recommended by Guest Editor Y. Wang. The authors are with the Information Processing Laboratory, Department of Electrical and Computer Engineering, University of Thessaloniki , Greece ( ngramm@dion.ee.auth.gr; dbel@panorama.ee.auth.gr; strintzi@eng.auth.gr). Publisher Item Identifier S (00) A method for encoding sprite images has been included in the emerging MPEG-4 standard [11], [12]. This method is based on describing simple camera motion models (e.g., translational, affine or perspective) by the 2-D motion of a number of points, called reference points. Since the sprite images are often much larger than the initial images, their coding is complicated by the significant delay (latency) incurred when they are coded and decoded as I-frames. Since the frames following the first are coded and decoded based on the sprite image, such delays may hinder real-time implementation. However, in MPEG-4, the sprite coding syntax allows large static sprite images to be transmitted piece by piece as well as hierarchically, so that the latency incurred in displaying a video sequence is significantly reduced. In earlier sprite-generation procedures, no segmentation was used, and the generated sprites always corresponded to the region with the dominant motion, i.e., usually the background. In this case, foreground objects were removed by using temporal averaging or median filtering. However, in order to improve the quality of the generated sprite images and to be able to generate sprites for foreground objects, a number of techniques have been proposed to segment the scene into a number of layers [10], [13] [15]. Layers are regions which typically correspond to the physical objects in the scene. If a multilayered scene description is available, sprites can be easily obtained for each layer using standard sprite generation techniques or more sophisticated involving depth and transparent objects [15]. Although much effort is spent in the past to design sophisticated layer segmentation procedures, this still remains in many respects an open problem. In this paper, the sprite generation and coding procedures are generalized for the case of multiocular systems, consisting of two or more cameras. Multiocular systems provide the viewer with the appropriate monoscopic or stereoscopic view of a scene, depending on his position. Several coding schemes have been proposed for stereoscopic [16] and multiview image sequences [1], [17]. A common characteristic in these coding schemes is the use of disparity information to eliminate redundancies between images from different views. Furthermore, the detection of occlusions, i.e., points not visible in all views, provides additional information that can improve coding results. Techniques based on dynamic programming have been applied for the purposes of disparity estimation and simultaneous occlusion detection [1], [18] [21] in stereoscopic sequences. A significant advantage of these techniques is that they provide a global solution for the disparity estimation/occlusion detection problem under local constraints, such as constraints related to correlation, smoothness, or disparity gradient limit /00$ IEEE
2 GRAMMALIDIS et al.: SPRITE GENERATION AND CODING IN MULTIVIEW IMAGE SEQUENCES 303 Fig. 1. A multiview system with three viewpoints. In previously proposed sprite-generation techniques [4] [6], motion information has been extensively used to identify the objects in the scene (segmentation) and to determine their position in the sprite image (warping). The present paper proposes novel techniques for sprite generation, in which foreground and background segmentation is mainly based on the estimated disparity and occlusion information. Clearly, this is a more natural way for background identification, especially in sequences with very small motion, where segmentation based on motion may fail. Furthermore, motion information is used in this paper in a second segmentation step, in order to assign small or occluded regions to the background or foreground regions. The main contribution of this paper is the use of disparity and occlusion information to add information from all available views to the background sprite image. For example, a part of the background that is occluded in one view may be added to the sprite from another view, where this part appears. The sprite is generated in two stages: the first involves the frames from the first (leftmost) view, uses disparity and occlusion information for segmentation purposes, and is otherwise similar to previously proposed sprite-generation procedures [5], [11]. The second step involves the frames from the other views and is based exclusively on the estimated disparity and occlusion information. The sprite coding mode defined by MPEG-4 is then used to code the background region in the entire multiview sequence. Full compliance to the MPEG-4 sprite coding mode is achieved by using the same 6-parameter affine model to model both motion or disparity information describing the warping transformation between a frame and the sprite image. This model has seen to be efficient in situations where either the structure of the imaged scene is approximately planar, or the scene is sufficiently far from the camera [6], [22]. The entire multiview sequence can be then coded, according to the MPEG-4 sprite coding mode, by reordering all the frames from a group of frames (GOF) into a single sequence as follows: first the frames from the first view, then the corresponding frames from the second view, and so on. An advantage of this technique is that no disparity or occlusion information needs to be coded for the background region. Experimental results demonstrate significant reduction in the required bit rate if a single sprite image is used for the entire background of the multiview sequence. The rest of the paper is organized as follows. The algorithm used for disparity and occlusion information, which was described in detail in [1], is summarized in Section II. The procedure used to generate a sprite image from the first (leftmost) view of a multiview image sequence is described in Section III. In Section IV, the procedure to generate sprites from the other available views, based on disparity and occlusion information, is presented. Then, the sprite coding scheme defined by MPEG-4 is generalized for the case of multiview sequences. In Section V, experimental results are obtained using a four-view sequence and a stereo sequence. Comparisons are made against standard MPEG-4 coding schemes, with or without the use of sprite images, applied independently to each view. Finally, conclusions and suggestions for future extensions of the proposed approach are presented in Section VI. II. A METHOD FOR DISPARITY ESTIMATION AND OCCLUSION DETECTION Consider a multiocular system with viewpoints arranged on a horizontal line. A trinocular system with viewpoints is shown in Fig. 1. Let be the image corresponding to viewpoint and denote the -coordinate of the (perspective) projection of a 3-D point in. We shall estimate disparity and occluded areas for the first (leftmost) image. Disparity with respect to of a point visible in is defined by if is visible in (1) undefined if is occluded in.
3 304 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 2, MARCH 2000 By assuming that central projection is used and that all optical axes are parallel, it may be shown [1] that if is visible in, then its disparity equals (2) where is the baseline corresponding to the th viewpoint and is the depth of. In the example of Fig. 1, points belonging to segments, and are visible in both and, thus both and are defined for these points. Since the depth is constant within each of these segments, (2) implies that the corresponding disparities and also remain constant within each of these segments. However, only is defined for each point in (projected to segment ), while neither nor is defined for points between and (projected to segment ). A pixel in the first view will be said to be in state if it is visible in views. In particular, it will be in state if it is visible in all views, in state if it is visible only in and in state if it is invisible (occluded) in all views but. For a pixel in state, it is seen from (2) that Thus, knowledge of implies knowledge of all,. A dynamic programming scheme was proposed in [1] and [23] so as to estimate the disparity vector and the state for each pixel in. The corresponding valid disparity values are then found from (3). The disparity field obtained in this manner corresponds to each pixel of the first (leftmost) view and as such may be termed L R disparity field to distinguish it from the converse R L disparity field, corresponding to each pixel of the th (rightmost) view. For pixels in state, the multiview matching cost between all corresponding pixels is defined as where and the fixed weights are chosen heuristically. The disparity is given by (3), rounded to the nearest integer, and is a window centered on the working pixel. A dynamic programming algorithm for the calculation of the disparity of pixels in and for the identification of areas occluded in at least one view may be then based on the algorithm schematically shown in Fig. 2. The multiview matching cost is associated to the transition, while a fixed occlu- (3) (4) (5) Fig. 2. Allowed transitions between states in the general (N -view) case. The disparity d = d is estimated and the occluded pixels in S (not in state S ) are detected. sion cost is used for the occlusions and. Using this algorithm, only two states are identified: assigned to pixels visible in all views, and indicating that the pixel is occluded in at least one view. Finer estimation of disparity and detection of occluded regions is achieved, by iteratively applying the same algorithm within each of the occluded segments detected by the above algorithm. As detailed in [1], the same dynamic programming algorithm can be used to provide the R L disparity field and the corresponding state information the rightmost view. with respect to III. MULTIVIEW SPRITE GENERATION USING INFORMATION FROM THE FIRST VIEW Sprites are typically generated from monoscopic image sequences by first using a scene-cut detector for the identification of the number of frames where a significant part of scene (usually a large part of the background) remains substantially the same. Each of the resulting subsequences is assumed to contain similar image content, and each is processed independently from the others. Each frame of the subsequence is segmented into a number of regions, each defining a different object. Then, a binary mask representing the shape of each object is produced, which together with the luminance (or color) information for this object comprise the video object plane (VOP) in MPEG-4 terminology. The segmentation may be based on luminance, motion and, in the case of multiview image sequences, disparity information. Segmentation based on disparity information, has the following advantages. 1) In sequences produced in videoconferencing and other similar applications, the observed motion may be very small and inadequate for accurate segmentation. However, for such sequences, efficient segmentation into foreground and background regions is possible by using multiple cameras and exploiting disparity information. 2) Luminance-edge and motion-edge information may be conveniently exploited in disparity estimation techniques based on dynamic programming, so that disparity changes are endorsed near luminance or motion edges [20]. Thus, the produced objects have more or less constant disparity, motion and texture information. This approach is very
4 GRAMMALIDIS et al.: SPRITE GENERATION AND CODING IN MULTIVIEW IMAGE SEQUENCES 305 Fig. 3. Object segmentation and motion estimation. Fig. 4. Procedure for generating sprites from multiview image sequences. suitable for situations where the disparity variations between the background region and the foreground objects is small (e.g., if the distance between the objects and the camera is very large). In such situations, even though segmentation based on the disparity alone might fail, use of luminance or motion information may significantly improve the final segmentation result. 3) Disparity provides a convenient means of layering the objects in the scene: objects with smaller absolute disparity values are at larger distance from the viewer, and are assigned accordingly to deeper layers. For the above reasons, we use disparity information as the main cue in the proposed segmentation procedure. Specifically, in order to generate the background sprite image, the background region is identified in each frame of the first (leftmost) view. A two-stage motion and disparity based segmentation technique is proposed to identify the foreground and background regions. At the first stage, a simple thresholding of the disparity fields is used to initialize the segmentation map for each frame. The threshold value is determined on the basis of the disparity histogram. Occluded pixels in at least one view are left out of this initial classification procedure. A second stage is used to correct errors in the initial segmentation result caused by local minor disparity changes or estimation errors. A connected component labeling procedure is used to find small connected regions that are labeled as artifacts and are excluded from the motion model estimation stage which follows. After having identified the background region, its motion is described using a 6-parameter 2-D affine motion model. This model can be expressed as follows: where is the pixel position at time and is the corresponding pixel at time. The upper and lower indices in indicate the time and the view (first), respectively. The estimation of is based on correspondences obtained using a standard exhaustive search block-matching procedure. If matches are available in the background region, (6) yields a system of equations with six unknowns. The unknown parameters are then estimated from this system of equations using least-squares techniques. The pixels in occluded or very small regions that were excluded from the motion estimation stage are now assigned to the background region if the average displaced frame difference for this region is smaller than the average DFD in the background region. In the above, is computed from (6) (6)
5 306 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 2, MARCH 2000 using the estimated motion parameters, denotes the frame from the th view at time and is the number of pixels in. Using this procedure, the final segmentation map is obtained and the final motion parameters for the background region are recalculated. The entire segmentation and motion-estimation scheme is summarized in Fig. 3. The sprite-generation procedure for the background from the main (leftmost) view uses the coordinate system of the first frame as the reference coordinate system for the sprite image. For each frame, the estimated motion parameters are used to compute the motion (warping) transformation of the object between the current frame and the reference coordinate system. Using (6), this transformation can be written as follows: (7) where and denotes a pixel position in frame from the th view. Thus, the video object is warped toward the reference coordinate system. After processing all frames of the sequence, temporal median is used to produce the final sprite image from the warped objects. An alternative method for producing the final sprite image is a progressive averaging procedure [11] Fig. 5. Estimated disparity fields and state maps for the left and right frame of the Claude sequence. Occluded areas are shown in black color. (a) L R disparity field. (b) R L disparity field. (c) State map for the L R disparity field. (d) State map for the R L disparity field. (8) Averaging may be performed on the spot, as each sample from the warped images becomes available where and are the sprite image and the warped image corresponding to the th time instant. This method is faster and requires less memory than the former; however, the use of median filtering in the former method improves the quality of sprite images since outliers (wrong samples) and noise are eliminated. IV. MULTIVIEW SPRITE GENERATION AND CODING USING INFORMATION FROM THE REMAINING VIEWS After the initial sprite-image generation based on information obtained from the first view, information from the remaining views may be added based on the estimated disparity and occlusion information. Specifically, the warping parameters corresponding to frame for the general case of views, are computed on the basis of the estimated disparity information. The position of a pixel in the background in frame is modeled using an affine transformation. Using the notation of Section III, this can be written as follows: (9) (10) Fig. 6. (a) Initial support map after thresholding the disparity images. (b) Final support map after motion estimation. These affine model parameters are estimated using least squares techniques based on the disparity of the pixels that are visible in all intermediate views ( ). The total warping transformation between frame and the sprite image is obtained by combining this affine model for the disparity with the warping model between frame and the sprite image (aligned to ) (11) where are defined in (7). The multiview sprite generation procedure is illustrated in Fig. 4. A significant advantage of constructing the sprite image using more than one views is that pixels that are occluded in some of the views are still retained in the sprite image. More specifically, in order to generate the sprite image, we use the L R and R L disparity fields and the corresponding state maps to produce the disparity field, which corresponds to frame. Foreground and background segmentation is based on thresholding the disparity field after a preprocessing step in which occluded
6 GRAMMALIDIS et al.: SPRITE GENERATION AND CODING IN MULTIVIEW IMAGE SEQUENCES 307 Fig. 7. Comparison of background sprites obtained from monoscopic and multiview sequences. (a) Monoscopic sprite is obtained from ten frames of the left sequence. (b) Multiview sprite obtained from ten frames of all four views using method A1. (c) Multiview sprite obtained from ten frames from all four views using method A2. segments between pixels that have similar disparity to that of the foreground or background regions are assigned to these regions. Assuming that the foreground region has been identified correctly, all other pixels in, even those occluded in the left or the right view, can be assimilated into the background. Then, the disparity affine model for the background can be used to model the disparity in the entire background region. As a result, pixels in occluded regions provide additional samples for the warped frames used for the construction of the sprite image. Based on the information used for generating the sprite image, we have used and evaluated the following approaches. 1) In method A1, the sprite image is initially generated using only the pixels from the first (leftmost) view. Then, pixels from frames corresponding to pixel locations where no luminance value has been assigned, are added in the sprite image. This method yields sprite images with better visual quality because the pixels obtained from the first view are more reliable candidates for the sprite. However, as verified by the experimental results, this leads to very good reconstruction of the leftmost view, but not equally good results for the other views. TABLE I CODING OF THE BACKGROUND REGION OF THE CLAUDE SEQUENCE A1, A2: Coding using a single sprite image. B: Independent MPEG-4 coding of each view using MPEG-4 spritecoding mode. C: Independent MPEG-4 coding of each view without using sprite coding. 2) Method A2 uses an averaging procedure of all available pixels from frames to update the sprite image. This creates a more balanced sprite image that can be used to obtain satisfactory reconstructions for all channels, since all views equally contribute information to the sprite image. A drawback however, is that some blurring may be induced to the sprite image, when some of the samples that are averaged are noisy, due to the inadequacy
7 308 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 2, MARCH 2000 of the affine disparity model to describe the local disparity or due to luminance changes among different views. The coding of the multiview sequence using the generated sprite image conforms to the MPEG-4 specifications for sprite coding [11], [12]. This is achieved by reordering all the frames from a group of frames (GOF) to the following order: (12) Then, warping parameters for each frame of this sequence toward the sprite image are given by (6) for the first (left) view and by (11) for the remaining views. In order to code the warping transformation used to generate the reconstructed images from the sprite image, each transformation is expressed as a set of motion trajectories of a number of reference points. The number of reference points, needed to encode the warping parameters determines the transform to be used for warping, e.g., three reference points are used to fully describe the affine transform of (6) and (11). V. EXPERIMENTAL RESULTS Results are presented for the four-view sequence Claude and the stereoscopic sequence Aqua. 1 Fig. 5(a) (d) show the L R and R L disparity fields and the corresponding state maps obtained using the proposed algorithm for the first frame. Fig. 6(a) illustrates the initial segmentation map for the first frame obtained by thresholding of the disparity field in Fig. 5(a), while Fig. 6(b) shows the final segmentation map obtained after the proposed algorithm summarized in Fig. 3. The sprite for the background generated from ten frames from the first (leftmost) view is shown in Fig. 7(a). The sprite obtained by adding occluded pixels from the other three views using method A1, which was discussed in Section IV, is shown in Fig. 7(b). The background sprite image when averaging information from all four sequences according to method A2 is shown in Fig. 7(c). As seen, many new pixels are added to the sprite image. Most of the pixels are seen to be at the correct positions, however some blurring can be observed in Fig. 7(c), produced using method A2, especially near the left-upper corner. Blurring is mainly due to averaging noisy samples in locations where the local disparity is not adequately described by the affine disparity model. Coding results for both sprite-based and standard (nonsprite) MPEG-4 coding methods were obtained using the software implementation of the MPEG-4 Version 1 encoder/decoder provided by the ACTS 098 MoMuSys project [24], [25]. In the proposed approaches, methods A1 and A2, coding of all four views is based on a single sprite image obtained from all views. For comparison, independent coding of each view using MPEG-4 coders with a different sprite image for each view (method B) was also evaluated. Also, coding results provided by independent coding of each view using standard MPEG-4 coding are 1 Both sequences were prepared by THOMPSON CSF/LER for the RACE Project 2045 DISTIMA and the ACTS 092 Project PANORAMA. Fig. 8. (a), (b) Original images for the first and fourth view. (c), (d) Reconstructed images for the first and fourth view. Sprite coding (method A2) is used for the background and standard MPEG-4 coding for the foreground image. (e), (f) Reconstructed images for the first and fourth view using method C. also presented (method C). Results for the coding of the background region of the Claude sequence using these coding techniques are presented in Table I. Methods A1 and A2 are seen to require approximately four times less bitrate to encode the background in all views when compared to method B and eight times less bit rate when compared to method C. Method A1 results in negligible degradation of the reconstruction quality for the first view; however, the reconstruction quality of the other three views falls by approximately 4.5 db. Method A2 produces the same significant bit rate savings at a loss of approximately 3 db in the reconstruction of all four views. In terms of visual quality, some blurring can be observed in reconstructions obtained using method A2, due to the sprite image blurring effects discussed above. We have also used the proposed method for reconstructing each frame by applying the sprite coding (method A2) for the background object, while using standard (non sprite) MPEG-4 coding for the foreground object. The corresponding coding results are presented in Table II. The original first frames from the first and the last view are shown in Fig. 8(a) and (b), respectively, while the corresponding reconstructed frames are shown in Fig. 8(c) and (d). Fig. 8(e) and (f) illustrate the corresponding reconstructed frames obtained using method C (standard MPEG-4 coding) for the entire image area. Similar results
8 GRAMMALIDIS et al.: SPRITE GENERATION AND CODING IN MULTIVIEW IMAGE SEQUENCES 309 Fig. 10. (a), (b) Disparity and occlusion maps for the left and right views (occluded regions are shown in black color). (c), (d) Reconstructed images for the left and right view. Sprite coding (method A2) is used for the background and standard MPEG-4 coding for the foreground region. views using methods A1 and A2 are shown in Fig. 11(b) and (c), respectively. Coding results for the background region are presented in Table III, while results for the entire frames of the stereoscopic sequence are presented in Table IV (using standard MPEG-4 coding for the foreground region). Finally, two reconstructed frames using Method A2 are presented in Fig. 10(c) and (d). Fig. 9. (a), (b) Original images from the second and third view. (c), (d) Reconstructed images from the second and third. Sprite coding (method A2) is used for the background and standard MPEG-4 coding for the foreground region. (e), (f) Reconstructed images for the second and third view using method C. TABLE II CODING OF THE BACKGROUND AND FOREGROUND REGION OF THE CLAUDE SEQUENCE. FOREGROUND IS CODED USING STANDARD MPEG-4 CODING A1, A2: Coding using a single sprite image. B: Independent MPEG-4 coding of each view using MPEG-4 spritecoding mode. C: Independent MPEG-4 coding of each view without using sprite coding. for the middle (second and third) views are presented in Fig. 9(a) (f). As seen, the visual quality of the reconstructed frames using multiview sprite coding is comparable to that obtained using standard MPEG-4 coding. The estimated disparity fields estimated for the first frame of the Aqua sequence are presented in Fig. 10(a) and (b), where occluded regions are marked in black. Although significant depth variations can be observed, satisfactory segmentation of the background region is possible using the proposed approach. The sprite generated from five frames of the left sequence is shown in Fig. 11(a), while the sprites generated from both VI. CONCLUSIONS AND SUGGESTIONS FOR FUTURE WORK A method for sprite generation from multiview sequences was proposed. Disparity and occlusion estimation is based on an efficient dynamic programming algorithm using information from all views of the multiview sequence. By combining motion, disparity, and occlusion information, a sprite image corresponding to the first (main) view at the first time instant is generated. Image pixels from other views that are occluded in the main view are added to the sprite. The sprite coding method defined by MPEG-4 was extended for the case of a multiview image sequence, based on the generated sprite. Experimental results demonstrating the performance of the proposed technique and comparing it with methods using sprite generation from monoscopic sequences were presented. An additional advantage of this technique, is that the generated sprite images (mosaics) contain more pixels, and thus, additional information is available that may be exploited in other interesting sprite applications such as object tracking, background substitution, or annotating in multiview sequences. Significant depth changes or difficulties in the segmentation procedure may hinder successful sprite generation. In order to improve results, various approaches may be followed in the future. More complex warping models, than the simple affine or perspective models used by MPEG-4, could be defined to describe the motion of nonplanar surfaces or complex camera motions. Another approach would be to segment the scene into more than two regions (multiple layers), and use a different warping model for each layer. Efficient sprite-generation procedures for multiple layers considering transparent objects and depth variations have already
9 310 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 2, MARCH 2000 Fig. 11. Background sprites generated from five frames of the Aqua sequence. (a) Monoscopic sprite generated from the left view. (b) Multiview sprite obtained using method A1. (c) Multiview sprite obtained using method A2. TABLE III CODING OF THE BACKGROUND REGION OF THE AQUA SEQUENCE A1, A2: Coding using a single sprite image. B: Independent MPEG-4 coding of each view using MPEG-4 spritecoding mode. C: Independent MPEG-4 coding of each view without using sprite coding. TABLE IV CODING OF THE BACKGROUND AND FOREGROUND REGION OF THE AQUA SEQUENCE. IN ALL CASES, FOREGROUND IS CODED USING STANDARD MPEG-4 CODING A1, A2: Coding using a single sprite image. B: Independent MPEG-4 coding of each view using MPEG-4 spritecoding mode. C: Independent MPEG-4 coding of each view without using sprite coding. been proposed [15]. In this case, additional depth information which is necessary to resynthesize the images from the sprite has to be coded. However, using sprite coding for more than one layer is not supported by the current version of MPEG-4, since shape coding is not supported for layers coded in sprite coding mode. This inhibits sprite coding of more than one layer using MPEG-4 compliant methods. In sequences where there are significant luminance changes among different views, an interesting extension of the proposed technique would be to incorporate photometric correction methods, similar to those used in [26]. Specifically, the luminance direction and a normal vector could be estimated for the entire background region. Then, an iterative technique could be used to improve the estimation of the affine model parameters by using the photometrically corrected luminance values instead of the real ones. REFERENCES [1] N. Grammalidis and M. G. Strintzis, Disparity and occlusion estimation in multiocular systems and their coding for the communication of multiview image sequences, IEEE Trans. Circuits Syst. Video Technol., vol. 8, pp , Jun [2] M. Massey and W. Bender, Salient stills: Process and practice, IBM Syst. J., vol. 35, no. 3/4, pp , [3] L. Teodosio and W. Bender, Salient video stills: Content and context preserved, in Proc. 1st ACM Int. Conf. Multimedia MULTIMEDIA 93. New York, Aug. 1993, pp [4] F. Dufaux and F. Moscheni, Background mosaicking for low bit rate coding, in Proc. Int. Conf. Image Processing, Lausanne, Switzerland, Sept [5] M. Irani, P. Anandan, J. Bergen, R. Kumar, and S. Hsu, Efficient representations of video sequences and their applications, Signal Processing: Image Commun., vol. 8, no. 4, pp , May [6] M. Irani and P. Anandan, Video indexing based on mosaic representations, Proc. IEEE, vol. 16, no. 5, pp , May [7] R. Szeliski, Video mosaics for virtual environments, IEEE Comput. Graphics Applicat., vol. 16, pp , Mar [8] R. Szeliski and H.-Y. Shum, Creating full view panoramic mosaics and environment maps, in Proc. ACM SIGGRAPH 97 Conf., T. Whitted, Ed., Aug. 1997, ISBN , pp [9] S. Mann and R. W. Picard, Virtual bellows: Constructing high quality stills from video, in Proc. ICIP Int. Conf. Image Processing, Nov [10] M. Lee, W. Chen, C. B. Lin, C. Gu, T. Markoc, and R. Szeliski, A layered video object coding system using sprite and affine motion model, IEEE Trans. Circuits Syst. Video Technol., vol. 7, pp , Feb [11] MPEG-4 Video Group, MPEG-4 verification model version 11.0,, Tokyo, Japan, Tech. Rep., ISO/IEC JTC1/SC29/WG11/MPEG98/N2172, T. Ebrahimi, Ed., Mar [12] T. Sikora, The MPEG-4 video standard verification model, IEEE Trans. Circuits Syst. Video Technol., vol. 7, pp , Feb
10 GRAMMALIDIS et al.: SPRITE GENERATION AND CODING IN MULTIVIEW IMAGE SEQUENCES 311 [13] J. Y. Wang and E. H. Adelson, Representing moving images with layers, IEEE Trans. Image Processing, vol. 3, pp , Sept [14] T. Darrell and A. Pentland, Cooperative robust estimation using layers of support, IEEE Trans. Pattern Anal. Mach. Intell., vol. 17, pp , May [15] S. Baker, R. Szeliski, and P. Anandan, A layered approach to stereo reconstruction, in Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition (CVPR 98), Santa Barbara, CA, June 1998, pp [16] M. Ziegler, Digital stereoscopic imaging and application A way toward new dimensions: The RACE II Project DISTIMA, in Inst. Elect. Eng. Colloq. Stereoscopic Television, London, U.K., Oct [17] J.-R. Ohm and K. Muller, Incomplete 3D-multiview representation of video objects, IEEE Trans. Circuits Syst. I, vol. 47, Feb [18] I. J. Cox, S. Hingorani, B. M. Maggs, and S. B. Rao, Stereo without disparity gradient smoothing: A Bayesian sensor fusion solution, in Proc. British Machine Vision Conf., New York, 1992, pp [19] S. S. Intille and A. F. Bobick, Disparity-space images and large occlusion stereo, M.I.T. Media Lab Perceptual Computing Group, Cambridge, MA, Tech. Rep. 220, [20], Incorporating intensity edges in the recovery of occlusion regions, M.I.T. Media Lab Perceptual Computing Group, Cambridge, MA, Tech. Rep. 246, [21] L. Falkenhagen, R. Koch, A. Kopernik, and M. Strintzis, Disparity estimation based on 3-D arbitrarily shaped regions, Digital Stereoscopic Imaging and Applications (DISTIMA), Tech. Rep. #R2045 /UH/DS/P/023/b1 RACE Project R2045, [22] O. Faugeras, Three-Dimensional Computer Vision. Cambridge, MA: MIT Press, [23] N. Grammalidis and M. G. Strintzis, Disparity and occlusion estimation for multiview image sequences using dynamic programming, in Proc. Int. Conf. Image Processing (ICIP 96), Lausanne, Switzerland, Sept [24] ACTS 098 MoMusys Project, Software simulation of mpeg4 video coder,, [Online]. Available FTP: /drogo.cselt.stet.it/pub/mpeg/mpeg- 4_fcd/Visual/Natural/MoMuSys-VFCD-V File: 507.tar.gz. [25] R. Koenen, F. Pereira, and L. Chariglione, MPEG-4: Context and objectives, Signal Processing: Image Commun., vol. 9, no. 4, May [26] G. Bozdagi, A. M. Tekalp, and L. Onural, 3-D motion estimation and wireframe adaptation including photometric effects for model-based coding of facial image sequences, IEEE Trans. Circuits Syst. Video Technol., vol. 4, pp , June Nikos Grammalidis (S 93) received the Diploma in electrical engineering from Aristotle University of Thessaloniki, Greece in He is currently working toward the Ph.D. degree in the Information Processing Laboratory, Aristotle University of Thessaloniki. His research interests include computer vision and multiview image sequence coding and processing. Dimitris Beletsiotis received the Diploma in electrical engineering from the Electrical Engineering Department, Aristotle University of Thessaloniki, Greece, in Presently he is serving in the Greek Army. His research interests include video-coding and video-processing applications. Michael G. Strintzis (S 68 M 70 SM 80) received the Diploma in electrical engineering from the National Technical University of Athens, Athens, Greece in 1967, and the M.A. and Ph.D. degrees in electrical engineering from Princeton University, Princeton, NJ, in 1969 and 1970, respectively. He then joined the Electrical Engineering Department, University of Pittsburgh, Pittsburgh, PA, where he served as Assistant ( ) and Associate ( ) Professor. Since 1980, he has been Professor of Electrical and Computer Engineering at the University of Thessaloniki, and since 1999, Director of the Informatics and Telematics Research Institute, Thessaloniki, Greece. His current research interests include 2-D and 3-D image coding, image processing, biomedical signal and image processing, and DVD and Internet data authentication and copy protection. Dr. Strintzis was awarded one of the Centennial Medals of the IEEE in 1994.
Transactions on Information and Communications Technologies vol 19, 1997 WIT Press, ISSN
Hopeld Network for Stereo Correspondence Using Block-Matching Techniques Dimitrios Tzovaras and Michael G. Strintzis Information Processing Laboratory, Electrical and Computer Engineering Department, Aristotle
More informationParticle Tracking. For Bulk Material Handling Systems Using DEM Models. By: Jordan Pease
Particle Tracking For Bulk Material Handling Systems Using DEM Models By: Jordan Pease Introduction Motivation for project Particle Tracking Application to DEM models Experimental Results Future Work References
More informationEfficient Background Video Coding With Static Sprite Generation and Arbitrary-Shape Spatial Prediction Techniques
394 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 5, MAY 2003 Efficient Background Video Coding With Static Sprite Generation and Arbitrary-Shape Spatial Prediction Techniques
More informationTHE GENERATION of a stereoscopic image sequence
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 8, AUGUST 2005 1065 Stereoscopic Video Generation Based on Efficient Layered Structure and Motion Estimation From a Monoscopic
More information1-2 Feature-Based Image Mosaicing
MVA'98 IAPR Workshop on Machine Vision Applications, Nov. 17-19, 1998, Makuhari, Chibq Japan 1-2 Feature-Based Image Mosaicing Naoki Chiba, Hiroshi Kano, Minoru Higashihara, Masashi Yasuda, and Masato
More informationMoving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation
IJCSNS International Journal of Computer Science and Network Security, VOL.13 No.11, November 2013 1 Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial
More informationExpress Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 7, NO. 2, APRIL 1997 429 Express Letters A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation Jianhua Lu and
More informationDepth Estimation for View Synthesis in Multiview Video Coding
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Depth Estimation for View Synthesis in Multiview Video Coding Serdar Ince, Emin Martinian, Sehoon Yea, Anthony Vetro TR2007-025 June 2007 Abstract
More informationThe Video Z-buffer: A Concept for Facilitating Monoscopic Image Compression by exploiting the 3-D Stereoscopic Depth map
The Video Z-buffer: A Concept for Facilitating Monoscopic Image Compression by exploiting the 3-D Stereoscopic Depth map Sriram Sethuraman 1 and M. W. Siegel 2 1 David Sarnoff Research Center, Princeton,
More informationStill Image Objective Segmentation Evaluation using Ground Truth
5th COST 276 Workshop (2003), pp. 9 14 B. Kovář, J. Přikryl, and M. Vlček (Editors) Still Image Objective Segmentation Evaluation using Ground Truth V. Mezaris, 1,2 I. Kompatsiaris 2 andm.g.strintzis 1,2
More informationMULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES
MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES Mehran Yazdi and André Zaccarin CVSL, Dept. of Electrical and Computer Engineering, Laval University Ste-Foy, Québec GK 7P4, Canada
More informationHead Detection and Tracking by 2-D and 3-D Ellipsoid Fitting.
Head Detection and Tracking by 2-D and 3-D Ellipsoid Fitting. Nikos Grammalidis and Michael G.Strintzis Department of Electrical Engineering, University of Thessaloniki Thessaloniki 540 06, GREECE ngramm@panorama.ee.auth.gr,
More informationEE795: Computer Vision and Intelligent Systems
EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational
More informationAsymmetric 2 1 pass stereo matching algorithm for real images
455, 057004 May 2006 Asymmetric 21 pass stereo matching algorithm for real images Chi Chu National Chiao Tung University Department of Computer Science Hsinchu, Taiwan 300 Chin-Chen Chang National United
More informationMultiview Image Compression using Algebraic Constraints
Multiview Image Compression using Algebraic Constraints Chaitanya Kamisetty and C. V. Jawahar Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, INDIA-500019
More informationFast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda
Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE 5359 Gaurav Hansda 1000721849 gaurav.hansda@mavs.uta.edu Outline Introduction to H.264 Current algorithms for
More informationOptimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform
Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Torsten Palfner, Alexander Mali and Erika Müller Institute of Telecommunications and Information Technology, University of
More informationImplementation and analysis of Directional DCT in H.264
Implementation and analysis of Directional DCT in H.264 EE 5359 Multimedia Processing Guidance: Dr K R Rao Priyadarshini Anjanappa UTA ID: 1000730236 priyadarshini.anjanappa@mavs.uta.edu Introduction A
More informationDATA EMBEDDING IN TEXT FOR A COPIER SYSTEM
DATA EMBEDDING IN TEXT FOR A COPIER SYSTEM Anoop K. Bhattacharjya and Hakan Ancin Epson Palo Alto Laboratory 3145 Porter Drive, Suite 104 Palo Alto, CA 94304 e-mail: {anoop, ancin}@erd.epson.com Abstract
More informationI. INTRODUCTION. Figure-1 Basic block of text analysis
ISSN: 2349-7637 (Online) (RHIMRJ) Research Paper Available online at: www.rhimrj.com Detection and Localization of Texts from Natural Scene Images: A Hybrid Approach Priyanka Muchhadiya Post Graduate Fellow,
More informationDetecting motion by means of 2D and 3D information
Detecting motion by means of 2D and 3D information Federico Tombari Stefano Mattoccia Luigi Di Stefano Fabio Tonelli Department of Electronics Computer Science and Systems (DEIS) Viale Risorgimento 2,
More informationLocal Image Registration: An Adaptive Filtering Framework
Local Image Registration: An Adaptive Filtering Framework Gulcin Caner a,a.murattekalp a,b, Gaurav Sharma a and Wendi Heinzelman a a Electrical and Computer Engineering Dept.,University of Rochester, Rochester,
More informationColour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation
ÖGAI Journal 24/1 11 Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation Michael Bleyer, Margrit Gelautz, Christoph Rhemann Vienna University of Technology
More informationVideo Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin
Literature Survey Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This literature survey compares various methods
More informationA Novel Stereo Camera System by a Biprism
528 IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 16, NO. 5, OCTOBER 2000 A Novel Stereo Camera System by a Biprism DooHyun Lee and InSo Kweon, Member, IEEE Abstract In this paper, we propose a novel
More informationA Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 1, JANUARY 2001 111 A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation
More informationEnhanced Hexagon with Early Termination Algorithm for Motion estimation
Volume No - 5, Issue No - 1, January, 2017 Enhanced Hexagon with Early Termination Algorithm for Motion estimation Neethu Susan Idiculay Assistant Professor, Department of Applied Electronics & Instrumentation,
More informationChapter 3 Image Registration. Chapter 3 Image Registration
Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation
More informationModel-based Enhancement of Lighting Conditions in Image Sequences
Model-based Enhancement of Lighting Conditions in Image Sequences Peter Eisert and Bernd Girod Information Systems Laboratory Stanford University {eisert,bgirod}@stanford.edu http://www.stanford.edu/ eisert
More informationEE 5359 MULTIMEDIA PROCESSING SPRING Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.
EE 5359 MULTIMEDIA PROCESSING SPRING 2011 Final Report IMPLEMENTATION AND ANALYSIS OF DIRECTIONAL DISCRETE COSINE TRANSFORM IN H.264 Under guidance of DR K R RAO DEPARTMENT OF ELECTRICAL ENGINEERING UNIVERSITY
More informationEdge tracking for motion segmentation and depth ordering
Edge tracking for motion segmentation and depth ordering P. Smith, T. Drummond and R. Cipolla Department of Engineering University of Cambridge Cambridge CB2 1PZ,UK {pas1001 twd20 cipolla}@eng.cam.ac.uk
More informationA The left scanline The right scanline
Dense Disparity Estimation via Global and Local Matching Chun-Jen Tsai and Aggelos K. Katsaggelos Electrical and Computer Engineering Northwestern University Evanston, IL 60208-3118, USA E-mail: tsai@ece.nwu.edu,
More informationAn Approach for Reduction of Rain Streaks from a Single Image
An Approach for Reduction of Rain Streaks from a Single Image Vijayakumar Majjagi 1, Netravati U M 2 1 4 th Semester, M. Tech, Digital Electronics, Department of Electronics and Communication G M Institute
More informationVideo Alignment. Final Report. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin
Final Report Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This report describes a method to align two videos.
More informationA Robust Two Feature Points Based Depth Estimation Method 1)
Vol.31, No.5 ACTA AUTOMATICA SINICA September, 2005 A Robust Two Feature Points Based Depth Estimation Method 1) ZHONG Zhi-Guang YI Jian-Qiang ZHAO Dong-Bin (Laboratory of Complex Systems and Intelligence
More informationVIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD. Ertem Tuncel and Levent Onural
VIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD Ertem Tuncel and Levent Onural Electrical and Electronics Engineering Department, Bilkent University, TR-06533, Ankara, Turkey
More informationBI-DIRECTIONAL AFFINE MOTION COMPENSATION USING A CONTENT-BASED, NON-CONNECTED, TRIANGULAR MESH
BI-DIRECTIONAL AFFINE MOTION COMPENSATION USING A CONTENT-BASED, NON-CONNECTED, TRIANGULAR MESH Marc Servais, Theo Vlachos and Thomas Davies University of Surrey, UK; and BBC Research and Development,
More informationEfficient Block Matching Algorithm for Motion Estimation
Efficient Block Matching Algorithm for Motion Estimation Zong Chen International Science Inde Computer and Information Engineering waset.org/publication/1581 Abstract Motion estimation is a key problem
More informationFace Cyclographs for Recognition
Face Cyclographs for Recognition Guodong Guo Department of Computer Science North Carolina Central University E-mail: gdguo@nccu.edu Charles R. Dyer Computer Sciences Department University of Wisconsin-Madison
More informationSTEREO BY TWO-LEVEL DYNAMIC PROGRAMMING
STEREO BY TWO-LEVEL DYNAMIC PROGRAMMING Yuichi Ohta Institute of Information Sciences and Electronics University of Tsukuba IBARAKI, 305, JAPAN Takeo Kanade Computer Science Department Carnegie-Mellon
More informationMANY image and video compression standards such as
696 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL 9, NO 5, AUGUST 1999 An Efficient Method for DCT-Domain Image Resizing with Mixed Field/Frame-Mode Macroblocks Changhoon Yim and
More informationA deblocking filter with two separate modes in block-based video coding
A deblocing filter with two separate modes in bloc-based video coding Sung Deu Kim Jaeyoun Yi and Jong Beom Ra Dept. of Electrical Engineering Korea Advanced Institute of Science and Technology 7- Kusongdong
More informationx L d +W/2 -W/2 W x R (d k+1/2, o k+1/2 ) d (x k, d k ) (x k-1, d k-1 ) (a) (b)
Disparity Estimation with Modeling of Occlusion and Object Orientation Andre Redert, Chun-Jen Tsai +, Emile Hendriks, Aggelos K. Katsaggelos + Information Theory Group, Department of Electrical Engineering
More informationA Survey of Light Source Detection Methods
A Survey of Light Source Detection Methods Nathan Funk University of Alberta Mini-Project for CMPUT 603 November 30, 2003 Abstract This paper provides an overview of the most prominent techniques for light
More informationPlanar pattern for automatic camera calibration
Planar pattern for automatic camera calibration Beiwei Zhang Y. F. Li City University of Hong Kong Department of Manufacturing Engineering and Engineering Management Kowloon, Hong Kong Fu-Chao Wu Institute
More informationRealtime View Adaptation of Video Objects in 3-Dimensional Virtual Environments
Contact Details of Presenting Author Edward Cooke (cooke@hhi.de) Tel: +49-30-31002 613 Fax: +49-30-3927200 Summation Abstract o Examination of the representation of time-critical, arbitrary-shaped, video
More informationAn Edge-Based Approach to Motion Detection*
An Edge-Based Approach to Motion Detection* Angel D. Sappa and Fadi Dornaika Computer Vison Center Edifici O Campus UAB 08193 Barcelona, Spain {sappa, dornaika}@cvc.uab.es Abstract. This paper presents
More informationElimination of Duplicate Videos in Video Sharing Sites
Elimination of Duplicate Videos in Video Sharing Sites Narendra Kumar S, Murugan S, Krishnaveni R Abstract - In some social video networking sites such as YouTube, there exists large numbers of duplicate
More informationA FAST MULTISPRITE GENERATOR WITH NEAR-OPTIMUM CODING BIT-RATE
International Journal of Pattern Recognition and Artificial Intelligence Vol. 23, No. 2 (2009) 331 353 c World Scientific Publishing Company A FAST MULTISPRITE GENERATOR WITH NEAR-OPTIMUM CODING BIT-RATE
More informationsignal-to-noise ratio (PSNR), 2
u m " The Integration in Optics, Mechanics, and Electronics of Digital Versatile Disc Systems (1/3) ---(IV) Digital Video and Audio Signal Processing ƒf NSC87-2218-E-009-036 86 8 1 --- 87 7 31 p m o This
More informationINTERACTIVE CONTENT-BASED VIDEO INDEXING AND BROWSING
INTERACTIVE CONTENT-BASED VIDEO INDEXING AND BROWSING Miclial Irani Dept. of Applied Math and CS The Weixmann Institute of Science 76100 Rehovot, Israel H.S. Sawhney R. Kumar P. Anandan David Sarnoff Research
More informationData Hiding in Video
Data Hiding in Video J. J. Chae and B. S. Manjunath Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 9316-956 Email: chaejj, manj@iplab.ece.ucsb.edu Abstract
More informationFast Outlier Rejection by Using Parallax-Based Rigidity Constraint for Epipolar Geometry Estimation
Fast Outlier Rejection by Using Parallax-Based Rigidity Constraint for Epipolar Geometry Estimation Engin Tola 1 and A. Aydın Alatan 2 1 Computer Vision Laboratory, Ecóle Polytechnique Fédéral de Lausanne
More informationMotion Tracking and Event Understanding in Video Sequences
Motion Tracking and Event Understanding in Video Sequences Isaac Cohen Elaine Kang, Jinman Kang Institute for Robotics and Intelligent Systems University of Southern California Los Angeles, CA Objectives!
More informationWATERMARKING FOR LIGHT FIELD RENDERING 1
ATERMARKING FOR LIGHT FIELD RENDERING 1 Alper Koz, Cevahir Çığla and A. Aydın Alatan Department of Electrical and Electronics Engineering, METU Balgat, 06531, Ankara, TURKEY. e-mail: koz@metu.edu.tr, cevahir@eee.metu.edu.tr,
More informationA reversible data hiding based on adaptive prediction technique and histogram shifting
A reversible data hiding based on adaptive prediction technique and histogram shifting Rui Liu, Rongrong Ni, Yao Zhao Institute of Information Science Beijing Jiaotong University E-mail: rrni@bjtu.edu.cn
More informationMOTION estimation is one of the major techniques for
522 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 18, NO. 4, APRIL 2008 New Block-Based Motion Estimation for Sequences with Brightness Variation and Its Application to Static Sprite
More informationRobust Model-Free Tracking of Non-Rigid Shape. Abstract
Robust Model-Free Tracking of Non-Rigid Shape Lorenzo Torresani Stanford University ltorresa@cs.stanford.edu Christoph Bregler New York University chris.bregler@nyu.edu New York University CS TR2003-840
More informationReduction of Blocking artifacts in Compressed Medical Images
ISSN 1746-7659, England, UK Journal of Information and Computing Science Vol. 8, No. 2, 2013, pp. 096-102 Reduction of Blocking artifacts in Compressed Medical Images Jagroop Singh 1, Sukhwinder Singh
More informationCompression of Light Field Images using Projective 2-D Warping method and Block matching
Compression of Light Field Images using Projective 2-D Warping method and Block matching A project Report for EE 398A Anand Kamat Tarcar Electrical Engineering Stanford University, CA (anandkt@stanford.edu)
More informationLight source estimation using feature points from specular highlights and cast shadows
Vol. 11(13), pp. 168-177, 16 July, 2016 DOI: 10.5897/IJPS2015.4274 Article Number: F492B6D59616 ISSN 1992-1950 Copyright 2016 Author(s) retain the copyright of this article http://www.academicjournals.org/ijps
More informationHUMAN S FACIAL PARTS EXTRACTION TO RECOGNIZE FACIAL EXPRESSION
HUMAN S FACIAL PARTS EXTRACTION TO RECOGNIZE FACIAL EXPRESSION Dipankar Das Department of Information and Communication Engineering, University of Rajshahi, Rajshahi-6205, Bangladesh ABSTRACT Real-time
More informationFeature-Based Image Mosaicing
Systems and Computers in Japan, Vol. 31, No. 7, 2000 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J82-D-II, No. 10, October 1999, pp. 1581 1589 Feature-Based Image Mosaicing Naoki Chiba and
More informationChapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:
Chapter 11.3 MPEG-2 MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications: Simple, Main, SNR scalable, Spatially scalable, High, 4:2:2,
More informationHierarchical Matching Techiques for Automatic Image Mosaicing
Hierarchical Matching Techiques for Automatic Image Mosaicing C.L Begg, R Mukundan Department of Computer Science, University of Canterbury, Christchurch, New Zealand clb56@student.canterbury.ac.nz, mukund@cosc.canterbury.ac.nz
More informationStereo Image Rectification for Simple Panoramic Image Generation
Stereo Image Rectification for Simple Panoramic Image Generation Yun-Suk Kang and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST) 261 Cheomdan-gwagiro, Buk-gu, Gwangju 500-712 Korea Email:{yunsuk,
More informationFingerprint Mosaicking by Rolling with Sliding
Fingerprint Mosaicking by Rolling with Sliding Kyoungtaek Choi, Hunjae Park, Hee-seung Choi and Jaihie Kim Department of Electrical and Electronic Engineering,Yonsei University Biometrics Engineering Research
More informationNeTra-V: Towards an Object-based Video Representation
Proc. of SPIE, Storage and Retrieval for Image and Video Databases VI, vol. 3312, pp 202-213, 1998 NeTra-V: Towards an Object-based Video Representation Yining Deng, Debargha Mukherjee and B. S. Manjunath
More informationPerceptual Grouping from Motion Cues Using Tensor Voting
Perceptual Grouping from Motion Cues Using Tensor Voting 1. Research Team Project Leader: Graduate Students: Prof. Gérard Medioni, Computer Science Mircea Nicolescu, Changki Min 2. Statement of Project
More informationSTEREOSCOPIC IMAGE PROCESSING
STEREOSCOPIC IMAGE PROCESSING Reginald L. Lagendijk, Ruggero E.H. Franich 1 and Emile A. Hendriks 2 Delft University of Technology Department of Electrical Engineering 4 Mekelweg, 2628 CD Delft, The Netherlands
More informationScene Segmentation by Color and Depth Information and its Applications
Scene Segmentation by Color and Depth Information and its Applications Carlo Dal Mutto Pietro Zanuttigh Guido M. Cortelazzo Department of Information Engineering University of Padova Via Gradenigo 6/B,
More informationGlobal Flow Estimation. Lecture 9
Motion Models Image Transformations to relate two images 3D Rigid motion Perspective & Orthographic Transformation Planar Scene Assumption Transformations Translation Rotation Rigid Affine Homography Pseudo
More informationCONVERSION OF FREE-VIEWPOINT 3D MULTI-VIEW VIDEO FOR STEREOSCOPIC DISPLAYS
CONVERSION OF FREE-VIEWPOINT 3D MULTI-VIEW VIDEO FOR STEREOSCOPIC DISPLAYS Luat Do 1, Svitlana Zinger 1, and Peter H. N. de With 1,2 1 Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven,
More informationLucas-Kanade Without Iterative Warping
3 LucasKanade Without Iterative Warping Alex RavAcha School of Computer Science and Engineering The Hebrew University of Jerusalem 91904 Jerusalem, Israel EMail: alexis@cs.huji.ac.il Abstract A significant
More informationLearning based face hallucination techniques: A survey
Vol. 3 (2014-15) pp. 37-45. : A survey Premitha Premnath K Department of Computer Science & Engineering Vidya Academy of Science & Technology Thrissur - 680501, Kerala, India (email: premithakpnath@gmail.com)
More informationFeature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies
Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies M. Lourakis, S. Tzurbakis, A. Argyros, S. Orphanoudakis Computer Vision and Robotics Lab (CVRL) Institute of
More informationCONTENT ADAPTIVE SCREEN IMAGE SCALING
CONTENT ADAPTIVE SCREEN IMAGE SCALING Yao Zhai (*), Qifei Wang, Yan Lu, Shipeng Li University of Science and Technology of China, Hefei, Anhui, 37, China Microsoft Research, Beijing, 8, China ABSTRACT
More informationOptical Flow-Based Person Tracking by Multiple Cameras
Proc. IEEE Int. Conf. on Multisensor Fusion and Integration in Intelligent Systems, Baden-Baden, Germany, Aug. 2001. Optical Flow-Based Person Tracking by Multiple Cameras Hideki Tsutsui, Jun Miura, and
More informationDIGITAL video is an integral part of many newly emerging
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 8, NO. 5, SEPTEMBER 1998 547 3-D Model-Based Segmentation of Videoconference Image Sequences Ioannis Kompatsiaris, Student Member, IEEE,
More informationDisparity map coding for 3D teleconferencing applications
Disparity map coding for 3D teleconferencing applications André Redert, Emile Hendriks Information Theory Group, Department of Electrical Engineering Delft University of Technology, Mekelweg 4, 2628 CD
More informationMotion Detection and Segmentation Using Image Mosaics
Research Showcase @ CMU Institute for Software Research School of Computer Science 2000 Motion Detection and Segmentation Using Image Mosaics Kiran S. Bhat Mahesh Saptharishi Pradeep Khosla Follow this
More informationStereo Vision. MAN-522 Computer Vision
Stereo Vision MAN-522 Computer Vision What is the goal of stereo vision? The recovery of the 3D structure of a scene using two or more images of the 3D scene, each acquired from a different viewpoint in
More informationUsing Shape Priors to Regularize Intermediate Views in Wide-Baseline Image-Based Rendering
Using Shape Priors to Regularize Intermediate Views in Wide-Baseline Image-Based Rendering Cédric Verleysen¹, T. Maugey², P. Frossard², C. De Vleeschouwer¹ ¹ ICTEAM institute, UCL (Belgium) ; ² LTS4 lab,
More information5LSH0 Advanced Topics Video & Analysis
1 Multiview 3D video / Outline 2 Advanced Topics Multimedia Video (5LSH0), Module 02 3D Geometry, 3D Multiview Video Coding & Rendering Peter H.N. de With, Sveta Zinger & Y. Morvan ( p.h.n.de.with@tue.nl
More informationFAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING
FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING 1 Michal Joachimiak, 2 Kemal Ugur 1 Dept. of Signal Processing, Tampere University of Technology, Tampere, Finland 2 Jani Lainema,
More informationMPEG-4 AUTHORING TOOL FOR THE COMPOSITION OF 3D AUDIOVISUAL SCENES
MPEG-4 AUTHORING TOOL FOR THE COMPOSITION OF 3D AUDIOVISUAL SCENES P. Daras I. Kompatsiaris T. Raptis M. G. Strintzis Informatics and Telematics Institute 1,Kyvernidou str. 546 39 Thessaloniki, GREECE
More informationAccurate and Dense Wide-Baseline Stereo Matching Using SW-POC
Accurate and Dense Wide-Baseline Stereo Matching Using SW-POC Shuji Sakai, Koichi Ito, Takafumi Aoki Graduate School of Information Sciences, Tohoku University, Sendai, 980 8579, Japan Email: sakai@aoki.ecei.tohoku.ac.jp
More informationVOLUMETRIC RECONSTRUCTION WITH COMPRESSED DATA
VOLUMETRIC RECONSTRUCTION WITH COMPRESSED DATA N. Anantrasirichai, C. Nishan Canagarajah, David W. Redmill and Akbar Sheikh Akbari Department of Electrical & Electronic Engineering, University of Bristol,
More informationStereo/Multiview Video Encoding Using the MPEG Family of Standards
Stereo/Multiview Video Encoding Using the MPEG Family of Standards Jens-Rainer Ohm Heinrich-Hertz-Institut, Image Processing Department, Einsteinufer 37, D-10587 Berlin, Germany ABSTRACT Compression of
More informationTracking facial features using low resolution and low fps cameras under variable light conditions
Tracking facial features using low resolution and low fps cameras under variable light conditions Peter Kubíni * Department of Computer Graphics Comenius University Bratislava / Slovakia Abstract We are
More informationEXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM
TENCON 2000 explore2 Page:1/6 11/08/00 EXPLORING ON STEGANOGRAPHY FOR LOW BIT RATE WAVELET BASED CODER IN IMAGE RETRIEVAL SYSTEM S. Areepongsa, N. Kaewkamnerd, Y. F. Syed, and K. R. Rao The University
More informationCOMPARISONS OF DCT-BASED AND DWT-BASED WATERMARKING TECHNIQUES
COMPARISONS OF DCT-BASED AND DWT-BASED WATERMARKING TECHNIQUES H. I. Saleh 1, M. E. Elhadedy 2, M. A. Ashour 1, M. A. Aboelsaud 3 1 Radiation Engineering Dept., NCRRT, AEA, Egypt. 2 Reactor Dept., NRC,
More informationAUTOMATIC OBJECT DETECTION IN VIDEO SEQUENCES WITH CAMERA IN MOTION. Ninad Thakoor, Jean Gao and Huamei Chen
AUTOMATIC OBJECT DETECTION IN VIDEO SEQUENCES WITH CAMERA IN MOTION Ninad Thakoor, Jean Gao and Huamei Chen Computer Science and Engineering Department The University of Texas Arlington TX 76019, USA ABSTRACT
More informationBiometric Security System Using Palm print
ISSN (Online) : 2319-8753 ISSN (Print) : 2347-6710 International Journal of Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 3, March 2014 2014 International Conference
More informationIntermediate view synthesis considering occluded and ambiguously referenced image regions 1. Carnegie Mellon University, Pittsburgh, PA 15213
1 Intermediate view synthesis considering occluded and ambiguously referenced image regions 1 Jeffrey S. McVeigh *, M. W. Siegel ** and Angel G. Jordan * * Department of Electrical and Computer Engineering
More informationELEC Dr Reji Mathew Electrical Engineering UNSW
ELEC 4622 Dr Reji Mathew Electrical Engineering UNSW Review of Motion Modelling and Estimation Introduction to Motion Modelling & Estimation Forward Motion Backward Motion Block Motion Estimation Motion
More informationCS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching
Stereo Matching Fundamental matrix Let p be a point in left image, p in right image l l Epipolar relation p maps to epipolar line l p maps to epipolar line l p p Epipolar mapping described by a 3x3 matrix
More informationGlobal Flow Estimation. Lecture 9
Global Flow Estimation Lecture 9 Global Motion Estimate motion using all pixels in the image. Parametric flow gives an equation, which describes optical flow for each pixel. Affine Projective Global motion
More informationFRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS
FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS Yen-Kuang Chen 1, Anthony Vetro 2, Huifang Sun 3, and S. Y. Kung 4 Intel Corp. 1, Mitsubishi Electric ITA 2 3, and Princeton University 1
More informationFace Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN
2016 International Conference on Artificial Intelligence: Techniques and Applications (AITA 2016) ISBN: 978-1-60595-389-2 Face Recognition Using Vector Quantization Histogram and Support Vector Machine
More information