3D object articulation and motion estimation in model-based stereoscopic videoconference image sequence analysis and coding
|
|
- Clara McKinney
- 5 years ago
- Views:
Transcription
1 Signal Processing: Image Communication 14 (1999) 817}840 3D object articulation and motion estimation in model-based stereoscopic videoconference image sequence analysis and coding Dimitrios Tzovaras*, Ioannis Kompatsiaris, Michael G. Strintzis Information Processing Laboratory, Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, Thessaloniki 54006, Greece Received 29 November 1996 Abstract This paper describes a procedure for model-based analysis and coding of both left and right channels of a stereoscopic image sequence. The proposed scheme starts with a hierarchical dynamic programming technique for matching across the epipolar line for e$cient disparity/depth estimation. Foreground/background segmentation is initially based on depth estimation and is improved using motion and luminance information. The model is initialised by the adaptation of a wireframe model to the consistent depth information. Robust classi"cation techniques are then used to obtain an articulated description of the foreground of the scene (head, neck, shoulders). The object articulation procedure is based on a novel scheme for the segmentation of the rigid 3D motion "elds of the triangle patches of the 3D model object. Spatial neighbourhood constraints are used to improve the reliability of the original triangle motion estimation. The motion estimation and motion "eld segmentation procedures are repeated iteratively until a satisfactory object articulation emerges. The rigid 3D motion is then re-computed for each sub-object and "nally, a novel technique is used to estimate #exible motion of the nodes of the wireframe from the rigid 3D motion vectors computed for the wireframe triangles containing each speci"c node. The performance of the resulting analysis and compression method is evaluated experimentally Elsevier Science B.V. All rights reserved. Keywords: Stereoscopic image sequence analysis; Model-based coding; Object articulation; Non-rigid 3D motion estimation 1. Introduction The transmission of full-motion video through limited capacity channels is critically dependent on the ability of the compression schemes to reach target bit-rates while still maintaining acceptable visual quality * Corresponding author. Tel.: # ; fax: # ; tzovaras@dion.ee.auth.gr This work was supported by the EU CEC Project ACTS PANORAMA (Package for New Autostereoscopic Multiview Systems and Applications, ACTS project 092) /99/$ - see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S ( 9 8 )
2 818 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 [15]. In order to achieve this, motion estimation and motion compensated prediction are frequently used, so as to reduce temporal redundancy in image sequences [22]. Similarly in the coding of stereo and multiview images, prediction may be based on disparity compensation [33] or the best of motion and disparity compensation [34]. Stereoscopic video processing has recently been the focus of considerable attention in the literature [3,6,10,13,16,24,26,28,32]. A stereoscopic pair of image sequences, recorded with a di!erence in the view angle, allows the three-dimensional (3D) perception of the scene by the human observer, by exposing to each eye the respective image sequence. This creates an enhanced 3D feeling and increased `telepresencea in teleconferencing and several other (medical, entertainment, etc.) applications. In both monoscopic and stereoscopic vision, the ability of model-based techniques to describe a scene in a structural way has opened new areas of applications. Video production, realistic computer graphics, multimedia interfaces and medical visualisation are some of the applications that may bene"t by exploiting the potential of model-based schemes. Object-based techniques have been extensively investigated for monoscopic image sequence coding [4,11,12,21]. Several object-oriented coding schemes have also been proposed for stereoscopic image sequence coding [7,24,25,29,31,32]. The advantages of using model-based techniques for stereo image sequence coding were reviewed in [25], where a feature-based 3D motion estimation scheme was presented. In [24], disparity is estimated using a dynamic programming scheme and is subsequently used for object segmentation. The segmentation algorithm is based on region growing and the criterion used for the de"nition of each object is based on the homogeneity of the respective disparity "elds. In [10], the objects in the scene are identi"ed using a segmentation method based on the homogeneity of the 2D motion "eld computed by a block matching procedure. Then the 3D motion of each object is modeled using the approach presented in [1] with depth estimated from disparity. Finally, an interframe coding scheme based on 3D motion compensation is evaluated. A disadvantage of the segmentation technique used in this procedure, is its failure to guarantee high performance of the resulting 3D motion compensation method. Alternatively, 3D models of objects may be derived from stereo images. This usually requires estimation of dense disparity "elds, postprocessing to remove erroneous estimates and "tting of a parametrised surface model to the calculated depth map [14]. In [17] an algorithm was presented which optimally models the scenes using a hierarchically structured wire-frame model derived directly from intensity images. The wire-frame model consists of adjacent triangles that may be split into smaller ones over areas that need to be represented in higher detail. The motion of the model surface using both rigid and non-rigid body assumptions is estimated concurrently with depth parameters. Knowledge-based image sequence coding has also attracted much interest recently, especially for the coding of facial image sequences in videophone applications. In [2], one such method is based on the generation of a generic face model and the use of e$cient techniques for rigid and #exible 3D motion estimation. In the present paper, a procedure for model-based analysis and coding of both left and right channels of a stereoscopic image sequence is proposed. The methodology used, overcomes a major obstacle in stereoscopic video coding, caused by the di$cult problem of determining and handling coherently corresponding objects in the left and right images. This is achieved in this paper by de"ning segmentation and object articulation in the 3D space, thus ensuring that all ensuing operations remain coherent for both the left and the right aspects of the scene. Each object is described by a mesh consisting of a set of interconnected triangles. The 3D motion of each triangle is estimated using a robust algorithm for the minimisation of the least median of squares error and by imposing neighbourhood constraints, such as introduced in [18,19], to guarantee the smoothness of the resulting vector "eld. A novel iterative object articulation technique for stereoscopic image sequences is then used to segment the 3D vector "eld and thus to derive a foreground object articulation. Triangle motion estimation and classi"cation are repeated iteratively until satisfactory object articulation is achieved. Rigid 3D motion estimation is performed next for each resulting sub-object, using motion information from both left and right cameras. Finally, a procedure is proposed for the
3 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} estimation of the non-rigid motion of each wireframe node based on the 3D motion of the neighbouring wireframe triangles. The paper is organised as follows. In Section 2 the camera geometry of the stereoscopic system is described. Next, in Section 3 an overview of the proposed stereoscopic image sequence analysis system is presented. Section 4 presents the techniques used for disparity/depth estimation, foreground/background segmentation and model initialisation. The technique used for object articulation is examined in Section 6 while the 3D motion estimation procedure used is presented in Section 7. The rigid 3D motion estimation procedure for each articulated 3D object is discussed in Section 7.1. Finally, in Section 7.2, an approach is considered for non-rigid motion estimation based on the rigid 3D motion vectors of small surface patches, computed during the object articulation procedure. Experimental results given in Section 8 demonstrate the performance of the proposed methods. Conclusions are drawn in Section Camera geometry The geometry of the stereoscopic camera arrangement used is shown in Fig. 1, where three reference coordinate frames are de"ned: World reference frame, attached to the imaged scene. Camera reference frame, attached to the camera system. Notice that the Z-axis is the optical axis, while the X and > axes are parallel to the image plane. Here c refers to the respective camera, i.e. c"l, r for the left, respectively, right cameras. Image reference frame, where the X and > axes, respectively, de"ne the horizontal and vertical directions on the digital image, where again c"l, r refer to the images produced, respectively, by the left, right cameras. Fig. 1. Stereoscopic camera geometry.
4 820 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 The camera geometry is described by the following set of equations mapping the 3D world-coordinates (x, y, z ) of a generic point P into the 2D coordinates (X,> ) of its projection on the image planes: Change of reference frame from world-coordinates to camera-coordinates: P "x y z "R x y, (1) z #T where c"l, c"r, for the left and right cameras, respectively, and R and T are, respectively, the rotation matrix and the translation vector. Perspective projection of a scene point to the image plane (the centre of projection is the centre of the lens and the projection plane is the camera CCD sensor): P " X > "f z x y, (2) Change of coordinate frame from camera-coordinates (X,> ) to image coordinates (X,> ). This operation simply consists of a 2D translation and scale change X "C # X d, > "C # > d, (3) where d and d are the horizontal and vertical size of an image pixel, respectively, and (C, C ) are the image coordinates of the optical centre OC in camera c. As seen from the above description, the camera geometry is completely speci"ed by a small set of parameters estimated during camera calibration. 3. Overview of the stereoscopic image sequence analysis and coding scheme In the proposed model-based stereoscopic image sequence analysis and coding scheme (see Fig. 2), both left and right channels are coded using 3D rigid and non-rigid motion compensation. The approach taken is to de"ne fully 3D models of objects composed of interconnecting wire-mesh triangles. In this way, complete left-to-right object correspondence is intrinsically established. Fig. 2. The proposed stereoscopic image sequence coding scheme.
5 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} The left-to-right (LR) and right-to-left (RL) disparity "elds are estimated "rst, using a hierarchical dynamic programming disparity estimation procedure. The consistency of the computed disparity "elds is then checked, for the &points of interest' which are provided by the model initialisation procedure. A reliable disparity estimate is obtained for those points of interest with inconsistent left}right disparities. An initial foreground/background segmentation procedure follows, leading to a 3D wireframe model adapted to the foreground object using the reliable disparity estimates. In order to improve rigid 3D motion estimation the foreground object is articulated producing sub-objects de"ned by the homogeneity of their 3D motion. This rigid 3D motion is estimated using least median of squares minimisation of a cost function taking into account the reliability of the projected rigid 3D motion in both the left and right channels. Neighbourhood constraints are also imposed to improve the reliability of the motion estimation procedure. Following articulation, the rigid 3D motion of each sub-object produced is estimated using the same motion estimation procedure, without this time imposing neighbourhood constraints. Finally, non-rigid motion of each node of the wireframe is estimated from the rigid 3D motion of the wireframe triangles containing this node as a vertex. A block diagram of the proposed encoder is shown in Fig. 3. Its constituent components are described in detail in the ensuing sections. 4. Depth estimation-scene segmentation and model initialisation 4.1. Disparity/depth estimation Since the stereo camera con"guration is known, the depth estimation problem reduces to that of disparity estimation [3,28,30]. A dynamic programming algorithm, minimising a combined cost function for two corresponding lines of the stereoscopic image pair is used for disparity estimation. The basic algorithm adapts the results of [5,24] using blocks rather than pixels. Furthermore, a novel hierarchical version of this algorithm is implemented so as to speed up its execution. The cost function takes into consideration the displaced frame di!erence (DFD) as well as the smoothness of the resulting vector "eld in the following way. Due to the epipolar line constraint [3] the search area for each pixel p "(i, j ) of the right image is an interval S p on the epipolar line in the left image determined by a minimum and maximum allowed disparity. If p "(i, j )3S p is the pixel in the left image matching with pixel p of the right image and if d p is the disparity vector corresponding to this match, the following cumulative cost function is minimised with respect to d p for the path ending at the pixel p in each line i of the right image: C(i )"min C(i!1)#c(p, d d p ). (4) p The cost function c(p, d p ) is determined by c(p, d p )"R(p )DFD(p, d p )#SMF(p, d p ). (5) The "rst term in Eq. (5) contains the absolute di!erence of two corresponding image intensity blocks, centered at the working pixels (k, i) and (l, j) in the right and left images, respectively, DFD(p, d p )" I (i #X, j #>)!I (i #X, j #>), (6) W where W is a rectangular window. Multiplication with the reliability function R(d) relaxes the DFD weight, keeping only the second term active in homogeneous regions where the matching reliability is small. The
6 822 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 Fig. 3. A block diagram of the proposed encoder. disparity vector is considered reliable whenever it corresponds to a pixel on an edge or in a highly textured area. For the detection of edges and textured areas a variant of the technique in [8] was used, based on the observation that highly textured areas exhibit high local intensity variance in all directions while on edges the intensity variance is higher across the direction of the edge. The second term in Eq. (5) is the smoothing function, SMF(d p )" d p!d R(d ), (7)
7 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} where d, n"1,2, N, are vectors neighbouring d. Multiplication by the factor R(d ) aims to attenuate the contribution of unreliable vectors to the smoothing function. Finally, the dynamic programming algorithm selects as the best path up to that stage, the one with the minimum cumulative cost (Eq. (4)). A hierarchical version of this approach was utilised in order to speed up the estimation process and to produce a smooth disparity "eld without discontinuities. In this version, the dynamic programming algorithm is applied at the coarse resolution level and an initial estimate for the disparity vectors is produced. The disparity information is then propagated to the next resolution level where it is corrected so that the cost function is further minimised. This process is iterated until full resolution is achieved. Along with the dense disparity "eld, the variance of the disparity estimate for each pixel of the image is also computed, using σ (p, d p )" 1 (I (i #k, j #l)!i (i #k, j #l)), (8) N where (2N#1)(2N#1) is the dimension of the rectangular window W. Finally, depth is estimated from disparity, using the camera geometry as in [32] Foreground/background segmentation The model is initialised by separating the body in the videoconference scene from the background using an initial foreground/background segmentation procedure. The depth map produced by the method in Section 4.1 applied to the full resolution image may be used for this purpose. However, to reduce as much as possible the e!ects of errors in depth estimation, we propose instead the use of a hierarchical foreground/background segmentation, focused on the determination of only the largest disparity vectors. These vectors correspond to foreground objects (objects that lie very close to the camera). This information is propagated to the higher resolution level where it is corrected in a coarse to "ne manner. Thus, by carefully selecting the search area of the disparity estimator at each resolution an initial foreground/background segmentation mask is formed. The resulting segmentation map is then post-processed using a motion detection mask and the luminance edge information. The motion detection mask is de"ned by simple subtraction of consequent frames of the same channel of the image sequence. Note that in this phase, the aim is not to calculate motion accurately, but rather to identify regions with very high or very low motion. The motion detection mask contains important information for both inner and boundary areas of the foreground object while luminance edge information carries important information about errors that occur mainly on the silhouette (border) of the foreground object. The foreground object boundary is found as the part of the image where both the depth gradient and the luminance gradient are high. Summarising, the following algorithm is used for foreground/background separation, as shown in Fig. 4: The disparity information in level l of the algorithm is segmented using a histogram based segmentation algorithm and areas corresponding to large disparity values, are identi"ed as objects close to the camera. The segmentation information is propagated to the "ner resolution level where it is corrected appropriately. In the full resolution level, the resulting segmentation mask is post-processed using motion and luminance information as follows: each portion of the scene designated as background by the disparity segmentation procedure is reexamined in view of its motion u and depth and luminance gradients g (Fig. 4). If all these parameters exceed preselected thresholds, this portion of the scene is con"rmed as being part of the foreground. Otherwise, it is relegated to the background.
8 824 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 Fig. 4. The proposed foreground/background estimation scheme. A 3D wireframe is adapted to the foreground object produced by the described procedure. Then, using the reliable depth estimates as described in the following sections, the "nal 3D model is created Consistency checking and disparity evaluation for the points of interest A set F of points of interest is "rst de"ned, composed of points in the 3D space with left or right image projections located on depth and luminance edges. The latter are extracted using the edge detection algorithm presented in Section 4.1. For each of these points, the disparity estimation algorithm produces left-to-right (LR) and right-to-left (RL) disparity "elds. However, the LR and RL disparity "elds may be
9 inconsistent because of occlusions and errors in the disparity estimation procedure. Thus, a consistency checking algorithm is used to indicate the correct matches followed by an averaging procedure (Kalman estimate) which assigns a depth value to pixels with inconsistent matches. More speci"cally, the correspondence between pixels p "(i, j ) of the right image and p "(i, j ) of the left image is considered consistent if d(p )"!d(p #d(p )). D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} If the above relation is not valid, a more reliable depth estimate must be assigned to that pixel. The method in [9] is applied to this e!ect, using the reliability of the disparity estimates as a weighting function. Speci"cally, the disparity d(p ) and the disparity d(p ) satisfying p "p #d(p ), (9) are averaged with respect to their disparity error variances as follows: dk " dσ!dσ σ #σ, σ K " σ σ σ #σ, where σ and σ are, respectively, the variances of the disparity estimates d and d, computed at the disparity/depth estimation stage using Eq. (8), and σ K is the variance of the averaged disparity. If more than one disparity vectors d(p ) satisfy Eq. (9), the one with the minimum estimation variance σ is selected. The consistency checking algorithm is applied to the set of all points of interest, selected as above so as to have projections on depth and luminance edges, and reliable depth estimates for pixels with either consistent or corrected disparity are obtained to be used for model initialisation. The result of this procedure is a set F of points of interest (xl, yl, zl ) whose projections are located on the foreground depth map and luminance edges of either the left or right camera and zl is their estimated depth. 5. Initial 3D model adaptation For the generation of the 3D model object, depth information must be modeled using a wire mesh. We shall generate a surface model of the form [17] z"s(x, y, PK ), (10) where PK is a set of 3D &control' points or &nodes' PK "(x, y, z ), i"1,2,n that determine the shape of the surface and (x, y, z) are the coordinates of a 3D point. An initial choice for PK is the regular tessellation shown in Fig. 5(a). The consistency checking algorithm, described in the previous section, is applied to all control points to assign corrected depth values to every node of the 3D model. Automatic adaptation of the 3D model (Fig. 5(c) and (d)) to the foreground object is sought by forcing the 3D model to meet the boundary of the foreground/background segmentation mask (Fig. 5(b)). A set of reference image points G"(xJ, yj, zj ), i"1, 2, Q is de"ned as the aggregate of F and PK : G"FPK, (11) where F is the set of points of interest de"ned in the preceding section and PK are the nodes of the 3D model with the corrected depth values. Then, S can be modelled by a piecewise linear surface, consisting of adjoint triangular patches, which may be written in the form z"z g (x, y)#z g (x, y)#z g (x, y), (12)
10 826 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 Fig. 5. (a) Initial triangulation of the image plane. (b) Foreground/background segmentation. (c) Part of the initial triangulation corresponding to the foreground object. (d) Expanded wireframe adapted to the foreground object. (e) Barycentric coordinates.
11 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} if (x, y, z) is a point on the triangular patch P PY P "(x, y, z ), (x, y, z ), (x, y, z ). The functions g (x, y) are the barycentric coordinates of (x, y, z) relative to the triangle and they are given by g (x, y)"area(a )/Area(P PK P ) (Fig. 5(e)). The reconstruction of a surface from consistent sparse depth measurements may be e!ected by minimising a functional of the form E (PK )" (S(xJ, yj, PK )!zj ). (13) The value of sum (13) expresses con"dence to the reference points (xj, yj, zj )3G, i"1, 2, Q. Note that no smoothness constraint is imposed to the surface of the 3D model, since the depth estimates for these points are considered very reliable. Replacing Eq. (12) in Eq. (13) yields E (PK )"APK!B, (14) where A is a QN matrix and B a Q1 vector given by A " g (xj, yj ) if (xj, yj ) inside triangle (x, y ), (x, y ), (x, y ), i"1, 2, Q, 0 otherwise, j"1,2, N, B "zj, i"1, 2, Q, where (i, j) are two triangle indices. The vector PK minimising Eq. (14) is PK "(AA)AB, (15) which de"nes the nodes of the wire-mesh surface. Using Eq. (12), the depth z of any point on a patch can be expressed in terms of the depth information of the nodes of the wireframe and the X and > coordinates of that point. Hence, full depth information will be available if only the depths of the nodes of the wireframe are transmitted. 6. Object articulation A novel subdivision method based on the rigid 3D motion parameters of each triangle and the error variance of the rigid 3D motion estimation is proposed for the articulation of the foreground object (separation of the head and shoulders). The model initialisation procedure described above, results in a set of interconnected triangles in the 3D space: ¹, k"1,2, K where K is the number of triangles of the 3D model. In the following, S will denote an articulation of the 3D model at iteration i of the articulation algorithm, consisting of s, k"1, 2, M sub-objects. The proposed iterative object articulation procedure is composed of the following steps: Step 1. Set i"0. Let an initial segmentation S"s, k"1, 2, K, with s "¹. Let also the initial neighbourhood for each triangle to be empty, i.e. ¹S ". Step 2. Apply the 3D rigid motion estimation algorithm to each triangle ¹, taking into account the neighbourhood constraint imposed by the neighbourhood ¹S. This constraint is described in detail in the Section 6.1 that follows. Step 3. Set i"i#1. Execute the object segmentation procedure that subdivides the initial object into M sub-objects, i.e. S"s, k"1, 2, M. Step 4. Use the segmentation map S to de"ne the new neighbourhood ¹S Step 5. If S"S then stop. Else go to step 2. of each triangle ¹. The proposed algorithm can be also explained by the example of Fig. 6(a)}(c). Fig. 6(a) illustrates the initial phase of the algorithm where each triangle is treated as an object. The estimated rigid 3D motion
12 828 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 Fig. 6. (a) Initial phase of the object articulation algorithm. (b) The output of the rigid 3D motion estimation procedure for each triangle. (c) The output of the object segmentation procedure. (d) Non-rigid 3D motion estimation example. The light grey vector represents the rigid motion of the working node while the black vectors represent estimates for the motion of the same node using the 3D motion parameters corresponding to each triangle containing the working node.
13 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} vectors of each triangle, computed at the second step of the proposed algorithm are shown in Fig. 6(b) and the output of the object segmentation procedure of step 3 is shown in Fig. 6(c). Based on this new object segmentation map the rigid 3D motion estimation and object segmentation procedures are then further re"ned iteratively. The 3D motion estimation of each triangle and the object segmentation procedure are described in more detail below Rigid 3D motion estimation of small surface patches The foreground object of a typical videophone scene is composed of more than one sub-objects (head, neck, shoulders, etc.) each of which exhibits di!erent rigid 3D motion. Thus, object articulation has to be completed and the rigid motion of each sub-object must be estimated. For rigid 3D motion estimation of each triangle ¹ we use least median of squares minimisation. This procedure removes the outliers from the initial data set and "nds the estimate that minimises the median of the square error. More speci"cally, the rigid motion of each triangle ¹, k"1,2, K, where K is the number of triangles in the foreground object, is modeled using a linear 3D model, with three rotation and three translation parameters [1]: x(t#1) 1!w w y(t#1) 1!w y(t) t z(t#1)"!w w 1 x(t) z(t)#t, (16) t where (x(t), y(t), z(t)) is a point on the plane de"ned by the coordinates of the vertices of triangle ¹. Since the triangle motion is to be used for object articulation, neighbourhood constraints are needed for the estimation of the model parameter vector a"(w, w, w, t, t, t ), in order to guarantee a smooth estimated triangle motion vector "eld, that can be successfully segmented. Let N be the ensemble of the triangles neighbouring the triangle ¹. If triangle ¹ belongs to the region s of S at iteration i of the object articulation algorithm, we de"ne as neighbourhood ¹S of each triangle ¹ the set of triangles ¹ in ¹S "¹ 3N ¹ 3s. For example, in order to de"ne the neighbourhood of triangle A in Fig. 6(c) we "rst consider all triangles that share at least one common vertex with triangle A (i.e. N "B, C, D, E, H, I, J, K). From the set N only the triangles belonging to the same object with triangle A, are "nally de"ned as neighbourhood of triangle A (i.e. ¹S "B, C, D, E). Then for each triangle ¹ the set of points belonging to ¹S are input to the 3D rigid motion estimation procedure so as to smooth the motion "eld produced The 3D motion estimation algorithm For the estimation of the model parameter vector a"(w, w, w, t, t, t ) for each neighbourhood ¹S at iteration i of the object articulation procedure, the MLMS iterative algorithm [27] was used. The MLMS algorithm is based on median "ltering and is very e$cient in suppressing noise with a large amount of outliers (i.e. in situations where conventional least-squares techniques usually fail). As noted in the previous sections, the 3D motion of each extended neighbourhood ¹S of a triangle ¹ is modelled in the global coordinate system by P(t#1)"R P(t)#T, (17)
14 830 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 where the matrix R and the vector T are de"ned from Eq. (16). Since initial motion estimates are available in the left and right camera images, the rigid 3D motion must be projected on the left and right coordinate systems. Using Eqs. (17) and (1), P (t#1)"r P (t)#t, (18) where and R "R R R T "!R R R T #R T #T, (20) where R and T are the 3D motion rotation and translation matrices corresponding to camera c and triangle k. Using the fact that the matrices R and T are of the form " 1!w w "t R w 1!w, T, (21)!w w 1 t and also using Eqs. (2) and (3), the projected 2D motion vector in camera c, d (X,>) is given by d (X(t),>(t))"f!w x (t)y (t)#w (x (t)#z (t))!w y (t)z (t)#t z (t)!t x (t) (!w x (t)#w y (t)#z (t)#t )z (t)d, (22) d (X(t),>(t))"f w (y (t)#z (t))!w x (t)y (t)!w x (t)z (t)!t z (t)#t y (t) (!w x (t)#w y (t)#z (t)#t )z (t)d, (23) where d (X,>)"(d (X(t),>(t)), d (X(t), >(t))). Using the initially estimated 2D motion vectors corresponding to the left and right cameras and Eqs. (22) and (23) along with Eqs. (19) and (20) evaluated for c"l and c"r, a linear system for the global motion parameter vector a for triangle ¹ is formed. Note that the parameters of a are implicitly contained in Eqs. (22) and (23), since a "(w, w, w, t, t, t ) and a are related by Eqs. (19) and (20). This is a system of equations with six unknowns, where are the number of reference points in the set 2( # ), G of (Eq. (11)), contained in the plane de"ned by the coordinates of the vertices of triangle k in the left and right image planes, respectively. If # *2 this is overdetermined and can be solved using least-squares methods or alternately, by the robust least median of squares motion estimation algorithm described in detail in [27]. The reference points initially chosen should be enough to guarantee # *2 for each triangle. As explained in Section 5, this is ensured by choosing in Eq. (11) as reference points all triangle vertices plus the points of interest on depth and luminance edges. (19) 6.3. Object segmentation At each iteration of the object articulation method, the rigidity constraint imposed on each rigid object component is exploited. This constraint requires that the distance between any pair of points of a rigid object component must remain constant at all times and con"gurations. Thus, the motion of a rigid model object component represented by a mesh of triangles can be completely described by using the same 6 motion parameters. Therefore, to achieve object articulation, neighbouring triangles which exhibit similar 3D motion parameters are clustered into patches. In an ideal case, these patches will represent the complete visible surface of the moving object components of the articulated object.
15 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} More speci"cally, the following iterative algorithm is proposed: Step 1. Set j"0. Set M"K. Set S "S. Step 2. For each patch s for k"1, 2, M execute the following clustering algorithm. Step 3. Set s "s s. For all the patches s that belong to the neighbourhood of s,if σ a #σa!a σ #σ )th, cluster s to s and set s "s and M "M!1. In the above a, m"k, l are the motion parameters, σ, m"k, l is the variance of the 3D motion estimate, i.e. the Displaced Frame Di!erence (DFD) of patch s computed by compensating the projected 3D motion in the left and right cameras and th is a threshold. Also, σ " 1 (I(P(t))!I(P(t#1)))# 1 (I N P N P (P(t))!I (P(t#1))), where N is the number of points contained in patch s and P(t) and P(t#1) are two corresponding points in time instants t and t#1, respectively. Step 4. Set j"j#1 and M "M. Set S "s, k"1, 2,M.IfS "S stop. Else go to step D motion estimation of each sub-object 7.1. Rigid 3D motion estimation of each sub-object The object articulation procedure identi"es a number of sub-objects of the 3D model object, as areas with homogeneous motion. A sub-object s represents a surface patch of the 3D model object consisting of N control points and q triangles. A sub-object may consist of q"1 triangle only. The motion of an arbitrary point P(t) on the sub-object s to its new position P(t#1) is described by P(t#1)"RP(t)#T, (24) where k"1,2, M, and M is the number of sub-objects, where as before [1]: 1!w w w 1!w T"t R", t.!w w 1 t For the estimation of the model parameter vector a"(w, w, w, t, t, t ) the MLMS iterative algorithm described earlier is used, this time without imposing neighbourhood constraints Non-rigid 3D motion estimation The rigid motion of the articulated objects cannot compensate errors occurring due to local motion (such as due to movement of eyes and lips). These errors can only be compensated by deforming appropriately the nodes of the wireframe, in order to also follow the local motion. An analysis-by-synthesis approach is proposed for the computation of the non-rigid motion *J at node i, which minimises the DFD between the image frame at time t#1 and the 3D non-rigid motion compensated estimate of frame t#1 from frame t.
16 832 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 More speci"cally, the 3D motion r, i"1,2, N, of the wire-mesh nodes is governed by (24). Alternative estimates of the motion of the same node are provided by applying to (16) the 3D motion parameters originally estimated for a triangle containing node i (see Fig. 6(d)). Since the motion of each triangle re#ects both global rigid motion and local deformations, the di!erence of these two estimates of the motion of each node may be assumed to approximate the non-rigid motion component of the node. If r is the rigid 3D motion of node i and *, k"1,2, N, are the estimates for the motion of node i produced by rotating and translating this node with the rotation and translation parameters corresponding to each triangle ¹ containing node i, wede"ne as candidates for the minimisation of the DFD of the reconstruction error *J "*!r, k"1,2, N, (25) where N is the number of neighbourhood triangles of node i. The "nal non-rigid motion vector *J is chosen to be where *J "arg min 2 (DFD (*I )#DFD (*J )), DFD (*J )" 1 (I N (P(t))!I (P(t)#r #*J )). In the above equation, P(t) are the 3D coordinates of node i at time instance t and P(t)#r #*J are the corresponding corrected coordinates at time instance t#1 corresponding to the *J non-rigid motion vector. The intensities I at time instances t and t#1 are calculated for cameras c"l,roveraregionr de"ned as the aggregate of the planes of all triangles containing node i,andn is the number of points contained in region R. 8. Experimental results The proposed model-based analysis and coding method was evaluated for the right and left channels of a stereoscopic image sequence. The "rst frame of both channels is transmitted using intra frame coding techniques, as in H263 [20]. The performance of the proposed methods was investigated in application to the compression of the interlaced stereoscopic videoconference sequence &Claude' of size The hierarchical dynamic programming procedure for matching across the epipolar line, described in Section 4.1, with 2 levels of hierarchy was used for LR and RL disparity/depth estimation. The search area for disparity was chosen to be $62 and $2 half pixels for the x and y coordinate, respectively. Fig. 7(b) and (d) show the computed left and right channel depth maps using the hierarchical dynamic programming approach. The depth map has the same resolution as the original image (since it is computed by a dense disparity "eld). Depth information is quantised to 256 levels. Darker areas represent objects closer to the cameras. The smoothing properties of the dynamic programming method are seen to result in more realistic depth-map estimates. Foreground/background separation is performed next, using the coarse to "ne technique described in Section 4.2. The motion detection mask along with the luminance edge information are then used to improve the results of the initial segmentation. The resulting foreground/background mask of &Claude' is shown in Fig. 5(b). This sequence were prepared by the THOMPSON BROADCASTING SYSTEMS for use in the DISTIMA RACE project.
17 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} Fig. 7. (a) Original left channel image &Claude' (frame 2). (b) Corresponding depth map estimated using dynamic programming. (c) Original right channel image &Claude' (frame 2). (d) Corresponding depth map estimated using dynamic programming. The LR and RL disparity estimates are then subjected to the consistency checking procedure and inconsistent matches are corrected by reliably fusing LR and RL information as described in Section 4.3. On the basis of the consistent and the corrected depth information at all reference points, the wireframe model is adapted to the foreground object (Fig. 5(c) and (d)). The rigid 3D motion of each triangle of the foreground object is computed next, using the technique described in Section 6.1. The output of the proposed local 3D motion estimator was a set of 3D motion parameters assigned to each triangle of the wireframe 3D model. In order to show the resulting local 3D motion we have produced a visualization of the rotation and translation parameters of the homogeneous motion matrix. For the rotation parameters, the direction of the vector assigned to each triangle shows the rotation axis and the size of the vector as well as the color of the triangle show the magnitude of the angle of
18 834 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 Fig. 8. (a) Visualization of the rotation parameters of the rigid 3D motion for each triangle of the 3D model of &Claude'. (b) Visualization of the translation parameters of the rigid 3D motion for each triangle of the 3D model of &Claude'. the rotation. For the translation parameters, the direction of the vector at each triangle shows the direction of the local 3D motion, while the size of the vector as well as the color of the triangle show the magnitude of translation. The visualisation of the rotation and the translation parameters of the rigid 3D motion of each wireframe triangle for &Claude' are shown in Fig. 8(a) and (b), respectively. As is in this way demonstrated, the head and the shoulders undergo di!erent motion. The resulting articulation of the foreground object achieved by the methods of Section 6 is shown in Fig. 9(a). The accuracy of this object articulation is remarkable and is due to the fact that the foreground/background segmentation and the object articulation procedures are de"ned on the 3D space in terms of triangle rather than pixel motion. In this way, complete correspondence is achieved between objects in the right and left channel image. Following object articulation, the algorithm presented in Section 7.1 is used for rigid 3D motion estimation. The computed motion parameter vectors, between frames 1 and 2, for the head and shoulders sub-objects are shown in Fig. 9(b). As seen, the 3D motion of the &shoulders' sub-object is negligible while the 3D motion of the &head' has signi"cant rotation and translation parameters (this can also be observed by examining the original frames 1 and 2 of &Claude'). The performance of the algorithm in terms of PSNR is evaluated in Tables 1 and 2 where the quality of the reconstruction of the whole image as well as only the &head' or &shoulders' sub-objects is presented. Fig. 10(a) and (c) show the reconstructed left and right images, respectively, using rigid 3D motion compensation while Fig. 10(b) and (d) show the corresponding prediction errors. The performance of the algorithm in terms of PSNR is shown in Tables 1 and 2 where the quality of the reconstruction of the whole image as well as only the &head' or &shoulders' sub-objects is presented. As seen, rigid 3D motion is not su$cient for very accurate reconstruction of the &head' sub-object, and thus non-rigid 3D motion must be used to improve the performance of the algorithm. The analysis-by-synthesis technique presented in Section 7.2 is then used for non-rigid 3D motion estimation. The quality of the reconstruction in terms of PSNR is described in Tables 1 and 2 where an improvement of about 1 db is seen to be achieved by non-rigid 3D motion compensation. Fig. 11(a) and (c) show details of the reconstruction error using 3D rigid motion compensation of the left and right images, respectively, while Fig. 11(b) and (d) show the corresponding prediction errors using 3D non-rigid motion compensation.
19 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} Fig. 9. (a) Articulation of the foreground object. (b) The 3D motion parameter vectors corresponding to the head and shoulder sub-objects. Table 1 PSNR (db) of the reconstruction of the left channel frame 2 of &Claude' using rigid and non-rigid 3D motion compensation Object Rigid Non-rigid Whole image Head Shoulders Table 2 PSNR (db) of the reconstruction of the right channel frame 2 of &Claude' using rigid and non-rigid 3D motion compensation Object Rigid Non-rigid Whole image Head Shoulders The proposed algorithm was also tested for the coding of a sequence of frames at 10 frames/s. The model adaptation, depth estimation and object articulation procedures were applied only at the beginning of each group of frames. Each group of frames consists of 10 frames. The "rst frame of each group of frames was transmitted using intra frame coding techniques. In the intermediate frames the model and articulation
20 836 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 Fig. 10. (a) Reconstructed left channel of &Claude' using rigid 3D motion compensation. (b) The corresponding prediction error. (c) Reconstructed right channel of &Claude' using rigid 3D motion compensation. (d) The corresponding prediction error. formation are self-adapted using the rigid and non-rigid 3D motion information. The only parameters that need to be transmitted are the 6 parameters of the rigid 3D motion and the 3D non-rigid motion vector for each node of the wireframe. The methodology developed in this paper allows both left and right images to be reconstructed using the same 3D rigid motion vectors, thus achieving considerable bit-rate savings. The coding algorithm requires a bit-rate of 24.4 kbps and produces better image quality compared to a correspondingly simple block matching motion estimation algorithm [23], as shown in Figs. 12 and 13. The simple block matching approach is identical to that used in H263 and consists of absolute displaced frame di!erence minimization, by searching exhaustively within a search area of!15,2,15 half-pixels in the previous in time frame, centered at the position of the examined block. In both coders, only the "rst frame of
21 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} Fig. 11. (a) Detail of the reconstruction error of left channel of &Claude' using rigid 3D motion compensation. (b) The corresponding prediction error using non-rigid 3D motion compensation. (c) Detail of the reconstruction error of right channel of &Claude' using rigid 3D motion compensation. (d) The corresponding prediction error using non-rigid 3D motion compensation. Fig. 12. PSNR of each frame of the left channel of the proposed algorithm, compared with the block matching scheme with a block size of 1616 pixels.
22 838 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817}840 Fig. 13. PSNR of each frame of the right channel of the proposed algorithm, compared with the block matching scheme with a block size of 1616 pixels. each group of frames was transmitted using intra frame coding. It was also assumed that each frame was predicted using the reconstructed previous frame, and that the prediction error was not transmitted. The bit-rate required by this scheme with a 1616 block size was 24.5 kbps. 9. Conclusions In this paper we addressed the problem of rigid and non-rigid 3D motion estimation for model-based stereo videoconference image sequence analysis and coding. On the basis of foreground/background segmentation using motion, depth and luminance information, the model was initialised by automatically adapting a wireframe model to the consistent depth information. Object articulation was then performed based on the rigid 3D motion of small surface patches. Spatial constraints were imposed to increase the reliability of the obtained 3D motion estimates for each triangle patch. A novel iterative classi"cation technique was then used to obtain an articulated description of the scene (head, neck, shoulders). Finally, #exible motion of the nodes of the wireframe was estimated using a novel technique based on the rigid 3D motions of the triangles containing the speci"c node. The results of the algorithm can be used in a series of applications. For <ideo Production and Computer Graphics applications, the 3D motion of a speci"c scene could be used to produce a scene with similar motion but di!erent texture, as when producing a video with a model mimicking the motion of an actor. The method can have also useful applications in Image Analysis since an analytic representation of the motion of the object is given (either in triangle or in wireframe node level) that can be used for the segmentation or articulation of the object into uniform moving rigidly components. Finally, the method was experimentally shown to be e$cient for very low bit-rate coding of stereoscopic image sequences.
23 D. Tzovaras et al. / Signal Processing: Image Communication 14 (1999) 817} References [1] G. Adiv, Determining three-dimensional motion and structure from optical #ow generated by several moving objects, IEEE Trans. on Pattern Analysis and Machine Intelligence 7 (July 1985) 384}401. [2] K. Aizawa, H. Harashima, T. Saito, Model-based analysis synthesis image coding (MBASIC) system for a person's face, Signal Processing: Image Communication 1 (October 1989) 139}152. [3] S. Barnard, W. Tompson, Disparity analysis of images, IEEE Trans. Pattern Anal. Mach. Intell. 2 (July 1980) 333}340. [4] G. Bozdagi, A.M. Tekalp, L. Onural, 3-D Motion estimation wireframe adaptation including photometric e!ects for model-based coding of facial image sequences, IEEE Trans. Circuits Systems Video Technol. (June 1994) 246}256. [5] I.J. Cox, S. Hingorani, B.M. Maggs, S.B. Rao, Stereo without regularization, tech. rep., NEC Research Institute, Princeton, USA, October [6] I. Cox, S. Hingorani, S. Rao, B. Maggs, A maximum likelihood stereo algorithm, Comput. Vision, Graphics Image Process. (1995) to appear. [7] J.L. Dugelay, D. Pele, Motion disparity analysis of a stereoscopic sequence. Application to 3DTV coding, EUSIPCO '92, October 1992, pp. 1295}1298. [8] W.L.O. Egger, M. Kunt, High compression image coding using an adaptive morphological subband decomposition, Proc. IEEE 83 (February 1995) 272}287. [9] L. Falkenhagen, 3D Object-based depth estimation from stereoscopic image sequences, in: Proc. Internat. Workshop on Stereoscopic and 3D Imaging '95, Santorini, Greece, September 1995, pp. 81}86. [10] N. Grammalidis, S. Malassiotis, D. Tzovaras, M.G. Strintzis, Stereo image sequence coding based on three-dimensional motion estimation compensation, Signal Processing: Image Communication 7 (August 1995) 129}145. [11] M. HoK tter, Object-oriented analysis}synthesis coding based on moving two-dimensional objects, Signal Processing: Image Communication 2 (December 1990) 409}428. [12] M. HoK tter, Optimization and e$ciency of an object-oriented analysis-synthesis coder, Signal Processing: Image Communication 4 (April 1994) 181}194. [13] E. Izquierdo, M. Ernst, Motion/disparity analysis for 3D-video-conference applications, in: M.G.S. et al. (Eds.), Proc. Internat. Workshop Stereoscopic and 3D Imaging, Santorini, Greece, September 1995, pp. 180}186. [14] R. Koch, Dynamic 3D scene analysis through synthesis feedback control, IEEE Trans. Pattern Anal. and Mach. Intell. 15 (June 1993) 556}568. [15] H. Li, A. Lundmark, R. Forchheimer, Image sequence coding at very low bitrates } a review, IEEE Trans. Image Process. 3 (September 1995) 589}609. [16] J. Liu, R. Skerjanc, Stereo and motion correspondence in a sequence of stereo images, Signal Processing: Image Communication 5 (October 1993) 305}318. [17] S. Malassiotis, M.G. Strintzis, Optimal 3D mesh object modeling for depth estimation from stereo images, in: Proc. 4th European Workshop on 3D Television, Rome, October [18] G. Martinez, Shape estimation of moving articulated 3D objects for object-based analysis-synthesis coding (OBASC), in: Internat. Workshop on Coding Techniques for Very Low Bit-rate Video, Tokyo, Japan, November [19] G. MartmH nez, Object articulation for model-based facial image coding, Signal Processing: Image Communication, (September 1996). [20] MPEG-2, Generic coding of moving pictures and associated audio information, tech. rep., ISO/IEC 13818, [21] H.G. Mussman, M. HoK tter, J. Ostermann, Object-oriented analysis}synthesis coding of moving images, Signal Processing: Image Communication 1 (October 1989) 117}138. [22] H.G. Musmann, P. Pirsch, H.J. Grallert, Advances in picture coding, Proc. IEEE 73 (April 1985) 523}548. [23] A.N. Netravali, B.G. Haskell, Digital Pictures } Representation and Compression. Plenum Press, New York and London, [24] S. Panis, M. Ziegler, Object based coding using motion stereo information, in: Proc. Picture Coding Symposium (PCS '94), Sacramento, California, September 1994, pp. 308}312. [25] D.V. Papadimitriou, Stereo in model-based image coding, in: Internat. Workshop on Coding Techniques for Very Low Bit-rate Video (VLBV 94), Colchester, April 1994, p [26] L. Robert, R. Deriche, Dense depth map reconstruction using a multiscale regularization approach with discontinuities preserving, in: M.G.S. et al. (Eds.), Proc. Internat. Workshop Stereoscopic and 3D Imaging, Santorini, Greece, September 1995, pp. 32}39. [27] S.S. Sinha, B.G. Schunck, A two-stage algorithm for discontinuity-preserving surface reconstruction, IEEE Trans. on PAMI 14 (January 1992). [28] A. Tamtaoui, C. Labit, Constrained disparity motion estimators for 3DTV image sequence coding, Signal Processing: Image Communication 4 (November 1991) 45}54. [29] A. Tamtaoui, C. Labit, Symmetrical stereo matching for 3DTV sequence coding, in: Picture Coding Symp. PCS '93, March [30] D. Tzovaras, N. Grammalidis, M.G. Strintzis, Depth map coding for stereo and multiview image sequence transmission, in: Internat. Workshop on Stereoscopic and 3D Imaging (IWS3DI'95), Santorini, Greece, September 1995, pp. 75}80.
Transactions on Information and Communications Technologies vol 19, 1997 WIT Press, ISSN
Hopeld Network for Stereo Correspondence Using Block-Matching Techniques Dimitrios Tzovaras and Michael G. Strintzis Information Processing Laboratory, Electrical and Computer Engineering Department, Aristotle
More informationx L d +W/2 -W/2 W x R (d k+1/2, o k+1/2 ) d (x k, d k ) (x k-1, d k-1 ) (a) (b)
Disparity Estimation with Modeling of Occlusion and Object Orientation Andre Redert, Chun-Jen Tsai +, Emile Hendriks, Aggelos K. Katsaggelos + Information Theory Group, Department of Electrical Engineering
More informationDIGITAL video is an integral part of many newly emerging
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 8, NO. 5, SEPTEMBER 1998 547 3-D Model-Based Segmentation of Videoconference Image Sequences Ioannis Kompatsiaris, Student Member, IEEE,
More informationInternational Journal of Advanced Research in Computer Science and Software Engineering
Volume 2, Issue 8, August 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Study on Block
More informationMultiview Image Compression using Algebraic Constraints
Multiview Image Compression using Algebraic Constraints Chaitanya Kamisetty and C. V. Jawahar Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, INDIA-500019
More informationDisparity map coding for 3D teleconferencing applications
Disparity map coding for 3D teleconferencing applications André Redert, Emile Hendriks Information Theory Group, Department of Electrical Engineering Delft University of Technology, Mekelweg 4, 2628 CD
More informationELEC Dr Reji Mathew Electrical Engineering UNSW
ELEC 4622 Dr Reji Mathew Electrical Engineering UNSW Review of Motion Modelling and Estimation Introduction to Motion Modelling & Estimation Forward Motion Backward Motion Block Motion Estimation Motion
More informationModule 7 VIDEO CODING AND MOTION ESTIMATION
Module 7 VIDEO CODING AND MOTION ESTIMATION Lesson 20 Basic Building Blocks & Temporal Redundancy Instructional Objectives At the end of this lesson, the students should be able to: 1. Name at least five
More informationSprite Generation and Coding in Multiview Image Sequences
302 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 2, MARCH 2000 Sprite Generation and Coding in Multiview Image Sequences Nikos Grammalidis, Student Member, IEEE, Dimitris
More informationRange Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation
Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical
More informationVIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD. Ertem Tuncel and Levent Onural
VIDEO OBJECT SEGMENTATION BY EXTENDED RECURSIVE-SHORTEST-SPANNING-TREE METHOD Ertem Tuncel and Levent Onural Electrical and Electronics Engineering Department, Bilkent University, TR-06533, Ankara, Turkey
More informationMulti-View Stereo for Static and Dynamic Scenes
Multi-View Stereo for Static and Dynamic Scenes Wolfgang Burgard Jan 6, 2010 Main references Yasutaka Furukawa and Jean Ponce, Accurate, Dense and Robust Multi-View Stereopsis, 2007 C.L. Zitnick, S.B.
More informationProceedings of the 6th Int. Conf. on Computer Analysis of Images and Patterns. Direct Obstacle Detection and Motion. from Spatio-Temporal Derivatives
Proceedings of the 6th Int. Conf. on Computer Analysis of Images and Patterns CAIP'95, pp. 874-879, Prague, Czech Republic, Sep 1995 Direct Obstacle Detection and Motion from Spatio-Temporal Derivatives
More informationJoint Position Estimation for Object Based Analysis Synthesis Coding
Joint Position Estimation for Object Based Analysis Synthesis Coding Geovanni Martínez* Escuela de Ingeniería Eléctrica, Universidad de Costa Rica, 2060, San José, Costa Rica ABSTRACT An object based analysis
More informationHead Detection and Tracking by 2-D and 3-D Ellipsoid Fitting.
Head Detection and Tracking by 2-D and 3-D Ellipsoid Fitting. Nikos Grammalidis and Michael G.Strintzis Department of Electrical Engineering, University of Thessaloniki Thessaloniki 540 06, GREECE ngramm@panorama.ee.auth.gr,
More informationExpress Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 7, NO. 2, APRIL 1997 429 Express Letters A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation Jianhua Lu and
More informationEstimation of eye and mouth corner point positions in a knowledge based coding system
Estimation of eye and mouth corner point positions in a knowledge based coding system Liang Institut für Theoretische Nachrichtentechnik und Informationsverarbeitung Universität Hannover, Appelstraße 9A,
More informationDepth Estimation for View Synthesis in Multiview Video Coding
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Depth Estimation for View Synthesis in Multiview Video Coding Serdar Ince, Emin Martinian, Sehoon Yea, Anthony Vetro TR2007-025 June 2007 Abstract
More informationBI-DIRECTIONAL AFFINE MOTION COMPENSATION USING A CONTENT-BASED, NON-CONNECTED, TRIANGULAR MESH
BI-DIRECTIONAL AFFINE MOTION COMPENSATION USING A CONTENT-BASED, NON-CONNECTED, TRIANGULAR MESH Marc Servais, Theo Vlachos and Thomas Davies University of Surrey, UK; and BBC Research and Development,
More informationA multiresolutional region based segmentation scheme for stereoscopic image compression
Published in SPIE Vol.2419 - Digital Video compression - Algorithms and technologies in 1995 A multiresolutional region based segmentation scheme for stereoscopic image compression Sriram Sethuraman, M.
More informationFlow Estimation. Min Bai. February 8, University of Toronto. Min Bai (UofT) Flow Estimation February 8, / 47
Flow Estimation Min Bai University of Toronto February 8, 2016 Min Bai (UofT) Flow Estimation February 8, 2016 1 / 47 Outline Optical Flow - Continued Min Bai (UofT) Flow Estimation February 8, 2016 2
More informationIntermediate view synthesis considering occluded and ambiguously referenced image regions 1. Carnegie Mellon University, Pittsburgh, PA 15213
1 Intermediate view synthesis considering occluded and ambiguously referenced image regions 1 Jeffrey S. McVeigh *, M. W. Siegel ** and Angel G. Jordan * * Department of Electrical and Computer Engineering
More informationContext based optimal shape coding
IEEE Signal Processing Society 1999 Workshop on Multimedia Signal Processing September 13-15, 1999, Copenhagen, Denmark Electronic Proceedings 1999 IEEE Context based optimal shape coding Gerry Melnikov,
More informationFundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision
Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision What Happened Last Time? Human 3D perception (3D cinema) Computational stereo Intuitive explanation of what is meant by disparity Stereo matching
More informationMoving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation
IJCSNS International Journal of Computer Science and Network Security, VOL.13 No.11, November 2013 1 Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial
More informationNEW CONCEPT FOR JOINT DISPARITY ESTIMATION AND SEGMENTATION FOR REAL-TIME VIDEO PROCESSING
NEW CONCEPT FOR JOINT DISPARITY ESTIMATION AND SEGMENTATION FOR REAL-TIME VIDEO PROCESSING Nicole Atzpadin 1, Serap Askar, Peter Kauff, Oliver Schreer Fraunhofer Institut für Nachrichtentechnik, Heinrich-Hertz-Institut,
More informationMesh Based Interpolative Coding (MBIC)
Mesh Based Interpolative Coding (MBIC) Eckhart Baum, Joachim Speidel Institut für Nachrichtenübertragung, University of Stuttgart An alternative method to H.6 encoding of moving images at bit rates below
More informationDepth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth
Common Classification Tasks Recognition of individual objects/faces Analyze object-specific features (e.g., key points) Train with images from different viewing angles Recognition of object classes Analyze
More informationUniversity of Erlangen-Nuremberg. Cauerstrasse 7, Erlangen, Germany. Abstract
Motion-Based Analysis and Segmentation of Image Sequences using 3-D Scene Models Eckehard Steinbach, Peter Eisert, and Bernd Girod Telecommunications Laboratory, University of Erlangen-Nuremberg Cauerstrasse
More informationSpeech Driven Synthesis of Talking Head Sequences
3D Image Analysis and Synthesis, pp. 5-56, Erlangen, November 997. Speech Driven Synthesis of Talking Head Sequences Peter Eisert, Subhasis Chaudhuri,andBerndGirod Telecommunications Laboratory, University
More informationFACIAL FEATURE EXTRACTION BASED ON THE SMALLEST UNIVALUE SEGMENT ASSIMILATING NUCLEUS (SUSAN) ALGORITHM. Mauricio Hess 1 Geovanni Martinez 2
FACIAL FEATURE EXTRACTION BASED ON THE SMALLEST UNIVALUE SEGMENT ASSIMILATING NUCLEUS (SUSAN) ALGORITHM Mauricio Hess 1 Geovanni Martinez 2 Image Processing and Computer Vision Research Lab (IPCV-LAB)
More informationMotion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures
Now we will talk about Motion Analysis Motion analysis Motion analysis is dealing with three main groups of motionrelated problems: Motion detection Moving object detection and location. Derivation of
More informationEfficient Block Matching Algorithm for Motion Estimation
Efficient Block Matching Algorithm for Motion Estimation Zong Chen International Science Inde Computer and Information Engineering waset.org/publication/1581 Abstract Motion estimation is a key problem
More informationSegmentation and Tracking of Partial Planar Templates
Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract
More informationPROJECTION MODELING SIMPLIFICATION MARKER EXTRACTION DECISION. Image #k Partition #k
TEMPORAL STABILITY IN SEQUENCE SEGMENTATION USING THE WATERSHED ALGORITHM FERRAN MARQU ES Dept. of Signal Theory and Communications Universitat Politecnica de Catalunya Campus Nord - Modulo D5 C/ Gran
More informationCoding of Coefficients of two-dimensional non-separable Adaptive Wiener Interpolation Filter
Coding of Coefficients of two-dimensional non-separable Adaptive Wiener Interpolation Filter Y. Vatis, B. Edler, I. Wassermann, D. T. Nguyen and J. Ostermann ABSTRACT Standard video compression techniques
More informationA The left scanline The right scanline
Dense Disparity Estimation via Global and Local Matching Chun-Jen Tsai and Aggelos K. Katsaggelos Electrical and Computer Engineering Northwestern University Evanston, IL 60208-3118, USA E-mail: tsai@ece.nwu.edu,
More informationVideo Alignment. Final Report. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin
Final Report Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This report describes a method to align two videos.
More informationOptimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform
Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Torsten Palfner, Alexander Mali and Erika Müller Institute of Telecommunications and Information Technology, University of
More informationStereo Wrap + Motion. Computer Vision I. CSE252A Lecture 17
Stereo Wrap + Motion CSE252A Lecture 17 Some Issues Ambiguity Window size Window shape Lighting Half occluded regions Problem of Occlusion Stereo Constraints CONSTRAINT BRIEF DESCRIPTION 1-D Epipolar Search
More informationMotion Estimation for Video Coding Standards
Motion Estimation for Video Coding Standards Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Introduction of Motion Estimation The goal of video compression
More informationFacial Expression Analysis for Model-Based Coding of Video Sequences
Picture Coding Symposium, pp. 33-38, Berlin, September 1997. Facial Expression Analysis for Model-Based Coding of Video Sequences Peter Eisert and Bernd Girod Telecommunications Institute, University of
More informationINTERNATIONAL ORGANIZATION FOR STANDARDIZATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO-IEC JTC1/SC29/WG11
INTERNATIONAL ORGANIZATION FOR STANDARDIZATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO-IEC JTC1/SC29/WG11 CODING OF MOVING PICTRES AND ASSOCIATED ADIO ISO-IEC/JTC1/SC29/WG11 MPEG 95/ July 1995
More informationView Synthesis for Multiview Video Compression
View Synthesis for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, and Anthony Vetro email:{martinian,jxin,avetro}@merl.com, behrens@tnt.uni-hannover.de Mitsubishi Electric Research
More informationVideo Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin
Literature Survey Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin Omer Shakil Abstract This literature survey compares various methods
More informationTHE GENERATION of a stereoscopic image sequence
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 8, AUGUST 2005 1065 Stereoscopic Video Generation Based on Efficient Layered Structure and Motion Estimation From a Monoscopic
More informationAutomatic Reconstruction of 3D Objects Using a Mobile Monoscopic Camera
Automatic Reconstruction of 3D Objects Using a Mobile Monoscopic Camera Wolfgang Niem, Jochen Wingbermühle Universität Hannover Institut für Theoretische Nachrichtentechnik und Informationsverarbeitung
More informationColour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation
ÖGAI Journal 24/1 11 Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation Michael Bleyer, Margrit Gelautz, Christoph Rhemann Vienna University of Technology
More informationDepartment of Electrical Engineering, Keio University Hiyoshi Kouhoku-ku Yokohama 223, Japan
Shape Modeling from Multiple View Images Using GAs Satoshi KIRIHARA and Hideo SAITO Department of Electrical Engineering, Keio University 3-14-1 Hiyoshi Kouhoku-ku Yokohama 223, Japan TEL +81-45-563-1141
More informationLocal qualitative shape from stereo. without detailed correspondence. Extended Abstract. Shimon Edelman. Internet:
Local qualitative shape from stereo without detailed correspondence Extended Abstract Shimon Edelman Center for Biological Information Processing MIT E25-201, Cambridge MA 02139 Internet: edelman@ai.mit.edu
More informationREDUCTION OF CODING ARTIFACTS IN LOW-BIT-RATE VIDEO CODING. Robert L. Stevenson. usually degrade edge information in the original image.
REDUCTION OF CODING ARTIFACTS IN LOW-BIT-RATE VIDEO CODING Robert L. Stevenson Laboratory for Image and Signal Processing Department of Electrical Engineering University of Notre Dame Notre Dame, IN 46556
More informationRENDERING AND ANALYSIS OF FACES USING MULTIPLE IMAGES WITH 3D GEOMETRY. Peter Eisert and Jürgen Rurainsky
RENDERING AND ANALYSIS OF FACES USING MULTIPLE IMAGES WITH 3D GEOMETRY Peter Eisert and Jürgen Rurainsky Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institute Image Processing Department
More informationSTEREOSCOPIC IMAGE PROCESSING
STEREOSCOPIC IMAGE PROCESSING Reginald L. Lagendijk, Ruggero E.H. Franich 1 and Emile A. Hendriks 2 Delft University of Technology Department of Electrical Engineering 4 Mekelweg, 2628 CD Delft, The Netherlands
More informationModel-Aided Coding: A New Approach to Incorporate Facial Animation into Motion-Compensated Video Coding
344 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 3, APRIL 2000 Model-Aided Coding: A New Approach to Incorporate Facial Animation into Motion-Compensated Video Coding Peter
More informationCHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION
CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION In this chapter we will discuss the process of disparity computation. It plays an important role in our caricature system because all 3D coordinates of nodes
More informationRectification and Distortion Correction
Rectification and Distortion Correction Hagen Spies March 12, 2003 Computer Vision Laboratory Department of Electrical Engineering Linköping University, Sweden Contents Distortion Correction Rectification
More informationCOMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION
COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION Mr.V.SRINIVASA RAO 1 Prof.A.SATYA KALYAN 2 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING PRASAD V POTLURI SIDDHARTHA
More informationLearning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009
Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer
More informationDetecting Planar Homographies in an Image Pair. submission 335. all matches. identication as a rst step in an image analysis
Detecting Planar Homographies in an Image Pair submission 335 Abstract This paper proposes an algorithm that detects the occurrence of planar homographies in an uncalibrated image pair. It then shows that
More informationFigure 1: Representation of moving images using layers Once a set of ane models has been found, similar models are grouped based in a mean-square dist
ON THE USE OF LAYERS FOR VIDEO CODING AND OBJECT MANIPULATION Luis Torres, David Garca and Anna Mates Dept. of Signal Theory and Communications Universitat Politecnica de Catalunya Gran Capita s/n, D5
More informationPhase2. Phase 1. Video Sequence. Frame Intensities. 1 Bi-ME Bi-ME Bi-ME. Motion Vectors. temporal training. Snake Images. Boundary Smoothing
CIRCULAR VITERBI BASED ADAPTIVE SYSTEM FOR AUTOMATIC VIDEO OBJECT SEGMENTATION I-Jong Lin, S.Y. Kung ijonglin@ee.princeton.edu Princeton University Abstract - Many future video standards such as MPEG-4
More informationA Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 1, JANUARY 2001 111 A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation
More informationProf. Fanny Ficuciello Robotics for Bioengineering Visual Servoing
Visual servoing vision allows a robotic system to obtain geometrical and qualitative information on the surrounding environment high level control motion planning (look-and-move visual grasping) low level
More informationView Synthesis for Multiview Video Compression
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com View Synthesis for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, and Anthony Vetro TR2006-035 April 2006 Abstract
More informationAccurate and Dense Wide-Baseline Stereo Matching Using SW-POC
Accurate and Dense Wide-Baseline Stereo Matching Using SW-POC Shuji Sakai, Koichi Ito, Takafumi Aoki Graduate School of Information Sciences, Tohoku University, Sendai, 980 8579, Japan Email: sakai@aoki.ecei.tohoku.ac.jp
More informationCHAPTER 5 MOTION DETECTION AND ANALYSIS
CHAPTER 5 MOTION DETECTION AND ANALYSIS 5.1. Introduction: Motion processing is gaining an intense attention from the researchers with the progress in motion studies and processing competence. A series
More informationRegion Segmentation for Facial Image Compression
Region Segmentation for Facial Image Compression Alexander Tropf and Douglas Chai Visual Information Processing Research Group School of Engineering and Mathematics, Edith Cowan University Perth, Australia
More informationExtensions of H.264/AVC for Multiview Video Compression
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Extensions of H.264/AVC for Multiview Video Compression Emin Martinian, Alexander Behrens, Jun Xin, Anthony Vetro, Huifang Sun TR2006-048 June
More informationThe Video Z-buffer: A Concept for Facilitating Monoscopic Image Compression by exploiting the 3-D Stereoscopic Depth map
The Video Z-buffer: A Concept for Facilitating Monoscopic Image Compression by exploiting the 3-D Stereoscopic Depth map Sriram Sethuraman 1 and M. W. Siegel 2 1 David Sarnoff Research Center, Princeton,
More informationPRELIMINARY RESULTS ON REAL-TIME 3D FEATURE-BASED TRACKER 1. We present some preliminary results on a system for tracking 3D motion using
PRELIMINARY RESULTS ON REAL-TIME 3D FEATURE-BASED TRACKER 1 Tak-keung CHENG derek@cs.mu.oz.au Leslie KITCHEN ljk@cs.mu.oz.au Computer Vision and Pattern Recognition Laboratory, Department of Computer Science,
More informationStereo Vision. MAN-522 Computer Vision
Stereo Vision MAN-522 Computer Vision What is the goal of stereo vision? The recovery of the 3D structure of a scene using two or more images of the 3D scene, each acquired from a different viewpoint in
More informationVery Low Bit Rate Color Video
1 Very Low Bit Rate Color Video Coding Using Adaptive Subband Vector Quantization with Dynamic Bit Allocation Stathis P. Voukelatos and John J. Soraghan This work was supported by the GEC-Marconi Hirst
More informationAn Embedded Wavelet Video Coder. Using Three-Dimensional Set. Partitioning in Hierarchical Trees. Beong-Jo Kim and William A.
An Embedded Wavelet Video Coder Using Three-Dimensional Set Partitioning in Hierarchical Trees (SPIHT) Beong-Jo Kim and William A. Pearlman Department of Electrical, Computer, and Systems Engineering Rensselaer
More informationStereo imaging ideal geometry
Stereo imaging ideal geometry (X,Y,Z) Z f (x L,y L ) f (x R,y R ) Optical axes are parallel Optical axes separated by baseline, b. Line connecting lens centers is perpendicular to the optical axis, and
More informationAn Embedded Wavelet Video. Set Partitioning in Hierarchical. Beong-Jo Kim and William A. Pearlman
An Embedded Wavelet Video Coder Using Three-Dimensional Set Partitioning in Hierarchical Trees (SPIHT) 1 Beong-Jo Kim and William A. Pearlman Department of Electrical, Computer, and Systems Engineering
More informationEfficient Stereo Image Rectification Method Using Horizontal Baseline
Efficient Stereo Image Rectification Method Using Horizontal Baseline Yun-Suk Kang and Yo-Sung Ho School of Information and Communicatitions Gwangju Institute of Science and Technology (GIST) 261 Cheomdan-gwagiro,
More informationRESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE
RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE K. Kaviya Selvi 1 and R. S. Sabeenian 2 1 Department of Electronics and Communication Engineering, Communication Systems, Sona College
More informationA Real Time System for Detecting and Tracking People. Ismail Haritaoglu, David Harwood and Larry S. Davis. University of Maryland
W 4 : Who? When? Where? What? A Real Time System for Detecting and Tracking People Ismail Haritaoglu, David Harwood and Larry S. Davis Computer Vision Laboratory University of Maryland College Park, MD
More informationVisual Hulls from Single Uncalibrated Snapshots Using Two Planar Mirrors
Visual Hulls from Single Uncalibrated Snapshots Using Two Planar Mirrors Keith Forbes 1 Anthon Voigt 2 Ndimi Bodika 2 1 Digital Image Processing Group 2 Automation and Informatics Group Department of Electrical
More informationScene Segmentation by Color and Depth Information and its Applications
Scene Segmentation by Color and Depth Information and its Applications Carlo Dal Mutto Pietro Zanuttigh Guido M. Cortelazzo Department of Information Engineering University of Padova Via Gradenigo 6/B,
More informationOcclusion Detection of Real Objects using Contour Based Stereo Matching
Occlusion Detection of Real Objects using Contour Based Stereo Matching Kenichi Hayashi, Hirokazu Kato, Shogo Nishida Graduate School of Engineering Science, Osaka University,1-3 Machikaneyama-cho, Toyonaka,
More informationUniversity of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /WIVC.1996.
Czerepinski, P. J., & Bull, D. R. (1996). Coderoriented matching criteria for motion estimation. In Proc. 1st Intl workshop on Wireless Image and Video Communications (pp. 38 42). Institute of Electrical
More informationImage Segmentation Techniques for Object-Based Coding
Image Techniques for Object-Based Coding Junaid Ahmed, Joseph Bosworth, and Scott T. Acton The Oklahoma Imaging Laboratory School of Electrical and Computer Engineering Oklahoma State University {ajunaid,bosworj,sacton}@okstate.edu
More informationVIDEO COMPRESSION STANDARDS
VIDEO COMPRESSION STANDARDS Family of standards: the evolution of the coding model state of the art (and implementation technology support): H.261: videoconference x64 (1988) MPEG-1: CD storage (up to
More informationCalibrating a Structured Light System Dr Alan M. McIvor Robert J. Valkenburg Machine Vision Team, Industrial Research Limited P.O. Box 2225, Auckland
Calibrating a Structured Light System Dr Alan M. McIvor Robert J. Valkenburg Machine Vision Team, Industrial Research Limited P.O. Box 2225, Auckland New Zealand Tel: +64 9 3034116, Fax: +64 9 302 8106
More informationChapter 3 Image Registration. Chapter 3 Image Registration
Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation
More informationGeometric transform motion compensation for low bit. rate video coding. Sergio M. M. de Faria
Geometric transform motion compensation for low bit rate video coding Sergio M. M. de Faria Instituto de Telecomunicac~oes / Instituto Politecnico de Leiria Pinhal de Marrocos, Polo II-FCTUC 3000 Coimbra,
More information3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: ,
3D Sensing and Reconstruction Readings: Ch 12: 12.5-6, Ch 13: 13.1-3, 13.9.4 Perspective Geometry Camera Model Stereo Triangulation 3D Reconstruction by Space Carving 3D Shape from X means getting 3D coordinates
More informationRealtime View Adaptation of Video Objects in 3-Dimensional Virtual Environments
Contact Details of Presenting Author Edward Cooke (cooke@hhi.de) Tel: +49-30-31002 613 Fax: +49-30-3927200 Summation Abstract o Examination of the representation of time-critical, arbitrary-shaped, video
More informationMOTION. Feature Matching/Tracking. Control Signal Generation REFERENCE IMAGE
Head-Eye Coordination: A Closed-Form Solution M. Xie School of Mechanical & Production Engineering Nanyang Technological University, Singapore 639798 Email: mmxie@ntuix.ntu.ac.sg ABSTRACT In this paper,
More information3. International Conference on Face and Gesture Recognition, April 14-16, 1998, Nara, Japan 1. A Real Time System for Detecting and Tracking People
3. International Conference on Face and Gesture Recognition, April 14-16, 1998, Nara, Japan 1 W 4 : Who? When? Where? What? A Real Time System for Detecting and Tracking People Ismail Haritaoglu, David
More informationDimensional Imaging IWSNHC3DI'99, Santorini, Greece, September SYNTHETIC HYBRID OR NATURAL FIT?
International Workshop on Synthetic Natural Hybrid Coding and Three Dimensional Imaging IWSNHC3DI'99, Santorini, Greece, September 1999. 3-D IMAGING AND COMPRESSION { SYNTHETIC HYBRID OR NATURAL FIT? Bernd
More informationA Hierarchical Statistical Framework for the Segmentation of Deformable Objects in Image Sequences Charles Kervrann and Fabrice Heitz IRISA / INRIA -
A hierarchical statistical framework for the segmentation of deformable objects in image sequences Charles Kervrann and Fabrice Heitz IRISA/INRIA, Campus Universitaire de Beaulieu, 35042 Rennes Cedex,
More informationLOCALIZATION OF FACIAL REGIONS AND FEATURES IN COLOR IMAGES. Karin Sobottka Ioannis Pitas
LOCALIZATION OF FACIAL REGIONS AND FEATURES IN COLOR IMAGES Karin Sobottka Ioannis Pitas Department of Informatics, University of Thessaloniki 540 06, Greece e-mail:fsobottka, pitasg@zeus.csd.auth.gr Index
More informationEE795: Computer Vision and Intelligent Systems
EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational
More informationMotion-Compensated Subband Coding. Patrick Waldemar, Michael Rauth and Tor A. Ramstad
Video Compression by Three-dimensional Motion-Compensated Subband Coding Patrick Waldemar, Michael Rauth and Tor A. Ramstad Department of telecommunications, The Norwegian Institute of Technology, N-7034
More informationPlatelet-based coding of depth maps for the transmission of multiview images
Platelet-based coding of depth maps for the transmission of multiview images Yannick Morvan a, Peter H. N. de With a,b and Dirk Farin a a Eindhoven University of Technology, P.O. Box 513, The Netherlands;
More informationMultiple View Geometry
Multiple View Geometry CS 6320, Spring 2013 Guest Lecture Marcel Prastawa adapted from Pollefeys, Shah, and Zisserman Single view computer vision Projective actions of cameras Camera callibration Photometric
More informationMOTION ESTIMATION IN MPEG-2 VIDEO ENCODING USING A PARALLEL BLOCK MATCHING ALGORITHM. Daniel Grosu, Honorius G^almeanu
MOTION ESTIMATION IN MPEG-2 VIDEO ENCODING USING A PARALLEL BLOCK MATCHING ALGORITHM Daniel Grosu, Honorius G^almeanu Multimedia Group - Department of Electronics and Computers Transilvania University
More informationFast Lighting Independent Background Subtraction
Fast Lighting Independent Background Subtraction Yuri Ivanov Aaron Bobick John Liu [yivanov bobick johnliu]@media.mit.edu MIT Media Laboratory February 2, 2001 Abstract This paper describes a new method
More information