Generic 3D Face Pose Estimation using Facial Shapes

Size: px
Start display at page:

Download "Generic 3D Face Pose Estimation using Facial Shapes"

Transcription

1 Generic 3D Face Pose Estimation using Facial Shapes Jingu Heo CyLab Biometrics Center Carnegie Mellon University 5000 Forbes Ave, Pittsburgh, PA Marios Savvides CyLab Biometrics Center Carnegie Mellon University 5000 Forbes Ave, Pittsburgh, PA Abstract Generic 3D face pose estimation from a single 2D facial image is an extremely crucial requirement for face-related research areas. To meet with the remaining challenges for face pose estimation, suggested Murphy-Chutorian et al. [13], we believe that the first step is to create a large corpus of a 3D facial shape database in which the statistical relationship between projected 2D shapes and corresponding pose parameters can be easily observed. Because facial geometry provides the most essential information for facial pose, understanding the effect of pose parameters in 2D facial shapes is a key step toward solving the remaining challenges. In this paper, we present necessary tasks to reconstruct 3D facial shapes from multiple 2D images and then explain how to generate 2D projected shapes at any rotation interval. To deal with self occlusions, a novel hidden points removal (HPR) algorithm is also proposed. By flexibly changing the number of points in 2D shapes, we evaluate the performance of two different approaches for achieving generic 3D pose estimation in both coarse and fine levels and analyze the importance of facial shapes toward generic 3D pose estimation. 1. Introduction Face (Head) 1 pose estimation has been widely investigated in decades and still has room for improvement towards the remaining challenges addressed by the recent survey paper [13]. More accurate and automatic pose estimation, which can be performed in real-time by using a monocular camera under various lighting conditions with different resolutions, providing invariance to identity for handing a full range of head motion and processing multiple persons simultaneously, is desired for numerous applications such 1 We use face instead of head in pose estimation, since the three Degrees of Freedom (DOF) can be reliably estimated based on facial features without the information about the ears and head contour. as face tracking and recognition, human computer interaction, and video database indexing. Since human faces in digital imagery can be affected by numerous factors caused by intrinsic facial appearance changes (expression, aging and eyeglass) and extrinsic changes (illumination, camera geometry and distortions), achieving invariance to these changes is an ongoing research topic and is the ultimate goal of not only for pose estimation but also for other face-related research areas, including face detection, face tracking, face alignment, and face recognition. It is known that the simplest 3D pose estimation can be simply achievable by using some facial geometry assumptions [6] [7] [10], i.e. generic distances among facial features. However, no rigorous evaluations of validating these geometry assumptions on facial shape variations have been reported. Furthermore, less attention has been made in order to understand the relationships among 3D facial shapes, their 2D projected shapes and associated pose parameters, due to the fact that generating and processing a large corpus of 3D face databases is not a trivial task itself. We believe that facial geometry provides the most important information for facial pose, therefore, understanding the effect of pose parameters in 2D facial shapes is the most essential step toward solving the remaining challenges. In this paper, we provide an efficient solution for this problem by using a large corpus of 3D shape databases, easily acquired by using multiple 2D images. Instead of relying on dense facial shapes, we only utilize a maximum of 79 points in our 3D shape reconstruction. Then we rotate a 3D face shape model and obtain 2D projection shapes at any rotation interval. A novel Hidden Points Removal (HPR) algorithm is also proposed in order to deal with self occlusions. By using much smaller number of points, we present two different approaches for achieving generic 3D pose estimation and analyze the importance of facial shapes for pose estimation. We can show that generic 3D face pose estimation both in coarse and fine levels can be effectively achievable by utilizing a set of important facial feature points /11/$ IEEE

2 2. Background Face pose estimation is the process of retrieving the direction or the orientation of a face by transforming the pixel-level representation of human faces into a high-level concept of direction [13], i.e. three Degree of Freedom (DOF). Existing face pose estimation can be largely divided into two categories based on the type of features: appearance-based and shape-based representation approaches. Appearance-based approaches include flexible model-based methods [21] and ordinal pixel representation methods [2] [16]. These methods often require additional regression models, such as Support Vector Regression (SVR), Neural Networks (NN) [3], and Manifold Embedding methods [25] in order to learn or analyze the nonlinear relationships between appearance information and pose parameters [26]. Due to the difficulty in learning innumerable appearance changes in facial images, over-fitting or generalization is the key limiting factor for these appearance methods. These methods also require large computational burdens compared to shape-based methods and typically have problems handling three DOF in fine level pose estimation. On the other hand, shape-based methods can be further divided into two groups: point-based methods [23] [15] and geometry-based methods [6] [7] which utilize a set of important facial fiducial points. What is the challenging task in shape-based methods is not the pose estimator itself but the automatic detection of facial features in order to achieve three DOF. This automatic facial feature detection normally complicate the evaluation of pose estimation, since poor performance in facial feature detection typically results in low performance in pose estimation. Multi-view face detectors [11] and view-based flexible models [22] are often utilized to improve or initialize the feature localization steps for coping with off-angle face images. However, generic 3D face alignment [8] is another important research topic to be addressed and has room for improvement to deal with shapes under a wide range of 3D rotation changes. We expect that recently proposed facial feature alignment schemes [4] [12] may reduce this feature localization problem; however, even with accurate facial feature alignment, we believe that generic 3D pose estimation in a fine level is not a completely solved problem. Point-based methods typically require a set of points which describe an overall shape of a face (the number of points varies within the range of 50 to 100), while geometry-based methods often use a minimum number of points and typically do not require additional regression models. For these reasons, geometry-based methods have attracted researchers for achieving three DOF in a fine resolution level, requiring only the location information of the centers of the two eyes, nose tip, and two mouth corners. In [6], the facial normal, which contain three orientations of a face, is obtained by using these five points, by utilizing a set of fixed distance ratios between facial features (we will detail this method and provide an alternative solution in Section 5). However, many existing pose estimation methods may not be suitable for achieving both coarse and fine resolution together; some methods can only estimate poses in a very coarse level and have difficulty in achieving fine level pose estimation, especially for appearance-based methods. In order to achieve a fine resolution pose estimator for appearance-based methods, one should consider continuous variations of face images. As mentioned earlier, learning seamless appearance variations in face images is still another challenging problem. A good example of appearancebased pose estimation in an extremely coarsest level is view-based face detectors [11], which can retrieve single DOF pose information relative to the frontal view with three discrete values. Although these view-based or pixel-based approaches are popular in detecting faces, improved face detection with more discrete or continuous level pose angles may require a significant modification in their algorithms. To assist readability on three DOF, we characterize three angels: roll, pitch, and yaw. Roll is the rotation about the z axis, resulting in the rotation of the xy plane. Pitch is the rotation about the x axis, resulting in the rotation of the yz plane. Similarly, yaw is the rotation about the y axis, changing the xz values. Therefore, pose estimation is an inverse problem to retrieve these 3 pose parameters based on the projected features in 2D face images. Since the observations for 3D pose parameters require a minimum of 5 facial feature points, the 5 point-based method [6] is one of the most attractable solutions to achieve a generic 3D fine pose estimator for this reason. In this paper, we focus on analyzing the 5 point-based method with a much more simple solution and providing more comprehensive evaluation results towards generic people by using a large corpus of a 3D shape database, in which the relationship between projected 2D shapes and corresponding pose parameters can be easily manipulated and observed. Although there are several databases for pose estimation [13], most of them use projected 2D appearances and do not provide shape information, to best our knowledge. We believe that facial shape information is the most essential for pose estimation, thus our database can be of great importance to analyze statistical shape variations caused by 3D pose changes. In addition, we propose a way to improve our pose estimator more accurately by incorporating more number of facial features. The rest of the paper is organized as the following. The detailed procedure of generating 3D database and obtaining 2D projected shapes is presented in Section 3, along with a new HPR algorithm. Two different methods for generic 3D pose estimator are presented in Section 4, and Section

3 where each column contains a vector of (x, y, z) coordinates. The goal of the 3D reconstruction problem from multiple 2D observations is to recover 3D shape information under noisy conditions. Since a set of 2D points is an instance of a projection of 3D with the 2D translation vector t, we write: s 2d = Ps 3d + t (3) Figure 1. Overview of the proposed work. 5 contains performance evaluation results. Finally, in Section 6, we summarize our proposed works and discuss future works. 3. Proposed Work In this section, we provide necessary steps for building a sparse 3D shape database and introduce necessary functions for obtaining 2D shapes and pose parameters at any angle interval. A visual illustration of the overview of our proposed work is presented in Fig. 1. Based on the reconstructed 3D shape, we obtain a set of novel 2D projected shapes at a fine angle interval. During the projection step, we identify hidden points. Then, a 3D pose estimator is achieved by utilizing a different number of point sets with/without utilizing regression models. We detail each step of our proposed work in the following sections Sparse 3D Face Reconstruction 3D reconstruction from multiple 2D face images offers one of the convenient way of acquiring 3D facial shape information. A closely related research topic for sparse 3D reconstruction is known as Structure from Motion (SFM), which are successfully applied in many computer vision algorithms [9]. A basic theory behind SFM is presented in this section. We define the 2D shape matrix S 2xn as the 2D coordinates (x, y) of the n vertices: ( ) x1 x S 2xn = 2... x n (1) y 1 y 2... y n where each column contains a vector of (x, y) coordinates. Similarly, the 3D shape matrix can be represented by the 3D coordinates (x, y, z): S 3xn = x 1 x 2... x n y 1 y 2... y n z 1 z 2... z n (2) where s 2d = (x, y) T and s 3d = (x, y, z) T indicate each point in 2D and 3D respectively and P is the projection matrix, which needs to be specified depending on applications. For multiple camera observations (i) and multiple points (n), the goal of reconstruction is to minimize the overall error arg min s i 2dn (P i s 3dn + t i ) 2. (4) P i,t i,s i 3dn n,i=1 The Factorization algorithm [24] achieves the above minimization under Gaussian noise. The measurement matrix Y can be obtained by stacking the 2D observations (S 2xn ): The factorization algorithm can estimate S 3xn by decomposing Y = P 2ix3 S3xn through Singular Value Decomposition (SVD), by utilizing the fact that the rank of Y is at most 3. Then, the metric upgrade needs to be performed with additional constraints [24]. By using the SFM technique [24], we reconstructed 249 3D sparse faces (79 points) by using the first session of the MPIE database [17]. Five images (within ± 30 degrees off from the frontal view) of the same person under pose variations with the same expression are used for each reconstruction. We utilize these shapes for generating 2D projected shapes at arbitrary 3D rotations D Projected Shapes The procedures for synthesizing 2D projection shapes at any desired angle can be explained by using the following three steps. First, we need to normalize the reconstructed 3D shapes in order to compensate scale and rotation problems. Based on the reconstructed shapes, we perform an initial adjustment of the 3D shapes to ensure that each 3D shape is all frontal, and its 2D projections lie along the z axis (relative to the frontal viewing angle). In other words, we project each 3D shape under the scaled version of orthographic projection (weak-perspective projection), which is a fairly a good assumption for projective geometry. Second, we rotate each 3D shape at any desired angle interval and obtain 2D projected shapes. Depending on the interval, pose estimation can obtain either in a coarse or a fine level. This way, we can estimate pose in both levels without changing the main algorithms; what needs to determine is the degree interval in which 2D projected shapes belong to the same degree.

4 Finally, in order to handle occlusions while displaying, the visibility of these points should be determined. However, current methods for handling visibility needs improvement since they typically utilize texture (depth-buffer) or surface normal to decide if the points are visible at a viewing direction. This visibility problem is a crucial element to be addressed, especially for 2D projected shapes with a wide range of pose changes. We detail this HPR problem in the next section, since we develop an improved way of estimating HPR from a set of sparse points based on surface normals and relative depth compared to a viewing angle A New Hidden Points Removal Method Hidden Points Removal (HPR) is the process of determining the visibility of a point cloud from a given viewpoint. Closely related techniques include Hidden Surface Removal (HSR). HPR determines whether the visibility of points while HSR emphasizes on the visibility of surfaces. Popular approaches for determining visibility include the z-buffer method [14] [20] and surface reconstruction based methods [5] [18] [1]. These methods typically require dense points in order to obtain triangular meshes for computing surface normals smoothly. Recently, a new efficient approach is proposed without utilizing computationally expensive surface normal computation [19]. The author of [19] also provided an automated way of computing the radius; however, it cannot successfully handle points with self-occlusions and needs a dense set of points to reliably determine the visibility of points. In order to eliminate the selection of the radius in [19], we propose a new HPR algorithm based on surface normal and relative depth information to the viewing point. Since the surface orientation information alone is not sufficiently enough to deal with the point visibility, relative depth information should be also considered. Our proposed HPR work is similar to surface reconstruction methods [18] [1]. However, our work does not attempt to reconstruct smooth surfaces. Rather, simple triangulation information based on a set of points is utilized. An overview of our proposed HPR method can be explained as the following. We first compute meshes from a set of points (79 points), obtained by the 3D reconstruction method in the previous section. Then we compute surface normals and depth information relative to the viewing point. Each surface normal is used to determine each surface orientation to the viewing angle while the relative depth information is also useful to determine self occlusions. By utilizing these surface orientation and relative depth information, inferred from the triangles related to each point, we can efficiently determine the visibility of points. Finally, we obtain the 2D projected shapes along with the visibility and corresponding pose parameters. This determination of visibility is a valuable component toward extreme off-angle Figure 2. The proposed HPR algorithm from a toy example. Three points on the right side are identified as visible points from C after apply HPR on the original five points, shown in the left side. pose estimation. We discuss more detailed explanations of the proposed HPR method. Formally, the goal of ordinary HPR processes for human facial shapes can be stated as the following: Definition: Given a set of points S 2xn, which is considered a sampling of a continuous surface S 3xm, where m n, and a viewpoint (camera position) C, our goal is to determine whether S 2xn is visible from C. We write 2D shapes as an instance of a 2D projection of a 3D shape with the 2D translation vector t. This can be expressed by: S 2xn = P w S 3xn + t. (5) The weak-perspective projection matrix P w can be further decomposed into s c P 2x3 R, where s c is the scale factor, R is the 3D rotation matrix, and P 2x3 is the projection matrix: P 2x3 = ( ). (6) Therefore, the HPR process is to separate S 2xn into visible points (S v 2x(n nh) ) and occluded points (Sh 2xnh ). This can be denoted by: S 2xn = {S v 2x(n nh), Sh 2xnh} (7) where nh is the number of hidden points. Since our viewing point C lies in the z axis, we set C to [0 0 c], where c is an arbitrary large positive number. This way, we justify the use of the weak perspective projection in our 2D projected shapes. In addition, all 2D projected shapes and corresponding rotation parameters are generated based on this fixed viewing position ([0 0 c]). The following four steps, based on a toy example shown in Fig. 2, contain the main idea of the proposed HPR process. 1. After applying rotations on a 3D shape based on C, triangulate the 3D shape based on the points. For example, in Fig. 2, a 3D shape is comprised of five

5 Figure 4. Illustration of a de-rotation for reliably computing the surface normal from a single face image. Figure 3. Examples of hidden points removal process. points from S1 to S5. We obtain anti-clockwise triangles, generated from the Delaunay triangulation, i.e., T 123, T 243, T 135, T 345 in the figure. 2. Compute a surface normal per each triangle. For example, in case of three points (S1, S2, S3) are used, the resulting surface normal can be calculated by the following equations: n T123 = ((S2) (S1))X(S3) (S1)) (8) n T123 = n T123 /norm(n T123 ) (9) where X is the cross product of the two vectors and norm indicates the Euclidean norm. 3. Calculate the angle between C and n T123, i.e. θ = C arccos( n T123 ). If θ >= 90, the triangle is oriented backwards and the points are associated with this triangle can be the candidates of occlusion. If the angles of all other triangles for each point for the backward triangle θ >= 90, then the points are indeed invisible. If there exists at least one triangle is less than 90, then the point is visible. However, although θ <= 90, the points associated with the triangle can be also invisible due to self-occlusions. Therefore, following additional step is used to solve this problem. 4. Compute the relative distance from C based on the center of gravity of each triangle. For each triangle sorted by the distance from C, check if there are any points are projected inside the triangle for all other points. If there is any point inside the projected triangle, the point is indeed occluded. Based on the above HPR removal procedure, which mainly utilizes triangle information (surface orientation and relative depth) in order to infer the visibility of points, we apply on 2D facial shapes, obtained during rotating and projecting a 3D shape. The results are shown in Fig. 3. Visible points (S v 2x(n nh) ) are depicted with a blue color while occluded points (S h 2xnh ) are plotted with a red color. As visually evidenced by these figures, we can successfully identify the visibility of points. This HPR process in this paper is mainly used in order to identify commonly visible points at a certain viewing direction; however, it can be easily applied to face alignment problems which need an automated scheme for detecting points and identifying the visibility of points simultaneously [8]. 4. Generic 3D Pose Estimation Methods In this section, we present two different types of shapebased methods to achieve generic 3D fine pose estimation: a pure geometry-based method and a shape-based method with a multivariate regression model. A modified version of the geometry-based method, which utilizes 5 points, is explained first. Evaluation of these methods is also discussed in Section 5. In case of the shape-based method, we apply three different numbers (5, 50, and 79) of points and multivariate regression models. Since we are also interested in simultaneously learning the relationships between pose parameters and shape changes, the use of multivariate regression models enable us to achieve this. Other approaches, such as SVR and NN, are also good candidates to learn the non-linear relationship; however, we confirm that these two approaches are only suitable for coarse pose estimation and have difficulty in learning multiple features jointly and simultaneously. It is important to note that we focus on dealing with roll and yaw pose estimation throughout the evaluation, since pitch angle estimation can be simply achievable by using the centers of the two eyes Modified Geometry Based Method In [6], three DOF can be calculated by using a spherical coordinate system to estimate the face normal. Based on a set of fixed distance ratios of generic people, the tilt and slant angles are computed based on any viewing direction. Since there are several ambiguities exist in the original method [6], we re-interpret this method with a more intuitive solution. The major difference in our approach is the use of projected lengths of the face normal in both axis. In addition, we fix our viewing direction (C) and do not utilize the spherical coordinate system. In our work, we first start with correcting the z-axis rotation based on the centers of each eye. A visual illustration of the de-rotation on z-axis is shown in Fig. 4. Once the face

6 axis, the line intersecting a middle point of the two eyes and a middle point of the nose corners, is retrieved, we use a predetermined fixed point in the face axis and obtain a line intersecting with the nose tip. This line becomes the face normal n f, which contains the most important information about the 3D orientation of the face. However, decomposing the face normal into pose parameters is not a trivial task, because we have to set the fixed point, a point connecting the face normal, in the face axis. We elaborate this problem by using four points (fa, fb, fc, fd), demonstrated in Fig. 4. fa and fd the two middle points between the two eyes and the mouth corners, respectively. The line intersecting fa and fd is the face axis. fb is the point in the face axis orthogonal to the face normal, while fc can be calculated by the following equations: fc = (1 t) fa + t fd (10) where t should be determined. The author of [6] utilizes an empirically chosen value based on the following relative distance ratio: L f = fb fd fa fd (11) where. indicates the Euclidean distance and the author set this ratio to 0.40; however, our average ratio, calculated from our shape database, is about 0.45 and there is a significant variation. Besides this relative distance ratio, we consider the length of the face normal, which is closely related to the relative length of the nose height [6]. Unlike the original method for utilizing the generic nose height [6], we compute the projected distances of the face normal in the observed y-axis and x-axis. We write these projected distances by: dy = fb fc, dx = nf fb x (12) where. x is the Euclidean distance of a x direction. A visual illustration of these distances is shown in 4. However, these projected distances need ground-truth data; i.e. they vary from person to person. A common average 3D shape, obtained from our 3D shape database, and its projected distances in 2D are good candidates for this purpose. Therefore, the relationships between the generic nose height and the projected distance in both yaw and pitch should be utilized. Then eq. 12 needs to be normalized based on the mean distance and can be rewritten as: dy = fb fc fa fd, dx = nf fb x fa fd. (13) The generic relative nose length ratio can be found by: l n = L n fa fd (14) Figure 5. Summary of the proposed modified geometry-based method for a generic 3D fine pose estimator. where fa, fd are obtained from an average 3D shape. L n is the length of the average nose height and l n is the relative nose length normalized by the average length of the face axis. By utilizing our 3D database, we compute l n. The value of l n is set to 0.50, while the author of [6] set l n as Finally, each yaw and pitch angle is computed by using the observed relative distances divided by the maximum nose height (l n ). This can be written by: θ z = arcsin( dy l n ), θ y = arcsin( dx l n ). (15) The intuition behind this equation is that large angle rotations lead to larger distances within the maximum nose height. For example, for a 90 degree rotation in yaw, the projected distance along the x-axis reaches a maximum with the average nose height. A visual summary of our proposed modified geometry-based method is illustrated in Fig. 5. However, both original and modified geometry-based method might suffer from the shapes significantly deviate from the mean, i.e. there a significant variation around the mean and people with completely different facial shapes (including nose height) cannot guarantee that we can achieve reliable performance. We will present an alternative way of achieving generic fine 3D pose estimation in the next section Multivariate Linear Regression Model In this section, we utilize the shape information alone (location information of the points) without computing any necessary geometry between these points for pose estimation. Similar to the geometry-based method, a generic mean 3D shape is used for this purpose. We also utilize a multivariate linear regression model, which enables us to simultaneously learn multiple pose parameters (yaw and pitch) through the regression matrix. Furthermore, we can easily increase the number of points for improved performance in the regression model. An overview of the proposed work by utilizing different number of points with multivariate linear regression models is shown in Fig. 6. The major concern for this approach is to see if a single generic 3D shape and its

7 2D projected shapes are suitable for learning relationships toward a variety of people with different shapes. We provide an answer to this problem. The overall procedure of the proposed pose estimator with a multivariate linear regression model can be divided into training and testing stages. In the training stage, we need to learn the relationship between the rotation parameters and 2D projected shapes. The relationship between R θy and R θz (we do not model θ x, since θ x can be easily obtained by using the centers of the two eyes) needs to be computed by regression analysis. Then eq. 3 becomes: S 2dn = PR θy R θz S. (16) We store the centered and energy-normalized 2D shape projection vectors (f θyθ z ) over pose parameters (θ y and θ z ) into F θyθ z and corresponding parameter vector (r θyθ z ) into R θyθ z respectively. We write f θyθ z as: f θyθ z = (s 1d2n (θ y, θ z ) s 1d2n (θ y, θ z ))/ s 1d2n (θ y, θ z ) 2. (17) where s 1d2n is a vectorized version of S 2dn, Then the relationship between 3D and 2D of the pose parameters can be modeled by multivariate regression analysis. This can be denoted by: M θyθ z = R θyθ z (F θyθ z F T θ yθ z + ɛi) 1 (18) where ɛ is the regularization parameter and I is the identity matrix. In the testing stage, we need to estimate the rotation parameters given a 2D input shape. The estimation of the rotation parameters can be simply done by: (θ y, θ z ) T = M θyθ z s 1d2n (19) where s 1d2n indicates the centered and energy-normalized input 2D shape. 5. Experimental Results Since our goal is to achieve generic 3D fine pose estimation, we only utilize a set of fixed ratios and 2D projected shapes, obtained from a single global 3D shape model; we expect that if we have a person-specific 3D shape model, more accurate pose estimator can be achieved. Based on the two different shape-based methods, we evaluate the performance in this section. In order to test the generalization power, we only use an average model, obtained by using the first 100 3D shapes in the MPIE database [17], and then the remaining 3D shapes and their 2D projections are used for testing throughout the evaluations. The main advantage of the geometry-based method is that there are no training stages. In case of the point-based Figure 6. Overview of the multivariate regression model for pose estimation with varying number of points. Table 1. Performance comparison of pose estimation (Mean Squared Errors (MSE) are presented within -40 to 40 degrees in yaw and pitch, respectively.) Number of Points Stage Yaw Pitch 5 Point Geometry Testing Only Points + Regression Training Testing Points + Regression Training Testing Points + Regression Training Testing method, we consider three different sets points. By using the three different sets of facial features, corresponding regression models are obtained. By using an average 3D shape, we generate 1,500 2D projected shapes with a 2 degree interval in each angle. The average errors tend to increase if the input shapes are affected by large rotations, especially around θ y = 40, θ z = 40 and θ y = 40, θ z = 40. These average training and testing results are shown in Table 1. We achieve similar performances for both 5 pointbased approaches with/without regression models in both angles. The best performance comes from 50 point-based regression models with reasonable generalization errors. The overall shape information about the important facial features, such as the shapes of the eyes, nose, and mouth are important factors for achieving reasonable performance. It can be concluded that 5 point-based methods may serve as an initial but reasonable pose estimator; however, in some applications which require high accuracy on pose estimation, large number of points and a set of generic shapes should be utilized. Throughout the evaluation, we have utilized commonly visible points (79 points) under rotations (-40 to 40 degrees in yaw and pitch, respectively) from the frontal view. Since some points are invisible during such rotations, we identify

8 them by using our proposed HPR approach. It is worth mentioning that each multivariate regression model requires the same number of points in both training and testing stages. We achieve this by setting up the value of the spatial locations of these invisible points zero for each multivariate regression model. This way, we may avoid the effect of the hidden points for pose estimation in both stages. For handling profile faces, we expect that we can easily extend our proposed pose estimator. One important task is to fix a set of points that are statistically observed in profile faces; About 46 points, obtained by using our 3D shape database, can be utilized to handle (50 to 100 degrees) in right profile faces and (-50 to -100 degrees) in left profile faces. Exactly the same procedures in our pose estimator can be applied for handling profile images, though we focus on presenting our results by using frontal shapes within -40 to 40 degrees in yaw and pitch angles in the paper. It is important to note that we do not attempt to deal with face alignment; this paper only focuses on pose estimation. Since 3D face alignment is another challenging research topic [8], we expect that reducing the fitting errors in 3D facial alignment is the key step towards for our generic 3D pose estimation. Furthermore, the HPR process is also a crucial task to be considered in 3D face alignment. 6. Discussion and Future Work In this paper, we have presented two different ways of achieving a generic 3D fine pose estimation technique by utilizing facial shapes. We have shown that sparse shape information of human faces is a very crucial element for generic pose estimation with a fine angle interval. It is our ongoing work to develop a robust 3D face alignment method with our proposed pose estimator, which makes our pose estimator operate in a fully automated manner. References [1] L. V. B. Mederos, N. Amenta and L. Figueiredo. Surface reconstruction from noisy point clouds. Eurographics Symposium on Geometry Processing, pages 53 62, [2] D. Beymer. Face recognition under varying pose. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pages , [3] C. M. Bishop. Neural networks for pattern recognition. Clarendon Press, Oxford, [4] D. Cristinacce and T. Cootes. Automatic feature localisation with constrained local models. Journal of Pattern Recognition, 41(10): , [5] C. S. D. Cohen-Or, Y. Chrysanthou and F. Durand. A survey of visibility for walkthrough applications. IEEE Transactions on Vision And Computer Graphics, 9(3): , [6] A. Gee and R. Cipolla. Determining the gaze of faces in images. Image and Vision Computing, 12(10): , [7] A. Gee and R. Cipolla. 3d pose estimation of the face from video. In Face Recognition: From Theory to Applications, NATO ASI Series F, Springer-Verlag, pages , [8] L. Gu and T. Kanade. 3d alignment of face in a single image. Proc. of IEEE Int l Conf. on Computer Vision and Pattern Recognition, [9] R. Hartley and A. Zisserman. Multiple view geometry in computer visoin. Cambridge University Press, [10] Q. Ji and R.Hu. 3d face pose estimation and tracking from a monocular camera. Image and Vision Computing, 20(7): , [11] M. Jones and P. Viola. Fast multi-view face detection. Mitsubishi Electric Research Laboratories, MERL-TR , [12] S. Milborrow. Locating facial features with an extended active shape model. European Conf. on Computer Vision (ECCV), [13] E. Murphy-Chutorian and M. M. Trivedi. ead pose estimation in computer vision: A survey. IEEE Transactions on Pattern Recognition and Machine Intelligence, 31(4): , [14] M. K. N. Greene and G. Miller. Hierarchical z-buffer visibility. SIGGRAPH, pages , [15] M. P. N. Kruger and C. von der Malsburg. Determination of face position and pose with a learned representation based on labeled graphs. Image and Vision Computing, 15(8): , [16] S. Niyogi and W. Freeman. Example-based head tracking. In Proc. IEEE Intl Conf. on Automatic Face and Gesture Recognition, pages , [17] J. C. T. K. R. Gross, I. Matthews and S. Baker. Multi-pie. Proc. of Int l Conf. on Automatic Face and Gesture Recognition, [18] S. Rusinkiewicz and M. Levoy. Qsplat: A multiresolution point rendering system for large meshes. SIGGRAPH, pages , [19] A. T. S. Katz and R. Basri. Direct visibility of point sets. ACM Transactions on Graphics (TOG), 26(3), [20] M. Sainz and R. Pajarola. Point-based rendering techniques. Computers & Graphics, 28(6): , [21] G. E. T. Cootes and C. Taylor. Active appearance models. In Proc. of the European Conf. on Computer Vision, 2: , [22] K. W. T. Cootes and C. Taylor. View-based active appearance models. In Proc. IEEE Intl Conf. on Automatic Face and Gesture Recognition, [23] D. C. T. F. Cootes, C. J. Taylor and J. Graham. Active shape models: Their training and application. Computer Vision and Image Understanding, 61(1):38 59, [24] C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method. Int l Journal of Computer Vision, 9(2): , [25] J. Wu and M. Trivedi. A two-stage head pose estimation framework and evaluation. Image and Vision Computing, 41(3): , [26] J. S. Y. Li, S. Gong and H. Liddell. Support vector machine based multi-view face detection and recognition. Image and Vision Computing, 22(5): , 2004.

In Between 3D Active Appearance Models and 3D Morphable Models

In Between 3D Active Appearance Models and 3D Morphable Models In Between 3D Active Appearance Models and 3D Morphable Models Jingu Heo and Marios Savvides Biometrics Lab, CyLab Carnegie Mellon University Pittsburgh, PA 15213 jheo@cmu.edu, msavvid@ri.cmu.edu Abstract

More information

Rapid 3D Face Modeling using a Frontal Face and a Profile Face for Accurate 2D Pose Synthesis

Rapid 3D Face Modeling using a Frontal Face and a Profile Face for Accurate 2D Pose Synthesis Rapid 3D Face Modeling using a Frontal Face and a Profile Face for Accurate 2D Pose Synthesis Jingu Heo and Marios Savvides CyLab Biometrics Center Carnegie Mellon University Pittsburgh, PA 15213 jheo@cmu.edu,

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,

More information

Face Recognition At-a-Distance Based on Sparse-Stereo Reconstruction

Face Recognition At-a-Distance Based on Sparse-Stereo Reconstruction Face Recognition At-a-Distance Based on Sparse-Stereo Reconstruction Ham Rara, Shireen Elhabian, Asem Ali University of Louisville Louisville, KY {hmrara01,syelha01,amali003}@louisville.edu Mike Miller,

More information

Occluded Facial Expression Tracking

Occluded Facial Expression Tracking Occluded Facial Expression Tracking Hugo Mercier 1, Julien Peyras 2, and Patrice Dalle 1 1 Institut de Recherche en Informatique de Toulouse 118, route de Narbonne, F-31062 Toulouse Cedex 9 2 Dipartimento

More information

Stereo Vision. MAN-522 Computer Vision

Stereo Vision. MAN-522 Computer Vision Stereo Vision MAN-522 Computer Vision What is the goal of stereo vision? The recovery of the 3D structure of a scene using two or more images of the 3D scene, each acquired from a different viewpoint in

More information

Image Coding with Active Appearance Models

Image Coding with Active Appearance Models Image Coding with Active Appearance Models Simon Baker, Iain Matthews, and Jeff Schneider CMU-RI-TR-03-13 The Robotics Institute Carnegie Mellon University Abstract Image coding is the task of representing

More information

Structure from motion

Structure from motion Structure from motion Structure from motion Given a set of corresponding points in two or more images, compute the camera parameters and the 3D point coordinates?? R 1,t 1 R 2,t 2 R 3,t 3 Camera 1 Camera

More information

Factorization with Missing and Noisy Data

Factorization with Missing and Noisy Data Factorization with Missing and Noisy Data Carme Julià, Angel Sappa, Felipe Lumbreras, Joan Serrat, and Antonio López Computer Vision Center and Computer Science Department, Universitat Autònoma de Barcelona,

More information

A Factorization Method for Structure from Planar Motion

A Factorization Method for Structure from Planar Motion A Factorization Method for Structure from Planar Motion Jian Li and Rama Chellappa Center for Automation Research (CfAR) and Department of Electrical and Computer Engineering University of Maryland, College

More information

CS231A Course Notes 4: Stereo Systems and Structure from Motion

CS231A Course Notes 4: Stereo Systems and Structure from Motion CS231A Course Notes 4: Stereo Systems and Structure from Motion Kenji Hata and Silvio Savarese 1 Introduction In the previous notes, we covered how adding additional viewpoints of a scene can greatly enhance

More information

Abstract. 1 Introduction. 2 Motivation. Information and Communication Engineering October 29th 2010

Abstract. 1 Introduction. 2 Motivation. Information and Communication Engineering October 29th 2010 Information and Communication Engineering October 29th 2010 A Survey on Head Pose Estimation from Low Resolution Image Sato Laboratory M1, 48-106416, Isarun CHAMVEHA Abstract Recognizing the head pose

More information

Abstract We present a system which automatically generates a 3D face model from a single frontal image of a face. Our system consists of two component

Abstract We present a system which automatically generates a 3D face model from a single frontal image of a face. Our system consists of two component A Fully Automatic System To Model Faces From a Single Image Zicheng Liu Microsoft Research August 2003 Technical Report MSR-TR-2003-55 Microsoft Research Microsoft Corporation One Microsoft Way Redmond,

More information

An Overview of Matchmoving using Structure from Motion Methods

An Overview of Matchmoving using Structure from Motion Methods An Overview of Matchmoving using Structure from Motion Methods Kamyar Haji Allahverdi Pour Department of Computer Engineering Sharif University of Technology Tehran, Iran Email: allahverdi@ce.sharif.edu

More information

Face Alignment Under Various Poses and Expressions

Face Alignment Under Various Poses and Expressions Face Alignment Under Various Poses and Expressions Shengjun Xin and Haizhou Ai Computer Science and Technology Department, Tsinghua University, Beijing 100084, China ahz@mail.tsinghua.edu.cn Abstract.

More information

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision What Happened Last Time? Human 3D perception (3D cinema) Computational stereo Intuitive explanation of what is meant by disparity Stereo matching

More information

METRIC PLANE RECTIFICATION USING SYMMETRIC VANISHING POINTS

METRIC PLANE RECTIFICATION USING SYMMETRIC VANISHING POINTS METRIC PLANE RECTIFICATION USING SYMMETRIC VANISHING POINTS M. Lefler, H. Hel-Or Dept. of CS, University of Haifa, Israel Y. Hel-Or School of CS, IDC, Herzliya, Israel ABSTRACT Video analysis often requires

More information

BIL Computer Vision Apr 16, 2014

BIL Computer Vision Apr 16, 2014 BIL 719 - Computer Vision Apr 16, 2014 Binocular Stereo (cont d.), Structure from Motion Aykut Erdem Dept. of Computer Engineering Hacettepe University Slide credit: S. Lazebnik Basic stereo matching algorithm

More information

Lecture 10: Multi view geometry

Lecture 10: Multi view geometry Lecture 10: Multi view geometry Professor Fei Fei Li Stanford Vision Lab 1 What we will learn today? Stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems Structure from

More information

AAM Based Facial Feature Tracking with Kinect

AAM Based Facial Feature Tracking with Kinect BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No 3 Sofia 2015 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2015-0046 AAM Based Facial Feature Tracking

More information

calibrated coordinates Linear transformation pixel coordinates

calibrated coordinates Linear transformation pixel coordinates 1 calibrated coordinates Linear transformation pixel coordinates 2 Calibration with a rig Uncalibrated epipolar geometry Ambiguities in image formation Stratified reconstruction Autocalibration with partial

More information

A Survey of Light Source Detection Methods

A Survey of Light Source Detection Methods A Survey of Light Source Detection Methods Nathan Funk University of Alberta Mini-Project for CMPUT 603 November 30, 2003 Abstract This paper provides an overview of the most prominent techniques for light

More information

Depth-Layer-Based Patient Motion Compensation for the Overlay of 3D Volumes onto X-Ray Sequences

Depth-Layer-Based Patient Motion Compensation for the Overlay of 3D Volumes onto X-Ray Sequences Depth-Layer-Based Patient Motion Compensation for the Overlay of 3D Volumes onto X-Ray Sequences Jian Wang 1,2, Anja Borsdorf 2, Joachim Hornegger 1,3 1 Pattern Recognition Lab, Friedrich-Alexander-Universität

More information

Dense 3D Reconstruction. Christiano Gava

Dense 3D Reconstruction. Christiano Gava Dense 3D Reconstruction Christiano Gava christiano.gava@dfki.de Outline Previous lecture: structure and motion II Structure and motion loop Triangulation Today: dense 3D reconstruction The matching problem

More information

Stereo and Epipolar geometry

Stereo and Epipolar geometry Previously Image Primitives (feature points, lines, contours) Today: Stereo and Epipolar geometry How to match primitives between two (multiple) views) Goals: 3D reconstruction, recognition Jana Kosecka

More information

Head Frontal-View Identification Using Extended LLE

Head Frontal-View Identification Using Extended LLE Head Frontal-View Identification Using Extended LLE Chao Wang Center for Spoken Language Understanding, Oregon Health and Science University Abstract Automatic head frontal-view identification is challenging

More information

Multi-View AAM Fitting and Camera Calibration

Multi-View AAM Fitting and Camera Calibration To appear in the IEEE International Conference on Computer Vision Multi-View AAM Fitting and Camera Calibration Seth Koterba, Simon Baker, Iain Matthews, Changbo Hu, Jing Xiao, Jeffrey Cohn, and Takeo

More information

Lecture 10: Multi-view geometry

Lecture 10: Multi-view geometry Lecture 10: Multi-view geometry Professor Stanford Vision Lab 1 What we will learn today? Review for stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems Structure from

More information

Parametric Manifold of an Object under Different Viewing Directions

Parametric Manifold of an Object under Different Viewing Directions Parametric Manifold of an Object under Different Viewing Directions Xiaozheng Zhang 1,2, Yongsheng Gao 1,2, and Terry Caelli 3 1 Biosecurity Group, Queensland Research Laboratory, National ICT Australia

More information

3D Active Appearance Model for Aligning Faces in 2D Images

3D Active Appearance Model for Aligning Faces in 2D Images 3D Active Appearance Model for Aligning Faces in 2D Images Chun-Wei Chen and Chieh-Chih Wang Abstract Perceiving human faces is one of the most important functions for human robot interaction. The active

More information

Subject-Oriented Image Classification based on Face Detection and Recognition

Subject-Oriented Image Classification based on Face Detection and Recognition 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Dense 3D Reconstruction. Christiano Gava

Dense 3D Reconstruction. Christiano Gava Dense 3D Reconstruction Christiano Gava christiano.gava@dfki.de Outline Previous lecture: structure and motion II Structure and motion loop Triangulation Wide baseline matching (SIFT) Today: dense 3D reconstruction

More information

Flexible Calibration of a Portable Structured Light System through Surface Plane

Flexible Calibration of a Portable Structured Light System through Surface Plane Vol. 34, No. 11 ACTA AUTOMATICA SINICA November, 2008 Flexible Calibration of a Portable Structured Light System through Surface Plane GAO Wei 1 WANG Liang 1 HU Zhan-Yi 1 Abstract For a portable structured

More information

A simple method for interactive 3D reconstruction and camera calibration from a single view

A simple method for interactive 3D reconstruction and camera calibration from a single view A simple method for interactive 3D reconstruction and camera calibration from a single view Akash M Kushal Vikas Bansal Subhashis Banerjee Department of Computer Science and Engineering Indian Institute

More information

Illumination-Robust Face Recognition based on Gabor Feature Face Intrinsic Identity PCA Model

Illumination-Robust Face Recognition based on Gabor Feature Face Intrinsic Identity PCA Model Illumination-Robust Face Recognition based on Gabor Feature Face Intrinsic Identity PCA Model TAE IN SEOL*, SUN-TAE CHUNG*, SUNHO KI**, SEONGWON CHO**, YUN-KWANG HONG*** *School of Electronic Engineering

More information

Structure from motion

Structure from motion Structure from motion Structure from motion Given a set of corresponding points in two or more images, compute the camera parameters and the 3D point coordinates?? R 1,t 1 R 2,t R 2 3,t 3 Camera 1 Camera

More information

Face Tracking. Synonyms. Definition. Main Body Text. Amit K. Roy-Chowdhury and Yilei Xu. Facial Motion Estimation

Face Tracking. Synonyms. Definition. Main Body Text. Amit K. Roy-Chowdhury and Yilei Xu. Facial Motion Estimation Face Tracking Amit K. Roy-Chowdhury and Yilei Xu Department of Electrical Engineering, University of California, Riverside, CA 92521, USA {amitrc,yxu}@ee.ucr.edu Synonyms Facial Motion Estimation Definition

More information

Factorization Method Using Interpolated Feature Tracking via Projective Geometry

Factorization Method Using Interpolated Feature Tracking via Projective Geometry Factorization Method Using Interpolated Feature Tracking via Projective Geometry Hideo Saito, Shigeharu Kamijima Department of Information and Computer Science, Keio University Yokohama-City, 223-8522,

More information

Sparse Shape Registration for Occluded Facial Feature Localization

Sparse Shape Registration for Occluded Facial Feature Localization Shape Registration for Occluded Facial Feature Localization Fei Yang, Junzhou Huang and Dimitris Metaxas Abstract This paper proposes a sparsity driven shape registration method for occluded facial feature

More information

Hand-Eye Calibration from Image Derivatives

Hand-Eye Calibration from Image Derivatives Hand-Eye Calibration from Image Derivatives Abstract In this paper it is shown how to perform hand-eye calibration using only the normal flow field and knowledge about the motion of the hand. The proposed

More information

Modeling 3D Human Poses from Uncalibrated Monocular Images

Modeling 3D Human Poses from Uncalibrated Monocular Images Modeling 3D Human Poses from Uncalibrated Monocular Images Xiaolin K. Wei Texas A&M University xwei@cse.tamu.edu Jinxiang Chai Texas A&M University jchai@cse.tamu.edu Abstract This paper introduces an

More information

Computer Vision Projective Geometry and Calibration. Pinhole cameras

Computer Vision Projective Geometry and Calibration. Pinhole cameras Computer Vision Projective Geometry and Calibration Professor Hager http://www.cs.jhu.edu/~hager Jason Corso http://www.cs.jhu.edu/~jcorso. Pinhole cameras Abstract camera model - box with a small hole

More information

Computer Vision Lecture 17

Computer Vision Lecture 17 Computer Vision Lecture 17 Epipolar Geometry & Stereo Basics 13.01.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar in the summer semester

More information

Computer Vision Lecture 17

Computer Vision Lecture 17 Announcements Computer Vision Lecture 17 Epipolar Geometry & Stereo Basics Seminar in the summer semester Current Topics in Computer Vision and Machine Learning Block seminar, presentations in 1 st week

More information

EECS 442: Final Project

EECS 442: Final Project EECS 442: Final Project Structure From Motion Kevin Choi Robotics Ismail El Houcheimi Robotics Yih-Jye Jeffrey Hsu Robotics Abstract In this paper, we summarize the method, and results of our projective

More information

There are many cues in monocular vision which suggests that vision in stereo starts very early from two similar 2D images. Lets see a few...

There are many cues in monocular vision which suggests that vision in stereo starts very early from two similar 2D images. Lets see a few... STEREO VISION The slides are from several sources through James Hays (Brown); Srinivasa Narasimhan (CMU); Silvio Savarese (U. of Michigan); Bill Freeman and Antonio Torralba (MIT), including their own

More information

Applying Synthetic Images to Learning Grasping Orientation from Single Monocular Images

Applying Synthetic Images to Learning Grasping Orientation from Single Monocular Images Applying Synthetic Images to Learning Grasping Orientation from Single Monocular Images 1 Introduction - Steve Chuang and Eric Shan - Determining object orientation in images is a well-established topic

More information

Nonrigid Surface Modelling. and Fast Recovery. Department of Computer Science and Engineering. Committee: Prof. Leo J. Jia and Prof. K. H.

Nonrigid Surface Modelling. and Fast Recovery. Department of Computer Science and Engineering. Committee: Prof. Leo J. Jia and Prof. K. H. Nonrigid Surface Modelling and Fast Recovery Zhu Jianke Supervisor: Prof. Michael R. Lyu Committee: Prof. Leo J. Jia and Prof. K. H. Wong Department of Computer Science and Engineering May 11, 2007 1 2

More information

Single view-based 3D face reconstruction robust to self-occlusion

Single view-based 3D face reconstruction robust to self-occlusion Lee et al. EURASIP Journal on Advances in Signal Processing 2012, 2012:176 RESEARCH Open Access Single view-based 3D face reconstruction robust to self-occlusion Youn Joo Lee 1, Sung Joo Lee 2, Kang Ryoung

More information

ROBUST LINE-BASED CALIBRATION OF LENS DISTORTION FROM A SINGLE VIEW

ROBUST LINE-BASED CALIBRATION OF LENS DISTORTION FROM A SINGLE VIEW ROBUST LINE-BASED CALIBRATION OF LENS DISTORTION FROM A SINGLE VIEW Thorsten Thormählen, Hellward Broszio, Ingolf Wassermann thormae@tnt.uni-hannover.de University of Hannover, Information Technology Laboratory,

More information

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter

More information

CHAPTER 3. Single-view Geometry. 1. Consequences of Projection

CHAPTER 3. Single-view Geometry. 1. Consequences of Projection CHAPTER 3 Single-view Geometry When we open an eye or take a photograph, we see only a flattened, two-dimensional projection of the physical underlying scene. The consequences are numerous and startling.

More information

Multiple Motion Scene Reconstruction from Uncalibrated Views

Multiple Motion Scene Reconstruction from Uncalibrated Views Multiple Motion Scene Reconstruction from Uncalibrated Views Mei Han C & C Research Laboratories NEC USA, Inc. meihan@ccrl.sj.nec.com Takeo Kanade Robotics Institute Carnegie Mellon University tk@cs.cmu.edu

More information

Real-time non-rigid driver head tracking for driver mental state estimation

Real-time non-rigid driver head tracking for driver mental state estimation Carnegie Mellon University Research Showcase @ CMU Robotics Institute School of Computer Science 2004 Real-time non-rigid driver head tracking for driver mental state estimation Simon Baker Carnegie Mellon

More information

Generic Face Alignment Using an Improved Active Shape Model

Generic Face Alignment Using an Improved Active Shape Model Generic Face Alignment Using an Improved Active Shape Model Liting Wang, Xiaoqing Ding, Chi Fang Electronic Engineering Department, Tsinghua University, Beijing, China {wanglt, dxq, fangchi} @ocrserv.ee.tsinghua.edu.cn

More information

DISTANCE MAPS: A ROBUST ILLUMINATION PREPROCESSING FOR ACTIVE APPEARANCE MODELS

DISTANCE MAPS: A ROBUST ILLUMINATION PREPROCESSING FOR ACTIVE APPEARANCE MODELS DISTANCE MAPS: A ROBUST ILLUMINATION PREPROCESSING FOR ACTIVE APPEARANCE MODELS Sylvain Le Gallou*, Gaspard Breton*, Christophe Garcia*, Renaud Séguier** * France Telecom R&D - TECH/IRIS 4 rue du clos

More information

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Accurate 3D Face and Body Modeling from a Single Fixed Kinect Accurate 3D Face and Body Modeling from a Single Fixed Kinect Ruizhe Wang*, Matthias Hernandez*, Jongmoo Choi, Gérard Medioni Computer Vision Lab, IRIS University of Southern California Abstract In this

More information

CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION

CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION In this chapter we will discuss the process of disparity computation. It plays an important role in our caricature system because all 3D coordinates of nodes

More information

Passive 3D Photography

Passive 3D Photography SIGGRAPH 2000 Course on 3D Photography Passive 3D Photography Steve Seitz Carnegie Mellon University University of Washington http://www.cs cs.cmu.edu/~ /~seitz Visual Cues Shading Merle Norman Cosmetics,

More information

Structure from Motion CSC 767

Structure from Motion CSC 767 Structure from Motion CSC 767 Structure from motion Given a set of corresponding points in two or more images, compute the camera parameters and the 3D point coordinates?? R,t R 2,t 2 R 3,t 3 Camera??

More information

/10/$ IEEE 4048

/10/$ IEEE 4048 21 IEEE International onference on Robotics and Automation Anchorage onvention District May 3-8, 21, Anchorage, Alaska, USA 978-1-4244-54-4/1/$26. 21 IEEE 448 Fig. 2: Example keyframes of the teabox object.

More information

Categorization by Learning and Combining Object Parts

Categorization by Learning and Combining Object Parts Categorization by Learning and Combining Object Parts Bernd Heisele yz Thomas Serre y Massimiliano Pontil x Thomas Vetter Λ Tomaso Poggio y y Center for Biological and Computational Learning, M.I.T., Cambridge,

More information

Determining pose of a human face from a single monocular image

Determining pose of a human face from a single monocular image Determining pose of a human face from a single monocular image Jian-Gang Wang 1, Eric Sung 2, Ronda Venkateswarlu 1 1 Institute for Infocomm Research 21 Heng Mui Keng Terrace, Singapore 119613 2 Nanyang

More information

Accurate and Dense Wide-Baseline Stereo Matching Using SW-POC

Accurate and Dense Wide-Baseline Stereo Matching Using SW-POC Accurate and Dense Wide-Baseline Stereo Matching Using SW-POC Shuji Sakai, Koichi Ito, Takafumi Aoki Graduate School of Information Sciences, Tohoku University, Sendai, 980 8579, Japan Email: sakai@aoki.ecei.tohoku.ac.jp

More information

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth Common Classification Tasks Recognition of individual objects/faces Analyze object-specific features (e.g., key points) Train with images from different viewing angles Recognition of object classes Analyze

More information

Learning based face hallucination techniques: A survey

Learning based face hallucination techniques: A survey Vol. 3 (2014-15) pp. 37-45. : A survey Premitha Premnath K Department of Computer Science & Engineering Vidya Academy of Science & Technology Thrissur - 680501, Kerala, India (email: premithakpnath@gmail.com)

More information

Viewpoint Invariant Features from Single Images Using 3D Geometry

Viewpoint Invariant Features from Single Images Using 3D Geometry Viewpoint Invariant Features from Single Images Using 3D Geometry Yanpeng Cao and John McDonald Department of Computer Science National University of Ireland, Maynooth, Ireland {y.cao,johnmcd}@cs.nuim.ie

More information

Mei Han Takeo Kanade. January Carnegie Mellon University. Pittsburgh, PA Abstract

Mei Han Takeo Kanade. January Carnegie Mellon University. Pittsburgh, PA Abstract Scene Reconstruction from Multiple Uncalibrated Views Mei Han Takeo Kanade January 000 CMU-RI-TR-00-09 The Robotics Institute Carnegie Mellon University Pittsburgh, PA 1513 Abstract We describe a factorization-based

More information

An idea which can be used once is a trick. If it can be used more than once it becomes a method

An idea which can be used once is a trick. If it can be used more than once it becomes a method An idea which can be used once is a trick. If it can be used more than once it becomes a method - George Polya and Gabor Szego University of Texas at Arlington Rigid Body Transformations & Generalized

More information

arxiv: v1 [cs.cv] 2 May 2016

arxiv: v1 [cs.cv] 2 May 2016 16-811 Math Fundamentals for Robotics Comparison of Optimization Methods in Optical Flow Estimation Final Report, Fall 2015 arxiv:1605.00572v1 [cs.cv] 2 May 2016 Contents Noranart Vesdapunt Master of Computer

More information

Passive driver gaze tracking with active appearance models

Passive driver gaze tracking with active appearance models Carnegie Mellon University Research Showcase @ CMU Robotics Institute School of Computer Science 2004 Passive driver gaze tracking with active appearance models Takahiro Ishikawa Carnegie Mellon University

More information

Structure from Motion

Structure from Motion 11/18/11 Structure from Motion Computer Vision CS 143, Brown James Hays Many slides adapted from Derek Hoiem, Lana Lazebnik, Silvio Saverese, Steve Seitz, and Martial Hebert This class: structure from

More information

Occlusion Robust Multi-Camera Face Tracking

Occlusion Robust Multi-Camera Face Tracking Occlusion Robust Multi-Camera Face Tracking Josh Harguess, Changbo Hu, J. K. Aggarwal Computer & Vision Research Center / Department of ECE The University of Texas at Austin harguess@utexas.edu, changbo.hu@gmail.com,

More information

Eye Detection by Haar wavelets and cascaded Support Vector Machine

Eye Detection by Haar wavelets and cascaded Support Vector Machine Eye Detection by Haar wavelets and cascaded Support Vector Machine Vishal Agrawal B.Tech 4th Year Guide: Simant Dubey / Amitabha Mukherjee Dept of Computer Science and Engineering IIT Kanpur - 208 016

More information

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University. 3D Computer Vision Structured Light II Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Introduction

More information

Vehicle Dimensions Estimation Scheme Using AAM on Stereoscopic Video

Vehicle Dimensions Estimation Scheme Using AAM on Stereoscopic Video Workshop on Vehicle Retrieval in Surveillance (VRS) in conjunction with 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance Vehicle Dimensions Estimation Scheme Using

More information

Textureless Layers CMU-RI-TR Qifa Ke, Simon Baker, and Takeo Kanade

Textureless Layers CMU-RI-TR Qifa Ke, Simon Baker, and Takeo Kanade Textureless Layers CMU-RI-TR-04-17 Qifa Ke, Simon Baker, and Takeo Kanade The Robotics Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 Abstract Layers are one of the most well

More information

A face recognition system based on local feature analysis

A face recognition system based on local feature analysis A face recognition system based on local feature analysis Stefano Arca, Paola Campadelli, Raffaella Lanzarotti Dipartimento di Scienze dell Informazione Università degli Studi di Milano Via Comelico, 39/41

More information

Module 4F12: Computer Vision and Robotics Solutions to Examples Paper 2

Module 4F12: Computer Vision and Robotics Solutions to Examples Paper 2 Engineering Tripos Part IIB FOURTH YEAR Module 4F2: Computer Vision and Robotics Solutions to Examples Paper 2. Perspective projection and vanishing points (a) Consider a line in 3D space, defined in camera-centered

More information

REAL-TIME FACE SWAPPING IN VIDEO SEQUENCES: MAGIC MIRROR

REAL-TIME FACE SWAPPING IN VIDEO SEQUENCES: MAGIC MIRROR REAL-TIME FACE SWAPPING IN VIDEO SEQUENCES: MAGIC MIRROR Nuri Murat Arar1, Fatma Gu ney1, Nasuh Kaan Bekmezci1, Hua Gao2 and Hazım Kemal Ekenel1,2,3 1 Department of Computer Engineering, Bogazici University,

More information

Silhouette-based Multiple-View Camera Calibration

Silhouette-based Multiple-View Camera Calibration Silhouette-based Multiple-View Camera Calibration Prashant Ramanathan, Eckehard Steinbach, and Bernd Girod Information Systems Laboratory, Electrical Engineering Department, Stanford University Stanford,

More information

Using temporal seeding to constrain the disparity search range in stereo matching

Using temporal seeding to constrain the disparity search range in stereo matching Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department

More information

Structure from Motion and Multi- view Geometry. Last lecture

Structure from Motion and Multi- view Geometry. Last lecture Structure from Motion and Multi- view Geometry Topics in Image-Based Modeling and Rendering CSE291 J00 Lecture 5 Last lecture S. J. Gortler, R. Grzeszczuk, R. Szeliski,M. F. Cohen The Lumigraph, SIGGRAPH,

More information

Vision Review: Image Formation. Course web page:

Vision Review: Image Formation. Course web page: Vision Review: Image Formation Course web page: www.cis.udel.edu/~cer/arv September 10, 2002 Announcements Lecture on Thursday will be about Matlab; next Tuesday will be Image Processing The dates some

More information

Structure from Motion

Structure from Motion Structure from Motion Outline Bundle Adjustment Ambguities in Reconstruction Affine Factorization Extensions Structure from motion Recover both 3D scene geoemetry and camera positions SLAM: Simultaneous

More information

On the Dimensionality of Deformable Face Models

On the Dimensionality of Deformable Face Models On the Dimensionality of Deformable Face Models CMU-RI-TR-06-12 Iain Matthews, Jing Xiao, and Simon Baker The Robotics Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 Abstract

More information

Agenda. Rotations. Camera models. Camera calibration. Homographies

Agenda. Rotations. Camera models. Camera calibration. Homographies Agenda Rotations Camera models Camera calibration Homographies D Rotations R Y = Z r r r r r r r r r Y Z Think of as change of basis where ri = r(i,:) are orthonormal basis vectors r rotated coordinate

More information

TEXTURE OVERLAY ONTO NON-RIGID SURFACE USING COMMODITY DEPTH CAMERA

TEXTURE OVERLAY ONTO NON-RIGID SURFACE USING COMMODITY DEPTH CAMERA TEXTURE OVERLAY ONTO NON-RIGID SURFACE USING COMMODITY DEPTH CAMERA Tomoki Hayashi 1, Francois de Sorbier 1 and Hideo Saito 1 1 Graduate School of Science and Technology, Keio University, 3-14-1 Hiyoshi,

More information

CSE 252B: Computer Vision II

CSE 252B: Computer Vision II CSE 252B: Computer Vision II Lecturer: Serge Belongie Scribe: Sameer Agarwal LECTURE 1 Image Formation 1.1. The geometry of image formation We begin by considering the process of image formation when a

More information

Project Updates Short lecture Volumetric Modeling +2 papers

Project Updates Short lecture Volumetric Modeling +2 papers Volumetric Modeling Schedule (tentative) Feb 20 Feb 27 Mar 5 Introduction Lecture: Geometry, Camera Model, Calibration Lecture: Features, Tracking/Matching Mar 12 Mar 19 Mar 26 Apr 2 Apr 9 Apr 16 Apr 23

More information

FACE RECOGNITION USING INDEPENDENT COMPONENT

FACE RECOGNITION USING INDEPENDENT COMPONENT Chapter 5 FACE RECOGNITION USING INDEPENDENT COMPONENT ANALYSIS OF GABORJET (GABORJET-ICA) 5.1 INTRODUCTION PCA is probably the most widely used subspace projection technique for face recognition. A major

More information

Mysteries of Parameterizing Camera Motion - Part 1

Mysteries of Parameterizing Camera Motion - Part 1 Mysteries of Parameterizing Camera Motion - Part 1 Instructor - Simon Lucey 16-623 - Advanced Computer Vision Apps Today Motivation SO(3) Convex? Exponential Maps SL(3) Group. Adapted from: Computer vision:

More information

Full-Motion Recovery from Multiple Video Cameras Applied to Face Tracking and Recognition

Full-Motion Recovery from Multiple Video Cameras Applied to Face Tracking and Recognition Full-Motion Recovery from Multiple Video Cameras Applied to Face Tracking and Recognition Josh Harguess, Changbo Hu, J. K. Aggarwal Computer & Vision Research Center / Department of ECE The University

More information

Compositing a bird's eye view mosaic

Compositing a bird's eye view mosaic Compositing a bird's eye view mosaic Robert Laganiere School of Information Technology and Engineering University of Ottawa Ottawa, Ont KN 6N Abstract This paper describes a method that allows the composition

More information

3D Morphable Model Parameter Estimation

3D Morphable Model Parameter Estimation 3D Morphable Model Parameter Estimation Nathan Faggian 1, Andrew P. Paplinski 1, and Jamie Sherrah 2 1 Monash University, Australia, Faculty of Information Technology, Clayton 2 Clarity Visual Intelligence,

More information

Real Time Face Tracking and Pose Estimation Using an Adaptive Correlation Filter for Human-Robot Interaction

Real Time Face Tracking and Pose Estimation Using an Adaptive Correlation Filter for Human-Robot Interaction Real Time Face Tracking and Pose Estimation Using an Adaptive Correlation Filter for Human-Robot Interaction Vo Duc My and Andreas Zell Abstract In this paper, we present a real time algorithm for mobile

More information

Face Alignment Across Large Poses: A 3D Solution

Face Alignment Across Large Poses: A 3D Solution Face Alignment Across Large Poses: A 3D Solution Outline Face Alignment Related Works 3D Morphable Model Projected Normalized Coordinate Code Network Structure 3D Image Rotation Performance on Datasets

More information

Ruch (Motion) Rozpoznawanie Obrazów Krzysztof Krawiec Instytut Informatyki, Politechnika Poznańska. Krzysztof Krawiec IDSS

Ruch (Motion) Rozpoznawanie Obrazów Krzysztof Krawiec Instytut Informatyki, Politechnika Poznańska. Krzysztof Krawiec IDSS Ruch (Motion) Rozpoznawanie Obrazów Krzysztof Krawiec Instytut Informatyki, Politechnika Poznańska 1 Krzysztof Krawiec IDSS 2 The importance of visual motion Adds entirely new (temporal) dimension to visual

More information

Multiple View Geometry

Multiple View Geometry Multiple View Geometry CS 6320, Spring 2013 Guest Lecture Marcel Prastawa adapted from Pollefeys, Shah, and Zisserman Single view computer vision Projective actions of cameras Camera callibration Photometric

More information

Face View Synthesis Across Large Angles

Face View Synthesis Across Large Angles Face View Synthesis Across Large Angles Jiang Ni and Henry Schneiderman Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 1513, USA Abstract. Pose variations, especially large out-of-plane

More information