Registration of Expressions Data using a 3D Morphable Model

Similar documents
Registration of Expressions Data using a 3D Morphable Model

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

High-Boost Mesh Filtering for 3-D Shape Enhancement

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

S1 Note. Basis functions.

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

CS 534: Computer Vision Model Fitting

Hermite Splines in Lie Groups as Products of Geodesics

Recognizing Faces. Outline

A Binarization Algorithm specialized on Document Images and Photos

Lecture 4: Principal components

Biostatistics 615/815

An efficient method to build panoramic image mosaics

LECTURE : MANIFOLD LEARNING

Feature Reduction and Selection

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

THE PULL-PUSH ALGORITHM REVISITED

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

Active Contours/Snakes

Fitting: Deformable contours April 26 th, 2018

Integrated Expression-Invariant Face Recognition with Constrained Optical Flow

An Image Fusion Approach Based on Segmentation Region

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Smoothing Spline ANOVA for variable screening

Model-Based Bundle Adjustment to Face Modeling

Development of an Active Shape Model. Using the Discrete Cosine Transform

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Discriminative Dictionary Learning with Pairwise Constraints

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem

Simplification of 3D Meshes

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Cluster Analysis of Electrical Behavior

Multi-stable Perception. Necker Cube

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

Reducing Frame Rate for Object Tracking

A B-Snake Model Using Statistical and Geometric Information - Applications to Medical Images

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

Lecture #15 Lecture Notes

Mesh Editing in ROI with Dual Laplacian

Unsupervised Learning

User Authentication Based On Behavioral Mouse Dynamics Biometrics

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Support Vector Machines


Some Tutorial about the Project. Computer Graphics

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

Structure from Motion

Dynamic wetting property investigation of AFM tips in micro/nanoscale

Takahiro ISHIKAWA Takahiro Ishikawa Takahiro Ishikawa Takeo KANADE

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

A Bilinear Model for Sparse Coding

Optimal Workload-based Weighted Wavelet Synopses

Analysis of Continuous Beams in General

The Codesign Challenge

Detection of an Object by using Principal Component Analysis

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

A Robust Method for Estimating the Fundamental Matrix

EXTENDED BIC CRITERION FOR MODEL SELECTION

Parallelism for Nested Loops with Non-uniform and Flow Dependences

High resolution 3D Tau-p transform by matching pursuit Weiping Cao* and Warren S. Ross, Shearwater GeoServices

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach

Edge Detection in Noisy Images Using the Support Vector Machines

Optimizing Document Scoring for Query Retrieval

Viseme aware realistic 3D face modeling from range images

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

3D Face Recognition Fusing Spherical Depth Map and Spherical Texture Map

MOTION BLUR ESTIMATION AT CORNERS

Feature Extraction and Registration An Overview

Wishing you all a Total Quality New Year!

Image Alignment CSC 767

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Feature-Area Optimization: A Novel SAR Image Registration Method

Very simple computational domains can be discretized using boundary-fitted structured meshes (also called grids)

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

LEARNING A WARPED SUBSPACE MODEL OF FACES WITH IMAGES OF UNKNOWN POSE AND ILLUMINATION

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Collaboratively Regularized Nearest Points for Set Based Recognition

Mathematics 256 a course in differential equations for engineering students

Radial Basis Functions

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

Resolving Ambiguity in Depth Extraction for Motion Capture using Genetic Algorithm

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Volumetric Approach for Interactive 3D Modeling

An Optimal Algorithm for Prufer Codes *

Fitting a Morphable Model to 3D Scans of Faces

Lecture 9 Fitting and Matching

A Newton-Type Method for Constrained Least-Squares Data-Fitting with Easy-to-Control Rational Curves

Fair Triangle Mesh Generation with Discrete Elastica

Transcription:

Regstraton of Expressons Data usng a 3D Morphable Model Curzo Basso DISI, Unverstà d Genova, Italy Emal: curzo.basso@ds.unge.t Thomas Vetter Departement Informatk, Unverstät Basel, Swtzerland Emal: thomas.vetter@unbas.ch Abstract The regstraton of 3D scans of faces s a key step for many applcatons, n partcular for buldng 3D Morphable Models. Although a number of algorthms are already avalable for regsterng data wth neutral expresson, the regstraton of scans wth arbtrary expressons s typcally performed under the assumpton of a known, fxed dentty. We present a novel algorthm whch breaks ths restrcton, allowng to regster 3D scans of faces wth arbtrary dentty and expresson. Furthermore, our algorthm can process ncomplete data, yeldng results whch are both contnuous and wth low reconstructon error. Even n the case of complete, expresson-less data, our method can yeld better results than prevous algorthms, due to an adaptve smoothng, whch regularzes the results surface only where the estmated correspondence s unrelable. Index Terms 3D morphable models, 3D regstraton, facal expressons, surface reconstructon. I. INTRODUCTION The regstraton of 3D scans of human faces s a key step n ther processng for many applcatons. We present an algorthm closely related to the methods usng a regularzed energy mnmzaton [1] [3]. Ths s a common approach, snce the regularzaton term provdes the advantage of handlng mssng data and nconsstences n the correspondence. Typcally the correspondence s derved from a manually-defned sparse correspondence [4], [5] or wth an Iteratve Closest Pont (ICP) algorthm [3]. We estmate a dense correspondence followng an approach smlar to [6], [7]. Ths results n a more accurate regstraton, and the correspondence s also used to determne the local mportance of the regularzaton term. Wth respect to prevous methods, our regstraton algorthm presents three noveltes. Unfed Processng. Although some very effcent methods for regsterng 3D scans of human faces have already been publshed [], [5], [6], [8] [10], the regstraton of data wth varyng denttes and expressons Ths work s based on Regstraton of Expressons Data usng a 3D Morphable Model, by C. Basso, P. Paysan and T. Vetter, whch appeared n the Proceedngs of the 7th IEEE Int. Conf. on Automatc Face and Gesture Recognton, Southampton, UK, Aprl 006. c 006 IEEE. It was supported n part by the Swss Natonal Scence Foundaton n the scope of the NCCR CO-ME project 5005-66380. are typcally treated separately. That s, ether a neutral expresson s assumed, and varyng dentty, or the dentty of the subject s fxed, and the goal s to regster her dfferent facal expressons. An excepton s the method n [11], whch however needs around 70 user-defned landmark ponts. Our algorthm can be appled to 3D face scans wth arbtrary dentty and expressons, whch makes t sutable for the applcatons where no such pror knowledge s avalable (e.g. recognton). Reconstructon of Mssng Data. The nput data of the regstraton algorthm s typcally ncomplete. In prevous methods ths problem s ether not consdered or t s addressed from a purely geometrc pont of vew, a clear drawback f the results of the regstraton have to be used to buld statstcal models of human faces. In our regstraton algorthm the reconstructon of the mssng areas takes nto account not only ts geometrc propertes but also ts lkelhood w.r.t. already avalable data. Robustness. A further novelty of our algorthm s related to the estmaton of correspondence. In methods whch regularze the result ( [1] [3]), the relatve mportance of the correspondence n the regstraton process for nstance wth respect to the smoothness of the regstraton result s globally fxed. Ths mght ether result n loss of nformaton, or ntroduce errors n the regstraton results. By settng the relatve mportance of the correspondence locally, our algorthm retans as much correspondence nformaton as possble, whle at the same tme beng robust wth respect to errors n ts estmaton. II. 3D MORPHABLE MODELS Before descrbng the regstraton algorthm, we revew the concept of the 3D Morphable Model (3DMM) and show how the 3DMM s extended to handle both dentty and expresson as separate sources of varatons. Recall that the shape of a 3D mesh s completely defned by the postons (x, y, z ) of ts n vertces, and mght be therefore represented by an n 3 matrx S = x 1 y 1 z 1... x n y n z n, (1)

or alternatvely as a 3n-dmensonal column vector obtaned by flattenng the matrx: s = vec(s) = (x 1, y 1, z 1,..., x n, y n, z n ) T. () A smlar representaton can be used for the texture of the 3D mesh. Denotng by (r, g, b ) the color of the -th vertex, the full texture s stored ether n an n 3 matrx T or n a vector t = vec(t ) = (r 1, g 1, b 1,..., r n, g n, b n ) T. (3) In ths secton we wll develop the model of the shape, usng the vector representaton. The texture model s obtaned followng the same procedure. Three-dmensonal morphable models are bult under the assumpton that a shape vector s can be approxmated by a lnear Gaussan model, defned by a mean vector s R 3n and a generatve matrx C R 3n k wth k < 3n: s = s + C α + ɛ, (4) where ɛ R 3n s the approxmaton error. The vector α R k holds the latent varables of the model, and both α and ɛ follow a Gaussan dstrbuton wth zero mean and dagonal covarance: p(α) = N (0, I) and p(ɛ) = N (0, σ I). (5) The model parameters s, C and σ can be estmated by maxmzng the lkelhood of a tranng set of examples shapes s 1,..., s m. We report here only the soluton of the maxmzaton, and refer to [1] for detals. The maxmum-lkelhood estmator of the model average s s gven by the sample mean s = 1 m s. (6) m =1 The generatve matrx C and the nose varance σ can be estmated through the Sngular Value Decomposton (SVD) of the centered data matrx: A = (s 1 s,..., s m s) R 3n m (7) = U W V T. (8) Recall that U s a column-orthogonal matrx (U T U = I) whose columns hold the normalzed prncpal components of the data sample, and that W s a dagonal matrx wth elements w. Retanng only the frst k prncpal components, the optmal estmates of C and σ are gven by σ = 1 (m 1)(3n k) m 1 =k+1 w, (9) C = U k (Λ k σ I ) 1/, (10) where the matrx U k s the 3n k matrx holdng the frst k columns of U, and Λ k s the k k dagonal matrx wth elements w /(m 1). When all the components are retaned (n the case of k = m 1), ths model s equvalent to PCA; however, n case some of the hgher components u are dscarded, ther contrbutons to the sample varance accumulates n the model nose and scales down the remanng components (whle n PCA ther magntudes stay constant). A. Combned Model In order to model expressons and dentty as separate sources of varatons, we assume that a generc face shape s the sum of an dentty vector and an expresson vector: s = s d + s xp, (11) whle the face texture depends only on the dentty. The vectors s d and s xp holds respectvely the face shape wth neutral expresson and the dsplacements of the vertces due to the expresson; assgnng them separate lnear Gaussan models we obtan: s = s d + C d α d + s xp + C xp α xp + ɛ, (1) wth the usual Gaussan pror for the latent varables α d, α xp and the nose ɛ. Clearly, once the model parameters are fxed, ths s equvalent to the model of eq. (4), wth the only dfference that the matrx C = [C d C xp ] s not column-orthogonal. In order to learn the dstnct model parameters for the dentty and expressons components we use two tranng sets. A frst set of examples wth neutral expresson and varyng dentty s used to estmate the dentty parameters s d and C d, as outlned n the prevous secton. The expresson parameters s xp and C xp are estmated from a second set of expresson examples, acqured from p dfferent persons. Gven the -th ndvdual, we have ts neutral expresson n and m examples s j, from whch we buld a matrx B = (s 1 n,..., s m n ) R 3n m. (13) All the person-specfc matrces B are then put together nto a matrx B = (B 1... B p ) R 3x P m. (14) The average expresson s xp s computed as the mean of the columns of B, whch s then recentered and decomposed by SVD to obtan the matrx C xp as n eq. (10). III. REGISTRATION ALGORITHM In the prevous secton, we ntroduced the 3D Morphable Model and the procedure used to learn ts parameters. In dong ths, we tactly assumed that the tranng data were already regstered wth a reference example, an essental requrement to ensure that they approxmately le on a lnear subspace. In ths secton we wll address the regstraton problem. We start by descrbng n secton III-A the procedure requred to preprocess the data. After preprocessng, the novel mesh s regstered n three steps (see also the dagram of fgure ). At frst we use the 3D Morphable Model defned n secton II to approxmate the nput mesh. Then, we use an algorthm smlar to the modfed

optcal flow of [6] to estmate the correspondence between the approxmaton and the nput mesh. The advantage of usng the approxmaton n place of a fxed reference les n the assumptons on whch the optcal flow algorthm reles. As explaned n secton III-C, the algorthm tres to estmate the correspondence under the assumpton that correspondng pxels have the same ntensty n both vews. In applcatons lke ours ths assumpton does not hold, and computng the approxmaton of the nput s a way to overcome the problem. In the fnal step, the shape s regstered by solvng an optmzaton problem. For the regstraton of the texture, ths last step s dfferent, and t wll be descrbed n secton III-E. Note that f we are startng from scratch, no model wll be avalable; n ths case, we perform the frst step usng an artfcal model bult from two hand-made meshes wth dentcal shape and texture apart from the mouth poston, closed n one mesh and open n the other. As soon as enough examples have been regstered, they can be used to buld a model whch wll replace the ntal one. A. Preprocessng Intally the shape of a novel mesh s defned n terms of the coordnate system of the partcular acquston devce used. As a prelmnary step, we algn coarsely the novel mesh to the reference by fndng the optmal rgd transformaton, n a least square sense, whch maps a few (less than 10), manually-placed landmarks of the novel mesh to the correspondng ponts n the reference. We use the algorthm presented n [13], whch we brefly descrbe here. Denotng by L S and L R the two N 3 matrces holdng the postons of the N landmarks n the novel and reference meshes, we frst compute the mean postons l S and l R of the two groups of landmarks. After SVD of the matrx M = (L R l R ) T (L S l S ) (15) = U M W M V T M, (16) the optmal rotaton s estmated as R = U M S M V T M where S M s a dagonal 3 3 matrx defned as S M = 1 0 0 0 1 0. 0 0 sgn M The translaton s then estmated as t = l R R l S. After the coarse algnment, we compute a D representaton of the novel mesh by clyndrcal projecton. The algned poston w = (w x, w y, w z ) of each vertex s transformed nto cylndrcal coordnates: φ = arctan w x w z + π y = w y ρ = (w x + w z) 1/ Fgure 1. Projecton of a 3D mesh to a D doman. The top mage s the projecton of the texture, whle the bottom mage s a greycoded representaton of the range values. The range values are enough to recover the geometry, snce the mage coordnates provde the angle φ and the heght y. The black areas n both mages are the vod pxels. The cylndrcal coordnates (φ, y) are used to defne a pont (u, v) n a doman [0, W ] [0, H] R by means of the followng transformaton: ( W (u, v) = π φ, H ) ky where the factor k nfluences the vertcal resoluton of the D doman. Followng ths transformaton each vertex of the novel mesh s mapped to a pont n the D doman [0, W ] [0, H], whch s therefore partally covered by the mappng of the mesh trangles. The areas of the doman whch are not covered by the trangles are vod regons, n the sense that they do not represent a surface. Each non-vod pxel of the D doman corresponds to a pont of the mesh surface, and ts shape and color can be computed nterpolatng the values at the vertces of the trangle contanng t. The resultng D representaton (an example s shown n fgure 1) s an mage-lke structure storng at each pxel 4 scalars: the three texture colors plus the value of ρ (φ and y can be computed from the pxel poston). In the followng, we wll denote by I ths 4 W H structure holdng the shape and color nformaton of the novel mesh, by I uv the vector holdng the values of I at the pont (u, v), and by I uv, the -th value of I uv. Let us also defne a norm on ths type of structures as ( 1/ I uv w = w Iuv,), (17) where the weghts w take nto account the fact that we are combnng heterogeneous quanttes. Note that the use of a cylndrcal projecton as a D parameterzaton of the surface s, n some respect, suboptmal. Certanly, t s not a bjectve mappng from the D doman to the surface, due to the occlusons whch mght, and most of the cases wll, occur. However, although we are losng some nformaton durng the cylndrcal projecton, we also gan the consderable advantage of havng a pseudo-parameterzaton whch s consstent over all the examples. Ths s of partcular mportance when fttng the 3DMM to an example. To conclude, we must stress that the data as they are obtaned from the projecton to D could stll need

mnmzaton of ts log-nverse: log P (θ I) P (I θ)p (θ) P (I) = log P (I θ) log P (θ) + const. = Ed (θ) + Ep (θ) + const. = log where Ed (θ) s the approxmaton error and Ep (θ) the pror probabltes of the model coeffcents. Denotng by I the cylndrcal projecton of the approxmaton, the approxmaton error s defned as a sum over all the non-vod pxels: Ed (θ) = Fgure. Flow dagram of the regstraton method. (1) The morphable model s ftted to the novel mesh (b). () A correspondence s estmated va optcal flow between the approxmaton (c) and the nput. Usng the correspondence, the nput s resampled yeldng the ncomplete surface (d). (3) The regstraton result (e) s obtaned by mnmzng an energy whch depends on the resamplng of the nput. Each result of the regstraton ncreases the set of examples used to buld the morphable model. c 006 IEEE. an addtonal processng step, due to the fact that they wll typcally nclude structures whch could dsturb the regstraton, e.g. clothes or hars. In the case of scans wth open mouth, t s especally problematc all nformaton of the mouth nteror, whch s not explctly modeled by our reference. All these structures have to be removed at ths stage. B. Approxmaton The goal of ths frst step s to optmze the shape coeffcents α and the texture coeffcents β of a gven 3D Morphable Model so that ts shape S(α) and texture T (β) ft the nput mesh. As well as optmzng for the shape and texture, we also optmze the pose of the model (specfed by a vector of coeffcents ρ) and a color transformaton parameterzed by a vector γ. The color transformaton actng on the texture T (β) s smlar to a rgd transformaton: T 0 (β, γ) = T (β) M (γ)t U (γ) + a(γ)t. The color offset a(γ) plays the role of the translaton, the dagonal matrx U (γ) specfes the color gans, n analogy wth scalng, whle the contrast matrx M (γ) acts as a rotaton: 0.3 0.59 0.11 M (γ) = γ1 I 3 + (1 γ1 ) 0.3 0.59 0.11. 0.3 0.59 0.11 If we denote the model parameters by θ = (α, β, ρ, γ), the optmal values of θ are found by maxmzng the posteror probablty P (θ I). Ths s equvalent to the 1 X ki uv I uv kw, σi u,v where the nose varance σi s set equal to the estmated model nose σ (see eq. (9)), and k kw s the norm defned n eq. (17). The pror probabltes of the model parameters α and β are Gaussan dstrbutons wth mean zero, and assumng a Gaussan dstrbuton also for the geometrc and color transformatons, the term Ep can be wrtten as: Ep (θ) X β X α + σ σ S, T, X (ρ µr, ) X (γ µc, ) + + σr, σc, = where the coeffcents σr, and σc, are emprcally chosen. The mnmum of the cost functon s acheved employng a stochastc quas-newton descent as n [6]. The algorthm s stochastc n the sense that at each teraton Ed s estmated on a random sample of the vertces by X E d = ki uv I uv kw, (u,v) I where we denoted by I the set of the coordnates n the D doman of the sampled vertces. In order for E d to be an unbased estmaton, we have to sample the vertces wth a probablty proportonal to the projected area of the adjonng trangles. Gven I, the frst dervatves of the cost functon wth respect to the parameters α, β, ρ and γ are computed and the update s computed as n a standard Newton descent algorthm. The computaton of the updates also requres a dagonal approxmaton of the Hessan matrx of the cost functon, whch s estmated on a larger sample of vertces every few thousands teratons. At the same tme when the Hessan s estmated, also the vsbltes of the vertces are checked: snce t can occur that n the mappng to D some of the vertces are occluded, these have not to be taken nto account when estmatng the Hessan or choosng the sample I. An example of approxmaton s shown n fgure (c); the novel mesh s n fgure (b) and the reference n fgure (a).

Algorthm 1: Optcal flow algorthm for 3D data. See secton III-C for a detaled descrpton of the sngle lnes. Input: projectons I and Ĩ Output: flow feld s 1 begn J, J nt features of I, Ĩ; J, J wth = l max... 0 nt pyramds of 3 4 5 6 7 8 9 10 11 1 13 14 J, J; s lmax 0; for l l max to 0 do f l < l max then ψ expand s l+1 ; s s 0 ; end else ψ 0 K warp J l wth ψ; ϕ flow from K to J l ; s l smooth ϕ ψ; C. Correspondence Estmaton After the approxmaton Ĩ has been computed, we need to estmate a correspondence between t and the nput I. To ths am we use a modfcaton of the optcal flow algorthm presented n [14]. We brefly recall that ths method computes the flow s between two scalar-valued mages I 1 and I by mnmzng for each pxel the energy ( s I 1 + I + I (u, v) I 1 (u, v)), u,v R (u,v) (18) where R s a neghborhood of the pxel, and the gradents of the mages are computed by fnte dfferences. The dervaton of the above energy, however, reles on a frst order approxmaton whch s vald only nsofar the mage changes I I 1 are small. In general ths s not true, and n order to overcome the problem [14] adopted a coarseto-fne approach, n whch the optcal flow s teratvely computed at dfferent resolutons. Let us now consder the extenson to 3D data (reference to the lnes of the pseudocode are between brackets). Intally (lne ), the features of the nput are computed. These are two 6-valued mages J(u, v) = (n(u, v), c(u, v)) and J, made up of the normals n and texture values c. In order to compute the normals for the cylndrcal projectons, recall that gven a 3D surface defned parametrcally as s(u, v) = (x(u, v), y(u, v), z(u, v)) the unt normal vector s defned at each pont as wth n = a b a b a b (19) a = s u s and b = v The vector feld n(u, v) s therefore computed by projectng the Cartesan coordnates of the 3D surface to the cylndrcal doman, computng (by fnte dfferences) ther frst dervatves a and b, and fnally applyng equaton (19). Gven the feature mages J and J, we compute the Laplacan pyramds (see [15]) up to a certan depth l max (lne 3). Note that when buldng the Laplacan pyramds, the normals wll lose ther geometrc meanng; ths s not a problem, snce the normals are used as surface features and not for ther geometrc propertes. After these ntalzaton steps, the coarse-to-fne loop s entered. At each teraton the flow feld computed at the prevous teraton s expanded to the current resoluton (lnes 6-9), and the resultng flow feld ψ s used to prewarp the reference of the current level, J l (lne 10). Then, a flow feld ϕ s computed from the result of the warp, K, to the nput J l (lne 11). The computaton of the flow feld s performed by mnmzng an energy wth essentally the same form of equaton (18), but snce the mages are vector-valued the addenda of the sum are squared norms: s J 1 + J + J (u, v) J 1 (u, v), (u,v) u,v R Settng to zero the dervatve of the energy wrt to the flow s we obtan for each pxel (u, v) a lnear system whch can be easly solved. Fnally, the flow felds ϕ (resultng from the soluton of the optcal flow equaton) and ψ (the expanson of the flow feld of the prevous resoluton level) are concatenated and smoothed (lne 1). A smoothng of the soluton to the optcal flow equaton proves necessary because of the hgh sensblty of the optcal flow soluton to nose n areas where there the contrast n the features s low. We conclude ths secton wth a techncal remark about the computatons of the mage dervatves and of the Laplacan pyramd. As noted n the secton relatve to the preprocessng, the D representaton of the 3D surfaces have vod regons whch do not descrbe surface ponts; one should modfy the numercal schemes accordngly. In the case of dervatves, for nstance, where one would normally use central dfference, t s advsable to use forward or backward dfference f the pxel s on a boundary between a vod and a non-vod regon. The output of the optcal flow algorthm wll be a flow feld defned over the D doman: s : [0, W ] [0, H] R whch wll be combned wth the correspondence mplctly defned by the approxmaton. In practce, f the -th vertex of the approxmaton a s projected to the pont

Fgure 3. Notaton used for defnng the energy mnmzed durng regstraton. The postons of the vertces n the approxmaton are denoted by a, the correspondng sampled postons on the nput by w, and the unknown vertces postons n the soluton by v. The dsplacements from the approxmaton to the soluton are denoted by d. c 006 IEEE. p = (u, v ) of the D doman, combnng ths poston wth the result of the optcal flow we obtan as refned correspondng pont: p = (u + s u (u, v ), v + s v (u, v )). Samplng the nput I at p we can derve the poston w of the pont correspondng to a (see fgure 3). We face now two problems: the nput data are typcally ncomplete (see fgure (d)), so that for some a we have no correspondng w, and the optcal flow s not everywhere relable. For these reasons, the vertces postons w computed through optcal flow do not drectly defne the regstraton result. D. Energy Mnmzaton We compute the fnal results v mnmzng an energy made up of a data term, dependng on the postons w obtaned from optcal flow, and a smoothness term. Ths allows us to reconstruct the postons of the vertces wthout correspondence, and to regularze the postons of the vertces where the correspondence s unrelable. Snce the postons w are defned only for the vertces wth a correspondence, we defne the data term only for the subset C of such vertces: E d = C v w (0) Let a denote the postons of the vertces n the approxmaton; the regularzaton term depends on the dsplacements wth respect to the approxmaton d = v a (see fgure 3): E s = e j d j d, (1) j N where N denotes the neghborhood of the -th vertex, and the coeffcents e j weghts the relatve mportance of each edge to the smoothness energy of a vertex. A good crteron for the choce of the coeffcents e j s to look at how much the edges deform n the examples already regstered. Defnng wth σ j the standard devatons of the edges lengths over the examples, we set e j = σ j / N σ j, () Substtutng eq. () n eq. (1), we can verfy that wth ths choce the smoothness term for each vertex becomes e j d j d d j d /σj, (3) j N j N so that the nfluence of each edge on the vertex energy s weghted by ts deformatons n the avalable examples. The adaptve smoothng s acheved by weghtng each term of E d wth a coeffcent λ : E = 1 λ v w + 1 e j d j d. C j N (4) Note that the postons of the vertces wthout correspondence are determned only by the regularzaton term; ts mnmzaton produces a reconstructon of the mssng vertces whch s contnuous and wth low reconstructon error, as shown n [16]. The coeffcents λ should be large where the correspondence s relable, to let the data term domnate, and small otherwse, to regularze the result. As a measure of the correspondence qualty, we use the smoothness of the dsplacement feld w a, defned as s = N (w j a j ) (w a ) a j a. (5) As shown n the example of fgure 7, hgh values of ths quantty are an ndex of problems n the correspondence. In our experments, we set the λ to 10 for s < 0., to 10 7 for s 1, and to 10 for 0. s < 1, whch produces a slght smoothng. Of course more fne-graned choces of λ are possble, but n our experments ths choce proved to be suffcently flexble. The global mnmum of the energy (4) s found by settng to zero ts dervatve w.r.t. the unknowns v. Ths yelds a sparse lnear system, whch can be effcently solved wth standard algorthms (n our mplementaton we used [17]). Denotng by D, A and W the N 3 matrces holdng the values of the vectors d, a and w, the system s (Λ + I (K + K T )/) D = Λ (A W ), (6) where Λ s an n n dagonal matrx wth elements λ /, and K s a sparse n n matrx wth K j = e j f {, j} s an edge of the mesh and K j = 0 otherwse. Solvng eq. (6) for D, the regstered postons of the vertces are found as v = a + d. E. Texture processng The energy mnmzaton descrbed n the prevous secton concerns only the regstraton of the nput shape. For regsterng the texture we follow a dfferent procedure, due to the dfferent nature of the data. Wth current 3D acquston technologes, one can often obtan a hgh resoluton texture of the scanned surface. In order to loose as less texture nformaton as possble durng the regstraton process, at the pre-processng stage (see secton III-A) we mapped the texture coordnates to the D doman just as the shape nformaton. After

Fgure 4. Three examples from our dataset, the orgnals on the top row and the regstraton results on the bottom row. Note how the texture n the last two examples s corrected durng regstraton n order to be consstent wth the other examples n the model. c 006 IEEE. a consstent texturng of all the regstraton results s needed for generatng new textures va lnear combnatons. Therefore two addtonal steps are requred: frst the orgnal texture s warped to a fxed texture map, and then the mssng texture s reconstructed wth a dffuson algorthm. 1) Texture Warpng: The warpng of the orgnal texture s performed by consderng the trangles of the model for whch all the vertces have a correspondence to the novel surface. Gven such a trangle, a pont p belongng to t wll be mapped to a pont p0 n the orgnal texture T 0, so that the regstered texture can be defned as T (p) = T 0 (p0 ). Although the correspondence s explctely defned only for the vertces p = (u, v ) {0,1,} of a gven trangle, we can estmate p0 by nterpolaton, assumng t has the same barycentrc coordnates (b1, b ) of p wrt the trangle vertces: p0 = p00 + b1 p01 + b p0, Fgure 5. The regstraton result can be partally textured wth the orgnal mage, as n ths case where the two orgnal texture mage (top row) can be mapped on the surface of the regstered shape (bottom row). The texturng however wll be partal, snce not all regons are present n the orgnal mages (black areas n the bottom mages). the correspondence estmaton, the cylndrcal projecton of the texture coordnates can be sampled exactly as the mappng of the shape, so that we can assocate a texture coordnate to every vertex for whch a correspondence to the novel surface s defned. The sampled texture coordnates refers to the orgnal texture mage, so that part of the regstraton result can be textured wth the orgnal mages, wthout any loss of nformaton (see fgure 5). However, only part of the result can be textured n ths way, snce n general the texture nformaton wll be ncomplete lke the shape nformaton. Moreover, where p0 are the ponts n T 0 correspondng to the trangle vertces p. ) Texture Reconstructon: The warpng of the orgnal mage defnes only a part of the model texture; the rest of t has to be reconstructed. To ths am we use a method based on a push-pull algorthm [18] presented n [19]. The algorthm reconstructs the mssng areas of an mage by teratvely downsamplng t to a gven scale (the push step) and then upsamplng t back to the orgnal scale (pull step), whle keepng constant the known area. Repeatng these two steps untl convergence effectvely acheves a dffuson of the known color values through the mssng regons. By repeatng the process wth dfferent and ncreasng bottom-scale resolutons, one obtans a fast and smooth approxmaton of the mssng areas only based on the known pxel colors. Dfferently than n the orgnal settng, n our case the areas to reconstruct are not completely wthout nformaton, snce we have the approxmaton T prevously obtaned va the 3D fttng (secton III-B). It makes then sense to apply the above algorthm to the resdual mage T T, and then add the result to T, retanng n ths way any structure n the approxmatng texture. 3) Texture Coordnates Reconstructon: A fnal ssue s that some of the orgnal texture nformaton mght be dscarded n case of holes n the nput surface, as for the eyes n fgure 6. Although the surface of the eyes has not been acqured, we have ts texture nformaton and we would be nterested n usng t rather than havng to approxmate t by the method descrbed above. We apply the push-pull method descrbed above to the D projectons of the texture coordnates, n order to reconstruct ther values n the areas we are nterested. By reconstructng the texture coordnates before the warpng, we can use the orgnal texture nformaton also for areas where the surface could not be reconstructed. IV. R ESULTS In order to test our algorthm, we appled t to an heterogeneous collecton of 485 3D scans, acqured n

Fgure 7. Regstraton of an example wth our algorthm (bottom row) and the algorthm of [6] (top row). In the leftmost column are the regstraton results and n the mddle column ther shape carcatures. The carcature on top evdences the correspondence problems of the prevous algorthm. On the rghtmost column, a color-coded renderng (red s lower smoothness, green hgher) of the dsplacement smoothness, shows that ths measure can be used to detect correspondence problems. c 006 IEEE. Fgure 8. Average dstance of the vertces from the orgnal surface, rangng from 0.0 mm (green) to 1.0 mm (red). The black areas correspond to vertces mssng n at least one example. Most of the vertces are close to the surface. c 006 IEEE. Fgure 6. Reconstructon of texture coordnates for the eyes. In the top mage we show the texture resultng from the push-pull algorthm n the eyes regon. A much better result can be obtaned by applyng the push-pull algorthm to the texture coordnates data, and then usng the reconstructed coordnates to sample the orgnal texture mages (mddle). The bottom mage shows the resoluton mprovement when doublng the sze of the regstered texture. c 006 IEEE. part wth a Cyberware scanner and n part wth a phase shft system. The collecton s equally dvded n examples wth neutral expresson (33 scans, all wth dfferent denttes) and examples wth emotonal expressons or vsemes (5 scans, from 33 subjects). Fgure 4 shows dfferent examples of the tranng data and the results of ther regstraton. As we mentoned n the ntroducton, our algorthm s robust to errors n the correspondence estmaton, thanks to the smoothness term n the mnmzed energy. Although there s no obvous way to measure ts robustness, we assume that the smoothness of the dsplacement feld between the regstered results and ther average shape s a relable way to detect problematc areas of the results, as shown n fgure 7. A comparson between the average values of ths smoothness for the results of our regstraton algorthm and the results of an algorthm based only on the correspondence estmaton [6] confrms, as expected, that our method yelds more regular results (the lower the better): 0.138 (σ = 0.086) vs. 0.04 (σ = 0.18). To rule out the possblty that the results are too smooth, resultng n a bad approxmaton of the nput, we also checked the dstances between the regstered vertces and the nput surfaces. The results, summarzed n fgure 8, show that the smoothng really affects the dstance from the orgnal surface only n small areas. On the rest of the face the vertces are wthn a dstance of 1.0 mm from the surface. We conclude ths secton by showng that the mprovement n the qualty of the results has also a postve effect on the qualty of the morphable model.to ths am, we performed a 10-fold cross-valdaton (for more detals see [0]) on two models bult wth the regstraton results of the prevous comparson, n order to estmate the generalzaton error of the model. Ths s the expected error made by the model n reconstructng novel data, whch was not n the tranng set. As shown n fgure 9, the new results provde a model whch s much more compact: 40 components are enough to acheve an error smaller than usng 170 components of the old model. V. C ONCLUSION We descrbed an algorthm amed at regsterng 3D face data wth arbtrary dentty and expresson. The algorthm also allows us to regster data wth mssng values and offers a control on the regularty of the regstered results, thanks to the adaptve smoothng performed n the last step of the regstraton. In fact, the last step s a generalzed verson of the surface reconstructon method we

Fgure 9. Generalzaton errors of the 3DMMs obtaned wth our algorthm and the algorthm of [6]. c 006 IEEE. descrbed n [16], where we showed that ts results are better than a purely statstcal reconstructon (as n [1]). Wth the new algorthm we were able to buld a 3DMM combnng both dentty and expresson varatons, whch n prncple mght be used n many applcatons, as well as n the regstraton tself: face recognton (both n 3D and D), vdeo trackng, face anmaton, expresson normalzaton n mages. Although we used a lnear model, one could also use a blnear model as n [11]. In prncple such a model has the advantage of capturng the dependency between dentty and expressons, at the expense of an ncreased complexty. However, n the 3D face recognton experments we performed, the blnear model had a worse dentfcaton rate than the lnear model, and we decded to use the latter. REFERENCES [1] R. Szelsk and S. Lavallée, Matchng 3-d anatomcal surfaces wth non-rgd deformatons usng octree-splnes, n IEEE Workshop on Bomedcal Image Analyss. IEEE Computer Socety, 1994, pp. 144 153. [] S. Marschner, B. Guenter, and S. Raghupathy, Modelng and renderng for realstc facal anmaton, n Proceedngs of the 11th Eurographcs Workshop on Renderng, 000, pp. 31 4. [3] B. Allen, B. Curless, and Z. Popovć, The space of human body shapes: reconstructon and parameterzaton from range scans, n Proceedngs of the 30st Internatonal Conference on Computer Graphcs and Interactve Technques (SIGGRAPH003), A. P. Rockwood, Ed. ACM Press, 003. [4] E. Praun, W. Sweldens, and P. Schröder, Consstent mesh parameterzaton, n Proceedngs of the 8st Internatonal Conference on Computer Graphcs and Interactve Technques (SIGGRAPH001), L. Pocock, Ed. ACM Press, 001, pp. 179 184. [5] K. Kähler, J. Haber, H. Yamauch, and H.-P. Sedel, Head shop: Generatng anmated head models wth anatomcal structure, n Proceedngs of the 00 ACM SIGGRAPH Symposum on Computer Anmaton, S. Spencer, Ed., 00, pp. 55 64. [6] V. Blanz and T. Vetter, A morphable model for the synthess of 3d faces, n Proceedngs of the 6st Internatonal Conference on Computer Graphcs and Interactve Technques (SIGGRAPH 99). ACM Press, 1999, pp. 187 194. [7] C. Basso, T. Vetter, and V. Blanz, Regularzed 3d morphable models, n Proceedngs of the 1st IEEE Internatonal Workshop on Hgher-Level Knowledge n 3D Modelng and Moton Analyss (HLK 003). Nce, France: IEEE Computer Socety Press, 17 October 003, pp. 3 11. [8] V. Blanz, C. Basso, T. Poggo, and T. Vetter, Reanmatng faces n mages and vdeo, Computer Graphcs Forum, vol., no. 3, pp. 641 641, 003, best Paper Award. [9] L. Zhang, N. Snavely, B. Curless, and S. Setz, Spacetme faces: Hgh resoluton capture for modelng and anmaton, n Proceedngs of the 31st Internatonal Conference on Computer Graphcs and Interactve Technques (SIG- GRAPH004), D. Slothower, Ed. ACM Press, Agust 8 1 004, pp. 548 558. [10] Y. Wang, X. Huang, C.-S. Lee, S. Zhang, Z. L, D. Samaras, D. Metaxas, A. Elgammal, and P. Huang, Hgh resoluton acquston, learnng and transfer of dynamc 3-d facal expressons, n Proceedngs of Eurographcs 004, M.-P. Can and M. Slater, Eds., vol. 3(3), The Eurographcs Assocaton. Blackwell Publshng, 004. [11] D. Vlasc, M. Brand, H. Pfster, and J. Popovć, Face transfer wth multlnear models, n Proceedngs of the 3nd Internatonal Conference on Computer Graphcs and Interactve Technques (SIGGRAPH005), J. L. Mohler, Ed. ACM Press, July 31 August 4 005. [1] M. E. Tppng and C. M. Bshop, Probablstc prncpal component analyss, Journal of the Royal Statstcal Socety, Seres B, vol. 1, no. 3, pp. 611 6, 1999. [13] S. Umeyama, Least-squares estmaton of transformaton parameters between two pont patterns, PAMI, vol. 13, no. 4, pp. 376 380, 1991. [14] J. Bergen, P. Anandan, K. Hanna, and R. Hngoran, Herarchcal model-based moton estmaton, n Proceedngs of the European Conference on Computer Vson, 199, pp. 37 5. [15] P. Burt and E. Adelson, The laplacan pyramd as a compact mage code, IEEE Transactons on Communcatons, pp. 53 540, 1983. [16] C. Basso and T. Vetter, Surface reconstructon va statstcal models, n Proceedngs of nd Internatonal Conference on Reconstructon of Soft Facal Parts (RSFP 005), Remagen, Germany, 17 18 March 005. [17] T. A. Davs, Umfpack verson 4.3, 004, http://www.cse.ufl.edu/research/sparse/umfpack/. [18] S. Gortler, R. Grzeszczuk, R. Szelsk, and M. Cohen, The lumgraph, n Proceedngs of the 3rd Internatonal Conference on Computer Graphcs and Interactve Technques (SIGGRAPH 96). ACM Press, 1996, pp. 43 54. [19] I. Dror, D. Cohen-Or, and H. Yeshurum, Fragment-based mage completon, n Proceedngs of the 30st Internatonal Conference on Computer Graphcs and Interactve Technques (SIGGRAPH003), A. P. Rockwood, Ed. ACM Press, 003, pp. 303 31. [0] T. Haste, R. Tbshran, and J. Fredman, The Elements of Statstcal Learnng. Sprnger-Verlag, 001. [1] V. Blanz, A. Mehl, T. Vetter, and H. P. Sedel, A statstcal method for robust 3d surface reconstructon from sparse data, n Int. Symp. on 3D Data Processng, Vsualzaton and Transmsson, Thessalonk, Greece, 004. [] A. P. Rockwood, Ed., Proceedngs of the 30st Internatonal Conference on Computer Graphcs and Interactve Technques (SIGGRAPH003). ACM Press, 003.