Fitting a Morphable Model to 3D Scans of Faces

Similar documents
Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Recognizing Faces. Outline

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

An efficient method to build panoramic image mosaics

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Some Tutorial about the Project. Computer Graphics

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Registration of Expressions Data using a 3D Morphable Model

Scan Conversion & Shading

Scan Conversion & Shading

CS 534: Computer Vision Model Fitting

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Detection of an Object by using Principal Component Analysis

Image Alignment CSC 767

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

Discriminative Dictionary Learning with Pairwise Constraints

Active Contours/Snakes

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

Object Recognition Based on Photometric Alignment Using Random Sample Consensus

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Multi-stable Perception. Necker Cube

Feature Reduction and Selection

Fitting: Deformable contours April 26 th, 2018

Local Quaternary Patterns and Feature Local Quaternary Patterns

Robust Face Alignment for Illumination and Pose Invariant Face Recognition

A Binarization Algorithm specialized on Document Images and Photos

Structure from Motion

Color in OpenGL Polygonal Shading Light Source in OpenGL Material Properties Normal Vectors Phong model

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

3D Face recognition by ICP-based shape matching

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Signature and Lexicon Pruning Techniques

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

Development of an Active Shape Model. Using the Discrete Cosine Transform

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

3D vector computer graphics

CORRELATION ICP ALGORITHM FOR POSE ESTIMATION BASED ON LOCAL AND GLOBAL FEATURES

Learning Ensemble of Local PDM-based Regressions. Yen Le Computational Biomedicine Lab Advisor: Prof. Ioannis A. Kakadiaris

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

On Modeling Variations For Face Authentication

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Biostatistics 615/815

Machine Learning: Algorithms and Applications

Multi-View Face Alignment Using 3D Shape Model for View Estimation

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

3D Face Structure Extraction from Images at Arbitrary Poses and under. Arbitrary Illumination Conditions

3D Face Reconstruction With Local Feature Refinement

Integrated Expression-Invariant Face Recognition with Constrained Optical Flow

Simplification of 3D Meshes

Face Recognition using 3D Directional Corner Points

Cluster Analysis of Electrical Behavior

Model-Based Bundle Adjustment to Face Modeling

3D Face Recognition Fusing Spherical Depth Map and Spherical Texture Map

Support Vector Machines

3D Face Modeling Using the Multi-Deformable Method

Modular PCA Face Recognition Based on Weighted Average

Lecture 4: Principal components

TN348: Openlab Module - Colocalization

3D Face Reconstruction With Local Feature Refinement. Abstract

High-Boost Mesh Filtering for 3-D Shape Enhancement

A B-Snake Model Using Statistical and Geometric Information - Applications to Medical Images

User Authentication Based On Behavioral Mouse Dynamics Biometrics

3D Novel Face Sample Modeling for Face Recognition

A Robust Method for Estimating the Fundamental Matrix

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

The Research of Ellipse Parameter Fitting Algorithm of Ultrasonic Imaging Logging in the Casing Hole

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Prof. Feng Liu. Spring /24/2017

An Image Fusion Approach Based on Segmentation Region

Video Object Tracking Based On Extended Active Shape Models With Color Information

Mathematics 256 a course in differential equations for engineering students

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Smoothing Spline ANOVA for variable screening

A Comparison and Evaluation of Three Different Pose Estimation Algorithms In Detecting Low Texture Manufactured Objects

Lecture 13: High-dimensional Images

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Collaboratively Regularized Nearest Points for Set Based Recognition

Fitting and Alignment


Geometric Primitive Refinement for Structured Light Cameras

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Interactive Rendering of Translucent Objects

Pictures at an Exhibition

3D Virtual Eyeglass Frames Modeling from Multiple Camera Image Data Based on the GFFD Deformation Method

Lecture #15 Lecture Notes

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Calibrating a single camera. Odilon Redon, Cyclops, 1914

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Face Recognition Based on SVM and 2DPCA

S1 Note. Basis functions.

Unsupervised Learning

What are the camera parameters? Where are the light sources? What is the mapping from radiance to pixel color? Want to solve for 3D geometry

Transcription:

Fttng a Morphable Model to 3D Scans of Faces Volker Blanz Unverstät Segen, Segen, Germany blanz@nformatk.un-segen.de Krstna Scherbaum MPI Informatk, Saarbrücken, Germany scherbaum@mp-nf.mpg.de Hans-Peter Sedel MPI Informatk, Saarbrücken, Germany hpsedel@mp-nf.mpg.de Abstract Ths paper presents a top-down approach to 3D data analyss by fttng a Morphable Model to scans of faces. In a unfed framework, the algorthm optmzes shape, texture, pose and llumnaton smultaneously. The algorthm can be used as a core component n face recognton from scans. In an analyss-by-synthess approach, raw scans are transformed nto a PCA-based representaton that s robust wth respect to changes n pose and llumnaton. Illumnaton condtons are estmated n an explct smulaton that nvolves specular and dffuse components. The algorthm nverts the effect of shadng n order to obtan the dffuse reflectance n each pont of the facal surface. Our results nclude llumnaton correcton, surface completon and face recognton on the FRGC database of scans. 1. Introducton Face recognton from 3D scans has become a very actve feld of research due to the rapd progress n 3D scannng technology. In scans, changes n pose are easy to compensate by a rgd transformaton. On the other hand, range data are often nosy and ncomplete, and usng shape only would gnore many person-specfc features such as the colors of the eyes. The man dea of our approach s to explot both shape and texture nformaton of the nput scan n a smultaneous fttng procedure, and to use a 3D Morphable Model for a PCA-based representaton of faces. Our method bulds upon an algorthm for fttng a Morphable Model to photographs [6]. We generalze ths algorthm by ncludng range data n the cost functon that s optmzed durng fttng. More specfcally, the algorthm syntheszes a random subset of pxels from the scan n each teraton by smulatng rgd transformaton, perspectve projecton and llumnaton. In an teratve optmzaton, t makes these as smlar as possble to the color and depth values found n the scan. Based on an analytcal dervatve of the cost functon, the algorthm optmzes pose, shape, texture and lghtng. For ntalzaton, the algorthm uses a set of about 7 feature ponts that have to be defned manually, or may be dentfed automatcally by feature detecton algorthms. One of the outputs of the system s a set of model coeffcents that can be used for face recognton. Moreover, we obtan a textured 3D model from the lnear span of example faces of the Morphable Model. The fttng procedure establshes pont-to-pont correspondence of the model to the scan, so we can sample the verdcal cartesan coordnates and color values of the scan, and substtute them n the face model. The result s a resampled verson of the orgnal scan that can be morphed wth other faces. We estmate and remove the effect of llumnaton, and thus obtan the approxmate dffuse reflectance at each surface pont. Ths s mportant for smulatng new llumnatons on the scan. The contrbutons of ths paper are: An algorthm for fttng a model to shape and texture smultaneously, The algorthm s specfcally desgned for perspectve projecton, whch s found n most scanners, such as structured lght scanners, tme-of-flght scanners and those laser scanners that have a fxed center of projecton, Compensaton of lghtng effects as an ntegrated part of the fttng procedure, Smulaton of both specular and dffuse reflecton, Model-based handlng of saturated color values n the texture (whch are common n scans and pose problems to non model-based approaches), and A comparson between recognton from photographs (textures) only, wth recognton from scans n a sophstcated, model-based algorthm. 2. Related Work Many of the early methods on face recognton from range data have reled on feature ponts, curvatures or curves [8, 16, 13, 4]. Other geometrcal crtera nclude Hausdorff-dstance [17], free-form surfaces and pont sgnatures [25, 10], or bendng-nvarant canoncal forms for surface representaton [7]. Smlar to the Egenface approach n mage data, several authors have appled Prncpal Component Analyss (PCA)

Fgure 1. From a raw nput scan (frst and second column, top and bottom), the fttng procedure generates a best ft wthn the vector space of examples (thrd column). Shape and texture samplng (Secton 5) ncludes the orgnal data nto the reconstructed model wherever possble (fourth column). Note that the algorthm has automatcally removed lghtng effects, ncludng the saturated color values. to range data after rgd algnment of the data [2, 1, 14, 9, 28], or after regstraton n the mage plane usng salent features [15]. In 3D Morphable Models, PCA s not appled to the depth or radus values, but to the cartesan coordnates of surface ponts [5]. It s mportant that pror to PCA, the scans are regstered n a dense pont-by-pont correspondence usng a modfed optcal flow algorthm that dentfes correspondng features n the range data [5]. Ths algorthm has only been used for buldng the Morphable Model, but not for recognzng faces n new scans, whch would probably be possble. The Morphable Model has been studed n the feld of mage analyss and shape reconstructon [5, 6]. On a very general level, t s a powerful approach to dentfy correspondng features n order to solve recognton problems. In face recognton from mages, ths s reflected n the paradgm shft from Egenfaces [26] to Morphable Models [27, 5] and Actve Appearance Models [11]. Correspondng features may ether be found by dedcated feature detectors, or by an teratve fttng procedure that mnmzes a dstance functon. A number of algorthms have been presented for the algnment of face scans. Unlke Iteratve Closest Pont (ICP) algorthms [3], whch s desgned for regsterng parts of the same surface by a rgd transformaton, these can be used to regster scans of dfferent ndvduals. Blanz and Vetter brefly descrbe an algorthm for fttng the Morphable Model to new 3D scans [5]: In an analyssby-synthess loop that s smlar to ther mage analyss algorthm [5, 6], the algorthm strves to reproduce the radus values of cylndrcal CyberwareT M scans by a lnear combnaton of example faces and a rgd transformaton. The algorthm mnmzes the sum of square dstance of the radus and texture values of the model to those of the nput scan. Our algorthm takes ths approach several steps fur- ther by posng the problem of 3D shape fttng as a generalzed problem of fttng to mages, by dealng wth the natural representaton of most 3D scanners, whch s based on perspectve projecton, and by takng llumnaton nto account. Zhang et al. [29] presented a method that tracks a face model durng a 3D moton sequence, and that fts a template mesh to the ntal frame. Smlar to our approach, they project the model to the mage plane of the depth map, and update the shape such that t follows the shape wth mnmal changes. They use texture to track the moton of ponts n the mage plane durng the sequence by an optcal flow algorthm. Mao et al [21] use elastc deformaton of a generc model, startng from manually placed landmarks. For a comparson of untextured scans, Russ et al. [24] detect fve facal features, perform ICP and a subsequent normal search to establsh correspondence between an nput scan and a reference face. Face recognton s then based on PCA coeffcents of shape vectors. Unlke ths algorthm, our approach deforms the PCA-based model to ft the nput scan, solvng the problem of correspondence and PCA decomposton smultaneously. Man et al. [22] compare pars of scans usng a hybrd system that uses feature ponts, ICP of the nose and forehead areas, and PCA. Whle the prevous algorthms dd not account for llumnaton effects n the scans, Malassots and Strntzs [20] use the depth nformaton of scans for face detecton, pose estmaton and a warp-based rectfcaton of the nput mage (texture of the scan). To make the system robust wth respect to llumnaton, the faces n the database are rendered wth dfferent lghtngs, and a Support Vector Machne s traned on these data. Lu et al. [19] construct a 3D model of a face by combnng several 2.5D scans, and then match ths to a new probe

scan by coarse algnment based on feature ponts, and fne algnment based on ICP. Root mean square dstance s used as a measure for shape smlarty. On a set of canddates, they synthesze dfferent shadngs of ther textures, and use LDA for a comparson wth the probe texture. Unlke ths work, we ft a deformable model to the scan, and ntegrate ths wth a closed analyss-by-synthess loop for smulatng the effect of lghtng on texture. Lu and Jan [18] consder non-rgd deformatons due to facal expressons, and teratvely optmze n alternatng order the rgd transformaton by ICP, and the expresson by mnmzng the sum of squared dstances. Texture s not estmated n ths work. In contrast, we optmze rgd transformaton, non-rgd deformaton, texture and lghtng n a unfed framework. 3. A Morphable Model of 3D Faces Ths secton summarzes how a Morphable Model of 3D faces[27, 5] s bult from a tranng set of 200 textured Cyberware T M laser scans that are stored n cylndrcal coordnates. These scans cover most of the facal surface from ear to ear, and are relatvely hgh qualty, but t takes about 20 seconds to record a full scan because the sensor of the scanner moves around the persons heads. In Secton 4, ths general Morphable Model wll be appled to nput scans of new ndvduals recorded wth a scanner that uses a perspectve projecton. In the Morphable Model, shape and texture vectors are defned such that any lnear combnaton of examples m m S = a S, T = b T. (1) =1 =1 s a realstc face f S, T are wthn a few standard devatons from ther averages. In the converson of the laser scans of the tranng set nto shape and texture vectors S, T, t s essental to establsh dense pont-to-pont correspondence of all scans wth a reference face to make sure that vector dmensons n S, T descrbe the same pont, such as the tp of the nose, n all faces. Correspondence s computed automatcally usng optcal flow [5]. Each vector S s the 3D shape, stored n terms of x, y, z- coordnates of all vertces k {1,..., n}, n = 75972 of a 3D mesh: S = (x 1, y 1, z 1, x 2,..., x n, y n, z n ) T. (2) In the same way, we form texture vectors from the red, green, and blue values of all vertces surface colors: T = (R 1, G 1, B 1, R 2,..., R n, G n, B n ) T. (3) Fnally, we perform a Prncpal Component Analyss (PCA) to estmate the prncpal axes s, t of varaton around the averages s and t, and the standard devatons σ S, and σ T,. The prncpal axes form an orthogonal bass, so m m S = s + α s, T = t + β t. (4) =1 4. Model-Based Shape Analyss =1 The fttng algorthm s a generalzaton of a modelbased algorthm for mage analyss [6]. As we have ponted out above, most 3D scans are parameterzed and sampled n terms of mage coordnates u, v n a perspectve projecton. In each sample pont, the scan stores the r, g, b component of the texture, and the cartesan coordnates of x, y, z of the pont, so we can wrte the scan as I nput (u, v) = ( r(u, v), g(u, v), b(u, v), x(u, v), y(u, v), z(u, v)) T. (5) The algorthm solves the followng optmzaton problem: Gven I nput (u, v), fnd the shape and texture vectors S, T, the rgd pose transformaton, camera parameters and lghtng such that 1. the camera produces a color mage that s as smlar as possble to the texture r(u, v), g(u, v), b(u, v), and 2. the cartesan coordnates of the surface ponts ft the shape of x(u, v) = (x(u, v), y(u, v), z(u, v)) T. Solvng the frst problem, whch s equvalent to the 3D shape reconstructon from mages [6], unquely defnes the rgd transformaton and all the other parameters: Ponts such as the tp of the nose, whch have coordnates x k = (x k, y k, z k ) T wthn the shape vector S, are mapped by the rgd transformaton and the perspectve projecton to a pxel u k, v k n the mage, and the color values n ths pxel should be reproduced by the estmated texture and lghtng. The same perspectve projecton solves the second problem, because the pxel u k, v k also stores the 3D coordnates of the same pont, x(u k, v k ). However, the 3D coordnates wll, n general, dffer by a rgd transformaton that depends on the defnton of coordnates by the manufacturer of the scanner. The algorthm, therefore, has to fnd two rgd transformatons, one that maps the Morphable Model to camera coordnates such that the perspectve projecton fts wth the coordnates u k, v k, and one that algns the coordnate system of the model (n our case the camera coordnates) wth the coordnate system of the scanner. We separate the two problems by pre-algnng the scans wth our camera coordnate system n a frst step. Before we descrbe ths algnment, let us ntroduce some notaton. 4.1. Rgd Transformaton and Perspectve Projecton In our analyss-by-synthess approach, each vertex k s mapped from the model-based coordnates x k = (x k, y k, z k ) T n S (Equaton 2) to the screen coordnates u k, v k n the followng way:

nput texture texture texture and shape texture and shape samplng Fgure 2. If the reconstructon s computed only from the nput texture (frst mage), the algorthm estmates the most plausble 3D shape (second mage), gven the shadng and shape of the front vew. Fttng the model to both texture and shape (thrd mage) captures more characterstcs of the face, whch are close to the ground truth that we obtan when samplng the texture and shape values (rght mage). A rgd transformaton maps x k to a poston relatve to the camera: w k = (w x,k, w y,k, w z,k ) T = R γ R θ R φ x k + t w. (6) The angles φ and θ control n-depth rotatons around the vertcal and horzontal axs, γ defnes a rotaton around the camera axs, and t w s a spatal shft. A perspectve projecton then maps vertex k to mage plane coordnates u k, v k : u k = u 0 + f w x,k w z,k, v k = v 0 f w y,k w z,k. (7) f s the focal length of the camera whch s located n the orgn, and u 0, v 0 defnes the mage-plane poston of the optcal axs (prncpal pont). 4.2. Prealgnment of Scans In the scan, the 3D coordnates found n u k, v k are x(u k, v k ). The camera of the scanner mapped these to u k, v k, so we can nfer the camera calbraton of the scanner and thus 1. transform the scan coordnates x(u, v) to camera coordnates w(u, v),.e. estmate the extrnsc camera parameters. 2. estmate the focal length (and potentally more ntrnsc camera parameters), and use these as fxed, known values n the subsequent model fttng. In fact, the prealgnment reverse-engneers the camera parameters that have been used n the software of the scanner from redundances n the data. In the well-known lterature on camera calbraton, there are a number of algorthms that could be used for ths task. For smplcty, we modfed the model fttng algorthm [6] that wll be used n the next processng step anyway. Ths makes sure that the defnton of all camera parameters s consstent n both steps. The modfed algorthm solves the non-lnear problem of camera calbraton teratvely. Frst, we select n = 10 random non-vod ponts u, v from the scan, and store ther coordnates x(u, v ). Equatons (6) and (7) map these to mage plane coordnates u, v. Ths defnes a cost functon E cal (φ, θ, γ, t w, f) = n =1 (u u ) 2 + (v v ) 2. We fnd the mnmum of E cal by Newton s algorthm, usng analytc dervatves of the rgd transformaton and the perspectve projecton (6), (7). Usng the rgd transformaton (6), we map all scan coordnates x(u, v) to our estmated camera coordnates w(u, v). 4.3. Fttng the Model to Scans The fttng algorthm fnds the model coeffcents, rgd transformaton and lghtng such that each vertex k of the Morphable Model s mapped to mage plane coordnates u k, v k such that the color values are matched and that the camera coordnates w z,k of the model are as close as possble to the camera coordnates w z (u k, v k ) of the scan. Note that we only ft the depth w z of the vertces: The frontoparallel coordnates w x and w y are fxed already by the fact that the model pont and the scan pont are n the same mage poston u k, v k, and an addtonal restrcton n w x and w y would prevent the model from sldng along the surface to fnd the best match n terms of feature correspondence. In order to ft the model to the texture of the scan, the algorthm has to compensate effects of llumnaton and of the overall color dstrbuton. 4.3.1 Illumnaton and Color We assume that the scannng setup nvolves smlar lghtng effects as a standard photograph. We propose to smulate ths explctly, n the same way as t has been done for fttng a model to mages [6]. Ths paragraph summarzes the steps nvolved n mage synthess, whch wll be part of the analyss algorthm. The normal vector to a trangle k 1 k 2 k 3 of the Morphable Model s gven by a vector product of the edges, n = (x k1 x k2 ) (x k1 x k3 ), whch s normalzed to unt length, and rotated along wth the head (Equaton 6). For fttng the model to an mage, t s suffcent to consder the centers of trangles only, most of whch are about 0.2mm 2 n sze. 3D coordnate and color of the center are the arthmetc means of the corners values. In the followng, we do not formally dstngush between trangle centers and vertces k. The algorthm smulates ambent lght wth red, green,

4.3.2 Optmzaton Just as n mage analyss [6], the fttng algorthm optmzes shape coeffcents α = (α1, α2,...)t and texture coeffcents β = (β1, β2,...)t along wth 21 renderng parameters, concatenated nto a vector ρ, that contans pose angles φ, θ and γ, 3D translaton tw, ambent lght ntenstes Lr,amb, Lg,amb, Lb,amb, drected lght ntenstes Lr,dr, Lg,dr, Lb,dr, the angles θl and φl of the the drected lght, color contrast c, and gans and offsets of color channels gr, gg, gb, or, og, ob. Unlke [6], we keep the focal length f fxed now. The man part of the cost functon s a least-squares dfference between the transformed nput scan Input (u, v) = (r(u, v), g(u, v), b(u, v), wz (u, v))t (12) Fgure 3. The textures on the rght have been sampled from the scans n the left column. The nverson of llumnaton effects has removed most of the harsh lghtng from the orgnal textures. The method compensates both the results of overexposure and nhomogeneous shadng of the face. and blue ntenstes Lr,amb, Lg,amb, Lb,amb, and drected lght wth ntenstes Lr,dr, Lg,dr, Lb,dr from a drecton l defned by two angles θl and φl : l = (cos(θl ) sn(φl ), sn(θl ), cos(θl ) cos(φl ))T. (8) The llumnaton model of Phong (see [12]) approxmately descrbes the dffuse and specular reflecton of a surface. In each vertex k, the red channel s ν bk Lr,k = Rk Lr,amb +Rk Lr,dr hnk, l+ks Lr,dr hrk, v (9) where Rk s the red component of the dffuse reflecton coeffcent stored n the texture vector T, ks s the specular reflectance, ν defnes the angular dstrbuton of bk s the vewng drecton, and the specular reflectons, v rk = 2 hnk, l nk l s the drecton of maxmum specular reflecton [12]. Dependng on the camera of the scanner, the textures may be color or gray level, and they may dffer n overall tone. We apply gans gr, gg, gb, offsets or, og, ob, and a color contrast c to each channel. The overall lumnance L of a colored pont s [12] L = 0.3 Lr + 0.59 Lg + 0.11 Lb. (10) Color contrast nterpolates between the orgnal color value and ths lumnance, so for the red channel we set r = gr (clr + (1 c)l) + or. (11) Green and blue channels are computed n the same way. The colors r, g and b are drawn at a poston (u, v) n the fnal mage Imodel. and the values Imodel syntheszed by the model X EI = (Input Imodel )T Λ(Input Imodel ) (13) u,v wth a dagonal weght matrx Λ that contans an emprcal scalng value between shape and texture, whch s 128 n our system (depth s n mm, texture s n {0,..., 255}.) For ntalzaton, another cost functon s added to EI that measures the dstances between manually defned feature ponts j n the mage plane, unt,j, vnt,j, and the mage coordnates of the projecton umodel,kj, vmodel,kj of the correspondng, manually defned model vertces kj : EF = X unt,j umodel,k j k k2. vnt,j vmodel,kj (14) j Ths addtonal term pulls the face model to the approxmate poston n the mage plane n the frst teratons. Its weght s reduced to 0 durng the process of optmzaton. To avod overfttng, we apply a regularzaton by addng penalty terms that measure the PCA-based Mahalanobs dstance from the average face and the ntal parameters [6, 5]: E = ηi EI +ηf EF + X α2 X β 2 X (ρ ρ )2. 2 + 2 + 2 σ σ σ S, T, R, (15) Ad-hoc choces of ηi and ηf are used to control the relatve weghts of EI, EF, and the pror probablty terms n (15). At the begnnng, pror probablty and EF are weghted hgh. The fnal teratons put more weght on EI, and no longer rely on EF. Trangles that are nvsble due to self-occluson of the face are dscarded n the cost functon. Ths s tested by a z-buffer crteron. Also, we dscard shape data that are vod and colors that are saturated. The algorthm takes cast shadows nto account n the Phong model, based on a shadowbuffer crteron.

Sample A Morph Sample B Fgure 4. Morph between the 3D face on the left and the face on the rght. They both are n correspondence to the reference face due to the reconstructon usng the Morphable Model and the samplng procedure. In the morph (mddle), facal features are preserved whle regons wthout remarkable characterstcs are averaged smoothly. The cost functon s optmzed wth a stochastc verson of Newton s method [6]: The algorthm selects 40 random trangles n each teraton, wth a probablty proportonal to ther area n the u, v doman, and evaluates E I and ts gradent only at ther centers. The gradent s computed analytcally usng chan rule and the equatons of the synthess that were gven n ths secton. After fttng the entre face model to the mage, the eyes, nose, mouth, and the surroundng regon are optmzed separately. The fttng procedure takes about 4 mnutes on a 3.4 GHz Xeon processor. 5. Shape and Texture Samplng After the fttng procedure, the optmzed face s represented as a shape vector S and a texture vector T. Note that the texture values descrbe the dffuse reflectance of the face n each pont, so the effect of llumnaton, whch was part of the optmzaton problem, has already been compensated by the algorthm. However, both S and T are lnear combnatons of examples (Secton 3), and they do not capture all detals n shape or texture found n the orgnal scan. For computer graphcs applcatons, we can therefore sample the shape and texture of the orgnal surface usng the followng algorthm, whch s an extenson of [5]: For each vertex k, the optmzed model and camera parameters predct an mage-plane poston u k, v k. There, we fnd the coordnates of the scanned pont n camera coordnates, w(u k, v k ) (Secton 4.2). By nvertng the rgd transformaton (6) wth the model parameters of the model fttng procedure (Secton 4.3), we obtan coordnates that are consstent wth the vertex coordnates x k n S and can replace them. If the pont u k, v k n the scan s vod or the dstance to the estmated poston exceeds a threshold, the estmated value s retaned. In the color values n u k, v k, the effects of shadng and color transformaton have to be compensated. Wth the optmzed parameters, we nvert the color transformaton (11), subtract the specular hghlght (9) whch we can estmate from the estmated lght drecton and surface normal, and dvde by the sum of ambent and dffuse lghtng to obtan the dffuse reflectances R k, G k, B k. Note that the sampled scan s now a new shape and texture vector that s n full correspondence wth the Morphable Model. 5.1. Saturated Color Values In many scans and mages, color values are saturated due to overexposure. On those pxels n the raw scan texture, the red, green or blue values are close to 255. We do not perform texture samplng n these pxels, because the assumptons of our llumnaton model are volated, so the nverson would not gve correct results. Instead, the algorthm retans the estmated color values from the prevous secton. The model-based approach and the explct smulaton of lghtng proves to be very convenent n ths context. For a smooth transton between sampled and estmated color values, the algorthm creates a lookup-mask n the u, v doman of the orgnal scan, blurs ths bnary mask and uses the contnuous values from the blurred mask as relatve weghts of sampled versus estmated texture. As a result, we obtan a texture vector that captures detals of the eyes and other structures, but does not contan the specular hghlghts of the orgnal data. 6. Results We have tested the algorthm on a porton of the Face Recognton Grand Challenge (FRGC, [23].) We selected pars of scans of 150 ndvduals, taken under uncontrolled condtons. The scans of each person were recorded on two dfferent days. For fttng, we used the 100 most relevant prncpal components. We manually clcked 7 ponts n each face, such as the corners of the eyes and the tp of the nose. Fgure 1 shows a typcal result of fttng the model to one of the scans. Gven the shape and texture (top left mage), whch we rendered from novel vewponts n the second column of the Fgure, we obtaned a best ft shown n the thrd column. The profle vew of the reconstructed face shows many characterstc features of the face, such as the curved nose and the dent under the lower lp. To go beyond the lnear span of examples, we sampled the true shape and texture of the face (rght column). The mages show that the reconstructed surface (at the ears) and the sampled surface are closely algned. The mean depth error w z,k w z (u k, v k ) between the vertces of the reconstructon n Fgure 1 and the ground truth scan was 1.88 mm. The mean error on

100 ROC Curves for the Verfcaton usng Set A(.) as Gallery and Set B(.) as Probe 100 ROC Curves for the Verfcaton usng Set B(.) as Gallery and Set A(.) as Probe 92 90 92 90 Ht Rate (%) 80 70 Ht Rate (%) 80 70 60 50 0 1 10 20 30 False Alarm Rate (%) 3D ntrapca, A(3D)/B(3D) 2D ntrapca, A(2D)/B(2D) 2D and 3D ntrapca, A(3D)/B(2D) 2D and 3D ntrapca, A(2D)/B(3D) 40 50 60 50 0 1 10 20 30 False Alarm Rate (%) 3D ntrapca, B(3D)/A(3D) 2D ntrapca, B(2D)/A(2D) 2D and 3D ntrapca, B(3D)/A(2D) 2D and 3D ntrapca, B(2D)/A(3D) Fgure 5. ROC curves of verfcaton across the two sets A and B. In the case of 3D-3D verfcaton, at 1 percent false alarm rate, the ht rate s 92 % for both types of comparson (A(3D)/B(3D) n the left mage and B(3D)/A(3D) n the rght mage). 40 50 all 2 150 scans was 1.02 mm when neglectng outlers above an eucldean dstance of 10 mm and vertex vewng angles above 80. The average percentage of outlers per face s 24%. When ncludng also the outlers nto the average mean error for all 2 150 scans, the result s 2.74 mm. Fgure 2 shows an addtonal set of results and demonstrates how the 3D nformaton mproves the reconstructon of the profle from the front vew. As shown n Fgure 1, the texture of the reconstructed and of the sampled face are normalzed n terms of lghtng and overall hue, so they can be used for smulatng new llumnaton n computer graphcs. To show how the algorthm removes lghtng effects, Fgure 3 gves a sde-by-sde comparson of two textures that were reconstructed and sampled from two dfferent harsh llumnatons. The result shows that the saturaton of color values and the shadng are removed successfully, and only a relatvely small dfference between the textures remans, so the textures are ready for smulatng new llumnatons. In Fgure 4, we show how two of the scans (the one from Fgure 1 and the rght face n Fgure 4), can be morphed wthn the framework or the Morphable Model. Ths s due to the fact that both the reconstructed and the sampled faces are n correspondence wth the reference face. Fnally, we nvestgated a face recognton scenaro wth our algorthm, and evaluated how the addtonal shape nformaton mproves the performance compared to the mageonly condton. After model fttng, we rescaled the model coeffcents to α σ S, and β σ T, and concatenated them to a coeffcent vector. By usng 100 coeffcents for shape and texture for the entre face and the segments eyes, nose, mouth and the surroundng regon each, ths adds up to 1000 dmensons. As a crteron for smlarty, we used the scalar product. Ths s the same method as n [6]. We also performed a PCA of ntra-object varaton, and compensated for these varatons [6]. Intra-object PCA was done wth reconstructons from other faces that are not n the test set, and on mage- or scan-based reconstructons for the mageor scan-based recognton. In the cross-modal condton, we used the ntra-object PCA pooled from reconstructons from scans and from mages. Table 1 gves the percentage of correct dentfcaton for a comparson of scans versus scans, mages versus mages, and cross-modal recognton. The results n Table 1 ndcate that the use of range data mproves the performance, compared to the mage-only condton. Ths s also shown n the ROC curve (Fgure 5). The cross-modal condton s compettve to 2D-2D n verfcaton, but not yet n dentfcaton. We plan a more sophstcated treatment of the ntra-person varaton between reconstructons from scans and those from mages, but the results show already that the jont representaton n the Morphable Model s a vable way for cross-modal recognton. Gallery Probe Correct Ident. ntrapca A(3D) B(3D) 96.0 3D B(3D) A(3D) 92.0 3D A(2D) B(2D) 84.7 2D B(2D) A(2D) 79.3 2D A(3D) B(2D) 71.3 2D and 3D B(3D) A(2D) 66.0 2D and 3D A(2D) B(3D) 66.7 2D and 3D B(2D) A(3D) 70.0 2D and 3D Table 1. Percentages of correct dentfcaton of n=150 ndvduals n two sets of scans (A and B), comparng scans (3D shape and texture) or texture only (2D). The last four rows show cross-modal recognton. 7. Concluson Our results demonstrate that analyss-by-synthess s not only a promsng strategy n mage analyss, but can also be appled to range data. The man dea s to smulate explctly the projecton of surface data nto pxels of a scan, and the effects of llumnaton that are found n the texture. The technque has a number of applcatons n bometrc dentfcaton, but also n Computer Graphcs, for example as a robust and relable way to transform scans nto shape and texture vectors n a Morphable Model for anmaton and

other hgh-level manpulatons. It can be used for bootstrappng the Morphable Model [5] by ncludng more and more scans n the vector space of faces. The algorthm may also be a tool for preprocessng raw scans, fllng n mssng regons automatcally, and regsterng multple scans. References [1] B. Achermann, X. Jang, and H. Bunke. Face recognton usng range mages. In VSMM 97: Proc. of the 1997 Int. Conf. on Vrtual Systems and MultMeda, page 129, Washngton, DC, USA, 1997. IEEE Computer Socety. [2] J. J. Atck, P. A. Grffn, and A. N. Redlch. Statstcal approach to shape from shadng: Reconstructon of threedmensonal face surfaces from sngle two-dmensonal mages. Neural Computaton, 8:1321 1340, 1996. [3] P. J. Besl and N. D. McKay. A method for regstraton of 3 D shapes. IEEE Trans. on Pattern Analyss and Machne Intellgence, 14(2):239 256, 1992. [4] C. Beumer and M. Acheroy. Face verfcaton from 3D and grey level clues. Pattern Recognton Letters, 22:1321 1329, 2001. [5] V. Blanz and T. Vetter. A morphable model for the synthess of 3D faces. In Computer Graphcs Proc. SIGGRAPH 99, pages 187 194, 1999. [6] V. Blanz and T. Vetter. Face recognton based on fttng a 3d morphable model. IEEE Trans. on Pattern Analyss and Machne Intellgence, 25(9):1063 1074, 2003. [7] A. M. Bronsten, M. M. Bronsten, and R. Kmmel. Expresson-nvarant 3d face recognton. In Proc. Audoand Vdeo-based Bometrc Person Authentcaton (AVBPA), Lecture Notes n Comp. Scence No. 2688, pages 62 69. Sprnger, 2003. [8] J. Y. Cartoux, J. T. Lapreste, and M. Rchetn. Face authentfcaton or recognton by profle extracton from range mages. In Workshop on Interpretaton of 3D Scenes, pages 194 199, 1989. [9] K. I. Chang, K. W. Bowyer, and P. J. Flynn. An evaluaton of multmodal 2d+3d face bometrcs. IEEE Trans. Pattern Anal. Mach. Intell., 27(4):619 624, 2005. [10] C.-S. Chua, F. Han, and Y. K. Ho. 3D human face recognton usng pont sgnature. In Proc. IEEE Internatonal Conference on Automatc Face and Gesture Recognton, pages 233 238, 2000. [11] T. Cootes, G. Edwards, and C. Taylor. Actve appearance models. In Burkhardt and Neumann, edtors, Computer Vson ECCV 98 Vol. II, Freburg, Germany, 1998. Sprnger, Lecture Notes n Computer Scence 1407. [12] J. Foley, A. v. Dam, S. K. Fener, and J. F. Hughes. Computer Graphcs: Prncples and Practce. Addson-Wesley, Readng, Ma, 2. edton, 1996. [13] G. G. Gordon. Face recognton based on depth and curvature features. In Proc. IEEE Computer Socety Conference on Computer Vson and Pattern Recognton, pages 808 810, 1992. [14] T. Heseltne, N. Pears, and J. Austn. Three-dmensonal face recognton: An egensurface approach. In Proc. IEEE Internatonal Conference on Image Processng, pages 1421 1424, Sngapore, 2004. poster. [15] C. Hesher, A. Srvastava, and G. Erlebacher. A novel technque for face recognton usng range magng. In Proc. Seventh Internatonal Symposum on Sgnal Processng and Its Applcatons, volume 2, pages 201 204, 2003. [16] J. C. Lee and E. Mlos. Matchng range mages of human faces. In Proc. IEEE Internatonal Conference on Computer Vson, pages 722 726, 1990. [17] Y. H. Lee and J. C. Shm. Curvature based human face recognton usng depth weghted hausdorff dstance. In Proc. IEEE Internatonal Conference on Image Processng, pages 1429 1432, Sngapore, 2004. [18] X. Lu and A. K. Jan. Deformaton modelng for robust 3d face matchng. In CVPR 06, pages 1377 1383, Washngton, DC, USA, 2006. IEEE Computer Socety. [19] X. Lu, A. K. Jan, and D. Colbry. Matchng 2.5d face scans to 3d models. IEEE Trans. Pattern Anal. Mach. Intell., 28(1):31 43, 2006. [20] S. Malassots and M. G. Strntzs. Pose and llumnaton compensaton for 3d face recognton. In Proc. Internatonal Conference on Image Processng, Sngapore, 2004. [21] Z. Mao, J. P. Sebert, W. P. Cockshott, and A. F. Ayoub. Constructng dense correspondences to analyze 3d facal change. In ICPR 04, Volume 3, pages 144 148, Washngton, DC, USA, 2004. IEEE Computer Socety. [22] A. S. Man, M. Bennamoun, and R. Owens. 2d and 3d multmodal hybrd face recognton. In ECCV 06, pages 344 355, 2006. [23] P. J. Phllps, P. J. Flynn, T. Scruggs, K. W. Bowyer, J. Chang, K. Hoffman, J. Marques, J. Mn, and W. Worek. Overvew of the face recognton grand challenge. In CVPR 05 - Volume 1, pages 947 954, Washngton, DC, USA, 2005. IEEE Computer Socety. [24] T. Russ, C. Boehnen, and T. Peters. 3d face recognton usng 3d algnment for pca. In CVPR 06, pages 1391 1398, Washngton, DC, USA, 2006. IEEE Computer Socety. [25] H. T. Tanaka, M. Ikeda, and H. Chak. Curvature-based face surface recognton usng sphercal correlaton - prncpal drecons for curved object recognton. In Proc. IEEE Internatonal Conference on Automatc Face and Gesture Recognton, pages 372 377, 1998. [26] M. Turk and A. Pentland. Egenfaces for recognton. Journal of Cogntve Neuroscence, 3:71 86, 1991. [27] T. Vetter and T. Poggo. Lnear object classes and mage synthess from a sngle example mage. IEEE Trans. on Pattern Analyss and Machne Intellgence, 19(7):733 742, 1997. [28] X. Yuan, J. Lu, and T. Yahag. A method of 3d face recognton based on prncpal component analyss algorthm. In IEEE Internatonal Symposum on Crcuts and Systems, volume 4, pages 3211 3214, 2005. [29] L. Zhang, N. Snavely, B. Curless, and S. M. Setz. Spacetme faces: hgh resoluton capture for modelng and anmaton. In SIGGRAPH 04: ACM SIGGRAPH 2004 Papers, pages 548 558, New York, NY, USA, 2004. ACM Press.