Data-Driven Face Modeling and Animation

Similar documents
Model-Based Face Computation

Animation of 3D surfaces.

Animation of 3D surfaces

CS 231. Deformation simulation (and faces)

Faces and Image-Based Lighting

Synthesizing Realistic Facial Expressions from Photographs

CS 231. Deformation simulation (and faces)

Synthesizing Speech Animation By Learning Compact Speech Co-Articulation Models

Facial Motion Capture Editing by Automated Orthogonal Blendshape Construction and Weight Propagation

M I RA Lab. Speech Animation. Where do we stand today? Speech Animation : Hierarchy. What are the technologies?

CrazyTalk 5 Basic Face Fitting and Techniques

Welcome to CS 231. Topics in Computer Animation. Victor Zordan EBU II 337

FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT

A Morphable Model for the Synthesis of 3D Faces

Data-driven Methods: Faces. Portrait of Piotr Gibas Joaquin Rosales Gomez

The accuracy and robustness of motion

High-Fidelity Facial and Speech Animation for VR HMDs

Face Modeling. Portrait of Piotr Gibas Joaquin Rosales Gomez

K A I S T Department of Computer Science

Animated Talking Head With Personalized 3D Head Model

Real-time Expression Cloning using Appearance Models

Human Upper Body Pose Estimation in Static Images

Unsupervised Learning for Speech Motion Editing

image-based visual synthesis: facial overlay

Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment

Towards Audiovisual TTS

Learning-Based Facial Rearticulation Using Streams of 3D Scans

Image Morphing. Application: Movie Special Effects. Application: Registration /Alignment. Image Cross-Dissolve

Data-driven Methods: Faces. Portrait of Piotr Gibas Joaquin Rosales Gomez (2003)

Computer Graphics. Spring Feb Ghada Ahmed, PhD Dept. of Computer Science Helwan University

SYNTHESIS OF 3D FACES

from: N.J., Smelter, & P.B., Baltes (Eds.) (2001). Encyclopedia of the Social and Behavioral Sciences. London: Elsevier Science.

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann

Human body animation. Computer Animation. Human Body Animation. Skeletal Animation

Data-driven Methods: Faces. Portrait of Piotr Gibas Joaquin Rosales Gomez (2003)

Computer Animation Visualization. Lecture 5. Facial animation

FACIAL ANIMATION FROM SEVERAL IMAGES

Topics for thesis. Automatic Speech-based Emotion Recognition

Facial Expression Detection Using Implemented (PCA) Algorithm

VISEME SPACE FOR REALISTIC SPEECH ANIMATION

Realistic Face Animation for Audiovisual Speech Applications: A Densification Approach Driven by Sparse Stereo Meshes

Tracking facial features using low resolution and low fps cameras under variable light conditions

3D Lip-Synch Generation with Data-Faithful Machine Learning

Speech Driven Synthesis of Talking Head Sequences

Master s Thesis. Cloning Facial Expressions with User-defined Example Models

Motion Synthesis and Editing. Yisheng Chen

Facial Animation System Design based on Image Processing DU Xueyan1, a

Facial Image Synthesis 1 Barry-John Theobald and Jeffrey F. Cohn

D DAVID PUBLISHING. 3D Modelling, Simulation and Prediction of Facial Wrinkles. 1. Introduction

Nonrigid Surface Modelling. and Fast Recovery. Department of Computer Science and Engineering. Committee: Prof. Leo J. Jia and Prof. K. H.

Graphics, Vision, HCI. K.P. Chan Wenping Wang Li-Yi Wei Kenneth Wong Yizhou Yu

Computer Graphics. Si Lu. Fall uter_graphics.htm 11/27/2017

Topics in Computer Animation

Perceptual Grouping from Motion Cues Using Tensor Voting

Semantic 3D Motion Retargeting for Facial Animation

Pose Space Deformation A unified Approach to Shape Interpolation and Skeleton-Driven Deformation

Text-to-Audiovisual Speech Synthesizer

Performance Driven Facial Animation using Blendshape Interpolation

Shape and Expression Space of Real istic Human Faces

Re-mapping Animation Parameters Between Multiple Types of Facial Model

Expressive Speech Animation Synthesis with Phoneme-Level Controls

3D Production Pipeline

HIGH-RESOLUTION ANIMATION OF FACIAL DYNAMICS

Natural Head Motion Synthesis Driven by Acoustic Prosodic Features

Automated Eye Motion Using Texture Synthesis

Multimedia Arts and Technologies

Chapter 7. Conclusions and Future Work

AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO F ^ k.^

Generic Face Alignment Using an Improved Active Shape Model

Genesis8FemaleXprssnMagic 2018 Elisa Griffin, all rights reserved

FACIAL EXPRESSION USING 3D ANIMATION

REAL-TIME FACE SWAPPING IN VIDEO SEQUENCES: MAGIC MIRROR

Real Sense- Use Case Scenarios

How to add video effects

Impact Case Study. UoA 34: Art and Design. Fairy Magic (2012) (Professor Greg Maguire)

Classification of Face Images for Gender, Age, Facial Expression, and Identity 1

Face Morphing. Introduction. Related Work. Alex (Yu) Li CS284: Professor Séquin December 11, 2009

Animation & AR Modeling Guide. version 3.0

INTERNATIONAL JOURNAL OF GRAPHICS AND MULTIMEDIA (IJGM)

VTalk: A System for generating Text-to-Audio-Visual Speech

Abstract Media Spaces

Differential Processing of Facial Motion

University of California Postprints

Speech-driven Face Synthesis from 3D Video

Girl6-Genesis2FemaleXprssnMagic 2014 Elisa Griffin, all rights reserved

Perceiving Visual Emotions With Speech

CS770/870 Spring 2017 Animation Basics

CS770/870 Spring 2017 Animation Basics

ANALYSIS, SYNTHESIS, AND RETARGETING OF FACIAL EXPRESSIONS

Object Removal Using Exemplar-Based Inpainting

Animating Blendshape Faces by Cross-Mapping Motion Capture Data

(Refer Slide Time 00:17) Welcome to the course on Digital Image Processing. (Refer Slide Time 00:22)

IFACE: A 3D SYNTHETIC TALKING FACE

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Human Body Shape Deformation from. Front and Side Images

Video based Animation Synthesis with the Essential Graph. Adnane Boukhayma, Edmond Boyer MORPHEO INRIA Grenoble Rhône-Alpes

c Copyright by Zhen Wen, 2004

Animation. CS 4620 Lecture 33. Cornell CS4620 Fall Kavita Bala

Head Emotion. Erika Chuang, Chis Bregler Stanford CS Tech Report CSTR Abstract. 1 Introduction. 2 Related work

Example Based Image Synthesis of Articulated Figures

Transcription:

1. Research Team Data-Driven Face Modeling and Animation Project Leader: Post Doc(s): Graduate Students: Undergraduate Students: Prof. Ulrich Neumann, IMSC and Computer Science John P. Lewis Zhigang Deng, Zhenyah Mo Albin Cheenath 2. Statement of Project Goals Avatars, computer animation, and other forms of human-centric computing all potentially require the generation of novel but realistic facial appearance and motion. Such face modeling and animation is currently the province of computer artists. While manually animated computer faces can be quite expressive, they usually fall short of complete realism in their appearance and motion (the recent movie Final Fantasy is one demonstration of the current state of the art). The manual effort required to model and animate faces also prevents their use in avatars and other contexts where the required behavior is not known in advance. Data-Driven Face Modeling and Animation is a new project that explores the use of facial models made directly from captured data to address the goals of realism and automation. More specifically, a range of typical data (either face shapes or motions) is first captured and manually labeled. A high level model of this data is created. Once this model is available, its parameters can be explored to extrapolate or synthesize novel face shapes or motions. While this general data-driven approach is a recent theme across several research groups, IMSC already has some unique results, as shown below. 3. Project Role in Support of IMSC Strategic Plan Data-Driven Face Modeling and Animation is part of the general IMSC effort towards expressive human interaction in virtual and augmented reality environments. It is one of a trio of technologies being developed specifically for placing humans in virtual spaces. The complementary Facial Expression Processing project is directed towards capturing human expression (including non-speech gestures) and adapting the results to different computer facial models. The third project in the trio, Hair Modeling and Animation, addresses a major gap in current computer graphics renditions of humans. 4. Discussion of Methodology Used We have developed two initial demonstrations of a data-driven approach to face modeling. The first models the combined shape and appearance of a database of two dimensional faces. We demonstrate the capabilities of this model by extrapolating hidden regions of previously 231

unprocessed faces (Figure 1). The second demonstration builds a mouth viseme and eye movement model from captured three-dimensional speech motion. This model is then used to synthesize facial motion on a 3D avatar in response to novel speech. In the two-dimensional statistical face modeling demonstration, we adopt a principal component model of both the texture and shape of a number of faces. Specifically, a number of feature points are manually labeled on a large database of face photographs. A linear subspace (eigenspace) describing the shape variation across these feature points is obtained. The shape variation is then removed by morphing all faces to the mean shape and a second model describing the face texture variation is obtained. This basic shape+texture linear subspace modeling approach was independently developed by a number of research groups and goes by various names including Active Appearance Models and Morphable Models. Our contributions include: Group modeling and outlier rejection. A comprehensive facial model should be able to represent faces of both sexes and of different ages and ethnicities. On the other hand, if abundant facial data for young asian females (for example) is available, adding additional data on (for example) older black males will not help much in improving the data model for the young asian female, and indeed it may degrade the model. In our work we replace the global principal component analysis with a representation that considers near neighbors to be part of the same group. This enhances modeling and prediction even given limited data (see Figure 1). Model extrapolation. We use the statistics implicit in the data to extrapolate new face regions. In Figure 1, regions of novel faces not among the training pictures (left) are removed (center). The model then predicts the missing regions (right). The prediction is not always accurate, but if there are any statistical regularities in the data they will be revealed by our process. For example, if a particular type of face almost always has a wide mouth and thin lips, then the prediction will be of a wide mouth and thin lips. If the statistical trends are weak then the extrapolated result will appear blurry, as in the second row of Figure 1. The human face is of paramount importance to other humans, and we can effortlessly make refined judgments about the attention and emotional qualities conveyed by a speaker. The eyes and mouth are the parts of the face that are most directly used to convey information, and are the areas that are most attended to by an observer. In our second exploration of a data-driven approach, a three-dimensional avatar is animated from novel audio, but using a data-driven approach to synthesizing the crucial mouth and eye movements. To obtain the mouth motion model, facial motion capture data from the mouth region is analyzed to obtain a movement subspace, then operating in this subspace a smooth viseme co-articulation path for any given utterance is obtained using a dynamic programming approach similar to that used in [1]. The resulting co-articulated viseme sequence is applied to a 3D avatar model using morphing by radial basis function interpolation. 232

Eye movements and blinks are not strongly correlated with speech sounds, so we adopted a statistical texture synthesis method to capture this lack of correlation while still producing characteristic movements. Specifically, we employ the popular non-parametric sampling method [2], operating on the onedimensional eye-blink signal (see Figure 2) and the quaternion eye orientation signal (data obtained from motion capture and hand labeling). Eye vergence will be considered in future work. 5. Short Description of Achievements in Previous Years The Data-Driven Face Modeling and Animation effort is a new project, started within the last year. It uses experience we gained in the ongoing Facial Expression project, described in a separate project summary. 5a. Detail of Accomplishments During the Past Year This is a new project undertaken during the last year. Section 4 describes the accomplishments to date. Figure 1. Regions of novel faces not among the training data (left) are obscured (center). Our model then predicts the missing regions (right). 233

6. Other Relevant Work Being Conducted and How this Project is Different Data driven modeling is a current trend in computer graphics research, and it has been applied to speech modeling [3] as well as other areas such as body animation and texture synthesis. Our work is distinguished in several ways. The rejection of outliers in the two-dimensional statistical model refines the modeling capability offered by the basic appearance Figure 2. Motion capture data for eye blinks over time model approach, but without (red graph), and novel synthesized eye blink motion introducing additional complexity. (blue graph). We are the first group applying datadriven modeling to eye motion as well as speech. 7. Plan for the Next Year It is most effective to evaluate research results in the context in which they will eventually be used. One of our main goals is to enable natural and expressive avatars. As such we will work towards a version of data-driven facial modeling that is suitable for avatars used in the Communication testbed. Realism is a major challenge for computer face renditions while stylized or cartoon-like faces are generally effective, attempts at realism often appear rigid or even creepy. Since realism is not the focus of our research, we will explore data-driven approaches to stylized face renditions. 8. Expected Milestones and Deliverables The current deliverables include several manually annotated facial databases, and the two prototype programs described above (roughly 10,000 lines of Java and C++ respectively). Our future milestones are indicated as follows: - Stylized facial rendering from data - Data-driven lip-synch: Improvements to quality and speed 9. Member Company Benefits Sophisticated statistical models of face appearance such as the one we prototyped have a number of uses, including: - Older or younger versions of a face can be approximately extrapolated, by first fitting the face using the model and then moving along an age axis in face space [4]. 234

- The estimation of a missing face part may have application in reconstructive surgery: in cases where a face region is deformed at birth, there is no prior photograph to provide a guide for the surgery. - Similarly, the estimation of missing face parts may be useful in security or other person identification contexts, as when a photograph of an individual is available but portions of the face are obscured. - Existing face-finding algorithms often cannot localize parts of a found face other than the eyes and mouth center. By pairing such an algorithm with a statistical appearance model it will be possible to automatically locate many specific facial features such as the mouth corners, nostrils, points along the eyebrows, etc. This capability is in turn useful in diverse areas including previewing medical procedures, virtual cosmetic preview, and various computer gaming applications. 10. References [1] CHUANG, E., AND BREGLER, C. 2002, Performance Driven Facial Animation using Blendshape Interpolation, CS-TR-2002-02, Department of Computer Science, Stanford University. [2] EFROS, A.A., AND LEUNG, T.K. 1999, Texture Synthesis by Non-parametric Sampling, ICCV 99, 1033 1038. [3] EZZAT, T., GEIGER, G., AND POGGIO, T. 2002, Trainable Videorealistic Speech Animation, ACM Transaction on Graphics, 21, 3, 388 398. [4] B. Tiddeman, D. Burt, and D. Perret, Prototyping and Transforming Facial Textures for Perception Research, IEEE Computer Graphics and Applications 21, 5, 42-50, Sept./Oct 2001. 235

236