K A I S T Department of Computer Science

Size: px
Start display at page:

Download "K A I S T Department of Computer Science"

Transcription

1 An Example-based Approach to Text-driven Speech Animation with Emotional Expressions Hyewon Pyun, Wonseok Chae, Yejin Kim, Hyungwoo Kang, and Sung Yong Shin CS/TR July 19, 2004 K A I S T Department of Computer Science

2 An Example-based Approach to Text-driven Speech Animation with Emotional Expressions Hyewon Pyun, Wonseok Chae, Yejin Kim, Hyungwoo Kang, and Sung Yong Shin 1 Introduction 1.1 Motivation Visual speech animations of virtual characters have been playing increasingly important roles in various computer graphics applications such as computer games, movies, and internet broadcasting. Aside from those applications, the face is the most important part of the human body in real life, through which we recognize each other and express our emotions. Thus, realistic facial animation is effective in human-computer interaction. For example, virtual characters with talking faces are now widely used as guiding agents at information desks, digital actors in movies, or avatars in internet chatrooms. In general, the visual speech animation of a virtual character is required to be synchronized with a given input speech track. Thus, most of the previous speech animation approaches have focused on lip synchronization [2, 4, 5, 6, 7, 11, 24]. That is, they have been mainly interested in producing a visual speech animation in which lip movements are synchronized with the speech track. For a more natural and realistic facial animation, however, it is necessary to incorporate emotional expressions into such a pure lip-sync animation. While many different approaches have been proposed for generating emotional expressions on given face models [1, 9, 10, 12, 15, 18, 21, 23, 25], none of them can provide explicit solutions for combining lip movements and facial expressions seamlessly. In this paper, we propose a novel scheme for the real-time speech animation of a 3D face model that effectively combines lip-sync movements with emotional expressions. To achieve our goal, we address three important issues: First, for realistic 3D lip-synchronization, we give a simple, effective scheme for producing a text-driven lip-sync animation with coarticulation effects. Second, for fast and 1

3 intuitive facial expression control, we provide an example-based expression synthesis scheme based on scattered data interpolation. Finally, for combining the lip-sync animation with emotional expressions in an on-line manner, we present an importance-based scheme for compositing facial models. Resulting facial animations are smooth, expressive, easy to control, and generated in real time. 1.2 Related Work Speech Animation Many methods have been proposed for generating the visual speech animation synchronized with a given speech track. These methods can produce 2D or 3D facial animations. 2D methods produced video streams based on analysis of input videos or image samples [2, 4, 6, 7]. While the 2D methods were capable of generating realistic speech animations, their main concerns were on providing lip synchronization. To incorporate upper-face expressions as well, Brand proposed a whole-face speech animation method driven by an audio signal, based on a statistical training model [3]. With this method, however, it is not easy for animators to control expressions or predict animation results. 3D speech animation methods provided animators with more freedom in terms of viewing, lighting, manipulating, and reusing. Based on Parke s parametric facial expression model [14], Pearce et al. developed an early system for 3D speech animation [17]. This system was able to convert a string of phonemes into timevarying control parameters to produce an animation sequence. Cohen and Massaro further extended this approach to generate a 3D speech animation from text [5]. Waters et al. developed a text-driven 3D speech animation system called DECface, based on the key-framing of 3D viseme models [24]. Kalberer et al. proposed a similar method, but they captured the viseme models from a talking human face with a 3D digitizer, to produce more realistic speech animation [11]. All of these previous 3D methods have also mainly focused on lip synchronization. Although the parametric approach allows for the control of upper-face expressions, it is not intuitive for animators to manipulate the facial expressions with a set of parameters Expression Modeling There have also been rich research results on creating whole-face expressions independently of a speech track. Physically-based approaches were able to synthesize facial expressions by simulating the physical properties of facial skin and muscles [12, 21, 23]. Parke proposed a parametric approach to represent the motion of a group of vertices to generate a wide range of facial expressions [15]. 2

4 In performance-driven approaches, facial animations were generated based on facial motion data captured from the live performance of an actor [9, 25]. Kalra et al. used free-form deformation to manipulate facial expressions [10]. Pighin et al. presented an example-based approach to generate photorealistic 3D facial expressions from a set of 2D photographs [18], and Blanz and Vetter proposed an expression synthesis scheme using a large set of example 3D face models [1]. Most of these approaches have focused on the pure expression synthesis aspect, without addressing the problem of synchronization with given speech tracks. 1.3 Overview As shown in Figure 1, our method consists of three main components for lip movements, emotional expressions, and their compositions, respectively. Our method first generates a visual speech animation synchronized with a given speech track obtained from text. We use an available TTS(Text-To-Speech) system to obtain a synthesized speech track, together with information including the corresponding phonemes and their lengths. As a preprocess, we construct a set of 3D face models called visemes 1, that is, the visual counterparts of phonemes [8]. Regarding a sequence of 3D viseme models as key-frames, our method then generates a visual speech animation synchronized with the phoneme sequence by interpolating the viseme models. To reflect coarticulation effects, the viseme model at each key-frame is adjusted in accordance with the length of the corresponding phoneme. Text TTS Lip-Sync Animation Generation Composition User Emotional parameters Facial Expression Animation Generation Speech Animation Figure 1: System overview While synthesizing the output viseme model at each frame, our method simultaneously produces an emotional expression model to be combined with the output viseme model. For intuitive and efficient synthesis of the emotional expression model, we adopt a scattered data interpolation scheme proposed by Sloan et al [20]. Referring to the emotion space diagram [19], we first parameterize a set of example 3D face models with key-expressions on a 2D space. Given a parameter 1 The term viseme is the abbreviation of visual phoneme. 3

5 vector, we obtain the corresponding emotional expression model by blending the key-expression models. Based on cardinal basis functions, this scheme produces the expression model in real time. Finally, we combine the output viseme model with the expression model at every frame to obtain a smooth, expressive, and realistic 3D facial animation. To avoid conflicts between the viseme model and the expression model, we propose an importance-based approach for their seamless composition. That is, we assign each vertex of the face model an importance value based on its relative contribution to the accurate pronunciations with respect to the emotion expressions. The viseme model and the expression model at each frame are blended vertex-wise using the importance values as weights, to produce a convincing visual speech animation with emotional expressions in real time. While the output viseme model is synthesized at every frame driven by the input text, the corresponding expression model is generated simultaneously in accordance with a stream of emotion parameters interactively fed by an animator. The remainder of this paper is organized as follows. In Sections 2 and 3, we describe our methods for generating the output viseme models and expression models, respectively. We explain how to combine them to obtain the final result in Section 4. In Section 5, we show some experimental results. Finally, we conclude this paper and discuss future research in Section 6. 2 Lip-Sync Animation In this section, we present our method for generating a visual speech animation synchronized with a given speech track. In general, the speech track can be obtained either from the recording of human voice, or from a speech synthesis system. Since our objective is to develop a text-driven speech animation system, we utilize an available TTS system to acquire the input speech track. One of the advantages in using a TTS system is that we can directly obtain the information on the sequence of phonemes and their lengths. Given such information, our initial objective is to generate the lip-motion animation synchronized with the phoneme sequence. For generating realistic lip movements, we predefine a set of key-viseme models for English phonemes based on the assumption that any output viseme models can be made by blending a finite set of key-visemes [6]. While there are phonemes in American English, the number of corresponding key-visemes can be much smaller since a single viseme can represent two or more phonemes. We select 14 key-visemes for our experiments, where 6 key-visemes represent the con- 4

6 sonantal phonemes, and 7 key-visemes represent the vocalic phonemes, and one extra key-viseme represents the silence. This last key-viseme is also used as the base model, whose shape is deformed to create all other key-viseme models. Thus, the animator needs to construct the 14 3D face models corresponding to the keyvisemes. Figures 2 and 3 show the key-viseme models for the vowels and the consonants, respectively. a ae o uh u e i Figure 2: Key-viseme models for the vowels g d m r f ch Figure 3: Key-viseme models for the consonants [Hello] h e l o u Figure 4: Key-viseme models for an example phoneme sequence From the set of key-viseme models, we select the models corresponding to the phonemes in the input sequence (Figure 4). Let H and L denote the phoneme 5

7 x i 1 x i 3 x i 2 x i 0 x i 4 x i O s e t 1 t 2 t 3 t 3 t 3 t 4 t Figure 5: A piece-wise cubic spline interpolation sequence and the phoneme length sequence, respectively, obtained from a TTS system: Let P the corresponding key-viseme sequence: H = {H 1, H 2,..., H m } (1) L = {l 1, l 2,..., l m } (2) P = {P 1, P 2,..., P m } (3) where each key-viseme model P j is a polygonal mesh composed of a set of vertices v j i = (xj i, yj i, zj i ), 1 i n and that of edges connecting them. From L, we first compute the time instance t j, 1 i m for locating each key-viseme model P j along the time axis. Let T be the sequence of such time instances: T = {t 1, t 2,..., t m } (4) Assuming that a local extremum is achieved at every key-frame, where a keyviseme model is placed, we compute each t j for key-viseme P j as follows: j 1 t j = l j /2 + l k (5) Here, we also assume that each key-model is placed at the middle of the corresponding phoneme duration along the time axis. With every key-viseme model P j placed at the corresponding time t j, we first describe our basic interpolation scheme and then elaborate this scheme to handle more complex problems such as long phoneme duration and coarticulation. We start with our basic interpolation scheme to construct a piece-wise cubic spline interpolating each sequence of the corresponding vertices for the selected key-viseme 6 k=1

8 models (Figure 5). We set the tangent vector of this curve at every key-frame to be zero to ensure the local extremum at the key-frame. The x coordinate for the vertex v i at time t (t j t t j+1 ) is represented by a cubic polynomial: x i (t) = at 3 + bt 2 + ct + d (6) Here, the coefficients a, b, c, d are obtained from the following constraints: x i (t j ) = x j i, x i(t j+1 ) = x j+1 i, x i(t j ) = x i(t j+1 ) = 0 (7) where x j i and xj+1 i denote the x-coordinates of the vertex v i at times t j and t j+1, respectively. The y and z coordinates can also be computed similarly. We now explain how to handle a very long phoneme duration where the same lip motion needs to be kept for a while. For each phoneme H j, we assign a viseme interval M j centered at t j, whose length l(m j ) is defined as follows: l(m j ) = { lj δ 1 l j > δ 1 0 otherwise (8) If l j > δ 1, we use both ends of this interval (denoted t s j and te j ) instead of t j for spline interpolation with the neighbor key-visemes. Phoneme H 3 in Figure 5 shows this case, whose magnified view is given in Figure 6. l j δ 1 / 2 l j δ 1 3 x i δ 1 / 2 M j s t 3 t 3 e t 3 Figure 6: A viseme maintenance interval Finally, we describe how to incorporate coarticulation effects into our visual speech animation. When a real person speaks, the lips do not always fully achieve a sequence of the key-visemes since the duration of each phoneme is usually very short. Therefore, the lip shapes for the neighbor phonemes tend to force the current viseme to have a similar lip shape. That is, the viseme for each phoneme is affected by the current phoneme length and the neighbor lip shapes, which is called 7

9 coarticulation. To approximate this coarticulation effect, we propose a simple and effective scheme based on shape blending of successive visemes. Starting from the first key-viseme model P 1 in P, we successively adjust each key-viseme model P j with respect to the previous key-viseme model, to obtain a new key-viseme sequence P: P = { P 1, P 2,..., P m } (9) where a vertex position v j i for key-viseme P j is obtained as follows: v j i = w(l j) v j i + (1 w(l j)) v j 1 i (10) Here, the weight w(l j ) is a function of the corresponding phoneme length l j. When l j is long enough, the original key-viseme can be achieved fully. Otherwise, the key-viseme is adjusted according to Equation 10. From empirical observations, we derive a heuristics for viseme transition: The transition speed from one keyviseme to the next is initially low, then abruptly gets high, and finally slows down, as shown in Figure 7. Based on this heuristics, we define the weight function w(l j ): w(l j ) = { 2lj 3 /δ l j 2 /δ 2 2 if l j < δ 2 1 otherwise (11) which is derived from the constraints: w(0) = 0, w(δ 2 ) = 1, and w (0) = w (δ 2 ) = 0. With the modified key-viseme sequence P obtained in this way, we then apply our key-framing scheme as explained above, to obtain smooth and natural-looking lip-sync animation. w(l) 1 O δ 2 l Figure 7: A weight function 8

10 3 Emotional Expression Animation In this section, we describe our example-based emotional expression synthesis scheme. Like viseme models for pronunciations, a human face has well-characterized emotional key-expressions so that a variety of emotional expressions can be obtained by blending those key-expressions. Referring to the emotion space diagram [19], we choose six emotional key-expressions including neutral, happy, sad, surprised, afraid, and angry expressions. The 3D face models corresponding to these key-expressions and their parameterization are shown in Figures 8 and 9, respectively. Note that the neutral expression model is the same as the viseme model representing the silence, that is, the base model. Neutral Happy Sad Surprised Afraid Angry Figure 8: Emotional key-expression models For intuitive and efficient emotional expression synthesis from these key-expression models, we adopt multi-dimensional scattered data interpolation scheme [20]. We first parameterize these key-expression models in a 2D space with the neutral expression located at the center (Figure 9). Then, our expression synthesis problem is transformed to a scattered data interpolation problem. We predefine the weight function for each key-expression model based on cardinal basis functions consisting of linear and radial basis functions. The global shapes of weight functions are first approximated by linear basis functions, and then adjusted locally by radial basis functions to exactly interpolate the key-expression models. At runtime, we interactively specify a parameter vector to synthesize the corresponding expression model by blending the key-expression models with respect to the weight values obtained from the predefined weight functions at this parameter vector. The weight function w i ( ) of each key-expression model E i, 1 i M at a parameter vector p is defined as follows: w i (p) = 2 M a il A l (p) + r ji R j (p). (12) l=0 where A l (p) and a il are the linear basis functions and their coefficients, respectively. Similarly, R j (p) and r ji are the radial basis functions and their coefficients. j=1 9

11 Let p i, 1 i M be the parameter vector of a key-expression model E i. To interpolate the key-expression model exactly, the weight of a key-expression model E i should be one at p i and zero at p j, i j, that is, w i (p j ) = 1 for i = j and w i (p j ) = 0 for i j. Ignoring the second term of Equation (12), we first solve for the linear coefficients a il : 2 w i (p) = a il A l (p). (13) l=0 The linear bases are simply A l (p) = p l, 1 l 2, where p l is the lth component of p, and A 0 (p) = 1. Using the parameter vector p i of each key-expression model and its weight w i (p i ), we employ a least squares method to evaluate the unknown linear coefficients a il of the linear bases. Given the linear approximation, we then compute the residuals for the keyexpression models as follows: w i (p) = w i (p) 2 a il A l (p), for all i. (14) l=0 With these residuals, we solve for the radial coefficients r ji in the Equation (12). The radial basis function R j (p) is a function of the Euclidean distance between p and p j in the parameter space: ( ) p pj R j (p) = B, for 1 j M, (15) α where B( ) is the cubic B-spline function, α is the dilation factor, which is the separation to the nearest other key-expression model in the parameter space. The radial coefficients are then found by solving the matrix system, rr = w, (16) where r is an M M matrix of the unknown radial coefficients r ji, and R and w are the matrices of the same size defined by the radial bases and by the residuals, respectively, such that R ij = R i (p j ) and w ij = w i (p j ). With the weight functions predefined, we are now able to blend the key-expression models in runtime. Using the predefined weight functions for the key-expression models E j, 1 j M, as given in Equation (12), we generate a new face model E new at the parameter vector p in : v new i (p in ) = M w j (p in )v j i, (17) j=1 10

12 where vi new and v j i, 1 i n, denote the vertex of E new and that of E j, respectively. In practice, we employ a cubic spline curve in the parameter space to continuously feed the emotional parameters during an animation (see Figure 9) happy (0,1) surprised π π ( cos,sin ) Neutral (0,0) sad π π (cos,sin ) angry 3π 3π ( cos, sin ) afraid 3π 3π (cos, sin ) Figure 9: The parameter space for expression synthesis 4 Composition In this section, we explain how to combine a visual speech animation and emotional expressions. Since they are generated frame by frame simultaneously, our problem is reduced to a problem of compositing a viseme model and an expression model. Analyzing the key-viseme models and the key-expression models with respect to the base model with neutral expression, we characterize the vertices of the face model in terms of their contributions to facial movements. For example, vertices near eyes contribute mainly to making emotional expressions. Vertices near the mouth contribute to both pronunciations and emotional expressions. However, the vertex movements are mainly constrained by pronunciations when there are any conflicts between two types of facial movements. Based on this observation, we introduce the notion of importance, which measures the relative contribution of each vertex to the pronunciations with respect to the emotional expressions. Let α i denote the importance value of every vertex v i such that 0 α i 1 for all i. If α i 0.5, then the movement of v i is constrained by the pronunciations; otherwise, it is constrained by the emotional expressions. To effectively compute the importance value, we have empirically derived the following three rules: First, the importance of a vertex is proportional to the norm of the displacement vector from the corresponding vertex of the base model. Second, a vertex with a small displacement is considered to be important if it has a neighbor vertex with a big displacement. Finally, a vertex of high importance is 11

13 constrained by the pronunciations, and a vertex of low importance is constrained by the emotional expressions. According to those rules, we compute the importance of each vertex in three steps. In the first two steps, two independent importance values are computed from the viseme models and the expression models, respectively. Then, they are combined to give the importance in the final step. Let p 1 (v i ) and e 1 (v i ), 1 i n be the pronunciation and emotion importances, respectively. In the first step, these importances are computed from the maximum norms of the displacement vectors of each vertex v i over their respective models.: p 1 (v i ) = max( v P j j i v i )/ max j,k ( vp j k v k ), e 1 (v i ) = max( v E j j i v i )/ max j,k ( ve j k v k ), where v P j i and v E j i respectively denote the vertices in a viseme model P j and an expression model E j corresponding to a vertex v i in the base model. Note that we have normalized the importances so that their values range from 0 to 1. We then propagate the importance value of each vertex to the neighbor vertices if it is big enough. Thus, in the second step, the importances p 2 (v i ) and e 2 (v i ) are obtained as follows: p 2 (v i ) = max({p 1 (v i )} {p 1 (v j ) v i v j < L p, p 1 (v j ) > S 1 }), e 2 (v i ) = max({e 1 (v i )} {e 1 (v j ) v i v j < L e, e 1 (v j ) > S 1 }), where S 1, L p, and L e are control parameters. In the final step, the importance α i of the vertex v i is obtained as follows: α i = { p2 (v i )(1 e 2 (v i )), if p 2 (v i ) < S 2, 1 (1 p 2 (v i ))e 2 (v i ), otherwise. (18) Equation (18) adjusts importance values so that they are clustered near both extremes, that is, zero and one. Figure 10 shows the importance distributions after the first, second, and third steps. Brighter regions indicate higher importance values. Now, we are ready to explain how to composite an expression model E and a viseme model P derived from the base model B. Let B = {v 1, v 2,..., v n }, E = {v E 1, v E 2,..., v E n }, and P = {v P 1, v P 2,..., v P n }. 12

14 (a) p 1 (v) (b) e 1 (v) (c) p 2 (v) (d) e 2 (v) (e) α Figure 10: Importance distributions vi E and vi P, 1 i n are obtained by displacing v i, if needed, and thus the natural correspondence is established for the vertices with the same subscript. For every vertex v i, we define displacement vectors vi E and vi P as follows: v E i = v E i v i, and v P i = v P i v i. Let C = {v1 C, vc 2,..., vc n } and vi C = vi C v i be the composite expression model and the displacement vector of a vertex vi C C, respectively. Then, vi C must lie on the plane spanned by vi E and vi P and containing v i as shown in Figure 11. P v i C v i P v i C v i E v i E v i v i E (1 αi) P ( v i ) E P ( v i ) v v P i E i Figure 11: Composition of the two displacements Consider a vertex vi C, 1 i n of the combined model C. If α i 0.5, then displacement vector vi P should be preserved in vi C for accurate pronunciation. Therefore, letting P ( vi E) be the component of ve i perpendicular to vi P, only this component P ( vi E) can make a contribution to vc i (see Figure 11). If α i < 0.5, the roles of vi P and vi E are switched. Thus, we have { vi C vi + ( v = i P + (1 α i )P ( vi E)) if α i 0.5 v i + ( vi E + α i E ( vi P )) otherwise, 13

15 where P ( v E i ) = v E i ve i v P i v P i 2 v P i, and E ( v P i ) = v P i vp i v E i v E i 2 v E i. Figure 12 shows a composite model (Figure 12(c)) constructed from a viseme model (Figure 12(a)) and a emotional expression model (Figure 12(b)). Note that the shape of the viseme model is preserved around the mouth and that of the emotional expression model is preserved in other parts. (a) (b) (c) Figure 12: Composition: (a) viseme model (vowel i ) (b) expression model ( happy ) (c) composite model. 5 Experimental Results To obtain a phoneme sequence from input text, we adopt a commercial TTS system called VoiceText. For a given 3D face model, our method first generates a sequence of output visemes synchronized with the phoneme sequence. Using the emotional space diagram as explained in Section 3, our method simultaneously synthesizes emotional expressions, which are immediately combined with the output viseme sequence frame by frame to give the final animation on the fly. All the experiments have been conducted on an Intel Pentium PC (P4 2.4 GHz processor with 512 MB memory). Figure 13 shows the three models used for our experiments, and Table 1 gives the number of vertices and that of polygons in each model. In Figure 15, we show in each row, 12 sample frames of a lip-sync animation, an expression animation, and their composite animation generated for a model Man. Note that only the composite version in the third row is actually displayed at runtime. Each composite model is obtained from the corresponding viseme model and the expression model 14

16 (a) (b) (c) Figure 13: Models used for the experiments: (a) Man (b) Woman (c) Gorilla Man Woman Gorilla Vertices Polygons Table 1: Model specification shown in the same column. The lip-sync animation was produced from input text A given in Table 2 while simultaneously synthesizing the emotion expressions from the input parameter curve shown in Figure 14(a). Note that our expression synthesis scheme can extrapolate the expression models at parameter vectors even outside the convex hull of the parameter vectors for the key-expression models. Each composite model in the final animation nicely reflects the corresponding lip motion and the emotional expression without conflicts. Similarly, Figures 16 and 17 show the animation results with the other two models. For the Woman model, we used input text B in Table 2 and the parameter curve in Figure 14(b). For the Gorilla model, text C and the last curve were used. happy happy happy surprised sad surprised sad surprised sad neutral neutral neutral angry afraid angry afraid angry afraid (a) (b) (c) Figure 14: Various emotional parameter curves 15

17 A B C Text The significant problems we face cannot be solved at the same level of thinking we were at when we created them. The grand aim of all science is to cover the greatest number of empirical facts by logical deduction from the smallest number of hypotheses or axioms. Any man who can drive safely while kissing a pretty girl is simply not giving the kiss the attention it deserves. Table 2: Input sentences (Quotes from Albert Einstein) For efficiency analysis, we also performed an experiment on each of the three models with the same input text sequence, which is a part of Abraham Lincoln s Gettysburg address. The text is composed of 266 words, and the TTS system produced the corresponding phoneme sequence of 86 seconds long. We applied the same emotion parameter curve to all 3 models. Table 3 shows some statistics obtained from this experiment. For each model, this table shows the computation time (in milliseconds) per frame for lip-synchronization, emotion expression, and their composition, respectively, excluding rendering time. Frame rate indicates the number of frames per second for the final animation, including rendering time. As shown in this table, our method exhibited a real-time performance in producing visual speech animations from text. Man Woman Gorilla L Comput. time E (msec./frame) C Total Frame rate (Hz) Table 3: Time-efficiency of our method (L: Lip-sync animation, E: Expression animation, C: Composite animation). 16

18 Figure 15: Visual speech animation with model Man Figure 16: Visual speech animation with model Woman 6 Conclusions We have presented an example-based approach to creating a visual speech animation with emotional expressions in real time. Our contributions are three-fold: First, we suggested a simple, effective scheme for producing a text-driven 3D lipsync animation with coarticulation effects. Second, we provided an example-based expression synthesis scheme based on scattered data interpolation. Our final contribution is an importance-based scheme for compositing a viseme (face) model and 17

19 Figure 17: Visual speech animation with model Gorilla an expression (face) model, which enables the combining of a lip-sync animation with emotional expressions frame by frame in an on-line manner. There are several aspects for further improvement. For more realistic facial animation, we would like to extend our scheme to incorporate subtle movements in addition to emotional expressions such as eye blinking, eyeballs rolling, head shaking or nodding, etc. Although our lip-sync animation scheme mainly refers to the phoneme length, we also need to consider other factors such as intonation or accents which also affect lip movements. Instead of providing the emotional parameters interactively through a user interface, we will try to extract the emotional parameters automatically from input text so that the expression models can be derived from the phoneme sequence directly. References [1] V. Blanz and T. Vetter. A morphable model for the synthesis of 3D faces, In Proceedings of SIGGRAPH 99, pp , [2] C. Bregler, M. Covell, and M. Slaney. Video rewrite: Driving visual speech with audio, ACM SIGGRAPH 97 Conference Proceedings, pp , August [3] M. Brand. Voice Puppetry, In Proceedings of SIGGRAPH 99, pp ,

20 [4] E. Cossato and H. Graf. Photo-Realistic Talking-Heads from Image Samples, IEEE Transactions on Multimedia, 2(3), pp , September [5] M. Cohen and D. Massaro. Modeling coarticulation in synthetic visual speech, In N. M. Thalmann and D. Thalmann (Eds.) Models and Techniques in Computer Animation, pp Springer-Verlag, Tokyo, [6] T. Ezzat and T. Poggio. Visual Speech synthesis by morphing visemes, International Journal of Computer Vision, 38, pp , [7] T. Ezzat, G. Geiger, and T. Poggio. Trainable Videorealistic Speech Animation, ACM SIGGRAPH 2002 Conference Proceedings, pp , July [8] C. G. Fisher. Confusions among visually perceived consonants., Jour. Speech and Hearing Research, 11, pp , [9] B. Guenter, C. Grimm, D. Wood, H. Malvar, and F. Pighin. Making Faces, ACM SIGGRAPH 98 Conference Proceedings, pp , July [10] P. Kalra and A. Mangili and N. M. Thalmann and D. Thalmann. Simulation of facial muscle actions based on rational free from deformations, In Proceedings of Eurographics 92, pp , [11] G. A. Kalberer and L. V. Gool. Face Animation Based on Observed 3D Speech Dynamics, Computer Animation 2001, pp , November [12] Y. C. Lee, D. Terzopoulos and K. Waters. Realistic modeling for facial animation, In Proceedings of SIGGRAPH 95, pp , [13] J. P. Lewis, M. Cordner, and N. Fong. Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-Drive Deformation, ACM SIGGRAPH 2000 Conference Proceedings, pp , July [14] F. I. Parke. A Parametric Model of Human Faces, PhD thesis, University of Utah, [15] F. I. Parke. Parameterized models for facial animation, In IEEE Computer Graphics and Applications, Vol. 2, No. 9, pp , [16] F. I. Parke and K. Waters. Computer Facial Animation. 1996, ISBN

21 [17] A. Pearce, B. Wyvill, G. Wyvill, and D. Hill. Speech and Expression: A Computer Solution to Face Animation, In Graphics Interface, [18] F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, and D.H. Salesin. Synthesizing Realistic Facial Expressions from Photographs, ACM SIGGRAPH 98 Conference Proceedings, pp , July [19] J. A. Russel. A Circomplex Model of Affect, In J. Personality and Social Psychology, Vol. 39, pp , [20] P. -P. Sloan and C. F. Rose and Michael F. Cohen. Shape by example, In Proceedings of 2001 Symposium on Interactive 3D Graphics, pp , [21] D. Terzopoulos and K. Waters. Physically-based facial modeling, analysis, and animation, In Journal of Visualization and Computer Animation, Vol. 1, No. 4, pp , [22] I. I. Witten. Principles of Computer Speech. Academic Press, [23] K. Waters. A muscle model for animating three-dimensional facial expressions, In Proceedings of SIGGRAPH 87, pp , [24] K. Waters and T. M. Levergood. DECface: an automatic lip synchronization algorithm for synthetic faces, Technical Report CRL 93/4, DEC Cambridge Research Laboratory, Cambridge, MA, September [25] L. Williams. Performance driven facial animation, In Proceedings of SIG- GRAPH 90, pp ,

Master s Thesis. Cloning Facial Expressions with User-defined Example Models

Master s Thesis. Cloning Facial Expressions with User-defined Example Models Master s Thesis Cloning Facial Expressions with User-defined Example Models ( Kim, Yejin) Department of Electrical Engineering and Computer Science Division of Computer Science Korea Advanced Institute

More information

M I RA Lab. Speech Animation. Where do we stand today? Speech Animation : Hierarchy. What are the technologies?

M I RA Lab. Speech Animation. Where do we stand today? Speech Animation : Hierarchy. What are the technologies? MIRALab Where Research means Creativity Where do we stand today? M I RA Lab Nadia Magnenat-Thalmann MIRALab, University of Geneva thalmann@miralab.unige.ch Video Input (face) Audio Input (speech) FAP Extraction

More information

Synthesizing Realistic Facial Expressions from Photographs

Synthesizing Realistic Facial Expressions from Photographs Synthesizing Realistic Facial Expressions from Photographs 1998 F. Pighin, J Hecker, D. Lischinskiy, R. Szeliskiz and D. H. Salesin University of Washington, The Hebrew University Microsoft Research 1

More information

IFACE: A 3D SYNTHETIC TALKING FACE

IFACE: A 3D SYNTHETIC TALKING FACE IFACE: A 3D SYNTHETIC TALKING FACE PENGYU HONG *, ZHEN WEN, THOMAS S. HUANG Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign Urbana, IL 61801, USA We present

More information

Pose Space Deformation A unified Approach to Shape Interpolation and Skeleton-Driven Deformation

Pose Space Deformation A unified Approach to Shape Interpolation and Skeleton-Driven Deformation Pose Space Deformation A unified Approach to Shape Interpolation and Skeleton-Driven Deformation J.P. Lewis Matt Cordner Nickson Fong Presented by 1 Talk Outline Character Animation Overview Problem Statement

More information

VISEME SPACE FOR REALISTIC SPEECH ANIMATION

VISEME SPACE FOR REALISTIC SPEECH ANIMATION VISEME SPACE FOR REALISTIC SPEECH ANIMATION Sumedha Kshirsagar, Nadia Magnenat-Thalmann MIRALab CUI, University of Geneva {sumedha, thalmann}@miralab.unige.ch http://www.miralab.unige.ch ABSTRACT For realistic

More information

Facial Image Synthesis 1 Barry-John Theobald and Jeffrey F. Cohn

Facial Image Synthesis 1 Barry-John Theobald and Jeffrey F. Cohn Facial Image Synthesis Page 1 of 5 Facial Image Synthesis 1 Barry-John Theobald and Jeffrey F. Cohn 1 Introduction Facial expression has been central to the

More information

Data-Driven Face Modeling and Animation

Data-Driven Face Modeling and Animation 1. Research Team Data-Driven Face Modeling and Animation Project Leader: Post Doc(s): Graduate Students: Undergraduate Students: Prof. Ulrich Neumann, IMSC and Computer Science John P. Lewis Zhigang Deng,

More information

Animation of 3D surfaces.

Animation of 3D surfaces. Animation of 3D surfaces Motivations When character animation is controlled by skeleton set of hierarchical joints joints oriented by rotations the character shape still needs to be visible: visible =

More information

Real-time Expression Cloning using Appearance Models

Real-time Expression Cloning using Appearance Models Real-time Expression Cloning using Appearance Models Barry-John Theobald School of Computing Sciences University of East Anglia Norwich, UK bjt@cmp.uea.ac.uk Iain A. Matthews Robotics Institute Carnegie

More information

An Interactive Interface for Directing Virtual Humans

An Interactive Interface for Directing Virtual Humans An Interactive Interface for Directing Virtual Humans Gael Sannier 1, Selim Balcisoy 2, Nadia Magnenat-Thalmann 1, Daniel Thalmann 2 1) MIRALab, University of Geneva, 24 rue du Général Dufour CH 1211 Geneva,

More information

Faces and Image-Based Lighting

Faces and Image-Based Lighting Announcements Faces and Image-Based Lighting Project #3 artifacts voting Final project: Demo on 6/25 (Wednesday) 13:30pm in this room Reports and videos due on 6/26 (Thursday) 11:59pm Digital Visual Effects,

More information

Muscle Based facial Modeling. Wei Xu

Muscle Based facial Modeling. Wei Xu Muscle Based facial Modeling Wei Xu Facial Modeling Techniques Facial modeling/animation Geometry manipulations Interpolation Parameterizations finite element methods muscle based modeling visual simulation

More information

Synthesizing Speech Animation By Learning Compact Speech Co-Articulation Models

Synthesizing Speech Animation By Learning Compact Speech Co-Articulation Models Synthesizing Speech Animation By Learning Compact Speech Co-Articulation Models Zhigang Deng J.P. Lewis Ulrich Neumann Computer Graphics and Immersive Technologies Lab Department of Computer Science, Integrated

More information

Animation of 3D surfaces

Animation of 3D surfaces Animation of 3D surfaces 2013-14 Motivations When character animation is controlled by skeleton set of hierarchical joints joints oriented by rotations the character shape still needs to be visible: visible

More information

Facial Motion Capture Editing by Automated Orthogonal Blendshape Construction and Weight Propagation

Facial Motion Capture Editing by Automated Orthogonal Blendshape Construction and Weight Propagation Facial Motion Capture Editing by Automated Orthogonal Blendshape Construction and Weight Propagation Qing Li and Zhigang Deng Department of Computer Science University of Houston Houston, TX, 77204, USA

More information

CS 231. Deformation simulation (and faces)

CS 231. Deformation simulation (and faces) CS 231 Deformation simulation (and faces) Deformation BODY Simulation Discretization Spring-mass models difficult to model continuum properties Simple & fast to implement and understand Finite Element

More information

Shape and Expression Space of Real istic Human Faces

Shape and Expression Space of Real istic Human Faces 8 5 2006 5 Vol8 No5 JOURNAL OF COMPU TER2AIDED DESIGN & COMPU TER GRAPHICS May 2006 ( 0087) (peiyuru @cis. pku. edu. cn) : Canny ; ; ; TP394 Shape and Expression Space of Real istic Human Faces Pei Yuru

More information

Speech Driven Synthesis of Talking Head Sequences

Speech Driven Synthesis of Talking Head Sequences 3D Image Analysis and Synthesis, pp. 5-56, Erlangen, November 997. Speech Driven Synthesis of Talking Head Sequences Peter Eisert, Subhasis Chaudhuri,andBerndGirod Telecommunications Laboratory, University

More information

Human body animation. Computer Animation. Human Body Animation. Skeletal Animation

Human body animation. Computer Animation. Human Body Animation. Skeletal Animation Computer Animation Aitor Rovira March 2010 Human body animation Based on slides by Marco Gillies Human Body Animation Skeletal Animation Skeletal Animation (FK, IK) Motion Capture Motion Editing (retargeting,

More information

VTalk: A System for generating Text-to-Audio-Visual Speech

VTalk: A System for generating Text-to-Audio-Visual Speech VTalk: A System for generating Text-to-Audio-Visual Speech Prem Kalra, Ashish Kapoor and Udit Kumar Goyal Department of Computer Science and Engineering, Indian Institute of Technology, Delhi Contact email:

More information

INTERNATIONAL JOURNAL OF GRAPHICS AND MULTIMEDIA (IJGM)

INTERNATIONAL JOURNAL OF GRAPHICS AND MULTIMEDIA (IJGM) INTERNATIONAL JOURNAL OF GRAPHICS AND MULTIMEDIA (IJGM) International Journal of Graphics and Multimedia (IJGM), ISSN: 0976 6448 (Print) ISSN: 0976 ISSN : 0976 6448 (Print) ISSN : 0976 6456 (Online) Volume

More information

Text-to-Audiovisual Speech Synthesizer

Text-to-Audiovisual Speech Synthesizer Text-to-Audiovisual Speech Synthesizer Udit Kumar Goyal, Ashish Kapoor and Prem Kalra Department of Computer Science and Engineering, Indian Institute of Technology, Delhi pkalra@cse.iitd.ernet.in Abstract.

More information

CS 231. Deformation simulation (and faces)

CS 231. Deformation simulation (and faces) CS 231 Deformation simulation (and faces) 1 Cloth Simulation deformable surface model Represent cloth model as a triangular or rectangular grid Points of finite mass as vertices Forces or energies of points

More information

Transfer Facial Expressions with Identical Topology

Transfer Facial Expressions with Identical Topology Transfer Facial Expressions with Identical Topology Alice J. Lin Department of Computer Science University of Kentucky Lexington, KY 40506, USA alice.lin@uky.edu Fuhua (Frank) Cheng Department of Computer

More information

Learning-Based Facial Rearticulation Using Streams of 3D Scans

Learning-Based Facial Rearticulation Using Streams of 3D Scans Learning-Based Facial Rearticulation Using Streams of 3D Scans Robert Bargmann MPI Informatik Saarbrücken, Germany Bargmann@mpi-inf.mpg.de Volker Blanz Universität Siegen Germany Blanz@informatik.uni-siegen.de

More information

Animated Talking Head With Personalized 3D Head Model

Animated Talking Head With Personalized 3D Head Model Animated Talking Head With Personalized 3D Head Model L.S.Chen, T.S.Huang - Beckman Institute & CSL University of Illinois, Urbana, IL 61801, USA; lchen@ifp.uiuc.edu Jörn Ostermann, AT&T Labs-Research,

More information

MODELING AND ANIMATING FOR THE DENSE LASER-SCANNED FACE IN THE LOW RESOLUTION LEVEL

MODELING AND ANIMATING FOR THE DENSE LASER-SCANNED FACE IN THE LOW RESOLUTION LEVEL MODELING AND ANIMATING FOR THE DENSE LASER-SCANNED FACE IN THE LOW RESOLUTION LEVEL Lijia Zhu and Won-Sook Lee School of Information Technology and Engineering, University of Ottawa 800 King Edward Ave.,

More information

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD Kang Liu and Joern Ostermann Institut für Informationsverarbeitung, Leibniz Universität Hannover Appelstr. 9A, 3167 Hannover, Germany

More information

Facial Animation System Based on Image Warping Algorithm

Facial Animation System Based on Image Warping Algorithm Facial Animation System Based on Image Warping Algorithm Lanfang Dong 1, Yatao Wang 2, Kui Ni 3, Kuikui Lu 4 Vision Computing and Visualization Laboratory, School of Computer Science and Technology, University

More information

3D Morphable Model Based Face Replacement in Video

3D Morphable Model Based Face Replacement in Video 3D Morphable Model Based Face Replacement in Video Yi-Ting Cheng, Virginia Tzeng, Yung-Yu Chuang, Ming Ouhyoung Dept. of Computer Science and Information Engineering, National Taiwan University E-mail:

More information

Performance Driven Facial Animation using Blendshape Interpolation

Performance Driven Facial Animation using Blendshape Interpolation Performance Driven Facial Animation using Blendshape Interpolation Erika Chuang Chris Bregler Computer Science Department Stanford University Abstract This paper describes a method of creating facial animation

More information

HIGH-RESOLUTION ANIMATION OF FACIAL DYNAMICS

HIGH-RESOLUTION ANIMATION OF FACIAL DYNAMICS HIGH-RESOLUTION ANIMATION OF FACIAL DYNAMICS N. Nadtoka, J.R. Tena, A. Hilton, J. Edge Centre for Vision, Speech and Signal Processing, University of Surrey {N.Nadtoka, J.Tena, A.Hilton}@surrey.ac.uk Keywords:

More information

Animating Lip-Sync Characters

Animating Lip-Sync Characters Animating Lip-Sync Characters Yu-Mei Chen Fu-Chung Huang Shuen-Huei Guan Bing-Yu Chen Shu-Yang Lin Yu-Hsin Lin Tse-Hsien Wang National Taiwan University University of California at Berkeley Digimax {yumeiohya,drake,young,b95705040,starshine}@cmlab.csie.ntu.edu.tw

More information

Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment

Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment ISCA Archive Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment Shigeo MORISHIMA Seikei University ABSTRACT Recently computer can make cyberspace to walk through

More information

FACIAL ANIMATION WITH MOTION CAPTURE BASED ON SURFACE BLENDING

FACIAL ANIMATION WITH MOTION CAPTURE BASED ON SURFACE BLENDING FACIAL ANIMATION WITH MOTION CAPTURE BASED ON SURFACE BLENDING Lijia Zhu and Won-Sook Lee School of Information Technology and Engineering, University of Ottawa 800 King Edward Ave., Ottawa, Ontario, Canada,

More information

Facial expression recognition using shape and texture information

Facial expression recognition using shape and texture information 1 Facial expression recognition using shape and texture information I. Kotsia 1 and I. Pitas 1 Aristotle University of Thessaloniki pitas@aiia.csd.auth.gr Department of Informatics Box 451 54124 Thessaloniki,

More information

An Automatic 3D Face Model Segmentation for Acquiring Weight Motion Area

An Automatic 3D Face Model Segmentation for Acquiring Weight Motion Area An Automatic 3D Face Model Segmentation for Acquiring Weight Motion Area Rio Caesar Suyoto Samuel Gandang Gunanto Magister Informatics Engineering Atma Jaya Yogyakarta University Sleman, Indonesia Magister

More information

FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT

FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT Shoichiro IWASAWA*I, Tatsuo YOTSUKURA*2, Shigeo MORISHIMA*2 */ Telecommunication Advancement Organization *2Facu!ty of Engineering, Seikei University

More information

A Morphable Model for the Synthesis of 3D Faces

A Morphable Model for the Synthesis of 3D Faces A Morphable Model for the Synthesis of 3D Faces Marco Nef Volker Blanz, Thomas Vetter SIGGRAPH 99, Los Angeles Presentation overview Motivation Introduction Database Morphable 3D Face Model Matching a

More information

REALISTIC facial expression synthesis has been one of the

REALISTIC facial expression synthesis has been one of the 48 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 12, NO. 1, JANUARY/FEBRUARY 2006 Geometry-Driven Photorealistic Facial Expression Synthesis Qingshan Zhang, Zicheng Liu, Senior Member,

More information

A Multiresolutional Approach for Facial Motion Retargetting Using Subdivision Wavelets

A Multiresolutional Approach for Facial Motion Retargetting Using Subdivision Wavelets A Multiresolutional Approach for Facial Motion Retargetting Using Subdivision Wavelets Kyungha Min and Moon-Ryul Jung Dept. of Media Technology, Graduate School of Media Communications, Sogang Univ., Seoul,

More information

Abstract We present a system which automatically generates a 3D face model from a single frontal image of a face. Our system consists of two component

Abstract We present a system which automatically generates a 3D face model from a single frontal image of a face. Our system consists of two component A Fully Automatic System To Model Faces From a Single Image Zicheng Liu Microsoft Research August 2003 Technical Report MSR-TR-2003-55 Microsoft Research Microsoft Corporation One Microsoft Way Redmond,

More information

FACIAL ANIMATION FROM SEVERAL IMAGES

FACIAL ANIMATION FROM SEVERAL IMAGES International Archives of Photogrammetry and Remote Sensing. Vol. XXXII, Part 5. Hakodate 1998 FACIAL ANIMATION FROM SEVERAL IMAGES Yasuhiro MUKAIGAWAt Yuichi NAKAMURA+ Yuichi OHTA+ t Department of Information

More information

Animating Lips-Sync Speech Faces with Compact Key-Shapes

Animating Lips-Sync Speech Faces with Compact Key-Shapes EUROGRAPHICS 2008 / G. Drettakis and R. Scopigno (Guest Editors) Volume 27 (2008), Number 3 Animating Lips-Sync Speech Faces with Compact Key-Shapes Submission id: paper1188 Abstract Facial animation is

More information

Personal style & NMF-based Exaggerative Expressions of Face. Seongah Chin, Chung-yeon Lee, Jaedong Lee Multimedia Department of Sungkyul University

Personal style & NMF-based Exaggerative Expressions of Face. Seongah Chin, Chung-yeon Lee, Jaedong Lee Multimedia Department of Sungkyul University Personal style & NMF-based Exaggerative Expressions of Face Seongah Chin, Chung-yeon Lee, Jaedong Lee Multimedia Department of Sungkyul University Outline Introduction Related Works Methodology Personal

More information

Computer Animation Visualization. Lecture 5. Facial animation

Computer Animation Visualization. Lecture 5. Facial animation Computer Animation Visualization Lecture 5 Facial animation Taku Komura Facial Animation The face is deformable Need to decide how all the vertices on the surface shall move Manually create them Muscle-based

More information

Motion Synthesis and Editing. Yisheng Chen

Motion Synthesis and Editing. Yisheng Chen Motion Synthesis and Editing Yisheng Chen Overview Data driven motion synthesis automatically generate motion from a motion capture database, offline or interactive User inputs Large, high-dimensional

More information

Unsupervised Learning for Speech Motion Editing

Unsupervised Learning for Speech Motion Editing Eurographics/SIGGRAPH Symposium on Computer Animation (2003) D. Breen, M. Lin (Editors) Unsupervised Learning for Speech Motion Editing Yong Cao 1,2 Petros Faloutsos 1 Frédéric Pighin 2 1 University of

More information

K A I S T Department of Computer Science

K A I S T Department of Computer Science A Region-based Facial Expression Cloning Bongcheol Park and Sung Yong Shin CS/TR-2006-256 April 24, 2006 K A I S T Department of Computer Science A Region-based Facial Expression Cloning Bongcheol Park

More information

SYNTHESIS OF 3D FACES

SYNTHESIS OF 3D FACES SYNTHESIS OF 3D FACES R. Enciso, J. Li, D.A. Fidaleo, T-Y Kim, J-Y Noh and U. Neumann Integrated Media Systems Center University of Southern California Los Angeles, CA 90089, U.S.A. Abstract In this paper,

More information

Facial Deformations for MPEG-4

Facial Deformations for MPEG-4 Facial Deformations for MPEG-4 Marc Escher, Igor Pandzic, Nadia Magnenat Thalmann MIRALab - CUI University of Geneva 24 rue du Général-Dufour CH1211 Geneva 4, Switzerland {Marc.Escher, Igor.Pandzic, Nadia.Thalmann}@cui.unige.ch

More information

Facial Expression Analysis for Model-Based Coding of Video Sequences

Facial Expression Analysis for Model-Based Coding of Video Sequences Picture Coding Symposium, pp. 33-38, Berlin, September 1997. Facial Expression Analysis for Model-Based Coding of Video Sequences Peter Eisert and Bernd Girod Telecommunications Institute, University of

More information

Real-time Lip Synchronization Based on Hidden Markov Models

Real-time Lip Synchronization Based on Hidden Markov Models ACCV2002 The 5th Asian Conference on Computer Vision, 23--25 January 2002, Melbourne, Australia. Real-time Lip Synchronization Based on Hidden Markov Models Ying Huang* 1 Stephen Lin+ Xiaoqing Ding* Baining

More information

Registration of Expressions Data using a 3D Morphable Model

Registration of Expressions Data using a 3D Morphable Model Registration of Expressions Data using a 3D Morphable Model Curzio Basso, Pascal Paysan, Thomas Vetter Computer Science Department, University of Basel {curzio.basso,pascal.paysan,thomas.vetter}@unibas.ch

More information

image-based visual synthesis: facial overlay

image-based visual synthesis: facial overlay Universität des Saarlandes Fachrichtung 4.7 Phonetik Sommersemester 2002 Seminar: Audiovisuelle Sprache in der Sprachtechnologie Seminarleitung: Dr. Jacques Koreman image-based visual synthesis: facial

More information

Text2Video: Text-Driven Facial Animation using MPEG-4

Text2Video: Text-Driven Facial Animation using MPEG-4 Text2Video: Text-Driven Facial Animation using MPEG-4 J. Rurainsky and P. Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz Institute Image Processing Department D-10587 Berlin, Germany

More information

Animating Blendshape Faces by Cross-Mapping Motion Capture Data

Animating Blendshape Faces by Cross-Mapping Motion Capture Data To appear in the Proc. of ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games 2006 (I3DG), March 2006 Animating Blendshape Faces by Cross-Mapping Motion Capture Data Zhigang Deng Pei-Ying Chiang

More information

CS 523: Computer Graphics, Spring Shape Modeling. Skeletal deformation. Andrew Nealen, Rutgers, /12/2011 1

CS 523: Computer Graphics, Spring Shape Modeling. Skeletal deformation. Andrew Nealen, Rutgers, /12/2011 1 CS 523: Computer Graphics, Spring 2011 Shape Modeling Skeletal deformation 4/12/2011 1 Believable character animation Computers games and movies Skeleton: intuitive, low-dimensional subspace Clip courtesy

More information

Mood Swings: Expressive Speech Animation

Mood Swings: Expressive Speech Animation Mood Swings: Expressive Speech Animation ERIKA CHUANG Stanford University and CHRISTOPH BREGLER New York University Motion capture-based facial animation has recently gained popularity in many applications,

More information

Facial Animation System Design based on Image Processing DU Xueyan1, a

Facial Animation System Design based on Image Processing DU Xueyan1, a 4th International Conference on Machinery, Materials and Computing Technology (ICMMCT 206) Facial Animation System Design based on Image Processing DU Xueyan, a Foreign Language School, Wuhan Polytechnic,

More information

Modeling Facial Expressions in 3D Avatars from 2D Images

Modeling Facial Expressions in 3D Avatars from 2D Images Modeling Facial Expressions in 3D Avatars from 2D Images Emma Sax Division of Science and Mathematics University of Minnesota, Morris Morris, Minnesota, USA 12 November, 2016 Morris, MN Sax (U of Minn,

More information

animation computer graphics animation 2009 fabio pellacini 1 animation shape specification as a function of time

animation computer graphics animation 2009 fabio pellacini 1 animation shape specification as a function of time animation computer graphics animation 2009 fabio pellacini 1 animation shape specification as a function of time computer graphics animation 2009 fabio pellacini 2 animation representation many ways to

More information

Human Face Animation Based on Video Analysis, with Applications to Mobile Entertainment

Human Face Animation Based on Video Analysis, with Applications to Mobile Entertainment Human Face Animation Based on Video Analysis, with Applications to Mobile Entertainment Author Tang, John Sy-Sen, Liew, Alan Wee-Chung, Yan, Hong Published 2005 Journal Title Journal of Mobile Multimedia

More information

3D Face Deformation Using Control Points and Vector Muscles

3D Face Deformation Using Control Points and Vector Muscles IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.4, April 2007 149 3D Face Deformation Using Control Points and Vector Muscles Hyun-Cheol Lee and Gi-Taek Hur, University

More information

Image-Based Deformation of Objects in Real Scenes

Image-Based Deformation of Objects in Real Scenes Image-Based Deformation of Objects in Real Scenes Han-Vit Chung and In-Kwon Lee Dept. of Computer Science, Yonsei University sharpguy@cs.yonsei.ac.kr, iklee@yonsei.ac.kr Abstract. We present a new method

More information

Skeletal deformation

Skeletal deformation CS 523: Computer Graphics, Spring 2009 Shape Modeling Skeletal deformation 4/22/2009 1 Believable character animation Computers games and movies Skeleton: intuitive, low dimensional subspace Clip courtesy

More information

The accuracy and robustness of motion

The accuracy and robustness of motion Orthogonal-Blendshape-Based Editing System for Facial Motion Capture Data Qing Li and Zhigang Deng University of Houston The accuracy and robustness of motion capture has made it a popular technique for

More information

Lifelike Talking Faces for Interactive Services

Lifelike Talking Faces for Interactive Services Lifelike Talking Faces for Interactive Services ERIC COSATTO, MEMBER, IEEE, JÖRN OSTERMANN, SENIOR MEMBER, IEEE, HANS PETER GRAF, FELLOW, IEEE, AND JUERGEN SCHROETER, FELLOW, IEEE Invited Paper Lifelike

More information

Fast Facial Motion Cloning in MPEG-4

Fast Facial Motion Cloning in MPEG-4 Fast Facial Motion Cloning in MPEG-4 Marco Fratarcangeli and Marco Schaerf Department of Computer and Systems Science University of Rome La Sapienza frat,schaerf@dis.uniroma1.it Abstract Facial Motion

More information

Hierarchical Retargetting of Fine Facial Motions

Hierarchical Retargetting of Fine Facial Motions EUROGRAPHICS 2004 / M.-P. Cani and M. Slater (Guest Editors) Volume 23 (2004), Number 3 Hierarchical Retargetting of Fine Facial Motions Kyunggun Na and Moonryul Jung Department of Media Technology, Graduate

More information

3D Lip-Synch Generation with Data-Faithful Machine Learning

3D Lip-Synch Generation with Data-Faithful Machine Learning EUROGRAPHICS 2007 / D. Cohen-Or and P. Slavík (Guest Editors) Volume 26 (2007), Number 3 3D Lip-Synch Generation with Data-Faithful Machine Learning Ig-Jae Kim 1,2 and Hyeong-Seok Ko 1 1 Graphics and Media

More information

Optimized Face Animation with Morph-Targets

Optimized Face Animation with Morph-Targets Optimized Face Animation with Morph-Targets Uwe Berner TU Darmstadt, Interactive Graphics Systems Group (GRIS) Fraunhoferstrasse 5 64283 Darmstadt, Germany uberner@gris.informatik.tudarmstadt.de ABSTRACT

More information

SMILE: A Multilayered Facial Animation System

SMILE: A Multilayered Facial Animation System SMILE: A Multilayered Facial Animation System Prem Kalra, Angelo Mangili, Nadia Magnenat-Thalmann, Daniel Thalmann ABSTRACT This paper describes a methodology for specifying facial animation based on a

More information

animation computer graphics animation 2009 fabio pellacini 1

animation computer graphics animation 2009 fabio pellacini 1 animation computer graphics animation 2009 fabio pellacini 1 animation shape specification as a function of time computer graphics animation 2009 fabio pellacini 2 animation representation many ways to

More information

Parameterization of triangular meshes

Parameterization of triangular meshes Parameterization of triangular meshes Michael S. Floater November 10, 2009 Triangular meshes are often used to represent surfaces, at least initially, one reason being that meshes are relatively easy to

More information

Analysis and Synthesis of Facial Expressions with Hand-Generated Muscle Actuation Basis

Analysis and Synthesis of Facial Expressions with Hand-Generated Muscle Actuation Basis Proceedings of Computer Animation 2001, pages 12 19, November 2001 Analysis and Synthesis of Facial Expressions with Hand-Generated Muscle Actuation Basis Byoungwon Choe Hyeong-Seok Ko School of Electrical

More information

Parameterization. Michael S. Floater. November 10, 2011

Parameterization. Michael S. Floater. November 10, 2011 Parameterization Michael S. Floater November 10, 2011 Triangular meshes are often used to represent surfaces, at least initially, one reason being that meshes are relatively easy to generate from point

More information

Speech-driven Face Synthesis from 3D Video

Speech-driven Face Synthesis from 3D Video Speech-driven Face Synthesis from 3D Video Ioannis A. Ypsilos, Adrian Hilton, Aseel Turkmani and Philip J. B. Jackson Centre for Vision Speech and Signal Processing University of Surrey, Guildford, GU2

More information

Introduction to Computer Graphics

Introduction to Computer Graphics Introduction to Computer Graphics 2016 Spring National Cheng Kung University Instructors: Min-Chun Hu 胡敏君 Shih-Chin Weng 翁士欽 ( 西基電腦動畫 ) Data Representation Curves and Surfaces Limitations of Polygons Inherently

More information

Parameterization of Triangular Meshes with Virtual Boundaries

Parameterization of Triangular Meshes with Virtual Boundaries Parameterization of Triangular Meshes with Virtual Boundaries Yunjin Lee 1;Λ Hyoung Seok Kim 2;y Seungyong Lee 1;z 1 Department of Computer Science and Engineering Pohang University of Science and Technology

More information

Real-time Speech Motion Synthesis from Recorded Motions

Real-time Speech Motion Synthesis from Recorded Motions Eurographics/ACM SIGGRAPH Symposium on Computer Animation (2004) R. Boulic, D. K. Pai (Editors) Real-time Speech Motion Synthesis from Recorded Motions Yong Cao 1,2 Petros Faloutsos 1 Eddie Kohler 1 Frédéric

More information

Resynthesizing Facial Animation through 3D Model-Based Tracking

Resynthesizing Facial Animation through 3D Model-Based Tracking Resynthesizing Facial Animation through 3D Model-Based Tracking Frédéric Pighin y Richard Szeliski z David H. Salesin yz y University of Washington z Microsoft Research Abstract Given video footage of

More information

2D Image Morphing using Pixels based Color Transition Methods

2D Image Morphing using Pixels based Color Transition Methods 2D Image Morphing using Pixels based Color Transition Methods H.B. Kekre Senior Professor, Computer Engineering,MP STME, SVKM S NMIMS University, Mumbai,India Tanuja K. Sarode Asst.Professor, Thadomal

More information

Adding Hand Motion to the Motion Capture Based Character Animation

Adding Hand Motion to the Motion Capture Based Character Animation Adding Hand Motion to the Motion Capture Based Character Animation Ge Jin and James Hahn Computer Science Department, George Washington University, Washington DC 20052 {jinge, hahn}@gwu.edu Abstract. Most

More information

Recovering Non-Rigid 3D Shape from Image Streams

Recovering Non-Rigid 3D Shape from Image Streams Recovering Non-Rigid D Shape from Image Streams Christoph Bregler Aaron Hertzmann Henning Biermann Computer Science Department NYU Media Research Lab Stanford University 9 Broadway, th floor Stanford,

More information

Machine Learning for Video-Based Rendering

Machine Learning for Video-Based Rendering Machine Learning for Video-Based Rendering Arno Schödl arno@schoedl.org Irfan Essa irfan@cc.gatech.edu Georgia Institute of Technology GVU Center / College of Computing Atlanta, GA 30332-0280, USA. Abstract

More information

Announcements. Midterms back at end of class ½ lecture and ½ demo in mocap lab. Have you started on the ray tracer? If not, please do due April 10th

Announcements. Midterms back at end of class ½ lecture and ½ demo in mocap lab. Have you started on the ray tracer? If not, please do due April 10th Announcements Midterms back at end of class ½ lecture and ½ demo in mocap lab Have you started on the ray tracer? If not, please do due April 10th 1 Overview of Animation Section Techniques Traditional

More information

Speech Driven Face Animation Based on Dynamic Concatenation Model

Speech Driven Face Animation Based on Dynamic Concatenation Model Journal of Information & Computational Science 3: 4 (2006) 1 Available at http://www.joics.com Speech Driven Face Animation Based on Dynamic Concatenation Model Jianhua Tao, Panrong Yin National Laboratory

More information

Chapter 9 Animation System

Chapter 9 Animation System Chapter 9 Animation System 9.1 Types of Character Animation Cel Animation Cel animation is a specific type of traditional animation. A cel is a transparent sheet of plastic on which images can be painted

More information

Normals of subdivision surfaces and their control polyhedra

Normals of subdivision surfaces and their control polyhedra Computer Aided Geometric Design 24 (27 112 116 www.elsevier.com/locate/cagd Normals of subdivision surfaces and their control polyhedra I. Ginkel a,j.peters b,,g.umlauf a a University of Kaiserslautern,

More information

Efficient Rendering of Glossy Reflection Using Graphics Hardware

Efficient Rendering of Glossy Reflection Using Graphics Hardware Efficient Rendering of Glossy Reflection Using Graphics Hardware Yoshinori Dobashi Yuki Yamada Tsuyoshi Yamamoto Hokkaido University Kita-ku Kita 14, Nishi 9, Sapporo 060-0814, Japan Phone: +81.11.706.6530,

More information

Real time facial expression recognition from image sequences using Support Vector Machines

Real time facial expression recognition from image sequences using Support Vector Machines Real time facial expression recognition from image sequences using Support Vector Machines I. Kotsia a and I. Pitas a a Aristotle University of Thessaloniki, Department of Informatics, Box 451, 54124 Thessaloniki,

More information

Multi-modal Translation and Evaluation of Lip-synchronization using Noise Added Voice

Multi-modal Translation and Evaluation of Lip-synchronization using Noise Added Voice Multi-modal Translation and Evaluation of Lip-synchronization using Noise Added Voice Shigeo MORISHIMA (,2), Satoshi NAKAMURA (2) () Faculty of Engineering, Seikei University. --, Kichijoji-Kitamachi,

More information

Morphable Displacement Field Based Image Matching for Face Recognition across Pose

Morphable Displacement Field Based Image Matching for Face Recognition across Pose Morphable Displacement Field Based Image Matching for Face Recognition across Pose Speaker: Iacopo Masi Authors: Shaoxin Li Xin Liu Xiujuan Chai Haihong Zhang Shihong Lao Shiguang Shan Work presented as

More information

Approximation of 3D-Parametric Functions by Bicubic B-spline Functions

Approximation of 3D-Parametric Functions by Bicubic B-spline Functions International Journal of Mathematical Modelling & Computations Vol. 02, No. 03, 2012, 211-220 Approximation of 3D-Parametric Functions by Bicubic B-spline Functions M. Amirfakhrian a, a Department of Mathematics,

More information

Animation. CS 4620 Lecture 33. Cornell CS4620 Fall Kavita Bala

Animation. CS 4620 Lecture 33. Cornell CS4620 Fall Kavita Bala Animation CS 4620 Lecture 33 Cornell CS4620 Fall 2015 1 Announcements Grading A5 (and A6) on Monday after TG 4621: one-on-one sessions with TA this Friday w/ prior instructor Steve Marschner 2 Quaternions

More information

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations Prashant Ramanathan and Bernd Girod Department of Electrical Engineering Stanford University Stanford CA 945

More information

For each question, indicate whether the statement is true or false by circling T or F, respectively.

For each question, indicate whether the statement is true or false by circling T or F, respectively. True/False For each question, indicate whether the statement is true or false by circling T or F, respectively. 1. (T/F) Rasterization occurs before vertex transformation in the graphics pipeline. 2. (T/F)

More information

Motion Capture, Motion Edition

Motion Capture, Motion Edition Motion Capture, Motion Edition 2013-14 Overview Historical background Motion Capture, Motion Edition Motion capture systems Motion capture workflow Re-use of motion data Combining motion data and physical

More information