Phonemes Interpolation

Size: px
Start display at page:

Download "Phonemes Interpolation"

Transcription

1 Phonemes Interpolation Fawaz Y. Annaz *1 and Mohammad H. Sadaghiani *2 *1 Institut Teknologi Brunei, Electrical & Electronic Engineering Department, Faculty of Engineering, Jalan Tungku Link, Gadong, BE 1410, Bandar Seri Begawan, Brunei Darussalam *2 University of Nottingham, Malaysia Cmpus, Electrical & Electronic Engineering Department, Jalan Broga, Semenyih, Selangor, Malaysia KeyWords: Lagrange Interpolation, Barycentric Lagrange Interpolation, Phonemes, Viseme. Abstract Learning a language starts immediately after birth in the form of repeating basic sounds and gestures that are generated by adults (usually the parents). While teaching is achieved by initial pronunciation through exaggerated gestures and sounds, learning is accompanied by memorizing and comprehension and eventually reproduction of such gestures and sounds. In fact, parents exaggerate speech by breaking up it into lower accepted (to babies) sound and gesture levels that are more accepted by babies. It is the aim of this paper to demonstrate methods in which fundamental articulated phonemes is represented by signatures that reflect dynamic mouth movement contours. The paper starts by explaining the basic (yet limited) Lagrange Interpolation Method to produce fundamental signatures of lip movements by tracking upper-lip and corner-lip feature-points. Then the paper proposes a method that produces more compact polynomials by using the Barycentric Lagrange Interpolation Method, which overcomes any limitations of the earlier method. 1. Introduction This paper addresses fundamental concepts in human audio-visual communication, and focusses on the mouth shape during speech. This field have received interest from various groups that can range from English Language Teachers (ELT) in a classroom to interacting with an animated face or an electromechanical robot head figure interface. Thus, the work will be of interest to groups in robotics, movie industries, biometrics, real-time translation or even future machine interaction. The difficulty in the study of this field is to understand and combine scientific and artistic significance of speech or communication and the way it should be delivered and perceived between human and/or machines. Facial animation is one concept that emerges from this science, which was pioneered by Parke [1] in 1974, a since then significant time and efforts were allocated to excel the fusion of science and art [2]-[4]. The definition of a generic framework to map speech components into animated visual pronunciation models is another important concept that goes back as far as 1968, which is known today as the concept of phonemes. This, in turn, lead to the birth of the viseme approach [5], which is the study of the visual observations of phonemes and aims to study mouth contours geometry to classify and analyse statistical and physical characteristics of consonants and vowels. The bridging of audio-phonemes and their corresponding visual visemes has thus become the basis for researchers in speech visualization and perception. It is also interesting to know the effect of coarticulation has over speech, and the effect of simple concatenation has over visual modality on their previous and next phonemes. This is clearly evident in the visual domain, thus, studies in speech driven systems [6] and [7], as well as speech and text driven systems [8]-[12] emerged to act as inputs and to generate animated lip movements. In machine learning, speech animation, speech and corresponding visual parameters are used to train Hidden Markov Models (HMM) to create constrains and trajectory functions to determine speech and visual features. The accuracy of approximated visual trajectories depends on the chosen training set which directly affects the quality of the synthesized results. In [13] HMM model was used to animate and synchronize a 2D face model by assisting speech and an AAM (Active Appearance Model) feature parameters. In this paper, each viseme is interpreted as a series of frames that describe a phoneme over a time interval, and is represented by interpolating trajectory paths from control points. Thus, each viseme is expressed by a mathematical form, which maybe further simplified by considering ratios or other geometric transformation rules. The authors in [14] proposed 2D trajectories of mouth cavity area versus aspect ratios to describe Japanese words, however, it did not suggest path formulation over discrete frames or defined mathematical signatures of spoken words. Here, the Lagrange interpolation [15] is proposed to construct polynomials by interpolating sets of control points resulting from lip deformation. This paper will initially describe the suggested approach in 2, followed by an introduction to the signature concept and examples of suggested signatures in sections 3, 4 and 5

2 2. The Approach The main aim of this paper is to determine mathematical signatures of indexed path of feature points that correspond to a sequence of mouth movements. In this introductory paper, isolated visual units of speech (phonemes) are examined to explain our approach. Each pronounced phoneme will be represented in a set of progressive 2Dframes, which some authors refer to as visemes. In this approach, the mouth height and width (per frame) make up unique sets of feature points per spoken phoneme, thus resulting in unique signature sets. It is also proposed that a fixed 30 frames are to represent various words, regardless of their length. Figure 1 shows an example of a fictitious word with five framed feature points (visemes), represented by the vector [F i U, F i C ], where the lip is approximated by ellipses. Feature points in each viseme can simply be the pixel coordinates per frame that can make up ellipse equations, which can be stored and recovered, when necessary. Vowels have longer visual duration than consonants, and they connect consonants in word structures. Thus, in a speech recognition system, the vowels play significant role in recognition [16]. Therefore, it is important to derive mathematical expressions for consonant-vowel phonemes, such as those shown in Table 1 of the International Phonetic Alphabet in American English. N 1 i=0 l i (x i ) = 1 (3) Where l i (x) are the basis functions corresponding to the nodes x i. Table 1. The IPA and ARPABET Vowels Notations IPA ARPABET Example IPA ARPABET Example i IY beet,e AH but I IH bit ɔ AO bought æ AE bat U UH foot ε AH bet u UW boot a AA hot o OW show In this method, the number of frames increases the degree of the Lagrange polynomial, however, this does not mean increase in accuracy. For example, a set of 15 extracted samples from pronouncing the vowel /UW/ gives the interpolation: L(f) = E 8 f E6 f E5 f f f f f f f f f f f f + 21 (4) Figure 1. The IPA and ARPABET Vowels Notations 3. The Lagrange interpolation The Lagrange is a popular choice to derive mathematical functions of the paths feature-point vectors that describe spoken phonemes. Here, the Lagrange interpolation reconstructs a continuous-time polynomial L(x) that spans uniformly over an interval x i [a, b], from a set of samples x i R and i Ν. N 1 L(x) = i=0 f( x i )l i (x) (1) l i (x) N 1 k=0,k j (x x k) N 1 k=0,k i(x i x k ) (2) Figure 2. The Vowel Viseme /UW/ Upper and Corner Feature-Points Lagrange Interpolation The phoneme expression is clearly defined with very high or low amplitudes on the interval boundaries, reducing the accuracy of the interpolation. Therefore, the method is limited and becomes impractical when dealing with large number of samples, thus inducing very high errors between the function and its interpolating curve. This could be demonstrated by simply substituting f=0.3 in (4) to get

3 L(f) 43 as an evaluation to a corner feature point of approximately L(f)=190. This high fluctuation in amplitude at the boundaries is described as the Runge-effect [17], results in an error between the function and its interpolating curve. The Lagrange interpolation of the vowel /UW/ (of both upper and corner feature-points) are plotted in Figures 2 (a and b). The Runge phenomenon is clearly visible on the first and last pairs of nodes, appearing in the form of oscillation at the ending boundaries. Therefore, there is an error between the function and its interpolating curve. To exclude this saturated regions in the Lagrange polynomial a more elegant solution is proposed through the Barycentric Lagrange Polynomial (BLP) interpolation, which will be discussed next. In comparison to the expression in (4), substituting a value close to the boundaries for f in (7) results in an amplitude that is very close to the next neighbouring sample amplitude. For example, letting f=2 in (7) results in L B (0.3) = 22.25, which is very close to the actual value of the first and second samples (F 0 U = 21, F 1 U = 24). Such definition changes the previous interpolation and allocates a polynomial with proper amplitudes to the same sample set in Figure 2. Thus the final desirable representation of the phoneme /UW/curve over a uniformly spanned interval [a,b] is as shown in Figure 3, which shows all samples in a function without the high boundary amplitudes. 4. The Barycentric Lagrange interpolation The boundary oscillation (Runge phenomenon) was treated by the authors in [18] by rearranging the sample nodes x i position and modifying the intervals to formulate the Barycentric Lagrange interpolation: L B (x) N 1 f( i=0 x i ) w i (x x i ) w N 1 i i=0 (x x i ) (5) Such modification tackles the problem of destructive oscillations on the interval boundaries by a transformation to another domain. The transformation is possible by Chebyshev points of the second kind x i = cos iπ (N 1) spanned on the interval [-1, 1]. The weighing function w j in (5) could be simplified as [19]: w i = ( 1) i δ i With δ i = { i = 0, i = N 1 otherwise Equation (5) defines an interpolation procedure by the cooperation of the Lagrange method and the Chebyshev nodes over the interval [1,-1]. This time the same sets of samples that used in (4) was applied to Barycentric Lagrange interpolation method and led to: (6) L(f) = ( f f f f f f f f f f f f f f ) / (14 f 12 84f f f f f f f f f f f ) (7) Figure 3. The Barycentric Lagrange Interpolations Over the Uniformly Spanned Intervals 5. Visual Signatures Building on the above approach and aiming to further reduce the number of mathematical expressions to yield a more compact signatures expression, the upper and lower feature-points ratios were considered. This is shown in Figure 4 and will be referred to as the Rational Signature. Figure 4. The BLP For a set of Feature-Point s

4 6. Conclusions The experiment of formulating visemes was conducted on the extracted Feature-Points from visemes filmed at a rate of 30 frames sec -1 video files for a set of 3 speakers pronouncing the phonemes /IY/, /IH/, /AE/ and /,e/. Clearly, amplitudes vary according to speakers, and duration (number of frames) vary according to the pronounced phoneme. However, here, the main aim is to show that it is feasible to build patterns or signatures for the various phonemes. The presented work is still under development and it is highly likely that further rules must be added to clearly distinguish and identify phoneme, words and portions of speech. The main aim of this paper was to derive mathematical expressions to some of the basic phonemes. Two methods were considered, the Lagrange Interpolation and Barycentric Lagrange Interpolation, where the latter gave a more accurate representation of a pronunciation envelope. In this analysis, the top-centre and the corner of the lip were selected as Feature-Points of the driven signatures. The paper finally presented more compact expressions by considering Feature-Point ratios. As was mentioned earlier on, the presented work is still under development and that there is still a need to further develop the approach to clearly distinguish and identify phoneme, words and portions of speech. References Figure 5. The BLP of Feature-Point s Ratios for the Phonemes phonemes /IY/, /IH/, /AE/ and /,e/. [1]. F. I. Parke, A Parametric Model for Human Faces, Utah: The University of Utah, [2]. F. I. Parke, Computer Generated Animation of Faces, Utah: The University of Utah, [3]. N. Magnenat-Thalman and D. Thalman, "The Direction of Synthetic Actors in the Film 'Rendezvous a Montreal," IEEE journal of Computer Graphics and Applications, vol. 7, no. 12, pp. 9-19, [4]. L. Xie and Z. Q. Liu, "Realistic Mouth-Synching For Speech-Driven Talking Face Using Articulatory Modeling," IEEE Transaction on Multimedia, vol. 9, no. 3, p , [5]. C. G. Fisher, "Confusions among visually perceived consonants, "Journal of Speech and Hearing Research, vol. 11, no. 4, pp , [6]. G. Ananthakrishnan and O. Engwall, "Important Regions in the Articulator Trajectory," in International Seminar on Speech Production, Strasbourg, [7]. D. Jiang, I. Ravyse, H. Sahli and W. Verhelst, "Speech Driven Realistic Mouth Animation Based on Multimodal Unit Selection,"Journal of Multi-Modal User Interfaces, vol. 2, pp , [8]. R. Gutierrez-Osuna, P. K. Kakumanu, A. Esposito, O. N. Garcia, A. Bojorquez, J. L. Castillo and I. Rudomin, "Speech-driven facial animation with realistic dynamics," IEEE Transactions on Multimedia, vol. 7, no. 1, p , [9]. E. Cosatto and H. Graf, "Sample-based synthesis of photorealistic talking heads," in Computer Animation, [10]. Z. Deng, U. Neumann, J. P. Lewis, T. Y. Kim, M. Bulut and S. Narayanan, "Expressive Facial Animation Synthesis By Learning Speech Coarticulation And Expression Spaces," IEEE Transaction on Visualization and Computer Graphics, vol. 12, no. 6, pp. 1-12, [11]. Y. Cao, P. Faloutsos, E. Kohler and F. Pighin, "Real- Time Speech Motion Synthesis from Recorded Motions," in the ACM SIGGRAPH/Eurographics symposium on computer animation, New York, 2004.

5 [12]. S. Morishima, K. Aizawa and H. Harashima, "An Intelligent Facial Image Coding Driven By Speech And Phoneme," in IEEE ICASSP, Glasgow, [13]. G. Englebienne, Animating faces from speech, The University of Manchester, [14]. T. Saitoh and R. Konishi, "Word recognition based on two dimensional lip motion trajectory," IEEE International symposium on intelligent signal processing and communication, pp , [15]. J. L. Lagrange, Lecons elementaires sur les mathématiques, donnees `a l Ecole Normale en 1795, Paris: Oeuvres VII, Gauthier Villars, 1877, p [16]. L. Rabiner and B. H. Jung, Fundumentals of speech recognition, Prentice Hall, [17]. C. Runge, "Uber empirische Funktionen und die Interpolation zwischen aquidistanten Ordinaten," Zeitschrift fur Mathematik und Physik, p , [18]. Salzer and H. E. Salzer, "Lagrangian interpolation at the Chebyshev points xn,v = cos(vπ/n), v = o(1)n; some unnoted advantages" Journal of Computer, p , [19]. J. P. Berrut and L. N. Trefethen, "Barycentric Lagrange," SIAM REVIEW, p , 2004.

M I RA Lab. Speech Animation. Where do we stand today? Speech Animation : Hierarchy. What are the technologies?

M I RA Lab. Speech Animation. Where do we stand today? Speech Animation : Hierarchy. What are the technologies? MIRALab Where Research means Creativity Where do we stand today? M I RA Lab Nadia Magnenat-Thalmann MIRALab, University of Geneva thalmann@miralab.unige.ch Video Input (face) Audio Input (speech) FAP Extraction

More information

Data-Driven Face Modeling and Animation

Data-Driven Face Modeling and Animation 1. Research Team Data-Driven Face Modeling and Animation Project Leader: Post Doc(s): Graduate Students: Undergraduate Students: Prof. Ulrich Neumann, IMSC and Computer Science John P. Lewis Zhigang Deng,

More information

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD Kang Liu and Joern Ostermann Institut für Informationsverarbeitung, Leibniz Universität Hannover Appelstr. 9A, 3167 Hannover, Germany

More information

Multi-modal Translation and Evaluation of Lip-synchronization using Noise Added Voice

Multi-modal Translation and Evaluation of Lip-synchronization using Noise Added Voice Multi-modal Translation and Evaluation of Lip-synchronization using Noise Added Voice Shigeo MORISHIMA (,2), Satoshi NAKAMURA (2) () Faculty of Engineering, Seikei University. --, Kichijoji-Kitamachi,

More information

Towards Audiovisual TTS

Towards Audiovisual TTS Towards Audiovisual TTS in Estonian Einar MEISTER a, SaschaFAGEL b and RainerMETSVAHI a a Institute of Cybernetics at Tallinn University of Technology, Estonia b zoobemessageentertainmentgmbh, Berlin,

More information

Section 5.5 Piecewise Interpolation

Section 5.5 Piecewise Interpolation Section 5.5 Piecewise Interpolation Key terms Runge phenomena polynomial wiggle problem Piecewise polynomial interpolation We have considered polynomial interpolation to sets of distinct data like {( )

More information

Speech Driven Synthesis of Talking Head Sequences

Speech Driven Synthesis of Talking Head Sequences 3D Image Analysis and Synthesis, pp. 5-56, Erlangen, November 997. Speech Driven Synthesis of Talking Head Sequences Peter Eisert, Subhasis Chaudhuri,andBerndGirod Telecommunications Laboratory, University

More information

COMPREHENSIVE MANY-TO-MANY PHONEME-TO-VISEME MAPPING AND ITS APPLICATION FOR CONCATENATIVE VISUAL SPEECH SYNTHESIS

COMPREHENSIVE MANY-TO-MANY PHONEME-TO-VISEME MAPPING AND ITS APPLICATION FOR CONCATENATIVE VISUAL SPEECH SYNTHESIS COMPREHENSIVE MANY-TO-MANY PHONEME-TO-VISEME MAPPING AND ITS APPLICATION FOR CONCATENATIVE VISUAL SPEECH SYNTHESIS Wesley Mattheyses 1, Lukas Latacz 1 and Werner Verhelst 1,2 1 Vrije Universiteit Brussel,

More information

image-based visual synthesis: facial overlay

image-based visual synthesis: facial overlay Universität des Saarlandes Fachrichtung 4.7 Phonetik Sommersemester 2002 Seminar: Audiovisuelle Sprache in der Sprachtechnologie Seminarleitung: Dr. Jacques Koreman image-based visual synthesis: facial

More information

Animation of 3D surfaces

Animation of 3D surfaces Animation of 3D surfaces 2013-14 Motivations When character animation is controlled by skeleton set of hierarchical joints joints oriented by rotations the character shape still needs to be visible: visible

More information

A MOUTH FULL OF WORDS: VISUALLY CONSISTENT ACOUSTIC REDUBBING. Disney Research, Pittsburgh, PA University of East Anglia, Norwich, UK

A MOUTH FULL OF WORDS: VISUALLY CONSISTENT ACOUSTIC REDUBBING. Disney Research, Pittsburgh, PA University of East Anglia, Norwich, UK A MOUTH FULL OF WORDS: VISUALLY CONSISTENT ACOUSTIC REDUBBING Sarah Taylor Barry-John Theobald Iain Matthews Disney Research, Pittsburgh, PA University of East Anglia, Norwich, UK ABSTRACT This paper introduces

More information

FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT

FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT Shoichiro IWASAWA*I, Tatsuo YOTSUKURA*2, Shigeo MORISHIMA*2 */ Telecommunication Advancement Organization *2Facu!ty of Engineering, Seikei University

More information

AUDIOVISUAL SYNTHESIS OF EXAGGERATED SPEECH FOR CORRECTIVE FEEDBACK IN COMPUTER-ASSISTED PRONUNCIATION TRAINING.

AUDIOVISUAL SYNTHESIS OF EXAGGERATED SPEECH FOR CORRECTIVE FEEDBACK IN COMPUTER-ASSISTED PRONUNCIATION TRAINING. AUDIOVISUAL SYNTHESIS OF EXAGGERATED SPEECH FOR CORRECTIVE FEEDBACK IN COMPUTER-ASSISTED PRONUNCIATION TRAINING Junhong Zhao 1,2, Hua Yuan 3, Wai-Kim Leung 4, Helen Meng 4, Jia Liu 3 and Shanhong Xia 1

More information

K A I S T Department of Computer Science

K A I S T Department of Computer Science An Example-based Approach to Text-driven Speech Animation with Emotional Expressions Hyewon Pyun, Wonseok Chae, Yejin Kim, Hyungwoo Kang, and Sung Yong Shin CS/TR-2004-200 July 19, 2004 K A I S T Department

More information

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Samer Al Moubayed Center for Speech Technology, Department of Speech, Music, and Hearing, KTH, Sweden. sameram@kth.se

More information

Speech Articulation Training PART 1. VATA (Vowel Articulation Training Aid)

Speech Articulation Training PART 1. VATA (Vowel Articulation Training Aid) Speech Articulation Training PART 1 VATA (Vowel Articulation Training Aid) VATA is a speech therapy tool designed to supplement insufficient or missing auditory feedback for hearing impaired persons. The

More information

Artificial Visual Speech Synchronized with a Speech Synthesis System

Artificial Visual Speech Synchronized with a Speech Synthesis System Artificial Visual Speech Synchronized with a Speech Synthesis System H.H. Bothe und E.A. Wieden Department of Electronics, Technical University Berlin Einsteinufer 17, D-10587 Berlin, Germany Abstract:

More information

Facial Animation System Based on Image Warping Algorithm

Facial Animation System Based on Image Warping Algorithm Facial Animation System Based on Image Warping Algorithm Lanfang Dong 1, Yatao Wang 2, Kui Ni 3, Kuikui Lu 4 Vision Computing and Visualization Laboratory, School of Computer Science and Technology, University

More information

Synthesizing Realistic Facial Expressions from Photographs

Synthesizing Realistic Facial Expressions from Photographs Synthesizing Realistic Facial Expressions from Photographs 1998 F. Pighin, J Hecker, D. Lischinskiy, R. Szeliskiz and D. H. Salesin University of Washington, The Hebrew University Microsoft Research 1

More information

VISEME SPACE FOR REALISTIC SPEECH ANIMATION

VISEME SPACE FOR REALISTIC SPEECH ANIMATION VISEME SPACE FOR REALISTIC SPEECH ANIMATION Sumedha Kshirsagar, Nadia Magnenat-Thalmann MIRALab CUI, University of Geneva {sumedha, thalmann}@miralab.unige.ch http://www.miralab.unige.ch ABSTRACT For realistic

More information

Modeling Coarticulation in Continuous Speech

Modeling Coarticulation in Continuous Speech ing in Oregon Health & Science University Center for Spoken Language Understanding December 16, 2013 Outline in 1 2 3 4 5 2 / 40 in is the influence of one phoneme on another Figure: of coarticulation

More information

Facial Motion Capture Editing by Automated Orthogonal Blendshape Construction and Weight Propagation

Facial Motion Capture Editing by Automated Orthogonal Blendshape Construction and Weight Propagation Facial Motion Capture Editing by Automated Orthogonal Blendshape Construction and Weight Propagation Qing Li and Zhigang Deng Department of Computer Science University of Houston Houston, TX, 77204, USA

More information

Real-time Lip Synchronization Based on Hidden Markov Models

Real-time Lip Synchronization Based on Hidden Markov Models ACCV2002 The 5th Asian Conference on Computer Vision, 23--25 January 2002, Melbourne, Australia. Real-time Lip Synchronization Based on Hidden Markov Models Ying Huang* 1 Stephen Lin+ Xiaoqing Ding* Baining

More information

Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment

Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment ISCA Archive Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment Shigeo MORISHIMA Seikei University ABSTRACT Recently computer can make cyberspace to walk through

More information

TOWARDS A HIGH QUALITY FINNISH TALKING HEAD

TOWARDS A HIGH QUALITY FINNISH TALKING HEAD TOWARDS A HIGH QUALITY FINNISH TALKING HEAD Jean-Luc Oliv&s, Mikko Sam, Janne Kulju and Otto Seppala Helsinki University of Technology Laboratory of Computational Engineering, P.O. Box 9400, Fin-02015

More information

Animation of 3D surfaces.

Animation of 3D surfaces. Animation of 3D surfaces Motivations When character animation is controlled by skeleton set of hierarchical joints joints oriented by rotations the character shape still needs to be visible: visible =

More information

Quarterly Progress and Status Report. Studies of labial articulation

Quarterly Progress and Status Report. Studies of labial articulation Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Studies of labial articulation Lindblom, B. journal: STL-QPSR volume: 6 number: 4 year: 1965 pages: 007-009 http://www.speech.kth.se/qpsr

More information

INTEGRATION OF SPEECH & VIDEO: APPLICATIONS FOR LIP SYNCH: LIP MOVEMENT SYNTHESIS & TIME WARPING

INTEGRATION OF SPEECH & VIDEO: APPLICATIONS FOR LIP SYNCH: LIP MOVEMENT SYNTHESIS & TIME WARPING INTEGRATION OF SPEECH & VIDEO: APPLICATIONS FOR LIP SYNCH: LIP MOVEMENT SYNTHESIS & TIME WARPING Jon P. Nedel Submitted to the Department of Electrical and Computer Engineering in Partial Fulfillment of

More information

CS 231. Deformation simulation (and faces)

CS 231. Deformation simulation (and faces) CS 231 Deformation simulation (and faces) Deformation BODY Simulation Discretization Spring-mass models difficult to model continuum properties Simple & fast to implement and understand Finite Element

More information

CS 231. Deformation simulation (and faces)

CS 231. Deformation simulation (and faces) CS 231 Deformation simulation (and faces) 1 Cloth Simulation deformable surface model Represent cloth model as a triangular or rectangular grid Points of finite mass as vertices Forces or energies of points

More information

Realistic Talking Head for Human-Car-Entertainment Services

Realistic Talking Head for Human-Car-Entertainment Services Realistic Talking Head for Human-Car-Entertainment Services Kang Liu, M.Sc., Prof. Dr.-Ing. Jörn Ostermann Institut für Informationsverarbeitung, Leibniz Universität Hannover Appelstr. 9A, 30167 Hannover

More information

Speech Driven Face Animation Based on Dynamic Concatenation Model

Speech Driven Face Animation Based on Dynamic Concatenation Model Journal of Information & Computational Science 3: 4 (2006) 1 Available at http://www.joics.com Speech Driven Face Animation Based on Dynamic Concatenation Model Jianhua Tao, Panrong Yin National Laboratory

More information

VTalk: A System for generating Text-to-Audio-Visual Speech

VTalk: A System for generating Text-to-Audio-Visual Speech VTalk: A System for generating Text-to-Audio-Visual Speech Prem Kalra, Ashish Kapoor and Udit Kumar Goyal Department of Computer Science and Engineering, Indian Institute of Technology, Delhi Contact email:

More information

Confusability of Phonemes Grouped According to their Viseme Classes in Noisy Environments

Confusability of Phonemes Grouped According to their Viseme Classes in Noisy Environments PAGE 265 Confusability of Phonemes Grouped According to their Viseme Classes in Noisy Environments Patrick Lucey, Terrence Martin and Sridha Sridharan Speech and Audio Research Laboratory Queensland University

More information

Development of Real-Time Lip Sync Animation Framework Based On Viseme Human Speech

Development of Real-Time Lip Sync Animation Framework Based On Viseme Human Speech Development of Real-Time Lip Sync Animation Framework Based On Viseme Human Speech Loh Ngiik Hoon 1, Wang Yin Chai 2, Khairul Aidil Azlin Abd. Rahman 3* 1 Faculty of Applied and Creative Arts Universiti

More information

Animated Talking Head With Personalized 3D Head Model

Animated Talking Head With Personalized 3D Head Model Animated Talking Head With Personalized 3D Head Model L.S.Chen, T.S.Huang - Beckman Institute & CSL University of Illinois, Urbana, IL 61801, USA; lchen@ifp.uiuc.edu Jörn Ostermann, AT&T Labs-Research,

More information

Lifelike Talking Faces for Interactive Services

Lifelike Talking Faces for Interactive Services Lifelike Talking Faces for Interactive Services ERIC COSATTO, MEMBER, IEEE, JÖRN OSTERMANN, SENIOR MEMBER, IEEE, HANS PETER GRAF, FELLOW, IEEE, AND JUERGEN SCHROETER, FELLOW, IEEE Invited Paper Lifelike

More information

Finite Element Analysis Prof. Dr. B. N. Rao Department of Civil Engineering Indian Institute of Technology, Madras. Lecture - 24

Finite Element Analysis Prof. Dr. B. N. Rao Department of Civil Engineering Indian Institute of Technology, Madras. Lecture - 24 Finite Element Analysis Prof. Dr. B. N. Rao Department of Civil Engineering Indian Institute of Technology, Madras Lecture - 24 So in today s class, we will look at quadrilateral elements; and we will

More information

AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO F ^ k.^

AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO F ^ k.^ Computer a jap Animation Algorithms and Techniques Second Edition Rick Parent Ohio State University AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO

More information

Model-Based Face Computation

Model-Based Face Computation Model-Based Face Computation 1. Research Team Project Leader: Post Doc(s): Graduate Students: Prof. Ulrich Neumann, IMSC and Computer Science John P. Lewis Hea-juen Hwang, Zhenyao Mo, Gordon Thomas 2.

More information

ModelStructureSelection&TrainingAlgorithmsfor an HMMGesture Recognition System

ModelStructureSelection&TrainingAlgorithmsfor an HMMGesture Recognition System ModelStructureSelection&TrainingAlgorithmsfor an HMMGesture Recognition System Nianjun Liu, Brian C. Lovell, Peter J. Kootsookos, and Richard I.A. Davis Intelligent Real-Time Imaging and Sensing (IRIS)

More information

Real-time Expression Cloning using Appearance Models

Real-time Expression Cloning using Appearance Models Real-time Expression Cloning using Appearance Models Barry-John Theobald School of Computing Sciences University of East Anglia Norwich, UK bjt@cmp.uea.ac.uk Iain A. Matthews Robotics Institute Carnegie

More information

A New Manifold Representation for Visual Speech Recognition

A New Manifold Representation for Visual Speech Recognition A New Manifold Representation for Visual Speech Recognition Dahai Yu, Ovidiu Ghita, Alistair Sutherland, Paul F. Whelan School of Computing & Electronic Engineering, Vision Systems Group Dublin City University,

More information

Text-to-Audiovisual Speech Synthesizer

Text-to-Audiovisual Speech Synthesizer Text-to-Audiovisual Speech Synthesizer Udit Kumar Goyal, Ashish Kapoor and Prem Kalra Department of Computer Science and Engineering, Indian Institute of Technology, Delhi pkalra@cse.iitd.ernet.in Abstract.

More information

Graph-based High Level Motion Segmentation using Normalized Cuts

Graph-based High Level Motion Segmentation using Normalized Cuts Graph-based High Level Motion Segmentation using Normalized Cuts Sungju Yun, Anjin Park and Keechul Jung Abstract Motion capture devices have been utilized in producing several contents, such as movies

More information

ROMANIAN LANGUAGE COARTICULATION MODEL FOR VISUAL SPEECH SIMULATIONS. Mihai Daniel Ilie, Cristian Negrescu, Dumitru Stanomir

ROMANIAN LANGUAGE COARTICULATION MODEL FOR VISUAL SPEECH SIMULATIONS. Mihai Daniel Ilie, Cristian Negrescu, Dumitru Stanomir 20th European Signal Processing Conference (EUSIPCO 2012) Bucharest, Romania, August 27-31, 2012 ROMANIAN LANGUAGE COARTICULATION MODEL FOR VISUAL SPEECH SIMULATIONS Mihai Daniel Ilie, Cristian Negrescu,

More information

Facial Deformations for MPEG-4

Facial Deformations for MPEG-4 Facial Deformations for MPEG-4 Marc Escher, Igor Pandzic, Nadia Magnenat Thalmann MIRALab - CUI University of Geneva 24 rue du Général-Dufour CH1211 Geneva 4, Switzerland {Marc.Escher, Igor.Pandzic, Nadia.Thalmann}@cui.unige.ch

More information

2-2-2, Hikaridai, Seika-cho, Soraku-gun, Kyoto , Japan 2 Graduate School of Information Science, Nara Institute of Science and Technology

2-2-2, Hikaridai, Seika-cho, Soraku-gun, Kyoto , Japan 2 Graduate School of Information Science, Nara Institute of Science and Technology ISCA Archive STREAM WEIGHT OPTIMIZATION OF SPEECH AND LIP IMAGE SEQUENCE FOR AUDIO-VISUAL SPEECH RECOGNITION Satoshi Nakamura 1 Hidetoshi Ito 2 Kiyohiro Shikano 2 1 ATR Spoken Language Translation Research

More information

Analyzing and Segmenting Finger Gestures in Meaningful Phases

Analyzing and Segmenting Finger Gestures in Meaningful Phases 2014 11th International Conference on Computer Graphics, Imaging and Visualization Analyzing and Segmenting Finger Gestures in Meaningful Phases Christos Mousas Paul Newbury Dept. of Informatics University

More information

Abstract. 1 Introduction

Abstract. 1 Introduction Contact detection algorithm using Overhauser splines M. Ulbin, S. Ulaga and J. Masker University of Maribor, Faculty of Mechanical Engineering, Smetanova 77, 2000 Maribor, SLOVENIA E-mail: ulbin @ uni-mb.si

More information

Research Article A Constraint-Based Approach to Visual Speech for a Mexican-Spanish Talking Head

Research Article A Constraint-Based Approach to Visual Speech for a Mexican-Spanish Talking Head Computer Games Technology Volume 2008, Article ID 412056, 7 pages doi:10.1155/2008/412056 Research Article A Constraint-Based Approach to Visual Speech for a Mexican-Spanish Talking Head Oscar Martinez

More information

Students are placed in System 44 based on their performance in the Scholastic Phonics Inventory. System 44 Placement and Scholastic Phonics Inventory

Students are placed in System 44 based on their performance in the Scholastic Phonics Inventory. System 44 Placement and Scholastic Phonics Inventory System 44 Overview The System 44 student application leads students through a predetermined path to learn each of the 44 sounds and the letters or letter combinations that create those sounds. In doing

More information

Real-Time Speech-Driven Face Animation with Expressions Using Neural Networks

Real-Time Speech-Driven Face Animation with Expressions Using Neural Networks IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 1, JANUARY 2002 100 Real-Time Speech-Driven Face Animation with Expressions Using Neural Networks Pengyu Hong, Zhen Wen, and Thomas S. Huang, Fellow,

More information

FACIAL ANIMATION WITH MOTION CAPTURE BASED ON SURFACE BLENDING

FACIAL ANIMATION WITH MOTION CAPTURE BASED ON SURFACE BLENDING FACIAL ANIMATION WITH MOTION CAPTURE BASED ON SURFACE BLENDING Lijia Zhu and Won-Sook Lee School of Information Technology and Engineering, University of Ottawa 800 King Edward Ave., Ottawa, Ontario, Canada,

More information

Joint Learning of Speech-Driven Facial Motion with Bidirectional Long-Short Term Memory

Joint Learning of Speech-Driven Facial Motion with Bidirectional Long-Short Term Memory Joint Learning of Speech-Driven Facial Motion with Bidirectional Long-Short Term Memory Najmeh Sadoughi and Carlos Busso Multimodal Signal Processing (MSP) Laboratory, Department of Electrical and Computer

More information

Facial Image Synthesis 1 Barry-John Theobald and Jeffrey F. Cohn

Facial Image Synthesis 1 Barry-John Theobald and Jeffrey F. Cohn Facial Image Synthesis Page 1 of 5 Facial Image Synthesis 1 Barry-John Theobald and Jeffrey F. Cohn 1 Introduction Facial expression has been central to the

More information

An Introductory SIGMA/W Example

An Introductory SIGMA/W Example 1 Introduction An Introductory SIGMA/W Example This is a fairly simple introductory example. The primary purpose is to demonstrate to new SIGMA/W users how to get started, to introduce the usual type of

More information

Speech Driven Facial Animation

Speech Driven Facial Animation Speech Driven Facial Animation P. Kakumanu R. Gutierrez-Osuna A. Esposito R. Bryll A. Goshtasby O.. Garcia 2 Department of Computer Science and Engineering Wright State University 364 Colonel Glenn Hwy

More information

Human body animation. Computer Animation. Human Body Animation. Skeletal Animation

Human body animation. Computer Animation. Human Body Animation. Skeletal Animation Computer Animation Aitor Rovira March 2010 Human body animation Based on slides by Marco Gillies Human Body Animation Skeletal Animation Skeletal Animation (FK, IK) Motion Capture Motion Editing (retargeting,

More information

Non-rigid body Object Tracking using Fuzzy Neural System based on Multiple ROIs and Adaptive Motion Frame Method

Non-rigid body Object Tracking using Fuzzy Neural System based on Multiple ROIs and Adaptive Motion Frame Method Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009 Non-rigid body Object Tracking using Fuzzy Neural System based on Multiple ROIs

More information

Optimized Face Animation with Morph-Targets

Optimized Face Animation with Morph-Targets Optimized Face Animation with Morph-Targets Uwe Berner TU Darmstadt, Interactive Graphics Systems Group (GRIS) Fraunhoferstrasse 5 64283 Darmstadt, Germany uberner@gris.informatik.tudarmstadt.de ABSTRACT

More information

1st component influence. y axis location (mm) Incoming context phone. Audio Visual Codebook. Visual phoneme similarity matrix

1st component influence. y axis location (mm) Incoming context phone. Audio Visual Codebook. Visual phoneme similarity matrix ISCA Archive 3-D FACE POINT TRAJECTORY SYNTHESIS USING AN AUTOMATICALLY DERIVED VISUAL PHONEME SIMILARITY MATRIX Levent M. Arslan and David Talkin Entropic Inc., Washington, DC, 20003 ABSTRACT This paper

More information

Chapter 8 Visualization and Optimization

Chapter 8 Visualization and Optimization Chapter 8 Visualization and Optimization Recommended reference books: [1] Edited by R. S. Gallagher: Computer Visualization, Graphics Techniques for Scientific and Engineering Analysis by CRC, 1994 [2]

More information

A new trainable trajectory formation system for facial animation

A new trainable trajectory formation system for facial animation ISCA Archive http://www.isca-speech.org/archive ITRW on Experimental Linguistics Athens, Greece August 28-30, 2006 A new trainable trajectory formation system for facial animation Oxana Govokhina 1,2,

More information

Master s Thesis. Cloning Facial Expressions with User-defined Example Models

Master s Thesis. Cloning Facial Expressions with User-defined Example Models Master s Thesis Cloning Facial Expressions with User-defined Example Models ( Kim, Yejin) Department of Electrical Engineering and Computer Science Division of Computer Science Korea Advanced Institute

More information

Learning based face hallucination techniques: A survey

Learning based face hallucination techniques: A survey Vol. 3 (2014-15) pp. 37-45. : A survey Premitha Premnath K Department of Computer Science & Engineering Vidya Academy of Science & Technology Thrissur - 680501, Kerala, India (email: premithakpnath@gmail.com)

More information

LOW-DIMENSIONAL MOTION FEATURES FOR AUDIO-VISUAL SPEECH RECOGNITION

LOW-DIMENSIONAL MOTION FEATURES FOR AUDIO-VISUAL SPEECH RECOGNITION LOW-DIMENSIONAL MOTION FEATURES FOR AUDIO-VISUAL SPEECH Andrés Vallés Carboneras, Mihai Gurban +, and Jean-Philippe Thiran + + Signal Processing Institute, E.T.S.I. de Telecomunicación Ecole Polytechnique

More information

The accuracy and robustness of motion

The accuracy and robustness of motion Orthogonal-Blendshape-Based Editing System for Facial Motion Capture Data Qing Li and Zhigang Deng University of Houston The accuracy and robustness of motion capture has made it a popular technique for

More information

A NEURAL NETWORK APPLICATION FOR A COMPUTER ACCESS SECURITY SYSTEM: KEYSTROKE DYNAMICS VERSUS VOICE PATTERNS

A NEURAL NETWORK APPLICATION FOR A COMPUTER ACCESS SECURITY SYSTEM: KEYSTROKE DYNAMICS VERSUS VOICE PATTERNS A NEURAL NETWORK APPLICATION FOR A COMPUTER ACCESS SECURITY SYSTEM: KEYSTROKE DYNAMICS VERSUS VOICE PATTERNS A. SERMET ANAGUN Industrial Engineering Department, Osmangazi University, Eskisehir, Turkey

More information

CHAPTER 8 Multimedia Information Retrieval

CHAPTER 8 Multimedia Information Retrieval CHAPTER 8 Multimedia Information Retrieval Introduction Text has been the predominant medium for the communication of information. With the availability of better computing capabilities such as availability

More information

Extraction of Human Gait Features from Enhanced Human Silhouette Images

Extraction of Human Gait Features from Enhanced Human Silhouette Images 2009 IEEE International Conference on Signal and Image Processing Applications Extraction of Human Gait Features from Enhanced Human Silhouette Images Hu Ng #1, Wooi-Haw Tan *2, Hau-Lee Tong #3, Junaidi

More information

C O M P U T E R G R A P H I C S. Computer Animation. Guoying Zhao 1 / 66

C O M P U T E R G R A P H I C S. Computer Animation. Guoying Zhao 1 / 66 Computer Animation Guoying Zhao 1 / 66 Basic Elements of Computer Graphics Modeling construct the 3D model of the scene Rendering Render the 3D model, compute the color of each pixel. The color is related

More information

Confidence Measures: how much we can trust our speech recognizers

Confidence Measures: how much we can trust our speech recognizers Confidence Measures: how much we can trust our speech recognizers Prof. Hui Jiang Department of Computer Science York University, Toronto, Ontario, Canada Email: hj@cs.yorku.ca Outline Speech recognition

More information

Facial Expression Analysis for Model-Based Coding of Video Sequences

Facial Expression Analysis for Model-Based Coding of Video Sequences Picture Coding Symposium, pp. 33-38, Berlin, September 1997. Facial Expression Analysis for Model-Based Coding of Video Sequences Peter Eisert and Bernd Girod Telecommunications Institute, University of

More information

Audio-to-Visual Speech Conversion using Deep Neural Networks

Audio-to-Visual Speech Conversion using Deep Neural Networks INTERSPEECH 216 September 8 12, 216, San Francisco, USA Audio-to-Visual Speech Conversion using Deep Neural Networks Sarah Taylor 1, Akihiro Kato 1, Iain Matthews 2 and Ben Milner 1 1 University of East

More information

Facial Animation System Design based on Image Processing DU Xueyan1, a

Facial Animation System Design based on Image Processing DU Xueyan1, a 4th International Conference on Machinery, Materials and Computing Technology (ICMMCT 206) Facial Animation System Design based on Image Processing DU Xueyan, a Foreign Language School, Wuhan Polytechnic,

More information

Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information

Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information Mustafa Berkay Yilmaz, Hakan Erdogan, Mustafa Unel Sabanci University, Faculty of Engineering and Natural

More information

An Interactive Interface for Directing Virtual Humans

An Interactive Interface for Directing Virtual Humans An Interactive Interface for Directing Virtual Humans Gael Sannier 1, Selim Balcisoy 2, Nadia Magnenat-Thalmann 1, Daniel Thalmann 2 1) MIRALab, University of Geneva, 24 rue du Général Dufour CH 1211 Geneva,

More information

MARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID PHONES

MARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID PHONES International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol. 3, Issue 2, May 2016, 34-38 IIST MARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID

More information

A Practical and Configurable Lip Sync Method for Games

A Practical and Configurable Lip Sync Method for Games A Practical and Configurable Lip Sync Method for Games Yuyu Xu Andrew W. Feng Stacy Marsella Ari Shapiro USC Institute for Creative Technologies Figure 1: Accurate lip synchronization results for multiple

More information

Joint Matrix Quantization of Face Parameters and LPC Coefficients for Low Bit Rate Audiovisual Speech Coding

Joint Matrix Quantization of Face Parameters and LPC Coefficients for Low Bit Rate Audiovisual Speech Coding IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 12, NO. 3, MAY 2004 265 Joint Matrix Quantization of Face Parameters and LPC Coefficients for Low Bit Rate Audiovisual Speech Coding Laurent Girin

More information

FACIAL MOVEMENT BASED PERSON AUTHENTICATION

FACIAL MOVEMENT BASED PERSON AUTHENTICATION FACIAL MOVEMENT BASED PERSON AUTHENTICATION Pengqing Xie Yang Liu (Presenter) Yong Guan Iowa State University Department of Electrical and Computer Engineering OUTLINE Introduction Literature Review Methodology

More information

Automatic Animation of High Resolution Images

Automatic Animation of High Resolution Images 2012 IEEE 27 th Convention of Electrical and Electronics Engineers in Israel Automatic Animation of High Resolution Images Dmitry Batenkov, Gregory Dinkin, Yosef Yomdin Department of Mathematics The Weizmann

More information

Mouth Center Detection under Active Near Infrared Illumination

Mouth Center Detection under Active Near Infrared Illumination Proceedings of the 6th WSEAS International Conference on SIGNAL PROCESSING, Dallas, Texas, USA, March 22-24, 2007 173 Mouth Center Detection under Active Near Infrared Illumination THORSTEN GERNOTH, RALPH

More information

3 CHOPS - LIP SYNCHING PART 2

3 CHOPS - LIP SYNCHING PART 2 3 CHOPS - LIP SYNCHING PART 2 In this lesson you will be building a more complex CHOP network to create a more automated lip-synch. This will utilize the voice split CHOP, and the voice sync CHOP. Using

More information

C H A P T E R Introduction

C H A P T E R Introduction C H A P T E R 1 Introduction M ultimedia is probably one of the most overused terms of the 90s (for example, see [Sch97]). The field is at the crossroads of several major industries: computing, telecommunications,

More information

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA Chapter 1 : BioMath: Transformation of Graphs Use the results in part (a) to identify the vertex of the parabola. c. Find a vertical line on your graph paper so that when you fold the paper, the left portion

More information

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

Visualization and Analysis of Inverse Kinematics Algorithms Using Performance Metric Maps

Visualization and Analysis of Inverse Kinematics Algorithms Using Performance Metric Maps Visualization and Analysis of Inverse Kinematics Algorithms Using Performance Metric Maps Oliver Cardwell, Ramakrishnan Mukundan Department of Computer Science and Software Engineering University of Canterbury

More information

Singularity Analysis of an Extensible Kinematic Architecture: Assur Class N, Order N 1

Singularity Analysis of an Extensible Kinematic Architecture: Assur Class N, Order N 1 David H. Myszka e-mail: dmyszka@udayton.edu Andrew P. Murray e-mail: murray@notes.udayton.edu University of Dayton, Dayton, OH 45469 James P. Schmiedeler The Ohio State University, Columbus, OH 43210 e-mail:

More information

Free-Form Shape Optimization using CAD Models

Free-Form Shape Optimization using CAD Models Free-Form Shape Optimization using CAD Models D. Baumgärtner 1, M. Breitenberger 1, K.-U. Bletzinger 1 1 Lehrstuhl für Statik, Technische Universität München (TUM), Arcisstraße 21, D-80333 München 1 Motivation

More information

An Automatic 3D Face Model Segmentation for Acquiring Weight Motion Area

An Automatic 3D Face Model Segmentation for Acquiring Weight Motion Area An Automatic 3D Face Model Segmentation for Acquiring Weight Motion Area Rio Caesar Suyoto Samuel Gandang Gunanto Magister Informatics Engineering Atma Jaya Yogyakarta University Sleman, Indonesia Magister

More information

Combining Audio and Video for Detection of Spontaneous Emotions

Combining Audio and Video for Detection of Spontaneous Emotions Combining Audio and Video for Detection of Spontaneous Emotions Rok Gajšek, Vitomir Štruc, Simon Dobrišek, Janez Žibert, France Mihelič, and Nikola Pavešić Faculty of Electrical Engineering, University

More information

A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition

A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition A Novel Visual Speech Representation and HMM Classification for Visual Speech Recognition Dahai Yu, Ovidiu Ghita, Alistair Sutherland*, Paul F Whelan Vision Systems Group, School of Electronic Engineering

More information

2D to pseudo-3d conversion of "head and shoulder" images using feature based parametric disparity maps

2D to pseudo-3d conversion of head and shoulder images using feature based parametric disparity maps University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2001 2D to pseudo-3d conversion of "head and shoulder" images using feature

More information

Building speaker-specific lip models for talking heads from 3D face data

Building speaker-specific lip models for talking heads from 3D face data Building speaker-specific lip models for talking heads from 3D face data Takaaki Kuratate 1,2, Marcia Riley 1 1 Institute for Cognitive Systems, Technical University Munich, Germany 2 MARCS Auditory Laboratories,

More information

Short Survey on Static Hand Gesture Recognition

Short Survey on Static Hand Gesture Recognition Short Survey on Static Hand Gesture Recognition Huu-Hung Huynh University of Science and Technology The University of Danang, Vietnam Duc-Hoang Vo University of Science and Technology The University of

More information

COORDINATE MEASUREMENTS OF COMPLEX-SHAPE SURFACES

COORDINATE MEASUREMENTS OF COMPLEX-SHAPE SURFACES XIX IMEKO World Congress Fundamental and Applied Metrology September 6 11, 2009, Lisbon, Portugal COORDINATE MEASUREMENTS OF COMPLEX-SHAPE SURFACES Andrzej Werner 1, Malgorzata Poniatowska 2 1 Faculty

More information

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

Wavelet Transform in Face Recognition

Wavelet Transform in Face Recognition J. Bobulski, Wavelet Transform in Face Recognition,In: Saeed K., Pejaś J., Mosdorf R., Biometrics, Computer Security Systems and Artificial Intelligence Applications, Springer Science + Business Media,

More information