Lip Tracking for MPEG-4 Facial Animation

Size: px
Start display at page:

Download "Lip Tracking for MPEG-4 Facial Animation"

Transcription

1 Lip Tracking for MPEG-4 Facial Animation Zhilin Wu, Petar S. Aleksic, and Aggelos. atsaggelos Department of Electrical and Computer Engineering Northwestern University 45 North Sheridan Road, Evanston, IL {zlwu, apetar, Abstract It is very important to accurately track the mouth of a talking person for many applications, such as face recognition and human computer interaction. This is in general a difficult problem due to the complexity of shapes, colors, textures, and changing lighting conditions. In this paper we develop techniques for outer and inner lip tracking. From the tracking results FAPs are extracted which are used to drive an MPEG-4 decoder. A novel method consisting of a Gradient Vector Flow (GVF) snake with a parabolic template as an additional external force is proposed. Based on the results of the outer lip tracking, the inner lip is tracked using a similarity function and a temporal smoothness constraint. Numerical results are presented using the Bernstein database.. Introduction MPEG-4 is an emerging multimedia compression standard expected to have an important impact on a number of future consumer electronic products. MPEG-4 is the first audiovisual object-based representation standard as opposed to most existing frame-based standards for video representation. One of the proent features of MPEG-4 is facial animation. By controlling the Facial Definition Parameters (FDPs) and the Facial Animation Parameters (FAPs), a face can be animated with different shapes, textures, and expressions. This kind of animation can be used in a number of applications, such as web-based customer service with talking heads or games. It can also be of great help to hearing impaired people by providing visual information. In addition, for video conferencing MPEG-4 facial animation objects could be a rather cost-efficient alternative. The animation objects can imitate a real person and animate the talking head satisfactorily as long as the parameters are extracted accurately. Transmission of all 68 FAPs at 30 frames/second without any compression needs about 30 kbps for a talking head. The data rate can be even lower to less than 0.5 kbps with further compression techniques, such as FAP interpolation [][], while standard video transmission requires about tens of Mbps. The key issue now is how to accurately obtain parameters of a face. Many studies have been done on this topic. The most interesting features in a face are the eyes and the mouth because they are the proent moving features. Some early studies used markers on the speakers faces. This requirement considerably constrains users in various environments. Some recent studies used templates [3][4] and active contours [5]. The use of templates is valid in many cases. However, it usually requires a great amount of training and may not result in exact fit of the features. The use of active contours [6] is appropriate especially when the feature shape is hard to represent with a simple template. Nevertheless, this method is sensitive to certain salient regions close to the desired feature. Random noise may strongly affect the deformation of active contours. For example, reflections on the lips may be stronger than a lip edge in terms of intensity difference. This reflection would have great effect in pulling the active contours towards it. In this case, the active contour tends to track the reflection on the lips, not the real lip boundaries. In this paper we develop a method combining both active contours and a template. The template represents the shape of the feature and the active contours track its exact position. The advantage is that the final tracking results depend on both the Gradient Vector Flow (GVF) snake field vector and the template, appropriately weighted in terms of their qualities. This combination results in accurate and robust tracking of outer lips in our experiments with all frames of the Bernstein audio-visual database [7]. We found out experimentally that it is considerably harder to track the inner lips using the same approach applied to the tracking of the outer lips. We therefore applied a similarity function to detere the inner lip boundaries. In addition a temporal smoothing constraint is applied to improve the accuracy of the tracking result. The paper is organized as follows. The database is briefly described in Sec.. The procedure for mouth tracking is described in Sec. 3, followed by the description of the proposed algorithms for outer and inner lip tracking in Sec. 4 and 5, respectively. The generation of FAPs is described in Sec. 6 and conclusion are drawn in Sec. 7.

2 Video sequence Nose Tracking Mouth Extraction Outer Lip Tracking: GVF Snake and Parabola Fitting In this work, only group and group 8 FAPs describing the outer and inner lip movement are considered, as shown in Fig. 3. Inner Lip Tracking: Similarity Function Group and 8 FAPs FAPs Figure. FAP extraction system. The audio-visual database This work utilizes speechreading material from the Bernstein Lipreading Corpus. This high quality audiovisual database includes a total of 954 sentences, of which 474 were uttered by a single female speaker, and the remaining 480 by a male speaker. For each of the sentences, the database contains a speech waveform, a word-level transcription, and a video sequence time synchronized with the speech waveform. Each utterance began and ended with a period of silence. The vocabulary size is approximately,000 words. The average utterance length is approximately 4 seconds. In order to extract visual features from the database, the video was sampled at a rate of 30 frames/sec (fps) with a spatial resolution of 30 x 40 pixels, 4 bits per pixel. 3. Mouth tracking Figure illustrates the FAP extraction system we have implemented [8]. In order to extract the mouth area from the Bernstein Lipreading corpus, a neutral facial expression image was chosen among the sampled video images (Fig. a). A 7 x 44 image of the nostrils (Fig. b) was extracted from the neutral facial expression image to serve as a template for the template matching algorithm. The nostrils were chosen since they did not deform significantly during articulation [9]. The template matching algorithm, applied on the first frame of each sequence, locates the nostrils by searching a 0x0 pixel area centered at the neutral face nose location, for the best match. In the subsequent frames, the search area is constrained to a 3x3 pixel area centered at the nose location in the previous frame. Once the nose location has been identified, a rectangular 90 x 68 pixel region is extracted enclosing the mouth (Fig. c). (a) (b) (c) Figure. (a) Neutral facial expression image; (b) Extracted nose template; (c) Extracted mouth image Figure 3. Outer and inner lip position FAPs 4. Outer lip tracking The outer lip-tracking algorithm that we developed is a combination of an active contour algorithm and parabola templates [8][0]. We use the GVF snake as an active contour algorithm since it provides a large capture range, and two parabolas as templates, as described next. 4.. GVF snake A snake is an elastic curve defined by a set of control points [6], and is used for finding visual features, such as lines, edges, or contours. The snake parametric representation is given by x(s) = [x(s), y(s)], s [0, ], () where x(s) and y(s) are the vertical and horizontal coordinates and s is the normalized independent parameter. Snake deformation controlled by internal and external snake forces moves through the image imizing the functional E = + + ext x 0 ' '' α x ( s) β x ( s) E ( ( s) ) ds, () where α and β are weights that control the snake s tension and rigidity. The external force E ext (x(s)) is derived from the image data. The GVF [][], defined as the vector field v(x,y)=(u(x,y),v(x,y)), can be used as an external force. It is computed by imizing the functional E = [ µ ( u + u + v + v ) + f v f dxdy, (3) x y x y ] where f is an edge map derived from the image using a gradient operator. The parameter µ is a weighting factor which is detered based on the noise level in the image. The important property of GVF is that when used as an external force it increases the capture range of the snake algorithm. Figure 4 depicts an example of the GVF and snake results.

3 (a) (b) (c) Figure 4. (a) Mouth image; (b) GVF, and (c) The final snake 4.. Parabola templates Based on our investigations we concluded that the snake algorithm, which uses only the GVF as an external force, is sensitive to random image noise and salient features around the lips (i.e., lip reflections). In order to improve the lip-tracking performance two parabolas are fit along the upper and lower lip. Edge detection is performed on every extracted mouth image using the Canny detector to obtain an edge map image (Fig. 5b). In order to obtain sets of points on the upper and lower lip, the edge map image is scanned column-wise, keeping only the first and the last encountered nonzero pixels (Fig. 5c). Parabolas are fitted through each of the obtained sets of points (Fig. 5d). (a) (b) (c) (d) Figure 5. (a) Extracted mouth image; (b) Edge map image; (c) Upper and lower lip boundaries; and (d) Fitted parabolas The noise present in the mouth image and the texture of the area around the mouth in some cases may cause inaccurate fitting of the parabolas to the outer lips. We resolved these cases by taking several noise eliating steps described in detail in [8][]. (a) (a) (a3) (a4) (b) (b) (b3) (b4) Afterwards, the image consisting of the two final parabolas was blurred and the parabola external force, v parabola, was obtained using a gradient operator. v parabola was added to the GVF external force, v, to obtain the final external force, v final, by appropriately weighting the two external forces, that is, v v + w v final =. (4) t parabola The value of w t =.5 proved to provide consistently better results. The final external force, v final, was used in the snake algorithm. Shown in Figure 6 are the snake results for cases of bad quality GVF (Fig. 6a), badly fitted parabolas (Fig. 6b), and improved results by combining the two (Fig. 6c). The advantage of the developed algorithm lies in the fact that both GVF and parabola templates contribute to good tracking results, but their combination provides in most cases improved results. 5. Inner lip tracking The tracking of inner lips presents a more challenging task than the tracking of outer lips. This is primarily due to the fact that the area inside the mouth is of similar color, texture, and intensity as the lips. In addition, teeth appear and disappear in typical conversation and further complicate matters. The technique applied for the tracking of the outer lips described above did not prove to provide good results when applied to the tracking of the inner lips. We therefore resorted to an approach based on similarity functions, as described next. 5. Inner lip model We use two parabolas as inner lip templates. Figure 7 shows the mouth model, where the outer curve is the continuous contour obtained from the previously described method in section 4. The inner lip model consists of two parabolas, which share two inner corners, defining a line at angle θ with respect to the horizontal axis. The two curves modeling the outer and inner lips define two areas, one between them (denoted by x ) and one inside the inner lips (denoted by x ). θ (c) (c) (c3) (c4) Figure 6. (a-c) Mouth images; (a-c) GVFs; (a3-c3) Fitted parabolas; and (a4-c4) Snake results, when GVF (a4), or the parabola templates (b4) do not give good results when applied individually, and when both methods give good results (c4). x x Figure 7. Outer and inner lip model

4 5. Similarity function The best boundary separating regions x and x is the one defining the largest unlikeness between the two regions. Given a pixel q, we can classify it to region x if p q x ) > p( q ), (5) ( x where p(q x i ), i =,, are the probabilities of q given it can be found in region x i. We may also treat p(q x i ) as the value of the histogram of region x i and the comparison in Eq. (5) becomes h q) > h ( ), (6) ( q In finding the inner lip boundary the results of the application of the outer lip tracking algorithm are used. Four displacement variables are then defined and the mouth region is denoted by R, as shown in Fig. 8. The mouth region is then defined as the result of the imization ( D, D, J, J ) = arg f ( d, d, j, j ) (8) d, d, j, j over all possible combinations (d, d, j, j ). The optimal (D, D, J, J ) is then used to define two parabolas for the upper and lower inner lips. An example of the application of this algorithm is shown in Fig. 9. where h i (), i =,, are the histograms of regions x and x. Assug that the histograms h and h, of the lip and mouth regions, respectively, are known, a region R can be classified as a lip region if h ( q) f ( R) = log (7) h ( q) q R is maximized over all possible shapes of R. Alternatively a region R can be classified as a mouth region if f(r) is imized. This function calculates how similar area R is to regions x and x. The larger f(r) is the closer this area is to x, and alternatively, the smaller f(r) is the closer this area is to x. To make the tracking algorithm luance insensitive, we use the hue and saturation color space instead of the original RGB color space. 5.3 Training and tracking The only training required is to obtain the two histograms for lip and inside mouth regions. We arbitrarily selected 0 frames in a sequence. For each frame we use the outer lip tracking results obtained by the algorithm described earlier, and a hand labeled inner lip parabola contour to get the lip and inside mouth regions. From these 0 frames, we obtained 973 pixels for the lip region (x ) and 4898 pixels for the inside mouth region (x ) for training. To get the two histograms, a bin size of 64 was applied. In the Bernstein database, the speaker rarely tilts her head. Therefore for this set of experiments θ was set equal to zero for all frames. d Figure 8. Inner lip tracking procedure j j R d Figure 9. Inner lip tracking results as two parabolas 5.4 Temporal smoothing In order to preserve some form of temporal continuity the values of the displacement parameters resulting from the imization in Eq. (8) are used as predictors which are corrected by the errors of this prediction, with respect to the predictions resulting from previous frames. As an example, for the variable D, the predictor form we used is the following Dˆ = D + a ( Dˆ ) ( ˆ D + a D D )], (9) where, -, -, are the indices of the current and previous frames, D is the value resulting from the imization in the current frame and ^ denotes corrected values. We want to couple the degree of trust in an estimate to the actual imum value of the function f(), denoted by f. A form then of the prediction coefficients we used is the following where and ω a =, ω a = (0) f ω = tan ( ) () π n f ω = tan ( ) () π n where the values of n, n are chosen experimentally, so that a forgetting factor is introduced. Clearly if f << 0 and f << 0, then ω = ω = 0 and ˆ ˆ ( ˆ D = D + D ). That is, the current estimate is

5 deemed unreliable and its corrected value is based solely on the values obtained for the previous frames. On the other hand if f f 0 and f 0, then >> >> = ω ω =, and Dˆ = D. That is, the current value resulting from the imization needs no correction In (b) the mouth is nearly closed, and although the tracking result is quite accurate the error is equal to Based on numerical evaluation, the application of temporal smoothing improved the tracking results by reducing the error on the average by Numerical evaluation To evaluate the inner lip tracking results, we hand labeled all inner lips for one sequence with 6 frames. If a pixel does not lie in both tracked and hand labeled inner lip region, this pixel is treated as an error pixel. The tracking error is defined as the ratio of all error pixels divided by the number of pixels in the hand labeled mouth area. Figure 0 (a) shows the so defined errors for the frames tested. Figure 0 (b) shows the corresponding number of error pixels for each frame, that is the numerator of the fraction used for Fig. 0 (a). (a) Error ratio (b) Number of error pixels Figure 0. Inner lip tracking error From the error ratio plot we can see some large errors. These errors typically correspond to frames for which the mouth is closed or nearly closed. Although the tracking of the inner lip is quite accurate, the error can still be large since the number of pixels in R (closed mouth) is very small. This is supported by Fig. 0 (b), in which clearly there is in most cases no correspondence between the peaks in the two figures (0(a) and 0(b)). Shown in Fig. are two image examples. In both of these the solid lines of the inner lip represent the hand labeled boundary and the dotted lines represent the tracked results. In (a) the tracking result is close to the truth and the error is equal to (a) error = 0.7 (b) error = Figure. Inner lip tracking error examples 6. FAP generation FAPs defined in the MPEG-4 standard are the imum facial animation parameters responsible for describing the movements of a face. They manipulate key feature points on a mesh model of a head to animate all kinds of facial movements and expressions. These parameters are either low level (i.e., displacement of a specific single point of the face) or high level (i.e., production of a facial expression) [3]. There are totally 68 FAPs, divided into 0 groups. We are interested in outer and inner lip FAPs which are in in group 8 and group. There are 0 FAPs in each of group 8 and group. All FAPs are expressed in terms of Facial Animation Parameter Units (FAPU). These units are normalized by certain essential facial feature distances in order to give an accurate and consistent representation. Two FAPUs are involved in mouth-related FAPs: mouth width and separation between the horizontal line of nostrils and the horizontal line of neutral mouth corners. Each distance is normalized to 04. In Bernstein database, the first frame in each sequence is a neutral face. Therefore we can get the two FAPUs from the first frame and apply them to all the remaining frames. The tracked outer lip of the first frame is the neutral outer lip. Since a neutral mouth is a closed mouth, the line connecting the two outer lip corners is the neutral inner lip. In each following frame, the 0 outer lip and 0 inner lip FAP points positions are compared to the neutral FAP positions and are normalized by FAPUs. This difference represents the mouth movement. Our system automatically reads all sequences from the Bernstein audio-video database and generates all outer and inner FAPs. These parameters are then input to an MPEG-4 facial animation player [4] to generate MPEG- 4 sequences. Based on the visual evaluation of the synthesized video, mouth movements are very realistic and close to the original video. Some results are shown in Fig.. Images (a) and (c) are the two original frames

6 with tracked outer and inner lips. Images (b) and (d) are corresponding frames generated by the MPEG-4 decoder with FAPs extracted from the original frames. When the sequences are played by the MPEG-4 player driven by FAPs, realistic motion and articulation has been observed. (a) (c) Figure. (a) (c) Original images with tracked outer and inner lips, (b) (d) MPEG-4 facial animations With the Facial Animation Engine [5] all animated MPEG-4 face sequences were well synchronized with acoustic signals. 7. Conclusions and Future Work We have presented a combined method to accurately track the outer lips by using GVF snakes with parabolic templates as an additional external force. This combination needs fewer requirements of both salient boundaries and accuracy of templates. Furthermore, it is more flexible in tracking noisy signals. We have also presented an inner lip tracking method using a similarity function with temporal smoothing. An encoder to generate MPEG-4 FAPs from the continuous lip boundaries is also designed. Good results have been achieved for the Bernstein database sequences. The outer lip FAPs have been used in our audio-visual speech recognition system which greatly improved recognition performance [8][]. In order to get a more realistic talking head, additional facial features, such as, eyes, need to be precisely tracked. Another key issue demanding further work is to make facial feature tracking real time, which is extremely important when it is used in video conferencing. (b) (d) References: [] H. Tao, H.H. Chen, W. Wu, and T.S. Huang, Compression of MPEG-4 Facial Animation Parameters for Transmission of Talking Heads, IEEE Trans. on Circuit and Systems for Video Technology, vol. 9, no., pp , March, 999. [] F. Lavagetto, and R. Pockaj, An Efficient Use of MPEG-4 FAP Interpolation for Facial Animation At 70 Bits/Frame, IEEE Trans. on Circuit and Systems for Video Technology, vol., no. 0, pp , Oct., 00. [3] A.L. Yuille, P.W. Hallinan, and D.S. Cohen, Feature Extraction From Faces Using Deformable Templates, Int. J. of Computer Vision, vol. 8, no., pp. 99-, 99. [4] J. Luettin, and N.A. Thackerl, Speechreading Using Probabilistic Models, Computer Vision and Image Understanding, vol. 65, no., pp , 997. [5] M. Pardas, Extraction and Tracking of the Eyelids, Proc. of ICASSP, New York, July, 000. [6] M. ass, A. Witkin, and D. Terzopoulos, Snakes: Active Contour Models, Int. J. Computer Vision. vol., no. 4, pp , 987. [7] L. Bernstein, and S. Eberhardt, Johns Hopkins lipreading corpus I-II, tech. Rep., Johns Hopkins U., Baltimore, MD, 986. [8] P.S. Aleksic, J.J. Williams, Z. Wu, and A.. atsaggelos, Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features, to appear in EURASIP Journal on Applied Signal Processing, 00. [9] H.P. Graf, T Chen, E Petajan, and E Cosatto, Locating Faces And Facial Parts, Proc. Int. Workshop on Automatic Face and Gesture Recognition, pp [0] P.S. Aleksic, J.J. Williams, Z. Wu, A.. atsaggelos, Audio-Visual Continuous Speech Recognition Using MPEG-4 Compliant Visual Features, to appear, ICIP, September 00. [] C. Xu and J.L. Prince, Gradient Vector Flow: A New External Force for Snakes, IEEE Proc. Conf. on Comp. Vis. Patt. Recog., 997. [] C. Xu, D.L. Pham, and J.L. Prince, Medical Image Segmentation Using Deformable Models, SPIE Handbook on Medical Imaging -- Volume III: Medical Image Analysis, edited by J.M. Fitzpatrick and M. Sonka, May 000. [3] Text for ISO/IEC FDIS Visual, ISO/IEC JTC/SC9/WG N50, Nov [4] F. Lavagetto, R. Pockaj, The Facial Animation Engine: Toward a High-Level Interface for the Design of MPEG-4 Compliant Animated Faces, IEEE Trans. on Circuits and Systems for Video Technology, vol. 9, no., March, pp , 999. [5] /fae.htm

Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features

Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features EURASIP Journal on Applied Signal Processing 2002:11, 1213 1227 c 2002 Hindawi Publishing Corporation Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features Petar S. Aleksic Department

More information

Robust Lip Contour Extraction using Separability of Multi-Dimensional Distributions

Robust Lip Contour Extraction using Separability of Multi-Dimensional Distributions Robust Lip Contour Extraction using Separability of Multi-Dimensional Distributions Tomokazu Wakasugi, Masahide Nishiura and Kazuhiro Fukui Corporate Research and Development Center, Toshiba Corporation

More information

Speech Driven Synthesis of Talking Head Sequences

Speech Driven Synthesis of Talking Head Sequences 3D Image Analysis and Synthesis, pp. 5-56, Erlangen, November 997. Speech Driven Synthesis of Talking Head Sequences Peter Eisert, Subhasis Chaudhuri,andBerndGirod Telecommunications Laboratory, University

More information

Snakes operating on Gradient Vector Flow

Snakes operating on Gradient Vector Flow Snakes operating on Gradient Vector Flow Seminar: Image Segmentation SS 2007 Hui Sheng 1 Outline Introduction Snakes Gradient Vector Flow Implementation Conclusion 2 Introduction Snakes enable us to find

More information

Facial Deformations for MPEG-4

Facial Deformations for MPEG-4 Facial Deformations for MPEG-4 Marc Escher, Igor Pandzic, Nadia Magnenat Thalmann MIRALab - CUI University of Geneva 24 rue du Général-Dufour CH1211 Geneva 4, Switzerland {Marc.Escher, Igor.Pandzic, Nadia.Thalmann}@cui.unige.ch

More information

Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information

Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information Mustafa Berkay Yilmaz, Hakan Erdogan, Mustafa Unel Sabanci University, Faculty of Engineering and Natural

More information

Context based optimal shape coding

Context based optimal shape coding IEEE Signal Processing Society 1999 Workshop on Multimedia Signal Processing September 13-15, 1999, Copenhagen, Denmark Electronic Proceedings 1999 IEEE Context based optimal shape coding Gerry Melnikov,

More information

Snakes reparameterization for noisy images segmentation and targets tracking

Snakes reparameterization for noisy images segmentation and targets tracking Snakes reparameterization for noisy images segmentation and targets tracking Idrissi Sidi Yassine, Samir Belfkih. Lycée Tawfik Elhakim Zawiya de Noaceur, route de Marrakech, Casablanca, maroc. Laboratoire

More information

Bengt J. Borgström, Student Member, IEEE, and Abeer Alwan, Senior Member, IEEE

Bengt J. Borgström, Student Member, IEEE, and Abeer Alwan, Senior Member, IEEE 1 A Low Complexity Parabolic Lip Contour Model With Speaker Normalization For High-Level Feature Extraction in Noise Robust Audio-Visual Speech Recognition Bengt J Borgström, Student Member, IEEE, and

More information

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression

An Optimized Template Matching Approach to Intra Coding in Video/Image Compression An Optimized Template Matching Approach to Intra Coding in Video/Image Compression Hui Su, Jingning Han, and Yaowu Xu Chrome Media, Google Inc., 1950 Charleston Road, Mountain View, CA 94043 ABSTRACT The

More information

A Robust Wipe Detection Algorithm

A Robust Wipe Detection Algorithm A Robust Wipe Detection Algorithm C. W. Ngo, T. C. Pong & R. T. Chin Department of Computer Science The Hong Kong University of Science & Technology Clear Water Bay, Kowloon, Hong Kong Email: fcwngo, tcpong,

More information

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation IJCSNS International Journal of Computer Science and Network Security, VOL.13 No.11, November 2013 1 Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial

More information

Text Area Detection from Video Frames

Text Area Detection from Video Frames Text Area Detection from Video Frames 1 Text Area Detection from Video Frames Xiangrong Chen, Hongjiang Zhang Microsoft Research China chxr@yahoo.com, hjzhang@microsoft.com Abstract. Text area detection

More information

Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video

Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video Quality versus Intelligibility: Evaluating the Coding Trade-offs for American Sign Language Video Frank Ciaramello, Jung Ko, Sheila Hemami School of Electrical and Computer Engineering Cornell University,

More information

Mouth Center Detection under Active Near Infrared Illumination

Mouth Center Detection under Active Near Infrared Illumination Proceedings of the 6th WSEAS International Conference on SIGNAL PROCESSING, Dallas, Texas, USA, March 22-24, 2007 173 Mouth Center Detection under Active Near Infrared Illumination THORSTEN GERNOTH, RALPH

More information

Classification of Upper and Lower Face Action Units and Facial Expressions using Hybrid Tracking System and Probabilistic Neural Networks

Classification of Upper and Lower Face Action Units and Facial Expressions using Hybrid Tracking System and Probabilistic Neural Networks Classification of Upper and Lower Face Action Units and Facial Expressions using Hybrid Tracking System and Probabilistic Neural Networks HADI SEYEDARABI*, WON-SOOK LEE**, ALI AGHAGOLZADEH* AND SOHRAB

More information

IMPLEMENTATION OF SPATIAL FUZZY CLUSTERING IN DETECTING LIP ON COLOR IMAGES

IMPLEMENTATION OF SPATIAL FUZZY CLUSTERING IN DETECTING LIP ON COLOR IMAGES IMPLEMENTATION OF SPATIAL FUZZY CLUSTERING IN DETECTING LIP ON COLOR IMAGES Agus Zainal Arifin 1, Adhatus Sholichah 2, Anny Yuniarti 3, Dini Adni Navastara 4, Wijayanti Nurul Khotimah 5 1,2,3,4,5 Department

More information

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Accurate 3D Face and Body Modeling from a Single Fixed Kinect Accurate 3D Face and Body Modeling from a Single Fixed Kinect Ruizhe Wang*, Matthias Hernandez*, Jongmoo Choi, Gérard Medioni Computer Vision Lab, IRIS University of Southern California Abstract In this

More information

Facial Expression Analysis for Model-Based Coding of Video Sequences

Facial Expression Analysis for Model-Based Coding of Video Sequences Picture Coding Symposium, pp. 33-38, Berlin, September 1997. Facial Expression Analysis for Model-Based Coding of Video Sequences Peter Eisert and Bernd Girod Telecommunications Institute, University of

More information

Facial Feature Detection

Facial Feature Detection Facial Feature Detection Rainer Stiefelhagen 21.12.2009 Interactive Systems Laboratories, Universität Karlsruhe (TH) Overview Resear rch Group, Universität Karlsruhe (TH H) Introduction Review of already

More information

Georgios Tziritas Computer Science Department

Georgios Tziritas Computer Science Department New Video Coding standards MPEG-4, HEVC Georgios Tziritas Computer Science Department http://www.csd.uoc.gr/~tziritas 1 MPEG-4 : introduction Motion Picture Expert Group Publication 1998 (Intern. Standardization

More information

Tracking facial features using low resolution and low fps cameras under variable light conditions

Tracking facial features using low resolution and low fps cameras under variable light conditions Tracking facial features using low resolution and low fps cameras under variable light conditions Peter Kubíni * Department of Computer Graphics Comenius University Bratislava / Slovakia Abstract We are

More information

Searching Video Collections:Part I

Searching Video Collections:Part I Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion

More information

Facial Features Localisation

Facial Features Localisation Facial Features Localisation Robust Eye Centre Extraction Using the Hough Transform David E. Benn, Mark S. Nixon and John N. Carter Department of Electronics and Computer Science, University of Southampton,

More information

Image Segmentation II Advanced Approaches

Image Segmentation II Advanced Approaches Image Segmentation II Advanced Approaches Jorge Jara W. 1,2 1 Department of Computer Science DCC, U. of Chile 2 SCIAN-Lab, BNI Outline 1. Segmentation I Digital image processing Segmentation basics 2.

More information

An Approach for Reduction of Rain Streaks from a Single Image

An Approach for Reduction of Rain Streaks from a Single Image An Approach for Reduction of Rain Streaks from a Single Image Vijayakumar Majjagi 1, Netravati U M 2 1 4 th Semester, M. Tech, Digital Electronics, Department of Electronics and Communication G M Institute

More information

Combining Edge Detection and Region Segmentation for Lip Contour Extraction

Combining Edge Detection and Region Segmentation for Lip Contour Extraction Combining Edge Detection and Region Segmentation for Lip Contour Extraction Usman Saeed and Jean-Luc Dugelay Eurecom 2229 Routes des Cretes, 06560 Sophia Antipolis, France. {Usman.Saeed, Jean-Luc.Dugelay}@Eurecom.fr

More information

Bluray (

Bluray ( Bluray (http://www.blu-ray.com/faq) MPEG-2 - enhanced for HD, also used for playback of DVDs and HDTV recordings MPEG-4 AVC - part of the MPEG-4 standard also known as H.264 (High Profile and Main Profile)

More information

MRI Brain Image Segmentation Using an AM-FM Model

MRI Brain Image Segmentation Using an AM-FM Model MRI Brain Image Segmentation Using an AM-FM Model Marios S. Pattichis', Helen Petropoulos2, and William M. Brooks2 1 Department of Electrical and Computer Engineering, The University of New Mexico, Albuquerque,

More information

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Yongying Gao and Hayder Radha Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48823 email:

More information

An Adaptive Eigenshape Model

An Adaptive Eigenshape Model An Adaptive Eigenshape Model Adam Baumberg and David Hogg School of Computer Studies University of Leeds, Leeds LS2 9JT, U.K. amb@scs.leeds.ac.uk Abstract There has been a great deal of recent interest

More information

Robust Facial Feature Tracking

Robust Facial Feature Tracking Robust Facial Feature Tracking Fabrice Bourel, Claude C. Chibelushi, Adrian A. Low School of Computing, Staffordshire University Stafford ST18 DG F.Bourel@staffs.ac.uk C.C.Chibelushi@staffs.ac.uk A.A.Low@staffs.ac.uk

More information

A The left scanline The right scanline

A The left scanline The right scanline Dense Disparity Estimation via Global and Local Matching Chun-Jen Tsai and Aggelos K. Katsaggelos Electrical and Computer Engineering Northwestern University Evanston, IL 60208-3118, USA E-mail: tsai@ece.nwu.edu,

More information

Spatial Scene Level Shape Error Concealment for Segmented Video

Spatial Scene Level Shape Error Concealment for Segmented Video Spatial Scene Level Shape Error Concealment for Segmented Video Luis Ducla Soares 1, Fernando Pereira 2 1 Instituto Superior de Ciências do Trabalho e da Empresa Instituto de Telecomunicações, Lisboa,

More information

Gradient Vector Flow: A New External Force for Snakes

Gradient Vector Flow: A New External Force for Snakes 66 IEEE Proc. Conf. on Comp. Vis. Patt. Recog. (CVPR'97) Gradient Vector Flow: A New External Force for Snakes Chenyang Xu and Jerry L. Prince Department of Electrical and Computer Engineering The Johns

More information

Real-time Lip Synchronization Based on Hidden Markov Models

Real-time Lip Synchronization Based on Hidden Markov Models ACCV2002 The 5th Asian Conference on Computer Vision, 23--25 January 2002, Melbourne, Australia. Real-time Lip Synchronization Based on Hidden Markov Models Ying Huang* 1 Stephen Lin+ Xiaoqing Ding* Baining

More information

VISEME SPACE FOR REALISTIC SPEECH ANIMATION

VISEME SPACE FOR REALISTIC SPEECH ANIMATION VISEME SPACE FOR REALISTIC SPEECH ANIMATION Sumedha Kshirsagar, Nadia Magnenat-Thalmann MIRALab CUI, University of Geneva {sumedha, thalmann}@miralab.unige.ch http://www.miralab.unige.ch ABSTRACT For realistic

More information

Expression Detection in Video. Abstract Expression detection is useful as a non-invasive method of lie detection and

Expression Detection in Video. Abstract Expression detection is useful as a non-invasive method of lie detection and Wes Miller 5/11/2011 Comp Sci 534 Expression Detection in Video Abstract Expression detection is useful as a non-invasive method of lie detection and behavior prediction, as many facial expressions are

More information

Learning based face hallucination techniques: A survey

Learning based face hallucination techniques: A survey Vol. 3 (2014-15) pp. 37-45. : A survey Premitha Premnath K Department of Computer Science & Engineering Vidya Academy of Science & Technology Thrissur - 680501, Kerala, India (email: premithakpnath@gmail.com)

More information

MPEG-4 AUTHORING TOOL FOR THE COMPOSITION OF 3D AUDIOVISUAL SCENES

MPEG-4 AUTHORING TOOL FOR THE COMPOSITION OF 3D AUDIOVISUAL SCENES MPEG-4 AUTHORING TOOL FOR THE COMPOSITION OF 3D AUDIOVISUAL SCENES P. Daras I. Kompatsiaris T. Raptis M. G. Strintzis Informatics and Telematics Institute 1,Kyvernidou str. 546 39 Thessaloniki, GREECE

More information

Chromosome Segmentation and Investigations using Generalized Gradient Vector Flow Active Contours

Chromosome Segmentation and Investigations using Generalized Gradient Vector Flow Active Contours Published Quarterly Mangalore, South India ISSN 0972-5997 Volume 4, Issue 2; April-June 2005 Original Article Chromosome Segmentation and Investigations using Generalized Gradient Vector Flow Active Contours

More information

Speech Driven Face Animation Based on Dynamic Concatenation Model

Speech Driven Face Animation Based on Dynamic Concatenation Model Journal of Information & Computational Science 3: 4 (2006) 1 Available at http://www.joics.com Speech Driven Face Animation Based on Dynamic Concatenation Model Jianhua Tao, Panrong Yin National Laboratory

More information

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Minh Dao 1, Xiang Xiang 1, Bulent Ayhan 2, Chiman Kwan 2, Trac D. Tran 1 Johns Hopkins Univeristy, 3400

More information

Motion Estimation for Video Coding Standards

Motion Estimation for Video Coding Standards Motion Estimation for Video Coding Standards Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Introduction of Motion Estimation The goal of video compression

More information

Face Synthesis in the VIDAS project

Face Synthesis in the VIDAS project Face Synthesis in the VIDAS project Marc Escher 1, Igor Pandzic 1, Nadia Magnenat Thalmann 1, Daniel Thalmann 2, Frank Bossen 3 Abstract 1 MIRALab - CUI University of Geneva 24 rue du Général-Dufour CH1211

More information

Detection of Facial Feature Points Using Anthropometric Face Model

Detection of Facial Feature Points Using Anthropometric Face Model Detection of Facial Feature Points Using Anthropometric Face Model Abu Sayeed Md. Sohail and Prabir Bhattacharya Concordia University 1455 de Maisonneuve Blvd. West, Montréal, Québec H3G 1M8, Canada E-mails:

More information

CONTENT ADAPTIVE SCREEN IMAGE SCALING

CONTENT ADAPTIVE SCREEN IMAGE SCALING CONTENT ADAPTIVE SCREEN IMAGE SCALING Yao Zhai (*), Qifei Wang, Yan Lu, Shipeng Li University of Science and Technology of China, Hefei, Anhui, 37, China Microsoft Research, Beijing, 8, China ABSTRACT

More information

M I RA Lab. Speech Animation. Where do we stand today? Speech Animation : Hierarchy. What are the technologies?

M I RA Lab. Speech Animation. Where do we stand today? Speech Animation : Hierarchy. What are the technologies? MIRALab Where Research means Creativity Where do we stand today? M I RA Lab Nadia Magnenat-Thalmann MIRALab, University of Geneva thalmann@miralab.unige.ch Video Input (face) Audio Input (speech) FAP Extraction

More information

Detection of Mouth Movements and Its Applications to Cross-Modal Analysis of Planning Meetings

Detection of Mouth Movements and Its Applications to Cross-Modal Analysis of Planning Meetings 2009 International Conference on Multimedia Information Networking and Security Detection of Mouth Movements and Its Applications to Cross-Modal Analysis of Planning Meetings Yingen Xiong Nokia Research

More information

Automatic speech reading by oral motion tracking for user authentication system

Automatic speech reading by oral motion tracking for user authentication system 2012 2013 Third International Conference on Advanced Computing & Communication Technologies Automatic speech reading by oral motion tracking for user authentication system Vibhanshu Gupta, M.E. (EXTC,

More information

Time Stamp Detection and Recognition in Video Frames

Time Stamp Detection and Recognition in Video Frames Time Stamp Detection and Recognition in Video Frames Nongluk Covavisaruch and Chetsada Saengpanit Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand E-mail: nongluk.c@chula.ac.th

More information

A New Feature Local Binary Patterns (FLBP) Method

A New Feature Local Binary Patterns (FLBP) Method A New Feature Local Binary Patterns (FLBP) Method Jiayu Gu and Chengjun Liu The Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA Abstract - This paper presents

More information

Dual-state Parametric Eye Tracking

Dual-state Parametric Eye Tracking Dual-state Parametric Eye Tracking Ying-li Tian 1;3 Takeo Kanade 1 and Jeffrey F. Cohn 1;2 1 Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213 2 Department of Psychology, University

More information

Digital Image Stabilization and Its Integration with Video Encoder

Digital Image Stabilization and Its Integration with Video Encoder Digital Image Stabilization and Its Integration with Video Encoder Yu-Chun Peng, Hung-An Chang, Homer H. Chen Graduate Institute of Communication Engineering National Taiwan University Taipei, Taiwan {b889189,

More information

FACIAL FEATURE EXTRACTION BASED ON THE SMALLEST UNIVALUE SEGMENT ASSIMILATING NUCLEUS (SUSAN) ALGORITHM. Mauricio Hess 1 Geovanni Martinez 2

FACIAL FEATURE EXTRACTION BASED ON THE SMALLEST UNIVALUE SEGMENT ASSIMILATING NUCLEUS (SUSAN) ALGORITHM. Mauricio Hess 1 Geovanni Martinez 2 FACIAL FEATURE EXTRACTION BASED ON THE SMALLEST UNIVALUE SEGMENT ASSIMILATING NUCLEUS (SUSAN) ALGORITHM Mauricio Hess 1 Geovanni Martinez 2 Image Processing and Computer Vision Research Lab (IPCV-LAB)

More information

Real-time target tracking using a Pan and Tilt platform

Real-time target tracking using a Pan and Tilt platform Real-time target tracking using a Pan and Tilt platform Moulay A. Akhloufi Abstract In recent years, we see an increase of interest for efficient tracking systems in surveillance applications. Many of

More information

Data-Driven Face Modeling and Animation

Data-Driven Face Modeling and Animation 1. Research Team Data-Driven Face Modeling and Animation Project Leader: Post Doc(s): Graduate Students: Undergraduate Students: Prof. Ulrich Neumann, IMSC and Computer Science John P. Lewis Zhigang Deng,

More information

HUMAN S FACIAL PARTS EXTRACTION TO RECOGNIZE FACIAL EXPRESSION

HUMAN S FACIAL PARTS EXTRACTION TO RECOGNIZE FACIAL EXPRESSION HUMAN S FACIAL PARTS EXTRACTION TO RECOGNIZE FACIAL EXPRESSION Dipankar Das Department of Information and Communication Engineering, University of Rajshahi, Rajshahi-6205, Bangladesh ABSTRACT Real-time

More information

Face Tracking. Synonyms. Definition. Main Body Text. Amit K. Roy-Chowdhury and Yilei Xu. Facial Motion Estimation

Face Tracking. Synonyms. Definition. Main Body Text. Amit K. Roy-Chowdhury and Yilei Xu. Facial Motion Estimation Face Tracking Amit K. Roy-Chowdhury and Yilei Xu Department of Electrical Engineering, University of California, Riverside, CA 92521, USA {amitrc,yxu}@ee.ucr.edu Synonyms Facial Motion Estimation Definition

More information

A Model-based Line Detection Algorithm in Documents

A Model-based Line Detection Algorithm in Documents A Model-based Line Detection Algorithm in Documents Yefeng Zheng, Huiping Li, David Doermann Laboratory for Language and Media Processing Institute for Advanced Computer Studies University of Maryland,

More information

SCAPE: Shape Completion and Animation of People

SCAPE: Shape Completion and Animation of People SCAPE: Shape Completion and Animation of People By Dragomir Anguelov, Praveen Srinivasan, Daphne Koller, Sebastian Thrun, Jim Rodgers, James Davis From SIGGRAPH 2005 Presentation for CS468 by Emilio Antúnez

More information

A Modified Image Segmentation Method Using Active Contour Model

A Modified Image Segmentation Method Using Active Contour Model nd International Conference on Electrical, Computer Engineering and Electronics (ICECEE 015) A Modified Image Segmentation Method Using Active Contour Model Shiping Zhu 1, a, Ruidong Gao 1, b 1 Department

More information

RENDERING AND ANALYSIS OF FACES USING MULTIPLE IMAGES WITH 3D GEOMETRY. Peter Eisert and Jürgen Rurainsky

RENDERING AND ANALYSIS OF FACES USING MULTIPLE IMAGES WITH 3D GEOMETRY. Peter Eisert and Jürgen Rurainsky RENDERING AND ANALYSIS OF FACES USING MULTIPLE IMAGES WITH 3D GEOMETRY Peter Eisert and Jürgen Rurainsky Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institute Image Processing Department

More information

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD Kang Liu and Joern Ostermann Institut für Informationsverarbeitung, Leibniz Universität Hannover Appelstr. 9A, 3167 Hannover, Germany

More information

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution

Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution 2011 IEEE International Symposium on Multimedia Hybrid Video Compression Using Selective Keyframe Identification and Patch-Based Super-Resolution Jeffrey Glaister, Calvin Chan, Michael Frankovich, Adrian

More information

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VI (Nov Dec. 2014), PP 29-33 Analysis of Image and Video Using Color, Texture and Shape Features

More information

A New Manifold Representation for Visual Speech Recognition

A New Manifold Representation for Visual Speech Recognition A New Manifold Representation for Visual Speech Recognition Dahai Yu, Ovidiu Ghita, Alistair Sutherland, Paul F. Whelan School of Computing & Electronic Engineering, Vision Systems Group Dublin City University,

More information

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS

FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS FRAME-RATE UP-CONVERSION USING TRANSMITTED TRUE MOTION VECTORS Yen-Kuang Chen 1, Anthony Vetro 2, Huifang Sun 3, and S. Y. Kung 4 Intel Corp. 1, Mitsubishi Electric ITA 2 3, and Princeton University 1

More information

International Journal of Emerging Technology and Advanced Engineering Website: (ISSN , Volume 2, Issue 4, April 2012)

International Journal of Emerging Technology and Advanced Engineering Website:   (ISSN , Volume 2, Issue 4, April 2012) A Technical Analysis Towards Digital Video Compression Rutika Joshi 1, Rajesh Rai 2, Rajesh Nema 3 1 Student, Electronics and Communication Department, NIIST College, Bhopal, 2,3 Prof., Electronics and

More information

A face recognition system based on local feature analysis

A face recognition system based on local feature analysis A face recognition system based on local feature analysis Stefano Arca, Paola Campadelli, Raffaella Lanzarotti Dipartimento di Scienze dell Informazione Università degli Studi di Milano Via Comelico, 39/41

More information

A reversible data hiding based on adaptive prediction technique and histogram shifting

A reversible data hiding based on adaptive prediction technique and histogram shifting A reversible data hiding based on adaptive prediction technique and histogram shifting Rui Liu, Rongrong Ni, Yao Zhao Institute of Information Science Beijing Jiaotong University E-mail: rrni@bjtu.edu.cn

More information

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Samer Al Moubayed Center for Speech Technology, Department of Speech, Music, and Hearing, KTH, Sweden. sameram@kth.se

More information

Data Hiding in Video

Data Hiding in Video Data Hiding in Video J. J. Chae and B. S. Manjunath Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 9316-956 Email: chaejj, manj@iplab.ece.ucsb.edu Abstract

More information

Applying Catastrophe Theory to Image Segmentation

Applying Catastrophe Theory to Image Segmentation Applying Catastrophe Theory to Image Segmentation Mohamad Raad, Majd Ghareeb, Ali Bazzi Department of computer and communications engineering Lebanese International University Beirut, Lebanon Abstract

More information

Coupling of surface roughness to the performance of computer-generated holograms

Coupling of surface roughness to the performance of computer-generated holograms Coupling of surface roughness to the performance of computer-generated holograms Ping Zhou* and Jim Burge College of Optical Sciences, University of Arizona, Tucson, Arizona 85721, USA *Corresponding author:

More information

Facial Animation System Based on Image Warping Algorithm

Facial Animation System Based on Image Warping Algorithm Facial Animation System Based on Image Warping Algorithm Lanfang Dong 1, Yatao Wang 2, Kui Ni 3, Kuikui Lu 4 Vision Computing and Visualization Laboratory, School of Computer Science and Technology, University

More information

Fast Facial Motion Cloning in MPEG-4

Fast Facial Motion Cloning in MPEG-4 Fast Facial Motion Cloning in MPEG-4 Marco Fratarcangeli and Marco Schaerf Department of Computer and Systems Science University of Rome La Sapienza frat,schaerf@dis.uniroma1.it Abstract Facial Motion

More information

Face Analysis using Curve Edge Maps

Face Analysis using Curve Edge Maps Face Analysis using Curve Edge Maps Francis Deboeverie 1, Peter Veelaert 2 and Wilfried Philips 1 1 Ghent University - Image Processing and Interpretation/IBBT, St-Pietersnieuwstraat 41, B9000 Ghent, Belgium

More information

Animated Talking Head With Personalized 3D Head Model

Animated Talking Head With Personalized 3D Head Model Animated Talking Head With Personalized 3D Head Model L.S.Chen, T.S.Huang - Beckman Institute & CSL University of Illinois, Urbana, IL 61801, USA; lchen@ifp.uiuc.edu Jörn Ostermann, AT&T Labs-Research,

More information

Head Frontal-View Identification Using Extended LLE

Head Frontal-View Identification Using Extended LLE Head Frontal-View Identification Using Extended LLE Chao Wang Center for Spoken Language Understanding, Oregon Health and Science University Abstract Automatic head frontal-view identification is challenging

More information

EE Multimedia Signal Processing. Scope & Features. Scope & Features. Multimedia Signal Compression VI (MPEG-4, 7)

EE Multimedia Signal Processing. Scope & Features. Scope & Features. Multimedia Signal Compression VI (MPEG-4, 7) EE799 -- Multimedia Signal Processing Multimedia Signal Compression VI (MPEG-4, 7) References: 1. http://www.mpeg.org 2. http://drogo.cselt.stet.it/mpeg/ 3. T. Berahimi and M.Kunt, Visual data compression

More information

Non-Rigid Object Tracker Based On a Robust Combination of Parametric Active Contour and Point Distribution Model

Non-Rigid Object Tracker Based On a Robust Combination of Parametric Active Contour and Point Distribution Model Non-Rigid Object Tracker Based On a Robust Combination of Parametric Active Contour and Point Distribution Model Joanna Isabelle Olszewska a, Tom Mathes b, Christophe De Vleeschouwer a, Justus Piater b

More information

Motion analysis for broadcast tennis video considering mutual interaction of players

Motion analysis for broadcast tennis video considering mutual interaction of players 14-10 MVA2011 IAPR Conference on Machine Vision Applications, June 13-15, 2011, Nara, JAPAN analysis for broadcast tennis video considering mutual interaction of players Naoto Maruyama, Kazuhiro Fukui

More information

Research on the Wood Cell Contour Extraction Method Based on Image Texture and Gray-scale Information.

Research on the Wood Cell Contour Extraction Method Based on Image Texture and Gray-scale Information. , pp. 65-74 http://dx.doi.org/0.457/ijsip.04.7.6.06 esearch on the Wood Cell Contour Extraction Method Based on Image Texture and Gray-scale Information Zhao Lei, Wang Jianhua and Li Xiaofeng 3 Heilongjiang

More information

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm Group 1: Mina A. Makar Stanford University mamakar@stanford.edu Abstract In this report, we investigate the application of the Scale-Invariant

More information

A Survey of Light Source Detection Methods

A Survey of Light Source Detection Methods A Survey of Light Source Detection Methods Nathan Funk University of Alberta Mini-Project for CMPUT 603 November 30, 2003 Abstract This paper provides an overview of the most prominent techniques for light

More information

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations Prashant Ramanathan and Bernd Girod Department of Electrical Engineering Stanford University Stanford CA 945

More information

Robust Lip Tracking by Combining Shape, Color and Motion

Robust Lip Tracking by Combining Shape, Color and Motion Robust Lip Tracking by Combining Shape, Color and Motion Ying-li Tian Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213 yltian@cs.cmu.edu National Laboratory of Pattern Recognition Chinese

More information

Dynamic skin detection in color images for sign language recognition

Dynamic skin detection in color images for sign language recognition Dynamic skin detection in color images for sign language recognition Michal Kawulok Institute of Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland michal.kawulok@polsl.pl

More information

Image Segmentation for Image Object Extraction

Image Segmentation for Image Object Extraction Image Segmentation for Image Object Extraction Rohit Kamble, Keshav Kaul # Computer Department, Vishwakarma Institute of Information Technology, Pune kamble.rohit@hotmail.com, kaul.keshav@gmail.com ABSTRACT

More information

An Automatic Timestamp Replanting Algorithm for Panorama Video Surveillance *

An Automatic Timestamp Replanting Algorithm for Panorama Video Surveillance * An Automatic Timestamp Replanting Algorithm for Panorama Video Surveillance * Xinguo Yu, Wu Song, Jun Cheng, Bo Qiu, and Bin He National Engineering Research Center for E-Learning, Central China Normal

More information

Chapter 3. Automated Segmentation of the First Mitotic Spindle in Differential Interference Contrast Microcopy Images of C.

Chapter 3. Automated Segmentation of the First Mitotic Spindle in Differential Interference Contrast Microcopy Images of C. Chapter 3 Automated Segmentation of the First Mitotic Spindle in Differential Interference Contrast Microcopy Images of C. elegans Embryos Abstract Differential interference contrast (DIC) microscopy is

More information

A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation

A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 1, JANUARY 2001 111 A Low Bit-Rate Video Codec Based on Two-Dimensional Mesh Motion Compensation with Adaptive Interpolation

More information

A Two-stage Scheme for Dynamic Hand Gesture Recognition

A Two-stage Scheme for Dynamic Hand Gesture Recognition A Two-stage Scheme for Dynamic Hand Gesture Recognition James P. Mammen, Subhasis Chaudhuri and Tushar Agrawal (james,sc,tush)@ee.iitb.ac.in Department of Electrical Engg. Indian Institute of Technology,

More information

Topics for thesis. Automatic Speech-based Emotion Recognition

Topics for thesis. Automatic Speech-based Emotion Recognition Topics for thesis Bachelor: Automatic Speech-based Emotion Recognition Emotion recognition is an important part of Human-Computer Interaction (HCI). It has various applications in industrial and commercial

More information

Gender Classification Technique Based on Facial Features using Neural Network

Gender Classification Technique Based on Facial Features using Neural Network Gender Classification Technique Based on Facial Features using Neural Network Anushri Jaswante Dr. Asif Ullah Khan Dr. Bhupesh Gour Computer Science & Engineering, Rajiv Gandhi Proudyogiki Vishwavidyalaya,

More information

signal-to-noise ratio (PSNR), 2

signal-to-noise ratio (PSNR), 2 u m " The Integration in Optics, Mechanics, and Electronics of Digital Versatile Disc Systems (1/3) ---(IV) Digital Video and Audio Signal Processing ƒf NSC87-2218-E-009-036 86 8 1 --- 87 7 31 p m o This

More information

Computer and Machine Vision

Computer and Machine Vision Computer and Machine Vision Lecture Week 10 Part-2 Skeletal Models and Face Detection March 21, 2014 Sam Siewert Outline of Week 10 Lab #4 Overview Lab #5 and #6 Extended Lab Overview SIFT and SURF High

More information

Mouse Pointer Tracking with Eyes

Mouse Pointer Tracking with Eyes Mouse Pointer Tracking with Eyes H. Mhamdi, N. Hamrouni, A. Temimi, and M. Bouhlel Abstract In this article, we expose our research work in Human-machine Interaction. The research consists in manipulating

More information

Mouth Region Localization Method Based on Gaussian Mixture Model

Mouth Region Localization Method Based on Gaussian Mixture Model Mouth Region Localization Method Based on Gaussian Mixture Model Kenichi Kumatani and Rainer Stiefelhagen Universitaet Karlsruhe (TH), Interactive Systems Labs, Am Fasanengarten 5, 76131 Karlsruhe, Germany

More information