Lipreading using Profile Lips Rebuilt by 3D Data from the Kinect
|
|
- Ralph Francis
- 6 years ago
- Views:
Transcription
1 Journal of Computational Information Systems 11: 7 (2015) Available at Lipreading using Profile Lips Rebuilt by 3D Data from the Kinect Jianrong WANG 1, Yongchun GAO 1, Ju ZHANG 1, Jianguo WEI 2,, Jianwu DANG 1 1 School of Computer Science and Technology, Tianjin University, Tianjin , China 2 School of Computer Software, Tianjin University, Tianjin , China Abstract Lipreading plays an important role in understanding the fluently spoken speech for hearing-impaired people, and the majority studies of lipreading assume the frontal images of the speaker s face, which is easily affected by the variations of each speaker s lip size and illumination. This paper concentrates on the efforts of the 3D data captured by the Kinect to the robustness of the lipreading system. In order to supplement the information of frontal lip, left profile lips and right profile lips are rebuilt by these 3D coordinates. As a result, the experiment adopts the feature integrating with the profile lips yield superior performance over the one with visual-only feature, and the recognition rate relatively increase by 7%. Keywords: Lipreading; 3D Data; Kinect; Profile Lip 1 Introduction Lipreading improves the robustness of speech recognition by means of establishing and analysising the parameters of mouth movement, or directly using the sequence of image to classification and identification. The lipreading system using image sequence of the speakers lips has recently attracted significant interest, and a great deal of progress has been achieved [1, 2, 11]. In the study of the lipreading, it gives emphasis to lip detection and feature extraction. The feature extraction is a crucial part for a lipreading system, and various visual features have been proposed in the literature. In general, they can be categorized into three kinds: 1. pixel based, where the entire image containing the speaker s lip is considered as informative; 2. lip contour based, in which lip contour model is obtained as the visual feature; and 3. the combination of 1 and 2. Among these approaches, the one based on pixels is assumed to be the most efficient [12, 13, 16]. However, some difference may occur when collecting the data, such as, different sizes of each speaker s lip, local and global changes in illumination and the variations in head pose, in addition, the poor mouth ROI localisation when lip detection may present. These differences can significantly degrade the performance of lipreading system. Project part of the National Natural Science Foundation (surface project No , National Key Basic Research Program No. 2013CB and key projects No ). Corresponding author. address: jianguo.fr@gmail.com (Jianguo WEI) / Copyright 2015 Binary Information Press DOI: /jcis13691 April 1, 2015
2 2430 J. Wang et al. /Journal of Computational Information Systems 11: 7 (2015) To alleviate those above problems, few dataset and experimental results have been published by utilizing some sort of 3D information from the speaker s face. For example, [10] developed a lip tracking system that allows the speaker s head to move in 3D and rotate up to 30 degrees away from the camera. In [19], they used three-dimensional characteristics for word recognition, and the result indicated that the recognition rate for three-dimensional characteristics was higher than that for two-dimensional characteristics. And the in-car spanish database AV@CAR was captured from six different angles in order to reconstruct a 3D textured mesh of the speaker s face [14]. Recently, with the development of the MS Kinect, whose sensor is supported by SDK which providing tools for real-time face tracking, and predefining the face with 121 3D coordinate points. As a result, some scholars concentrated their attention on the multi-model AVSR system, the University of Texas utilized their own recorded BAVCD database [4], built a multi-modal AVSR system investigating the use of the facial depth information [5, 6]. A Turkish university employed the angles computed by the 3D coordinate points as the feature, and KNN classifier was used to classify the words [22]. However, most researchs in lipreading were confined to frontal face, no matter with the visual feature or the 3D data, but in the real world, it is hard for everyone to keep frontal view all along. Consequently, some work gave emphasis to non-frontal video data for AVASR [23, 24], and the results demonstrated that useful speech information can be gained from non-frontal visual features, the profile lip feature even yield superior results in [8]. The main purpose of this paper is to lead to a new lipreading system integrating the speech information extracted from the visual data with the profile lips, which rebuilt by the 3D data. This constituents the first attempt in lipreading system using this novel feature. In this paper, a Chinese audio-visual corpus with 3D data was collected, and a projection technique using 3D coordinates to locate lip was introduced. Considering the changes of the speaker s head pose may make different information contained in different profile lips, the main contribute of this paper is to rebuild two sides of profile lips based on 3D coordinate which captured by the Kinect to supplement the information of frontal lip. Following those work, the remainder of this paper is organized as follows: Section 2 introduce the new lipreading system, which include locating the lip with 3D projection, rebuilding the profile lips and integrating them with visual information. The result will be presented in Section 3. Finally, Section 4 concludes this work. 2 The Lipreading System This part consists of the lip location completed by 3D projection, rebuilding of the profile lips, feature extraction applying to visual data as well as 3D data, and the model training and testing on the feature which integrating visual feature with profile lip feature. These will be discussed in more detail in the following subsection. The lipreading system overview is depicted in Fig Lip location by 3D projection Before feature extraction, the primary task is lip location, this paper adopts 3D projection rather than traditional method based on image processing [13, 20]. The 3D projection utilize the 3D data captured by the Kinect and imaging principle, which is shown in Fig. 2, to estimate the coordinate of the center pixel of the lip, then take this center as the midpoint, extend circumferentially to
3 J. Wang et al. /Journal of Computational Information Systems 11: 7 (2015) Fig. 1: Overview of the lipreading system integrating the visual data with 3D data obtain the lip portion containing the 32*32 pixels area of the lip. Fig. 2: The schematic of Kinect imaging principle The schematic of Kinect imaging principle defines three coordinates. x1o1y1 is the camera coordinates, x2o2y2 represents imaging plane coordinates, and upv stands for the image coordinates. It is clearly to see that the center of the camera coordinates in the same straight line with that of the imaging plane coordinates. Assuming the distance between them is m1, the coordinate of o2 will be (0, 0, m1). Let the horizontal angle be α, the vertical angle be β. If m1 is known, the real length and width of the imaging plane respectively is Length : L = 2 m1 tan(α/2) (1) W idth : S = 2 m1 tan(β/2) (2) Suppose that there is a point in the space, its coordinate is (x1, y1, z1), its imaging point in the imaging plane is a, whose coordinate is (x2, y2, z2), and z2 = m1. As they are in the same plane, the following formula can be obtained as a result of similar triangles. x1 x2 = y1 y2 = z1 z2 (3)
4 2432 J. Wang et al. /Journal of Computational Information Systems 11: 7 (2015) In addition, assume the pixel coordinate of a in the imaging plane is (x3, y3, z3) with o2 is the center, if the resolution of the image is m*n, the formulas listed below can be obtained according to the fact that the pixel is proportional to the imaging plane. x2 x3 = L m (4) y2 y3 = s n (5) Since the horizontal angel of Kinect is 57, the vertical angel is 43, together with the principle of coordinate transformation in the Kinect SDK, the real pixel coordinate of a in the picture is (320 + x3, y3). A combination of those formulas leads to the result that m1 is the only variant, which is the distance between imaging plane with the camera. This paper take m1 as 0.7m, then transform these color image into grayscale. The lips obtained from above steps are listed in Fig. 3. Fig. 3: The gray image of lip area for all the speakers 2.2 Profile lips rebuilt by 3D data Kinect is supported by Face Tracking SDK which predefines 121 lip points with 3D coordinates,18 of them represent lip, and each lip point is assigned an integer ID value to identify them. Take the right profile lip rebuilding for example. With the purpose to locate lip region and identify the graphics border, the first step is to generate the grid map of right lip. In case of the interference of left lip, here only choose 11 3D coordinate points corresponding to the right lip. Fig. 4(a) is the right lip contour plotted by these 11 3D points, which only give the information about the two-dimensional plane of the lip. Then interpolation the lip contour to a grid map according to the corresponding relationship between z-axis and x-axis, y-axis, as exhibit in Fig. 4(b); The second step is filling the grid map with color. Fig. 4(c) shows clearly that the color shading has corresponded to z-axis, color is deepened with the distance goes closer; The final step is projection and rotation, what we have obtained so far is solely the right lip from front view. Projection by changing perspective is a necessity to generate the profile lip, which is the right profile lip from right side view of the speaker. Due to the view of the profile lip is downward, it need to rotate 90 degree to get the right profile lip from normal view. Finally, save the rebuilding profile lip into a 60*60 pixels picture in BMP format. Figure 3 provides a flow chart of these steps. Since the right lip and the left lip of each speaker contain different information, this paper rebuild the right profile lips and the left profile lips following the same procedures as in flowchart. Fig. 4 present the right profile lip and the left profile lip rebuilt from 3D data of the same frame.
5 J. Wang et al. /Journal of Computational Information Systems 11: 7 (2015) Fig. 4: The flowchart of rebuilding the profile lip from 3D data Fig. 5: The left profile lip(a) and the right profile lip(b) rebuilt from the 3D data of the same frame 2.3 Feature extraction After obtaining the gray level image of ROI and the profile lips rebuilt form previous processing, the next step is transforming these image information into feature vector to capture the speech information. This paper apply the same method to feature extraction for ROI image as well as the profile lips image, the task is illustrated in Fig. 5. Fig. 6: The block diagram of feature extraction Motivated by some previous work [13, 17], this paper choose DCT transform, it involve onedimensional DCT two-dimensional DCT and block-based DCT. This paper apply two-dimensional DCT as they work similarly for lip reading task [19], using Zig-Zag method to draw the DCT matrix into a 1*1024 row vector. However, before applying DCT to the profile lip rebuilt by the 3D data, the image needs to be compressed into 32*32 pixels, on account of DCT allows fast implementations when the coefficients are powers of 2 [13, 18]. In case of dimensions disaster, this paper apply PCA to reduce dimensions in view of its excellent ability for information compression. This combination is assumed to take the advantages of these
6 2434 J. Wang et al. /Journal of Computational Information Systems 11: 7 (2015) two transforms. DCT is preferable to differentiate frequencies while PCA is beneficial to select the most important components [21]. This combination method outperforms the traditional Zig-Zag [7]. This paper project the features down to 52. Normalization is necessary for the purpose to improve the robustness of features. This paper apply feature mean normalization (FMN) by simply subtracting the feature mean computed over the entire utterance length T. x i = x i T x i i=1 T, i = 1, 2,..., T (6) Where i is the time frame, T is the total number of frames in one word, x t is the vector of visual feature. In order to get the information represent the lip movement, take J as the window length, H as the overlap, concatenating the J-frame feature within a window, which is similar to the windowing of audio signal processing, to get the lip dynamic information. C T t = [x ( t [J/2]) T,..., x T t,..., x ( t + [J/2] + 1) T ] (7) Where x i is the feature vector of the Jth frame.this paper take J = 3, H = 1. Fig. 7: The schematic of windowing for visual feature 3 Experiments and Results This paper perform speech recognition on the basic of isolated Chinese words, the baseline experiment implemented solely with one single feature (i.e., the visual-only feature and each side of the profile lips). Then compare the visual-only lipreading with the one fusioned with 3D data, and compare the lipreading with a single side of profile lip with the one with both sides of the profile lips. The HTK toolkit is utilized for both system training and testing [25], implementing three-state phoneme HMMs with a mixture of two Gaussians per state. As this paper conduct windowing to the integrated feature, which make the dimensions increase and easily cause dimensions disaster, PCA apply to reduce the dimensionality of the integrated feature consequently. When integrating the right profile lip with the left profile lip (this paper call it RL ), their dimension are all 39. While integrate the visual feature with the one rebuilt by the 3D data, the visual feature dimension is 52, and the 3D feature dimension is 26. In addition to these, the dimension of all integrate feature are reduced to 78 after PCA. 3.1 Database To allow experiments on Chinese audio visual corpus with 3D data, a suitable database was collected in the recording studio of computer and technology department at TianJin University,
7 J. Wang et al. /Journal of Computational Information Systems 11: 7 (2015) which provides clean acoustics and controlled illumination. Every speaker distanced camera 0.9m with a solid blue background. It consists of audio and full-face frontal video with 3D data of 10 speakers, with an equal number of men and women. 40 Chinese words are complied to guarantee the phoneme balance, and each of them is pronounced by each speaker 10 times. The device employed in capturing the data is the Microsoft Kinect, which is a novel device. The Kinect utilizes 4 microphones to capture audio, in conjunction with a color camera to get the color video images, the 3D data is collected by a laser and an IR camera. This guarantee the Kinect can capture the audio, visual and 3D data at the same time. Among them, the audio is two-tracks of 16-bit, 44.1 khz, PCM format, the color video are 640*480 pixel, 24-bit RGB at 30 fps. Corresponding to each frame of an color image, Kinect yield 3D data whose full format is shown in Fig. 7, which provide 121 3D points to describe the face contour, the first and second column are the timestamp of the 3D data, the third column is the ID number to these 121 3D data, the next three columns are respectively the x-axis, y-axis and z-axis of the points. Fig. 8: The format of the 3D coordinate data captured by the Kinect 3.2 Experimental results The experiment results are given in the following tables, Table 1 list the baseline results, Table 2 for the integrated feature experiment, including left profile fusion with the right profile lip, the visual feature integrate respectively with the left profile lip, right profile lip and RL. Table 1: Word recognition accuracy based on visual-only and profile lip rebuilded by 3D data Feature Recognition accuracy Visual-only Left profile Right profile Table 2: Word recognition accuracy on integrated feature Integrated feature Integrated feature before PCA Integrated after PCA left profile lip+right profile lip(rl) Visual+left profile lip Visual+right profile lip Visual+RL feature Checking the database, it is easily to find that some speaker s head turn left slightly in some words, owing to the fact that it is hard to keep frontal view continuously, this make the result
8 2436 J. Wang et al. /Journal of Computational Information Systems 11: 7 (2015) that the rebuilding right profile lip outperforming the left profile lip as show in Table 1. This is consistent with the result in [3, 9, 15], that the recognition accuracy degrades as the speakers head pose deviates from the frontal pose. Table 2 presents the efforts of using the 3D data integrated with the visual data, as well as the beneficial effects of feature transformation adopting PCA. It is obviously that the integrated feature dose improve the recognition accuracy compareing with the one using signal feature. Furthermore, the PCA significantly improve the lipreading accuracy. The feature fusioning the left profile lip with the right profile lip outperform the one with signal profile lip reflects that the information contained in one single profile lip is limited, which can not be well represent the side lip information of the speaker. It can also be noted that the feature integrate the visual with the 3D data obtains better performance than the visual-only, this demonstrates that the 3D data provides great efforts to the robustness of lipreading. 4 Conclusion This paper explores a new lipreading system which adopt 3D data captured by Kinect, integrating the profile lips rebuilt by the 3D data with visual feature to improve the performance of the traditional lipreading. In addition, this paper employ 3D projection to locate the lip, and apply the same method to extract the visual feature and 3D data. Finally, the result reveal that the 3D data dose improve the robustness of the visual-only lipreading. The result also indicates that the left profile lips rebuilt by the 3D data outperform the right one, and the result exactly consistent with the conclusion in [3, 9, 15], that the performance will be degrade as the head pose deviates from the frontal view. However, most work have neglected the efforts of 3D data to the profile lip lipreading, and there is no database to allow these work, so the future work of this paper is to build a database involving 3D data as well as audio and visual data, to explore whether the 3D data provide sufficient information to improve the robustness of multi-pose view lipreading. Acknowledgement The research was supported by part of the National Natural Science Foundation (surface project No , National Key Basic Research Program No. 2013CB and key projects No ). References [1] C. Bregler and Y. Konig. eigenlips for robust speech recognition. In Acoustics, Speech, and Signal Processing, ICASSP-94., 1994 IEEE International Conference on, vol. 2, pp. II-669. IEEE, [2] G. I. Chiou and J.-N. Hwang. Lipreading from color video. Image Processing, IEEE Transactions on, 6 (8): , 1997.
9 J. Wang et al. /Journal of Computational Information Systems 11: 7 (2015) [3] V. Estellers and J.-P. Thiran. Multipose audio-visual speech recognition. In EUSIPCO Proceedings, number EPFL-CONF , [4] G. Galatas, G. Potamianos, D. I. Kosmopoulos, C. McMurrough, and F. Makedon. Bilingual corpus for avasr using multiple sensors and depth information. In AVSP, pp , [5] G. Galatas, G. Potamianos, and F. Makedon. Audio-visual speech recognition incorporating facial depth information captured by the kinect. In Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European, pp IEEE, [6] G. Galatas, G. Potamianos, and F. Makedon. Audio-visual speech recognition using depth information from the kinect in noisy video conditions. In Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments, pp. 2. ACM, [7] X. Hong, H. Yao, Y. Wan, and R. Chen. A pca based visual dct feature extraction method for lipreading. In Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 06. International Conference on, pp IEEE, [8] K. Kumar, T. Chen, and R. M. Stern. Profile view lip reading. In Acoustics, Speech and Signal Processing, ICASSP IEEE International Conference on, vol. 4, pp. IV-429. IEEE, [9] K. Kumatani and R. Stiefelhagen. State synchronous modeling on phone boundary for audio visual speech recognition and application to muti-view face images. In Acoustics, Speech and Signal Processing, ICASSP IEEE International Conference on, vol. 4, pp. IV-417. IEEE, [10] G. Loy, E.-J. Holden, and R. Owens. 3d head tracker for an automatic lipreading system. In Proc. Australian Conf. on Robotics and Automation (ACRA2000), [11] J. Luettin, N. A. Thacker, and S. W. Beet. Speechreading using shape and intensity information. In Spoken Language, ICSLP 96. Proceedings., Fourth International Conference on, vol. 1, pp IEEE, [12] I. Matthews, T. F. Cootes, J. A. Bangham, S. Cox, and R. Harvey. Extraction of visual features for lipreading. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24 (2): , [13] C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, D. Vergyri, J. Sison, A. Mashari, and J. Zhou. Audio-visual speech recognition. In Final Workshop 2000 Report, vol. 764, [14] A. Ortega, F. Sukno, E. Lleida, A. F. Frangi, A. Miguel, L. Buera, and E. Zacur. Av@ car: A spanish multichannel multimodal corpus for in-vehicle automatic audio-visual speech recognition. In LREC, [15] A. Pass, J. Zhang, and D. Stewart. An investigation into features for multi-view lipreading. In Image Processing (ICIP), th IEEE International Conference on, pp IEEE, [16] G. Potamianos, H. P. Graf, and E. Cosatto. An image transform approach for hmm based automatic lipreading. In Image Processing, ICIP 98. Proceedings International Conference on, pp IEEE, [17] G. Potamianos, C. Neti, G. Iyengar, A. W. Senior, and A. Verma. A cascade visual front end for speaker independent automatic speechreading. International Journal of Speech Technology, 4 (3-4): , [18] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical recipes in c: the art of scientific computing, Cité en, pp. 92, [19] K. Uda, N. Tagawa, A. Minagawa, and T. Moriya. Effectiveness evaluation of word characteristics obtained from 3d image information for lipreading. In Image Analysis and Processing, Proceedings. 11th International Conference on, pp IEEE, 2001.
10 2438 J. Wang et al. /Journal of Computational Information Systems 11: 7 (2015) [20] S. Werda, W. Mahdi, and A. B. Hamadou. Lip localization and viseme classification for visual speech recognition. arxiv preprint arxiv: , [21] Q. YANG and X. CHEN. An improved grid search algorithm and its application in pca and svm based face recognition. Journal of Computational Information Systems, 10 (3): , [22] A. Yargic and M. Dogan. A lip reading application on ms kinect camera. In Innovations in Intelligent Systems and Applications (INISTA), 2013 IEEE International Symposium on, pp IEEE, [23] T. Yoshinaga, S. Tamura, K. Iwano, and S. Furui. Audio-visual speech recognition using lip movement extracted from side-face images. In AVSP 2003-International Conference on Audio-Visual Speech Processing, [24] T. Yoshinaga, S. Tamura, K. Iwano, and S. Furui. Audio-visual speech recognition using new lip features extracted from side-face images. In COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction, [25] S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, et al. The htk book (for htk version 3.4). Cambridge university engineering department, 2 (2): 2-3, 2006.
A New Manifold Representation for Visual Speech Recognition
A New Manifold Representation for Visual Speech Recognition Dahai Yu, Ovidiu Ghita, Alistair Sutherland, Paul F. Whelan School of Computing & Electronic Engineering, Vision Systems Group Dublin City University,
More information2-2-2, Hikaridai, Seika-cho, Soraku-gun, Kyoto , Japan 2 Graduate School of Information Science, Nara Institute of Science and Technology
ISCA Archive STREAM WEIGHT OPTIMIZATION OF SPEECH AND LIP IMAGE SEQUENCE FOR AUDIO-VISUAL SPEECH RECOGNITION Satoshi Nakamura 1 Hidetoshi Ito 2 Kiyohiro Shikano 2 1 ATR Spoken Language Translation Research
More informationLOW-DIMENSIONAL MOTION FEATURES FOR AUDIO-VISUAL SPEECH RECOGNITION
LOW-DIMENSIONAL MOTION FEATURES FOR AUDIO-VISUAL SPEECH Andrés Vallés Carboneras, Mihai Gurban +, and Jean-Philippe Thiran + + Signal Processing Institute, E.T.S.I. de Telecomunicación Ecole Polytechnique
More informationSparse Coding Based Lip Texture Representation For Visual Speaker Identification
Sparse Coding Based Lip Texture Representation For Visual Speaker Identification Author Lai, Jun-Yao, Wang, Shi-Lin, Shi, Xing-Jian, Liew, Alan Wee-Chung Published 2014 Conference Title Proceedings of
More informationResearch Article Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images
Hindawi Publishing Corporation EURASIP Journal on Audio, Speech, and Music Processing Volume 2007, Article ID 64506, 9 pages doi:10.1155/2007/64506 Research Article Audio-Visual Speech Recognition Using
More informationMulti-pose lipreading and audio-visual speech recognition
RESEARCH Open Access Multi-pose lipreading and audio-visual speech recognition Virginia Estellers * and Jean-Philippe Thiran Abstract In this article, we study the adaptation of visual and audio-visual
More informationISCA Archive
ISCA Archive http://www.isca-speech.org/archive Auditory-Visual Speech Processing 2005 (AVSP 05) British Columbia, Canada July 24-27, 2005 AUDIO-VISUAL SPEAKER IDENTIFICATION USING THE CUAVE DATABASE David
More informationAutomatic speech reading by oral motion tracking for user authentication system
2012 2013 Third International Conference on Advanced Computing & Communication Technologies Automatic speech reading by oral motion tracking for user authentication system Vibhanshu Gupta, M.E. (EXTC,
More informationProbabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information
Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information Mustafa Berkay Yilmaz, Hakan Erdogan, Mustafa Unel Sabanci University, Faculty of Engineering and Natural
More informationEND-TO-END VISUAL SPEECH RECOGNITION WITH LSTMS
END-TO-END VISUAL SPEECH RECOGNITION WITH LSTMS Stavros Petridis, Zuwei Li Imperial College London Dept. of Computing, London, UK {sp14;zl461}@imperial.ac.uk Maja Pantic Imperial College London / Univ.
More informationAudio-visual interaction in sparse representation features for noise robust audio-visual speech recognition
ISCA Archive http://www.isca-speech.org/archive Auditory-Visual Speech Processing (AVSP) 2013 Annecy, France August 29 - September 1, 2013 Audio-visual interaction in sparse representation features for
More informationAn algorithm of lips secondary positioning and feature extraction based on YCbCr color space SHEN Xian-geng 1, WU Wei 2
International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 015) An algorithm of lips secondary positioning and feature extraction based on YCbCr color space SHEN Xian-geng
More informationPhoneme Analysis of Image Feature in Utterance Recognition Using AAM in Lip Area
MIRU 7 AAM 657 81 1 1 657 81 1 1 E-mail: {komai,miyamoto}@me.cs.scitec.kobe-u.ac.jp, {takigu,ariki}@kobe-u.ac.jp Active Appearance Model Active Appearance Model combined DCT Active Appearance Model combined
More informationA Comparison of Visual Features for Audio-Visual Automatic Speech Recognition
A Comparison of Visual Features for Audio-Visual Automatic Speech Recognition N. Ahmad, S. Datta, D. Mulvaney and O. Farooq Loughborough Univ, LE11 3TU Leicestershire, UK n.ahmad@lboro.ac.uk 6445 Abstract
More informationA New Feature Local Binary Patterns (FLBP) Method
A New Feature Local Binary Patterns (FLBP) Method Jiayu Gu and Chengjun Liu The Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA Abstract - This paper presents
More informationAccurate 3D Face and Body Modeling from a Single Fixed Kinect
Accurate 3D Face and Body Modeling from a Single Fixed Kinect Ruizhe Wang*, Matthias Hernandez*, Jongmoo Choi, Gérard Medioni Computer Vision Lab, IRIS University of Southern California Abstract In this
More informationMouth Center Detection under Active Near Infrared Illumination
Proceedings of the 6th WSEAS International Conference on SIGNAL PROCESSING, Dallas, Texas, USA, March 22-24, 2007 173 Mouth Center Detection under Active Near Infrared Illumination THORSTEN GERNOTH, RALPH
More informationFace Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN
2016 International Conference on Artificial Intelligence: Techniques and Applications (AITA 2016) ISBN: 978-1-60595-389-2 Face Recognition Using Vector Quantization Histogram and Support Vector Machine
More informationPedestrian Detection with Improved LBP and Hog Algorithm
Open Access Library Journal 2018, Volume 5, e4573 ISSN Online: 2333-9721 ISSN Print: 2333-9705 Pedestrian Detection with Improved LBP and Hog Algorithm Wei Zhou, Suyun Luo Automotive Engineering College,
More informationFlexible Calibration of a Portable Structured Light System through Surface Plane
Vol. 34, No. 11 ACTA AUTOMATICA SINICA November, 2008 Flexible Calibration of a Portable Structured Light System through Surface Plane GAO Wei 1 WANG Liang 1 HU Zhan-Yi 1 Abstract For a portable structured
More informationLocating 1-D Bar Codes in DCT-Domain
Edith Cowan University Research Online ECU Publications Pre. 2011 2006 Locating 1-D Bar Codes in DCT-Domain Alexander Tropf Edith Cowan University Douglas Chai Edith Cowan University 10.1109/ICASSP.2006.1660449
More informationDiagonal Principal Component Analysis for Face Recognition
Diagonal Principal Component nalysis for Face Recognition Daoqiang Zhang,2, Zhi-Hua Zhou * and Songcan Chen 2 National Laboratory for Novel Software echnology Nanjing University, Nanjing 20093, China 2
More informationShort Survey on Static Hand Gesture Recognition
Short Survey on Static Hand Gesture Recognition Huu-Hung Huynh University of Science and Technology The University of Danang, Vietnam Duc-Hoang Vo University of Science and Technology The University of
More informationText Area Detection from Video Frames
Text Area Detection from Video Frames 1 Text Area Detection from Video Frames Xiangrong Chen, Hongjiang Zhang Microsoft Research China chxr@yahoo.com, hjzhang@microsoft.com Abstract. Text area detection
More informationLinear Discriminant Analysis for 3D Face Recognition System
Linear Discriminant Analysis for 3D Face Recognition System 3.1 Introduction Face recognition and verification have been at the top of the research agenda of the computer vision community in recent times.
More informationDECODING VISEMES: IMPROVING MACHINE LIP-READING. Helen L. Bear and Richard Harvey
DECODING VISEMES: IMPROVING MACHINE LIP-READING Helen L. Bear and Richard Harvey School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, United Kingdom ABSTRACT To undertake machine
More informationAAM Based Facial Feature Tracking with Kinect
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No 3 Sofia 2015 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2015-0046 AAM Based Facial Feature Tracking
More informationVideo annotation based on adaptive annular spatial partition scheme
Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory
More informationDepth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth
Common Classification Tasks Recognition of individual objects/faces Analyze object-specific features (e.g., key points) Train with images from different viewing angles Recognition of object classes Analyze
More informationA GENERIC FACE REPRESENTATION APPROACH FOR LOCAL APPEARANCE BASED FACE VERIFICATION
A GENERIC FACE REPRESENTATION APPROACH FOR LOCAL APPEARANCE BASED FACE VERIFICATION Hazim Kemal Ekenel, Rainer Stiefelhagen Interactive Systems Labs, Universität Karlsruhe (TH) 76131 Karlsruhe, Germany
More informationXing Fan, Carlos Busso and John H.L. Hansen
Xing Fan, Carlos Busso and John H.L. Hansen Center for Robust Speech Systems (CRSS) Erik Jonsson School of Engineering & Computer Science Department of Electrical Engineering University of Texas at Dallas
More informationImproving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,
More informationAn Adaptive Threshold LBP Algorithm for Face Recognition
An Adaptive Threshold LBP Algorithm for Face Recognition Xiaoping Jiang 1, Chuyu Guo 1,*, Hua Zhang 1, and Chenghua Li 1 1 College of Electronics and Information Engineering, Hubei Key Laboratory of Intelligent
More informationCombining Dynamic Texture and Structural Features for Speaker Identification
Combining Dynamic Texture and Structural Features for Speaker Identification Guoying Zhao Machine Vision Group Infotech Oulu and Department of Electrical and Information Engineering P. O. Box 4500 FI-90014
More informationVisual Front-End Wars: Viola-Jones Face Detector vs Fourier Lucas-Kanade
ISCA Archive http://www.isca-speech.org/archive Auditory-Visual Speech Processing (AVSP) 2013 Annecy, France August 29 - September 1, 2013 Visual Front-End Wars: Viola-Jones Face Detector vs Fourier Lucas-Kanade
More informationJPEG compression of monochrome 2D-barcode images using DCT coefficient distributions
Edith Cowan University Research Online ECU Publications Pre. JPEG compression of monochrome D-barcode images using DCT coefficient distributions Keng Teong Tan Hong Kong Baptist University Douglas Chai
More informationLOCAL APPEARANCE BASED FACE RECOGNITION USING DISCRETE COSINE TRANSFORM
LOCAL APPEARANCE BASED FACE RECOGNITION USING DISCRETE COSINE TRANSFORM Hazim Kemal Ekenel, Rainer Stiefelhagen Interactive Systems Labs, University of Karlsruhe Am Fasanengarten 5, 76131, Karlsruhe, Germany
More informationNearest Clustering Algorithm for Satellite Image Classification in Remote Sensing Applications
Nearest Clustering Algorithm for Satellite Image Classification in Remote Sensing Applications Anil K Goswami 1, Swati Sharma 2, Praveen Kumar 3 1 DRDO, New Delhi, India 2 PDM College of Engineering for
More informationMultifactor Fusion for Audio-Visual Speaker Recognition
Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing, Beijing, China, September 15-17, 2007 70 Multifactor Fusion for Audio-Visual Speaker Recognition GIRIJA CHETTY
More informationHuman Motion Detection and Tracking for Video Surveillance
Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,
More informationComponent-based Face Recognition with 3D Morphable Models
Component-based Face Recognition with 3D Morphable Models Jennifer Huang 1, Bernd Heisele 1,2, and Volker Blanz 3 1 Center for Biological and Computational Learning, M.I.T., Cambridge, MA, USA 2 Honda
More informationFace Recognition At-a-Distance Based on Sparse-Stereo Reconstruction
Face Recognition At-a-Distance Based on Sparse-Stereo Reconstruction Ham Rara, Shireen Elhabian, Asem Ali University of Louisville Louisville, KY {hmrara01,syelha01,amali003}@louisville.edu Mike Miller,
More informationResearch on Emotion Recognition for Facial Expression Images Based on Hidden Markov Model
e-issn: 2349-9745 p-issn: 2393-8161 Scientific Journal Impact Factor (SJIF): 1.711 International Journal of Modern Trends in Engineering and Research www.ijmter.com Research on Emotion Recognition for
More informationMulti-View Image Coding in 3-D Space Based on 3-D Reconstruction
Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction Yongying Gao and Hayder Radha Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48823 email:
More informationAn ICA based Approach for Complex Color Scene Text Binarization
An ICA based Approach for Complex Color Scene Text Binarization Siddharth Kherada IIIT-Hyderabad, India siddharth.kherada@research.iiit.ac.in Anoop M. Namboodiri IIIT-Hyderabad, India anoop@iiit.ac.in
More informationFace Quality Assessment System in Video Sequences
Face Quality Assessment System in Video Sequences Kamal Nasrollahi, Thomas B. Moeslund Laboratory of Computer Vision and Media Technology, Aalborg University Niels Jernes Vej 14, 9220 Aalborg Øst, Denmark
More informationHuman pose estimation using Active Shape Models
Human pose estimation using Active Shape Models Changhyuk Jang and Keechul Jung Abstract Human pose estimation can be executed using Active Shape Models. The existing techniques for applying to human-body
More informationOn Modeling Variations for Face Authentication
On Modeling Variations for Face Authentication Xiaoming Liu Tsuhan Chen B.V.K. Vijaya Kumar Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213 xiaoming@andrew.cmu.edu
More informationHand gesture recognition with Leap Motion and Kinect devices
Hand gesture recognition with Leap Motion and devices Giulio Marin, Fabio Dominio and Pietro Zanuttigh Department of Information Engineering University of Padova, Italy Abstract The recent introduction
More informationIris Recognition for Eyelash Detection Using Gabor Filter
Iris Recognition for Eyelash Detection Using Gabor Filter Rupesh Mude 1, Meenakshi R Patel 2 Computer Science and Engineering Rungta College of Engineering and Technology, Bhilai Abstract :- Iris recognition
More informationHead Frontal-View Identification Using Extended LLE
Head Frontal-View Identification Using Extended LLE Chao Wang Center for Spoken Language Understanding, Oregon Health and Science University Abstract Automatic head frontal-view identification is challenging
More information3D LIP TRACKING AND CO-INERTIA ANALYSIS FOR IMPROVED ROBUSTNESS OF AUDIO-VIDEO AUTOMATIC SPEECH RECOGNITION
3D LIP TRACKING AND CO-INERTIA ANALYSIS FOR IMPROVED ROBUSTNESS OF AUDIO-VIDEO AUTOMATIC SPEECH RECOGNITION Roland Goecke 1,2 1 Autonomous System and Sensing Technologies, National ICT Australia, Canberra,
More informationCombining Audio and Video for Detection of Spontaneous Emotions
Combining Audio and Video for Detection of Spontaneous Emotions Rok Gajšek, Vitomir Štruc, Simon Dobrišek, Janez Žibert, France Mihelič, and Nikola Pavešić Faculty of Electrical Engineering, University
More informationarxiv: v1 [cs.cv] 3 Oct 2017
Which phoneme-to-viseme maps best improve visual-only computer lip-reading? Helen L. Bear, Richard W. Harvey, Barry-John Theobald and Yuxuan Lan School of Computing Sciences, University of East Anglia,
More informationMouth Region Localization Method Based on Gaussian Mixture Model
Mouth Region Localization Method Based on Gaussian Mixture Model Kenichi Kumatani and Rainer Stiefelhagen Universitaet Karlsruhe (TH), Interactive Systems Labs, Am Fasanengarten 5, 76131 Karlsruhe, Germany
More informationRobust Steganography Using Texture Synthesis
Robust Steganography Using Texture Synthesis Zhenxing Qian 1, Hang Zhou 2, Weiming Zhang 2, Xinpeng Zhang 1 1. School of Communication and Information Engineering, Shanghai University, Shanghai, 200444,
More informationGeneric Face Alignment Using an Improved Active Shape Model
Generic Face Alignment Using an Improved Active Shape Model Liting Wang, Xiaoqing Ding, Chi Fang Electronic Engineering Department, Tsinghua University, Beijing, China {wanglt, dxq, fangchi} @ocrserv.ee.tsinghua.edu.cn
More informationIMPROVED FACE RECOGNITION USING ICP TECHNIQUES INCAMERA SURVEILLANCE SYSTEMS. Kirthiga, M.E-Communication system, PREC, Thanjavur
IMPROVED FACE RECOGNITION USING ICP TECHNIQUES INCAMERA SURVEILLANCE SYSTEMS Kirthiga, M.E-Communication system, PREC, Thanjavur R.Kannan,Assistant professor,prec Abstract: Face Recognition is important
More informationA reversible data hiding based on adaptive prediction technique and histogram shifting
A reversible data hiding based on adaptive prediction technique and histogram shifting Rui Liu, Rongrong Ni, Yao Zhao Institute of Information Science Beijing Jiaotong University E-mail: rrni@bjtu.edu.cn
More informationFace Alignment Under Various Poses and Expressions
Face Alignment Under Various Poses and Expressions Shengjun Xin and Haizhou Ai Computer Science and Technology Department, Tsinghua University, Beijing 100084, China ahz@mail.tsinghua.edu.cn Abstract.
More informationHuman Detection and Tracking for Video Surveillance: A Cognitive Science Approach
Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach Vandit Gajjar gajjar.vandit.381@ldce.ac.in Ayesha Gurnani gurnani.ayesha.52@ldce.ac.in Yash Khandhediya khandhediya.yash.364@ldce.ac.in
More informationA Study on Similarity Computations in Template Matching Technique for Identity Verification
A Study on Similarity Computations in Template Matching Technique for Identity Verification Lam, S. K., Yeong, C. Y., Yew, C. T., Chai, W. S., Suandi, S. A. Intelligent Biometric Group, School of Electrical
More informationTowards Lipreading Sentences with Active Appearance Models
Towards Lipreading Sentences with Active Appearance Models George Sterpu, Naomi Harte Sigmedia, ADAPT Centre, School of Engineering, Trinity College Dublin, Ireland sterpug@tcd.ie, nharte@tcd.ie Abstract
More informationAudio-visual speech recognition using deep bottleneck features and high-performance lipreading
Proceedings of APSIPA Annual Summit and Conference 215 16-19 December 215 Audio-visual speech recognition using deep bottleneck features and high-performance lipreading Satoshi TAMURA, Hiroshi NINOMIYA,
More informationPerformance analysis of robust road sign identification
IOP Conference Series: Materials Science and Engineering OPEN ACCESS Performance analysis of robust road sign identification To cite this article: Nursabillilah M Ali et al 2013 IOP Conf. Ser.: Mater.
More informationAutomatic Shadow Removal by Illuminance in HSV Color Space
Computer Science and Information Technology 3(3): 70-75, 2015 DOI: 10.13189/csit.2015.030303 http://www.hrpub.org Automatic Shadow Removal by Illuminance in HSV Color Space Wenbo Huang 1, KyoungYeon Kim
More informationSOME stereo image-matching methods require a user-selected
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 3, NO. 2, APRIL 2006 207 Seed Point Selection Method for Triangle Constrained Image Matching Propagation Qing Zhu, Bo Wu, and Zhi-Xiang Xu Abstract In order
More informationMultidirectional 2DPCA Based Face Recognition System
Multidirectional 2DPCA Based Face Recognition System Shilpi Soni 1, Raj Kumar Sahu 2 1 M.E. Scholar, Department of E&Tc Engg, CSIT, Durg 2 Associate Professor, Department of E&Tc Engg, CSIT, Durg Email:
More informationAudio Visual Isolated Oriya Digit Recognition Using HMM and DWT
Conference on Advances in Communication and Control Systems 2013 (CAC2S 2013) Audio Visual Isolated Oriya Digit Recognition Using HMM and DWT Astik Biswas Department of Electrical Engineering, NIT Rourkela,Orrisa
More information1. INTRODUCTION ABSTRACT
Weighted Fusion of Depth and Inertial Data to Improve View Invariance for Human Action Recognition Chen Chen a, Huiyan Hao a,b, Roozbeh Jafari c, Nasser Kehtarnavaz a a Center for Research in Computer
More informationImage Processing Pipeline for Facial Expression Recognition under Variable Lighting
Image Processing Pipeline for Facial Expression Recognition under Variable Lighting Ralph Ma, Amr Mohamed ralphma@stanford.edu, amr1@stanford.edu Abstract Much research has been done in the field of automated
More informationAlgorithm research of 3D point cloud registration based on iterative closest point 1
Acta Technica 62, No. 3B/2017, 189 196 c 2017 Institute of Thermomechanics CAS, v.v.i. Algorithm research of 3D point cloud registration based on iterative closest point 1 Qian Gao 2, Yujian Wang 2,3,
More informationThe Novel Approach for 3D Face Recognition Using Simple Preprocessing Method
The Novel Approach for 3D Face Recognition Using Simple Preprocessing Method Parvin Aminnejad 1, Ahmad Ayatollahi 2, Siamak Aminnejad 3, Reihaneh Asghari Abstract In this work, we presented a novel approach
More informationImage Inpainting by Hyperbolic Selection of Pixels for Two Dimensional Bicubic Interpolations
Image Inpainting by Hyperbolic Selection of Pixels for Two Dimensional Bicubic Interpolations Mehran Motmaen motmaen73@gmail.com Majid Mohrekesh mmohrekesh@yahoo.com Mojtaba Akbari mojtaba.akbari@ec.iut.ac.ir
More informationDeduction and Logic Implementation of the Fractal Scan Algorithm
Deduction and Logic Implementation of the Fractal Scan Algorithm Zhangjin Chen, Feng Ran, Zheming Jin Microelectronic R&D center, Shanghai University Shanghai, China and Meihua Xu School of Mechatronical
More informationScene Text Detection Using Machine Learning Classifiers
601 Scene Text Detection Using Machine Learning Classifiers Nafla C.N. 1, Sneha K. 2, Divya K.P. 3 1 (Department of CSE, RCET, Akkikkvu, Thrissur) 2 (Department of CSE, RCET, Akkikkvu, Thrissur) 3 (Department
More informationRecognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 17 (2014), pp. 1839-1845 International Research Publications House http://www. irphouse.com Recognition of
More informationSCALE BASED FEATURES FOR AUDIOVISUAL SPEECH RECOGNITION
IEE Colloquium on Integrated Audio-Visual Processing for Recognition, Synthesis and Communication, pp 8/1 8/7, 1996 1 SCALE BASED FEATURES FOR AUDIOVISUAL SPEECH RECOGNITION I A Matthews, J A Bangham and
More informationAudio-Visual Speech Processing System for Polish with Dynamic Bayesian Network Models
Proceedings of the orld Congress on Electrical Engineering and Computer Systems and Science (EECSS 2015) Barcelona, Spain, July 13-14, 2015 Paper No. 343 Audio-Visual Speech Processing System for Polish
More informationAUDIOVISUAL SPEECH RECOGNITION USING MULTISCALE NONLINEAR IMAGE DECOMPOSITION
AUDIOVISUAL SPEECH RECOGNITION USING MULTISCALE NONLINEAR IMAGE DECOMPOSITION Iain Matthews, J. Andrew Bangham and Stephen Cox School of Information Systems, University of East Anglia, Norwich, NR4 7TJ,
More informationAn Approach for Real Time Moving Object Extraction based on Edge Region Determination
An Approach for Real Time Moving Object Extraction based on Edge Region Determination Sabrina Hoque Tuli Department of Computer Science and Engineering, Chittagong University of Engineering and Technology,
More informationTri-modal Human Body Segmentation
Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4
More informationA NEW ROBUST IMAGE WATERMARKING SCHEME BASED ON DWT WITH SVD
A NEW ROBUST IMAGE WATERMARKING SCHEME BASED ON WITH S.Shanmugaprabha PG Scholar, Dept of Computer Science & Engineering VMKV Engineering College, Salem India N.Malmurugan Director Sri Ranganathar Institute
More informationMeasurement of pinna flare angle and its effect on individualized head-related transfer functions
PROCEEDINGS of the 22 nd International Congress on Acoustics Free-Field Virtual Psychoacoustics and Hearing Impairment: Paper ICA2016-53 Measurement of pinna flare angle and its effect on individualized
More informationEUSIPCO A SPACE-VARIANT CUBIC-SPLINE INTERPOLATION
EUSIPCO 213 1569744341 A SPACE-VARIAN CUBIC-SPLINE INERPOLAION Jianxing Jiang, Shaohua Hong, Lin Wang Department of Communication Engineering, Xiamen University, Xiamen, Fujian, 3615, P.R. China. ABSRAC
More information3-D MRI Brain Scan Classification Using A Point Series Based Representation
3-D MRI Brain Scan Classification Using A Point Series Based Representation Akadej Udomchaiporn 1, Frans Coenen 1, Marta García-Fiñana 2, and Vanessa Sluming 3 1 Department of Computer Science, University
More informationFACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT
FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT Shoichiro IWASAWA*I, Tatsuo YOTSUKURA*2, Shigeo MORISHIMA*2 */ Telecommunication Advancement Organization *2Facu!ty of Engineering, Seikei University
More informationVideo Inter-frame Forgery Identification Based on Optical Flow Consistency
Sensors & Transducers 24 by IFSA Publishing, S. L. http://www.sensorsportal.com Video Inter-frame Forgery Identification Based on Optical Flow Consistency Qi Wang, Zhaohong Li, Zhenzhen Zhang, Qinglong
More informationAdaptive Skin Color Classifier for Face Outline Models
Adaptive Skin Color Classifier for Face Outline Models M. Wimmer, B. Radig, M. Beetz Informatik IX, Technische Universität München, Germany Boltzmannstr. 3, 87548 Garching, Germany [wimmerm, radig, beetz]@informatik.tu-muenchen.de
More informationA QR code identification technology in package auto-sorting system
Modern Physics Letters B Vol. 31, Nos. 19 21 (2017) 1740035 (5 pages) c World Scientific Publishing Company DOI: 10.1142/S0217984917400358 A QR code identification technology in package auto-sorting system
More informationREAL-TIME FACE SWAPPING IN VIDEO SEQUENCES: MAGIC MIRROR
REAL-TIME FACE SWAPPING IN VIDEO SEQUENCES: MAGIC MIRROR Nuri Murat Arar1, Fatma Gu ney1, Nasuh Kaan Bekmezci1, Hua Gao2 and Hazım Kemal Ekenel1,2,3 1 Department of Computer Engineering, Bogazici University,
More informationIntensity-Depth Face Alignment Using Cascade Shape Regression
Intensity-Depth Face Alignment Using Cascade Shape Regression Yang Cao 1 and Bao-Liang Lu 1,2 1 Center for Brain-like Computing and Machine Intelligence Department of Computer Science and Engineering Shanghai
More informationArticulatory Features for Robust Visual Speech Recognition
Articulatory Features for Robust Visual Speech Recognition Kate Saenko, Trevor Darrell, and James Glass MIT Computer Science and Artificial Intelligence Laboratory 32 Vassar Street Cambridge, Massachusetts,
More informationRobust biometric image watermarking for fingerprint and face template protection
Robust biometric image watermarking for fingerprint and face template protection Mayank Vatsa 1, Richa Singh 1, Afzel Noore 1a),MaxM.Houck 2, and Keith Morris 2 1 West Virginia University, Morgantown,
More informationIRIS SEGMENTATION OF NON-IDEAL IMAGES
IRIS SEGMENTATION OF NON-IDEAL IMAGES William S. Weld St. Lawrence University Computer Science Department Canton, NY 13617 Xiaojun Qi, Ph.D Utah State University Computer Science Department Logan, UT 84322
More informationA Fast Personal Palm print Authentication based on 3D-Multi Wavelet Transformation
A Fast Personal Palm print Authentication based on 3D-Multi Wavelet Transformation * A. H. M. Al-Helali, * W. A. Mahmmoud, and * H. A. Ali * Al- Isra Private University Email: adnan_hadi@yahoo.com Abstract:
More informationMoving Object Detection and Tracking for Video Survelliance
Moving Object Detection and Tracking for Video Survelliance Ms Jyoti J. Jadhav 1 E&TC Department, Dr.D.Y.Patil College of Engineering, Pune University, Ambi-Pune E-mail- Jyotijadhav48@gmail.com, Contact
More informationStacked Denoising Autoencoders for Face Pose Normalization
Stacked Denoising Autoencoders for Face Pose Normalization Yoonseop Kang 1, Kang-Tae Lee 2,JihyunEun 2, Sung Eun Park 2 and Seungjin Choi 1 1 Department of Computer Science and Engineering Pohang University
More informationSpeaker Localisation Using Audio-Visual Synchrony: An Empirical Study
Speaker Localisation Using Audio-Visual Synchrony: An Empirical Study H.J. Nock, G. Iyengar, and C. Neti IBM TJ Watson Research Center, PO Box 218, Yorktown Heights, NY 10598. USA. Abstract. This paper
More informationConversion of 2D Image into 3D and Face Recognition Based Attendance System
Conversion of 2D Image into 3D and Face Recognition Based Attendance System Warsha Kandlikar, Toradmal Savita Laxman, Deshmukh Sonali Jagannath Scientist C, Electronics Design and Technology, NIELIT Aurangabad,
More information