Vulnerability of Voice Verification System with STC anti-spoofing detector to different methods of spoofing attacks

Size: px
Start display at page:

Download "Vulnerability of Voice Verification System with STC anti-spoofing detector to different methods of spoofing attacks"

Transcription

1 Vulnerability of Voice Verification System with STC anti-spoofing detector to different methods of spoofing attacks Vadim Shchemelinin 1,2, Alexandr Kozlov 2, Galina Lavrentyeva 2, Sergey Novoselov 1,2 and Konstantin Simonchik 1,2 1 ITMO University, St.Petersburg, Russia 2 Speech Technology Center Limited, St.Petersburg, Russia Abstract. This paper explores the robustness of a text-independent voice verification system against different methods of spoofing attacks based on speech synthesis and voice conversion techniques. Our experiments show that the most dangerous are spoofing attacks based on the speech synthesis, but the use of standard TV-JFA approach based spoofing detection module can reduce the False Acceptance error rate of the whole speaker recognition system from 80% to 1%. Keywords: spoofing, anti-spoofing, speaker recognition, TV, SVM 1 Introduction Speaker verification systems become widespread in recent time. They are used in different areas of our lives: forensic research, physical access control systems, banking, as well as on the web. The two main roles that such systems have in every-day life are usability enhancement and security. So to perform its functions a voice verification system has to have high robustness, especially if it is used for access to a bank account or personal information. For this reason, it is important to continuously assess the stability of voice verification systems against to spoofing attacks. The greatest threat are automatable methods of spoofing based on the synthesis of speech or voice conversion techniques. In the works [1, 2] it is shown that such attack mehods may raise a false error rate to unacceptable values. Together with the increased security threat there were developed detection methods of similar attacks. However, the question of their reliability and performance evaluation is still open. The aim of our study was to determine the most dangerous methods of spoofing for modern verification system working together with the spoofing detection module.

2 2 Vadim Shchemelinin et al. 2 Voice Verification System with Anti-spoofing 2.1 Voice Verification Module One of the standard use-cases of text-independent voice verification systems is the client voice model creation and its comparison with his etalon model during user interaction with the IVR (Interactive Voice Response) systems in call-centers. The user calls to the call-center and uses voice commands to go through the IVR menu. Throughout the call session, clients speech is sent to verification system for voice model creation and estimation if the access to the confidential information should be denied or not. In our experiments the i-vector based speaker recognition system was used. Before features extraction signal preprocessing module was applied. It included energy based voice activity detection, clipping [3], pulse and multi-tonal detection. The pre-emphasizing was also done and speech signal was divided into 22ms window frames with a 50% overlap, and, similarly to spoofing detection, multiplied by Hamming window function. As front-end features 13 MFCC features of each frame with first and second derivatives were selected. The derivatives were estimated over a 5-frame context and we also applied a cepstral mean subtraction (CMS) for the cepstral coefficients. For the acoustic space modelling we used Total Variability super-vectors with Probabilistic LDA approach (TV-PLDA) to achieve better performance [4, 5]. According to this approach, the distribution of the i-vectors can be expressed as following: µ = m + T ω + ϵ, where µ is the super-vector of the Gaussian Mixture Models (GMM) parameters of the speaker model, m is the super-vector of the Universal Background Model(UBM) parameters, T is the TV matrix defining the basis in the reduced feature space, ω is the i-vector in the reduced feature space, ω N(0, 1), ϵ is the error vector. In our system the dimension of TV space was 600 and UBM was genderindependent with 512 component. UBM was obtained by standard ML-training on the telephone part of the NIST s SRE datasets (all languages, both genders) [6, 7]. In our study we used more than 4000 training speakers in total. We also used a diagonal, not a full-covariance GMM UBM. The i-vector extractor and PLDA matrix were trained on more than telephone and microphone recordings from the NIST comprising more than 4000 speakers voices. 2.2 Spoofing Detection Module Spoofing detection method was used in considered speaker verification system as preliminary step. It was firstly introduced in the ASVspoof Challenge 2015 [8]

3 Vulnerability of Voice Verification to different spoofing attacks 3 and achieved 3.922% EER for unknown types of spoofing attacks and 0.008% EER for known spoofing attacks. It should be mentioned that for the HMMbased spoofing attacks of the ASVspoof Challenge evaluation base zero error of spoofing detection was achieved. That was the motivation to include this method to ASV system. Anti-spoofing method consists of four main components: Pre-detection Acoustic feature extractor TV i-vector extractor SVM classifier Pre-detector was used to check if the input signal had zero temporal energy and in this cases declared signal as spoofing attack. Otherwise acoustic features were extracted from signal. As front-end acoustic features we used: 12 Mel-Frequency Cepstral Coefficients (MFCC), 12 Mel-Frequency Principal Coefficients (MFPC) and 12 Cos- Phase Principal Coefficients (CosPhasePC) based on phase spectrum with its first and second derivatives. To obtain these coefficients Hamming windowing was used with 256 window length and 50% overlap. For the acoustic space modelling we used the standard TV-JFA approach, which is the state-of-the-art in speaker verification [7, 9, 10]. According to this version of the joint factor analysis, the i-vector of the Total Variability space is extracted by means of JFA modification, which is a usual Gaussian factor analyser defined on mean super-vectors of the Universal Background Model (UBM) and Total-variability matrix T. UBM was represented by the Gaussian mixture model (GMM) of the described features. The diagonal covariance UBM was trained by the standard EM-algorithm. For anti-spoofing method UBM was represented by a 1024-component Gaussian mixture model of the described features, and the dimension of the TV space was Fusion Decision Module Fig. 1. Voice Verification System with Anti-spoofing sheme

4 4 Vadim Shchemelinin et al. Fusion Decision Module was based on fusion on speaker recognition module output and spoofing detection module output as shown on figure 1. The decision made by verification and spoofing detection modules was expressed as a mentioned below P = P verification (1 P spoofing ), where: P verification is the probability that the speaker in the test recording is the same as the speaker in the etalon, P spoofing is probability that the test recording is spoofing. To calculate probabilities from scores, we used the BOSARIS toolkit [18]. 3 Experiments with Different Types of Spoofing Baseline S1 spoofing S2 spoofing S3 spoofing S4 spoofing S5 spoofing False Rejection (%) False Acceptance (%) Fig. 2. DET curves for verification system without spoofing detection module against different methods of attacks For examining vulnerability of Voice Verification System to different methods of spoofing attacks we used ASVspoof development dataset [11]. It includes free and spoofed speech of 35 speakers, 15 male and 20 female. There are 3497 genuine and spoofed trials. Spoofed speech is generated according to one of the five spoofing methods (S1 - S5) as follows:

5 Vulnerability of Voice Verification to different spoofing attacks 5 S1 - Based on voice conversion, simplified frame selection algorithm [12, 13]. The converted speech is generated by selecting target speech frames. S2 - The simplest voice conversion algorithm [14] which adjusts only the first mel-cepstral coefficient in order to shift the slope of the source spectrum to the target. S3 - The Hidden Markov model based speech synthesis system using speaker adaptation techniques [15] and only 20 adaptation utterances. S4 - The Hidden Markov model based speech synthesis system using speaker adaptation techniques [15] and only 40 adaptation utterances. S5 - The method based on voice conversion toolkit and with the Festvox system [16]. At first, we checked how strong F A error rate was increased if voice verification system didn t contain spoofing detection module. Also in this step, we wanted to make sure that proposed for ASVspoof Chalenge 2015 spoofing techniques were a threat to a system of verification. As the baseline we used only free speech of all speakers from previous described dataset. It is interesting to note that S2 based on conversion of the first mel-cepstral coefficient gives the greatest detection error [17], while this method has the least impact on verification system without spoofing detector as shown on figure Baseline S1 spoofing S2 spoofing S3 spoofing S4 spoofing S5 spoofing 2 1 False Rejection (%) False Acceptance (%) Fig. 3. DET curves for verification system with spoofing detection module against different methods of attacks

6 6 Vadim Shchemelinin et al. The results of experiments with enabled spoofing detection module are presented on figure 3. Additionally in table 1, presented comparisons of the F A values at baseline EER point threshold with spoofing detection module on and off. As it can be Table 1. F A verification error for spoofing the verification system based on different algorithms. F A for threshold in EER point Voice Verification system S1 S2 S3 S4 S5 Without spoofing detection module 52.5% 1.7% 68.5% 77.1% 63.7% With TV-JFA based spoofing detection module 0.36% 0% 0.23% 1.35% 0.98% seen from the table, spoofing detection implementation significantly improves F A error rate. Also obtained results demonstrate that synthesis based spoofing methods are more dangerous in comparison with those based on voice conversion techniques. 4 Conclusions In this paper we analyzed the vulnerability of voice verification system based on state-of-the-art speaker recognition and spoofing detection methods against different spoofing methods based on text-to-speech and voice conversion algorithms. As it was demonstrated by the experiments, spoofing using a TTS voice is more treatful than other methods. For instance, the Hidden Markov model based speech synthesis spoofing method gave 1.35% False Acceptance error, comparing to the 0.98% of method based on voice conversion toolkit. Also, it can be sum up that it is important to evaluate spoofing detection methods together with voice verification systems. Firstly, spoofing detector can be reliable on the not effective spoofing attacks. Secondly, the system EER can be increased by false acceptances error of spoofing detector. However, our results showed once again that it is highly necessary to test verification systems against spoofing by different methods, and to develop antispoofing algorithms reliable in real use-cases. This work was partially financially supported by the Government of Russian Federation, Grant 074-U01. References 1. Shchemelinin V., Simonchik K.: Examining Vulnerability of Voice Verification Systems to Spoofing Attacks by Means of a TTS System. Proceedings of the SPECOM 2013 (Plzen, Czech Republic, September 15, 2013), pp (2013)

7 Vulnerability of Voice Verification to different spoofing attacks 7 2. Shchemelinin V., Topchina M., Simonchik K.: Vulnerability of Voice Verification Systems to Spoofing Attacks with TTS Voices Based on Automatically Labeled Telephone Speech. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 8773, No. LNAI, pp (2014) 3. Aleinik, S., Matveev, Y.N.: Detection of Clipped Fragments in Speech Signals. International Journal of Electrical, Electronic Science and Engineering, 8(2), pp , (2014) 4. Kenny, P.,: Bayesian speaker verification with heavy tailed priors. Proceedings of the Odyssey Speaker and Language Recognition Workshop (Brno, Czech Republic, Jun. 2010). (2010) 5. Simonchik K., Pekhovsky T., Shulipa A., Afanasyev A.: Supervized Mixture of PLDA Models for Cross-Channel Speaker Verification. Proceedings of the 13th Annual Conference of the International Speech Communication Association, Interspeech-2012 (Portland, Oregon, USA, September 9-13). (2012) 6. Matveev Yu., Simonchik K.: The speaker identification system for the NIST SRE Proc. The 20th International Conference on Computer Graphics and Vision, GraphiCon 2010 (St. Petersburg, Russia, September ), pp (2010) 7. Kozlov, A., Kudashev, O., Matveev, Yu., Pekhovsky, T., Simonchik, K., Shulipa, A.: SVID speaker recognition system for the NIST SRE Lecture Notes in Computer Science (LNCS), vol. 8113, pp (2013) 8. Zhizheng Wu, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi: ASVspoof 2015: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan, Dec. 19, 2014, 9. S. Novoselov, T. Pekhovsky, K. Simonchik: STC Speaker Recognition System for the NIST i-vector Challenge. In: Proc. Odyssey The Speaker and Language Recognition Workshop (2014) 10. Kinnunen T., Li H.: An overview of text-independent speaker recognition: from features to supervectors. In: Speech Commun. vol. 52, pp (2010) 11. Zhizheng Wu, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, Cemal Hanilc, Md Sahidullah, Aleksandr Sizov: ASVspoof 2015: the First Automatic Speaker Verification Spoofing and Countermeasures Challenge spoofingchallenge.org/is2015_asvspoof.pdf 12. T. Dutoit, A. Holzapfel, M. Jottrand, A. Moinet, J. Perez, and Y. Stylianou: Towards a voice conversion system based on frame selection in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Z. Wu, T. Virtanen, T. Kinnunen, E. Chng, and H. Li: Exemplarbased unit selection for voice conversion utilizing temporal information. Interspeech, T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai: An adaptive algorithm for mel-cepstral analysis of speech, in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai: Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained smaplr adaptation algorithm, IEEE Trans. Audio, Speech and Language Processing, vol. 17, no. 1, pp. 6683, Festvox project, S. Novoselov, A. Kozlov, G. Lavrentyeva, K. Simonchik, V. Shchemelinin: STC Anti-spoofing Systems for the ASVspoof 2015 Challenge. wp-content/uploads/2015/06/technical_report_asvspoof2015_stc.pdf. 18. BOSARIS Toolkit,

STC ANTI-SPOOFING SYSTEMS FOR THE ASVSPOOF 2015 CHALLENGE

STC ANTI-SPOOFING SYSTEMS FOR THE ASVSPOOF 2015 CHALLENGE STC ANTI-SPOOFING SYSTEMS FOR THE ASVSPOOF 2015 CHALLENGE Sergey Novoselov 1,2, Alexandr Kozlov 2, Galina Lavrentyeva 1,2, Konstantin Simonchik 1,2, Vadim Shchemelinin 1,2 1 ITMO University, St. Petersburg,

More information

arxiv: v1 [cs.sd] 24 May 2017

arxiv: v1 [cs.sd] 24 May 2017 Anti-spoofing Methods for Automatic Speaker Verification System Galina Lavrentyeva 1,2, Sergey Novoselov 1,2, and Konstantin Simonchik 1,2 arxiv:1705.08865v1 [cs.sd] 24 May 2017 1 Speech Technology Center

More information

SAS: A speaker verification spoofing database containing diverse attacks

SAS: A speaker verification spoofing database containing diverse attacks SAS: A speaker verification spoofing database containing diverse attacks Zhizheng Wu 1, Ali Khodabakhsh 2, Cenk Demiroglu 2, Junichi Yamagishi 1,3, Daisuke Saito 4, Tomoki Toda 5, Simon King 1 1 University

More information

Comparative Evaluation of Feature Normalization Techniques for Speaker Verification

Comparative Evaluation of Feature Normalization Techniques for Speaker Verification Comparative Evaluation of Feature Normalization Techniques for Speaker Verification Md Jahangir Alam 1,2, Pierre Ouellet 1, Patrick Kenny 1, Douglas O Shaughnessy 2, 1 CRIM, Montreal, Canada {Janagir.Alam,

More information

Improving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data

Improving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data INTERSPEECH 17 August 24, 17, Stockholm, Sweden Improving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data Achintya Kr. Sarkar 1, Md. Sahidullah 2, Zheng-Hua

More information

Introducing I-Vectors for Joint Anti-spoofing and Speaker Verification

Introducing I-Vectors for Joint Anti-spoofing and Speaker Verification Introducing I-Vectors for Joint Anti-spoofing and Speaker Verification Elie Khoury, Tomi Kinnunen, Aleksandr Sizov, Zhizheng Wu, Sébastien Marcel Idiap Research Institute, Switzerland School of Computing,

More information

arxiv: v1 [cs.mm] 23 Jan 2019

arxiv: v1 [cs.mm] 23 Jan 2019 GENERALIZATION OF SPOOFING COUNTERMEASURES: A CASE STUDY WITH ASVSPOOF 215 AND BTAS 216 CORPORA Dipjyoti Paul 1, Md Sahidullah 2, Goutam Saha 1 arxiv:191.825v1 [cs.mm] 23 Jan 219 1 Department of E & ECE,

More information

The Approach of Mean Shift based Cosine Dissimilarity for Multi-Recording Speaker Clustering

The Approach of Mean Shift based Cosine Dissimilarity for Multi-Recording Speaker Clustering The Approach of Mean Shift based Cosine Dissimilarity for Multi-Recording Speaker Clustering 1 D. Jareena Begum, 2 K Rajendra Prasad, 3 M Suleman Basha 1 M.Tech in SE, RGMCET, Nandyal 2 Assoc Prof, Dept

More information

arxiv: v1 [cs.sd] 8 Jun 2017

arxiv: v1 [cs.sd] 8 Jun 2017 SUT SYSTEM DESCRIPTION FOR NIST SRE 2016 Hossein Zeinali 1,2, Hossein Sameti 1 and Nooshin Maghsoodi 1 1 Sharif University of Technology, Tehran, Iran 2 Brno University of Technology, Speech@FIT and IT4I

More information

Detection of Replay Attacks using Single Frequency Filtering Cepstral Coefficients

Detection of Replay Attacks using Single Frequency Filtering Cepstral Coefficients INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Detection of Replay Attacks using Single Frequency Filtering Cepstral Coefficients K N R K Raju Alluri, Sivanand Achanta, Sudarsana Reddy Kadiri,

More information

The ASVspoof 2017 Challenge: Assessing the Limits of Replay Spoofing Attack Detection

The ASVspoof 2017 Challenge: Assessing the Limits of Replay Spoofing Attack Detection INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden The ASVspoof 2017 Challenge: Assessing the Limits of Replay Spoofing Attack Detection Tomi Kinnunen 1, Md Sahidullah 1, Héctor Delgado 2, Massimiliano

More information

Unsupervised Representation Learning Using Convolutional Restricted Boltzmann Machine for Spoof Speech Detection

Unsupervised Representation Learning Using Convolutional Restricted Boltzmann Machine for Spoof Speech Detection INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Unsupervised Representation Learning Using Convolutional Restricted Boltzmann Machine for Spoof Speech Detection Hardik B. Sailor, Madhu R. Kamble,

More information

SUT Submission for NIST 2016 Speaker Recognition Evaluation: Description and Analysis

SUT Submission for NIST 2016 Speaker Recognition Evaluation: Description and Analysis The 2017 Conference on Computational Linguistics and Speech Processing ROCLING 2017, pp. 276-286 The Association for Computational Linguistics and Chinese Language Processing SUT Submission for NIST 2016

More information

Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes

Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes Hye-Jin Shim shimhz6.6@gmail.com Sung-Hyun Yoon ysh901108@naver.com Jee-Weon Jung jeewon.leo.jung@gmail.com

More information

Replay Attack Detection using DNN for Channel Discrimination

Replay Attack Detection using DNN for Channel Discrimination INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Replay Attack Detection using DNN for Channel Discrimination Parav Nagarsheth, Elie Khoury, Kailash Patil, Matt Garland Pindrop, Atlanta, USA {pnagarsheth,ekhoury,kpatil,matt.garland}@pindrop.com

More information

ANALYSING REPLAY SPOOFING COUNTERMEASURE PERFORMANCE UNDER VARIED CONDITIONS

ANALYSING REPLAY SPOOFING COUNTERMEASURE PERFORMANCE UNDER VARIED CONDITIONS 2018 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 17 20, 2018, AALBORG, DENMARK ANALYSING REPLAY SPOOFING COUNTERMEASURE PERFORMANCE UNDER VARIED CONDITIONS Bhusan Chettri

More information

OVERVIEW OF BTAS 2016 SPEAKER ANTI-SPOOFING COMPETITION

OVERVIEW OF BTAS 2016 SPEAKER ANTI-SPOOFING COMPETITION RESEARCH REPORT IDIAP OVERVIEW OF BTAS 2016 SPEAKER ANTI-SPOOFING COMPETITION Pavel Korshunov Sébastien Marcel a Hannah Muckenhirn A. R. Gonçalves A. G. Souza Mello R. P. Velloso Violato Flávio Simões

More information

Bo#leneck Features from SNR- Adap9ve Denoising Deep Classifier for Speaker Iden9fica9on

Bo#leneck Features from SNR- Adap9ve Denoising Deep Classifier for Speaker Iden9fica9on Bo#leneck Features from SNR- Adap9ve Denoising Deep Classifier for Speaker Iden9fica9on TAN Zhili & MAK Man-Wai APSIPA 2015 Department of Electronic and Informa2on Engineering The Hong Kong Polytechnic

More information

SRE08 system. Nir Krause Ran Gazit Gennady Karvitsky. Leave Impersonators, fraudsters and identity thieves speechless

SRE08 system. Nir Krause Ran Gazit Gennady Karvitsky. Leave Impersonators, fraudsters and identity thieves speechless Leave Impersonators, fraudsters and identity thieves speechless SRE08 system Nir Krause Ran Gazit Gennady Karvitsky Copyright 2008 PerSay Inc. All Rights Reserved Focus: Multilingual telephone speech and

More information

Voiceprint-based Access Control for Wireless Insulin Pump Systems

Voiceprint-based Access Control for Wireless Insulin Pump Systems Voiceprint-based Access Control for Wireless Insulin Pump Systems Presenter: Xiaojiang Du Bin Hao, Xiali Hei, Yazhou Tu, Xiaojiang Du, and Jie Wu School of Computing and Informatics, University of Louisiana

More information

Spoofing Speech Detection using Temporal Convolutional Neural Network

Spoofing Speech Detection using Temporal Convolutional Neural Network Spoofing Speech Detection using Temporal Convolutional Neural Network Xiaohai Tian, Xiong Xiao, Eng Siong Chng and Haizhou Li School of Computer Science and Engineering, Nanyang Technological University

More information

Presentation attack detection in voice biometrics

Presentation attack detection in voice biometrics Chapter 1 Presentation attack detection in voice biometrics Pavel Korshunov and Sébastien Marcel Idiap Research Institute, Martigny, Switzerland {pavel.korshunov,sebastien.marcel}@idiap.ch Recent years

More information

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

PDF hosted at the Radboud Repository of the Radboud University Nijmegen PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/94752

More information

Multifactor Fusion for Audio-Visual Speaker Recognition

Multifactor Fusion for Audio-Visual Speaker Recognition Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing, Beijing, China, September 15-17, 2007 70 Multifactor Fusion for Audio-Visual Speaker Recognition GIRIJA CHETTY

More information

Speaker Verification with Adaptive Spectral Subband Centroids

Speaker Verification with Adaptive Spectral Subband Centroids Speaker Verification with Adaptive Spectral Subband Centroids Tomi Kinnunen 1, Bingjun Zhang 2, Jia Zhu 2, and Ye Wang 2 1 Speech and Dialogue Processing Lab Institution for Infocomm Research (I 2 R) 21

More information

IMPROVED SPEAKER RECOGNITION USING DCT COEFFICIENTS AS FEATURES. Mitchell McLaren, Yun Lei

IMPROVED SPEAKER RECOGNITION USING DCT COEFFICIENTS AS FEATURES. Mitchell McLaren, Yun Lei IMPROVED SPEAKER RECOGNITION USING DCT COEFFICIENTS AS FEATURES Mitchell McLaren, Yun Lei Speech Technology and Research Laboratory, SRI International, California, USA {mitch,yunlei}@speech.sri.com ABSTRACT

More information

FOUR WEIGHTINGS AND A FUSION: A CEPSTRAL-SVM SYSTEM FOR SPEAKER RECOGNITION. Sachin S. Kajarekar

FOUR WEIGHTINGS AND A FUSION: A CEPSTRAL-SVM SYSTEM FOR SPEAKER RECOGNITION. Sachin S. Kajarekar FOUR WEIGHTINGS AND A FUSION: A CEPSTRAL-SVM SYSTEM FOR SPEAKER RECOGNITION Sachin S. Kajarekar Speech Technology and Research Laboratory SRI International, Menlo Park, CA, USA sachin@speech.sri.com ABSTRACT

More information

This presentation is the third part of a tutorial presented at INTERSPEECH 2018:

This presentation is the third part of a tutorial presented at INTERSPEECH 2018: INTERSPEECH 2018 SEPTEMBER 2-6 HYDERABAD, INDIA This presentation is the third part of a tutorial presented at INTERSPEECH 2018: Spoofing attacks in Automatic Speaker Verification: Analysis and Countermeasures

More information

Complex Identification Decision Based on Several Independent Speaker Recognition Methods. Ilya Oparin Speech Technology Center

Complex Identification Decision Based on Several Independent Speaker Recognition Methods. Ilya Oparin Speech Technology Center Complex Identification Decision Based on Several Independent Speaker Recognition Methods Ilya Oparin Speech Technology Center Corporate Overview Global provider of voice biometric solutions Company name:

More information

A One-Class Classification Approach to Generalised Speaker Verification Spoofing Countermeasures using Local Binary Patterns

A One-Class Classification Approach to Generalised Speaker Verification Spoofing Countermeasures using Local Binary Patterns A One-Class Classification Approach to Generalised Speaker Verification Spoofing Countermeasures using Local Binary Patterns Federico Alegre, Asmaa Amehraye, Nicholas Evans To cite this version: Federico

More information

Client Dependent GMM-SVM Models for Speaker Verification

Client Dependent GMM-SVM Models for Speaker Verification Client Dependent GMM-SVM Models for Speaker Verification Quan Le, Samy Bengio IDIAP, P.O. Box 592, CH-1920 Martigny, Switzerland {quan,bengio}@idiap.ch Abstract. Generative Gaussian Mixture Models (GMMs)

More information

Improving Robustness to Compressed Speech in Speaker Recognition

Improving Robustness to Compressed Speech in Speaker Recognition INTERSPEECH 2013 Improving Robustness to Compressed Speech in Speaker Recognition Mitchell McLaren 1, Victor Abrash 1, Martin Graciarena 1, Yun Lei 1, Jan Pe sán 2 1 Speech Technology and Research Laboratory,

More information

Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV

Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV Jan Vaněk and Josef V. Psutka Department of Cybernetics, West Bohemia University,

More information

Comparison of Clustering Methods: a Case Study of Text-Independent Speaker Modeling

Comparison of Clustering Methods: a Case Study of Text-Independent Speaker Modeling Comparison of Clustering Methods: a Case Study of Text-Independent Speaker Modeling Tomi Kinnunen, Ilja Sidoroff, Marko Tuononen, Pasi Fränti Speech and Image Processing Unit, School of Computing, University

More information

Supervector Compression Strategies to Speed up I-Vector System Development

Supervector Compression Strategies to Speed up I-Vector System Development Supervector Compression Strategies to Speed up I-Vector System Development Ville Vestman Tomi Kinnunen University of Eastern Finland Finland vvestman@cs.uef.fi tkinnu@cs.uef.fi system can be achieved [5

More information

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Samer Al Moubayed Center for Speech Technology, Department of Speech, Music, and Hearing, KTH, Sweden. sameram@kth.se

More information

Trial-Based Calibration for Speaker Recognition in Unseen Conditions

Trial-Based Calibration for Speaker Recognition in Unseen Conditions Trial-Based Calibration for Speaker Recognition in Unseen Conditions Mitchell McLaren, Aaron Lawson, Luciana Ferrer, Nicolas Scheffer, Yun Lei Speech Technology and Research Laboratory SRI International,

More information

Constant Q Cepstral Coefficients: A Spoofing Countermeasure for Automatic Speaker Verification

Constant Q Cepstral Coefficients: A Spoofing Countermeasure for Automatic Speaker Verification Constant Q Cepstral Coefficients: A Spoofing Countermeasure for Automatic Speaker Verification Massimiliano Todisco, Héctor Delgado and Nicholas Evans EURECOM, Sophia Antipolis, France Abstract Recent

More information

SPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION

SPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION Far East Journal of Electronics and Communications Volume 3, Number 2, 2009, Pages 125-140 Published Online: September 14, 2009 This paper is available online at http://www.pphmj.com 2009 Pushpa Publishing

More information

AN I-VECTOR BASED DESCRIPTOR FOR ALPHABETICAL GESTURE RECOGNITION

AN I-VECTOR BASED DESCRIPTOR FOR ALPHABETICAL GESTURE RECOGNITION 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) AN I-VECTOR BASED DESCRIPTOR FOR ALPHABETICAL GESTURE RECOGNITION You-Chi Cheng 1, Ville Hautamäki 2, Zhen Huang 1,

More information

ON THE EFFECT OF SCORE EQUALIZATION IN SVM MULTIMODAL BIOMETRIC SYSTEMS

ON THE EFFECT OF SCORE EQUALIZATION IN SVM MULTIMODAL BIOMETRIC SYSTEMS ON THE EFFECT OF SCORE EQUALIZATION IN SVM MULTIMODAL BIOMETRIC SYSTEMS Pascual Ejarque and Javier Hernando TALP Research Center, Department of Signal Theory and Communications Technical University of

More information

A Fast Personal Palm print Authentication based on 3D-Multi Wavelet Transformation

A Fast Personal Palm print Authentication based on 3D-Multi Wavelet Transformation A Fast Personal Palm print Authentication based on 3D-Multi Wavelet Transformation * A. H. M. Al-Helali, * W. A. Mahmmoud, and * H. A. Ali * Al- Isra Private University Email: adnan_hadi@yahoo.com Abstract:

More information

Confidence Measures: how much we can trust our speech recognizers

Confidence Measures: how much we can trust our speech recognizers Confidence Measures: how much we can trust our speech recognizers Prof. Hui Jiang Department of Computer Science York University, Toronto, Ontario, Canada Email: hj@cs.yorku.ca Outline Speech recognition

More information

K-Nearest Neighbor Classification Approach for Face and Fingerprint at Feature Level Fusion

K-Nearest Neighbor Classification Approach for Face and Fingerprint at Feature Level Fusion K-Nearest Neighbor Classification Approach for Face and Fingerprint at Feature Level Fusion Dhriti PEC University of Technology Chandigarh India Manvjeet Kaur PEC University of Technology Chandigarh India

More information

ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements

ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements Héctor Delgado 1, Massimiliano Todisco 1, Md Sahidullah 2, Nicholas Evans 1, Tomi Kinnunen 3, Kong Aik Lee 4, Junichi Yamagishi 5,6

More information

TWO-STEP SEMI-SUPERVISED APPROACH FOR MUSIC STRUCTURAL CLASSIFICATION. Prateek Verma, Yang-Kai Lin, Li-Fan Yu. Stanford University

TWO-STEP SEMI-SUPERVISED APPROACH FOR MUSIC STRUCTURAL CLASSIFICATION. Prateek Verma, Yang-Kai Lin, Li-Fan Yu. Stanford University TWO-STEP SEMI-SUPERVISED APPROACH FOR MUSIC STRUCTURAL CLASSIFICATION Prateek Verma, Yang-Kai Lin, Li-Fan Yu Stanford University ABSTRACT Structural segmentation involves finding hoogeneous sections appearing

More information

REDDOTS REPLAYED: A NEW REPLAY SPOOFING ATTACK CORPUS FOR TEXT-DEPENDENT SPEAKER VERIFICATION RESEARCH

REDDOTS REPLAYED: A NEW REPLAY SPOOFING ATTACK CORPUS FOR TEXT-DEPENDENT SPEAKER VERIFICATION RESEARCH REDDOTS REPLAYED: A NEW REPLAY SPOOFING ATTACK CORPUS FOR TEXT-DEPENDENT SPEAKER VERIFICATION RESEARCH Tomi Kinnunen 1, Md Sahidullah 1, Mauro Falcone 2, Luca Costantini 2, Rosa González Hautamäki 1, Dennis

More information

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)

GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) (1) Stanford University (2) National Research and Simulation Center, Rafael Ltd. 0 MICROPHONE

More information

HIERARCHICAL LARGE-MARGIN GAUSSIAN MIXTURE MODELS FOR PHONETIC CLASSIFICATION. Hung-An Chang and James R. Glass

HIERARCHICAL LARGE-MARGIN GAUSSIAN MIXTURE MODELS FOR PHONETIC CLASSIFICATION. Hung-An Chang and James R. Glass HIERARCHICAL LARGE-MARGIN GAUSSIAN MIXTURE MODELS FOR PHONETIC CLASSIFICATION Hung-An Chang and James R. Glass MIT Computer Science and Artificial Intelligence Laboratory Cambridge, Massachusetts, 02139,

More information

Voice Command Based Computer Application Control Using MFCC

Voice Command Based Computer Application Control Using MFCC Voice Command Based Computer Application Control Using MFCC Abinayaa B., Arun D., Darshini B., Nataraj C Department of Embedded Systems Technologies, Sri Ramakrishna College of Engineering, Coimbatore,

More information

SOUND EVENT DETECTION AND CONTEXT RECOGNITION 1 INTRODUCTION. Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2

SOUND EVENT DETECTION AND CONTEXT RECOGNITION 1 INTRODUCTION. Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2 Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2 1 Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 33720, Tampere, Finland toni.heittola@tut.fi,

More information

Voice Conversion Using Dynamic Kernel. Partial Least Squares Regression

Voice Conversion Using Dynamic Kernel. Partial Least Squares Regression Voice Conversion Using Dynamic Kernel 1 Partial Least Squares Regression Elina Helander, Hanna Silén, Tuomas Virtanen, Member, IEEE, and Moncef Gabbouj, Fellow, IEEE Abstract A drawback of many voice conversion

More information

Variable-Component Deep Neural Network for Robust Speech Recognition

Variable-Component Deep Neural Network for Robust Speech Recognition Variable-Component Deep Neural Network for Robust Speech Recognition Rui Zhao 1, Jinyu Li 2, and Yifan Gong 2 1 Microsoft Search Technology Center Asia, Beijing, China 2 Microsoft Corporation, One Microsoft

More information

Voice Conversion for Non-Parallel Datasets Using Dynamic Kernel Partial Least Squares Regression

Voice Conversion for Non-Parallel Datasets Using Dynamic Kernel Partial Least Squares Regression INTERSPEECH 2013 Voice Conversion for Non-Parallel Datasets Using Dynamic Kernel Partial Least Squares Regression Hanna Silén, Jani Nurminen, Elina Helander, Moncef Gabbouj Department of Signal Processing,

More information

Pitch Prediction from Mel-frequency Cepstral Coefficients Using Sparse Spectrum Recovery

Pitch Prediction from Mel-frequency Cepstral Coefficients Using Sparse Spectrum Recovery Pitch Prediction from Mel-frequency Cepstral Coefficients Using Sparse Spectrum Recovery Achuth Rao MV, Prasanta Kumar Ghosh SPIRE LAB Electrical Engineering, Indian Institute of Science (IISc), Bangalore,

More information

Shape and Texture Based Countermeasure to Protect Face Recognition Systems Against Mask Attacks

Shape and Texture Based Countermeasure to Protect Face Recognition Systems Against Mask Attacks 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops Shape and Texture Based Countermeasure to Protect Face Recognition Systems Against Mask Attacks Neslihan Kose and Jean-Luc Dugelay

More information

Authentication of Fingerprint Recognition Using Natural Language Processing

Authentication of Fingerprint Recognition Using Natural Language Processing Authentication of Fingerprint Recognition Using Natural Language Shrikala B. Digavadekar 1, Prof. Ravindra T. Patil 2 1 Tatyasaheb Kore Institute of Engineering & Technology, Warananagar, India 2 Tatyasaheb

More information

ASVspoof 2019: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan

ASVspoof 2019: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan ASVspoof 2019: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan ASVspoof consortium http://www.asvspoof.org/ January 15, 2019 1 Introduction The ASVspoof 2019 challenge

More information

Agatha: Multimodal Biometric Authentication Platform in Large-Scale Databases

Agatha: Multimodal Biometric Authentication Platform in Large-Scale Databases Agatha: Multimodal Biometric Authentication Platform in Large-Scale Databases David Hernando David Gómez Javier Rodríguez Saeta Pascual Ejarque 2 Javier Hernando 2 Biometric Technologies S.L., Barcelona,

More information

Text-Independent Speaker Identification

Text-Independent Speaker Identification December 8, 1999 Text-Independent Speaker Identification Til T. Phan and Thomas Soong 1.0 Introduction 1.1 Motivation The problem of speaker identification is an area with many different applications.

More information

Voice Authentication Using Short Phrases: Examining Accuracy, Security and Privacy Issues

Voice Authentication Using Short Phrases: Examining Accuracy, Security and Privacy Issues Voice Authentication Using Short Phrases: Examining Accuracy, Security and Privacy Issues R.C. Johnson, Terrance E. Boult University of Colorado, Colorado Springs Colorado Springs, CO, USA rjohnso9 tboult

More information

Detector. Flash. Detector

Detector. Flash. Detector CLIPS at TRECvid: Shot Boundary Detection and Feature Detection Georges M. Quénot, Daniel Moraru, and Laurent Besacier CLIPS-IMAG, BP53, 38041 Grenoble Cedex 9, France Georges.Quenot@imag.fr Abstract This

More information

Time Analysis of Pulse-based Face Anti-Spoofing in Visible and NIR

Time Analysis of Pulse-based Face Anti-Spoofing in Visible and NIR Time Analysis of Pulse-based Face Anti-Spoofing in Visible and NIR Javier Hernandez-Ortega, Julian Fierrez, Aythami Morales, and Pedro Tome Biometrics and Data Pattern Analytics BiDA Lab Universidad Autónoma

More information

Speaker Diarization System Based on GMM and BIC

Speaker Diarization System Based on GMM and BIC Speaer Diarization System Based on GMM and BIC Tantan Liu 1, Xiaoxing Liu 1, Yonghong Yan 1 1 ThinIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences Beijing 100080 {tliu, xliu,yyan}@hccl.ioa.ac.cn

More information

Writer Identification In Music Score Documents Without Staff-Line Removal

Writer Identification In Music Score Documents Without Staff-Line Removal Writer Identification In Music Score Documents Without Staff-Line Removal Anirban Hati, Partha P. Roy and Umapada Pal Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata,

More information

Optimization of Observation Membership Function By Particle Swarm Method for Enhancing Performances of Speaker Identification

Optimization of Observation Membership Function By Particle Swarm Method for Enhancing Performances of Speaker Identification Proceedings of the 6th WSEAS International Conference on SIGNAL PROCESSING, Dallas, Texas, USA, March 22-24, 2007 52 Optimization of Observation Membership Function By Particle Swarm Method for Enhancing

More information

Simultaneous Design of Feature Extractor and Pattern Classifer Using the Minimum Classification Error Training Algorithm

Simultaneous Design of Feature Extractor and Pattern Classifer Using the Minimum Classification Error Training Algorithm Griffith Research Online https://research-repository.griffith.edu.au Simultaneous Design of Feature Extractor and Pattern Classifer Using the Minimum Classification Error Training Algorithm Author Paliwal,

More information

Input speech signal. Selected /Rejected. Pre-processing Feature extraction Matching algorithm. Database. Figure 1: Process flow in ASR

Input speech signal. Selected /Rejected. Pre-processing Feature extraction Matching algorithm. Database. Figure 1: Process flow in ASR Volume 5, Issue 1, January 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Feature Extraction

More information

Audiovisual Synchrony Detection with Optimized Audio Features

Audiovisual Synchrony Detection with Optimized Audio Features Audiovisual Synchrony Detection with Optimized Audio Features Sami Sieranoja, Md Sahidullah, Tomi Kinnunen School of Computing University of Eastern Finland, Joensuu, Finland Jukka Komulainen, Abdenour

More information

Biometrics already form a significant component of current. Biometrics Systems Under Spoofing Attack

Biometrics already form a significant component of current. Biometrics Systems Under Spoofing Attack [ Abdenour Hadid, Nicholas Evans, Sébastien Marcel, and Julian Fierrez ] s Systems Under Spoofing Attack [ An evaluation methodology and lessons learned ] istockphoto.com/greyfebruary s already form a

More information

Baseball Game Highlight & Event Detection

Baseball Game Highlight & Event Detection Baseball Game Highlight & Event Detection Student: Harry Chao Course Adviser: Winston Hu 1 Outline 1. Goal 2. Previous methods 3. My flowchart 4. My methods 5. Experimental result 6. Conclusion & Future

More information

Discriminative training and Feature combination

Discriminative training and Feature combination Discriminative training and Feature combination Steve Renals Automatic Speech Recognition ASR Lecture 13 16 March 2009 Steve Renals Discriminative training and Feature combination 1 Overview Hot topics

More information

Outline. Incorporating Biometric Quality In Multi-Biometrics FUSION. Results. Motivation. Image Quality: The FVC Experience

Outline. Incorporating Biometric Quality In Multi-Biometrics FUSION. Results. Motivation. Image Quality: The FVC Experience Incorporating Biometric Quality In Multi-Biometrics FUSION QUALITY Julian Fierrez-Aguilar, Javier Ortega-Garcia Biometrics Research Lab. - ATVS Universidad Autónoma de Madrid, SPAIN Loris Nanni, Raffaele

More information

Biometrics Technology: Multi-modal (Part 2)

Biometrics Technology: Multi-modal (Part 2) Biometrics Technology: Multi-modal (Part 2) References: At the Level: [M7] U. Dieckmann, P. Plankensteiner and T. Wagner, "SESAM: A biometric person identification system using sensor fusion ", Pattern

More information

Applications of Keyword-Constraining in Speaker Recognition. Howard Lei. July 2, Introduction 3

Applications of Keyword-Constraining in Speaker Recognition. Howard Lei. July 2, Introduction 3 Applications of Keyword-Constraining in Speaker Recognition Howard Lei hlei@icsi.berkeley.edu July 2, 2007 Contents 1 Introduction 3 2 The keyword HMM system 4 2.1 Background keyword HMM training............................

More information

How accurate is AGNITIO KIVOX Voice ID?

How accurate is AGNITIO KIVOX Voice ID? How accurate is AGNITIO KIVOX Voice ID? Overview Using natural speech, KIVOX can work with error rates below 1%. When optimized for short utterances, where the same phrase is used for enrolment and authentication,

More information

Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute Slide Credit: Mehryar Mohri

Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute Slide Credit: Mehryar Mohri Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute eugenew@cs.nyu.edu Slide Credit: Mehryar Mohri Speech Recognition Components Acoustic and pronunciation model:

More information

FUSION MODEL BASED ON CONVOLUTIONAL NEURAL NETWORKS WITH TWO FEATURES FOR ACOUSTIC SCENE CLASSIFICATION

FUSION MODEL BASED ON CONVOLUTIONAL NEURAL NETWORKS WITH TWO FEATURES FOR ACOUSTIC SCENE CLASSIFICATION Please contact the conference organizers at dcasechallenge@gmail.com if you require an accessible file, as the files provided by ConfTool Pro to reviewers are filtered to remove author information, and

More information

Xing Fan, Carlos Busso and John H.L. Hansen

Xing Fan, Carlos Busso and John H.L. Hansen Xing Fan, Carlos Busso and John H.L. Hansen Center for Robust Speech Systems (CRSS) Erik Jonsson School of Engineering & Computer Science Department of Electrical Engineering University of Texas at Dallas

More information

Multi-Modal Human Verification Using Face and Speech

Multi-Modal Human Verification Using Face and Speech 22 Multi-Modal Human Verification Using Face and Speech Changhan Park 1 and Joonki Paik 2 1 Advanced Technology R&D Center, Samsung Thales Co., Ltd., 2 Graduate School of Advanced Imaging Science, Multimedia,

More information

Neetha Das Prof. Andy Khong

Neetha Das Prof. Andy Khong Neetha Das Prof. Andy Khong Contents Introduction and aim Current system at IMI Proposed new classification model Support Vector Machines Initial audio data collection and processing Features and their

More information

On-line Signature Verification on a Mobile Platform

On-line Signature Verification on a Mobile Platform On-line Signature Verification on a Mobile Platform Nesma Houmani, Sonia Garcia-Salicetti, Bernadette Dorizzi, and Mounim El-Yacoubi Institut Telecom; Telecom SudParis; Intermedia Team, 9 rue Charles Fourier,

More information

LARGE-SCALE SPEAKER IDENTIFICATION

LARGE-SCALE SPEAKER IDENTIFICATION 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) LARGE-SCALE SPEAKER IDENTIFICATION Ludwig Schmidt MIT Matthew Sharifi and Ignacio Lopez Moreno Google, Inc. ABSTRACT

More information

Multimodal Biometric System by Feature Level Fusion of Palmprint and Fingerprint

Multimodal Biometric System by Feature Level Fusion of Palmprint and Fingerprint Multimodal Biometric System by Feature Level Fusion of Palmprint and Fingerprint Navdeep Bajwa M.Tech (Student) Computer Science GIMET, PTU Regional Center Amritsar, India Er. Gaurav Kumar M.Tech (Supervisor)

More information

EFFECTIVE METHODOLOGY FOR DETECTING AND PREVENTING FACE SPOOFING ATTACKS

EFFECTIVE METHODOLOGY FOR DETECTING AND PREVENTING FACE SPOOFING ATTACKS EFFECTIVE METHODOLOGY FOR DETECTING AND PREVENTING FACE SPOOFING ATTACKS 1 Mr. Kaustubh D.Vishnu, 2 Dr. R.D. Raut, 3 Dr. V. M. Thakare 1,2,3 SGBAU, Amravati,Maharashtra, (India) ABSTRACT Biometric system

More information

Secure E- Commerce Transaction using Noisy Password with Voiceprint and OTP

Secure E- Commerce Transaction using Noisy Password with Voiceprint and OTP Secure E- Commerce Transaction using Noisy Password with Voiceprint and OTP Komal K. Kumbhare Department of Computer Engineering B. D. C. O. E. Sevagram, India komalkumbhare27@gmail.com Prof. K. V. Warkar

More information

Chapter 3. Speech segmentation. 3.1 Preprocessing

Chapter 3. Speech segmentation. 3.1 Preprocessing , as done in this dissertation, refers to the process of determining the boundaries between phonemes in the speech signal. No higher-level lexical information is used to accomplish this. This chapter presents

More information

Production of Video Images by Computer Controlled Cameras and Its Application to TV Conference System

Production of Video Images by Computer Controlled Cameras and Its Application to TV Conference System Proc. of IEEE Conference on Computer Vision and Pattern Recognition, vol.2, II-131 II-137, Dec. 2001. Production of Video Images by Computer Controlled Cameras and Its Application to TV Conference System

More information

RLAT Rapid Language Adaptation Toolkit

RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit Tim Schlippe May 15, 2012 RLAT Rapid Language Adaptation Toolkit - 2 RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit - 3 Outline Introduction

More information

Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data

Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data Martin Karafiát Λ, Igor Szöke, and Jan Černocký Brno University of Technology, Faculty of Information Technology Department

More information

REAL-TIME ROAD SIGNS RECOGNITION USING MOBILE GPU

REAL-TIME ROAD SIGNS RECOGNITION USING MOBILE GPU High-Performance Сomputing REAL-TIME ROAD SIGNS RECOGNITION USING MOBILE GPU P.Y. Yakimov Samara National Research University, Samara, Russia Abstract. This article shows an effective implementation of

More information

EM Algorithm with Split and Merge in Trajectory Clustering for Automatic Speech Recognition

EM Algorithm with Split and Merge in Trajectory Clustering for Automatic Speech Recognition EM Algorithm with Split and Merge in Trajectory Clustering for Automatic Speech Recognition Yan Han and Lou Boves Department of Language and Speech, Radboud University Nijmegen, The Netherlands {Y.Han,

More information

Audio-visual interaction in sparse representation features for noise robust audio-visual speech recognition

Audio-visual interaction in sparse representation features for noise robust audio-visual speech recognition ISCA Archive http://www.isca-speech.org/archive Auditory-Visual Speech Processing (AVSP) 2013 Annecy, France August 29 - September 1, 2013 Audio-visual interaction in sparse representation features for

More information

HIDDEN Markov model (HMM)-based statistical parametric

HIDDEN Markov model (HMM)-based statistical parametric 1492 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 20, NO. 5, JULY 2012 Minimum Kullback Leibler Divergence Parameter Generation for HMM-Based Speech Synthesis Zhen-Hua Ling, Member,

More information

Lec 08 Feature Aggregation II: Fisher Vector, Super Vector and AKULA

Lec 08 Feature Aggregation II: Fisher Vector, Super Vector and AKULA Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W 4-5:15pm@Bloch 0012 Lec 08 Feature Aggregation II: Fisher Vector, Super Vector and AKULA Zhu Li Dept of CSEE,

More information

Figure 1. Example sample for fabric mask. In the second column, the mask is worn on the face. The picture is taken from [5].

Figure 1. Example sample for fabric mask. In the second column, the mask is worn on the face. The picture is taken from [5]. ON THE VULNERABILITY OF FACE RECOGNITION SYSTEMS TO SPOOFING MASK ATTACKS Neslihan Kose, Jean-Luc Dugelay Multimedia Department, EURECOM, Sophia-Antipolis, France {neslihan.kose, jean-luc.dugelay}@eurecom.fr

More information

2. Basic Task of Pattern Classification

2. Basic Task of Pattern Classification 2. Basic Task of Pattern Classification Definition of the Task Informal Definition: Telling things apart 3 Definition: http://www.webopedia.com/term/p/pattern_recognition.html pattern recognition Last

More information

Query-by-example spoken term detection based on phonetic posteriorgram Query-by-example spoken term detection based on phonetic posteriorgram

Query-by-example spoken term detection based on phonetic posteriorgram Query-by-example spoken term detection based on phonetic posteriorgram International Conference on Education, Management and Computing Technology (ICEMCT 2015) Query-by-example spoken term detection based on phonetic posteriorgram Query-by-example spoken term detection based

More information

Gender Classification Technique Based on Facial Features using Neural Network

Gender Classification Technique Based on Facial Features using Neural Network Gender Classification Technique Based on Facial Features using Neural Network Anushri Jaswante Dr. Asif Ullah Khan Dr. Bhupesh Gour Computer Science & Engineering, Rajiv Gandhi Proudyogiki Vishwavidyalaya,

More information

Multi-modal Person Identification in a Smart Environment

Multi-modal Person Identification in a Smart Environment Multi-modal Person Identification in a Smart Environment Hazım Kemal Ekenel 1, Mika Fischer 1, Qin Jin 2, Rainer Stiefelhagen 1 1 Interactive Systems Labs (ISL), Universität Karlsruhe (TH), 76131 Karlsruhe,

More information