SRE08 system. Nir Krause Ran Gazit Gennady Karvitsky. Leave Impersonators, fraudsters and identity thieves speechless
|
|
- Winfred Robbins
- 5 years ago
- Views:
Transcription
1 Leave Impersonators, fraudsters and identity thieves speechless SRE08 system Nir Krause Ran Gazit Gennady Karvitsky Copyright 2008 PerSay Inc. All Rights Reserved
2 Focus: Multilingual telephone speech and 10sec conditions
3
4 Qualcomm-ICSI- OGI Wiener filter (MIC only)
5 MFCC & LPCC
6 SGM Svm in the Gmm Models space SGM= GMM-SVM=GSV=GMS=?
7 NAP SGM
8 GMM
9 TNO Thank you David!
10 Tuning with Focal Thank you Niko!
11 Condition Short2-short3 PRS1 (Primary) LPCC NAP SGM + MFCC NAP SGM + TNO Short2-10sec LPCC NAP SGM + MFCC NAP SGM 10sec-10sec LPCC SGM + MFCC SGM + GMM Short2-summed LPCC NAP SGM + MFCC NAP SGM
12
13 Super vectors
14 Super vector generation m 1 :Means-only Bayesian adaptation -> UBM with 512 Gaussians (m ubm ) top 10 scoring Gaussians A super vector of Gaussian means m 2 =m 1 -m ubm m ( gaussian, feature ) m 3 2 m=m 3 / m 3 L2 normalization. ( gaussian, feature ) weight ( gaussian ) var( gaussian, feature )
15 Data engineering
16 Strategy: Gender dependent Sub-condition dependent Greedy
17
18 Optimized parameters UBM, Negative & NAP speakers (among a few choices) NAP dimension Relevance factor
19 LPCC & MFCC NAP SGM short2-short3 phone (in both train & test), short2-summed Males Females Description UBM background data segments of different speakers from SREs: 99, 03, 04 & 05. Negative examples segments from Call Friend, and SREs: 99, 03, 04 & 05. NAP different speakers with at least 6 calls in SREs 04 & 05, who do not appear in the negative examples. NAP dimension Relevance factor 3 3
20 LPCC NAP SGM short2-short3 microphone (in either test or train) Males Females Description UBM background data segments of different speakers from SREs: 99, 03, 04 & 05. Negative examples segments from SREs: 99, 03, 04 & 05, including 05 mic tests data. NAP different speakers with at least 6 calls in SREs 04 & 05mic, who do not appear in the negative examples. NAP dimension Relevance factor 3 3
21 LPCC NAP SGM short2-10sec (MFCC used slightly different background data) Males Females description UBM background data segments of different speakers from SREs: 99, 03, 04 & 05. Negative examples segments from SREs: 99, 03, 04 & 05. Only the first 15sec of net audio were used to create the super vector, to match the test segment length. NAP different speakers with at least 6 calls in SREs 04 & 05, who do not appear in the negative examples. NAP dimension Relevance factor 3 3
22 LPCC & MFCC SGM 10sec-10sec Males Females description UBM background data segments of different speakers from SREs: 99, 03, 04 & 05. Negative examples segments from SREs: 99, 03, 04 & 05. Only the first 15sec of net audio were used to create the super vector, to match the test segment length. Relevance factor 1 1 Other The silence detector parameters were optimized for this condition, to extract more frames
23 10sec-10sec GMM Same as last year
24
25 Equal fusion Focal logistic regression
26 Good old SRE06 (didn t use the new short-short lists)
27 Results
28 Short2-short3 Int-Int Int-Int same PRS1 (NAP SGM + TNO) Int-Int different Int-Tel Tel-Mic Tel-Tel Tel-Tel Eng EER Tel-Tel native Eng mindcf actdcf PRS2 (NAP SGM) EER mindcf actdcf
29 Short2-summed Tel-tel Tel-tel Eng EER mindcf actdcf
30 Short2-10sec Tel-tel Tel-tel Eng EER mindcf actdcf
31 10sec-10sec Tel-tel Tel-tel Eng PRS1 EER mindcf actdcf PRS2 (SGM) EER mindcf actdcf PRS3 (Tnormed GMM) EER mindcf actdcf
32 Road to no-where Z/T/ZT norm for SGM? Only in GMM
33 Road to no-where 1024 Gaussians
34 Road to no-where Concatenate a male adapted & a female adapted super vectors (as in SRI s MLLR)
35 Road to no-where Wiener filter on telephone data
36 Road to no-where Factor analysis session compensation: as good as NAP, doesn t improve much with fusion
37 Road to no-where NAP on 10sec training duration doesn t help. Helps when training on 2.5 min, testing 10sec
38 Road to no-where Fusion of same system with different background datausually not useful
39 Back to the future Joint Factor Analysis still in process
40 PerSay s NIST VS customers (mainly call centers of banks, telecoms etc...)
41 Dev & Test data NIST Dev: SRE 06,05,04, Different background and development data Thousands of speakers SRE target models 100,000 tests 100GB Customers Same dev & test: speakers 10 speakers 3 speakers
42 Duration NIST Focus on Train: 2.5 minutes Test: 2.5 minutes Customers Train: ~1 minute Test: 20 sec summed Can we remove the agent?
43 TI/TD NIST Text Independent Customers Text Dependent (90%) 0.5-3% EER Text Independent (10%)
44 Thank You Copyright 2008 PerSay Inc. All Rights Reserved
Comparative Evaluation of Feature Normalization Techniques for Speaker Verification
Comparative Evaluation of Feature Normalization Techniques for Speaker Verification Md Jahangir Alam 1,2, Pierre Ouellet 1, Patrick Kenny 1, Douglas O Shaughnessy 2, 1 CRIM, Montreal, Canada {Janagir.Alam,
More informationABC submission for NIST SRE 2016
ABC submission for NIST SRE 2016 Agnitio+BUT+CRIM Oldrich Plchot, Pavel Matejka, Ondrej Novotny, Anna Silnova, Johan Rohdin, Mireia Diez, Ondrej Glembek, Xiaowei Jiang, Lukas Burget, Martin Karafiat, Lucas
More informationarxiv: v1 [cs.sd] 8 Jun 2017
SUT SYSTEM DESCRIPTION FOR NIST SRE 2016 Hossein Zeinali 1,2, Hossein Sameti 1 and Nooshin Maghsoodi 1 1 Sharif University of Technology, Tehran, Iran 2 Brno University of Technology, Speech@FIT and IT4I
More informationSUT Submission for NIST 2016 Speaker Recognition Evaluation: Description and Analysis
The 2017 Conference on Computational Linguistics and Speech Processing ROCLING 2017, pp. 276-286 The Association for Computational Linguistics and Chinese Language Processing SUT Submission for NIST 2016
More informationVulnerability of Voice Verification System with STC anti-spoofing detector to different methods of spoofing attacks
Vulnerability of Voice Verification System with STC anti-spoofing detector to different methods of spoofing attacks Vadim Shchemelinin 1,2, Alexandr Kozlov 2, Galina Lavrentyeva 2, Sergey Novoselov 1,2
More informationPDF hosted at the Radboud Repository of the Radboud University Nijmegen
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/94752
More informationFOUR WEIGHTINGS AND A FUSION: A CEPSTRAL-SVM SYSTEM FOR SPEAKER RECOGNITION. Sachin S. Kajarekar
FOUR WEIGHTINGS AND A FUSION: A CEPSTRAL-SVM SYSTEM FOR SPEAKER RECOGNITION Sachin S. Kajarekar Speech Technology and Research Laboratory SRI International, Menlo Park, CA, USA sachin@speech.sri.com ABSTRACT
More informationBo#leneck Features from SNR- Adap9ve Denoising Deep Classifier for Speaker Iden9fica9on
Bo#leneck Features from SNR- Adap9ve Denoising Deep Classifier for Speaker Iden9fica9on TAN Zhili & MAK Man-Wai APSIPA 2015 Department of Electronic and Informa2on Engineering The Hong Kong Polytechnic
More informationSAS: A speaker verification spoofing database containing diverse attacks
SAS: A speaker verification spoofing database containing diverse attacks Zhizheng Wu 1, Ali Khodabakhsh 2, Cenk Demiroglu 2, Junichi Yamagishi 1,3, Daisuke Saito 4, Tomoki Toda 5, Simon King 1 1 University
More informationA ROBUST SPEAKER CLUSTERING ALGORITHM
A ROBUST SPEAKER CLUSTERING ALGORITHM J. Ajmera IDIAP P.O. Box 592 CH-1920 Martigny, Switzerland jitendra@idiap.ch C. Wooters ICSI 1947 Center St., Suite 600 Berkeley, CA 94704, USA wooters@icsi.berkeley.edu
More informationTHE 2013 SPEAKER RECOGNITION EVALUATION IN MOBILE ENVIRONMENT
RESEARCH IDIAP REPORT THE 013 SPEAKER RECOGNITION EVALUATION IN MOBILE ENVIRONMENT Elie Khoury Bostjan Vesnicer Javier Franco-Pedroso Ricardo Violato Zenelabidine Boulkenafet Luis-Miguel Mazaira Fernandez
More informationMultifactor Fusion for Audio-Visual Speaker Recognition
Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing, Beijing, China, September 15-17, 2007 70 Multifactor Fusion for Audio-Visual Speaker Recognition GIRIJA CHETTY
More informationVoiceprint-based Access Control for Wireless Insulin Pump Systems
Voiceprint-based Access Control for Wireless Insulin Pump Systems Presenter: Xiaojiang Du Bin Hao, Xiali Hei, Yazhou Tu, Xiaojiang Du, and Jie Wu School of Computing and Informatics, University of Louisiana
More informationMultimedia Event Detection for Large Scale Video. Benjamin Elizalde
Multimedia Event Detection for Large Scale Video Benjamin Elizalde Outline Motivation TrecVID task Related work Our approach (System, TF/IDF) Results & Processing time Conclusion & Future work Agenda 2
More informationTrial-Based Calibration for Speaker Recognition in Unseen Conditions
Trial-Based Calibration for Speaker Recognition in Unseen Conditions Mitchell McLaren, Aaron Lawson, Luciana Ferrer, Nicolas Scheffer, Yun Lei Speech Technology and Research Laboratory SRI International,
More informationHands On: Multimedia Methods for Large Scale Video Analysis (Lecture) Dr. Gerald Friedland,
Hands On: Multimedia Methods for Large Scale Video Analysis (Lecture) Dr. Gerald Friedland, fractor@icsi.berkeley.edu 1 Today Recap: Some more Machine Learning Multimedia Systems An example Multimedia
More informationThe Approach of Mean Shift based Cosine Dissimilarity for Multi-Recording Speaker Clustering
The Approach of Mean Shift based Cosine Dissimilarity for Multi-Recording Speaker Clustering 1 D. Jareena Begum, 2 K Rajendra Prasad, 3 M Suleman Basha 1 M.Tech in SE, RGMCET, Nandyal 2 Assoc Prof, Dept
More informationManual operations of the voice identification program GritTec's Speaker-ID: The Mobile Client
Manual operations of the voice identification program GritTec's Speaker-ID: The Mobile Client Version 4.00 2017 Title Short name of product Version 4.00 Manual operations of GritTec s Speaker-ID: The Mobile
More informationIntroducing I-Vectors for Joint Anti-spoofing and Speaker Verification
Introducing I-Vectors for Joint Anti-spoofing and Speaker Verification Elie Khoury, Tomi Kinnunen, Aleksandr Sizov, Zhizheng Wu, Sébastien Marcel Idiap Research Institute, Switzerland School of Computing,
More informationA Low-Complexity Dynamic Face-Voice Feature Fusion Approach to Multimodal Person Recognition
A Low-Complexity Dynamic Face-Voice Feature Fusion Approach to Multimodal Person Recognition Dhaval Shah, Kyu J. Han, Shrikanth S. Nayaranan Ming Hsieh Department of Electrical Engineering, Viterbi School
More informationPitch Prediction from Mel-frequency Cepstral Coefficients Using Sparse Spectrum Recovery
Pitch Prediction from Mel-frequency Cepstral Coefficients Using Sparse Spectrum Recovery Achuth Rao MV, Prasanta Kumar Ghosh SPIRE LAB Electrical Engineering, Indian Institute of Science (IISc), Bangalore,
More informationSpeaker Diarization System Based on GMM and BIC
Speaer Diarization System Based on GMM and BIC Tantan Liu 1, Xiaoxing Liu 1, Yonghong Yan 1 1 ThinIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences Beijing 100080 {tliu, xliu,yyan}@hccl.ioa.ac.cn
More informationApplications of Keyword-Constraining in Speaker Recognition. Howard Lei. July 2, Introduction 3
Applications of Keyword-Constraining in Speaker Recognition Howard Lei hlei@icsi.berkeley.edu July 2, 2007 Contents 1 Introduction 3 2 The keyword HMM system 4 2.1 Background keyword HMM training............................
More informationIMPROVED SPEAKER RECOGNITION USING DCT COEFFICIENTS AS FEATURES. Mitchell McLaren, Yun Lei
IMPROVED SPEAKER RECOGNITION USING DCT COEFFICIENTS AS FEATURES Mitchell McLaren, Yun Lei Speech Technology and Research Laboratory, SRI International, California, USA {mitch,yunlei}@speech.sri.com ABSTRACT
More informationGYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)
GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) (1) Stanford University (2) National Research and Simulation Center, Rafael Ltd. 0 MICROPHONE
More informationComparison of Clustering Methods: a Case Study of Text-Independent Speaker Modeling
Comparison of Clustering Methods: a Case Study of Text-Independent Speaker Modeling Tomi Kinnunen, Ilja Sidoroff, Marko Tuononen, Pasi Fränti Speech and Image Processing Unit, School of Computing, University
More informationComplex Identification Decision Based on Several Independent Speaker Recognition Methods. Ilya Oparin Speech Technology Center
Complex Identification Decision Based on Several Independent Speaker Recognition Methods Ilya Oparin Speech Technology Center Corporate Overview Global provider of voice biometric solutions Company name:
More informationSpeaker Verification with Adaptive Spectral Subband Centroids
Speaker Verification with Adaptive Spectral Subband Centroids Tomi Kinnunen 1, Bingjun Zhang 2, Jia Zhu 2, and Ye Wang 2 1 Speech and Dialogue Processing Lab Institution for Infocomm Research (I 2 R) 21
More informationImproving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data
INTERSPEECH 17 August 24, 17, Stockholm, Sweden Improving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data Achintya Kr. Sarkar 1, Md. Sahidullah 2, Zheng-Hua
More informationLec 08 Feature Aggregation II: Fisher Vector, Super Vector and AKULA
Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W 4-5:15pm@Bloch 0012 Lec 08 Feature Aggregation II: Fisher Vector, Super Vector and AKULA Zhu Li Dept of CSEE,
More informationNeetha Das Prof. Andy Khong
Neetha Das Prof. Andy Khong Contents Introduction and aim Current system at IMI Proposed new classification model Support Vector Machines Initial audio data collection and processing Features and their
More informationImproving Robustness to Compressed Speech in Speaker Recognition
INTERSPEECH 2013 Improving Robustness to Compressed Speech in Speaker Recognition Mitchell McLaren 1, Victor Abrash 1, Martin Graciarena 1, Yun Lei 1, Jan Pe sán 2 1 Speech Technology and Research Laboratory,
More informationCOMBINING FEATURE SETS WITH SUPPORT VECTOR MACHINES: APPLICATION TO SPEAKER RECOGNITION
COMBINING FEATURE SETS WITH SUPPORT VECTOR MACHINES: APPLICATION TO SPEAKER RECOGNITION Andrew O. Hatch ;2, Andreas Stolcke ;3, and Barbara Peskin The International Computer Science Institute, Berkeley,
More informationTowards PLDA-RBM based Speaker Recognition in Mobile Environment: Designing Stacked/Deep PLDA-RBM Systems
Nautch, Hao, Stafylaki, Rathgeb, Buch PLDA-RBM mobile data / Shanghai, 23.03.2016 1/14 Toward PLDA-RBM baed Speaker Recognition in Mobile Environment: Deigning Stacked/Deep PLDA-RBM Sytem A. Nautch, H.
More informationThe research on Uighur speaker-dependent isolated word speech recognition
The research on Uighur speaker-dependent isolated word speech recognition Wushour silamu Caiqin Nuominghua College of information science and engineering Xinjiang University, Urumqi 830046 Abstract: A
More informationProject 3 Q&A. Jonathan Krause
Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations
More informationVoice. Voice. Patterson EagleSoft Overview Voice 629
Voice Voice Using the Microsoft voice engine, Patterson EagleSoft's Voice module is now faster, easier and more efficient than ever. Please refer to your Voice Installation guide prior to installing the
More informationHANDSET-DEPENDENT BACKGROUND MODELS FOR ROBUST. Larry P. Heck and Mitchel Weintraub. Speech Technology and Research Laboratory.
HANDSET-DEPENDENT BACKGROUND MODELS FOR ROBUST TEXT-INDEPENDENT SPEAKER RECOGNITION Larry P. Heck and Mitchel Weintraub Speech Technology and Research Laboratory SRI International Menlo Park, CA 9 ABSTRACT
More informationClient Dependent GMM-SVM Models for Speaker Verification
Client Dependent GMM-SVM Models for Speaker Verification Quan Le, Samy Bengio IDIAP, P.O. Box 592, CH-1920 Martigny, Switzerland {quan,bengio}@idiap.ch Abstract. Generative Gaussian Mixture Models (GMMs)
More informationSUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018
SUPERVISED LEARNING METHODS Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 2 CHOICE OF ML You cannot know which algorithm will work
More informationXing Fan, Carlos Busso and John H.L. Hansen
Xing Fan, Carlos Busso and John H.L. Hansen Center for Robust Speech Systems (CRSS) Erik Jonsson School of Engineering & Computer Science Department of Electrical Engineering University of Texas at Dallas
More informationVariable Selection 6.783, Biomedical Decision Support
6.783, Biomedical Decision Support (lrosasco@mit.edu) Department of Brain and Cognitive Science- MIT November 2, 2009 About this class Why selecting variables Approaches to variable selection Sparsity-based
More informationMinimal-Impact Personal Audio Archives
Minimal-Impact Personal Audio Archives Dan Ellis, Keansub Lee, Jim Ogle Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu
More informationSTC ANTI-SPOOFING SYSTEMS FOR THE ASVSPOOF 2015 CHALLENGE
STC ANTI-SPOOFING SYSTEMS FOR THE ASVSPOOF 2015 CHALLENGE Sergey Novoselov 1,2, Alexandr Kozlov 2, Galina Lavrentyeva 1,2, Konstantin Simonchik 1,2, Vadim Shchemelinin 1,2 1 ITMO University, St. Petersburg,
More informationInter-session Variability Modelling and Joint Factor Analysis for Face Authentication
Inter-session Variability Modelling and Joint Factor Analysis for Face Authentication Roy Wallace Idiap Research Institute, Martigny, Switzerland roy.wallace@idiap.ch Mitchell McLaren Radboud University
More informationA text-independent speaker verification model: A comparative analysis
A text-independent speaker verification model: A comparative analysis Rishi Charan, Manisha.A, Karthik.R, Raesh Kumar M, Senior IEEE Member School of Electronic Engineering VIT University Tamil Nadu, India
More informationProbabilistic scoring using decision trees for fast and scalable speaker recognition
Probabilistic scoring using decision trees for fast and scalable speaker recognition Gilles Gonon, Frédéric Bimbot, Rémi Gribonval To cite this version: Gilles Gonon, Frédéric Bimbot, Rémi Gribonval. Probabilistic
More information10601 Machine Learning. Model and feature selection
10601 Machine Learning Model and feature selection Model selection issues We have seen some of this before Selecting features (or basis functions) Logistic regression SVMs Selecting parameter value Prior
More informationCAPTURING AUDIO DATA FAQS
EVS AUDIO FAQ CAPTURING AUDIO DATA FAQS EVS is a powerful engine that turns audio into a rich data stream for use in upstream applications such as analytics or CRM systems. The better the quality of the
More informationThe BioSecure Talking-Face Reference System
The BioSecure Talking-Face Reference System Hervé Bredin 1, Guido Aversano 1, Chafic Mokbel 2 and Gérard Chollet 1 1 CNRS-LTCI, GET-ENST (TSI Department), 46 rue Barrault, 75013 Paris, France 2 University
More informationbabytel Self Install Guide
babytel Self Install Guide Last updated 4/5/2005 CONTENTS What you need ------------------------------------------------------------------------------------------------------------3 Download & Install
More informationMulti-modal Person Identification in a Smart Environment
Multi-modal Person Identification in a Smart Environment Hazım Kemal Ekenel 1, Mika Fischer 1, Qin Jin 2, Rainer Stiefelhagen 1 1 Interactive Systems Labs (ISL), Universität Karlsruhe (TH), 76131 Karlsruhe,
More informationPassive Detection. What is KIVOX Passive Detection? Product Datasheet. Key Benefits. 3 APIs in one product
Passive Detection Product Datasheet KIVOX Passive Detection is part of the KIVOX 4.0 family of products. KIVOX 4.0 brings our partners the full potential of real voice biometrics technology by providing
More informationON THE EFFECT OF SCORE EQUALIZATION IN SVM MULTIMODAL BIOMETRIC SYSTEMS
ON THE EFFECT OF SCORE EQUALIZATION IN SVM MULTIMODAL BIOMETRIC SYSTEMS Pascual Ejarque and Javier Hernando TALP Research Center, Department of Signal Theory and Communications Technical University of
More informationGender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV
Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV Jan Vaněk and Josef V. Psutka Department of Cybernetics, West Bohemia University,
More informationHow accurate is AGNITIO KIVOX Voice ID?
How accurate is AGNITIO KIVOX Voice ID? Overview Using natural speech, KIVOX can work with error rates below 1%. When optimized for short utterances, where the same phrase is used for enrolment and authentication,
More informationMACHINE LEARNING Example: Google search
MACHINE LEARNING Lauri Ilison, PhD Data Scientist 20.11.2014 Example: Google search 1 27.11.14 Facebook: 350 million photo uploads every day The dream is to build full knowledge of the world and know everything
More informationJoint Optimisation of Tandem Systems using Gaussian Mixture Density Neural Network Discriminative Sequence Training
Joint Optimisation of Tandem Systems using Gaussian Mixture Density Neural Network Discriminative Sequence Training Chao Zhang and Phil Woodland March 8, 07 Cambridge University Engineering Department
More informationMulti-Modal Human- Computer Interaction
Multi-Modal Human- Computer Interaction Attila Fazekas University of Debrecen, Hungary Road Map Multi-modal interactions and systems (main categories, examples, benefits) Face detection, facial gestures
More informationDr Andrew Abel University of Stirling, Scotland
Dr Andrew Abel University of Stirling, Scotland University of Stirling - Scotland Cognitive Signal Image and Control Processing Research (COSIPRA) Cognitive Computation neurobiology, cognitive psychology
More informationMACHINE LEARNING: CLUSTERING, AND CLASSIFICATION. Steve Tjoa June 25, 2014
MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION Steve Tjoa kiemyang@gmail.com June 25, 2014 Review from Day 2 Supervised vs. Unsupervised Unsupervised - clustering Supervised binary classifiers (2 classes)
More informationTA Section: Problem Set 4
TA Section: Problem Set 4 Outline Discriminative vs. Generative Classifiers Image representation and recognition models Bag of Words Model Part-based Model Constellation Model Pictorial Structures Model
More informationOperating Instructions
IBT5 07/11/2007 ECR 2014 BT5 Operating Instructions Baldwin Boxall Communications Ltd. Wealden Industrial Estate, Farningham Road Crowborough, East Sussex, TN6 2JR Telephone: 01892 664422 Fax: 01892 663146
More informationk-nearest Neighbor (knn) Sept Youn-Hee Han
k-nearest Neighbor (knn) Sept. 2015 Youn-Hee Han http://link.koreatech.ac.kr ²Eager Learners Eager vs. Lazy Learning when given a set of training data, it will construct a generalization model before receiving
More informationSPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION
Far East Journal of Electronics and Communications Volume 3, Number 2, 2009, Pages 125-140 Published Online: September 14, 2009 This paper is available online at http://www.pphmj.com 2009 Pushpa Publishing
More informationDUET Operate Instruction
DUET Operate Instruction Version1.0.0 INDEX 1.INTRODUCTION 2 2.APPEARANCE 3 3.CONTENT 4 4.SYSTEM REQUIREMENT. 4 5.INSTALLATION 5 6.MAIN MENU..7 7.SETUP. 8 8.RECORD. 12 9.SERVICE AND WARRANTY.18 10.CONTACT
More informationSelf Lane Assignment Using Smart Mobile Camera For Intelligent GPS Navigation and Traffic Interpretation
For Intelligent GPS Navigation and Traffic Interpretation Tianshi Gao Stanford University tianshig@stanford.edu 1. Introduction Imagine that you are driving on the highway at 70 mph and trying to figure
More informationAnalyzing Vocal Patterns to Determine Emotion Maisy Wieman, Andy Sun
Analyzing Vocal Patterns to Determine Emotion Maisy Wieman, Andy Sun 1. Introduction The human voice is very versatile and carries a multitude of emotions. Emotion in speech carries extra insight about
More informationRossmann Store Sales. 1 Introduction. 3 Datasets and Features. 2 Related Work. David Beam and Mark Schramm. December 2015
Rossmann Store Sales David Beam and Mark Schramm 1 Introduction December 015 The objective of this project is to forecast sales in euros at 1115 stores owned by Rossmann, a European pharmaceutical company.
More informationHands On: Multimedia Methods for Large Scale Video Analysis (Lecture) Dr. Gerald Friedland,
Hands On: Multimedia Methods for Large Scale Video Analysis (Lecture) Dr. Gerald Friedland, fractor@icsi.berkeley.edu 1 Today Answers to Questions How to estimate resources for large data projects - Some
More informationExcel 2007/2010/2013: Using Data Validation to provide dropdown selection menu
Excel 2007/2010/2013: Using Data Validation to provide dropdown selection menu Submitted by Jess on Sun, 06/30/2013-20:57 In Excel, there are various ways to provide a drop-down menu in a form or in cells.
More informationDiscriminative training and Feature combination
Discriminative training and Feature combination Steve Renals Automatic Speech Recognition ASR Lecture 13 16 March 2009 Steve Renals Discriminative training and Feature combination 1 Overview Hot topics
More informationConfidence Measures: how much we can trust our speech recognizers
Confidence Measures: how much we can trust our speech recognizers Prof. Hui Jiang Department of Computer Science York University, Toronto, Ontario, Canada Email: hj@cs.yorku.ca Outline Speech recognition
More informationLecture 13: Model selection and regularization
Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always
More informationIntroduction to Deep Learning in Signal Processing & Communications with MATLAB
Introduction to Deep Learning in Signal Processing & Communications with MATLAB Dr. Amod Anandkumar Pallavi Kar Application Engineering Group, Mathworks India 2019 The MathWorks, Inc. 1 Different Types
More informationDeep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition
IEEE 2017 Conference on Computer Vision and Pattern Recognition Deep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition Bong-Nam Kang*, Yonghyun
More informationDetector. Flash. Detector
CLIPS at TRECvid: Shot Boundary Detection and Feature Detection Georges M. Quénot, Daniel Moraru, and Laurent Besacier CLIPS-IMAG, BP53, 38041 Grenoble Cedex 9, France Georges.Quenot@imag.fr Abstract This
More informationLIA_SPKDET. Package documentation
LIA_SPKDET Package documentation Edité par Eric Charton/ LIA Projet Mistral mistral.univ-avignon.fr 1/11 Crédits This documentation is a user guide: Editeur / Editor: Eric Charton eric.charton@univ-avignon.fr
More informationMULTIMODAL PERSON IDENTIFICATION IN A SMART ROOM. J.Luque, R.Morros, J.Anguita, M.Farrus, D.Macho, F.Marqués, C.Martínez, V.Vilaplana, J.
MULTIMODAL PERSON IDENTIFICATION IN A SMART ROOM JLuque, RMorros, JAnguita, MFarrus, DMacho, FMarqués, CMartínez, VVilaplana, J Hernando Technical University of Catalonia (UPC) Jordi Girona, 1-3 D5, 08034
More informationSetting UP the UMI-1 with True RTA
Setting UP the UMI-1 with True RTA PC based test gear or single purpose device? Is there an easier way? Sure. There are lots of options for a single purpose device that will allow you to make frequency
More informationVentriLock: Exploring voice-based authentication systems
VentriLock: Exploring voice-based authentication systems Chaouki KASMI & José LOPES ESTEVES ANSSI, FRANCE Hack In Paris 06/2017 2 WHO WE ARE Chaouki Kasmi and José Lopes Esteves ANSSI-FNISA / Wireless
More informationThe SecurePhone PDA Database, Experimental Protocol and Automatic Test Procedure for Multimodal User Authentication
The SecurePhone PDA Database, Experimental Protocol and Automatic Test Procedure for Multimodal User Authentication A.C. Morris 1, J. Koreman 1, H. Sellahewa 2, J. Ehlers 2, S. Jassim 2, L. Allano 3, S.
More informationAgatha: Multimodal Biometric Authentication Platform in Large-Scale Databases
Agatha: Multimodal Biometric Authentication Platform in Large-Scale Databases David Hernando David Gómez Javier Rodríguez Saeta Pascual Ejarque 2 Javier Hernando 2 Biometric Technologies S.L., Barcelona,
More informationIntroduction to Stata Toy Program #1 Basic Descriptives
Introduction to Stata 2018-19 Toy Program #1 Basic Descriptives Summary The goal of this toy program is to get you in and out of a Stata session and, along the way, produce some descriptive statistics.
More informationSpeech Recognition. Project: Phone Recognition using Sphinx. Chia-Ho Ling. Sunya Santananchai. Professor: Dr. Kepuska
Speech Recognition Project: Phone Recognition using Sphinx Chia-Ho Ling Sunya Santananchai Professor: Dr. Kepuska Objective Use speech data corpora to build a model using CMU Sphinx.Apply a built model
More informationInput speech signal. Selected /Rejected. Pre-processing Feature extraction Matching algorithm. Database. Figure 1: Process flow in ASR
Volume 5, Issue 1, January 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Feature Extraction
More informationPresentation attack detection in voice biometrics
Chapter 1 Presentation attack detection in voice biometrics Pavel Korshunov and Sébastien Marcel Idiap Research Institute, Martigny, Switzerland {pavel.korshunov,sebastien.marcel}@idiap.ch Recent years
More informationScott Shaobing Chen & P.S. Gopalakrishnan. IBM T.J. Watson Research Center. as follows:
SPEAKER, ENVIRONMENT AND CHANNEL CHANGE DETECTION AND CLUSTERING VIA THE BAYESIAN INFORMATION CRITERION Scott Shaobing Chen & P.S. Gopalakrishnan IBM T.J. Watson Research Center email: schen@watson.ibm.com
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description
More informationYiqi Yan. May 10, 2017
Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field
More informationTYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT
PRIMER FOR ACS OUTCOMES RESEARCH COURSE: TYPES OF VARIABLES, STRUCTURE OF DATASETS, AND BASIC STATA LAYOUT STEP 1: Install STATA statistical software. STEP 2: Read through this primer and complete the
More informationDynamic Time Warping
Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. Dynamic Time Warping Dr Philip Jackson Acoustic features Distance measures Pattern matching Distortion penalties DTW
More informationThe LENA Advanced Data Extractor (ADEX) User Guide Version 1.1.2
The LENA Advanced Data Extractor (ADEX) User Guide Version 1.1.2 ADEXUG20110602 Copyright 2011 LENA Foundation The LENA Advanced Data Extractor User Guide ii The LENA Advanced Data Extractor (ADEX) User
More informationABSTRACT AUTOMATIC SPEECH CODEC IDENTIFICATION WITH APPLICATIONS TO TAMPERING DETECTION OF SPEECH RECORDINGS
ABSTRACT Title of thesis: AUTOMATIC SPEECH CODEC IDENTIFICATION WITH APPLICATIONS TO TAMPERING DETECTION OF SPEECH RECORDINGS Jingting Zhou, Master of Engineering, 212 Thesis directed by: Professor Carol
More informationOptimizing feature representation for speaker diarization using PCA and LDA
Optimizing feature representation for speaker diarization using PCA and LDA itsikv@netvision.net.il Jean-Francois Bonastre jean-francois.bonastre@univ-avignon.fr Outline Speaker Diarization what is it?
More informationGeneric SIP Interface Configuration Guide
Generic SIP Interface Configuration Guide ASL Document Ref.: U-0701-1497.docx Issue: 1 complete, approved - Date: 28/11/16 Part Number: M0664_TBD Contents 1 Introduction... 3 2 Configuration... 4 Additional
More informationVersion 2.6. SurVo Advanced User s Guide
Version 2.6 SurVo Advanced User s Guide Contents What is a SurVo?...3 SurVo: Voice Survey Form...3 About the Documentation...3 Ifbyphone on the Web...3 Setting up a SurVo...4 Speech/Recording Options...4
More informationThe Expected Performance Curve: a New Assessment Measure for Person Authentication
The Expected Performance Curve: a New Assessment Measure for Person Authentication Samy Bengio Johnny Mariéthoz IDIAP CP 592, rue du Simplon4 192 Martigny, Switzerland {bengio,marietho}@idiap.ch Abstract
More informationVoice Quality Assessment for Mobile to SIP Call over Live 3G Network
Abstract 132 Voice Quality Assessment for Mobile to SIP Call over Live 3G Network G.Venkatakrishnan, I-H.Mkwawa and L.Sun Signal Processing and Multimedia Communications, University of Plymouth, Plymouth,
More informationSelection Control Structure CSC128: FUNDAMENTALS OF COMPUTER PROBLEM SOLVING
Selection Control Structure CSC128: FUNDAMENTALS OF COMPUTER PROBLEM SOLVING MULTIPLE SELECTION To solve a problem that has several selection, use either of the following method: Multiple selection nested
More information