Detection of Acoustic Events in Meeting-Room Environment
|
|
- Kristopher Lesley Austin
- 6 years ago
- Views:
Transcription
1 11/Dec/2008 Detection of Acoustic Events in Meeting-Room Environment Presented by Andriy Temko Department of Electrical and Electronic Engineering
2 Page 2 of 34 Content Introduction State of the Art Acoustic Event Classification Acoustic Event Detection Participation in International Evaluations Speech Activity Detection UPC Smart-Room Activities
3 Page 3 of 34 Introduction Need of environments (Smart-rooms) in which computers do not need much human attention and humans focus on interaction with humans (not with the computer) e.g. CHIL project Multimodality Technologies (Person ID, Person Localization & Tracking, Meeting Summarization) Need a large set of unobtrusive Sensors
4 Page 4 of 34 Introduction In such environments the human activity is reflected in a rich variety of acoustic events (AEs) Need of perceptual components (activity detection, automatic speech recognition, speaker identification and verification, and speaker localization) Speech is the most informative AE however Detection and classification of other AEs may: Describe social activity that takes place in the room (e.g. laughter or clapping inside meeting, door slam when a meeting just started) Increase the robustness of other (audio) technologies e.g. speech recognition / speech activity detection
5 Page 5 of 34 State of the art Sound taxonomy Applications of audio recognition Features and classifiers Methods of detection of AEs
6 Page 6 of 34 Sound Taxonomy Sound Continuous tone Continuous nontone Non-repetitive noncontinuous Acoustic Regular repetitive noncontinuous Irregular repetitive noncontinuous Other Sound Speech noises Repetitions Corrections False starts Mispronunciation Human vocal tract non-speech noises Breathes Laughter Throat/cough Disagreement Whisper Mouth (generic) Non-speech environment noises Clapping Beep Chair moving Door slams Footsteps Key jingling Music (radio, handy) Semantic
7 Page 7 of 34 Applications Audio indexing and retrieval (speech, background noise, etc Audio recognition for a given environment (kitchen, shops, hospitals, etc) Recognition of generic sounds (birds, animals, alarms, etc) Classification of acoustic environments (street, restaurants, library, airport, etc)
8 Page 8 of 34 Features Features and Classifiers Conventional ASR features Perceptual features (energy, silence ratio, spectral tilt, SBE, spectral flux, zero-crossing etc) Concatenation of the ASR and perceptual features Feature selection (feature space reduction): LDA, PCA, ICA, etc Classifiers Distance based classifiers (Euclidean, Mahalanobis, knn, etc) Generative classifiers (conventional ASR): GMM, HMM, etc Discriminative classifiers: ANN, Decision trees, SVM, etc
9 Page 9 of 34 Methods of Detection Detection by classification Detection and classification (classification of the segment bounded by detection algorithm)
10 Page 10 of 34 Acoustic Event Classification SVM-based Clustering Schemes (Multiclass problem) Comparison of Sequence Discriminant SVM (dynamic data problem) Fuzzy Integral Based Information Fusion for Classification of Highly Confusable Non-Speech Sounds (Fusion and Feature Selection)
11 Page 11 of 34 SVM-based Clustering Schemes (I) First attempt to deal with AEC comparing different feature sets and two distinct classifiers SVM and GMM Database of AEs: 16 classes (Cough, Laugh, Paper, Door slam, Steps, etc) Defined acoustic feature sets: 9 Feature sets (different combination of perceptual features and ASR features)
12 Page 12 of 34 SVM-based Clustering Schemes (II) ACC % SVM 1 2 GMM feature set Results of AEC with binary tree multi-class SVM and GMM. Best feature set for SVM is 8 and for GMM is 9. ACC % feature set liquid sneeze sniff Best feature sets for liquid, sneeze, and sniff are different and none is the best overall (8).
13 Page 13 of 34 SVM-based Clustering Schemes (III) Outline of variable-feature-set clustering algorithm Split classes into two clusters with the least possible confusion based on confusion matrices (exhaustive search) The algorithm returns both the two clusters of classes and the 1 2 feature set that provide the smallest number of confusions Variable feature set clustering schemes Introducing knowledge about class confusions % classification rate % relative average error reduction
14 Page 14 of 34 Acoustic Event Classification SVM-based Clustering Schemes (Multi-class problem) Comparison of Sequence Discriminant SVM (dynamic data problem) Fuzzy Integral Based Information Fusion for Classification of Highly Confusable Non-Speech Sounds (Fusion and Feature Selection)
15 Page 15 of 34 Comparison of sequence discriminant SVM (I) A drawback of SVMs when dealing with audio data is their restriction to work with a fixed-length input that represents a time sequence of feature vectors Variable-size sequence of feature vectors 1 Features... N Frames Solution 1: normalize the size of the time sequence of feature vectors Solution 2: find a suitable kernel function that can deal with sequential data of variable length
16 Page 16 of 34 Comparison of sequence discriminant SVM (II) Rank Rank AEs with temporal structure sneeze, music 13 Outpr 13 PolyDTW 13 Fisher 13 Dtak 13 SVMStat 13 Fisher Ratio 13 Gdtw Recognition Rate, % 7 Dtak 7 PolyDTW 7 Fisher 7 Fisher Ratio 7 Gdtw 7 GMM 7 Outpr 13 GMM 7 SVMStat AEs without temporal structure paper pen writing, liquid pouring Outpr 10 Outpr 11 Gdtw 11 PolyDTW 10 PolyDTW 10 GMM 11 GMM 11 Fisher Ratio 10 Fisher Ratio 11 SVMStat 10 Fisher 11 Fisher 10 SVMStat 10 Dtak Fisher LikeRatio Comparison Results Outerpr GDTW DTAK PolyDTW GMM SVM stat Fisher kernel obtained the best overall results DTW-based kernels work well for sounds that show a temporal structure The observed bias of the classifiers to specific types of classes is a good condition for a successful application of fusion techniques 8 10 Gdtw 11 Dtak Recognition Rate, %
17 Page 17 of 34 Acoustic Event Classification SVM-based Clustering Schemes (Multi-class problem) Comparison of Sequence Discriminant SVM (dynamic data problem) Fuzzy Integral Based Information Fusion for Classification of Highly Confusable Non-Speech Sounds (Fusion and Feature Selection)
18 Page 18 of 34 Fuzzy Integral Information Fusion (I) Acoustic Features 1. Zero crossing rate (1) 2. Short-time energy (1) 3. Fundamental frequency (1) 4. Sub-band log energies (4) 5. Sub-band log energy distribution (4) 6. Sub-band log energy correlations (4) 7. Sub-band log energy time differences (4) 8. Spectral centroid (1) 9. Spectral roll-off (1) 10.Spectral bandwidth (1) Feat 1 Feat 2. Feature and decision level information fusion Feat N Classifier (a) Feat 1 Feat 2. Event hypothesis Feat N Classifier 1 Classifier 2 Classifier N (b) Out 1 Out 2 Out N FI Event hypothesis
19 Page 19 of 34 Fuzzy Integral and Fuzzy Measure Weighted Arithmetical Mean M = μ( i) WAM h i i Z μ ( i) = 1 i Z μ( i) 0 for all i Z μ 0 M μ 1 μ 2 μ 3 μ 4 μ 12 μ 13 μ 14 μ 23 μ 24 μ 34 Fuzzy Integral (Choquet) = [ z FI ( μ, h) μ( i,..., z) μ( i + 1,..., z) ] h i i= 1 h1... h z S T μ( S) μ( T ) μ ( ø) = 0, μ ( Z ) = 1 h i are the information sources μ 123 μ 124 μ 134 μ 234 μ 1234 The lattice representation of fuzzy measure for n = 4. FM can be leant from data provided by an expert calculated from fuzzy densities
20 Page 20 of 34 Fuzzy Integral Information Fusion (II) Performance for the 10 SVM systems running of each feature type, the combination of the 10 features at the feature-level with SVM, and the fusion on the decision-level with WAM and FI operators. FM is leant from data.
21 Page 21 of 34 Fuzzy Integral Information Fusion (III) Importance Interaction Importance and Interactions are extracted from Fuzzy Measure The most important 4, 6, 7 Highly redundant (light cell) {4,5}, {1,8} Feature selection: With feature selection based on information extracted from FM an improvement over the baseline was obtained Fusion of different classifiers: SVM with signal level features + HMM with frame-level features
22 Page 22 of 34 Fuzzy Integral Information Fusion (IV) Fusion of several information sources with FI formalism showed significant improvements with respect to single information source Decision-level FI fusion showed comparable results with SVM feature level fusion Importance and the degree of interaction can be used for feature selection and give a valuable insight into the problem FI may be a good choice when feature-level fusion is not an option (different nature of the involved features, application/technique dependence Finalist of the Best Journal Paper Award of 2008 in Speech Technologies
23 Page 23 of 34 Acoustic Event Detection UPC Acoustic Event Detection Systems 2006 UPC Acoustic Event Detection Systems 2007
24 Page 24 of 34 UPC Acoustic Event Detection Systems 2006 Feature extraction SVM silence / non-silence segmentation SVM classification Segmentation step Classification step 1st SVM for pre-segmentation based on silence/non-silence detection. 2nd SVM trained on 12 classes + speech + unknown for event detection.
25 Page 25 of 34 UPC Acoustic Event Detection Systems ms 1 second... SVM SVM SVM SVM Post-processing Final decision windows System output AE assignment segments
26 Page 26 of 34 Participation in International Evaluations CHIL Dry-Run Evaluations 2004 (1 place 2 participants) CHIL Evaluations 2005 (1 place 2 participants) CLEAR Evaluations 2006 (1 place 3 participants) CLEAR Evaluations 2007 (2 place 8 participants)
27 Page 27 of 34 Speech Activity Detection AED is event-based, SAD is frame-based Dataset reduction algorithm Metric NIST error rate = Duration of Incorrect Decisions / Duration of all Speech Segments Features Feature Extractor FF ΔFF ΔΔFF ΔlogE lfed hfed ldam LDA t t... t+15 hfed t-9... t... t+9 lfed Current frame t-9... t... t+9
28 Page 28 of 34 Enhanced SVM Training for SAD Dataset reduction algorithm Step 1. Divide all the data into chunks of 1000 samples per chunk. Step 2. Train a PSVM on each chunk performing 5-fold cross-validation (CV) Step 3. Apply an appropriate threshold to the highest/lowest CV accuracy Step4. Train a classical SVM on the amount of data selected in Step 3. Classical SVM PSVM H -1 H 1 H -1 H 1 Separating hyperplane H 0 Separating hyperplane H 0 NIST SAD Evaluations 2006 best SAD system out of 10 systems with 4.88 % error rate
29 Page 29 of 34 UPC Smart-Room Activities UPC Smart-Room Database Recording Real-Time Implementation Demonstrations
30 Page 30 of 34 UPC Smart-Room y Hammer_xx x Cam1 Cam6 12 Cam MkIII_001 MkIII_064 Hammer_13 to Hammer_20 09 Cam8 zenithal cam Hammer_xx m Cam2 PTZ Cam Cam Cam m Hammer_xx
31 Page 31 of 34 Real-Time Implementation Implementation of AED component as a smart-flow client
32 Page 32 of 34 Demonstrations (I) Participation in several demonstrations Mockup demo: Journalist scenario Reliably detected AEs: door knock, door opening/closing, speech, applause, and key jingles. XIL LA LA
33 Page 33 of 34 Demonstrations (II) Participation in several demonstrations Virtual UPC smart-room: 3D person tracking, ASL, and AED
34 Page 34 of 34 Demonstrations (III) Participation in several demonstrations Demo of AED & ASL Real-time video VIDEO >> Graphical representation
ACOUSTIC EVENT DETECTION AND CLASSIFICATION IN SMART-ROOM ENVIRONMENTS: EVALUATION OF CHIL PROJECT SYSTEMS
ACOUSTIC EVENT DETECTION AND CLASSIFICATION IN SMART-ROOM ENVIRONMENTS: EVALUATION OF CHIL PROJECT SYSTEMS Andrey Temko 1, Robert Malkin 2, Christian Zieger 3, Dusan Macho 1, Climent Nadeu 1, Maurizio
More informationAudiovisual Event Detection Towards Scene Understanding
Audiovisual Event Detection Towards Scene Understanding C. Canton-Ferrer, T. Butko, C. Segura, X. Giró, C. Nadeu, J. Hernando, J.R. Casas Technical University of Catalonia, Barcelona (Spain) {ccanton,butko,csegura,xgiro,climent,javier,josep}@gps.tsc.upc.edu
More informationInclusion of Video Information for Detection of Acoustic Events using Fuzzy Integral
Inclusion of Video Information for Detection of Acoustic Events using Fuzzy Integral Taras Butko, Andrey Temko, Climent Nadeu, and Cristian Canton TALP Research Center, Universitat Politècnica de Catalunya,
More informationWorkshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards
Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Jürgen Herre for Integrated Circuits (FhG-IIS) Erlangen, Germany Jürgen Herre, hrr@iis.fhg.de Page 1 Overview Extracting meaning
More informationGYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS. Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1)
GYROPHONE RECOGNIZING SPEECH FROM GYROSCOPE SIGNALS Yan Michalevsky (1), Gabi Nakibly (2) and Dan Boneh (1) (1) Stanford University (2) National Research and Simulation Center, Rafael Ltd. 0 MICROPHONE
More informationIntegration of audiovisual sensors and technologies in a smart room
DOI 10.1007/s00779-007-0172-1 ORIGINAL ARTICLE Integration of audiovisual sensors and technologies in a smart room Joachim Neumann Æ Josep R. Casas Æ Dušan Macho Æ Javier Ruiz Hidalgo Received: 1 August
More informationSOUND EVENT DETECTION AND CONTEXT RECOGNITION 1 INTRODUCTION. Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2
Toni Heittola 1, Annamaria Mesaros 1, Tuomas Virtanen 1, Antti Eronen 2 1 Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 33720, Tampere, Finland toni.heittola@tut.fi,
More informationMinimal-Impact Personal Audio Archives
Minimal-Impact Personal Audio Archives Dan Ellis, Keansub Lee, Jim Ogle Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu
More informationDiscriminate Analysis
Discriminate Analysis Outline Introduction Linear Discriminant Analysis Examples 1 Introduction What is Discriminant Analysis? Statistical technique to classify objects into mutually exclusive and exhaustive
More informationAvailable online Journal of Scientific and Engineering Research, 2016, 3(4): Research Article
Available online www.jsaer.com, 2016, 3(4):417-422 Research Article ISSN: 2394-2630 CODEN(USA): JSERBR Automatic Indexing of Multimedia Documents by Neural Networks Dabbabi Turkia 1, Lamia Bouafif 2, Ellouze
More informationCOSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor
COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality
More informationAnalysis of Local Appearance-based Face Recognition on FRGC 2.0 Database
Analysis of Local Appearance-based Face Recognition on FRGC 2.0 Database HAZIM KEMAL EKENEL (ISL),, Germany e- mail: ekenel@ira.uka.de web: http://isl.ira.uka.de/~ekenel/ http://isl.ira.uka.de/face_recognition/
More informationPattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition
Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant
More informationIndex. Symbols. Index 353
Index 353 Index Symbols 1D-based BID 12 2D biometric images 7 2D image matrix-based LDA 274 2D transform 300 2D-based BID 12 2D-Gaussian filter 228 2D-KLT 300, 302 2DPCA 293 3-D face geometric shapes 7
More informationINF 4300 Classification III Anne Solberg The agenda today:
INF 4300 Classification III Anne Solberg 28.10.15 The agenda today: More on estimating classifier accuracy Curse of dimensionality and simple feature selection knn-classification K-means clustering 28.10.15
More informationMACHINE LEARNING: CLUSTERING, AND CLASSIFICATION. Steve Tjoa June 25, 2014
MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION Steve Tjoa kiemyang@gmail.com June 25, 2014 Review from Day 2 Supervised vs. Unsupervised Unsupervised - clustering Supervised binary classifiers (2 classes)
More informationNeetha Das Prof. Andy Khong
Neetha Das Prof. Andy Khong Contents Introduction and aim Current system at IMI Proposed new classification model Support Vector Machines Initial audio data collection and processing Features and their
More informationECG782: Multidimensional Digital Signal Processing
ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting
More informationMaximum Likelihood Beamforming for Robust Automatic Speech Recognition
Maximum Likelihood Beamforming for Robust Automatic Speech Recognition Barbara Rauch barbara@lsv.uni-saarland.de IGK Colloquium, Saarbrücken, 16 February 2006 Agenda Background: Standard ASR Robust ASR
More informationClient Dependent GMM-SVM Models for Speaker Verification
Client Dependent GMM-SVM Models for Speaker Verification Quan Le, Samy Bengio IDIAP, P.O. Box 592, CH-1920 Martigny, Switzerland {quan,bengio}@idiap.ch Abstract. Generative Gaussian Mixture Models (GMMs)
More informationFeature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate
More informationMachine Learning based session drop prediction in LTE networks and its SON aspects
Machine Learning based session drop prediction in LTE networks and its SON aspects Bálint Daróczy, András Benczúr Institute for Computer Science and Control (MTA SZTAKI) Hungarian Academy of Sciences Péter
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationReal-time Monitoring of Participants Interaction in a Meeting using Audio-Visual sensors
Real-time Monitoring of Participants Interaction in a Meeting using Audio-Visual sensors Carlos Busso, Panayiotis G. Georgiou, and Shrikanth S. Narayanan Speech Analysis and Interpretation Laboratory (SAIL)
More informationLarge-scale visual recognition Efficient matching
Large-scale visual recognition Efficient matching Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline!! Preliminary!! Locality Sensitive Hashing: the two modes!! Hashing!! Embedding!!
More informationMETRIC LEARNING BASED DATA AUGMENTATION FOR ENVIRONMENTAL SOUND CLASSIFICATION
METRIC LEARNING BASED DATA AUGMENTATION FOR ENVIRONMENTAL SOUND CLASSIFICATION Rui Lu 1, Zhiyao Duan 2, Changshui Zhang 1 1 Department of Automation, Tsinghua University 2 Department of Electrical and
More informationIntroduction to digital image classification
Introduction to digital image classification Dr. Norman Kerle, Wan Bakx MSc a.o. INTERNATIONAL INSTITUTE FOR GEO-INFORMATION SCIENCE AND EARTH OBSERVATION Purpose of lecture Main lecture topics Review
More informationFuzzy Entropy based feature selection for classification of hyperspectral data
Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineering NIT Kurukshetra, 136119 mpce_pal@yahoo.co.uk Abstract: This paper proposes to use
More informationContent-based image and video analysis. Machine learning
Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all
More informationTWO-STEP SEMI-SUPERVISED APPROACH FOR MUSIC STRUCTURAL CLASSIFICATION. Prateek Verma, Yang-Kai Lin, Li-Fan Yu. Stanford University
TWO-STEP SEMI-SUPERVISED APPROACH FOR MUSIC STRUCTURAL CLASSIFICATION Prateek Verma, Yang-Kai Lin, Li-Fan Yu Stanford University ABSTRACT Structural segmentation involves finding hoogeneous sections appearing
More informationK-Nearest Neighbor Classification Approach for Face and Fingerprint at Feature Level Fusion
K-Nearest Neighbor Classification Approach for Face and Fingerprint at Feature Level Fusion Dhriti PEC University of Technology Chandigarh India Manvjeet Kaur PEC University of Technology Chandigarh India
More informationMachine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016
Machine Learning 10-701, Fall 2016 Nonparametric methods for Classification Eric Xing Lecture 2, September 12, 2016 Reading: 1 Classification Representing data: Hypothesis (classifier) 2 Clustering 3 Supervised
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Spring 2018 http://vllab.ee.ntu.edu.tw/dlcv.html (primary) https://ceiba.ntu.edu.tw/1062dlcv (grade, etc.) FB: DLCV Spring 2018 Yu Chiang Frank Wang 王鈺強, Associate Professor
More informationLinear Discriminant Analysis for 3D Face Recognition System
Linear Discriminant Analysis for 3D Face Recognition System 3.1 Introduction Face recognition and verification have been at the top of the research agenda of the computer vision community in recent times.
More informationExemplar Selection Methods to Distinguish Human from Animal Footsteps
Exemplar Selection Methods to Distinguish Human from Animal Footsteps Po-Sen Huang, Mark Hasegawa-Johnson, Thyagaraju Damarla Beckman Institute, ECE Department, University of Illinois at Urbana-Champaign,
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2013 http://ce.sharif.edu/courses/91-92/2/ce725-1/ Agenda Features and Patterns The Curse of Size and
More informationSPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION
Far East Journal of Electronics and Communications Volume 3, Number 2, 2009, Pages 125-140 Published Online: September 14, 2009 This paper is available online at http://www.pphmj.com 2009 Pushpa Publishing
More informationMPEG-7 Audio: Tools for Semantic Audio Description and Processing
MPEG-7 Audio: Tools for Semantic Audio Description and Processing Jürgen Herre for Integrated Circuits (FhG-IIS) Erlangen, Germany Jürgen Herre, hrr@iis.fhg.de Page 1 Overview Why semantic description
More informationLinear Discriminant Analysis in Ottoman Alphabet Character Recognition
Linear Discriminant Analysis in Ottoman Alphabet Character Recognition ZEYNEB KURT, H. IREM TURKMEN, M. ELIF KARSLIGIL Department of Computer Engineering, Yildiz Technical University, 34349 Besiktas /
More informationA Framework for Efficient Fingerprint Identification using a Minutiae Tree
A Framework for Efficient Fingerprint Identification using a Minutiae Tree Praveer Mansukhani February 22, 2008 Problem Statement Developing a real-time scalable minutiae-based indexing system using a
More informationOptimizing feature representation for speaker diarization using PCA and LDA
Optimizing feature representation for speaker diarization using PCA and LDA itsikv@netvision.net.il Jean-Francois Bonastre jean-francois.bonastre@univ-avignon.fr Outline Speaker Diarization what is it?
More informationTri-modal Human Body Segmentation
Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4
More informationAutomatic summarization of video data
Automatic summarization of video data Presented by Danila Potapov Joint work with: Matthijs Douze Zaid Harchaoui Cordelia Schmid LEAR team, nria Grenoble Khronos-Persyvact Spring School 1.04.2015 Definition
More informationClass 5: Attributes and Semantic Features
Class 5: Attributes and Semantic Features Rogerio Feris, Feb 21, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Project
More informationHands On: Multimedia Methods for Large Scale Video Analysis (Lecture) Dr. Gerald Friedland,
Hands On: Multimedia Methods for Large Scale Video Analysis (Lecture) Dr. Gerald Friedland, fractor@icsi.berkeley.edu 1 Today Recap: Some more Machine Learning Multimedia Systems An example Multimedia
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2012 http://ce.sharif.edu/courses/90-91/2/ce725-1/ Agenda Features and Patterns The Curse of Size and
More informationUnsupervised learning in Vision
Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual
More informationAlgorithms for Recognition of Low Quality Iris Images. Li Peng Xie University of Ottawa
Algorithms for Recognition of Low Quality Iris Images Li Peng Xie University of Ottawa Overview Iris Recognition Eyelash detection Accurate circular localization Covariance feature with LDA Fourier magnitude
More informationGPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC
GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC MIKE GOWANLOCK NORTHERN ARIZONA UNIVERSITY SCHOOL OF INFORMATICS, COMPUTING & CYBER SYSTEMS BEN KARSIN UNIVERSITY OF HAWAII AT MANOA DEPARTMENT
More informationBinary Hierarchical Classifier for Hyperspectral Data Analysis
Binary Hierarchical Classifier for Hyperspectral Data Analysis Hafrún Hauksdóttir A intruduction to articles written by Joydeep Gosh and Melba M. Crawford Binary Hierarchical Classifierfor Hyperspectral
More informationVideo Summarization Using MPEG-7 Motion Activity and Audio Descriptors
Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors Ajay Divakaran, Kadir A. Peker, Regunathan Radhakrishnan, Ziyou Xiong and Romain Cabasson Presented by Giulia Fanti 1 Overview Motivation
More informationMultiple Kernel Learning for Emotion Recognition in the Wild
Multiple Kernel Learning for Emotion Recognition in the Wild Karan Sikka, Karmen Dykstra, Suchitra Sathyanarayana, Gwen Littlewort and Marian S. Bartlett Machine Perception Laboratory UCSD EmotiW Challenge,
More information1 Introduction. 3 Data Preprocessing. 2 Literature Review
Rock or not? This sure does. [Category] Audio & Music CS 229 Project Report Anand Venkatesan(anand95), Arjun Parthipan(arjun777), Lakshmi Manoharan(mlakshmi) 1 Introduction Music Genre Classification continues
More informationDUPLICATE DETECTION AND AUDIO THUMBNAILS WITH AUDIO FINGERPRINTING
DUPLICATE DETECTION AND AUDIO THUMBNAILS WITH AUDIO FINGERPRINTING Christopher Burges, Daniel Plastina, John Platt, Erin Renshaw, and Henrique Malvar March 24 Technical Report MSR-TR-24-19 Audio fingerprinting
More informationCS 229 Final Project Report Learning to Decode Cognitive States of Rat using Functional Magnetic Resonance Imaging Time Series
CS 229 Final Project Report Learning to Decode Cognitive States of Rat using Functional Magnetic Resonance Imaging Time Series Jingyuan Chen //Department of Electrical Engineering, cjy2010@stanford.edu//
More informationApplying Supervised Learning
Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains
More informationMultimedia Event Detection for Large Scale Video. Benjamin Elizalde
Multimedia Event Detection for Large Scale Video Benjamin Elizalde Outline Motivation TrecVID task Related work Our approach (System, TF/IDF) Results & Processing time Conclusion & Future work Agenda 2
More informationLearning to Recognize Faces in Realistic Conditions
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationClustering. CS294 Practical Machine Learning Junming Yin 10/09/06
Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,
More informationApplication of Characteristic Function Method in Target Detection
Application of Characteristic Function Method in Target Detection Mohammad H Marhaban and Josef Kittler Centre for Vision, Speech and Signal Processing University of Surrey Surrey, GU2 7XH, UK eep5mm@ee.surrey.ac.uk
More informationIntroduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering
Introduction to Pattern Recognition Part II Selim Aksoy Bilkent University Department of Computer Engineering saksoy@cs.bilkent.edu.tr RETINA Pattern Recognition Tutorial, Summer 2005 Overview Statistical
More informationBaseball Game Highlight & Event Detection
Baseball Game Highlight & Event Detection Student: Harry Chao Course Adviser: Winston Hu 1 Outline 1. Goal 2. Previous methods 3. My flowchart 4. My methods 5. Experimental result 6. Conclusion & Future
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Features and Patterns The Curse of Size and
More informationSemantic Video Indexing
Semantic Video Indexing T-61.6030 Multimedia Retrieval Stevan Keraudy stevan.keraudy@tkk.fi Helsinki University of Technology March 14, 2008 What is it? Query by keyword or tag is common Semantic Video
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationAction recognition in videos
Action recognition in videos Cordelia Schmid INRIA Grenoble Joint work with V. Ferrari, A. Gaidon, Z. Harchaoui, A. Klaeser, A. Prest, H. Wang Action recognition - goal Short actions, i.e. drinking, sit
More informationSpeaker Diarization System Based on GMM and BIC
Speaer Diarization System Based on GMM and BIC Tantan Liu 1, Xiaoxing Liu 1, Yonghong Yan 1 1 ThinIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences Beijing 100080 {tliu, xliu,yyan}@hccl.ioa.ac.cn
More informationContent Based Classification of Audio Using MPEG-7 Features
Content Based Classification of Audio Using MPEG-7 Features ManasiChoche, Dr.SatishkumarVarma Abstract The segmentation plays important role in audio classification. The audio data can be divided into
More informationConfidence Measures: how much we can trust our speech recognizers
Confidence Measures: how much we can trust our speech recognizers Prof. Hui Jiang Department of Computer Science York University, Toronto, Ontario, Canada Email: hj@cs.yorku.ca Outline Speech recognition
More informationPerformance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM
Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Lu Chen and Yuan Hang PERFORMANCE DEGRADATION ASSESSMENT AND FAULT DIAGNOSIS OF BEARING BASED ON EMD AND PCA-SOM.
More informationMultimedia Database Systems. Retrieval by Content
Multimedia Database Systems Retrieval by Content MIR Motivation Large volumes of data world-wide are not only based on text: Satellite images (oil spill), deep space images (NASA) Medical images (X-rays,
More informationCOPYRIGHTED MATERIAL. Introduction. 1.1 Introduction
1 Introduction 1.1 Introduction One of the most fascinating characteristics of humans is their capability to communicate ideas by means of speech. This capability is undoubtedly one of the facts that has
More informationBackground: Bioengineering - Electronics - Signal processing. Biometry - Person Identification from brain wave detection University of «Roma TRE»
Background: Bioengineering - Electronics - Signal processing Neuroscience - Source imaging - Brain functional connectivity University of Rome «Sapienza» BCI - Feature extraction from P300 brain potential
More informationDiscriminative training and Feature combination
Discriminative training and Feature combination Steve Renals Automatic Speech Recognition ASR Lecture 13 16 March 2009 Steve Renals Discriminative training and Feature combination 1 Overview Hot topics
More informationBiometrics Technology: Multi-modal (Part 2)
Biometrics Technology: Multi-modal (Part 2) References: At the Level: [M7] U. Dieckmann, P. Plankensteiner and T. Wagner, "SESAM: A biometric person identification system using sensor fusion ", Pattern
More informationGeneration of Sports Highlights Using a Combination of Supervised & Unsupervised Learning in Audio Domain
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Generation of Sports Highlights Using a Combination of Supervised & Unsupervised Learning in Audio Domain Radhakrishan, R.; Xiong, Z.; Divakaran,
More informationA Brief Overview of Audio Information Retrieval. Unjung Nam CCRMA Stanford University
A Brief Overview of Audio Information Retrieval Unjung Nam CCRMA Stanford University 1 Outline What is AIR? Motivation Related Field of Research Elements of AIR Experiments and discussion Music Classification
More informationPattern Recognition for Neuroimaging Data
Pattern Recognition for Neuroimaging Data Edinburgh, SPM course April 2013 C. Phillips, Cyclotron Research Centre, ULg, Belgium http://www.cyclotron.ulg.ac.be Overview Introduction Univariate & multivariate
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationFeatures: representation, normalization, selection. Chapter e-9
Features: representation, normalization, selection Chapter e-9 1 Features Distinguish between instances (e.g. an image that you need to classify), and the features you create for an instance. Features
More informationClassification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University
Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate
More informationon learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015
on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,
More informationNew Results in Low Bit Rate Speech Coding and Bandwidth Extension
Audio Engineering Society Convention Paper Presented at the 121st Convention 2006 October 5 8 San Francisco, CA, USA This convention paper has been reproduced from the author's advance manuscript, without
More informationCS570: Introduction to Data Mining
CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.
More informationSystem Identification Related Problems at SMN
Ericsson research SeRvices, MulTimedia and Networks System Identification Related Problems at SMN Erlendur Karlsson SysId Related Problems @ ER/SMN Ericsson External 2015-04-28 Page 1 Outline Research
More informationSTC ANTI-SPOOFING SYSTEMS FOR THE ASVSPOOF 2015 CHALLENGE
STC ANTI-SPOOFING SYSTEMS FOR THE ASVSPOOF 2015 CHALLENGE Sergey Novoselov 1,2, Alexandr Kozlov 2, Galina Lavrentyeva 1,2, Konstantin Simonchik 1,2, Vadim Shchemelinin 1,2 1 ITMO University, St. Petersburg,
More informationGait analysis for person recognition using principal component analysis and support vector machines
Gait analysis for person recognition using principal component analysis and support vector machines O V Strukova 1, LV Shiripova 1 and E V Myasnikov 1 1 Samara National Research University, Moskovskoe
More informationDynamic Time Warping
Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. Dynamic Time Warping Dr Philip Jackson Acoustic features Distance measures Pattern matching Distortion penalties DTW
More informationTag Recommendation for Photos
Tag Recommendation for Photos Gowtham Kumar Ramani, Rahul Batra, Tripti Assudani December 10, 2009 Abstract. We present a real-time recommendation system for photo annotation that can be used in Flickr.
More informationMotivation: why study sound? Motivation (2) Related Work. Our Study. Related Work (2)
Motivation: why study sound? Interactive Learning of the Acoustic Properties of Objects by a Robot Sound Producing Event Jivko Sinapov Mark Wiemer Alexander Stoytchev {jsinapov banff alexs}@iastate.edu
More informationPattern recognition. Classification/Clustering GW Chapter 12 (some concepts) Textures
Pattern recognition Classification/Clustering GW Chapter 12 (some concepts) Textures Patterns and pattern classes Pattern: arrangement of descriptors Descriptors: features Patten class: family of patterns
More informationA Geometric Perspective on Machine Learning
A Geometric Perspective on Machine Learning Partha Niyogi The University of Chicago Collaborators: M. Belkin, V. Sindhwani, X. He, S. Smale, S. Weinberger A Geometric Perspectiveon Machine Learning p.1
More informationEE 6882 Statistical Methods for Video Indexing and Analysis
EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. ShihFu Chang http://www.ee.columbia.edu/~sfchang Lecture 1 part A (9/8/04) 1 EE E6882 SVIA Lecture #1 Part I Introduction Course
More informationRobust PDF Table Locator
Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records
More informationSystem Identification Related Problems at
media Technologies @ Ericsson research (New organization Taking Form) System Identification Related Problems at MT@ER Erlendur Karlsson, PhD 1 Outline Ericsson Publications and Blogs System Identification
More information3D Object Recognition using Multiclass SVM-KNN
3D Object Recognition using Multiclass SVM-KNN R. Muralidharan, C. Chandradekar April 29, 2014 Presented by: Tasadduk Chowdhury Problem We address the problem of recognizing 3D objects based on various
More informationDynamic Human Fatigue Detection Using Feature-Level Fusion
Dynamic Human Fatigue Detection Using Feature-Level Fusion Xiao Fan, Bao-Cai Yin, and Yan-Feng Sun Beijing Key Laboratory of Multimedia and Intelligent Software, College of Computer Science and Technology,
More information3 Feature Selection & Feature Extraction
3 Feature Selection & Feature Extraction Overview: 3.1 Introduction 3.2 Feature Extraction 3.3 Feature Selection 3.3.1 Max-Dependency, Max-Relevance, Min-Redundancy 3.3.2 Relevance Filter 3.3.3 Redundancy
More informationDESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES
EXPERIMENTAL WORK PART I CHAPTER 6 DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES The evaluation of models built using statistical in conjunction with various feature subset
More informationStatistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte
Statistical Analysis of Metabolomics Data Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Outline Introduction Data pre-treatment 1. Normalization 2. Centering,
More information