Modeling the Spectral Envelope of Musical Instruments
|
|
- Willa Barker
- 5 years ago
- Views:
Transcription
1 Modeling the Spectral Envelope of Musical Instruments Juan José Burred IRCAM Équipe Analyse/Synthèse Axel Röbel / Xavier Rodet Technical University of Berlin Communication Systems Group Prof. Thomas Sikora Séminaire Recherche-Technologie IRCAM, 12th April 2006
2 Presentation Outline 1. Context: source separation 2. Definition and model requirements 3. Spectral basis decompositions Spectral PCA Previous applications of spectral PCA Training spectral PCA 4. Dealing with variable supports 5. Evaluation framework 6. Experiments and results 7. Modeling of the coefficients 8. Conclusions/future work 2/33
3 Research context Main research topic: Underdetermined Source Separation Less mixtures than sources: strong a priori knowledge is needed Knowledge about the mixing process: mixing models Knowledge about the sources General statistic properties: sparsity (past work) Source-dependent modeling (e.g. model of the violin, piano,...) 3-month stay at IRCAM to work on spectral envelope modeling Such a model will be used in a probabilistic framework as a source of a priori knowledge about the signals to be unmixed Other possible applications: instrument classification, transcription, realistic signal transformations 3/33
4 Spectral envelope: definition Spectral envelope: a function of frequency that matches the amplitudes of the individual partials of the spectrum. [Figure source: D. Schwarz, Spectral Envelopes in Sound Analysis and Synthesis, MSc Thesis, IRCAM, 1998] Motivation: a sound's spectral envelope is the basic defining factor for its timbre. Dynamic behaviour: changes over time and can change with f0. 4/33
5 Desirable features of the model for source separation Ultimate goal: segregation of the overlapping partial peaks in the spectrum Accuracy The envelope obtained from the model should match the candidate partials as exactly as possible. Time evolution should be reflected in the model. Demanding requirement that is not always necessary in other modeling applications such as classification or retrieval-by-similarity. Generalization Ability to handle with unknown, real-world mixtures. Need for database training and extraction of prototypes. Compactness Efficient computation. Together with generality and accuracy, it implies that the model has captured the essential characteristics of the source. 5/33
6 Methods for spectral envelope extraction Estimation on whole spectrum Linear Predictive Coding (LPC) Cepstral smoothing Iterative algorithms (True Envelope) Estimation based on additive analysis Additive analysis + interpolation between partials Discrete All-Pole (DAP) Discrete cepstrum We have chosen to develop a model based on full additive analysis We can use the frequency information for evaluation and parallel modeling It is possible to resynthesize 6/33
7 Sinusoidal Modeling A quasi-periodic signal can be modeled by a sum of sinusoids that evolve in amplitude and frequency: The instantaneous frequency is the derivative of the total phase: Frame-based processing: STFT Pitch detection Partial tracking Resynthesis by time interpolation of the parameters 7/33
8 Spectral Basis Decompositions (1) General basis expansion signal model: X : original data matrix C : transformed data matrix (coefficients) B : transformation basis. Columns: basis vectors (e.g.: DFT, STFT, filter banks, wavelets, PCA, ICA, sparse decompositions) Application to time-frequency representations: X is a t-f representation with spectral bands and time frames, Temporal orientation: X(n,k) N x N temporal basis Spectral orientation: X(k,n) K x K spectral basis 8/33
9 Spectral Basis Decompositions (2) Example: truncated PCA decomposition of a violin t-f representation with first 3 basis = x input data coefficients spectral basis Interpretation as projection into a vector subspace spanned by : 9/33
10 Spectral Basis Decompositions (3) Adaptive transforms applied to spectral basis decomposition: Principal Component Analysis (PCA) Yields optimally compact representation Main application: dimensionality reduction Independent Component Analysis (ICA) Yields statistically independent coefficients Main application: Determined Blind Source Separation Independence has proven unnecessary for our representation purposes When applied to a t-f data matrix it is called Independent Subspace Analysis (ISA) Main application: Source Separation from single channel Non-negative Matrix Factorization (NMF) Basis decomposition with non-negativity constraint Has been used to extract features from magnitude spectrograms However, we will work with logarithmic amplitudes can be negative 10/33
11 Principal Component Analysis (1) Problem formulation 1: find the orthogonal directions of maximum variance of a data set [Figure source: T. Jehan, Creating Music by Listening, PhD Thesis, MIT, 2005] Problem formulation 2: find the reduced-dimension representation of a data set that minimizes the approximation error Both problems are equivalent, and their solution is PCA 11/33
12 Principal Component Analysis (2) PCA is defined by the linear transformation matrix of the input data: are the unit-length eigenvectors of the sample covariance : diagonal matrix of the eigenvalues, sorted in decreasing order:, input data must be centered: the variance of the i-th principal component equals the i-th eigenvalue the output data matrix is uncorrelated PCA can be efficiently implemented with Singular Value Decomposition (SVD) 12/33
13 Principal Component Analysis (3) Dimensionality reduction with PCA: keep the first R < K eigenvectors corresponding to the R largest eigenvalues Y r : R x N reduced dimension representation E r : K x N reduced PCA basis approximate reconstruction: reconstruction error (Mean-Square Error) the MSE is equal to the sum of the ignored eigenvalues 13/33
14 An example of spectral PCA PCA applied to the partial amplitudes of a single horn note Original sound full sinusoidal analysis explained variance 1 dim 2 dim 10 dim 14/33
15 Previous applications of spectral PCA (1) Data reduction of additive analysis/synthesis data [Sandell&Martens, 1995] Perceptual experiments Single notes, no training 40-70% data reduction to obtain nearly identical tones Additive analysis/synthesis using Multidimensional Scaling (MDS) [Hourdin, Charbonneau, Moussa, 1997] MDS similar concept to PCA Main goal: representation of sound trajectories in timbre space No training 75% of information for musically acceptable sounds 90% of information for sounds indistinguishable from the original [Figure source: C. Hourdin, G. Charbonneat, T. Moussa, A Multidimensional Scaling Analysis of Musical Instruments Time-Varying Spectra, Comp. Music Journal, 1997] 15/33
16 Previous applications of spectral PCA (2) Sonological models for timbre characterization [De Poli & Prandoni, 1997] PCA input data are a fixed number of MFCC cepstral coefficients Rough approximation of the envelope, no training [Figure source: G. De Poli, P. Prandoni, Sonological models for timbre characterization, J. New Music Research, 1997] Feature extraction in the MPEG-7 standard [Casey, 2001] Another context: general sound description. Not based on spectral envelope. 16/33
17 Training spectral PCA Training database Concatenation of sound samples Pre-processing Centering mean PCA basis Model database Model space Coefficient modeling statistical model / curve prototype 17/33
18 Dealing with variable supports (1) We wish to concatenate the partial amplitudes of several notes in order to train a common PCA basis. It is straightforward to extract a fixed number of partials for each training sample and arrange them in the data matrix X(p,l), where p is the partial and l the frame index. (Partial Indexing, PI) NOTE 1 NOTE 2 NOTE 3 X(p,l) = partial number time frames 18/33
19 Dealing with variable supports (2) original frequency support However, when using notes of different pitches to generalize the model we are in effect misaligning some frequency information. Ex.: Training 1 octave (C4-B4) of an alto saxophone frequency misalignment original data frequencyaligned features f0-correlated features X(p,l) 19/33
20 Dealing with variable supports (3) To correct the misalignment of frequency-invariant features (fixed formants, resonances): set maximum frequency extract a different number of partials for each note interpolate in frequency to get data matrix (Envelope Interpolation, EI) We define a regular frequency grid (grid index: g) We compare two interpolation methods: Linear interpolation: Cubic polynomial interpolation: Find interpolation polynomial so that 20/33
21 Dealing with variable supports (4) Ex.: Training 1 octave (C4-B4) of an alto saxophone, extracting all partials up to the 20 th partial of the highest note, linearly interpolating with a regular frequency grid of 40 points Envelope interpolation X(g,l) 21/33
22 Partial indexing vs. Envelope Interpolation Taking the partial index as spectral index in the data matrix misaligns the frequency-invariant features (formants, resonances) of the spectral envelope. Frequency interpolation avoids this but introduces interpolation errors. On the other hand, partial indexing aligns f0-correlated resonances. In principle, frequency alignment is desirable because: Prototype spectral shapes will be learned more effectively. The data matrix will be more correlated and thus PCA will be able to achieve a better compression. The question arises: Which of these strategies is more appropriate for the PCA model? In other words: What kind of features (f0-correlated or invariant) are more important for our model? 22/33
23 Cross-validation framework Training database Preprocessing parameters Test database Preprocessing PCA/ dim.red EXP 1: compactness basis EXP 2: accuracy EXP 3: generalization Model space Model space Reconstruction Reinterpolation 23/33
24 Results 1: compactness Explained, accumulated variance (eigenvalues): bassoon Exp: 4 th octave, 2 instr. from the RWC database violin piano 24/33
25 Results 2: Accuracy Relative Spectral Error (RSE) of the reconstructed partials, reinterpolated at the original frequencies bassoon Exp: 4 th octave, Training: 2 instr., Test: 1 instr. from RWC violin piano 25/33
26 Experiment 3: generality (1) Problem: measure distance between data cloud of training coefficients and data cloud of test coefficients without assuming any probability distribution Training Test The data clouds do not necessarily form a gaussian cluster In such a case, we cannot trust a distribution measure based on normal parameters (divergence, Bhattacharyya, Cross Likelihood Ratio) 26/33
27 Experiment 3: generality (2) Measures not assuming any distribution (i.e., solely based on point topology) will be more reliable in the general case. Ex.: Kullback-Leibler Divergence: Compared to averaged mininum Mahalanobis distance between points: where KL, piano av min Mahal., piano 27/33
28 Results 3: generality Averaged minimum Mahalanobis distance between training and test data clouds bassoon Exp: 4 th octave, Training: 2 instr., Test: 1 instr. from RWC violin piano 28/33
29 Modeling the coefficients (1) Further generalization is possible by modeling the transformed coefficients Common approaches from Music Information Retrieval: GMM (Gaussian Mixture Models) HMM (Hidden Markov Models) To fully characterize the dynamic behavior of the envelopes, we choose to model the coefficients as a prototype trajectory. 29/33
30 Modeling the coefficients (2) First experiments: simple time interpolation and averaging in low dim space training data cloud prototype trajectory prototype spectral envelope Exp: piano, 4 th octave, Training: 2 instr. from RWC 30/33
31 Modeling the coefficients (3) Example: training of several instruments on the same space (e.g. for timbre characterization, blind source separation) 31/33
32 Modeling the coefficients (4) Further refinement: application of Principal Curves Nonlinear extension to PCA Has been used to model gestures captured by sensors [Figure source: A.F.Bobick, A.D. Wilson, A state-based approach to the representation and recognition of gesture, IEEE Trans. Pattern Analysis and Machine Intelligence, 1997] 32/33
33 Conclusions / Future work When training the PCA model with notes of different pitch, frequency interpolation improves accuracy of the model. The interpolation error is compensated by the gain in correlation between envelope time frames in training data. Appropriate framework for dynamic timbre modeling using prototype trajectories. Future work Integration in a source separation framework Refinement of trajectory models Modeling of frequency information 33/33
Dynamic Spectral Envelope Modeling for the Analysis of Musical Instrument Sounds
IEEE TRANS. ACOUST., SPEECH, SIGNAL PROCESS., VOL. X, NO. X, X 29 1 Dynamic Spectral Envelope Modeling for the Analysis of Musical Instrument Sounds Juan José Burred, Axel Röbel and Thomas Sikora Abstract
More informationDetection of goal event in soccer videos
Detection of goal event in soccer videos Hyoung-Gook Kim, Steffen Roeber, Amjad Samour, Thomas Sikora Department of Communication Systems, Technical University of Berlin, Einsteinufer 17, D-10587 Berlin,
More informationUnsupervised Learning
Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover
More informationSpectral modeling of musical sounds
Spectral modeling of musical sounds Xavier Serra Audiovisual Institute, Pompeu Fabra University http://www.iua.upf.es xserra@iua.upf.es 1. Introduction Spectral based analysis/synthesis techniques offer
More informationCHROMA AND MFCC BASED PATTERN RECOGNITION IN AUDIO FILES UTILIZING HIDDEN MARKOV MODELS AND DYNAMIC PROGRAMMING. Alexander Wankhammer Peter Sciri
1 CHROMA AND MFCC BASED PATTERN RECOGNITION IN AUDIO FILES UTILIZING HIDDEN MARKOV MODELS AND DYNAMIC PROGRAMMING Alexander Wankhammer Peter Sciri introduction./the idea > overview What is musical structure?
More informationDimension Reduction CS534
Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of
More informationINDEPENDENT COMPONENT ANALYSIS APPLIED TO fmri DATA: A GENERATIVE MODEL FOR VALIDATING RESULTS
INDEPENDENT COMPONENT ANALYSIS APPLIED TO fmri DATA: A GENERATIVE MODEL FOR VALIDATING RESULTS V. Calhoun 1,2, T. Adali, 2 and G. Pearlson 1 1 Johns Hopkins University Division of Psychiatric Neuro-Imaging,
More informationFeature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22
Feature selection Javier Béjar cbea LSI - FIB Term 2011/2012 Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/2012 1 / 22 Outline 1 Dimensionality reduction 2 Projections 3 Attribute selection
More informationLinear Methods for Regression and Shrinkage Methods
Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors
More informationCSC 411: Lecture 14: Principal Components Analysis & Autoencoders
CSC 411: Lecture 14: Principal Components Analysis & Autoencoders Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 14-PCA & Autoencoders 1 / 18
More information1 Introduction. 3 Data Preprocessing. 2 Literature Review
Rock or not? This sure does. [Category] Audio & Music CS 229 Project Report Anand Venkatesan(anand95), Arjun Parthipan(arjun777), Lakshmi Manoharan(mlakshmi) 1 Introduction Music Genre Classification continues
More information( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components
Review Lecture 14 ! PRINCIPAL COMPONENT ANALYSIS Eigenvectors of the covariance matrix are the principal components 1. =cov X Top K principal components are the eigenvectors with K largest eigenvalues
More informationCSC 411: Lecture 14: Principal Components Analysis & Autoencoders
CSC 411: Lecture 14: Principal Components Analysis & Autoencoders Raquel Urtasun & Rich Zemel University of Toronto Nov 4, 2015 Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 1 / 18
More informationConvention Paper Presented at the 120th Convention 2006 May Paris, France
Audio Engineering Society Convention Paper Presented at the 12th Convention 26 May 2 23 Paris, France This convention paper has been reproduced from the author s advance manuscript, without editing, corrections,
More informationECG782: Multidimensional Digital Signal Processing
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu ECG782: Multidimensional Digital Signal Processing Spring 2014 TTh 14:30-15:45 CBC C313 Lecture 06 Image Structures 13/02/06 http://www.ee.unlv.edu/~b1morris/ecg782/
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationMULTIVARIATE TEXTURE DISCRIMINATION USING A PRINCIPAL GEODESIC CLASSIFIER
MULTIVARIATE TEXTURE DISCRIMINATION USING A PRINCIPAL GEODESIC CLASSIFIER A.Shabbir 1, 2 and G.Verdoolaege 1, 3 1 Department of Applied Physics, Ghent University, B-9000 Ghent, Belgium 2 Max Planck Institute
More informationIMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING
SECOND EDITION IMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING ith Algorithms for ENVI/IDL Morton J. Canty с*' Q\ CRC Press Taylor &. Francis Group Boca Raton London New York CRC
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationInternational Journal of Digital Application & Contemporary research Website: (Volume 1, Issue 8, March 2013)
Face Recognition using ICA for Biometric Security System Meenakshi A.D. Abstract An amount of current face recognition procedures use face representations originate by unsupervised statistical approaches.
More informationData Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\
Data Preprocessing Javier Béjar BY: $\ URL - Spring 2018 C CS - MAI 1/78 Introduction Data representation Unstructured datasets: Examples described by a flat set of attributes: attribute-value matrix Structured
More informationCIE L*a*b* color model
CIE L*a*b* color model To further strengthen the correlation between the color model and human perception, we apply the following non-linear transformation: with where (X n,y n,z n ) are the tristimulus
More informationMusical Applications of MPEG-7 Audio
Musical Applications of MPEG-7 Audio Michael Casey, MERL Cambridge Research Laboratory Abstract the MPEG-7 international standard contains new tools for computing similarity and classification of audio
More informationUsing Noise Substitution for Backwards-Compatible Audio Codec Improvement
Using Noise Substitution for Backwards-Compatible Audio Codec Improvement Colin Raffel AES 129th Convention San Francisco, CA February 16, 2011 Outline Introduction and Motivation Coding Error Analysis
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationAn Effective Approach in Face Recognition using Image Processing Concepts
An Effective Approach in Face Recognition using Image Processing Concepts K. Ganapathi Babu 1, M.A.Rama Prasad 2 1 Pursuing M.Tech in CSE at VLITS,Vadlamudi Guntur Dist., A.P., India 2 Asst.Prof, Department
More informationConvention Paper Presented at the 120th Convention 2006 May Paris, France
Audio Engineering Society Convention Paper Presented at the 12th Convention 26 May 2 23 Paris, France This convention paper has been reproduced from the author s advance manuscript, without editing, corrections,
More informationCLASSIFICATION AND CHANGE DETECTION
IMAGE ANALYSIS, CLASSIFICATION AND CHANGE DETECTION IN REMOTE SENSING With Algorithms for ENVI/IDL and Python THIRD EDITION Morton J. Canty CRC Press Taylor & Francis Group Boca Raton London NewYork CRC
More informationFacial Expression Recognition Using Non-negative Matrix Factorization
Facial Expression Recognition Using Non-negative Matrix Factorization Symeon Nikitidis, Anastasios Tefas and Ioannis Pitas Artificial Intelligence & Information Analysis Lab Department of Informatics Aristotle,
More informationLast week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints
Last week Multi-Frame Structure from Motion: Multi-View Stereo Unknown camera viewpoints Last week PCA Today Recognition Today Recognition Recognition problems What is it? Object detection Who is it? Recognizing
More informationImage Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi
Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi 1. Introduction The choice of a particular transform in a given application depends on the amount of
More informationDiscriminative training and Feature combination
Discriminative training and Feature combination Steve Renals Automatic Speech Recognition ASR Lecture 13 16 March 2009 Steve Renals Discriminative training and Feature combination 1 Overview Hot topics
More informationLecture 8 Object Descriptors
Lecture 8 Object Descriptors Azadeh Fakhrzadeh Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University 2 Reading instructions Chapter 11.1 11.4 in G-W Azadeh Fakhrzadeh
More informationENHANCED RADAR IMAGING VIA SPARSITY REGULARIZED 2D LINEAR PREDICTION
ENHANCED RADAR IMAGING VIA SPARSITY REGULARIZED 2D LINEAR PREDICTION I.Erer 1, K. Sarikaya 1,2, H.Bozkurt 1 1 Department of Electronics and Telecommunications Engineering Electrics and Electronics Faculty,
More informationRobust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma
Robust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma Presented by Hu Han Jan. 30 2014 For CSE 902 by Prof. Anil K. Jain: Selected
More information22 October, 2012 MVA ENS Cachan. Lecture 5: Introduction to generative models Iasonas Kokkinos
Machine Learning for Computer Vision 1 22 October, 2012 MVA ENS Cachan Lecture 5: Introduction to generative models Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Center for Visual Computing Ecole Centrale Paris
More informationAnalysis of Functional MRI Timeseries Data Using Signal Processing Techniques
Analysis of Functional MRI Timeseries Data Using Signal Processing Techniques Sea Chen Department of Biomedical Engineering Advisors: Dr. Charles A. Bouman and Dr. Mark J. Lowe S. Chen Final Exam October
More informationApplication of Principal Components Analysis and Gaussian Mixture Models to Printer Identification
Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification Gazi. Ali, Pei-Ju Chiang Aravind K. Mikkilineni, George T. Chiu Edward J. Delp, and Jan P. Allebach School
More informationHybrid Speech Synthesis
Hybrid Speech Synthesis Simon King Centre for Speech Technology Research University of Edinburgh 2 What are you going to learn? Another recap of unit selection let s properly understand the Acoustic Space
More informationSchedule for Rest of Semester
Schedule for Rest of Semester Date Lecture Topic 11/20 24 Texture 11/27 25 Review of Statistics & Linear Algebra, Eigenvectors 11/29 26 Eigenvector expansions, Pattern Recognition 12/4 27 Cameras & calibration
More informationSequence representation of Music Structure using Higher-Order Similarity Matrix and Maximum likelihood approach
Sequence representation of Music Structure using Higher-Order Similarity Matrix and Maximum likelihood approach G. Peeters IRCAM (Sound Analysis/Synthesis Team) - CNRS (STMS) 1 supported by the RIAM Ecoute
More informationSpectral Classification
Spectral Classification Spectral Classification Supervised versus Unsupervised Classification n Unsupervised Classes are determined by the computer. Also referred to as clustering n Supervised Classes
More informationHuman Motion Synthesis by Motion Manifold Learning and Motion Primitive Segmentation
Human Motion Synthesis by Motion Manifold Learning and Motion Primitive Segmentation Chan-Su Lee and Ahmed Elgammal Rutgers University, Piscataway, NJ, USA {chansu, elgammal}@cs.rutgers.edu Abstract. We
More informationCS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample
CS 1675 Introduction to Machine Learning Lecture 18 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem:
More information3 Feature Selection & Feature Extraction
3 Feature Selection & Feature Extraction Overview: 3.1 Introduction 3.2 Feature Extraction 3.3 Feature Selection 3.3.1 Max-Dependency, Max-Relevance, Min-Redundancy 3.3.2 Relevance Filter 3.3.3 Redundancy
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationMedia Segmentation using Self-Similarity Decomposition
Media Segmentation using Self-Similarity Decomposition Jonathan T. Foote and Matthew L. Cooper FX Palo Alto Laboratory Palo Alto, CA 93 USA {foote, cooper}@fxpal.com ABSTRACT We present a framework for
More informationSlides adapted from Marshall Tappen and Bryan Russell. Algorithms in Nature. Non-negative matrix factorization
Slides adapted from Marshall Tappen and Bryan Russell Algorithms in Nature Non-negative matrix factorization Dimensionality Reduction The curse of dimensionality: Too many features makes it difficult to
More informationSPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION
Far East Journal of Electronics and Communications Volume 3, Number 2, 2009, Pages 125-140 Published Online: September 14, 2009 This paper is available online at http://www.pphmj.com 2009 Pushpa Publishing
More informationPRINCIPAL COMPONENT ANALYSIS OF RASTERISED AUDIO FOR CROSS- SYNTHESIS
PRINCIPAL COMPONENT ANALYSIS OF RASTERISED AUDIO FOR CROSS- SYNTHESIS Jeremy J. Wells Audio Lab, Department of Electronics, University of York, YO10 5DD, UK. jjw100@ohm.york.ac.uk Gregory Aldam Department
More informationPreface to the Second Edition. Preface to the First Edition. 1 Introduction 1
Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches
More informationLearning based face hallucination techniques: A survey
Vol. 3 (2014-15) pp. 37-45. : A survey Premitha Premnath K Department of Computer Science & Engineering Vidya Academy of Science & Technology Thrissur - 680501, Kerala, India (email: premithakpnath@gmail.com)
More informationINTERACTIVE REFINEMENT OF SUPERVISED AND SEMI-SUPERVISED SOUND SOURCE SEPARATION ESTIMATES
INTERACTIVE REFINEMENT OF SUPERVISED AND SEMI-SUPERVISED SOUND SOURCE SEPARATION ESTIMATES Nicholas J. Bryan Gautham J. Mysore Center for Computer Research in Music and Acoustics, Stanford University Adobe
More informationAcoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing
Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Samer Al Moubayed Center for Speech Technology, Department of Speech, Music, and Hearing, KTH, Sweden. sameram@kth.se
More informationOptimum Array Processing
Optimum Array Processing Part IV of Detection, Estimation, and Modulation Theory Harry L. Van Trees WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Preface xix 1 Introduction 1 1.1 Array Processing
More information1 Audio quality determination based on perceptual measurement techniques 1 John G. Beerends
Contents List of Figures List of Tables Contributing Authors xiii xxi xxiii Introduction Karlheinz Brandenburg and Mark Kahrs xxix 1 Audio quality determination based on perceptual measurement techniques
More informationMPEG-7 Sound Recognition Tools
MPEG-7 Sound Recognition Tools Michael Casey, member IEEE Abstract The MPEG-7 sound recognition descriptors and description schemes consist of tools for indexing audio media using probabilistic sound models.
More informationDimensionality Reduction, including by Feature Selection.
Dimensionality Reduction, including by Feature Selection www.cs.wisc.edu/~dpage/cs760 Goals for the lecture you should understand the following concepts filtering-based feature selection information gain
More informationFMRI data: Independent Component Analysis (GIFT) & Connectivity Analysis (FNC)
FMRI data: Independent Component Analysis (GIFT) & Connectivity Analysis (FNC) Software: Matlab Toolbox: GIFT & FNC Yingying Wang, Ph.D. in Biomedical Engineering 10 16 th, 2014 PI: Dr. Nadine Gaab Outline
More information716 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 14, NO. 5, MAY 2004
716 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 14, NO. 5, MAY 2004 Audio Classification Based on MPEG-7 Spectral Basis Representations Hyoung-Gook Kim, Nicolas Moreau, and Thomas
More informationClustering. CS294 Practical Machine Learning Junming Yin 10/09/06
Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,
More informationFace detection and recognition. Detection Recognition Sally
Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification
More informationDimension reduction : PCA and Clustering
Dimension reduction : PCA and Clustering By Hanne Jarmer Slides by Christopher Workman Center for Biological Sequence Analysis DTU The DNA Array Analysis Pipeline Array design Probe design Question Experimental
More informationPerceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.
Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general
More informationA voice interface for sound generators: adaptive and automatic mapping of gestures to sound
University of Wollongong Research Online University of Wollongong in Dubai - Papers University of Wollongong in Dubai 2012 A voice interface for sound generators: adaptive and automatic mapping of gestures
More informationCOSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor
COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality
More informationAudio-coding standards
Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.
More informationWAVELET BASED NONLINEAR SEPARATION OF IMAGES. Mariana S. C. Almeida and Luís B. Almeida
WAVELET BASED NONLINEAR SEPARATION OF IMAGES Mariana S. C. Almeida and Luís B. Almeida Instituto de Telecomunicações Instituto Superior Técnico Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal mariana.almeida@lx.it.pt,
More informationScan Matching. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics
Scan Matching Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Scan Matching Overview Problem statement: Given a scan and a map, or a scan and a scan,
More informationNew Approaches for EEG Source Localization and Dipole Moment Estimation. Shun Chi Wu, Yuchen Yao, A. Lee Swindlehurst University of California Irvine
New Approaches for EEG Source Localization and Dipole Moment Estimation Shun Chi Wu, Yuchen Yao, A. Lee Swindlehurst University of California Irvine Outline Motivation why EEG? Mathematical Model equivalent
More informationImage Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 18.
Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W 4-5:15pm@Bloch 0012 Lec 18 Image Hashing Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph:
More informationSpeech and audio coding
Institut Mines-Telecom Speech and audio coding Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced compression Outline Introduction Introduction Speech signal Music signal Masking Codeurs simples
More informationRecognition, SVD, and PCA
Recognition, SVD, and PCA Recognition Suppose you want to find a face in an image One possibility: look for something that looks sort of like a face (oval, dark band near top, dark band near bottom) Another
More informationIntroduction to ANSYS DesignXplorer
Lecture 4 14. 5 Release Introduction to ANSYS DesignXplorer 1 2013 ANSYS, Inc. September 27, 2013 s are functions of different nature where the output parameters are described in terms of the input parameters
More informationMultimedia Database Systems. Retrieval by Content
Multimedia Database Systems Retrieval by Content MIR Motivation Large volumes of data world-wide are not only based on text: Satellite images (oil spill), deep space images (NASA) Medical images (X-rays,
More informationBoth LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.
Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general
More informationAdvanced phase retrieval: maximum likelihood technique with sparse regularization of phase and amplitude
Advanced phase retrieval: maximum likelihood technique with sparse regularization of phase and amplitude A. Migukin *, V. atkovnik and J. Astola Department of Signal Processing, Tampere University of Technology,
More informationCSC 411 Lecture 18: Matrix Factorizations
CSC 411 Lecture 18: Matrix Factorizations Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 18-Matrix Factorizations 1 / 27 Overview Recall PCA: project data
More informationAvailable online Journal of Scientific and Engineering Research, 2016, 3(4): Research Article
Available online www.jsaer.com, 2016, 3(4):417-422 Research Article ISSN: 2394-2630 CODEN(USA): JSERBR Automatic Indexing of Multimedia Documents by Neural Networks Dabbabi Turkia 1, Lamia Bouafif 2, Ellouze
More informationAudio-coding standards
Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.
More informationText-Independent Speaker Identification
December 8, 1999 Text-Independent Speaker Identification Til T. Phan and Thomas Soong 1.0 Introduction 1.1 Motivation The problem of speaker identification is an area with many different applications.
More informationCSC321: Neural Networks. Lecture 13: Learning without a teacher: Autoencoders and Principal Components Analysis. Geoffrey Hinton
CSC321: Neural Networks Lecture 13: Learning without a teacher: Autoencoders and Principal Components Analysis Geoffrey Hinton Three problems with backpropagation Where does the supervision come from?
More informationGraduate School for Integrative Sciences & Engineering 2 Arts and Creativity Laboratory, Interactive and Digital Media Institute
ADAPTING GENERAL PURPOSE INTERFACES TO SYNTHESIS ENGINES USING UNSUPERVISED DIMENSIONALITY REDUCTION TECHNIQUES AND INVERSE MAPPING FROM FEATURES TO PARAMETERS Stefano Fasciani 1,2 Lonce Wyse 2,3 1 Graduate
More informationCluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1
Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods
More informationLarge-Scale Face Manifold Learning
Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random
More informationAnalyzing Vocal Patterns to Determine Emotion Maisy Wieman, Andy Sun
Analyzing Vocal Patterns to Determine Emotion Maisy Wieman, Andy Sun 1. Introduction The human voice is very versatile and carries a multitude of emotions. Emotion in speech carries extra insight about
More informationAn M-Channel Critically Sampled Graph Filter Bank
An M-Channel Critically Sampled Graph Filter Bank Yan Jin and David Shuman March 7, 2017 ICASSP, New Orleans, LA Special thanks and acknowledgement: Andre Archer, Andrew Bernoff, Andrew Beveridge, Stefan
More informationPrincipal Component Image Interpretation A Logical and Statistical Approach
Principal Component Image Interpretation A Logical and Statistical Approach Md Shahid Latif M.Tech Student, Department of Remote Sensing, Birla Institute of Technology, Mesra Ranchi, Jharkhand-835215 Abstract
More informationCS 521 Data Mining Techniques Instructor: Abdullah Mueen
CS 521 Data Mining Techniques Instructor: Abdullah Mueen LECTURE 2: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major Tasks
More informationReversible Wavelets for Embedded Image Compression. Sri Rama Prasanna Pavani Electrical and Computer Engineering, CU Boulder
Reversible Wavelets for Embedded Image Compression Sri Rama Prasanna Pavani Electrical and Computer Engineering, CU Boulder pavani@colorado.edu APPM 7400 - Wavelets and Imaging Prof. Gregory Beylkin -
More informationAn Introduction to Pattern Recognition
An Introduction to Pattern Recognition Speaker : Wei lun Chao Advisor : Prof. Jian-jiun Ding DISP Lab Graduate Institute of Communication Engineering 1 Abstract Not a new research field Wide range included
More informationProbabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information
Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information Mustafa Berkay Yilmaz, Hakan Erdogan, Mustafa Unel Sabanci University, Faculty of Engineering and Natural
More informationData Preprocessing. Javier Béjar AMLT /2017 CS - MAI. (CS - MAI) Data Preprocessing AMLT / / 71 BY: $\
Data Preprocessing S - MAI AMLT - 2016/2017 (S - MAI) Data Preprocessing AMLT - 2016/2017 1 / 71 Outline 1 Introduction Data Representation 2 Data Preprocessing Outliers Missing Values Normalization Discretization
More informationCS229 Final Project: Audio Query By Gesture
CS229 Final Project: Audio Query By Gesture by Steinunn Arnardottir, Luke Dahl and Juhan Nam {steinunn,lukedahl,juhan}@ccrma.stanford.edu December 2, 28 Introduction In the field of Music Information Retrieval
More informationReviewer Profiling Using Sparse Matrix Regression
Reviewer Profiling Using Sparse Matrix Regression Evangelos E. Papalexakis, Nicholas D. Sidiropoulos, Minos N. Garofalakis Technical University of Crete, ECE department 14 December 2010, OEDM 2010, Sydney,
More informationCOMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS
COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Toomas.Kirt@mail.ee Abstract: Key words: For the visualisation
More informationImage Analysis, Classification and Change Detection in Remote Sensing
Image Analysis, Classification and Change Detection in Remote Sensing WITH ALGORITHMS FOR ENVI/IDL Morton J. Canty Taylor &. Francis Taylor & Francis Group Boca Raton London New York CRC is an imprint
More information9 length of contour = no. of horizontal and vertical components + ( 2 no. of diagonal components) diameter of boundary B
8. Boundary Descriptor 8.. Some Simple Descriptors length of contour : simplest descriptor - chain-coded curve 9 length of contour no. of horiontal and vertical components ( no. of diagonal components
More informationClustering and Visualisation of Data
Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some
More information