Machine Learning and Visualisation

Size: px
Start display at page:

Download "Machine Learning and Visualisation"

Transcription

1 Machine Learning and Visualisation Ian T. Nabney Aston University, Birmingham, UK March 2015 Ian T. Nabney Machine Learning and Visualisation 1/45

2 Outline The challenge of hidden knowledge Data visualisation: latent variable models Data visualisation: topographic mappings Non-linear modelling and feature selection Ian T. Nabney Machine Learning and Visualisation 2/45

3 Acknowledgements Collaborators Chris Bishop, Mike Tipping, David Lowe, Markus Svénsen, Chris Williams Peter Tiño, Yi Sun, Dharmesh Maniyar, John Owen Phil Laflin, Bruce Williams, Paola Gaolini, Jens Lösel Martin Schroeder, Ain Abdul Karim, Dan Cornford, Cliff Bailey, Naomi Hubber, Shahzad Mumtaz, Midhel Randrianandrasana Richard Barnes, Colin Smith, Dan Wells Ian T. Nabney Machine Learning and Visualisation 3/45

4 Hidden Knowledge Hidden Knowledge Understanding the vast quantities of data that surround us is a real challenge; particularly in situations with a lot of variables We can understand more of it with help. Machine learning is the computer-based generation of models from data. A model is a parameterised function from input attributes to an output prediction. Parameters in the model express the hidden connection between inputs and predictions. They are learned from data. Ian T. Nabney Machine Learning and Visualisation 4/45

5 Data Visualisation What is Visualisation? Goal of visualisation is to present data in a human-readable way. Visualisation is an important tool for developing a better understanding of large complex datasets. It is particularly helpful for users such as research scientists or clinicians who are not specialists in data modelling. Detection of outliers. Clustering and segmentation. Aid to feature selection. Feedback on results of analysis. Two aspects: data projection and information visualisation. Ian T. Nabney Machine Learning and Visualisation 5/45

6 Data Projection Data Visualisation The goal is to project data to a lower-dimensional space (usually 2d) while preserving as much information or structure as possible. Once the projection is done standard information visualisation approaches can be used to support user interaction. The quantity and complexity of many datasets means that simple visualisation methods, such as Principal Component Analysis, are not very effective. Ian T. Nabney Machine Learning and Visualisation 6/45

7 Data Visualisation Information Visualisation Shneiderman: Overview first; zoom and filter; details on demand. Overview provided by projection. Zooming possible in Matlab plots. Filtering by user interaction; e.g. specify pattern of values that is of interest. Details by providing local information. See more of this later on practical examples. Ian T. Nabney Machine Learning and Visualisation 7/45

8 Data Visualisation Information Visualisation Examples Word Cloud ( Ian T. Nabney Machine Learning and Visualisation 8/45

9 Uncertainty Data Visualisation Doubt is not a pleasant condition, but certainty is absurd. Voltaire Real data is noisy. We are forced to deal with uncertainty, yet we need to be quantitative. The optimal formalism for inference in the presence of uncertainty is probability theory. We assume the presence of an underlying regularity to make predictions. Bayesian inference allows us to reason probabilistically about the model as well as the data. Ian T. Nabney Machine Learning and Visualisation 9/45

10 Data Projection Data Visualisation D f(y; W) y3 Define f to optimise some criterion. V y1 y2 PCA is minimal variance; Sammon mapping is minimal stress. Ian T. Nabney Machine Learning and Visualisation 10/45

11 Data Visualisation What can we learn from this? 10 Sinus VEL VER Ian T. Nabney Machine Learning and Visualisation 11/45

12 Projection Data Visualisation What is the simplest way to project data? A linear map. What is the best way to linearly project data? Want to preserve as much information as possible. If we assume that information is measured by variance this implies choosing new coordinate axes along directions of maximal variance; these can be found by analysing the covariance matrix of the data. This gives Principal Component Analysis (PCA). For large datasets, the end result is usually a circular blob in the middle of the screen. Ian T. Nabney Machine Learning and Visualisation 12/45

13 PCA Data Visualisation Let S be the covariance matrix of the data, so that S ij = 1 (xi n x i )(xj n x j ) N n The first q principal components are the first q eigenvectors w j of S, ordered by the size of the eigenvalues λ j. The percentage of the variance explained by the first q PC s is q j=1 λ j d j=1 λ j where the data dimension is d. These vectors are orthonormal (perpendicular and unit length). The variance when the data is projected onto them is maximal. Plot the sorted principal values: plot(-sort(-eig(cov(data)))); Ian T. Nabney Machine Learning and Visualisation 13/45

14 Data Visualisation: Topographic Mappings Topographic Mappings Basic aim is that distances in the visualisation space are as close a possible to those in original data space. Given a dissimilarity matrix d ij, we want to map data points x i to points y i in a feature space such that their dissimilarities in feature space, d ij, are as close as possible to the d ij. We say that the map preserves similarities. The stress measure is used as objective function ) 2 E = 1 (d ij d ij ij d ij d ij i<j Ian T. Nabney Machine Learning and Visualisation 14/45

15 Data Visualisation: Topographic Mappings Multi-Dimensional Scaling Given distances or dissimilarities d rs between every pair of observations try to preserve these as far as possible in lower dimensional space. In classical scaling, the distance between the objects is assumed to be Euclidean. A linear projection then corresponds to PCA. The Sammon mapping is a non-linear multidimensional scaling technique more general (and more widely used) than classical scaling. Neuroscale is a neural network based scaling technique that has the advantage of actually giving a map that generalises! Ian T. Nabney Machine Learning and Visualisation 15/45

16 Data Visualisation: Topographic Mappings Neuroscale Ian T. Nabney Machine Learning and Visualisation 16/45

17 Data Visualisation: Topographic Mappings Biological Application: Streptomyces Gene Expression Data supplied by Colin Smith (Surrey University). Streptomyces Coelicolor is a bacterium which undergoes developmental changes correlated to sporulation and production of antibiotics genes include more than 20 clusters coding for secondary metabolites including a large proportion of regulatory genes. The dataset consists of ten time points from 16 to 67 hours after inoculation of the growth medium. Analysis based on 3067 genes that were significantly expressed. SCO6283, SCO6284, SCO6277, SCO6278 co-regulated genes involved in synthesis of type I polyketide, SCO3245 in synthesis of lipid. Ian T. Nabney Machine Learning and Visualisation 17/45

18 Data Visualisation: Topographic Mappings Streptomycin Life of streptomycin Bioinformatics Measuring the expression levels of thousands of genes over multiple timepoints. Ian T. Nabney Machine Learning and Visualisation 18/45

19 Data Visualisation: Topographic Mappings SCO6283, SCO6284, SCO6277, SCO6278 in cluster 11, SCO3245 in cluster 12. Ian T. Nabney Machine Learning and Visualisation 19/45

20 Data Visualisation: Topographic Mappings Genes involved with synthesis of two distinct secondary metabolites may be coregulated by a common network. Ian T. Nabney Machine Learning and Visualisation 20/45

21 Data Visualisation: Latent Variable Models Latent Variable Models The projection approach is one way of reducing the data complexity. An alternative view is to hypothesise how the data might have been generated. Hidden Connections A hidden connection is stronger than an obvious one. Heraclitus Ian T. Nabney Machine Learning and Visualisation 21/45

22 Data Visualisation: Latent Variable Models Latent Variable Models How is the idea of hidden connections applied to statistical pattern recognition? Separate the observed variables and the latent variables. Latent variables generate observations. Use (probabilistic) inference to deduce what is happening in latent variable space. Often use Bayes Theorem: P(L O) = P(O L) P(L) P(O) Static case: GTM. Two latent variables and a non-linear transformation to observation space. Dynamic case: Hidden Markov Models: discrete state space. Speech recognition. State Space Models: continuous state space. Tracking. Ian T. Nabney Machine Learning and Visualisation 22/45

23 Data Visualisation: Latent Variable Models Visualisation with Density Models Construct a generative model for the data mapping from a low-dimensional latent space H to the data space D. Maps latent variables r to observed variables x giving a probability density p(x r). To visualise the data we want to map from observed variables to latent variables: use Bayes theorem to compute p(r x) = p(x r)p(r). p(x) Plot a summary statistic of p(r i x i ) for each data point x i : usually the mean. If the mapping is linear and there is a single Gaussian noise model, we recover PCA. Ian T. Nabney Machine Learning and Visualisation 23/45

24 Data Visualisation: Latent Variable Models Latent space x 3 z 2 y(z;w) z 1 Data space x2 x 1 Ian T. Nabney Machine Learning and Visualisation 24/45

25 Data Visualisation: Latent Variable Models The Generative Topographic Mapping GTM (Bishop, Svensén and Williams) is a latent variable model with a non-linear RBF f M mapping a (usually two dimensional) latent space H to the data space D. Data doesn t live exactly on manifold, so smear it with Gaussian noise. Introduce latent space density p(x): approximate by a data sample. This is a generative probabilistic model. This model assumes that the data lies close to a two dimensional manifold; however, this is likely to be too simple a model for interesting data. We can measure the non-linearity of the sheet and use this to understand the visualisation plot. Train the model in maximum likelihood framework using an iterative algorithm (EM). Ian T. Nabney Machine Learning and Visualisation 25/45

26 Data Visualisation: Latent Variable Models Enhancements to GTM Curvatures give more information about shape of manifold. Hierarchy allows the user to drill down into data; either user-defined or automated (MML) selection of sub-model positions. Temporal dependencies in data handled by GTM through Time. Discrete data handled by Latent Trait Model (LTM): all the other goodies work for it as well. Can cope with missing data in training and visualisation. Ian T. Nabney Machine Learning and Visualisation 26/45

27 Data Visualisation: Latent Variable Models Enhancements to GTM Curvatures give more information about shape of manifold. Hierarchy allows the user to drill down into data; either user-defined or automated (MML) selection of sub-model positions. Temporal dependencies in data handled by GTM through Time. Discrete data handled by Latent Trait Model (LTM): all the other goodies work for it as well. Can cope with missing data in training and visualisation. MML methods for feature selection. Ian T. Nabney Machine Learning and Visualisation 26/45

28 Data Visualisation: Latent Variable Models Enhancements to GTM Curvatures give more information about shape of manifold. Hierarchy allows the user to drill down into data; either user-defined or automated (MML) selection of sub-model positions. Temporal dependencies in data handled by GTM through Time. Discrete data handled by Latent Trait Model (LTM): all the other goodies work for it as well. Can cope with missing data in training and visualisation. MML methods for feature selection. Structured covariance. Ian T. Nabney Machine Learning and Visualisation 26/45

29 Data Visualisation: Latent Variable Models Enhancements to GTM Curvatures give more information about shape of manifold. Hierarchy allows the user to drill down into data; either user-defined or automated (MML) selection of sub-model positions. Temporal dependencies in data handled by GTM through Time. Discrete data handled by Latent Trait Model (LTM): all the other goodies work for it as well. Can cope with missing data in training and visualisation. MML methods for feature selection. Structured covariance. Mixed data types. Ian T. Nabney Machine Learning and Visualisation 26/45

30 Data Visualisation: Latent Variable Models Local Parallel Coordinates Parallel coordinates maps d-dimensional data space onto two display dimensions by using d equidistant axes parallel to the y-axis. Each data point is displayed as a piecewise linear graph intersecting each axis at the position corresponding to the data value for that dimension. It is impractical to display this for all the data points, so allow the user to select a region of interest. The user can also interact with the local parallel coordinates plot to obtain detailed information. Ian T. Nabney Machine Learning and Visualisation 27/45

31 Data Visualisation: Latent Variable Models Hierarchical GTM: Drilling Down Bishop and Tipping introduced the idea of hierarchical visualisation for probabilistic PCA. We have developed a general framework for arbitrary latent variable models. Because GTM is a generative latent variable model, it is straightforward to train hierarchical mixtures of GTMs. We model the whole data set with a GTM at the top level, which is broken down into clusters at deeper levels of the hierarchy. Because the data can be visualised at each level of the hierarchy, the selection of clusters, which are used to train GTMs at the next level down, can be carried out interactively by the user. Ian T. Nabney Machine Learning and Visualisation 28/45

32 Data Visualisation: Latent Variable Models Chemometric Application: HTS Data Exploration Scientists at Pfizer searching for active compounds can now screen millions of compounds in a fortnight. Gain a better understanding of the results of multiple screens through the use of novel data visualisation and modelling techniques. Find clusters of similar compounds (measured in terms of biological activity) and using a representative subset to reduce the number of compounds in a screen. Build local prediction models. Ian T. Nabney Machine Learning and Visualisation 29/45

33 Data Visualisation: Latent Variable Models We have taken data from Jens Lösel (Pfizer) which consists of dimensional vectors representing chemical compounds using topological indices developed at Pfizer. The task is to predict LogP. Plots segment the data (by responsibility) which can be used to build local predictive models which are often more accurate than global models. Only 14 inputs, compared with c for other methods of predicting logp. Results comparable with other algorithms for logp. Ian T. Nabney Machine Learning and Visualisation 30/45

34 Data Visualisation: Latent Variable Models Ian T. Nabney Machine Learning and Visualisation 31/45

35 Data Visualisation: Latent Variable Models Ian T. Nabney Machine Learning and Visualisation 32/45

36 Data Visualisation: Latent Variable Models Gaussian Process Latent Variable Model Ian T. Nabney Machine Learning and Visualisation 33/45

37 Non-linear Modelling and Feature Selection Non-linear Modelling and Feature Selection Many chemometric problems can best be addressed using non-linear predictive models (e.g. QSAR). Models must be multivariate (there is no single silver bullet ), but there are hundreds (thousands, tens of thousands) of possible features (e.g. for small molecules, proteins,... ). Linear models have a constant sensitivity to input variables. Non-linear models have a variable sensitivity; niches of good performance/variable importance. Ian T. Nabney Machine Learning and Visualisation 34/45

38 Non-linear Modelling and Feature Selection GTM-FS d 1 and d 2 have high saliency, d 3 has low saliency Ian T. Nabney Machine Learning and Visualisation 35/45

39 Non-linear Modelling and Feature Selection Chemometric Data GTM Visualisation GTM-FS Visualisation Magnification factors on a log scale Ian T. Nabney Machine Learning and Visualisation 36/45

40 Non-linear Modelling and Feature Selection Feature Saliencies Both GTM models outperform Kohonen SOM GTM-FS performs better than GTM on magnification factors (71 to 126) and (subjectively) has more coherent clusters GTM-FS performs worse than GTM on nearest-neighbour error (41% to 38%) Ian T. Nabney Machine Learning and Visualisation 37/45

41 Block GTM Block-structured Covariance Include prior information about the correlations of variables into a GTM by using a full covariance matrix in the noise model and enforcing a block structure. This results in a reasonably sparse covariance matrix and keeps the number of unknown parameters low. The additional flexibility of the model allows the model to fit the data more closely. The extension of the learning algorithm is straightforward and the only changes occur in the computation of responsibilities in the E-step and of Σ in the M-step. Σ Σ = 0 Σ Σ p Ian T. Nabney Machine Learning and Visualisation 38/45

42 Block-structured Covariance Finding the Blocks: I Find the block structure by visualising the correlation coefficients as a heat map. However for this method to be successful one needs to order this heat map highly correlated variables are close to each other (i.e. forming blocks). Generate a dendrogram using hierarchical clustering combined with heuristics to reorder the leaves to reflect their proximity. To achieve this the tree is ordered in such a way that the distance between neighbouring leaves is minimized. Use a recursive algorithm: Optimal Leaf Ordering (OLO). (Available in the Matlab Bioinformatics Toolbox). Swaps sub-trees if this reduces distances to neighbours. Ian T. Nabney Machine Learning and Visualisation 39/45

43 Block-structured Covariance Finding the Blocks: II Bayesian Correlation Estimation based on the paper of Liechty et al. (2004). For the grouping one is only interested in the off-diagonal elements of the empirical correlation matrix C. Assume that C ij N(µ, σ 2 ) with priors µ N(0, τ 2 ) and σ 2 IG(α, β) with the hyperparameters known. Extend this to groups with µ θi,θ j where the posterior p(θ i ) defines the groups. The full posterior distribution of θ i, µ and σ can be sampled using the Metropolis Hastings algorithms. Very slow. Created a simpler Quick BCE which just estimates p(θ i = k). Ian T. Nabney Machine Learning and Visualisation 40/45

44 Block-structured Covariance Results on Toy Data The nearest neighbour label error with high (ST=20) and low (ST=2) structure for the GTM model with different covariance structures. PCA=(blue, dotted line with big dot). S-GTM=(green, constant line with X). B-GTM=(red, dashed line with diamond). F-GTM=(black, dashed and dotted line). Ian T. Nabney Machine Learning and Visualisation 41/45

45 Conclusions Block-structured Covariance Visualisation is an important tool for all types of user; the domain expert must be involved in the process. Interaction with the plots allows the user to query the data more effectively. Presenting the data in the right way is key. Feature selection is a very important tool. Accounting for known structure (e.g. block covariance) improves results. Ian T. Nabney Machine Learning and Visualisation 42/45

46 Block-structured Covariance AgustaWestland AW has pioneered CVM, the continuous recording of airframe vibration (0-200Hz), to improve the investigation of unusual occurrences and monitor airframe integrity. Develop a probabilistic framework for inferring flight mode and key parameters from multiple streams of vibration data. Improve indicators of airframe condition: the wavelet transform and kernel entropy to assess the dynamics (i.e. non-stationary characteristics) of the vibration signal. Integrated diagnosis based on probabilistic models of normality and using a belief network to model prior knowledge about the domain and interactions between key variables. Ian T. Nabney Machine Learning and Visualisation 43/45

47 Block-structured Covariance Understanding the Data 8 sensors measuring vibration; 108 frequency bands per sensor. Ian T. Nabney Machine Learning and Visualisation 44/45

48 Block-structured Covariance Ian T. Nabney Machine Learning and Visualisation 45/45

Machine Learning Methods in Visualisation for Big Data 2018

Machine Learning Methods in Visualisation for Big Data 2018 Machine Learning Methods in Visualisation for Big Data 2018 Daniel Archambault1 Ian Nabney2 Jaakko Peltonen3 1 Swansea University 2 University of Bristol 3 University of Tampere, Aalto University Evaluating

More information

Semi-Supervised Construction of General Visualization Hierarchies

Semi-Supervised Construction of General Visualization Hierarchies Semi-Supervised Construction of General Visualization Hierarchies Peter Tiňo Yi Sun Ian Nabney Aston University, Aston Triangle, Birmingham, B4 7ET United Kingdom Abstract We have recently developed a

More information

A Principled Approach to Interactive Hierarchical Non-Linear Visualization of High-Dimensional Data

A Principled Approach to Interactive Hierarchical Non-Linear Visualization of High-Dimensional Data A Principled Approach to Interactive Hierarchical Non-Linear Visualization of High-Dimensional Data Peter Tiňo, Ian Nabney, Yi Sun Neural Computing Research Group Aston University, Birmingham, B4 7ET,

More information

Data visualisation and exploration with prior knowledge

Data visualisation and exploration with prior knowledge Data visualisation and exploration with prior knowledge Martin Schroeder, Dan Cornford, Ian T. Nabney Aston University, NCRG, Aston Triangle, Birmingham, B4 7ET, UK shroderm@aston.ac.uk Abstract. Visualising

More information

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering Introduction to Pattern Recognition Part II Selim Aksoy Bilkent University Department of Computer Engineering saksoy@cs.bilkent.edu.tr RETINA Pattern Recognition Tutorial, Summer 2005 Overview Statistical

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

Unsupervised Learning

Unsupervised Learning Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised

More information

Methods for Intelligent Systems

Methods for Intelligent Systems Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering

More information

Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data

Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data Neil D. Lawrence Department of Computer Science University of Sheffield Regent Court, 211 Portobello Street, Sheffield,

More information

Probabilistic Modelling and Reasoning Assignment 2

Probabilistic Modelling and Reasoning Assignment 2 Probabilistic Modelling and Reasoning Assignment 2 Instructor: Prof. Chris Williams Published: Fri November 4 2011, Revised Nov 18 Due: Fri December 2, 2011 by 4pm Remember that plagiarism is a university

More information

Bioinformatics - Lecture 07

Bioinformatics - Lecture 07 Bioinformatics - Lecture 07 Bioinformatics Clusters and networks Martin Saturka http://www.bioplexity.org/lectures/ EBI version 0.4 Creative Commons Attribution-Share Alike 2.5 License Learning on profiles

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Dimension Reduction CS534

Dimension Reduction CS534 Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of

More information

t 1 y(x;w) x 2 t 2 t 3 x 1

t 1 y(x;w) x 2 t 2 t 3 x 1 Neural Computing Research Group Dept of Computer Science & Applied Mathematics Aston University Birmingham B4 7ET United Kingdom Tel: +44 (0)121 333 4631 Fax: +44 (0)121 333 4586 http://www.ncrg.aston.ac.uk/

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Unsupervised learning Daniel Hennes 29.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Supervised learning Regression (linear

More information

22 October, 2012 MVA ENS Cachan. Lecture 5: Introduction to generative models Iasonas Kokkinos

22 October, 2012 MVA ENS Cachan. Lecture 5: Introduction to generative models Iasonas Kokkinos Machine Learning for Computer Vision 1 22 October, 2012 MVA ENS Cachan Lecture 5: Introduction to generative models Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Center for Visual Computing Ecole Centrale Paris

More information

Mixture Models and the EM Algorithm

Mixture Models and the EM Algorithm Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

08 An Introduction to Dense Continuous Robotic Mapping

08 An Introduction to Dense Continuous Robotic Mapping NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

Semi-Supervised Learning of Hierarchical Latent Trait Models for Data Visualisation

Semi-Supervised Learning of Hierarchical Latent Trait Models for Data Visualisation 1 Semi-Supervised Learning of Hierarchical Latent Trait Models for Data Visualisation Ian T. Nabney, Yi Sun, Peter Tiňo, and Ata Kabán Ian T. Nabney is with the Neural Computing Research Group, Aston University,

More information

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of

More information

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves Machine Learning A 708.064 11W 1sst KU Exercises Problems marked with * are optional. 1 Conditional Independence I [2 P] a) [1 P] Give an example for a probability distribution P (A, B, C) that disproves

More information

Clustering with Reinforcement Learning

Clustering with Reinforcement Learning Clustering with Reinforcement Learning Wesam Barbakh and Colin Fyfe, The University of Paisley, Scotland. email:wesam.barbakh,colin.fyfe@paisley.ac.uk Abstract We show how a previously derived method of

More information

Latent Variable Models and Expectation Maximization

Latent Variable Models and Expectation Maximization Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06 Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,

More information

A problem - too many features. TDA 231 Dimension Reduction: PCA. Features. Making new features

A problem - too many features. TDA 231 Dimension Reduction: PCA. Features. Making new features A problem - too many features TDA 1 Dimension Reduction: Aim: To build a classifier that can diagnose leukaemia using Gene expression data. Data: 7 healthy samples,11 leukaemia samples (N = 8). Each sample

More information

CS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning

CS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning CS325 Artificial Intelligence Cengiz Spring 2013 Unsupervised Learning Missing teacher No labels, y Just input data, x What can you learn with it? Unsupervised Learning Missing teacher No labels, y Just

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Statistical Analysis of Metabolomics Data Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Outline Introduction Data pre-treatment 1. Normalization 2. Centering,

More information

Data Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\

Data Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\ Data Preprocessing Javier Béjar BY: $\ URL - Spring 2018 C CS - MAI 1/78 Introduction Data representation Unstructured datasets: Examples described by a flat set of attributes: attribute-value matrix Structured

More information

CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points]

CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points] CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, 2015. 11:59pm, PDF to Canvas [100 points] Instructions. Please write up your responses to the following problems clearly and concisely.

More information

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-

More information

Experimental Evaluation of Latent Variable Models for Dimensionality Reduction

Experimental Evaluation of Latent Variable Models for Dimensionality Reduction In: Proc. of the 18 IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing (NNSP8), pp.5-17, Cambridge, UK. URL: http://www.dcs.shef.ac.uk/ miguel/papers/nnsp8.html Experimental

More information

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo Data is Too Big To Do Something..

More information

Generalized Principal Component Analysis CVPR 2007

Generalized Principal Component Analysis CVPR 2007 Generalized Principal Component Analysis Tutorial @ CVPR 2007 Yi Ma ECE Department University of Illinois Urbana Champaign René Vidal Center for Imaging Science Institute for Computational Medicine Johns

More information

Markov Random Fields and Gibbs Sampling for Image Denoising

Markov Random Fields and Gibbs Sampling for Image Denoising Markov Random Fields and Gibbs Sampling for Image Denoising Chang Yue Electrical Engineering Stanford University changyue@stanfoed.edu Abstract This project applies Gibbs Sampling based on different Markov

More information

Expectation Maximization (EM) and Gaussian Mixture Models

Expectation Maximization (EM) and Gaussian Mixture Models Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation

More information

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6 Cluster Analysis and Visualization Workshop on Statistics and Machine Learning 2004/2/6 Outlines Introduction Stages in Clustering Clustering Analysis and Visualization One/two-dimensional Data Histogram,

More information

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned

More information

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed

More information

Machine Learning. Unsupervised Learning. Manfred Huber

Machine Learning. Unsupervised Learning. Manfred Huber Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training

More information

Artificial Neural Networks Unsupervised learning: SOM

Artificial Neural Networks Unsupervised learning: SOM Artificial Neural Networks Unsupervised learning: SOM 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001

More information

MCMC Methods for data modeling

MCMC Methods for data modeling MCMC Methods for data modeling Kenneth Scerri Department of Automatic Control and Systems Engineering Introduction 1. Symposium on Data Modelling 2. Outline: a. Definition and uses of MCMC b. MCMC algorithms

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1 Week 8 Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Part I Clustering 2 / 1 Clustering Clustering Goal: Finding groups of objects such that the objects in a group

More information

Image analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis

Image analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis 7 Computer Vision and Classification 413 / 458 Computer Vision and Classification The k-nearest-neighbor method The k-nearest-neighbor (knn) procedure has been used in data analysis and machine learning

More information

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques.

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques. . Non-Parameteric Techniques University of Cambridge Engineering Part IIB Paper 4F: Statistical Pattern Processing Handout : Non-Parametric Techniques Mark Gales mjfg@eng.cam.ac.uk Michaelmas 23 Introduction

More information

Unsupervised: no target value to predict

Unsupervised: no target value to predict Clustering Unsupervised: no target value to predict Differences between models/algorithms: Exclusive vs. overlapping Deterministic vs. probabilistic Hierarchical vs. flat Incremental vs. batch learning

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Unsupervised Learning: Clustering Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com (Some material

More information

Sensor Tasking and Control

Sensor Tasking and Control Sensor Tasking and Control Outline Task-Driven Sensing Roles of Sensor Nodes and Utilities Information-Based Sensor Tasking Joint Routing and Information Aggregation Summary Introduction To efficiently

More information

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation

More information

Sampling PCA, enhancing recovered missing values in large scale matrices. Luis Gabriel De Alba Rivera 80555S

Sampling PCA, enhancing recovered missing values in large scale matrices. Luis Gabriel De Alba Rivera 80555S Sampling PCA, enhancing recovered missing values in large scale matrices. Luis Gabriel De Alba Rivera 80555S May 2, 2009 Introduction Human preferences (the quality tags we put on things) are language

More information

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques Mark Gales mjfg@eng.cam.ac.uk Michaelmas 2015 11. Non-Parameteric Techniques

More information

10701 Machine Learning. Clustering

10701 Machine Learning. Clustering 171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among

More information

Hierarchical Gaussian Process Latent Variable Models

Hierarchical Gaussian Process Latent Variable Models Neil D. Lawrence neill@cs.man.ac.uk School of Computer Science, University of Manchester, Kilburn Building, Oxford Road, Manchester, M13 9PL, U.K. Andrew J. Moore A.Moore@dcs.shef.ac.uk Dept of Computer

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Experimental Analysis of GTM

Experimental Analysis of GTM Experimental Analysis of GTM Elias Pampalk In the past years many different data mining techniques have been developed. The goal of the seminar Kosice-Vienna is to compare some of them to determine which

More information

Data Preprocessing. Javier Béjar AMLT /2017 CS - MAI. (CS - MAI) Data Preprocessing AMLT / / 71 BY: $\

Data Preprocessing. Javier Béjar AMLT /2017 CS - MAI. (CS - MAI) Data Preprocessing AMLT / / 71 BY: $\ Data Preprocessing S - MAI AMLT - 2016/2017 (S - MAI) Data Preprocessing AMLT - 2016/2017 1 / 71 Outline 1 Introduction Data Representation 2 Data Preprocessing Outliers Missing Values Normalization Discretization

More information

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components

( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components Review Lecture 14 ! PRINCIPAL COMPONENT ANALYSIS Eigenvectors of the covariance matrix are the principal components 1. =cov X Top K principal components are the eigenvectors with K largest eigenvalues

More information

Day 3 Lecture 1. Unsupervised Learning

Day 3 Lecture 1. Unsupervised Learning Day 3 Lecture 1 Unsupervised Learning Semi-supervised and transfer learning Myth: you can t do deep learning unless you have a million labelled examples for your problem. Reality You can learn useful representations

More information

NeuroScale: Novel Topographic Feature Extraction using RBF Networks

NeuroScale: Novel Topographic Feature Extraction using RBF Networks NeuroScale: Novel Topographic Feature Extraction using RBF Networks David Lowe D.LoweOaston.ac.uk Michael E. Tipping H.E.TippingOaston.ac.uk Neural Computing Research Group Aston University, Aston Triangle,

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised

More information

K-Means and Gaussian Mixture Models

K-Means and Gaussian Mixture Models K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser

More information

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern

More information

Machine Learning in Biology

Machine Learning in Biology Università degli studi di Padova Machine Learning in Biology Luca Silvestrin (Dottorando, XXIII ciclo) Supervised learning Contents Class-conditional probability density Linear and quadratic discriminant

More information

SGN (4 cr) Chapter 10

SGN (4 cr) Chapter 10 SGN-41006 (4 cr) Chapter 10 Feature Selection and Extraction Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 18, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006

More information

Tree-based Cluster Weighted Modeling: Towards A Massively Parallel Real- Time Digital Stradivarius

Tree-based Cluster Weighted Modeling: Towards A Massively Parallel Real- Time Digital Stradivarius Tree-based Cluster Weighted Modeling: Towards A Massively Parallel Real- Time Digital Stradivarius Edward S. Boyden III e@media.mit.edu Physics and Media Group MIT Media Lab 0 Ames St. Cambridge, MA 039

More information

10601 Machine Learning. Hierarchical clustering. Reading: Bishop: 9-9.2

10601 Machine Learning. Hierarchical clustering. Reading: Bishop: 9-9.2 161 Machine Learning Hierarchical clustering Reading: Bishop: 9-9.2 Second half: Overview Clustering - Hierarchical, semi-supervised learning Graphical models - Bayesian networks, HMMs, Reasoning under

More information

GTM: The Generative Topographic Mapping

GTM: The Generative Topographic Mapping Communicated by Helge Ritter GTM: The Generative Topographic Mapping Christopher M. Bishop Markus Svensén Christopher K. I. Williams Neural Computing Research Group, Department of Computer Science and

More information

Part I. Graphical exploratory data analysis. Graphical summaries of data. Graphical summaries of data

Part I. Graphical exploratory data analysis. Graphical summaries of data. Graphical summaries of data Week 3 Based in part on slides from textbook, slides of Susan Holmes Part I Graphical exploratory data analysis October 10, 2012 1 / 1 2 / 1 Graphical summaries of data Graphical summaries of data Exploratory

More information

A Taxonomy of Semi-Supervised Learning Algorithms

A Taxonomy of Semi-Supervised Learning Algorithms A Taxonomy of Semi-Supervised Learning Algorithms Olivier Chapelle Max Planck Institute for Biological Cybernetics December 2005 Outline 1 Introduction 2 Generative models 3 Low density separation 4 Graph

More information

Missing Data Estimation in Microarrays Using Multi-Organism Approach

Missing Data Estimation in Microarrays Using Multi-Organism Approach Missing Data Estimation in Microarrays Using Multi-Organism Approach Marcel Nassar and Hady Zeineddine Progress Report: Data Mining Course Project, Spring 2008 Prof. Inderjit S. Dhillon April 02, 2008

More information

A Dendrogram. Bioinformatics (Lec 17)

A Dendrogram. Bioinformatics (Lec 17) A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and

More information

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015 Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth

More information

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1 Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches

More information

An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework

An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster

More information

GTM: The Generative Topographic Mapping

GTM: The Generative Topographic Mapping GTM: The Generative Topographic Mapping Christopher M. Bishop, Markus Svensén Microsoft Research 7 J J Thomson Avenue Cambridge, CB3 0FB, U.K. {cmbishop,markussv}@microsoft.com http://research.microsoft.com/{

More information

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 205-206 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI BARI

More information

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract Clustering Sequences with Hidden Markov Models Padhraic Smyth Information and Computer Science University of California, Irvine CA 92697-3425 smyth@ics.uci.edu Abstract This paper discusses a probabilistic

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 01-31-017 Outline Background Defining proximity Clustering methods Determining number of clusters Comparing two solutions Cluster analysis as unsupervised Learning

More information

COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning

COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning Associate Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551

More information

Non-linear dimension reduction

Non-linear dimension reduction Sta306b May 23, 2011 Dimension Reduction: 1 Non-linear dimension reduction ISOMAP: Tenenbaum, de Silva & Langford (2000) Local linear embedding: Roweis & Saul (2000) Local MDS: Chen (2006) all three methods

More information

Quantitative Biology II!

Quantitative Biology II! Quantitative Biology II! Lecture 3: Markov Chain Monte Carlo! March 9, 2015! 2! Plan for Today!! Introduction to Sampling!! Introduction to MCMC!! Metropolis Algorithm!! Metropolis-Hastings Algorithm!!

More information

Monte Carlo for Spatial Models

Monte Carlo for Spatial Models Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing

More information

Exploratory data analysis for microarrays

Exploratory data analysis for microarrays Exploratory data analysis for microarrays Jörg Rahnenführer Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken Germany NGFN - Courses in Practical DNA

More information

Content-based image and video analysis. Machine learning

Content-based image and video analysis. Machine learning Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all

More information

SGN (4 cr) Chapter 11

SGN (4 cr) Chapter 11 SGN-41006 (4 cr) Chapter 11 Clustering Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 25, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006 (4 cr) Chapter

More information

Modeling of seismic data takes two forms: those based

Modeling of seismic data takes two forms: those based SPECIAL Reservoir SECTION: modeling Reservoir modeling constrained constrained by seismic by by seismic Latent space modeling of seismic data: An overview BRADLEY C. WALLET, MARCILIO C. DE MATOS, and J.

More information

Seismic facies analysis using generative topographic mapping

Seismic facies analysis using generative topographic mapping Satinder Chopra + * and Kurt J. Marfurt + Arcis Seismic Solutions, Calgary; The University of Oklahoma, Norman Summary Seismic facies analysis is commonly carried out by classifying seismic waveforms based

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Learning a Manifold as an Atlas Supplementary Material

Learning a Manifold as an Atlas Supplementary Material Learning a Manifold as an Atlas Supplementary Material Nikolaos Pitelis Chris Russell School of EECS, Queen Mary, University of London [nikolaos.pitelis,chrisr,lourdes]@eecs.qmul.ac.uk Lourdes Agapito

More information

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample CS 1675 Introduction to Machine Learning Lecture 18 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem:

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints Last week Multi-Frame Structure from Motion: Multi-View Stereo Unknown camera viewpoints Last week PCA Today Recognition Today Recognition Recognition problems What is it? Object detection Who is it? Recognizing

More information

Solution Sketches Midterm Exam COSC 6342 Machine Learning March 20, 2013

Solution Sketches Midterm Exam COSC 6342 Machine Learning March 20, 2013 Your Name: Your student id: Solution Sketches Midterm Exam COSC 6342 Machine Learning March 20, 2013 Problem 1 [5+?]: Hypothesis Classes Problem 2 [8]: Losses and Risks Problem 3 [11]: Model Generation

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational

More information

Warped Mixture Models

Warped Mixture Models Warped Mixture Models Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani Cambridge University Computational and Biological Learning Lab March 11, 2013 OUTLINE Motivation Gaussian Process Latent Variable

More information