Topographic Local PCA Maps

Size: px
Start display at page:

Download "Topographic Local PCA Maps"

Transcription

1 Topographic Local PCA Maps Peter Meinicke and Helge Ritter Neuroinformatics Group, University of Bielefeld {pmeinick, Abstract We present a model for coupling Local Principal Component Analysers based on a probabilistic notion of neighbourhood, which is inspired by the Self-Organizing Map (SOM). With our approach topologically ordered configurations of Local PCA s arise from homotopy-based minimization of a global error function. We indicate that such an approach can be viewed as a natural generalization of the basic SOM while, unlike the SOM, it is not restricted to capture the variation of multivariate data only along a small number of grid dimensions. We show the close relations to the Adaptive Subspace SOM (ASSOM) and by experimental results on synthetic and high-dimensional real-world data we demonstrate the capabilities of the model. 1 Introduction Local PCA (LPCA) learning [7, 1] can be viewed as a plausible extension to the conventional Vector Quantization (VQ) framework. It replaces the point prototypes by linear manifolds, which can considerably improve generalization especially in high-dimensional data spaces. In both frameworks of LPCA and VQ based learning the problem of overfitting quickly arises if we increase the number of prototypes to improve the approximation capabilities of the model. In the VQ case, one successful type of approach to restrict the model complexity is by an introduction of couplings that constrain the flexibility of each prototype relative to a set of "neighbours". The wellknown SOM [8, 10] imposes the relationships among the K prototypes by means of a neighbourhood function h jk which determines the coupling strength between prototypes j and k. Usually, h jk is related to the closeness of lattice points ("nodes") j, k on a fictitious grid, which is chosen to resemble the topology of the data in order to obtain good generalization. For a biological motivation of such an approach we refer to [8, 10]. In this paper we propose to extend the point-wise SOM prototypes to linear manifold prototypes, which are now able to extract some local subspace structure from the data. Optimization of the neighbourhood coupled manifolds results in a special kind of feature maps which we refer to as "Topographic Local PCA" (T-LPCA) maps. With respect to the representational capabilities T- LPCA can be viewed as a natural generalization of the classical SOM and in addition it can also realize the Adaptive Subspace SOM (ASSOM) [8], which turns out to be a special case of the T-LPCA framework too. Therefore a probabilistic version of the SOM can be easily realized within the Topographic Local PCA framework and indeed it plays an important role during learning via successive model refinement. 2 T-LPCA Prototypes In essence the definition of our T-LPCA model is based on the combination of two formalisms, which realize some variability with respect to the kind of prototype and a certain kind of probabilistic neighbourhood respectively. More specifically, the prototype variability is achieved by a parametrized distance function, which can provide a smooth transition from points to linear manifolds. The neighbourhood coupling is achieved by a set of assignment probabilities, which combine the squared distances w.r.t. distinct prototypes. This idea of a probabilistic notion of neighbourhood goes back to [9] and has been extended to "Soft Topographic Vector Quantization" by [4]. The combination of the above formalisms results in a function which measures the error of a d-dimensional point x with respect to the neighbourhood of a prototype j: E j (x; ) = 1 h jk k(i V k Vk T )(x w k)k 2 (1) 2 k=1 for j = 1; : : : ; K with 2 [0; 1]. The error function sums up (squared) neighbourhood-weighted central (SOM) distances for = 0 and orthogonal distances w.r.t. some linear manifolds for = 1. While I is the d d identity matrix the V k are d q matrices with orthonormal columns which determine the directions of the

2 manifolds. Each V k Vk T thus represents an orthogonal projection onto a q-dimensional subspace that becomes associated with node k. The w k are points in data space which correspond to the SOM prototypes for = 0 and which determine the distance of the linear manifolds to the origin for = 1. Within a statistical interpretation [9, 4] a data-point is not deterministically assigned to a single prototype but instead to a set of probabilistically related prototypes. The final assignment depends on a discrete random variable r j which assigns a point P to the k-th prototype with probability h jk. Therefore k h jk = 1 and the distribution of r j defines what we refer to as the "neighbourhood" of prototype j or shortly the j-th neighbourhood. The above error function (1) therefore defines the expected squared -distance of a data-point w.r.t. the distribution of r j. The topological justification for that probabilistic notion of neighbourhood comes from the fact that within a neighbourhood j we require the probability h jk to be a monotonous function of the distance between the corresponding nodes of prototypes j and k on a prespecified grid ("array") of chosen topology. The corresponding neighbourhood function which maps array-distances to probabilities establishes the connection to the classical SOM. In subsection 3.3 we will propose a Gaussian realization of that function which leads to a convenient parametrization of the probabilities. For notational convenience the neighbourhood probabilities of all K random variables are collected in the prespecified neighbourhood matrix H = (h jk ) which encodes the topological relations between all prototypes. For the sake of a simplified notation in the following we suppress the H-dependency of E j (). 3 Optimization Learning from the sample X = fx 1 ; : : : ; x N g R d requires minimization of the following global error function: E(M; ; ) = i=1 j=1 m ij E j (x i ; ) (2) where comprises all model parameters w 1 ; V 1 ; : : : ; w K ; V K and the N K matrix M = (m ij ) contains a set of membership variables m ij, which denote the membership of a data point i w.r.t. to a certain neighbourhood j. Within a hard-clustering framework the membership variables would be binary and would assign a data point to exactly one neighbourhood. In order to avoid poor local minima of the above objective function we do not start with a direct minimization of (2) by hard-clustering where a nearest neighbour(hood) partitioning of the data and a subsequent reestimation of the parameter values are iterated. Instead we use a homotopy-based method which gradually deforms an initial error function with a welldefined global minimum until the original error function is minimized in a final optimization step. This technique is usually referred to as deterministic annealing [11] and will be the subject of the next subsection. 3.1 Deterministic Annealing In the following the value of m ij is viewed as the probability of a data-point i to belong to neighbourhood j, requiring X j m ij = 1; m ij 0; j = 1; : : : ; K (3) In that way we introduce a set of N random variables s i with probability distributions P fs i = jg = m ij which randomize the data-to-neighbourhood assignments. Thus (2) denotes the expected error w.r.t. to these s i distributions. In contrast to the r j of the previous section it doesn t make sense to prespecify the distributions of the s i, the corresponding probabilities have to be estimated from the data. However such an estimation scheme is not well-defined unless some further constraints are imposed on the m ij. For that purpose a suitable approach is to constrain the entropy of the s i distributions. This technique can be derived from the well-known "maximum entropy"-principle [5] and can be simply implemented by adding a regularization term to the above error function (2): E(M; ; ; ) = E(M; ; ) 1 m ij log m ij i=1 j=1 (4) where plays the role of an inverse temperature [11]. For an infinitely high temperature, i.e.! 0, minimization of (4) w.r.t. to the m ij yields the maximum entropy solution for the s i distributions with all probabilities equal to 1=K. For P = 0 optimization of the point prototypes yields ^w k = 1=N x i i for all k as the unique minimizers of (4) since the Hessian of the error function is positive definite in this case [11]. This means that for! 0 all optimal prototypes coincide in the global sample mean. If is increased, without neighbourhood constraint, this state remains stable as long as the value of 1= exceeds the largest eigenvalue of the sample covariance matrix, which is known as the "critical temperature" [11]. With neighbourhood constraint this critical temperature also depends on H [4]. With further increasing

3 the prototype vectors undergo a series of splittings in order to minimize the regularized error function and in the limit! 1 a hard-clustering of the data is achieved. The technique described so far is well known as "deterministic annealing" and it has the reputation of being rather robust against shallow local minima, provided an adequate annealing schedule is chosen. 3.2 Parameter Estimation For given values of, and H minimization of (4) can be achieved by a special version of the EM-algorithm [2]. Thereby the following two steps are iterated until convergence. E-Step Given some parameter values for the optimal membership probabilities can be derived from the corresponding stationarity conditions (zero first derivatives) under the constraint (3), which yield ^m ij = expfe j(x i ; )g Pk expfe k(x i ; )g for i = 1; : : : ; N and j = 1; : : : ; K. M-Step (5) Given some values for the membership probabilities the optimal prototypes are derived from the corresponding stationarity conditions, which yield the following local means for k = 1; : : : ; K. ^w k = 1 n k n k = i=1 x i i=1 j=1 j=1 ^m ij h jk (6) ^m ij h jk (7) For > 0 from (4) optimal direction-matrices are defined by ^V k = arg max V tr (S kvv T ) subject to V T V = I q (8) with tr() denoting the trace operation, I q being the q q identity matrix and (x i ^w k )(x i ^w k ) T ^m ij h jk (9) S k = 1 n k i=1 j=1 being some local covariance matrix. Now it can be shown that a (non-unique) maximizer of the above trace in (8) can be found from an eigenvalue decomposition of S k with ^V k containing those eigenvectors as columns which are associated with the q largest eigenvalues of S k (see e.g. [6] pp. 9). Thus estimation of the optimal direction matrices is achieved by performing K local PCA s. 3.3 Varying and H To obtain a good local minimum of the global error we combine the above deterministic annealing with two other deformations of the error function which involve a gradual increase of the above parameter and a successive modification of H, which realizes a "shrinking" neighbourhood. Since the above splitting scheme of the previous subsection is not well-defined for general linear manifolds it makes sense to first apply deterministic annealing to a set of initial point prototypes with = 0. In order to control the extent of the neighbourhoods it is necessary to provide a suitable parametrization of the neighbourhood matrix H. A convenient choice is to use a Gaussian neighbourhood function, which for a 1-D array leads to the following probabilities: h jk = 1 exp 1 Z j 2 2 jj kj2 (10) where the Z j is chosen to provide unit row sums of H. Such a scheme easily generalizes to higher-dimensional arrays and provides a suitable control of the neighbourhood width by the variance 2 of the Gaussian neighbourhood function. With the specification of a neighbourhood function we can now extend the global error function to E(M; ; ; ; ) in order to make the -dependency explicit. As in the SOM case, for the minimization of this function it is recommendable to start with a large neighbourhood width and successively decrease the width until, in the limit, all couplings between prototypes may vanish. Although other strategies are conceivable the following overall optimization scheme has proven useful in all our experiments. We always start with = 0 at some high temperature 1= min and with a large neighbourhood width. Then is increased in a few optimization steps according to an exponential schedule. After this initial deterministic annealing phase we continue with zero temperature hard-clustering and according to a linear schedule we increase and decrease in a few steps until = 1 and = 0 respectively. For the case = 0 the Gaussian neighbourhood function is replaced by a Dirac impulse. The overall optimization scheme is shown in table 1 in a more algorithmic fashion.

4 ➊ Define max > min > 0; max > 0; > 1; > 0 ➋ Initialize = 0; = min ; = max ➌ Minimize E(M; ; ; ; ) ➍ Set = ➎ If < max Goto ➌ ➏ Minimize E(M; ; ; 1; ) ➐ Set = + ; = (1 ) max ➑ If 1 Goto ➏ Else Stop. Table 1: Homotopy-based optimization scheme; for example values of constants min, max, max, and see section 5. 4 Relations to the ASSOM Learning with the Adaptive Subspace SOM (ASSOM) [8] can be viewed as an online variant of our T-LPCA optimization scheme for the particular case! 1 and = 1 with the linear manifolds passing through the origin, i.e. w k = 0; k = 1; : : : ; K. Due to the latter constraint formally the ASSOM can not be viewed as a generalization of the SOM and in practice it wouldn t be possible to build the ASSOM from an initially given SOM by simply extending the prototypes. In addition the constraint specializes the ASSOM to certain kinds of data distributions as illustrated by the experiments of the next section. In cases where the local means w k of the T-LPCA map are highly correlated with the main directions of the V k the ASSOM can be expected to yield a similarly good representation of the data. Our experimental results indicate that this might be the case for the handwritten digit image data which we used for T-LPCA training, since figure 3 shows that the main directions in the second row (from bottom) are approximately scaled versions of the corresponding means of the bottom row. However the noisy circle (see figure 1 and 2) shows an example, which is better suited for the more general T-LPCA model, since it allows the linear manifolds to have arbitrary offsets w.r.t. the origin. 5 Experimental Results In all experiments we applied the optimization scheme described in section 3 and table 1. Thereby the initial neighbourhood width max was set to twice the grid spacing and the initial temperature 1= min was set to the largest eigenvalue of the sample covariance matrix. In the deterministic annealing phase we used a factor = 2 to increase over 10 iterations. During each iteration the above EM-optimization of subsection 3.2 was applied to reduce the global error. In the second zero-temperature phase was incremented by = 0: Noisy Circle In the first experiment prototypes with one-dimensional subspaces (q = 1) were fitted to a synthetic data set of 100 points, which were generated by sampling from the unit circle and adding isotropic Gaussian noise with standard deviation 0:1. We used a model with K = 6 prototypes and a 1D array of equally spaced nodes. The residual squared error was per data point on the average and the resulting model is depicted in figure 1. For comparison we also fitted a model with zero local means in order to achieve an ASSOM-like representation. The average squared error was in this case and the result is shown in figure Figure 1: T-LPCA model with 1D-topology and K = 6 prototypes fitted to 100-point sample of noisy circle. 5.2 Feature Representation In the second experiment we used a downsampled 1000-point subset of the MNIST database ( containing 8 8 images of handwritten "1" digits. In the 64-dimensional data space we fitted a T-LPCA model with K = 6 prototypes and q = 5 subspace dimensions.

5 Figure 2: T-LPCA map with w k = 0; k = 1; : : : ; K leads to an ASSOM-like model with all K = 6 lines passing through the origin Again the nodes were arranged on a regular 1D array. From the result in figure 3 we see that most of the non-linear variation, in this case mainly due to rotation in the image plane, is captured along the horizontal array dimension. In addition some linear feature filters emerge along the vertical subspace dimensions. 5.3 Visualization A convenient visualization of a 2D SOM can be achieved if the distance between neighbouring prototypes is mapped to the greylevel of a corresponding image region, according to the topology of the underlying SOM array. In essence the resulting visualization resembles the socalled U-map [12] and one might argue that this concept easily carries over to linear manifold prototypes. However a suitable distance metric isn t quite obvious. The distance between two subspaces S j and S k, represented by matrices V j ; V k (see section 2), can be defined as [3] dist(s j ; S k ) = kv j V T j V kv T k k 2 (11) which equals the largest singular value of V j Vj T V k Vk T. However this distance doesn t involve the local means and the results we achieved by using the direction matrices only, were rather poor. As a possible alternative we investigated an extension of the usual point-to-point distance to a sum of point-to-manifold distances D jk = 1 2 k(2i V jv T j V kv T k )(w j w k )k (12) Figure 3: T-LPCA map from 1000-point sample of 64 dimensional image vectors of handwritten "1"-digits; bottom row shows local means w k, the next upper rows show the column-vectors of the direction matrices V k for q = 5. which is simply half the orthogonal distance of the local mean w j to linear manifold k plus half the distance of w k to manifold j. For zero-dimensional subspaces it reduces to the usual point-to-point distance, normally used for U-map imaging. As an illustrative example we used this "pseudo"-distance to build an U-map from a T-LPCA model which we had trained on 88 images of digits "0" to "4". The training set contained 1000 examples of each digit which were used to optimize a model with 36 nodes arranged on a regular 6 6 grid. The resulting U-map is shown in figure 4. 6 Conclusion We conclude that the T-LPCA map is a highly promising extension of the SOM, which is capable to catch some high-dimensional local linear variation in addition to the global non-linear variation along the low-dimensional SOM array. We showed that T-LPCA maps can be formulated as a probabilistic generalization of the SOM by means of a suitable parametrization of a global error function which is minimized by homotopy-based optimization.

6 Information Processing Systems, volume 6, pages Morgan Kaufmann Publishers, Inc., [8] T. Kohonen. Self-Organizing Maps. Springer, Berlin, [9] Stephen P. Luttrell. A Bayesian analysis of selforganizing maps. Neural Computation, 6(5): , [10] H. Ritter, T. Martinetz, and K. Schulten. Neural Computation and Self-Organizing Maps. An Introduction. Addison-Wesley, Reading, MA, [11] K. Rose, E. Gurewitz, and G. C. Fox. Vector quantization by deterministic annealing. IEEE Transactions on Information Theory, 38(4): , Figure 4: T-LPCA U-map for 6x6 model built from digit data; fields between units have greylevel proportional to the D jk "distance" defined in the text; on diagonals the minimum of both distances is taken; fields of units take the median of their surrounding fields; labels indicate most common digit class mapped to the corresponding prototype. [12] A. Ultsch. Self-organizing neural networks for visualization and classification. In O. Opitz, B. Lausen, and R. Klar, editors, Information and Classification, pages , Berlin, Springer. References [1] Christoph Bregler and Stephen M. Omohundro. Surface learning with applications to lipreading. In Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors, Advances in Neural Information Processing Systems, volume 6, pages Morgan Kaufmann Publishers, Inc., [2] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39:1 38, [3] G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, Baltimore, third edition, [4] T. Graepel, M. Burger, and K. Obermayer. Phase transitions in stochastic self-organizing maps. Physical Review E, 56(4): , [5] E. Jaynes. Information theory and statistical mechanics. Physical Review, 106(4): , [6] I. T. Jolliffe. Principal Component Analysis. Springer, New York, [7] Nanda Kambhaltla and Todd K. Leen. Fast nonlinear dimension reduction. In Advances in Neural

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

A Topography-Preserving Latent Variable Model with Learning Metrics

A Topography-Preserving Latent Variable Model with Learning Metrics A Topography-Preserving Latent Variable Model with Learning Metrics Samuel Kaski and Janne Sinkkonen Helsinki University of Technology Neural Networks Research Centre P.O. Box 5400, FIN-02015 HUT, Finland

More information

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06 Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

Clustering Lecture 5: Mixture Model

Clustering Lecture 5: Mixture Model Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics

More information

A Stochastic Optimization Approach for Unsupervised Kernel Regression

A Stochastic Optimization Approach for Unsupervised Kernel Regression A Stochastic Optimization Approach for Unsupervised Kernel Regression Oliver Kramer Institute of Structural Mechanics Bauhaus-University Weimar oliver.kramer@uni-weimar.de Fabian Gieseke Institute of Structural

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:

More information

t 1 y(x;w) x 2 t 2 t 3 x 1

t 1 y(x;w) x 2 t 2 t 3 x 1 Neural Computing Research Group Dept of Computer Science & Applied Mathematics Aston University Birmingham B4 7ET United Kingdom Tel: +44 (0)121 333 4631 Fax: +44 (0)121 333 4586 http://www.ncrg.aston.ac.uk/

More information

Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation

Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation Bryan Poling University of Minnesota Joint work with Gilad Lerman University of Minnesota The Problem of Subspace

More information

INDEPENDENT COMPONENT ANALYSIS WITH QUANTIZING DENSITY ESTIMATORS. Peter Meinicke, Helge Ritter. Neuroinformatics Group University Bielefeld Germany

INDEPENDENT COMPONENT ANALYSIS WITH QUANTIZING DENSITY ESTIMATORS. Peter Meinicke, Helge Ritter. Neuroinformatics Group University Bielefeld Germany INDEPENDENT COMPONENT ANALYSIS WITH QUANTIZING DENSITY ESTIMATORS Peter Meinicke, Helge Ritter Neuroinformatics Group University Bielefeld Germany ABSTRACT We propose an approach to source adaptivity in

More information

Dimension reduction : PCA and Clustering

Dimension reduction : PCA and Clustering Dimension reduction : PCA and Clustering By Hanne Jarmer Slides by Christopher Workman Center for Biological Sequence Analysis DTU The DNA Array Analysis Pipeline Array design Probe design Question Experimental

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Unsupervised Learning

Unsupervised Learning Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

Linear Methods for Regression and Shrinkage Methods

Linear Methods for Regression and Shrinkage Methods Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

9.1. K-means Clustering

9.1. K-means Clustering 424 9. MIXTURE MODELS AND EM Section 9.2 Section 9.3 Section 9.4 view of mixture distributions in which the discrete latent variables can be interpreted as defining assignments of data points to specific

More information

The Curse of Dimensionality

The Curse of Dimensionality The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more

More information

Robot Manifolds for Direct and Inverse Kinematics Solutions

Robot Manifolds for Direct and Inverse Kinematics Solutions Robot Manifolds for Direct and Inverse Kinematics Solutions Bruno Damas Manuel Lopes Abstract We present a novel algorithm to estimate robot kinematic manifolds incrementally. We relate manifold learning

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Neural Computation : Lecture 14 John A. Bullinaria, 2015 1. The RBF Mapping 2. The RBF Network Architecture 3. Computational Power of RBF Networks 4. Training

More information

CIE L*a*b* color model

CIE L*a*b* color model CIE L*a*b* color model To further strengthen the correlation between the color model and human perception, we apply the following non-linear transformation: with where (X n,y n,z n ) are the tristimulus

More information

Chapter 7: Competitive learning, clustering, and self-organizing maps

Chapter 7: Competitive learning, clustering, and self-organizing maps Chapter 7: Competitive learning, clustering, and self-organizing maps António R. C. Paiva EEL 6814 Spring 2008 Outline Competitive learning Clustering Self-Organizing Maps What is competition in neural

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Toomas.Kirt@mail.ee Abstract: Key words: For the visualisation

More information

K-Means and Gaussian Mixture Models

K-Means and Gaussian Mixture Models K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser

More information

Expectation Maximization (EM) and Gaussian Mixture Models

Expectation Maximization (EM) and Gaussian Mixture Models Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation

More information

10701 Machine Learning. Clustering

10701 Machine Learning. Clustering 171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

Analysis of Functional MRI Timeseries Data Using Signal Processing Techniques

Analysis of Functional MRI Timeseries Data Using Signal Processing Techniques Analysis of Functional MRI Timeseries Data Using Signal Processing Techniques Sea Chen Department of Biomedical Engineering Advisors: Dr. Charles A. Bouman and Dr. Mark J. Lowe S. Chen Final Exam October

More information

THE preceding chapters were all devoted to the analysis of images and signals which

THE preceding chapters were all devoted to the analysis of images and signals which Chapter 5 Segmentation of Color, Texture, and Orientation Images THE preceding chapters were all devoted to the analysis of images and signals which take values in IR. It is often necessary, however, to

More information

Clustering with Reinforcement Learning

Clustering with Reinforcement Learning Clustering with Reinforcement Learning Wesam Barbakh and Colin Fyfe, The University of Paisley, Scotland. email:wesam.barbakh,colin.fyfe@paisley.ac.uk Abstract We show how a previously derived method of

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Recognizing Handwritten Digits Using Mixtures of Linear Models. Abstract

Recognizing Handwritten Digits Using Mixtures of Linear Models. Abstract Recognizing Handwritten Digits Using Mixtures of Linear Models Geoffrey E Hinton Michael Revow Peter Dayan Deparbnent of Computer Science, University of Toronto Toronto, Ontario, Canada M5S la4 Abstract

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information

Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification

Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification Gazi. Ali, Pei-Ju Chiang Aravind K. Mikkilineni, George T. Chiu Edward J. Delp, and Jan P. Allebach School

More information

Loopy Belief Propagation

Loopy Belief Propagation Loopy Belief Propagation Research Exam Kristin Branson September 29, 2003 Loopy Belief Propagation p.1/73 Problem Formalization Reasoning about any real-world problem requires assumptions about the structure

More information

A New Orthogonalization of Locality Preserving Projection and Applications

A New Orthogonalization of Locality Preserving Projection and Applications A New Orthogonalization of Locality Preserving Projection and Applications Gitam Shikkenawis 1,, Suman K. Mitra, and Ajit Rajwade 2 1 Dhirubhai Ambani Institute of Information and Communication Technology,

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned

More information

K-Means Clustering. Sargur Srihari

K-Means Clustering. Sargur Srihari K-Means Clustering Sargur srihari@cedar.buffalo.edu 1 Topics in Mixture Models and EM Mixture models K-means Clustering Mixtures of Gaussians Maximum Likelihood EM for Gaussian mistures EM Algorithm Gaussian

More information

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Supervised vs unsupervised clustering

Supervised vs unsupervised clustering Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints Last week Multi-Frame Structure from Motion: Multi-View Stereo Unknown camera viewpoints Last week PCA Today Recognition Today Recognition Recognition problems What is it? Object detection Who is it? Recognizing

More information

Methods for Intelligent Systems

Methods for Intelligent Systems Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering

More information

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple

More information

Image Analysis, Classification and Change Detection in Remote Sensing

Image Analysis, Classification and Change Detection in Remote Sensing Image Analysis, Classification and Change Detection in Remote Sensing WITH ALGORITHMS FOR ENVI/IDL Morton J. Canty Taylor &. Francis Taylor & Francis Group Boca Raton London New York CRC is an imprint

More information

Image Processing. Image Features

Image Processing. Image Features Image Processing Image Features Preliminaries 2 What are Image Features? Anything. What they are used for? Some statements about image fragments (patches) recognition Search for similar patches matching

More information

PATTERN CLASSIFICATION AND SCENE ANALYSIS

PATTERN CLASSIFICATION AND SCENE ANALYSIS PATTERN CLASSIFICATION AND SCENE ANALYSIS RICHARD O. DUDA PETER E. HART Stanford Research Institute, Menlo Park, California A WILEY-INTERSCIENCE PUBLICATION JOHN WILEY & SONS New York Chichester Brisbane

More information

Figure (5) Kohonen Self-Organized Map

Figure (5) Kohonen Self-Organized Map 2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;

More information

Learning a Manifold as an Atlas Supplementary Material

Learning a Manifold as an Atlas Supplementary Material Learning a Manifold as an Atlas Supplementary Material Nikolaos Pitelis Chris Russell School of EECS, Queen Mary, University of London [nikolaos.pitelis,chrisr,lourdes]@eecs.qmul.ac.uk Lourdes Agapito

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

Algebraic Iterative Methods for Computed Tomography

Algebraic Iterative Methods for Computed Tomography Algebraic Iterative Methods for Computed Tomography Per Christian Hansen DTU Compute Department of Applied Mathematics and Computer Science Technical University of Denmark Per Christian Hansen Algebraic

More information

I How does the formulation (5) serve the purpose of the composite parameterization

I How does the formulation (5) serve the purpose of the composite parameterization Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)

More information

The Pre-Image Problem in Kernel Methods

The Pre-Image Problem in Kernel Methods The Pre-Image Problem in Kernel Methods James Kwok Ivor Tsang Department of Computer Science Hong Kong University of Science and Technology Hong Kong The Pre-Image Problem in Kernel Methods ICML-2003 1

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

10. MLSP intro. (Clustering: K-means, EM, GMM, etc.)

10. MLSP intro. (Clustering: K-means, EM, GMM, etc.) 10. MLSP intro. (Clustering: K-means, EM, GMM, etc.) Rahil Mahdian 01.04.2016 LSV Lab, Saarland University, Germany What is clustering? Clustering is the classification of objects into different groups,

More information

One-mode Additive Clustering of Multiway Data

One-mode Additive Clustering of Multiway Data One-mode Additive Clustering of Multiway Data Dirk Depril and Iven Van Mechelen KULeuven Tiensestraat 103 3000 Leuven, Belgium (e-mail: dirk.depril@psy.kuleuven.ac.be iven.vanmechelen@psy.kuleuven.ac.be)

More information

Human Motion Synthesis by Motion Manifold Learning and Motion Primitive Segmentation

Human Motion Synthesis by Motion Manifold Learning and Motion Primitive Segmentation Human Motion Synthesis by Motion Manifold Learning and Motion Primitive Segmentation Chan-Su Lee and Ahmed Elgammal Rutgers University, Piscataway, NJ, USA {chansu, elgammal}@cs.rutgers.edu Abstract. We

More information

5.6 Self-organizing maps (SOM) [Book, Sect. 10.3]

5.6 Self-organizing maps (SOM) [Book, Sect. 10.3] Ch.5 Classification and Clustering 5.6 Self-organizing maps (SOM) [Book, Sect. 10.3] The self-organizing map (SOM) method, introduced by Kohonen (1982, 2001), approximates a dataset in multidimensional

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP

More information

An efficient algorithm for sparse PCA

An efficient algorithm for sparse PCA An efficient algorithm for sparse PCA Yunlong He Georgia Institute of Technology School of Mathematics heyunlong@gatech.edu Renato D.C. Monteiro Georgia Institute of Technology School of Industrial & System

More information

Multiresponse Sparse Regression with Application to Multidimensional Scaling

Multiresponse Sparse Regression with Application to Multidimensional Scaling Multiresponse Sparse Regression with Application to Multidimensional Scaling Timo Similä and Jarkko Tikka Helsinki University of Technology, Laboratory of Computer and Information Science P.O. Box 54,

More information

2. Data Preprocessing

2. Data Preprocessing 2. Data Preprocessing Contents of this Chapter 2.1 Introduction 2.2 Data cleaning 2.3 Data integration 2.4 Data transformation 2.5 Data reduction Reference: [Han and Kamber 2006, Chapter 2] SFU, CMPT 459

More information

University of Florida CISE department Gator Engineering. Clustering Part 5

University of Florida CISE department Gator Engineering. Clustering Part 5 Clustering Part 5 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville SNN Approach to Clustering Ordinary distance measures have problems Euclidean

More information

Clustering & Dimensionality Reduction. 273A Intro Machine Learning

Clustering & Dimensionality Reduction. 273A Intro Machine Learning Clustering & Dimensionality Reduction 273A Intro Machine Learning What is Unsupervised Learning? In supervised learning we were given attributes & targets (e.g. class labels). In unsupervised learning

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

SOM+EOF for Finding Missing Values

SOM+EOF for Finding Missing Values SOM+EOF for Finding Missing Values Antti Sorjamaa 1, Paul Merlin 2, Bertrand Maillet 2 and Amaury Lendasse 1 1- Helsinki University of Technology - CIS P.O. Box 5400, 02015 HUT - Finland 2- Variances and

More information

SGN (4 cr) Chapter 11

SGN (4 cr) Chapter 11 SGN-41006 (4 cr) Chapter 11 Clustering Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 25, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006 (4 cr) Chapter

More information

Abstractacceptedforpresentationatthe2018SEGConventionatAnaheim,California.Presentationtobemadeinsesion

Abstractacceptedforpresentationatthe2018SEGConventionatAnaheim,California.Presentationtobemadeinsesion Abstractacceptedforpresentationatthe2018SEGConventionatAnaheim,California.Presentationtobemadeinsesion MLDA3:FaciesClasificationandReservoirProperties2,onOctober17,2018from 11:25am to11:50am inroom 204B

More information

Exploratory Data Analysis using Self-Organizing Maps. Madhumanti Ray

Exploratory Data Analysis using Self-Organizing Maps. Madhumanti Ray Exploratory Data Analysis using Self-Organizing Maps Madhumanti Ray Content Introduction Data Analysis methods Self-Organizing Maps Conclusion Visualization of high-dimensional data items Exploratory data

More information

Some questions of consensus building using co-association

Some questions of consensus building using co-association Some questions of consensus building using co-association VITALIY TAYANOV Polish-Japanese High School of Computer Technics Aleja Legionow, 4190, Bytom POLAND vtayanov@yahoo.com Abstract: In this paper

More information

Flexible Lag Definition for Experimental Variogram Calculation

Flexible Lag Definition for Experimental Variogram Calculation Flexible Lag Definition for Experimental Variogram Calculation Yupeng Li and Miguel Cuba The inference of the experimental variogram in geostatistics commonly relies on the method of moments approach.

More information

Topological Correlation

Topological Correlation Topological Correlation K.A.J. Doherty, R.G. Adams and and N. Davey University of Hertfordshire, Department of Computer Science College Lane, Hatfield, Hertfordshire, UK Abstract. Quantifying the success

More information

Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data

Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data Neil D. Lawrence Department of Computer Science University of Sheffield Regent Court, 211 Portobello Street, Sheffield,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Eric Xing Lecture 14, February 29, 2016 Reading: W & J Book Chapters Eric Xing @

More information

Optimum Array Processing

Optimum Array Processing Optimum Array Processing Part IV of Detection, Estimation, and Modulation Theory Harry L. Van Trees WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Preface xix 1 Introduction 1 1.1 Array Processing

More information

Mixture Models and the EM Algorithm

Mixture Models and the EM Algorithm Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is

More information

The Use of Biplot Analysis and Euclidean Distance with Procrustes Measure for Outliers Detection

The Use of Biplot Analysis and Euclidean Distance with Procrustes Measure for Outliers Detection Volume-8, Issue-1 February 2018 International Journal of Engineering and Management Research Page Number: 194-200 The Use of Biplot Analysis and Euclidean Distance with Procrustes Measure for Outliers

More information

Dimension Reduction of Image Manifolds

Dimension Reduction of Image Manifolds Dimension Reduction of Image Manifolds Arian Maleki Department of Electrical Engineering Stanford University Stanford, CA, 9435, USA E-mail: arianm@stanford.edu I. INTRODUCTION Dimension reduction of datasets

More information

Experiments with Edge Detection using One-dimensional Surface Fitting

Experiments with Edge Detection using One-dimensional Surface Fitting Experiments with Edge Detection using One-dimensional Surface Fitting Gabor Terei, Jorge Luis Nunes e Silva Brito The Ohio State University, Department of Geodetic Science and Surveying 1958 Neil Avenue,

More information

Self-organizing mixture models

Self-organizing mixture models Self-organizing mixture models Jakob Verbeek, Nikos Vlassis, Ben Krose To cite this version: Jakob Verbeek, Nikos Vlassis, Ben Krose. Self-organizing mixture models. Neurocomputing / EEG Neurocomputing,

More information

Basis Functions. Volker Tresp Summer 2017

Basis Functions. Volker Tresp Summer 2017 Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)

More information

Robust Subspace Computation Using L1 Norm

Robust Subspace Computation Using L1 Norm Robust Subspace Computation Using L1 Norm Qifa Ke and Takeo Kanade August 2003 CMU-CS-03-172 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract Linear subspace has many

More information

Inverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations

Inverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations Inverse KKT Motion Optimization: A Newton Method to Efficiently Extract Task Spaces and Cost Parameters from Demonstrations Peter Englert Machine Learning and Robotics Lab Universität Stuttgart Germany

More information

GTM: The Generative Topographic Mapping

GTM: The Generative Topographic Mapping GTM: The Generative Topographic Mapping Christopher M. Bishop, Markus Svensén Microsoft Research 7 J J Thomson Avenue Cambridge, CB3 0FB, U.K. {cmbishop,markussv}@microsoft.com http://research.microsoft.com/{

More information

Feature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22

Feature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22 Feature selection Javier Béjar cbea LSI - FIB Term 2011/2012 Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/2012 1 / 22 Outline 1 Dimensionality reduction 2 Projections 3 Attribute selection

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,

More information

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction

Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Invariant Recognition of Hand-Drawn Pictograms Using HMMs with a Rotating Feature Extraction Stefan Müller, Gerhard Rigoll, Andreas Kosmala and Denis Mazurenok Department of Computer Science, Faculty of

More information

CS 543: Final Project Report Texture Classification using 2-D Noncausal HMMs

CS 543: Final Project Report Texture Classification using 2-D Noncausal HMMs CS 543: Final Project Report Texture Classification using 2-D Noncausal HMMs Felix Wang fywang2 John Wieting wieting2 Introduction We implement a texture classification algorithm using 2-D Noncausal Hidden

More information

Learning in Medical Image Databases. Cristian Sminchisescu. Department of Computer Science. Rutgers University, NJ

Learning in Medical Image Databases. Cristian Sminchisescu. Department of Computer Science. Rutgers University, NJ Learning in Medical Image Databases Cristian Sminchisescu Department of Computer Science Rutgers University, NJ 08854 email: crismin@paul.rutgers.edu December, 998 Abstract In this paper we present several

More information

Road Sign Visualization with Principal Component Analysis and Emergent Self-Organizing Map

Road Sign Visualization with Principal Component Analysis and Emergent Self-Organizing Map Road Sign Visualization with Principal Component Analysis and Emergent Self-Organizing Map H6429: Computational Intelligence, Method and Applications Assignment One report Written By Nguwi Yok Yen (nguw0001@ntu.edu.sg)

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

OBJECT-CENTERED INTERACTIVE MULTI-DIMENSIONAL SCALING: ASK THE EXPERT

OBJECT-CENTERED INTERACTIVE MULTI-DIMENSIONAL SCALING: ASK THE EXPERT OBJECT-CENTERED INTERACTIVE MULTI-DIMENSIONAL SCALING: ASK THE EXPERT Joost Broekens Tim Cocx Walter A. Kosters Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Email: {broekens,

More information

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample CS 1675 Introduction to Machine Learning Lecture 18 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem:

More information

Advanced visualization techniques for Self-Organizing Maps with graph-based methods

Advanced visualization techniques for Self-Organizing Maps with graph-based methods Advanced visualization techniques for Self-Organizing Maps with graph-based methods Georg Pölzlbauer 1, Andreas Rauber 1, and Michael Dittenbach 2 1 Department of Software Technology Vienna University

More information