Clustering: Classic Methods and Modern Views
|
|
- Whitney Osborne
- 5 years ago
- Views:
Transcription
1 Clustering: Classic Methods and Modern Views Marina Meilă University of Washington June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms
2 Outline Paradigms for clustering Basic algorithms K-means clustering and the quadratic distortion Model based / soft clustering, the EM algorihtm and maximizing likelihood Similarity based / graph clustering and the Spectral clustering algorithm Further issues and current trends
3 Outline Paradigms for clustering Basic algorithms K-means clustering and the quadratic distortion Model based / soft clustering, the EM algorihtm and maximizing likelihood Similarity based / graph clustering and the Spectral clustering algorithm Further issues and current trends
4 What is clustering? Problem and Notation Informal definition Clustering = Finding groups in data Notation D = {x 1, x 2,... x n} a data set n = number of data points K = number of clusters (K << n) C = {C 1, C 2,..., C K } a partition of D into disjoint subsets k(i) = the label of point i L(C) = cost (loss) of C (to be minimized) Second informal definition Clustering = given n data points, separate them into K clusters Hard vs. soft clusterings Hard clustering C: an item belongs to only 1 cluster Soft clustering γ = {γ ki } i=1:n k=1:k γ ki = the degree of membership of point i to cluster k γ ki = 1 for all i (usually associated with a probabilistic model) k
5
6 Paradigms Depend on type of data, type of clustering, type of cost (probabilistic or not), and constraints (about K, shape of clusters) Data = vectors {x i } in R d Parametric Cost based [hard] (K known) Model based [soft] Non-parametric Dirichlet process mixtures [soft] (K determined Information bottleneck [soft] by algorithm) Modes of distribution [hard] Gaussian blurring mean shift [hard] Data = similarities between pairs of points [S ij ] i,j=1:n, S ij = S ji 0 (called Similarity based clustering) Graph partitioning spectral clustering [hard, K fixed, cost based] typical cuts [hard non-parametric, cost based] Affinity propagation [hard/soft non-parametric]
7 Classification vs Clustering Classification Clustering Performance criterion Expected error a wide variety Supervised Unsupervised Generalization Performance on new Performance on current data is what matters data is what matters K Known Unknown Goal Prediction Exploration, etc Stage of field Mature Young: new paradigms and theoretical results are emerging
8 Outline Paradigms for clustering Basic algorithms K-means clustering and the quadratic distortion Model based / soft clustering, the EM algorihtm and maximizing likelihood Similarity based / graph clustering and the Spectral clustering algorithm Further issues and current trends
9 K-means clustering Algorithm K-Means Input Data D = {x i } i=1:n, number clusters K nitialize centers µ 1, µ 2,... µ K R d at random Iterate until convergence 1. for i = 1 : n (assign points to clusters new clustering) k(i) = argmin x i µ k k 2. for k = 1 : K (recalculate centers) µ k = 1 x i (1) C k i C k Convergence if C doesn t change at iteration m it will never change after that convergence in finite number of steps to local optimum of cost L (defined next) therefore, initialization will matter
10 The K-means cost L(C) = K k=1 i C k x i µ k 2 K-means solves a least-squares problem the cost L is called quadratic distortion Proposition The K-means algorithm decreases L(C) at every step. Sketch of proof step 1: reassigning the labels can only decrease L step 2: reassigning the centers µ k can only decrease L because µ k as given by (1) is the solution to µ k = min x i µ 2 (2) µ R d i C k
11 The K-means cost, rewritten L as sum of squared distances to centers K L(C) = x i µ k 2 k=1 i C k L as sum of squared intracluster distances K 1 L(C) = x i x j 2 (3) C k k=1 i,j C k L in matrix form Define the assignment matrix associated with C by Z(clust) Let C = {C 1 = {1, 2, 3}, C 2 = {4, 5}} Z unnorm (C) = C 1 C point i Z(C) = 0 1 Then Z is an orthogonal matrix (columns are orthornormal) and Proof of (3): Replace µ C 1 C L(C) = trace Z T AZ with A ij = x i x j 2 (4) as expressed in (1) in the expression of L, then rearrange the terms
12 Initialization of the centroids µ 1:K Idea 1: start with K points at random Idea 2: start with K data points at random What s wrong with chosing K data points at random? The probability of hitting all K clusters with K samples approaches 0 when K > 5 Idea 3: start with K data points using Fastest First Traversal (greedy simple approach to spread out centers) Idea 4: k-means++ (randomized, theoretically backed approach to spread out centers) Idea 5: K-logK Initialization (start with enough centers to hit all clusters, then prune down to K)
13 The K-logK initialization The K-logK Initialization (see also ) 1. pick µ 0 1:K at random from data set, where K = O(K log K) (this assures that each cluster has at least 1 center w.h.p) 2. run 1 step of K-means 3. remove all centers µ 0 k that have few points, e.g C k < n ek 4. from the remaining centers select K centers by Fastest First Traversal 4.1 pick µ 1 at random from the remaining {µ 0 1:K } 4.2 for k = 2 : K, µ k argmax µ 0 k min j=1:k 1 µ 0 k µ j, i.e next µ k is furthest away from the already chosen centers 5. continue with the standard K-means algorithm
14 K-means clustering with K-logK Initialization Example using a mixture of 7 Normal distributions with 100 outliers sampled uniformly K-logK K = 7, T = 100, n = 1100, c = K means cost iteration 50 Naive K = 7 T = 100, n = K means cost iteration
15 Model based clustering: Mixture models The mixture density f (x) = K π k f k (x) k= f k (x) = the components of the mixture each is a density f called mixture of Gaussians if f k = Normal µk,σ k π k = the mixing proportions, k = 1K π k = 1, π k 0. model parameters θ = (π 1:K, µ 1:K, Σ 1:K ) The degree of membership of point i to cluster k γ ki def = P[x i C k ] = π kf k (x) f (x) for i = 1 : n, k = 1 depends on x i and on the model parameters (5)
16 The Expectation-Maximization (EM) Algorithm Algorithm Expectation-Maximization (EM) Input Data D = {x i } i=1:n, number clusters K nitialize parameters π 1:K R, µ 1:K R d, Σ 1:K R d d Iterate until convergence E step (Optimize clustering) for i = 1 : n, k = 1 : K M step (Optimize parameters) compute γ ki Compute number of points in cluster k m Γ k = γ ki, k = 1 : K Estimate parameters i=1 = π kf k (x) f (x) (note: k Γ k = n) (6) π k = Γ k n, µ k = Σ k = k = 1 : K n γ ki x i Γ i=1 k n i=1 γ ki (x i µ k )(x i µ k ) T Γ k
17 EM versus K-means Alternates between cluster assignments and parameter estimation Cluster assignments γ ki are probabilistic Cluster parametrization more flexible Converges to local optimum of log-likelihood Initialization recommended by K-logK method Modern algorithms with guarantees (for e.g. mixtures of Gaussians) Random projections Projection on principal subspace Two step EM (=K-logK initialization + one more EM iteration)
18 Outline Paradigms for clustering Basic algorithms K-means clustering and the quadratic distortion Model based / soft clustering, the EM algorihtm and maximizing likelihood Similarity based / graph clustering and the Spectral clustering algorithm Further issues and current trends
19 Similarity based clustering Paradigm: the features we observe are measures of similarity/dissimilarity between pairs of data points, e.g points features Image segmentation pixels distance in color space, location, separated by a contour, belong to same texture Social network people friendship, coauthorship, hyperlinks, Text analysis words appear in same context The features are summarized by a single similarity measure S ij e.g S ij = e k α k feature k (i,j) for all points i, j symmetric S ij = S ji non-negative S ij 0 Mathematically, we can see the data as a n n matrix S = [S ij ] a (weighted) graph points = graph nodes, similarity S ij = weight of edge ij meaningful because very few similarities are large Then, clustering is cutting the graph
20 Artificial matrixuw Statistics co-authorshipsimilarity data based useful for R d data as well S ij =# tech-reports co-authored S ij = e x i x j 2 σ 2
21 Criteria for clustering Graph cuts remove some edges = disconnected graph the groups are the connected components By similar behavior nodes i, j in the same group iff i, j have the same pattern of transitions at group level By Embedding map graph nodes {1, 2,..., n} to R K then use standard clustering methods (e.g K-means) By diffusion distance All are equivalent (approximately) when the data is clusterable
22 Spectral clustering in a nutshell
23 Spectral clustering in a nutshell Similarity S Preprocess Top K e-vectors v 1:K Data embedded by v Cluster with K-means P = D 1 S point i
24 Definitions V = {1, 2,..., n} node degree or volume D i = j V S ij volume of cluster C V D C = i C D i cut between subsets C, C V S ij i C j C Multiway Normalized Cut of a partition C = {C 1:K } of V MNCut(C) = K k=1 k k Cut(C k, C k ) D Ck in particular, for K = 2, MNCut(C, C ) = Cut(C, C ) D C D C
25 Spectral clustering algorithm Spectral Clustering Algorithm Input Similarity matrix S, number of clusters K 1. Transform S: Set D i = n j=1 S ij, j = 1 : n the node degrees. Form the transition matrix P = [P ij ] n ij=1 with P ij S ij /D i, for i, j = 1 : n 2. Compute the largest K eigenvalues λ 1 = 1 λ 2... λ K and eigenvectors v 1,... v K of P. 3. Embed the data in principal subspace Let V = [ v 2 v 3... v K ] R n K, x i i-th row of V. 4. (orthogonal initialization) Find K initial centers by 4.1 take µ 1 randomly from x 1,... x n 4.2 for k = 2,... K set µ k = argmin xi max k <k µ T k x i. 5. Run the K-means algorithm on the data x 1:n starting from the centers µ 1:K. An algorithm based on and.
26 Properties of spectral clustering Arbitrary cluster shapes (main advantage) Elegant mathematically Practical up to medium sized problems Running time (by Lanczos algorithm) O(nk)/iteration. Works well when K known, not too large estimating K Depend heavily on the similarity function (main problem) learning the similarities,,, Outliers become separate clusters (user must adjust K accordingly!) Very popular, many variants which aim to improve on the above Diffusion maps : normalize the eigenvectors λ t kv k Practical fix, when K large: only compute a fixed number of eigenvectors d < K. This avoids the effects of noise in lower ranked eigenvectors
27 Understanding spectral clustering Graph cuts Spectral clustering minimizes MNCut(C) = K Cut(C k,c k ) k=1 k k D Ck By similar behavior in the random walk on the graph (not the smallest K-way cut!)
28 Understanding spectral clustering (2) By Spectal Embedding The principal e-vectors v 1:K Z orthogonal matrix with n rows and K columns, which maximizes trace Z T LZ with L = D 1/2 SD 1/2 Compare with K-means cost L(C) = trace Z T AZ + constant with A ij = x i x j 2 item All are equivalent (approximately) when the data is clusterable. Clusterability characterized by
29 Outline Paradigms for clustering Basic algorithms K-means clustering and the quadratic distortion Model based / soft clustering, the EM algorihtm and maximizing likelihood Similarity based / graph clustering and the Spectral clustering algorithm Further issues and current trends
30 Further issues and current trends (an incomplete list) Selecting K many ad-hoc methods BIC (statistical model selection method) for mixture models theoretically unsupported in clustering stability-based selection (a boostrap like method) Idea: if a clustering C is supported by the data then it is stable to perturbations theoretical results only prelimnary, but empirical evidence promising Non-parametric clustering mixture models with unbounded K (known as Dirichlet Process Mixtures, or Bayesian Nonparametric clustering) methods based on a kernel density estimator find the peaks of the density Mean-Shift, Gaussian Blurring Mean-Shift level-set methods (find the high density regions) for similarity data Affinity Propagation Clusterability Algorithms with guarantees for clusterable data Scalable algorithms
Clustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationClustering. CS294 Practical Machine Learning Junming Yin 10/09/06
Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,
More informationMachine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,
More informationClustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic
Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the
More informationSupervised vs. Unsupervised Learning
Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now
More informationClassification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University
Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate
More informationk-means demo Administrative Machine learning: Unsupervised learning" Assignment 5 out
Machine learning: Unsupervised learning" David Kauchak cs Spring 0 adapted from: http://www.stanford.edu/class/cs76/handouts/lecture7-clustering.ppt http://www.youtube.com/watch?v=or_-y-eilqo Administrative
More informationMachine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme
Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Unsupervised Learning: Clustering Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com (Some material
More information( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components
Review Lecture 14 ! PRINCIPAL COMPONENT ANALYSIS Eigenvectors of the covariance matrix are the principal components 1. =cov X Top K principal components are the eigenvectors with K largest eigenvalues
More informationCS Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 8: ADVANCED CLUSTERING (FUZZY AND CO -CLUSTERING) Review: Basic Cluster Analysis Methods (Chap. 10) Cluster Analysis: Basic Concepts
More informationUnsupervised Learning: Clustering
Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning
More informationK-means and Hierarchical Clustering
K-means and Hierarchical Clustering Xiaohui Xie University of California, Irvine K-means and Hierarchical Clustering p.1/18 Clustering Given n data points X = {x 1, x 2,, x n }. Clustering is the partitioning
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationClassification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University
Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate
More informationClustering. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 238
Clustering Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester 2015 163 / 238 What is Clustering? Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester
More informationThe K-modes and Laplacian K-modes algorithms for clustering
The K-modes and Laplacian K-modes algorithms for clustering Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://faculty.ucmerced.edu/mcarreira-perpinan
More informationCS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample
CS 1675 Introduction to Machine Learning Lecture 18 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem:
More informationClustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation
More informationA Course in Machine Learning
A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling
More informationhttp://www.xkcd.com/233/ Text Clustering David Kauchak cs160 Fall 2009 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture17-clustering.ppt Administrative 2 nd status reports Paper review
More informationMSA220 - Statistical Learning for Big Data
MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups
More informationPattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition
Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant
More informationINF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering
INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Murhaf Fares & Stephan Oepen Language Technology Group (LTG) September 27, 2017 Today 2 Recap Evaluation of classifiers Unsupervised
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationLecture 7: Segmentation. Thursday, Sept 20
Lecture 7: Segmentation Thursday, Sept 20 Outline Why segmentation? Gestalt properties, fun illusions and/or revealing examples Clustering Hierarchical K-means Mean Shift Graph-theoretic Normalized cuts
More informationMixture Models and the EM Algorithm
Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is
More informationSGN (4 cr) Chapter 11
SGN-41006 (4 cr) Chapter 11 Clustering Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 25, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006 (4 cr) Chapter
More informationSegmentation: Clustering, Graph Cut and EM
Segmentation: Clustering, Graph Cut and EM Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu
More informationCluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1
Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods
More informationCOSC 6397 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2015.
COSC 6397 Big Data Analytics Fuzzy Clustering Some slides based on a lecture by Prof. Shishir Shah Edgar Gabriel Spring 215 Clustering Clustering is a technique for finding similarity groups in data, called
More informationNote Set 4: Finite Mixture Models and the EM Algorithm
Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-
More informationClustering. Shishir K. Shah
Clustering Shishir K. Shah Acknowledgement: Notes by Profs. M. Pollefeys, R. Jin, B. Liu, Y. Ukrainitz, B. Sarel, D. Forsyth, M. Shah, K. Grauman, and S. K. Shah Clustering l Clustering is a technique
More informationMachine Learning. Unsupervised Learning. Manfred Huber
Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training
More informationApplication of Spectral Clustering Algorithm
1/27 Application of Spectral Clustering Algorithm Danielle Middlebrooks dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance
More informationCS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning
CS325 Artificial Intelligence Cengiz Spring 2013 Unsupervised Learning Missing teacher No labels, y Just input data, x What can you learn with it? Unsupervised Learning Missing teacher No labels, y Just
More informationClustering. Supervised vs. Unsupervised Learning
Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now
More informationUnsupervised Learning
Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover
More informationCLUSTERING. JELENA JOVANOVIĆ Web:
CLUSTERING JELENA JOVANOVIĆ Email: jeljov@gmail.com Web: http://jelenajovanovic.net OUTLINE What is clustering? Application domains K-Means clustering Understanding it through an example The K-Means algorithm
More informationLecture 3 January 22
EE 38V: Large cale Learning pring 203 Lecture 3 January 22 Lecturer: Caramanis & anghavi cribe: ubhashini Krishsamy 3. Clustering In the last lecture, we saw Locality ensitive Hashing (LH) which uses hash
More informationNormalized cuts and image segmentation
Normalized cuts and image segmentation Department of EE University of Washington Yeping Su Xiaodan Song Normalized Cuts and Image Segmentation, IEEE Trans. PAMI, August 2000 5/20/2003 1 Outline 1. Image
More informationSpectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014
Spectral Clustering Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 What are we going to talk about? Introduction Clustering and
More informationUnsupervised Learning
Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised
More informationIBL and clustering. Relationship of IBL with CBR
IBL and clustering Distance based methods IBL and knn Clustering Distance based and hierarchical Probability-based Expectation Maximization (EM) Relationship of IBL with CBR + uses previously processed
More informationSYDE Winter 2011 Introduction to Pattern Recognition. Clustering
SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned
More information10-701/15-781, Fall 2006, Final
-7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly
More informationMachine Learning and Data Mining. Clustering (1): Basics. Kalev Kask
Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of
More informationHard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering
An unsupervised machine learning problem Grouping a set of objects in such a way that objects in the same group (a cluster) are more similar (in some sense or another) to each other than to those in other
More informationIntroduction to Machine Learning
Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline
More informationLecture 11: E-M and MeanShift. CAP 5415 Fall 2007
Lecture 11: E-M and MeanShift CAP 5415 Fall 2007 Review on Segmentation by Clustering Each Pixel Data Vector Example (From Comanciu and Meer) Review of k-means Let's find three clusters in this data These
More informationMachine Learning for Data Science (CS4786) Lecture 11
Machine Learning for Data Science (CS4786) Lecture 11 Spectral Clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Survey Survey Survey Competition I Out! Preliminary report of
More informationData Clustering Hierarchical Clustering, Density based clustering Grid based clustering
Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering Team 2 Prof. Anita Wasilewska CSE 634 Data Mining All Sources Used for the Presentation Olson CF. Parallel algorithms
More information10701 Machine Learning. Clustering
171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among
More informationUnsupervised Learning : Clustering
Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationSegmentation Computer Vision Spring 2018, Lecture 27
Segmentation http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 218, Lecture 27 Course announcements Homework 7 is due on Sunday 6 th. - Any questions about homework 7? - How many of you have
More informationMultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A
MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 205-206 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI BARI
More informationPart I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a
Week 9 Based in part on slides from textbook, slides of Susan Holmes Part I December 2, 2012 Hierarchical Clustering 1 / 1 Produces a set of nested clusters organized as a Hierarchical hierarchical clustering
More informationCOMP 551 Applied Machine Learning Lecture 13: Unsupervised learning
COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning Associate Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551
More informationSensor Tasking and Control
Sensor Tasking and Control Outline Task-Driven Sensing Roles of Sensor Nodes and Utilities Information-Based Sensor Tasking Joint Routing and Information Aggregation Summary Introduction To efficiently
More informationMore on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization
More on Learning Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization Neural Net Learning Motivated by studies of the brain. A network of artificial
More informationMixture Models and EM
Table of Content Chapter 9 Mixture Models and EM -means Clustering Gaussian Mixture Models (GMM) Expectation Maximiation (EM) for Mixture Parameter Estimation Introduction Mixture models allows Complex
More information10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors
Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple
More informationExpectation Maximization (EM) and Gaussian Mixture Models
Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation
More informationGeneralized trace ratio optimization and applications
Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO
More informationScalable Clustering of Signed Networks Using Balance Normalized Cut
Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.
More informationUnsupervised: no target value to predict
Clustering Unsupervised: no target value to predict Differences between models/algorithms: Exclusive vs. overlapping Deterministic vs. probabilistic Hierarchical vs. flat Incremental vs. batch learning
More informationIntroduction to Computer Science
DM534 Introduction to Computer Science Clustering and Feature Spaces Richard Roettger: About Me Computer Science (Technical University of Munich and thesis at the ICSI at the University of California at
More informationGaussian Mixture Models For Clustering Data. Soft Clustering and the EM Algorithm
Gaussian Mixture Models For Clustering Data Soft Clustering and the EM Algorithm K-Means Clustering Input: Observations: xx ii R dd ii {1,., NN} Number of Clusters: kk Output: Cluster Assignments. Cluster
More information22 October, 2012 MVA ENS Cachan. Lecture 5: Introduction to generative models Iasonas Kokkinos
Machine Learning for Computer Vision 1 22 October, 2012 MVA ENS Cachan Lecture 5: Introduction to generative models Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Center for Visual Computing Ecole Centrale Paris
More informationCOSC 6339 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2017.
COSC 6339 Big Data Analytics Fuzzy Clustering Some slides based on a lecture by Prof. Shishir Shah Edgar Gabriel Spring 217 Clustering Clustering is a technique for finding similarity groups in data, called
More informationMethods for Intelligent Systems
Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering
More informationCluster Evaluation and Expectation Maximization! adapted from: Doug Downey and Bryan Pardo, Northwestern University
Cluster Evaluation and Expectation Maximization! adapted from: Doug Downey and Bryan Pardo, Northwestern University Kinds of Clustering Sequential Fast Cost Optimization Fixed number of clusters Hierarchical
More informationALTERNATIVE METHODS FOR CLUSTERING
ALTERNATIVE METHODS FOR CLUSTERING K-Means Algorithm Termination conditions Several possibilities, e.g., A fixed number of iterations Objects partition unchanged Centroid positions don t change Convergence
More informationExpectation Maximization!
Expectation Maximization! adapted from: Doug Downey and Bryan Pardo, Northwestern University and http://www.stanford.edu/class/cs276/handouts/lecture17-clustering.ppt Steps in Clustering Select Features
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Unsupervised learning Daniel Hennes 29.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Supervised learning Regression (linear
More informationDATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm
DATA MINING LECTURE 7 Hierarchical Clustering, DBSCAN The EM Algorithm CLUSTERING What is a Clustering? In general a grouping of objects such that the objects in a group (cluster) are similar (or related)
More informationLatent Variable Models and Expectation Maximization
Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15
More informationThe goals of segmentation
Image segmentation The goals of segmentation Group together similar-looking pixels for efficiency of further processing Bottom-up process Unsupervised superpixels X. Ren and J. Malik. Learning a classification
More informationAn Introduction to Cluster Analysis. Zhaoxia Yu Department of Statistics Vice Chair of Undergraduate Affairs
An Introduction to Cluster Analysis Zhaoxia Yu Department of Statistics Vice Chair of Undergraduate Affairs zhaoxia@ics.uci.edu 1 What can you say about the figure? signal C 0.0 0.5 1.0 1500 subjects Two
More informationClustering and Visualisation of Data
Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationSegmentation and low-level grouping.
Segmentation and low-level grouping. Bill Freeman, MIT 6.869 April 14, 2005 Readings: Mean shift paper and background segmentation paper. Mean shift IEEE PAMI paper by Comanici and Meer, http://www.caip.rutgers.edu/~comanici/papers/msrobustapproach.pdf
More informationUnsupervised Learning
Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,
More informationMachine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016
Machine Learning for Signal Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 Statistical Modelling and Latent Structure Much of statistical modelling attempts to identify latent structure in the
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-31-017 Outline Background Defining proximity Clustering methods Determining number of clusters Comparing two solutions Cluster analysis as unsupervised Learning
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationINF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering
INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering Erik Velldal University of Oslo Sept. 18, 2012 Topics for today 2 Classification Recap Evaluating classifiers Accuracy, precision,
More informationGene Clustering & Classification
BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering
More informationCS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample
Lecture 9 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem: distribute data into k different groups
More informationK-Means. Oct Youn-Hee Han
K-Means Oct. 2015 Youn-Hee Han http://link.koreatech.ac.kr ²K-Means algorithm An unsupervised clustering algorithm K stands for number of clusters. It is typically a user input to the algorithm Some criteria
More informationClustering and Dissimilarity Measures. Clustering. Dissimilarity Measures. Cluster Analysis. Perceptually-Inspired Measures
Clustering and Dissimilarity Measures Clustering APR Course, Delft, The Netherlands Marco Loog May 19, 2008 1 What salient structures exist in the data? How many clusters? May 19, 2008 2 Cluster Analysis
More informationTargil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation
Targil : Image Segmentation Image segmentation Many slides from Steve Seitz Segment region of the image which: elongs to a single object. Looks uniform (gray levels, color ) Have the same attributes (texture
More information6.801/866. Segmentation and Line Fitting. T. Darrell
6.801/866 Segmentation and Line Fitting T. Darrell Segmentation and Line Fitting Gestalt grouping Background subtraction K-Means Graph cuts Hough transform Iterative fitting (Next time: Probabilistic segmentation)
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationUnsupervised learning in Vision
Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1
Week 8 Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Part I Clustering 2 / 1 Clustering Clustering Goal: Finding groups of objects such that the objects in a group
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised
More information