SGN (4 cr) Chapter 11
|
|
- Esther Randall
- 5 years ago
- Views:
Transcription
1 SGN (4 cr) Chapter 11 Clustering Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 25, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
2 Contents of This Lecture 1 Hierarchical Clustering 2 Quick Partitions 3 Mixture Models 4 Sum-of-squares 5 Spectral Clustering 6 Cluster Validity 7 Other Unsupervised Schemes J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
3 Material Chapter 11 in WebCop:2011 and Section in HasTibFri:2009 J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
4 What Should You Already Know? Basics of hierarchical clustering, k-means, mixture models, and SOM depending on the basic course you ve taken. J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
5 Clustering Given a set of points divide these into c clusters based on their similarity. k may or may not be known. Different definitions of similarity give rise to different clustering algorithms. More formally, given a set of points D = {x 1,..., x n }, the task is to place each of these into one of the c classes, i.e., to find c sets D 1,..., D c so that c D i = D and for all i j. i=1 D j D i = J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
6 Hierarchical Clustering Hierarchical methods Derive clustering from from a given dissimilarity matrix Means for summarizing data structure via dendograms Divisive or agglomerative (latter much more common) Agglomerative clustering works by merging two closest clusters, starting from n clusters J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
7 Hierarchical Clustering Hierarchical Methods 1 / 2 Different definitions of the closest yield different algorithms Single-link or nearest neighbour Complete-link or farthest neighbour Average-link Ward s method J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
8 Hierarchical Clustering Hierarchical Methods 2 / 2 Single-link seeks isolated clusters but suffers from chaining effect that (usually) is undesirable. Complete-link, average-link and Ward tend to concentrate on internal cohesion producing compact and often spherical clusters Mathematical definition of the clustering quality difficult J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
9 Quick Partitions Quick Partitions For initial partitions of data Random k selection Variable division Leader algorithm J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
10 Mixture Models Mixture Models A mixture density p(x) = c π j p(x θ j ). (1) j=1 The priors π j in mixture densities are mixing parameters, and the class conditional densities are called component densities. Most commonly component densities are normal J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
11 Mixture Models EM Algorithm for Gaussian Mixture Models 1 Initialize µ 0 j, Σ0 j, P0 (ω j ), set t 0. 2 (E-step) Compute the posterior probabilities p t+1 ij = 3 (M-step) Update the parameter values π t j p normal(x i µ t j, Σt j ) g k=1 πt k p normal(x i µ t k, Σt k ). π t+1 j = (1/n) i p t+1 ij (2) 4 Set t t + 1. µ t+1 j = Σ t+1 j = i pt+1 ij x i i pt+1 ij i pt+1 ij (x i µ t+1 j )(x i µ t+1 j i pt+1 ij 5 Stop if a convergence criterion is met, otherwise return to step 2. ) T (3) (4) J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
12 Mixture Models How Many Components? Different information theoretic model selection criteria AIC BIC etc. etc. Model selection criteria evaluate the model fit while penalizing for the number of parameters in the model. J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
13 Mixture Models Other Difficulties Switching problem Local minima (can be reduced by a good initialization or a high entropy initialization) Maximum likelihood solution might not exist J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
14 Sum-of-squares Sum-of-Squares Criteria Given a set of n data samples, partition the data into c clusters so that the clustering criterion is optimized. The simplest being the sum-of-squares (or k-means) criterion: J(D 1,..., D c ) = c i=1 x D i x µ i 2. (5) Various related criteria based scatter matrices. Produces sphere-like clusters J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
15 Sum-of-squares K-Means Algorithm k-means algorithm Initialize µ 1 (0),..., µ c (0), set t 0 repeat Classify each x 1,..., x n to the class D j (t) whose mean vector µ j (t) is the nearest to x i. for k = 1 to c do update the mean vectors µ k (t + 1) = 1 end for Set t t + 1 until clustering did not change Return D 1 (t 1),..., D c (t 1). D k (t) x D k (t) x K-means demo J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
16 Sum-of-squares K-Means Properties TRUE K-means EM Examples of problematic clustering tasks for k-means. J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
17 Sum-of-squares Fuzzy K-Means Fuzzy k-means algorithm Initialize y ij, set t 0 repeat for j = 1 to k do Compute the mean vectors m j = 1 n n i=1 y ij r i=1 y ij r x i end for for j = 1 to k, i = 1 to n do Compute the distances as d ij = x i m j end for for j = 1 to k, i = 1 to n do 1 Compute the membership function as y ij = k c=1 (d ij /d ic ) 2/(r 1) end for Set t t + 1 until clustering did not change Return membership values y ij. J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
18 Sum-of-squares Self-Organizing Feature Maps Aim is to present high-dimensional data as 1-D or 2-D array of number that captures the structure in the original data. J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
19 Spectral Clustering Spectral Clustering 1 / 2 How to cluster J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
20 Spectral Clustering Spectral Clustering 1 / 2 How to cluster Spectral clustering (normalized cuts, J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
21 Spectral Clustering Spectral Clustering 2 / 2 Idea/motivation: Represent data as a graph where the nodes are the data points and the edge weights are (inversely) proportional to distances between the data points. Graphs do not have to be complete, i.e., every node pair does not have to be connected by an edge. Clustering is generated by making cuts to the graph (note that graph cuts in image segmentation are based on similar idea, but the graph construction is different) Graph cuts are often impossible to compute exactly (NP-complete problems), so spectral clustering utilizes approximations based linear algebra. There are other interpretations /motivations to spectral clustering J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
22 Spectral Clustering Graph Theory 1 / 2 J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
23 Spectral Clustering Graph Theory 2 / 2 J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
24 Spectral Clustering Selecting Edge Adjacencies For example, the elements in the adjacency matrix A can be set a ij = exp( x i x j 2 /σ) if x i is one of k-nearest neighbours of x j and a ij = 0 otherwise. J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
25 Spectral Clustering The Graph Laplacian The nonsymmetric (Ncut) weighted graph Laplacian is L = I D 1 A where A is the adjacency matrix and D is the diagonal matrix with entries n d ii = on the diagonal. j=1 a ij J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
26 Spectral Clustering Clustering Algorithm (k Clusters) 1 Form the graph adjacency matrix based on data 2 Compute the Laplacian 3 Solve the generalized eigenvalue problem Lv = λdv and select the eigenvectors v 1,..., v k corresponding to k smallest eigenvalues. Datapoint x 1 is mapped to x 1 = [v 11,..., v k1 ] T and so on. 4 Perform k-means clustering of the mapped datapoints x i. J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
27 Cluster Validity Cluster Validity Evaluation of the clustering: Is the cluster structure property of the data (as it should) or imposed by a particular clustering algorithm (as it should not)? Very, very difficult in high dimensions Different criteria: 1 Internal 2 External 3 Relative Based on these criteria, we can statistically test different hypotheses on the cluster validity. J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
28 Cluster Validity Evaluating Partitions: Rand Index From Wikipedia J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
29 Cluster Validity Evaluating Partitions: Adjusted Rand Index From Wikipedia, application example J.-P. Kauppi, I.P. Jaaskelainen, M. Sams, and J. Tohka. Clustering Inter-Subject Correlation Matrices in Functional Magnetic Resonance Imaging. IEEE-ITAB J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
30 Other Unsupervised Schemes Other unsupervised schemes Various other unsupervised learning tasks in addition to clustering exist - we give just one example J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
31 Other Unsupervised Schemes PageRank 1 / 2 We have N web pages and wish to rank them in terms of importance. The Google PageRank algorithm considers a webpage to be important if many other webpages point to it. The linking webpages that point to a given page are not treated equally: the algorithm also takes into account both the importance (PageRank) of the linking pages and the number of outgoing links that they have. J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
32 Other Unsupervised Schemes PageRank 2 / 2 L ij = 1 if page j points to page i and otherwise zero. c j = N i=1 L ij (the number of outlinks) PageRanks p i defined recursively via where d = 0.85, or p i = (1 d) + d N (L ij /c j )p j, j=1 p = (a d)1 + dldiag(c) 1 p. It can be shown that after proper normalization we get: p = Ap where A has one as its largest eigenvalue. Again: an eigenvalue problem, this time one solvable by the Power Method. J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
33 Other Unsupervised Schemes Summary 1 Clustering is partioning of (or classifying) a given data set directly without labeled training data J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
34 Other Unsupervised Schemes Summary 1 Clustering is partioning of (or classifying) a given data set directly without labeled training data 2 The missing training data is replaced by a user-defined structure imposed to the pattern space J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
35 Other Unsupervised Schemes Summary 1 Clustering is partioning of (or classifying) a given data set directly without labeled training data 2 The missing training data is replaced by a user-defined structure imposed to the pattern space 3 Various approaches, eg. dissimilaritiy measures, mixture models, spectral methods J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
36 Other Unsupervised Schemes Summary 1 Clustering is partioning of (or classifying) a given data set directly without labeled training data 2 The missing training data is replaced by a user-defined structure imposed to the pattern space 3 Various approaches, eg. dissimilaritiy measures, mixture models, spectral methods 4 Cluster result evaluation: measures for cluster validity J. Tohka & J. Niemi (TUT-SGN) SGN (4 cr) Chapter 11 February 25, / 32
Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-31-017 Outline Background Defining proximity Clustering methods Determining number of clusters Comparing two solutions Cluster analysis as unsupervised Learning
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised
More informationPart I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a
Week 9 Based in part on slides from textbook, slides of Susan Holmes Part I December 2, 2012 Hierarchical Clustering 1 / 1 Produces a set of nested clusters organized as a Hierarchical hierarchical clustering
More informationCluster Analysis. Ying Shen, SSE, Tongji University
Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group
More informationSTATS306B STATS306B. Clustering. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010
STATS306B Jonathan Taylor Department of Statistics Stanford University June 3, 2010 Spring 2010 Outline K-means, K-medoids, EM algorithm choosing number of clusters: Gap test hierarchical clustering spectral
More informationSYDE Winter 2011 Introduction to Pattern Recognition. Clustering
SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned
More informationDATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm
DATA MINING LECTURE 7 Hierarchical Clustering, DBSCAN The EM Algorithm CLUSTERING What is a Clustering? In general a grouping of objects such that the objects in a group (cluster) are similar (or related)
More informationHard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering
An unsupervised machine learning problem Grouping a set of objects in such a way that objects in the same group (a cluster) are more similar (in some sense or another) to each other than to those in other
More informationMachine learning - HT Clustering
Machine learning - HT 2016 10. Clustering Varun Kanade University of Oxford March 4, 2016 Announcements Practical Next Week - No submission Final Exam: Pick up on Monday Material covered next week is not
More informationUnsupervised Learning
Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover
More informationClustering. K-means clustering
Clustering K-means clustering Clustering Motivation: Identify clusters of data points in a multidimensional space, i.e. partition the data set {x 1,...,x N } into K clusters. Intuition: A cluster is a
More informationk-means demo Administrative Machine learning: Unsupervised learning" Assignment 5 out
Machine learning: Unsupervised learning" David Kauchak cs Spring 0 adapted from: http://www.stanford.edu/class/cs76/handouts/lecture7-clustering.ppt http://www.youtube.com/watch?v=or_-y-eilqo Administrative
More informationBehavioral Data Mining. Lecture 18 Clustering
Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i
More informationCS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample
CS 1675 Introduction to Machine Learning Lecture 18 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem:
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationSegmentation: Clustering, Graph Cut and EM
Segmentation: Clustering, Graph Cut and EM Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu
More informationUnsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi
Unsupervised Learning Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Content Motivation Introduction Applications Types of clustering Clustering criterion functions Distance functions Normalization Which
More informationMultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A
MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 205-206 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI BARI
More informationMachine Learning. Unsupervised Learning. Manfred Huber
Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training
More information6.801/866. Segmentation and Line Fitting. T. Darrell
6.801/866 Segmentation and Line Fitting T. Darrell Segmentation and Line Fitting Gestalt grouping Background subtraction K-Means Graph cuts Hough transform Iterative fitting (Next time: Probabilistic segmentation)
More informationLecture 7: Segmentation. Thursday, Sept 20
Lecture 7: Segmentation Thursday, Sept 20 Outline Why segmentation? Gestalt properties, fun illusions and/or revealing examples Clustering Hierarchical K-means Mean Shift Graph-theoretic Normalized cuts
More informationVisual Representations for Machine Learning
Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering
More informationINF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22
INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task
More informationMSA220 - Statistical Learning for Big Data
MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-
More informationMachine Learning for Data Science (CS4786) Lecture 11
Machine Learning for Data Science (CS4786) Lecture 11 Spectral Clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Survey Survey Survey Competition I Out! Preliminary report of
More informationClustering and Dissimilarity Measures. Clustering. Dissimilarity Measures. Cluster Analysis. Perceptually-Inspired Measures
Clustering and Dissimilarity Measures Clustering APR Course, Delft, The Netherlands Marco Loog May 19, 2008 1 What salient structures exist in the data? How many clusters? May 19, 2008 2 Cluster Analysis
More informationClustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin
Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014
More informationSGN (4 cr) Chapter 10
SGN-41006 (4 cr) Chapter 10 Feature Selection and Extraction Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 18, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006
More informationClustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2
So far in the course Clustering Subhransu Maji : Machine Learning 2 April 2015 7 April 2015 Supervised learning: learning with a teacher You had training data which was (feature, label) pairs and the goal
More informationClustering Lecture 3: Hierarchical Methods
Clustering Lecture 3: Hierarchical Methods Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced
More informationHierarchical Clustering
Hierarchical Clustering Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram A tree-like diagram that records the sequences of merges
More informationClustering. Subhransu Maji. CMPSCI 689: Machine Learning. 2 April April 2015
Clustering Subhransu Maji CMPSCI 689: Machine Learning 2 April 2015 7 April 2015 So far in the course Supervised learning: learning with a teacher You had training data which was (feature, label) pairs
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)
More informationCluster Analysis. Angela Montanari and Laura Anderlucci
Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a
More information( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components
Review Lecture 14 ! PRINCIPAL COMPONENT ANALYSIS Eigenvectors of the covariance matrix are the principal components 1. =cov X Top K principal components are the eigenvectors with K largest eigenvalues
More informationCS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning
CS325 Artificial Intelligence Cengiz Spring 2013 Unsupervised Learning Missing teacher No labels, y Just input data, x What can you learn with it? Unsupervised Learning Missing teacher No labels, y Just
More informationMethods for Intelligent Systems
Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering
More informationClustering. CS294 Practical Machine Learning Junming Yin 10/09/06
Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,
More informationBig Data Analytics. Special Topics for Computer Science CSE CSE Feb 11
Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering II Spectral
More informationClustering: Classic Methods and Modern Views
Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering
More informationClustering in R d. Clustering. Widely-used clustering methods. The k-means optimization problem CSE 250B
Clustering in R d Clustering CSE 250B Two common uses of clustering: Vector quantization Find a finite set of representatives that provides good coverage of a complex, possibly infinite, high-dimensional
More informationCS 534: Computer Vision Segmentation and Perceptual Grouping
CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1
Week 8 Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Part I Clustering 2 / 1 Clustering Clustering Goal: Finding groups of objects such that the objects in a group
More informationINF4820, Algorithms for AI and NLP: Hierarchical Clustering
INF4820, Algorithms for AI and NLP: Hierarchical Clustering Erik Velldal University of Oslo Sept. 25, 2012 Agenda Topics we covered last week Evaluating classifiers Accuracy, precision, recall and F-score
More informationMixture Models and the EM Algorithm
Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is
More informationFinding Clusters 1 / 60
Finding Clusters Types of Clustering Approaches: Linkage Based, e.g. Hierarchical Clustering Clustering by Partitioning, e.g. k-means Density Based Clustering, e.g. DBScan Grid Based Clustering 1 / 60
More informationhttp://www.xkcd.com/233/ Text Clustering David Kauchak cs160 Fall 2009 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture17-clustering.ppt Administrative 2 nd status reports Paper review
More informationCHAPTER 4: CLUSTER ANALYSIS
CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis
More informationData Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/004 1
More informationOlmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.
Center of Atmospheric Sciences, UNAM November 16, 2016 Cluster Analisis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster)
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10. Cluster
More informationUnsupervised Learning
Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised
More informationMachine Learning and Data Mining. Clustering (1): Basics. Kalev Kask
Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of
More informationCOMS 4771 Clustering. Nakul Verma
COMS 4771 Clustering Nakul Verma Supervised Learning Data: Supervised learning Assumption: there is a (relatively simple) function such that for most i Learning task: given n examples from the data, find
More informationIntroduction to Machine Learning
Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline
More information6. Dicretization methods 6.1 The purpose of discretization
6. Dicretization methods 6.1 The purpose of discretization Often data are given in the form of continuous values. If their number is huge, model building for such data can be difficult. Moreover, many
More informationImage Segmentation. Selim Aksoy. Bilkent University
Image Segmentation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Examples of grouping in vision [http://poseidon.csd.auth.gr/lab_research/latest/imgs/s peakdepvidindex_img2.jpg]
More informationImage Segmentation. Selim Aksoy. Bilkent University
Image Segmentation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Examples of grouping in vision [http://poseidon.csd.auth.gr/lab_research/latest/imgs/s peakdepvidindex_img2.jpg]
More informationPattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition
Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant
More informationLecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic
SEMANTIC COMPUTING Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 23 November 2018 Overview Unsupervised Machine Learning overview Association
More informationSegmentation and low-level grouping.
Segmentation and low-level grouping. Bill Freeman, MIT 6.869 April 14, 2005 Readings: Mean shift paper and background segmentation paper. Mean shift IEEE PAMI paper by Comanici and Meer, http://www.caip.rutgers.edu/~comanici/papers/msrobustapproach.pdf
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/25/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.
More informationBig Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)
Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 9: Data Mining (4/4) March 9, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These slides
More informationCS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample
Lecture 9 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem: distribute data into k different groups
More informationUnsupervised Learning : Clustering
Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex
More informationCS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003
CS 664 Slides #11 Image Segmentation Prof. Dan Huttenlocher Fall 2003 Image Segmentation Find regions of image that are coherent Dual of edge detection Regions vs. boundaries Related to clustering problems
More informationV4 Matrix algorithms and graph partitioning
V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community
More informationSegmentation. Bottom Up Segmentation
Segmentation Bottom up Segmentation Semantic Segmentation Bottom Up Segmentation 1 Segmentation as clustering Depending on what we choose as the feature space, we can group pixels in different ways. Grouping
More informationThe goals of segmentation
Image segmentation The goals of segmentation Group together similar-looking pixels for efficiency of further processing Bottom-up process Unsupervised superpixels X. Ren and J. Malik. Learning a classification
More informationClustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search
Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2
More informationTargil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation
Targil : Image Segmentation Image segmentation Many slides from Steve Seitz Segment region of the image which: elongs to a single object. Looks uniform (gray levels, color ) Have the same attributes (texture
More informationMachine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme
Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany
More informationClustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic
Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationSegmentation Computer Vision Spring 2018, Lecture 27
Segmentation http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 218, Lecture 27 Course announcements Homework 7 is due on Sunday 6 th. - Any questions about homework 7? - How many of you have
More informationClustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York
Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity
More informationCS Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 8: ADVANCED CLUSTERING (FUZZY AND CO -CLUSTERING) Review: Basic Cluster Analysis Methods (Chap. 10) Cluster Analysis: Basic Concepts
More informationINF 4300 Classification III Anne Solberg The agenda today:
INF 4300 Classification III Anne Solberg 28.10.15 The agenda today: More on estimating classifier accuracy Curse of dimensionality and simple feature selection knn-classification K-means clustering 28.10.15
More informationNormalized cuts and image segmentation
Normalized cuts and image segmentation Department of EE University of Washington Yeping Su Xiaodan Song Normalized Cuts and Image Segmentation, IEEE Trans. PAMI, August 2000 5/20/2003 1 Outline 1. Image
More informationUnsupervised Learning: Clustering
Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning
More informationCluster Analysis. Jia Li Department of Statistics Penn State University. Summer School in Statistics for Astronomers IV June 9-14, 2008
Cluster Analysis Jia Li Department of Statistics Penn State University Summer School in Statistics for Astronomers IV June 9-1, 8 1 Clustering A basic tool in data mining/pattern recognition: Divide a
More informationMachine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,
More informationUniversity of Florida CISE department Gator Engineering. Clustering Part 2
Clustering Part 2 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Partitional Clustering Original Points A Partitional Clustering Hierarchical
More informationContent-based image and video analysis. Machine learning
Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all
More informationClassification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University
Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationIntroduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering
Introduction to Pattern Recognition Part II Selim Aksoy Bilkent University Department of Computer Engineering saksoy@cs.bilkent.edu.tr RETINA Pattern Recognition Tutorial, Summer 2005 Overview Statistical
More informationApplications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors
Segmentation I Goal Separate image into coherent regions Berkeley segmentation database: http://www.eecs.berkeley.edu/research/projects/cs/vision/grouping/segbench/ Slide by L. Lazebnik Applications Intelligent
More informationCommunity Detection. Community
Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,
More informationData Mining. Clustering. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Clustering Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 31 Table of contents 1 Introduction 2 Data matrix and
More informationData Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Descriptive model A descriptive model presents the main features of the data
More informationChapter 6 Continued: Partitioning Methods
Chapter 6 Continued: Partitioning Methods Partitioning methods fix the number of clusters k and seek the best possible partition for that k. The goal is to choose the partition which gives the optimal
More informationLecture 11: E-M and MeanShift. CAP 5415 Fall 2007
Lecture 11: E-M and MeanShift CAP 5415 Fall 2007 Review on Segmentation by Clustering Each Pixel Data Vector Example (From Comanciu and Meer) Review of k-means Let's find three clusters in this data These
More informationIntroduction to spectral clustering
Introduction to spectral clustering Vasileios Zografos zografos@isy.liu.se Klas Nordberg klas@isy.liu.se What this course is Basic introduction into the core ideas of spectral clustering Sufficient to
More information10601 Machine Learning. Hierarchical clustering. Reading: Bishop: 9-9.2
161 Machine Learning Hierarchical clustering Reading: Bishop: 9-9.2 Second half: Overview Clustering - Hierarchical, semi-supervised learning Graphical models - Bayesian networks, HMMs, Reasoning under
More information