CS 6140: Machine Learning Spring 2017
|
|
- Lionel Wood
- 6 years ago
- Views:
Transcription
1 CS 6140: Machine Learning Spring 2017 Instructor: Lu Wang College of Computer and Science Northeastern University Webpage:
2 Grades for Assignment 1 will be out next week. Assignment 3 is out and due on 03/30. Project progress report is due on 03/16. Hard copy in class.
3 Project progress report What changes you have made for the task No change at all Change the data, or else Describe data preprocessing What are the features or Numerical or categorical? Do you use all the data or part of it? What method you have tried E.g regression, SVM
4 Project progress report What results do you have now? Baselines? metrics? Precision, recall, F1, accuracy How are your results compared to the baselines?
5 What we learned Dimension (or feature Principal component analysis (PCA) Singular value (SVD)
6
7 What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most of the sample's informa@on. Useful for the compression and classifica@on of data.
8 Geometric picture of principal components (PCs) z 1 the 1 st PC z 1 is a minimum distance fit to a line in X space z 2 the 2 nd PC is a minimum distance fit to a line in the plane perpendicular to the 1 st PC PCs are a series of linear least squares fits to a sample, each orthogonal to all the previous.
9 To find first note that where is the covariance matrix. ( )( ) T i n i i x x x x n S = =1 1 a 1 ( ) ( )( ) ) ) (( ] var[ Sa a a x x x x a n x a x a n z z E z T n i T i i T n i T i T = = = = = = isthe mean. 1 1 = = n i x i n x In the following, we assume the Data is centered. 0 x = Algebraic defini@on of PCs
10 Algebraic derivation of PCs We find that a 2 whose eigenvalue is also an eigenvector of S λ = λ 2 is the second largest. In general var[ z k ] = a T k Sa k = λ k The k th largest eigenvalue of S is the variance of the k th PC. z k The k th PC in the sample. retains the k th greatest fraction of the variation
11 PCA for image compression d=1 d=2 d=4 d=8 d=16 d=32 d=64 d=100 Original Image
12
13
14
15
16
17 CUR In large-data it is normal for the matrix A being decomposed to be very sparse Documents With SVD, even if A is sparse, U and V will be dense
18
19
20
21
22 Today s Outline Clustering K means Hierarchical clustering Spectral clustering [some of the slides are borrowed from David Blei]
23 Clustering Goal: segment data into groups of similar points
24 Clustering Goal: segment data into groups of similar points When and why would we want to do this?
25 Clustering Goal: segment data into groups of similar points When and why would we want to do this? Useful for: organizing data Understanding hidden structure in some data high-dimensional data in a lowdimensional space
26 Clustering Examples Facebook user group according to their interests Image search Topic discovery in
27 Setup
28
29 segment this data into k groups What is a good distance func@on?
30 Squared Euclidean distance:
31 segment this data into k groups What should k be?
32 segment this data into k groups For example, k is 4
33 K means
34 K means The basic idea is to describe each cluster by its mean value. Goal: assign data to clusters and define these clusters with their means.
35 K means algorithm
36 K means algorithm
37 Example: Start
38
39
40
41
42
43 K means evalua@on
44 Coordinate descent
45 Coordinate descent However, It finds a local minimum. (MulBple restarts are oden necessary.)
46 for the example data
47 Example: compressing images
48 Each pixel is associated with a red, green, and blue value A 1024 X 1024 image is a collec@on of values <x1, x2, x3>, which requires 3M of storage How can we use k-means to compress this image?
49
50
51
52
53
54
55
56
57 Measure of Less distorted with more clusters
58 K-medoids In many semngs, Euclidean distance is not appropriate. Discrete data, such as purchase histories, moving watching histories k-medoids is an algorithm that only requires knowing distances between data points No need to define the mean
59 K-medoids In many semngs, Euclidean distance is not appropriate. Discrete data, such as purchase histories, moving watching histories k-medoids is an algorithm that only requires knowing distances between data points No need to define the mean Each of the clusters is associated with its most typical example
60 k-medoids algorithm
61 Choosing k Choosing k is a nagging problem in cluster analysis Some@mes, the problem determines k Clustering customers for k salespeople in a business It is not well-defined.
62 What happens as k increases?
63 What happens as k increases?
64 What happens as k increases?
65 What happens as k increases?
66 What happens as k increases?
67 What happens as k increases?
68 What happens as k increases?
69 What happens as k increases?
70 A kink in the objec@ve
71 Today s Outline Clustering K means Hierarchical clustering Spectral clustering
72 Hierarchical clustering Hierarchical clustering is a widely used data analysis tool. The idea is to build a binary tree of the data that successively merges similar groups of points. Visualizing this tree provides a useful summary of the data.
73 Hierarchical clustering vs. k-means Recall that k-means or k-medoids requires A number of clusters k An ini@al assignment of data to clusters A distance measure between data Hierarchical clustering only requires a measure of similarity between groups of data points.
74 clustering Algorithm: Place each data point into its own singleton group Repeat: merge the two closest groups all the data are merged into a single cluster
75
76
77
78
79
80
81
82
83
84
85
86 clustering Each level of the tree is a segmenta@on of the data The algorithm results in a sequence of groupings It is up to the user to choose a natural clustering from this sequence
87 Dendrogram clustering is monotonic The similarity between merged clusters is monotone decreasing with the level of the merge. Dendrogram: Plot each merge between the two merged groups Provides an interpretable of the algorithm and data Useful tool, part of why hierarchical clustering is popular
88 [source: mmds.org]
89 Group similarity
90
91
92 of intergroup similarity Single linkage can produce chaining, where a sequence of close observa@ons in different groups cause early merges of those groups
93 of intergroup similarity Single linkage can produce chaining, where a sequence of close observa@ons in different groups cause early merges of those groups Complete linkage has the opposite problem. It might not merge close groups because of outlier members that are far apart
94 of intergroup similarity Single linkage can produce chaining, where a sequence of close observa@ons in different groups cause early merges of those groups Complete linkage has the opposite problem. It might not merge close groups because of outlier members that are far apart Group average represents a natural compromise, but depends on the scale of the similari@es. Applying a monotone transforma@on to the similari@es can change the results
95 Caveats Hierarchical clustering should be treated with Different decisions about group can lead to vastly different dendrograms. The algorithm imposes a hierarchical structure on the data, even data for which such structure is not appropriate.
96 Today s Outline Clustering K means Hierarchical clustering Spectral clustering [Some slides are borrowed from Royi Itzhak]
97 Spectral Clustering Algorithms that cluster points using eigenvectors of matrices derived from the data Obtain data in the lowdimensional space that can be easily clustered Difficult to understand Easy to implement
98 Elements of Graph Theory A graph G = (V,E) consists of a vertex set V and an edge set E. If G is a directed graph, each edge is an ordered pair of ver@ces A bipar3te graph is one in which the ver@ces can be divided into two groups, so that all edges join ver@ces in different groups.
99 Similarity Graph Represent dataset as a weighted graph G(V,E) V={x i } Set of n ver@ces represen@ng data points { v, v,..., v } E={W ij } Set of weighted edges indica@ng pair-wise similarity between points
100 Similarity Graph W ij represent similarity between vertex If W ij =0, no similarity Set W ii =0
101 The idea: Graph Clustering can be viewed as a similarity graph task: Divide into two disjoint groups (A,B) A 1 5 B V=A U B Graph par@@on is NP hard!
102 Clustering of a good clustering: 1. Points assigned to same cluster should be highly similar. 2. Points assigned to different clusters should be highly dissimilar.
103 Clustering of a good clustering: 1. Points assigned to same cluster should be highly similar. 2. Points assigned to different clusters should be highly dissimilar. Apply these objec@ves to our graph representa@on Minimize weight of between-group connec@ons
104 Graph Cuts Express as a func@on of the edge cut of the par@@on. Cut: Set of edges with only one vertex in a group. We wants to find the minimal cut between groups. The groups that has the minimal cut would be the par@@on A B cut( A, B) = w ij i A, j B cut(a,b) =
105 Graph Cut Criteria Criterion: Minimum-cut Minimise weight of between groups Degenerate case: min cut(a,b) cut Minimum cut Problem: Only considers external cluster Does not consider internal cluster density
106 Graph Cut Criteria Criterion: Normalized-cut (Shi & Malik, 97) Consider the between groups to the density of each group. cut( A, B) min Ncut ( A, B) = + vol( A) cut( A, B) vol( B) Normalize the associa@on between groups by volume. Vol(A): The total weight of the edges origina@ng from group A. Why use this criterion? Produces more balanced par@@ons.
107 Example 2 Spirals Dataset exhibits complex cluster shapes K-means performs very poorly in this space due to bias toward dense spherical clusters In the embedded space given by two leading eigenvectors, clusters are trivial to separate
108 Spectral Graph Theory Possible approach Represent a similarity graph as a matrix Apply knowledge from Linear Algebra The eigenvalues and eigenvectors of a matrix provide global informa@on about its structure. w w 1n!! w n1... w nn x 1! x n = λ x 1! x n Spectral Graph Theory Analyze the spectrum of matrix represen@ng a graph. Spectrum : The eigenvectors of a graph, ordered by the magnitude (strength) of their corresponding eigenvalues. Λ = { 1 2 n λ, λ,..., λ }
109 2 Matrix Adjacency matrix (A) n x n matrix A = [ w ij ] : edge weight between vertex x i and x j Important properbes: Symmetric matrix Eigenvalues are real Eigenvector could span orthogonal base 5 6 x 1 x 2 x 3 x 4 x 5 x 6 x x x x x x
110 Matrix Degree matrix (D) n x n diagonal matrix D ( i, i) = w ij : total weight of edges incident to vertex x i j Important applicabon: Normalize adjacency matrix x 1 x 2 x 3 x 4 x 5 x 6 x x x x x x
111 Matrix Laplacian matrix (L) n x n symmetric matrix L = D - A x 1 x 2 x 3 x 4 x 5 x 6 x x x x x x Important properbes: Eigenvalues are non-nega@ve real numbers Eigenvectors are real and orthogonal Eigenvalues and eigenvectors provide an insight into the connec@vity of the graph.
112 Another normalized laplacian Laplacian matrix (L) n x n symmetric matrix L' = D 0.5 L D Important properbes: Eigenvectors are real and normalized Each A ij (which i,j is not equal) = A ij D ii
113 Find An Min-Cut (Hall 70, Fiedler 73) Express a bi-par@@on (A,B) as a vector f p i = +1 if x i A 1 if x i B We can minimise the cut of the par@@on by finding a nontrivial vector p that minimizes the func@on ( p) wij( pi p j i, j V = ) 2 f (p) = p T L p Laplacian matrix
114 Find An Min-Cut (Hall 70, Fiedler 73) Express a bi-par@@on (A,B) as a vector p i = +1 if x i A 1 if x i B We can minimise the cut of the par@@on by finding a nontrivial vector p that minimizes the func@on f ( p) wij( pi p j i, j V = ) 2 f (p) = p T L p Laplacian matrix The Rayleigh Theorem shows: The minimum value for f(p) is given by the 2nd smallest eigenvalue of the Laplacian L. (because the 1 st smallest one is 0, which corresponds to eigenvector I --- unit vector) The op@mal solu@on for p is given by the corresponding eigenvector, referred as the Fiedler Vector.
115 Spectral Clustering Algorithms Three basic stages: 1. Pre-processing Construct a matrix representa@on of the dataset. 2. Decomposi@on Compute eigenvalues and eigenvectors of the matrix. Map each point to a lower-dimensional representa@on based on one or more eigenvectors. 3. Grouping Assign points to two or more clusters, based on the new representa@on.
116 Spectral Algorithm 1. Pre-processing Build Laplacian matrix L of the graph 2. Decomposi@on Find eigenvalues and eigenvectors of the matrix L Map ver@ces to corresponding components of λ 2 x 1 x 2 x 3 x 4 x 5 x 6 x x x x x x Λ = X = x x x x x x 6-0.7
117 Spectral Algorithm The matrix which represents the eigenvectors of the Laplacian matrix
118 Spectral Grouping Sort components of reduced 1-dimensional vector. clusters by splimng the sorted vector in two. How to choose a splimng point? Naïve approaches: Split at 0, mean or median value More expensive approaches A}empt to minimize normalized cut criterion in 1-dimension x Split at Cluster A: Positive points x 2 x 3 x 4 x 5 x Cluster B: Negative points x 1 x 2 x x 4 x 5 x A B
119 3-Clusters
120 K-Way Spectral Clustering How do we a graph into k clusters? Two basic approaches: 1. Recursive bi-par@@oning (Hagen et al., 91) Recursively apply bi-par@@oning algorithm in a hierarchical divisive manner. Disadvantages: Inefficient, unstable 2. Cluster mul@ple eigenvectors (Shi & Malik, 00) Build a reduced space from mul@ple eigenvectors. Commonly used A preferable approach but its like to do PCA and then k-means
121 Recursive (Hagen et al., 91) using only one eigenvector at Use procedure recursively Example: Image Segmenta@on Uses 2 nd (smallest) eigenvector to define op@mal cut Recursively generates two clusters with each cut
122 Why use Eigenvectors? 1. Approximates the opbmal cut (Shi & Malik, 00) Can be used to approximate the k-way normalized cut. 2. Emphasises cohesive clusters (Brand & Huang, 02) Increases the unevenness in the of the data. between similar points are amplified, between dissimilar points are a}enuated. The data begins to approximate a clustering. 3. Well-separated space Transforms data to a new embedded space, consis@ng of k orthogonal basis vectors.
123 K-Eigenvector Clustering K-eigenvector Algorithm (Ng et al., 01) 1. Pre-processing Construct the scaled adjacency matrix A' 1/ 2 1/ 2 = D AD 2. Decomposi@on Find the eigenvalues and eigenvectors of A'. Build embedded space from the eigenvectors corresponding to the k largest eigenvalues. 3. Grouping Apply k-means to reduced n x k space to produce k clusters.
124 Aside: How to select k? Eigengap: the difference between two eigenvalues. Most stable clustering is generally given by the value k that maximizes the expression Largest eigenvalues of Cisi/Medline data 45 λ Δ k = λ λ k k 1 max Δ k Choose k=2 = λ λ 2 1 Eigenvalue λ K
125 Summary on Spectral Clustering Clustering as a graph par@@oning problem Quality of a par@@on can be determined using graph cut criteria. Iden@fying an op@mal par@@on is NP-hard. Spectral clustering techniques Efficient approach to calculate near-op@mal bi-par@@ons and k-way par@@ons. Based on well-known cut criteria and strong theore@cal background.
126 What we learned today Clustering K means Hierarchical clustering Spectral clustering
127 Homework Readings Murphy Ch Mining of Massive Datasets, chapter 11 h}p://infolab.stanford.edu/~ullman/mmds/ ch11.pdf A tutorial on Spectral Clustering h}p:// files/publica@ons/a}achments/ Luxburg07_tutorial_4488%5b0%5d.pdf
CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #21: Graph Mining 2
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #21: Graph Mining 2 Networks & Communi>es We o@en think of networks being organized into modules, cluster, communi>es: VT CS 5614 2 Goal:
More informationBig Data Analytics. Special Topics for Computer Science CSE CSE Feb 11
Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering II Spectral
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationVisual Representations for Machine Learning
Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering
More informationClustering Lecture 8. David Sontag New York University. Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein
Clustering Lecture 8 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, Carlos Guestrin, Andrew Moore, Dan Klein Clustering: Unsupervised learning Clustering Requires
More informationSpectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,
Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4
More informationCS 6140: Machine Learning Spring Final Exams. What we learned Final Exams 2/26/16
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Assignment
More informationCS 6140: Machine Learning Spring 2016
CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Assignment
More informationCS 534: Computer Vision Segmentation and Perceptual Grouping
CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation
More informationhttp://www.xkcd.com/233/ Text Clustering David Kauchak cs160 Fall 2009 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture17-clustering.ppt Administrative 2 nd status reports Paper review
More informationNon Overlapping Communities
Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides
More informationMachine learning - HT Clustering
Machine learning - HT 2016 10. Clustering Varun Kanade University of Oxford March 4, 2016 Announcements Practical Next Week - No submission Final Exam: Pick up on Monday Material covered next week is not
More informationk-means demo Administrative Machine learning: Unsupervised learning" Assignment 5 out
Machine learning: Unsupervised learning" David Kauchak cs Spring 0 adapted from: http://www.stanford.edu/class/cs76/handouts/lecture7-clustering.ppt http://www.youtube.com/watch?v=or_-y-eilqo Administrative
More informationClustering and Dimensionality Reduction
Clustering and Dimensionality Reduction Some material on these is slides borrowed from Andrew Moore's excellent machine learning tutorials located at: Data Mining Automatically extracting meaning from
More informationClustering. CS294 Practical Machine Learning Junming Yin 10/09/06
Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,
More informationCS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation
CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation Spring 2005 Ahmed Elgammal Dept of Computer Science CS 534 Segmentation II - 1 Outlines What is Graph cuts Graph-based clustering
More informationClustering CS 550: Machine Learning
Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf
More informationAarti Singh. Machine Learning / Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg
Spectral Clustering Aarti Singh Machine Learning 10-701/15-781 Apr 7, 2010 Slides Courtesy: Eric Xing, M. Hein & U.V. Luxburg 1 Data Clustering Graph Clustering Goal: Given data points X1,, Xn and similarities
More informationIntroduction to Machine Learning
Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline
More informationSegmentation Computer Vision Spring 2018, Lecture 27
Segmentation http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 218, Lecture 27 Course announcements Homework 7 is due on Sunday 6 th. - Any questions about homework 7? - How many of you have
More informationCluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1
Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods
More informationCS 6140: Machine Learning Spring 2016
CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Exam
More informationClustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search
Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationClustering Algorithms for general similarity measures
Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative
More informationLecture 7: Spectral Clustering; Linear Dimensionality Reduc:on via Principal Component Analysis
Lecture 7: Spectral Clustering; Linear Dimensionality Reduc:on via Principal Component Analysis Lester Mackey April, Stats 6B: Unsupervised Learning Blackboard discussion See lecture notes Spectral clustering
More informationSegmentation (continued)
Segmentation (continued) Lecture 05 Computer Vision Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr Mubarak Shah Professor, University of Central Florida The Robotics
More informationBig Data Analytics! Special Topics for Computer Science CSE CSE Feb 9
Big Data Analytics! Special Topics for Computer Science CSE 4095-001 CSE 5095-005! Feb 9 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering I What
More informationSegmentation and Grouping
Segmentation and Grouping How and what do we see? Fundamental Problems ' Focus of attention, or grouping ' What subsets of pixels do we consider as possible objects? ' All connected subsets? ' Representation
More informationSGN (4 cr) Chapter 11
SGN-41006 (4 cr) Chapter 11 Clustering Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 25, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006 (4 cr) Chapter
More informationSpectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014
Spectral Clustering Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 What are we going to talk about? Introduction Clustering and
More informationTargil 12 : Image Segmentation. Image segmentation. Why do we need it? Image segmentation
Targil : Image Segmentation Image segmentation Many slides from Steve Seitz Segment region of the image which: elongs to a single object. Looks uniform (gray levels, color ) Have the same attributes (texture
More informationCS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample
CS 1675 Introduction to Machine Learning Lecture 18 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem:
More informationLecture 7: Segmentation. Thursday, Sept 20
Lecture 7: Segmentation Thursday, Sept 20 Outline Why segmentation? Gestalt properties, fun illusions and/or revealing examples Clustering Hierarchical K-means Mean Shift Graph-theoretic Normalized cuts
More informationML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris
ML4Bio Lecture #1: Introduc3on February 24 th, 216 Quaid Morris Course goals Prac3cal introduc3on to ML Having a basic grounding in the terminology and important concepts in ML; to permit self- study,
More informationClustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation
More informationUnsupervised Learning
Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised
More information1 Case study of SVM (Rob)
DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how
More informationImage Segmentation. Selim Aksoy. Bilkent University
Image Segmentation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Examples of grouping in vision [http://poseidon.csd.auth.gr/lab_research/latest/imgs/s peakdepvidindex_img2.jpg]
More informationGeneral Instructions. Questions
CS246: Mining Massive Data Sets Winter 2018 Problem Set 2 Due 11:59pm February 8, 2018 Only one late period is allowed for this homework (11:59pm 2/13). General Instructions Submission instructions: These
More informationHierarchical clustering
Hierarchical clustering Rebecca C. Steorts, Duke University STA 325, Chapter 10 ISL 1 / 63 Agenda K-means versus Hierarchical clustering Agglomerative vs divisive clustering Dendogram (tree) Hierarchical
More informationIntroduction to spectral clustering
Introduction to spectral clustering Vasileios Zografos zografos@isy.liu.se Klas Nordberg klas@isy.liu.se What this course is Basic introduction into the core ideas of spectral clustering Sufficient to
More informationMSA220 - Statistical Learning for Big Data
MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups
More informationINF4820, Algorithms for AI and NLP: Hierarchical Clustering
INF4820, Algorithms for AI and NLP: Hierarchical Clustering Erik Velldal University of Oslo Sept. 25, 2012 Agenda Topics we covered last week Evaluating classifiers Accuracy, precision, recall and F-score
More informationTypes of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters
Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive
More informationInforma(on Retrieval
Introduc*on to Informa(on Retrieval CS276: Informa*on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 12: Clustering Today s Topic: Clustering Document clustering Mo*va*ons Document
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/25/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.
More informationSpectral Clustering on Handwritten Digits Database
October 6, 2015 Spectral Clustering on Handwritten Digits Database Danielle dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance
More informationIntroduction to spectral clustering
Introduction to spectral clustering Denis Hamad LASL ULCO Denis.Hamad@lasl.univ-littoral.fr Philippe Biela HEI LAGIS Philippe.Biela@hei.fr Data Clustering Data clustering Data clustering is an important
More informationImage Segmentation. Selim Aksoy. Bilkent University
Image Segmentation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Examples of grouping in vision [http://poseidon.csd.auth.gr/lab_research/latest/imgs/s peakdepvidindex_img2.jpg]
More informationCSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection
CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationCHAPTER 4: CLUSTER ANALYSIS
CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis
More informationSupervised vs. Unsupervised Learning
Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now
More informationSegmentation. Bottom Up Segmentation
Segmentation Bottom up Segmentation Semantic Segmentation Bottom Up Segmentation 1 Segmentation as clustering Depending on what we choose as the feature space, we can group pixels in different ways. Grouping
More informationBBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler
BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification Classification systems: Supervised learning Make a rational prediction given evidence There are several methods for
More informationTELCOM2125: Network Science and Analysis
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning
More informationUnsupervised Learning
Unsupervised Learning Fabio G. Cozman - fgcozman@usp.br November 16, 2018 What can we do? We just have a dataset with features (no labels, no response). We want to understand the data... no easy to define
More informationCluster Analysis. Angela Montanari and Laura Anderlucci
Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a
More informationLecture 11: E-M and MeanShift. CAP 5415 Fall 2007
Lecture 11: E-M and MeanShift CAP 5415 Fall 2007 Review on Segmentation by Clustering Each Pixel Data Vector Example (From Comanciu and Meer) Review of k-means Let's find three clusters in this data These
More informationCSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection
CSE 158 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationSpectral Methods for Network Community Detection and Graph Partitioning
Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection
More informationLecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic
SEMANTIC COMPUTING Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 23 November 2018 Overview Unsupervised Machine Learning overview Association
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10. Cluster
More informationSearch Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson
Search Engines Informa1on Retrieval in Prac1ce Annota1ons by Michael L. Nelson All slides Addison Wesley, 2008 Evalua1on Evalua1on is key to building effec$ve and efficient search engines measurement usually
More informationMinimum Redundancy and Maximum Relevance Feature Selec4on. Hang Xiao
Minimum Redundancy and Maximum Relevance Feature Selec4on Hang Xiao Background Feature a feature is an individual measurable heuris4c property of a phenomenon being observed In character recogni4on: horizontal
More informationDimension Reduction CS534
Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of
More informationA Spectral-based Clustering Algorithm for Categorical Data Using Data Summaries (SCCADDS)
A Spectral-based Clustering Algorithm for Categorical Data Using Data Summaries (SCCADDS) Eman Abdu eha90@aol.com Graduate Center The City University of New York Douglas Salane dsalane@jjay.cuny.edu Center
More informationCSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection
CSE 258 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:
More informationInforma(on Retrieval
Introduc*on to Informa(on Retrieval Clustering Chris Manning, Pandu Nayak, and Prabhakar Raghavan Today s Topic: Clustering Document clustering Mo*va*ons Document representa*ons Success criteria Clustering
More informationEE 701 ROBOT VISION. Segmentation
EE 701 ROBOT VISION Regions and Image Segmentation Histogram-based Segmentation Automatic Thresholding K-means Clustering Spatial Coherence Merging and Splitting Graph Theoretic Segmentation Region Growing
More informationSocial-Network Graphs
Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities
More informationModularity CMSC 858L
Modularity CMSC 858L Module-detection for Function Prediction Biological networks generally modular (Hartwell+, 1999) We can try to find the modules within a network. Once we find modules, we can look
More informationClustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2
So far in the course Clustering Subhransu Maji : Machine Learning 2 April 2015 7 April 2015 Supervised learning: learning with a teacher You had training data which was (feature, label) pairs and the goal
More informationCS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample
Lecture 9 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem: distribute data into k different groups
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 3 Parametric Distribu>ons We want model the probability
More informationCSE 255 Lecture 5. Data Mining and Predictive Analytics. Dimensionality Reduction
CSE 255 Lecture 5 Data Mining and Predictive Analytics Dimensionality Reduction Course outline Week 4: I ll cover homework 1, and get started on Recommender Systems Week 5: I ll cover homework 2 (at the
More informationCS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning
CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning Parallel sparse matrix-vector product Lay out matrix and vectors by rows y(i) = sum(a(i,j)*x(j)) Only compute terms with A(i,j) 0 P0 P1
More informationCS 664 Slides #11 Image Segmentation. Prof. Dan Huttenlocher Fall 2003
CS 664 Slides #11 Image Segmentation Prof. Dan Huttenlocher Fall 2003 Image Segmentation Find regions of image that are coherent Dual of edge detection Regions vs. boundaries Related to clustering problems
More information( ) =cov X Y = W PRINCIPAL COMPONENT ANALYSIS. Eigenvectors of the covariance matrix are the principal components
Review Lecture 14 ! PRINCIPAL COMPONENT ANALYSIS Eigenvectors of the covariance matrix are the principal components 1. =cov X Top K principal components are the eigenvectors with K largest eigenvalues
More informationClustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York
Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity
More informationCS395T Visual Recogni5on and Search. Gautam S. Muralidhar
CS395T Visual Recogni5on and Search Gautam S. Muralidhar Today s Theme Unsupervised discovery of images Main mo5va5on behind unsupervised discovery is that supervision is expensive Common tasks include
More informationHypergraph Sparsifica/on and Its Applica/on to Par//oning
Hypergraph Sparsifica/on and Its Applica/on to Par//oning Mehmet Deveci 1,3, Kamer Kaya 1, Ümit V. Çatalyürek 1,2 1 Dept. of Biomedical Informa/cs, The Ohio State University 2 Dept. of Electrical & Computer
More informationNormalized Graph cuts. by Gopalkrishna Veni School of Computing University of Utah
Normalized Graph cuts by Gopalkrishna Veni School of Computing University of Utah Image segmentation Image segmentation is a grouping technique used for image. It is a way of dividing an image into different
More informationClustering. Subhransu Maji. CMPSCI 689: Machine Learning. 2 April April 2015
Clustering Subhransu Maji CMPSCI 689: Machine Learning 2 April 2015 7 April 2015 So far in the course Supervised learning: learning with a teacher You had training data which was (feature, label) pairs
More informationData Mining Learning from Large Data Sets
Data Mining Learning from Large Data Sets Lecture 8 Clustering large data sets 263-5200- 00L Andreas Krause Announcements! Homework 4 out tomorrow 2 Course organizapon! Retrieval! Given a query, find most
More informationBehavioral Data Mining. Lecture 18 Clustering
Behavioral Data Mining Lecture 18 Clustering Outline Why? Cluster quality K-means Spectral clustering Generative Models Rationale Given a set {X i } for i = 1,,n, a clustering is a partition of the X i
More informationClustering. Chapter 10 in Introduction to statistical learning
Clustering Chapter 10 in Introduction to statistical learning 16 14 12 10 8 6 4 2 0 2 4 6 8 10 12 14 1 Clustering ² Clustering is the art of finding groups in data (Kaufman and Rousseeuw, 1990). ² What
More informationImage Segmentation continued Graph Based Methods
Image Segmentation continued Graph Based Methods Previously Images as graphs Fully-connected graph node (vertex) for every pixel link between every pair of pixels, p,q affinity weight w pq for each link
More informationImage Segmentation. Srikumar Ramalingam School of Computing University of Utah. Slides borrowed from Ross Whitaker
Image Segmentation Srikumar Ramalingam School of Computing University of Utah Slides borrowed from Ross Whitaker Segmentation Semantic Segmentation Indoor layout estimation What is Segmentation? Partitioning
More informationHow and what do we see? Segmentation and Grouping. Fundamental Problems. Polyhedral objects. Reducing the combinatorics of pose estimation
Segmentation and Grouping Fundamental Problems ' Focus of attention, or grouping ' What subsets of piels do we consider as possible objects? ' All connected subsets? ' Representation ' How do we model
More informationINF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22
INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task
More informationUnsupervised learning, Clustering CS434
Unsupervised learning, Clustering CS434 Unsupervised learning and pattern discovery So far, our data has been in this form: We will be looking at unlabeled data: x 11,x 21, x 31,, x 1 m x 12,x 22, x 32,,
More informationPPI Network Alignment Advanced Topics in Computa8onal Genomics
PPI Network Alignment 02-715 Advanced Topics in Computa8onal Genomics PPI Network Alignment Compara8ve analysis of PPI networks across different species by aligning the PPI networks Find func8onal orthologs
More informationCluster analysis. Agnieszka Nowak - Brzezinska
Cluster analysis Agnieszka Nowak - Brzezinska Outline of lecture What is cluster analysis? Clustering algorithms Measures of Cluster Validity What is Cluster Analysis? Finding groups of objects such that
More informationExploratory data analysis for microarrays
Exploratory data analysis for microarrays Jörg Rahnenführer Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken Germany NGFN - Courses in Practical DNA
More informationCSE 258 Lecture 5. Web Mining and Recommender Systems. Dimensionality Reduction
CSE 258 Lecture 5 Web Mining and Recommender Systems Dimensionality Reduction This week How can we build low dimensional representations of high dimensional data? e.g. how might we (compactly!) represent
More informationWeek 7 Picturing Network. Vahe and Bethany
Week 7 Picturing Network Vahe and Bethany Freeman (2005) - Graphic Techniques for Exploring Social Network Data The two main goals of analyzing social network data are identification of cohesive groups
More informationToday s lecture. Clustering and unsupervised learning. Hierarchical clustering. K-means, K-medoids, VQ
Clustering CS498 Today s lecture Clustering and unsupervised learning Hierarchical clustering K-means, K-medoids, VQ Unsupervised learning Supervised learning Use labeled data to do something smart What
More informationSEMINAR: GRAPH-BASED METHODS FOR NLP
SEMINAR: GRAPH-BASED METHODS FOR NLP Organisatorisches: Seminar findet komplett im Mai statt Seminarausarbeitungen bis 15. Juli (?) Hilfen Seminarvortrag / Ausarbeitung auf der Webseite Tucan number for
More informationImage Segmentation continued Graph Based Methods. Some slides: courtesy of O. Capms, Penn State, J.Ponce and D. Fortsyth, Computer Vision Book
Image Segmentation continued Graph Based Methods Some slides: courtesy of O. Capms, Penn State, J.Ponce and D. Fortsyth, Computer Vision Book Previously Binary segmentation Segmentation by thresholding
More information