Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling

Size: px
Start display at page:

Download "Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling"

Transcription

1 Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany 1 / 1

2 Outline 2. Mixture Models & EM Algorithm 2 / 1

3 Syllabus Wed (1) 0. Introduction A. Supervised Learning Wed (2) A.1 Linear Regression Wed (3) A.2 Linear Classification Wed (4) A.3 Regularization (Given by Martin) Wed (5) A.4 High-dimensional Data Wed (6) A.5 Nearest-Neighbor Models Wed (7) A.6 Support Vector Machines Wed (8) A.7 Decision Trees Wed (9) A.8 A First Look at Bayesian and Markov Networks Extra: Wed (E) Invited Talk: Recommender Systems in work at Volkswagen B. Unsupervised Learning Wed (10) B.1 Clustering Wed (11) B.2 Dimensionality Reduction Wed (12) B.3 Frequent Pattern Mining C. Reinforcement Learning Wed.??.??. (13) C.1 State Space Models Wed.??.??. (14) C.2 Markov Decision Processes 1 / 1

4 Outline 2. Mixture Models & EM Algorithm 1 / 1

5 Unsupervised Learning For supervised learning problems, we were always given some training data D train = {(x 1, y 1 ),..., (x N, y N )} x i X corresponds to a measurement (a data instance) y i Y is a label Then the goal was to find a model f : X Y with minimal training error and decent generalization ability. In unsupervised learning, there are no labels given!! 1 / 1

6 Cluster Analysis Assume we have a dataset with no further information given. D train = {x 1,..., x N } cluster analysis tries to find commonalities among all data instances to group them into K many groups. we have to find a partition of X. 2 / 1

7 Partitions Let X := {x 1,..., x N } be a finite set. A set P := {X 1,..., X K } of subsets X k X is called a partition of X if the subsets 1. are pairwise disjoint: X k X j =, k, j {1,..., K}, k j K 2. cover X : X k = X, and k=1 3. do not contain the empty set: X k, k {1,..., K}. The sets X k are also called clusters, a partition P a clustering. K N is called number of clusters. 3 / 1

8 Partitions Let X be a finite set. A surjective function is called a partition function of X. p : {1,..., X } {1,..., K} The sets X k := p 1 (k) form a partition P := {X 1,..., X K }. x i p(x i ) x 1 1 x 2 2 x 3 2 x 4 1 p 1 (1) = {x 1, x 4 } p 1 (2) = {x 2, x 3 } 4 / 1

9 Partitions Let X := {x 1,..., x N } be a finite set. A binary N K matrix P {0, 1} N K is called a partition matrix of X if it K 1. is row-stochastic: P i,k = 1, i {1,..., N} 2. does not contain a zero column: X i,k (0,..., 0) T, k {1,..., K} The sets X k := {x i P i,k = 1} form a partition P := {X 1,..., X K }. P.,k is called membership vector of class k. k=1 5 / 1

10 Partitions For the example given through: the partition matrix would look like: x i p(x i ) x 1 1 x 2 2 x 3 2 x P = / 1

11 The Cluster Analysis Problem Given a set X called data space, e.g., X := R m, a set X X called data, and a function D : X X Part(X ) R + 0 called distortion measure where D(P) measures how bad a partition P Part(X ) for a data set X X is, find a partition P = {X 1, X 2,... X K } Part (X ) with minimal distortion D(P). 7 / 1

12 The Cluster Analysis Problem (given K) Given a set X called data space, e.g., X := R m, a set X X called data, a function D : X X Part(X ) R + 0 called distortion measure where D(P) measures how bad a partition P Part(X ) for a data set X X is, and a number K N of clusters, find a partition P = {X 1, X 2,... X K } Part K (X ) with K clusters with minimal distortion D(P). 7 / 1

13 Distortion Measures: Intuition Assume we have the following data and two cluster centers µ 1 and µ 2 : we would assign the left points to the red cluster, the right points to the blue cluster we want a distortion measure that encourages this behaviour 8 / 1

14 k-means: Distortion Sum of Distances to Cluster Centers Find a partition P such that the sum of squared distances to cluster centers in minimal: with D(P) := K k=1 n i=1: P i.k =1 x i µ k 2 µ k := mean {x i P i,k = 1, i = 1,..., n} 9 / 1

15 k-means: Distortion Sum of Distances to Cluster Centers Find a partition P such that the sum of squared distances to cluster centers in minimal: D(P) := n i=1 k=1 K P i,k x i µ k 2 = K k=1 n i=1: P i.k =1 x i µ k 2 with µ k := n i=1 P i,kx i n i=1 P i,k = mean {x i P i,k = 1, i = 1,..., n} 10 / 1

16 On the role of K Minimizing D over partitions with varying number of clusters (varying K) does not make sense a singleton clustering, where each point is its own cluster center and K = N has minimal D only minimizing with a given K makes sense Minimizing D is not easy as reassigning a point to a different cluster also shifts the cluster centers. 11 / 1

17 k-means: Minimizing Distances to Cluster Centers Add cluster centers µ as auxiliary optimization variables: D(P, µ) := n K P i,k x i µ k 2 i=1 k=1 12 / 1

18 k-means: Minimizing Distances to Cluster Centers Add cluster centers µ as auxiliary optimization variables: D(P, µ) := n K P i,k x i µ k 2 i=1 k=1 Block coordinate descent: 1. fix µ, optimize P reassign data points to clusters: P i,k := δ(k = l i ), l i := arg min x i µ k 2 k {1,...,K} 12 / 1

19 k-means: Minimizing Distances to Cluster Centers Add cluster centers µ as auxiliary optimization variables: Block coordinate descent: D(P, µ) := n i=1 k=1 K P i,k x i µ k 2 1. fix µ, optimize P reassign data points to clusters: P i,k := δ(k = l i ), l i := arg min x i µ k 2 k {1,...,K} 2. fix P, optimize µ recompute cluster centers: µ k := n i=1 P i,kx i n i=1 P i,k Iterate until partition is stable. 12 / 1

20 k-means: Initialization k-means is usually initialized by picking K data points as cluster centers at random: 1. pick the first cluster center µ 1 out of the data points at random and then 2. sequentially select the data point with the largest sum of distances to already chosen cluster centers as next cluster center k 1 µ k := x i, i := arg max i {1,...,n} x i µ l 2, l=1 k = 2,..., K 13 / 1

21 k-means: Initialization k-means is usually initialized by picking K data points as cluster centers at random: 1. pick the first cluster center µ 1 out of the data points at random and then 2. sequentially select the data point with the largest sum of distances to already chosen cluster centers as next cluster center k 1 µ k := x i, i := arg max i {1,...,n} x i µ l 2, l=1 k = 2,..., K Different initializations may lead to different local minima. run k-means with different random initializations and keep only the one with the smallest distortion (random restarts). 13 / 1

22 k-means Algorithm 1: procedure cluster-kmeans(d := {x 1,..., x N } R M, K N, ɛ R + ) 2: i 1 unif({1,..., N}), µ 1 := x i1 3: for k := 2,..., K do 4: i k := arg max k 1 n {1,...,N} l=1 xn µ l, µ i := x ik,. 5: repeat 6: µ old := µ 7: for n := 1,..., N do 8: P n := arg min k {1,...,K} x n µ k 9: for k := 1,..., K do 10: µ k := mean {x n P n = k} K k=1 µ k µ old k < ɛ 1 K 11: until 12: return P Note: In implementations, the two loops over the data (lines 6 and 9) can be combined in one loop. 14 / 1

23 Example x 1 x 2 x 3 x 4 x 5 x 6 15 / 1

24 Example x 1 x 2 x 3 x 4 x 5 x 6 x 1 x 2 x 3 x 4 = µ 1 x 5 x 6 15 / 1

25 Example x 1 x 2 x 3 x 4 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 15 / 1

26 Example x 1 x 2 x 3 x 4 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 15 / 1

27 Example x 1 x 2 x 3 x 4 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 x 1 µ 2 x 2 x 3 x 4 µ 1 x 5 x 6 15 / 1

28 Example x 1 x 2 x 3 x 4 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 x 1 µ 2 x 2 x 3 x 4 µ 1 x 5 x 6 x 1 µ 2 x 2 x 3 x 4 µ 1 x 5 x 6 15 / 1

29 Example x 1 x 2 x 3 x 4 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 x 1 µ 2 x 2 x 3 x 4 µ 1 x 5 x 6 x 1 µ 2 x 2 x 3 x 4 µ 1 x 5 x 6 x 1 x 2 = µ 2 x 3 x 4 x 5 = µ 1 x 6 15 / 1

30 Example x 1 x 2 x 3 x 4 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 x 1 µ 2 x 2 x 3 x 4 µ 1 x 5 x 6 x 1 µ 2 x 2 x 3 x 4 µ 1 x 5 x 6 x 1 x 2 = µ 2 x 3 x 4 x 5 = µ 1 x 6 x 1 x 2 = µ 2 x 3 x 4 x 5 = µ 1 x 6 15 / 1

31 Example x 1 x 2 x 3 x 4 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 x 1 = µ 2 x 2 x 3 x 4 = µ 1 x 5 x 6 d = 33 x 1 µ 2 x 2 x 3 x 4 µ 1 x 5 x 6 x 1 µ 2 x 2 x 3 x 4 µ 1 x 5 x 6 d = 23.7 x 1 x 2 = µ 2 x 3 x 4 x 5 = µ 1 x 6 x 1 x 2 = µ 2 x 3 x 4 x 5 = µ 1 x 6 d = / 1

32 K-medoids: K-means for General Distances One can generalize k-means to general distances d: D(P, µ) := n K P i,k d(x i, µ k ) i=1 k=1 16 / 1

33 K-medoids: K-means for General Distances One can generalize k-means to general distances d: D(P, µ) := n K P i,k d(x i, µ k ) i=1 k=1 step 1 assigning data points to clusters remains the same P i,k := arg min d(x i, µ k ) k {1,...,K} but step 2 finding the best cluster representatives µ k is not solved by the mean and may be difficult in general. 16 / 1

34 K-medoids: K-means for General Distances One can generalize k-means to general distances d: D(P, µ) := n i=1 k=1 K P i,k d(x i, µ k ) step 1 assigning data points to clusters remains the same P i,k := arg min d(x i, µ k ) k {1,...,K} but step 2 finding the best cluster representatives µ k is not solved by the mean and may be difficult in general. idea k-medoids: choose cluster representatives out of cluster data points: n µ k := x j, j := arg min j {1,...,n}:P j,k =1 P i,k d(x i, x j ) i=1 16 / 1

35 2. Mixture Models & EM Algorithm Outline 2. Mixture Models & EM Algorithm 17 / 1

36 2. Mixture Models & EM Algorithm Soft Partitions: Row Stochastic Matrices Let X := {x 1,..., x N } be a finite set. A N K matrix P [0, 1] N K is called a soft partition matrix of X if it 1. is row-stochastic: K P i,k = 1, i {1,..., N}, k {1,..., K} 2. does not contain a zero column: X i,k (0,..., 0) T, k {1,..., K}. k=1 P i,k is called the membership degree of instance i in class k or the cluster weight of instance i in cluster k. P.,k is called membership vector of class k. SoftPart(X ) denotes the set of all soft partitions of X. Note: Soft partitions are also called soft clusterings and fuzzy clusterings. 17 / 1

37 2. Mixture Models & EM Algorithm The Soft Clustering Problem Given a set X called data space, e.g., X := R m, a set X X called data, and a function D : X X SoftPart(X ) R + 0 called distortion measure where D(P) measures how bad a soft partition P SoftPart(X ) for a data set X X is, find a soft partition P SoftPart (X ) with minimal distortion D(P). 18 / 1

38 2. Mixture Models & EM Algorithm The Soft Clustering Problem (with given K) Given a set X called data space, e.g., X := R m, a set X X called data, a function D : X X SoftPart(X ) R + 0 called distortion measure where D(P) measures how bad a soft partition P SoftPart(X ) for a data set X X is, and a number K N of clusters, find a soft partition P SoftPart K (X ) [0, 1] X K with K clusters with minimal distortion D(P). 18 / 1

39 2. Mixture Models & EM Algorithm Mixture Models For our data, no clusters are given, but this does not mean that they do not exist, there is just no way for us to measure them. Mixture models assume that there exists an unobserved nominal variable Z with K levels, which is distributed according to: or for some probabilities π k with Z Cat(π) p(z = k) = π k K π k = 1 k=1 19 / 1

40 2. Mixture Models & EM Algorithm Mixture Models Mixture models then model the joint probability of X and Z: p(x, Z) = p(z)p(x Z) = = K k=1 π δ(z=k) k K p(x Z = k) δ(z=k) k=1 K (π k p(x Z = k)) δ(z=k) k=1 And the marginal probability of a given X is: p(x ) = K p(z = k)p(x Z = k) = k=1 All we need to specify is p(x Z)! K π k p(x Z = k) k=1 20 / 1

41 2. Mixture Models & EM Algorithm Gaussian Mixture Models Gaussian mixture models are mixture models where the probability of seeing an instance, given its cluster membership is a Gaussian: Or equivalently: p(x = x Z = k) = p(x i z i = k) = N (x i ; µ k, Σ k ) 1 (2π) m Σ k e 1 2 (x µ k) T Σ 1 k (x µ k ) for a mean µ k and Covariance Matrix Σ k 21 / 1

42 2. Mixture Models & EM Algorithm Mixture Models: Intuition Clearly, we see that the hidden variable Z has two outcomes! 22 / 1

43 2. Mixture Models & EM Algorithm Mixture Models: Intuition Data may come from a mixture of two Gaussians! 23 / 1

44 2. Mixture Models & EM Algorithm Mixture Models: Intuition Heightlines of a mixture of two Gaussians! 24 / 1

45 2. Mixture Models & EM Algorithm Maximum Likelihood Estimate? The complete data loglikelihood of the completed data (X, Z) then is n K l(θ; X, Z) := δ(z i = k)(ln π k + ln p(x = x i Z = k; θ k ) i=1 k=1 with Θ := (π 1,..., π K, θ 1,..., θ K ) θ k = (µ k, Σ k ) l cannot be computed because z i s are unobserved. We cannot learn this model by computing a maximum likelihood estimate! 25 / 1

46 2. Mixture Models & EM Algorithm Expected Complete Likelihood & EM Algorithm We calculate the expected value of the loglikelihood with respect to the conditional distribution of Z given X under the currently estimated θ t 1 : Q(Θ Θ t 1 ) = E Z X,Θ t 1[l(Θ; X, Z)] From old Θ t 1, we know the distribution of the Z, then compute the expectation value of l with respect to Z (Expectation Step) We derive a quantity Q, that we can then maximize by optimizing Θ (Maximization Step) From the new Θ, we can then update Z and repeat the process 26 / 1

47 2. Mixture Models & EM Algorithm Expected Complete Likelihood [ N ] Q(Θ Θ t 1 ) = E log p(x i, z i Θ) = = = = i=1 [ [ N K ]] E log π k p(x i θ k )) δ(z i =k) i=1 N i=1 k=1 N i=1 k=1 N k=1 K E[δ(z i = k)] log[π k p(x i θ k ] K p(z i = k x i, Θ t 1 ) log[π k p(x i θ k ] K r ik log π k + i=1 k=1 i=1 k=1 N K r ik log p(x i θ k ) 27 / 1

48 2. Mixture Models & EM Algorithm Expected Complete Likelihood (Expectation Step) [ N ] Q(Θ Θ t 1 ) = E log p(x i, z i Θ) i=1 =... N K = r ik log π k + i=1 k=1 N i=1 k=1 K r ik log p(x i θ k ) with π k p(x i θ k ) r ik = p(z = k x i, Θ) = k π k p(x i θ k ) which is called the responsibilities of a cluster k to an instance i Computing the r ik yields the probabilites for Z (Expectation Step) 28 / 1

49 2. Mixture Models & EM Algorithm Maximization Step (I) Q(Θ Θ t 1 ) needs to be maximized for all π k and for the parameters of the individual Gaussians θ k = (µ k, Σ k ). For π, Q is maximized by setting π k = 1 N i r ik k For µ k and Σ k, we only have to look at the second part of Q N K l(µ k, Σ k ) = r ik log p(x i Θ k ) i=1 k=1 = 1 r ik [log Σ k + (x i µ k ) Σ k (x i µ k )] 2 i 29 / 1

50 2. Mixture Models & EM Algorithm Maximization Step (II) l(µ k, Σ k ) is maximized for: µ k = n i=1 r i,kx i k i=1 r i,k And Σ k = = n i=1 r i,k(x i µ k ) (x i µ k ) n i=1 r i,k n i=1 r i,kxi T x i µ k µ k n i=1 r i,k 30 / 1

51 2. Mixture Models & EM Algorithm Gaussian Mixtures for Soft Clustering The responsibilities r [0, 1] N K are a soft partition. P := r The negative expected loglikelihood can be used as cluster distortion: D(P) := max Q(Θ, r) Θ To optimize D, we simply can run EM. 31 / 1

52 2. Mixture Models & EM Algorithm Gaussian Mixtures for Soft Clustering The responsibilities r [0, 1] N K are a soft partition. P := r The negative expected loglikelihood can be used as cluster distortion: D(P) := max Q(Θ, r) Θ To optimize D, we simply can run EM. For hard clustering: assign points to the cluster with highest responsibility (hard EM): r (t 1) i,k = δ(k = arg max r (t 1) k i,k ) (0b ) =1,...,K 31 / 1

53 2. Mixture Models & EM Algorithm Gaussian Mixtures for Soft Clustering / Example / 1

54 2. Mixture Models & EM Algorithm Gaussian Mixtures for Soft Clustering / Example 32 / 1

55 2. Mixture Models & EM Algorithm Gaussian Mixtures for Soft Clustering / Example 32 / 1

56 2. Mixture Models & EM Algorithm Model-based Cluster Analysis Different parametrizations of the covariance matrices Σ k restrict possible cluster shapes: full Σ: all sorts of ellipsoid clusters. diagonal Σ: ellipsoid clusters with axis-parallel axes unit Σ: spherical clusters. One also distinguishes cluster-specific Σ k : each cluster can have its own shape. shared Σ k = Σ: all clusters have the same shape. 33 / 1

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany

More information

K-Means and Gaussian Mixture Models

K-Means and Gaussian Mixture Models K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser

More information

Mixture Models and the EM Algorithm

Mixture Models and the EM Algorithm Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Clustering Lecture 5: Mixture Model

Clustering Lecture 5: Mixture Model Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics

More information

Mixture Models and EM

Mixture Models and EM Table of Content Chapter 9 Mixture Models and EM -means Clustering Gaussian Mixture Models (GMM) Expectation Maximiation (EM) for Mixture Parameter Estimation Introduction Mixture models allows Complex

More information

Unsupervised Learning: Clustering

Unsupervised Learning: Clustering Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning

More information

CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points]

CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points] CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, 2015. 11:59pm, PDF to Canvas [100 points] Instructions. Please write up your responses to the following problems clearly and concisely.

More information

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed

More information

Unsupervised Learning : Clustering

Unsupervised Learning : Clustering Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex

More information

Clustering in R d. Clustering. Widely-used clustering methods. The k-means optimization problem CSE 250B

Clustering in R d. Clustering. Widely-used clustering methods. The k-means optimization problem CSE 250B Clustering in R d Clustering CSE 250B Two common uses of clustering: Vector quantization Find a finite set of representatives that provides good coverage of a complex, possibly infinite, high-dimensional

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

Supervised vs. Unsupervised Learning

Supervised vs. Unsupervised Learning Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now

More information

K-means clustering Based in part on slides from textbook, slides of Susan Holmes. December 2, Statistics 202: Data Mining.

K-means clustering Based in part on slides from textbook, slides of Susan Holmes. December 2, Statistics 202: Data Mining. K-means clustering Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 K-means Outline K-means, K-medoids Choosing the number of clusters: Gap test, silhouette plot. Mixture

More information

Inference and Representation

Inference and Representation Inference and Representation Rachel Hodos New York University Lecture 5, October 6, 2015 Rachel Hodos Lecture 5: Inference and Representation Today: Learning with hidden variables Outline: Unsupervised

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

Latent Variable Models and Expectation Maximization

Latent Variable Models and Expectation Maximization Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15

More information

Machine Learning. A. Supervised Learning A.7. Decision Trees. Lars Schmidt-Thieme

Machine Learning. A. Supervised Learning A.7. Decision Trees. Lars Schmidt-Thieme Machine Learning A. Supervised Learning A.7. Decision Trees Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany 1 /

More information

Introduction to Mobile Robotics

Introduction to Mobile Robotics Introduction to Mobile Robotics Clustering Wolfram Burgard Cyrill Stachniss Giorgio Grisetti Maren Bennewitz Christian Plagemann Clustering (1) Common technique for statistical data analysis (machine learning,

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

Cluster Analysis. Jia Li Department of Statistics Penn State University. Summer School in Statistics for Astronomers IV June 9-14, 2008

Cluster Analysis. Jia Li Department of Statistics Penn State University. Summer School in Statistics for Astronomers IV June 9-14, 2008 Cluster Analysis Jia Li Department of Statistics Penn State University Summer School in Statistics for Astronomers IV June 9-1, 8 1 Clustering A basic tool in data mining/pattern recognition: Divide a

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1 Week 8 Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Part I Clustering 2 / 1 Clustering Clustering Goal: Finding groups of objects such that the objects in a group

More information

Clustering web search results

Clustering web search results Clustering K-means Machine Learning CSE546 Emily Fox University of Washington November 4, 2013 1 Clustering images Set of Images [Goldberger et al.] 2 1 Clustering web search results 3 Some Data 4 2 K-means

More information

Chapter 6 Continued: Partitioning Methods

Chapter 6 Continued: Partitioning Methods Chapter 6 Continued: Partitioning Methods Partitioning methods fix the number of clusters k and seek the best possible partition for that k. The goal is to choose the partition which gives the optimal

More information

Statistics 202: Data Mining. c Jonathan Taylor. Outliers Based in part on slides from textbook, slides of Susan Holmes.

Statistics 202: Data Mining. c Jonathan Taylor. Outliers Based in part on slides from textbook, slides of Susan Holmes. Outliers Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Concepts What is an outlier? The set of data points that are considerably different than the remainder of the

More information

DATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm

DATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm DATA MINING LECTURE 7 Hierarchical Clustering, DBSCAN The EM Algorithm CLUSTERING What is a Clustering? In general a grouping of objects such that the objects in a group (cluster) are similar (or related)

More information

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves Machine Learning A 708.064 11W 1sst KU Exercises Problems marked with * are optional. 1 Conditional Independence I [2 P] a) [1 P] Give an example for a probability distribution P (A, B, C) that disproves

More information

COMPUTATIONAL STATISTICS UNSUPERVISED LEARNING

COMPUTATIONAL STATISTICS UNSUPERVISED LEARNING COMPUTATIONAL STATISTICS UNSUPERVISED LEARNING Luca Bortolussi Department of Mathematics and Geosciences University of Trieste Office 238, third floor, H2bis luca@dmi.units.it Trieste, Winter Semester

More information

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a Week 9 Based in part on slides from textbook, slides of Susan Holmes Part I December 2, 2012 Hierarchical Clustering 1 / 1 Produces a set of nested clusters organized as a Hierarchical hierarchical clustering

More information

732A54/TDDE31 Big Data Analytics

732A54/TDDE31 Big Data Analytics 732A54/TDDE31 Big Data Analytics Lecture 10: Machine Learning with MapReduce Jose M. Peña IDA, Linköping University, Sweden 1/27 Contents MapReduce Framework Machine Learning with MapReduce Neural Networks

More information

Graphical Models & HMMs

Graphical Models & HMMs Graphical Models & HMMs Henrik I. Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I. Christensen (RIM@GT) Graphical Models

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

More information

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning ECE 5424: Introduction to Machine Learning Topics: Unsupervised Learning: Kmeans, GMM, EM Readings: Barber 20.1-20.3 Stefan Lee Virginia Tech Tasks Supervised Learning x Classification y Discrete x Regression

More information

Clustering & Dimensionality Reduction. 273A Intro Machine Learning

Clustering & Dimensionality Reduction. 273A Intro Machine Learning Clustering & Dimensionality Reduction 273A Intro Machine Learning What is Unsupervised Learning? In supervised learning we were given attributes & targets (e.g. class labels). In unsupervised learning

More information

Homework #4 Programming Assignment Due: 11:59 pm, November 4, 2018

Homework #4 Programming Assignment Due: 11:59 pm, November 4, 2018 CSCI 567, Fall 18 Haipeng Luo Homework #4 Programming Assignment Due: 11:59 pm, ovember 4, 2018 General instructions Your repository will have now a directory P4/. Please do not change the name of this

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

Unsupervised Learning. Clustering and the EM Algorithm. Unsupervised Learning is Model Learning

Unsupervised Learning. Clustering and the EM Algorithm. Unsupervised Learning is Model Learning Unsupervised Learning Clustering and the EM Algorithm Susanna Ricco Supervised Learning Given data in the form < x, y >, y is the target to learn. Good news: Easy to tell if our algorithm is giving the

More information

CLUSTERING. JELENA JOVANOVIĆ Web:

CLUSTERING. JELENA JOVANOVIĆ   Web: CLUSTERING JELENA JOVANOVIĆ Email: jeljov@gmail.com Web: http://jelenajovanovic.net OUTLINE What is clustering? Application domains K-Means clustering Understanding it through an example The K-Means algorithm

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Unsupervised Learning: Clustering Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com (Some material

More information

Machine Learning. Unsupervised Learning. Manfred Huber

Machine Learning. Unsupervised Learning. Manfred Huber Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training

More information

An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework

An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

9.1. K-means Clustering

9.1. K-means Clustering 424 9. MIXTURE MODELS AND EM Section 9.2 Section 9.3 Section 9.4 view of mixture distributions in which the discrete latent variables can be interpreted as defining assignments of data points to specific

More information

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22 INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 01-31-017 Outline Background Defining proximity Clustering methods Determining number of clusters Comparing two solutions Cluster analysis as unsupervised Learning

More information

Unsupervised Learning

Unsupervised Learning Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised

More information

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination

CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination Instructor: Erik Sudderth Brown University Computer Science September 11, 2014 Some figures and materials courtesy

More information

COMS 4771 Clustering. Nakul Verma

COMS 4771 Clustering. Nakul Verma COMS 4771 Clustering Nakul Verma Supervised Learning Data: Supervised learning Assumption: there is a (relatively simple) function such that for most i Learning task: given n examples from the data, find

More information

Machine Learning Department School of Computer Science Carnegie Mellon University. K- Means + GMMs

Machine Learning Department School of Computer Science Carnegie Mellon University. K- Means + GMMs 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University K- Means + GMMs Clustering Readings: Murphy 25.5 Bishop 12.1, 12.3 HTF 14.3.0 Mitchell

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Unsupervised learning Daniel Hennes 29.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Supervised learning Regression (linear

More information

Clustering. Supervised vs. Unsupervised Learning

Clustering. Supervised vs. Unsupervised Learning Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now

More information

Clustering and The Expectation-Maximization Algorithm

Clustering and The Expectation-Maximization Algorithm Clustering and The Expectation-Maximization Algorithm Unsupervised Learning Marek Petrik 3/7 Some of the figures in this presentation are taken from An Introduction to Statistical Learning, with applications

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

Cluster Analysis. Debashis Ghosh Department of Statistics Penn State University (based on slides from Jia Li, Dept. of Statistics)

Cluster Analysis. Debashis Ghosh Department of Statistics Penn State University (based on slides from Jia Li, Dept. of Statistics) Cluster Analysis Debashis Ghosh Department of Statistics Penn State University (based on slides from Jia Li, Dept. of Statistics) Summer School in Statistics for Astronomers June 1-6, 9 Clustering: Intuition

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Neural Computation : Lecture 14 John A. Bullinaria, 2015 1. The RBF Mapping 2. The RBF Network Architecture 3. Computational Power of RBF Networks 4. Training

More information

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov

ECE521: Week 11, Lecture March 2017: HMM learning/inference. With thanks to Russ Salakhutdinov ECE521: Week 11, Lecture 20 27 March 2017: HMM learning/inference With thanks to Russ Salakhutdinov Examples of other perspectives Murphy 17.4 End of Russell & Norvig 15.2 (Artificial Intelligence: A Modern

More information

Introduction to Machine Learning

Introduction to Machine Learning Department of Computer Science, University of Helsinki Autumn 2009, second term Session 8, November 27 th, 2009 1 2 3 Multiplicative Updates for L1-Regularized Linear and Logistic Last time I gave you

More information

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06 Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,

More information

Supervised vs unsupervised clustering

Supervised vs unsupervised clustering Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful

More information

COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning

COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning Associate Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551

More information

Clustering. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 238

Clustering. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 238 Clustering Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester 2015 163 / 238 What is Clustering? Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Problem 1 (20 pt) Answer the following questions, and provide an explanation for each question.

Problem 1 (20 pt) Answer the following questions, and provide an explanation for each question. Problem 1 Answer the following questions, and provide an explanation for each question. (5 pt) Can linear regression work when all X values are the same? When all Y values are the same? (5 pt) Can linear

More information

K-Means Clustering. Sargur Srihari

K-Means Clustering. Sargur Srihari K-Means Clustering Sargur srihari@cedar.buffalo.edu 1 Topics in Mixture Models and EM Mixture models K-means Clustering Mixtures of Gaussians Maximum Likelihood EM for Gaussian mistures EM Algorithm Gaussian

More information

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple

More information

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity

More information

SGN (4 cr) Chapter 11

SGN (4 cr) Chapter 11 SGN-41006 (4 cr) Chapter 11 Clustering Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 25, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006 (4 cr) Chapter

More information

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany Syllabus Fri. 27.10. (1) 0. Introduction A. Supervised Learning: Linear Models & Fundamentals Fri. 3.11. (2) A.1 Linear Regression Fri. 10.11. (3) A.2 Linear Classification Fri. 17.11. (4) A.3 Regularization

More information

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically

More information

1 Case study of SVM (Rob)

1 Case study of SVM (Rob) DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,

More information

Expectation Maximization: Inferring model parameters and class labels

Expectation Maximization: Inferring model parameters and class labels Expectation Maximization: Inferring model parameters and class labels Emily Fox University of Washington February 27, 2017 Mixture of Gaussian recap 1 2/26/17 Jumble of unlabeled images HISTOGRAM blue

More information

Content-based image and video analysis. Machine learning

Content-based image and video analysis. Machine learning Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all

More information

Segmentation: Clustering, Graph Cut and EM

Segmentation: Clustering, Graph Cut and EM Segmentation: Clustering, Graph Cut and EM Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

Machine learning - HT Clustering

Machine learning - HT Clustering Machine learning - HT 2016 10. Clustering Varun Kanade University of Oxford March 4, 2016 Announcements Practical Next Week - No submission Final Exam: Pick up on Monday Material covered next week is not

More information

Expectation Maximization (EM) and Gaussian Mixture Models

Expectation Maximization (EM) and Gaussian Mixture Models Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation

More information

Expectation-Maximization. Nuno Vasconcelos ECE Department, UCSD

Expectation-Maximization. Nuno Vasconcelos ECE Department, UCSD Expectation-Maximization Nuno Vasconcelos ECE Department, UCSD Plan for today last time we started talking about mixture models we introduced the main ideas behind EM to motivate EM, we looked at classification-maximization

More information

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering Introduction to Pattern Recognition Part II Selim Aksoy Bilkent University Department of Computer Engineering saksoy@cs.bilkent.edu.tr RETINA Pattern Recognition Tutorial, Summer 2005 Overview Statistical

More information

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 205-206 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI BARI

More information

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science. Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Image Segmentation Some material for these slides comes from https://www.csd.uwo.ca/courses/cs4487a/

More information

Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University

Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University Applied Bayesian Nonparametrics 5. Spatial Models via Gaussian Processes, not MRFs Tutorial at CVPR 2012 Erik Sudderth Brown University NIPS 2008: E. Sudderth & M. Jordan, Shared Segmentation of Natural

More information

Clustering algorithms

Clustering algorithms Clustering algorithms Machine Learning Hamid Beigy Sharif University of Technology Fall 1393 Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 1 / 22 Table of contents 1 Supervised

More information

IBL and clustering. Relationship of IBL with CBR

IBL and clustering. Relationship of IBL with CBR IBL and clustering Distance based methods IBL and knn Clustering Distance based and hierarchical Probability-based Expectation Maximization (EM) Relationship of IBL with CBR + uses previously processed

More information

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Erik Velldal & Stephan Oepen Language Technology Group (LTG) September 23, 2015 Agenda Last week Supervised vs unsupervised learning.

More information

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering An unsupervised machine learning problem Grouping a set of objects in such a way that objects in the same group (a cluster) are more similar (in some sense or another) to each other than to those in other

More information

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 9: Data Mining (4/4) March 9, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These slides

More information

Data Clustering. Danushka Bollegala

Data Clustering. Danushka Bollegala Data Clustering Danushka Bollegala Outline Why cluster data? Clustering as unsupervised learning Clustering algorithms k-means, k-medoids agglomerative clustering Brown s clustering Spectral clustering

More information

Clustering: K-means and Kernel K-means

Clustering: K-means and Kernel K-means Clustering: K-means and Kernel K-means Piyush Rai Machine Learning (CS771A) Aug 31, 2016 Machine Learning (CS771A) Clustering: K-means and Kernel K-means 1 Clustering Usually an unsupervised learning problem

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 2014-2015 Jakob Verbeek, November 28, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP

More information

Methods for Intelligent Systems

Methods for Intelligent Systems Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering

More information

Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic

Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 23 November 2018 Overview Unsupervised Machine Learning overview Association

More information