Expectation-Maximization Algorithm and Image Segmentation

Size: px
Start display at page:

Download "Expectation-Maximization Algorithm and Image Segmentation"


1 Expectation-Maximization Algorithm and Image Segmentation Daozheng Chen 1 In computer vision, image segmentation problem is to partition a digital image into multiple parts. The goal is to change the representation of the image and make it more meaningful and easier to analyze [11]. In this assignment, we will show how an image segmentation algorithm works in a real application. In the Electronic Field Guide (EFG) project, researchers want to segment the leaf region from an image and extract a set of points from the contour to represent the shape of leaf [2]. A leaf image typically contains a single piece of leaf on a surface with rather uniform pattern. This makes the segmentation problem easier. Figure 1 shows an example of leaf images. A matrix of pixels represents a leaf image. Each pixel is a 3 by 1 vector representing the value of red, green, and blue components respectively. Each value is an integer value between 0 to 255. Instead of working on this 3-dimensional data, we transform a pixel in hue, saturation, and value domain (HSV) [7], and get rid of the hue component. Our task is to group this 2-dimensional data into two clusters: the leaf region and non-leaf region. Figure 1. A typical leaf image This is a data clustering problem, Chapter 11 of [10] discusses the popular K- means algorithm. Let us do some review of this algorithm to see what it optimizes and how it works. We have a set X of N data points x i R m and K clusters.

2 2 Each cluster C j has a center c j and each point is assigned to a cluster. We define the distance d i between a point x i to its cluster center as d i = min j x i c j, where j = 1,..., K. Our objective is to find the k cluster centers that minimize the sum of distance between each point and its cluster center. Given an initial guess of centers or computed centers from previous iteration, the K-means algorithm first assigns each data point to a cluster whose center is the closest. Then for each cluster, it updates the cluster center according to the data points assigned to it. It keeps doing these two steps until the centers do not change. Algorithm 11.1 in [10] provides a detailed description of the algorithm. Let us do Challenge 1 to see how it works for our image data set. CHALLENGE 1. For leaf1.jpg, leaf2.jpg, leaf3.jpg, and leaf4.jpg, generate the 2D data points for saturation and values. Use the MATLAB function kmeans to group the data points into two clusters. Display the binary segmentation image. You should use 2-norm to measure distance d i. And use (0.4, 0.6) as the initial mean for the first cluster and (0.6, 0.4) as the initial mean for the second cluster. Each pixel in a binary images is either 0 or 1. If the data of a pixel in the original image belongs to cluster 1 (those pixels whose indices returned by kmeans are 1), set the pixel value to be 0. Otherwise, set it to be 1. MATLAB image processing functions such as imread, imshow, imwrite, and rgb2hsv may be useful to play with the images and do the HSV transformation. For leaf1.jpg, leaf2.jpg, and leaf3.jpg, use MATLAB plot function to generate the scatter plot for (1) all the 2D data points, (2) those data points belonging to cluster 1, and (3) those belonging to cluster 2. In addition, use hist2d and Plot2dHist in MATLAB Central [8] to generate 2D histogram for these three data sets. Only display the plots and histograms for leaf1.jpg for this Challenge, and save those plots for leaf2.jpg and leaf3.jpg for Challenge 4. You do not need to discuss the segmentation quality. We leave this to Challenge 4. Document your program to make it clear to understand, and answer the following questions: (a) Using the scatter plots and 2D histogram for leaf1.jpg, how many clusters can you see? (b) What is the shape of the boundary that separates the clusters? Why it is a shape like that? (c) Discuss the advantages and disadvantages of visualizing our data using scatter plots and 2D histograms. Expectation-Maximization (EM) algorithm K-means algorithm is simple. However, it is easy to get stuck in local optimal. The EM algorithm tends to get stuck less than K-means algorithm. The idea is to assign data points partially to different clusters instead of assigning to only one cluster. To do this partial assignment, we model each cluster using a probabilistic distribution.

3 3 So a data point associates with a cluster with certain probability and it belongs to the cluster with the highest probability in the final assignment [6]. We can use mixture of Gaussian distributions to model this. The mixture model is a weighted sum of K Gaussian distributions. The weights sum up to 1. Let the parameter of jth distribution be θ j and its weight be w j, the probability of a data point x i given this model is p(x i Θ) = K w j p j (x i θ j ), j=1 where Θ = {w 1,..., w K, θ 1,..., θ K }. To do clustering, we want to determine the probability of the cluster C yi for each data point x i given Θ, that is, p(y i x i, Θ). This requires us to know Θ. We can use the common maximum likelihood approach to determine Θ. Assuming that each data point is identical and independently distributed with this mixture model, the log-likelihood of our data set X is N N K log(p (X Θ)) = log( p(x i Θ)) = log( w j p((x i θ j )). However, finding Θ by direct maximization of this function is difficult, and this is the place where EM jumps in [4]. In this algorithm, we suppose our data set X is incomplete, and Y is the set of missing data. Given the old (or initial) model parameter Θ old, we perform the following two steps repeatedly. Expectation Step (E-step): we use Θ old and incomplete data X to express the expected value of log-likelihood of the complete data: E(Θ, Θ old ) = E[log(P (X, Y Θ)) Θ old, X]. j=1 Maximization Step (M-step): we use the expression for E(Θ, Θ old ) from E- step to find the new parameter Θ = Θ new which maximizes our expected value E(Θ, Θ old ). Then we use Θ new as Θ old in the next iteration. The log-likelihood will increase in each iteration and the parameter Θ new from M- step will converge to a local minimum of the log-likelihood for incomplete data X [4]. Therefore, we can keep this iteration until the log-likelihood increases less than some threshold. To apply EM algorithm to our mixture of Gaussians, we let the missing data set Y be the index y i of the cluster to which a data point x i belongs. (So x i belongs to cluster C yi.) Now let us work through Challenge 2 to see how E-step works for our mixture model. (Please refer to the problem on the next page.) After completing Challenge 2, we get a general expression for E(Θ, Θ old ). It turns out that we can simplify it and get K N K N E(Θ, Θ old ) = log(w j )p(j x i, Θ old ) + log(p(x i θ j ))p(j x i, Θ old ), (1) j=1 j=1

4 4 assuming 0 < w j < 1 and p(x i θ j ) > 0 for j = 1,..., K and i = 1,..., N. Let µ new j, Σ new j, and wj new be the mean, varaince, and weight of the jth Gaussian that maximize E(Θ, Θ old ). It can be shown that w new j µ new j = = 1 N N p(j x i, Θ old ) (2) N x ip(j x i, Θ old ) N p(j x i, Θ old ) (3) Σ new j = N p(j x i, Θ old )(x i µ new j )(x i µ new j N p(j x i, Θ old ) ) T (4) Deriving formula 2, 3, and 4 for more than 2 components and multivariate Gaussian distribution requires the use of Lagrange multiplier and knowledge from Vector Calculus [4]. However, let us do Challenge 3 to work out the derivation of a simple case. CHALLENGE 2. Before we go into the problem, we need to introduce two tools to do the analysis. The product rule of probability states p(a, B) = p(a B)p(B). The Bayes s rule says p(a B) = p(b A)p(A) p(b). In this problem, we assume that x i is independently distributed. The same property holds for y i. (a) Using product rule of probability, verify that the log-likelihood of our complete data (X, Y ) is log(p(x, Y Θ)) = N log(w yi p(x i θ yi )). (b) Using Bayes s rule and product rule of probability, verify that p(y i x i, Θ old ) = (c) Let y = [y 1, y 2,..., y N ], we have p(y X, Θ old ) = Using this expression, we can show that wy old i p(x i θy old i ) K j=1 wold j p(x i θj old ). N p(y i x i, Θ old ). E(Θ, Θ old ) = E[log(p(X, Y Θ)) Θ old, X] = y Q log(p(x, y Θ))p(y X, Θ old ), where Q in the right most formula is the domain of y. How many different ys are in Q? Express this using N and K. Is it practical to directly use this summation to evaluate E(Θ, Θ old )?

5 5 CHALLENGE 3. For K = 2 and 1-dimensional Gaussian distribution, verify that formula 2, 3, and 4 are correct. You may assume that you know formula 1. Now let us implement this method and see how it works for our data set. Here is an important note you should be aware of before the implementation. The description of E-step wants to obtain E(Θ, Θ old ). However, in our case, we can work out formulas for the paramters that maximize E(Θ, Θ old ). So in E-step, we only need to compute p(j x i, Θ old ) for j = 1,..., K and i = 1,..., N, which are used in formula 2, 3, and 4 in M-step. CHALLENGE 4. For each image in Challenge 1, write a MATLAB program that uses EM algorithm to do the image segmentation. Keep the same initial mean as that in Challenge 1, and use identity matrix as initial covariance matrix and 0.5 as the initial weight for each component. Set the maximum number of iteration to be 100. Stop the iteration if Lnew L old L old 0.001, where L new is the log-likelihood based on the new parameters, and L old is the log-likelihood based on the old parameter in an iteration. Compute and display the log-likelihood using new parameters in each iteration. As in chanllege 1, generate the binary segmentation image for each image, and produce the similar set of scatter plots and 2D histograms for cluster 1 and 2 for leaf1.jpg, leaf2.jpg and leaf3.jpg. Answer the following questions: (a) For each image, does the log-likelihood increases for each iteration? (b) Compare the scatter plots, 2D histograms and segmentation images for leaf1.jpg using EM with those using K-means, discuss the results. How different are the scatter plots for cluster 1 and 2? How different are the 2D histograms for cluster 1 and 2? How different are the final segmentations? What can you conclude? (c) Compare the results by K-means and EM for leaf2.jpg and leaf3.jpg respectively. Follow the similar approach as that in part (b). In addition, how is the total data distribution different from that in leaf1.jpg? (d) Compare the segmentation images for leaf4.jpg using EM with those using K- means, discuss the results. In what region is EM is bad? Do similar things happen in other images? (e) Based on your discussion for part (b), (c), and (d), discuss advantages and disadvantages of both methods in terms of segmentation quality. Which one is better in general for our images? Relation between K-means and EM Algorithms K-means and EM are very similar. Within one iteration of K-means algorithm, first, we assign each data point to a cluster whose center is the closest; then for each cluster, we update its center according to the data points assigned to it from the previous step.

6 6 POINTER. For background on probability, please refer to a standard text such as [9]. Forsyth and Ponce [6, Chapter 16, 17] give more description of K-means and EM image segmentation. A paper by Malik and his students [3] formulates the image segmentation problem using EM. The EM algorithm was explained and named in a classic paper by Arthur Dempster, Nan Laird, and Donald Rubin [5] in Bilmes [4] gives very detailed description of EM algorithm, and discusses its application to Gaussian Mixture and Hidden Markov Models. The derivation of EM in this project follows from the derivation in this paper. For more information on the EFG project, please see [1] and [2]. See also http: //herbarium.cs.columbia.edu/. For more information on HSV, please refer to the book by Gonzalez and Woods [7]. See also Within one iteration of EM algorithm, first, we compute the probability that a data point comes from a cluster for each data point and each cluster; then for the distribution of each cluster, we update its parameters based on the probabilities from the previous step. Let us define a new set of probabilities ˆp(j x i, Θ old ) for i = 1,..., N and j = 1,..., K. We let { ˆp(j x i, Θ old 1 if j = argmaxs p(s x ) = i, Θ old ); 0 otherwise. (5) Based on this new function, let us do Challenge 5 to see why EM algorithm is a general case of K-means algorithm. CHALLENGE 5. (a) In E-step, if we replace p(j x i, Θ old ) with ˆp(j x i, Θ old ), and use ˆp to do further optimization in M-step. ˆp(j x i, Θ old ) will replace p(j x i, Θ old ) in formulas (2), (3), and (4). What do these new formulas tell you about the parameters computed in M-step? In particular, can you tell what data points are used to update parameters for jth distribution? (b) Using this new probability function ˆp and your discovery in part (a), can you describe an iteration of the new EM algorithm in the sense of a K-means algorithm, like the description at the beginning of this section?

7 Bibliography [1] Gaurav Agarwal, Haibin Ling, David Jacobs, Sameer Shirdhonkar, W. John Kress, Rusty Russell, Peter Belhumeur, An Dixit, Steve Feiner, Dhruv Mahajan, Kalyan Sunkavalli, Ravi Ramamoorthi, and Sean White. First steps toward an electronic field guide for plants. Taxon, 55: , [2] Peter N. Belhumeur, Daozheng Chen, Steven Feiner, David W. Jacobs, W. John Kress, Haibin Ling, Ida Lopez, Ravi Ramamoorthi, Sameer Sheorey, Sean White, and Ling Zhang. Searching the world s herbaria: A system for visual identification of plant species. In David A. Forsyth, Philip H. S. Torr, and Andrew Zisserman, editors, ECCV (4), volume 5305 of Lecture Notes in Computer Science, pages Springer, [3] Serge Belongie, Chad Carson, Hayit Greenspan, and Jitendra Malik. Colorand texture-based image segmentation using em and its application to contentbased image retrieval. pages , [4] Jeff A. Bilmes. A gentle tutorial on the em algorithm and its application to parameter estimation for gaussian mixture and hidden markov models. Technical report, [5] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1 38, [6] David A. Forsyth and Jean Ponce. Computer Vision: A Modern Approach. Prentice Hall, August [7] Rafael C. Gonzalez and Richard E. Woods. Digital Image Processing (2nd Edition). Prentice Hall, January [8] Kangwon Lee. 2d histogram matrix. [9] Alexander M. Mood, Franklin A. Graybill, and Duane C. Boes. Introduction to the Theory of Statistics. McGraw-Hill Companies, [10] Dianne P. O Leary. Scientific Computing with Case Studies. SIAM Press, Philadelphia,

8 8 Bibliography [11] Linda G. Shapiro, George C. Stockman, Linda G. Shapiro, and George Stockman. Computer Vision. Prentice Hall, January 2001.

Leaf Classification from Boundary Analysis

Leaf Classification from Boundary Analysis Leaf Classification from Boundary Analysis Anne Jorstad AMSC 663 Midterm Progress Report Fall 2007 Advisor: Dr. David Jacobs, Computer Science 1 Outline Background, Problem Statement Algorithm Validation

More information

Morphidas: Progress report

Morphidas: Progress report Morphidas: Progress report David Corney 14 th April 2010 www.computing.surrey.ac.uk/morphidas Outline Creating the Tilia dataset First steps at automatic analysis Finding the colour checkers Finding the

More information

Geoff McLachlan and Angus Ng. University of Queensland. Schlumberger Chaired Professor Univ. of Texas at Austin. + Chris Bishop

Geoff McLachlan and Angus Ng. University of Queensland. Schlumberger Chaired Professor Univ. of Texas at Austin. + Chris Bishop EM Algorithm Geoff McLachlan and Angus Ng Department of Mathematics & Institute for Molecular Bioscience University of Queensland Adapted by Joydeep Ghosh Schlumberger Chaired Professor Univ. of Texas

More information

Mixture Models and EM

Mixture Models and EM Mixture Models and EM Goal: Introduction to probabilistic mixture models and the expectationmaximization (EM) algorithm. Motivation: simultaneous fitting of multiple model instances unsupervised clustering

More information

Segmentation (continued)

Segmentation (continued) Segmentation (continued) Lecture 05 Computer Vision Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr Mubarak Shah Professor, University of Central Florida The Robotics

More information

Speeding up Queries in a Leaf Image Database

Speeding up Queries in a Leaf Image Database 1 Speeding up Queries in a Leaf Image Database Daozheng Chen May 10, 2007 Abstract We have an Electronic Field Guide which contains an image database with thousands of leaf images. We have a system which

More information

Dynamic Thresholding for Image Analysis

Dynamic Thresholding for Image Analysis Dynamic Thresholding for Image Analysis Statistical Consulting Report for Edward Chan Clean Energy Research Center University of British Columbia by Libo Lu Department of Statistics University of British

More information

Lecture 11: E-M and MeanShift. CAP 5415 Fall 2007

Lecture 11: E-M and MeanShift. CAP 5415 Fall 2007 Lecture 11: E-M and MeanShift CAP 5415 Fall 2007 Review on Segmentation by Clustering Each Pixel Data Vector Example (From Comanciu and Meer) Review of k-means Let's find three clusters in this data These

More information

Announcements. Image Segmentation. From images to objects. Extracting objects. Status reports next Thursday ~5min presentations in class

Announcements. Image Segmentation. From images to objects. Extracting objects. Status reports next Thursday ~5min presentations in class Image Segmentation Announcements Status reports next Thursday ~5min presentations in class Project voting From Sandlot Science Today s Readings Forsyth & Ponce, Chapter 1 (plus lots of optional references

More information

Clustering Documents in Large Text Corpora

Clustering Documents in Large Text Corpora Clustering Documents in Large Text Corpora Bin He Faculty of Computer Science Dalhousie University Halifax, Canada B3H 1W5 bhe@cs.dal.ca http://www.cs.dal.ca/ bhe Yongzheng Zhang Faculty of Computer Science

More information

Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information

Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information Probabilistic Facial Feature Extraction Using Joint Distribution of Location and Texture Information Mustafa Berkay Yilmaz, Hakan Erdogan, Mustafa Unel Sabanci University, Faculty of Engineering and Natural

More information

Gaussian Mixture Models For Clustering Data. Soft Clustering and the EM Algorithm

Gaussian Mixture Models For Clustering Data. Soft Clustering and the EM Algorithm Gaussian Mixture Models For Clustering Data Soft Clustering and the EM Algorithm K-Means Clustering Input: Observations: xx ii R dd ii {1,., NN} Number of Clusters: kk Output: Cluster Assignments. Cluster

More information

Image Segmentation using Gaussian Mixture Models

Image Segmentation using Gaussian Mixture Models Image Segmentation using Gaussian Mixture Models Rahman Farnoosh, Gholamhossein Yari and Behnam Zarpak Department of Applied Mathematics, University of Science and Technology, 16844, Narmak,Tehran, Iran

More information

Expectation Maximization (EM) and Gaussian Mixture Models

Expectation Maximization (EM) and Gaussian Mixture Models Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation

More information

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves Machine Learning A 708.064 11W 1sst KU Exercises Problems marked with * are optional. 1 Conditional Independence I [2 P] a) [1 P] Give an example for a probability distribution P (A, B, C) that disproves

More information

Clustering. Image segmentation, document clustering, protein class discovery, compression

Clustering. Image segmentation, document clustering, protein class discovery, compression Clustering CS 444 Some material on these is slides borrowed from Andrew Moore's machine learning tutorials located at: Clustering The problem of grouping unlabeled data on the basis of similarity. A key

More information

Mixture Models and EM

Mixture Models and EM Table of Content Chapter 9 Mixture Models and EM -means Clustering Gaussian Mixture Models (GMM) Expectation Maximiation (EM) for Mixture Parameter Estimation Introduction Mixture models allows Complex

More information

Introduction to Mobile Robotics

Introduction to Mobile Robotics Introduction to Mobile Robotics Clustering Wolfram Burgard Cyrill Stachniss Giorgio Grisetti Maren Bennewitz Christian Plagemann Clustering (1) Common technique for statistical data analysis (machine learning,

More information

Lecture 7: Segmentation. Thursday, Sept 20

Lecture 7: Segmentation. Thursday, Sept 20 Lecture 7: Segmentation Thursday, Sept 20 Outline Why segmentation? Gestalt properties, fun illusions and/or revealing examples Clustering Hierarchical K-means Mean Shift Graph-theoretic Normalized cuts

More information

Edge Detection Lecture 03 Computer Vision

Edge Detection Lecture 03 Computer Vision Edge Detection Lecture 3 Computer Vision Suggested readings Chapter 5 Linda G. Shapiro and George Stockman, Computer Vision, Upper Saddle River, NJ, Prentice Hall,. Chapter David A. Forsyth and Jean Ponce,

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Clustering Lecture 5: Mixture Model

Clustering Lecture 5: Mixture Model Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics

More information

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed

More information

Segmentation: Clustering, Graph Cut and EM

Segmentation: Clustering, Graph Cut and EM Segmentation: Clustering, Graph Cut and EM Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu

More information

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation Spring 2005 Ahmed Elgammal Dept of Computer Science CS 534 Segmentation II - 1 Outlines What is Graph cuts Graph-based clustering

More information

Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification

Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification Application of Principal Components Analysis and Gaussian Mixture Models to Printer Identification Gazi. Ali, Pei-Ju Chiang Aravind K. Mikkilineni, George T. Chiu Edward J. Delp, and Jan P. Allebach School

More information

Scale-Space Processing of Point-Sampled Geometry for Efficient 3D Object Segmentation

Scale-Space Processing of Point-Sampled Geometry for Efficient 3D Object Segmentation Scale-Space Processing of Point-Sampled Geometry for Efficient 3D Object Segmentation Hamid Laga Hiroki Takahashi Masayuki Nakajima Graduate School of Information Science and Engineering Tokyo Institute

More information

Clustering web search results

Clustering web search results Clustering K-means Machine Learning CSE546 Emily Fox University of Washington November 4, 2013 1 Clustering images Set of Images [Goldberger et al.] 2 1 Clustering web search results 3 Some Data 4 2 K-means

More information

Inference and Representation

Inference and Representation Inference and Representation Rachel Hodos New York University Lecture 5, October 6, 2015 Rachel Hodos Lecture 5: Inference and Representation Today: Learning with hidden variables Outline: Unsupervised

More information

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science. Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Image Segmentation Some material for these slides comes from https://www.csd.uwo.ca/courses/cs4487a/

More information

Image Segmentation for Image Object Extraction

Image Segmentation for Image Object Extraction Image Segmentation for Image Object Extraction Rohit Kamble, Keshav Kaul # Computer Department, Vishwakarma Institute of Information Technology, Pune kamble.rohit@hotmail.com, kaul.keshav@gmail.com ABSTRACT

More information

CS 534: Computer Vision Segmentation and Perceptual Grouping

CS 534: Computer Vision Segmentation and Perceptual Grouping CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation

More information

K-Means and Gaussian Mixture Models

K-Means and Gaussian Mixture Models K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser

More information

An Introduction to PDF Estimation and Clustering

An Introduction to PDF Estimation and Clustering Sigmedia, Electronic Engineering Dept., Trinity College, Dublin. 1 An Introduction to PDF Estimation and Clustering David Corrigan corrigad@tcd.ie Electrical and Electronic Engineering Dept., University

More information

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning ECE 5424: Introduction to Machine Learning Topics: Unsupervised Learning: Kmeans, GMM, EM Readings: Barber 20.1-20.3 Stefan Lee Virginia Tech Tasks Supervised Learning x Classification y Discrete x Regression

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning

COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning Associate Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551

More information

Computing Gaussian Mixture Models with EM using Equivalence Constraints

Computing Gaussian Mixture Models with EM using Equivalence Constraints Computing Gaussian Mixture Models with EM using Equivalence Constraints Noam Shental Computer Science & Eng. Center for Neural Computation Hebrew University of Jerusalem Jerusalem, Israel 9904 fenoam@cs.huji.ac.il

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

Mixture Models and the EM Algorithm

Mixture Models and the EM Algorithm Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is

More information

Methods for Intelligent Systems

Methods for Intelligent Systems Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering

More information

Text mining on a grid environment

Text mining on a grid environment Data Mining X 13 Text mining on a grid environment V. G. Roncero, M. C. A. Costa & N. F. F. Ebecken COPPE/Federal University of Rio de Janeiro, Brazil Abstract The enormous amount of information stored

More information

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection. CSCI-B609: A Theorist s Toolkit, Fall 016 Sept. 6, 016 Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality Lecturer: Yuan Zhou Scribe: Xuan Dong We will continue studying the spectral graph theory

More information

Lecture 11: Classification

Lecture 11: Classification Lecture 11: Classification 1 2009-04-28 Patrik Malm Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University 2 Reading instructions Chapters for this lecture 12.1 12.2 in

More information

A Review on Plant Disease Detection using Image Processing

A Review on Plant Disease Detection using Image Processing A Review on Plant Disease Detection using Image Processing Tejashri jadhav 1, Neha Chavan 2, Shital jadhav 3, Vishakha Dubhele 4 1,2,3,4BE Student, Dept. of Electronic & Telecommunication Engineering,

More information

DATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm

DATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm DATA MINING LECTURE 7 Hierarchical Clustering, DBSCAN The EM Algorithm CLUSTERING What is a Clustering? In general a grouping of objects such that the objects in a group (cluster) are similar (or related)

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 2014-2015 Jakob Verbeek, November 28, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15

More information

A Deterministic Global Optimization Method for Variational Inference

A Deterministic Global Optimization Method for Variational Inference A Deterministic Global Optimization Method for Variational Inference Hachem Saddiki Mathematics and Statistics University of Massachusetts, Amherst saddiki@math.umass.edu Andrew C. Trapp Operations and

More information

CS 534: Computer Vision Texture

CS 534: Computer Vision Texture CS 534: Computer Vision Texture Ahmed Elgammal Dept of Computer Science CS 534 Texture - 1 Outlines Finding templates by convolution What is Texture Co-occurrence matrices for texture Spatial Filtering

More information

10. MLSP intro. (Clustering: K-means, EM, GMM, etc.)

10. MLSP intro. (Clustering: K-means, EM, GMM, etc.) 10. MLSP intro. (Clustering: K-means, EM, GMM, etc.) Rahil Mahdian 01.04.2016 LSV Lab, Saarland University, Germany What is clustering? Clustering is the classification of objects into different groups,

More information

Markov Random Fields and Segmentation with Graph Cuts

Markov Random Fields and Segmentation with Graph Cuts Markov Random Fields and Segmentation with Graph Cuts Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project Proposal due Oct 27 (Thursday) HW 4 is out

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-

More information

Expectation Maximization. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University

Expectation Maximization. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University Expectation Maximization Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University April 10 th, 2006 1 Announcements Reminder: Project milestone due Wednesday beginning of class 2 Coordinate

More information

Image Analysis - Lecture 5

Image Analysis - Lecture 5 Texture Segmentation Clustering Review Image Analysis - Lecture 5 Texture and Segmentation Magnus Oskarsson Lecture 5 Texture Segmentation Clustering Review Contents Texture Textons Filter Banks Gabor

More information

Computer Vision 5 Segmentation by Clustering

Computer Vision 5 Segmentation by Clustering Computer Vision 5 Segmentation by Clustering MAP-I Doctoral Programme Miguel Tavares Coimbra Outline Introduction Applications Simple clustering K-means clustering Graph-theoretic clustering Acknowledgements:

More information

human vision: grouping k-means clustering graph-theoretic clustering Hough transform line fitting RANSAC

human vision: grouping k-means clustering graph-theoretic clustering Hough transform line fitting RANSAC COS 429: COMPUTER VISON Segmentation human vision: grouping k-means clustering graph-theoretic clustering Hough transform line fitting RANSAC Reading: Chapters 14, 15 Some of the slides are credited to:

More information

Texture. Texture is a description of the spatial arrangement of color or intensities in an image or a selected region of an image.

Texture. Texture is a description of the spatial arrangement of color or intensities in an image or a selected region of an image. Texture Texture is a description of the spatial arrangement of color or intensities in an image or a selected region of an image. Structural approach: a set of texels in some regular or repeated pattern

More information

Assignment 2. Unsupervised & Probabilistic Learning. Maneesh Sahani Due: Monday Nov 5, 2018

Assignment 2. Unsupervised & Probabilistic Learning. Maneesh Sahani Due: Monday Nov 5, 2018 Assignment 2 Unsupervised & Probabilistic Learning Maneesh Sahani Due: Monday Nov 5, 2018 Note: Assignments are due at 11:00 AM (the start of lecture) on the date above. he usual College late assignments

More information

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization More on Learning Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization Neural Net Learning Motivated by studies of the brain. A network of artificial

More information

An EM-like algorithm for color-histogram-based object tracking

An EM-like algorithm for color-histogram-based object tracking An EM-like algorithm for color-histogram-based object tracking Zoran Zivkovic Ben Kröse Intelligent and Autonomous Systems Group University of Amsterdam The Netherlands email:{zivkovic,krose}@science.uva.nl

More information

9.913 Pattern Recognition for Vision. Class I - Overview. Instructors: B. Heisele, Y. Ivanov, T. Poggio

9.913 Pattern Recognition for Vision. Class I - Overview. Instructors: B. Heisele, Y. Ivanov, T. Poggio 9.913 Class I - Overview Instructors: B. Heisele, Y. Ivanov, T. Poggio TOC Administrivia Problems of Computer Vision and Pattern Recognition Overview of classes Quick review of Matlab Administrivia Instructors:

More information

Cluster Analysis. Jia Li Department of Statistics Penn State University. Summer School in Statistics for Astronomers IV June 9-14, 2008

Cluster Analysis. Jia Li Department of Statistics Penn State University. Summer School in Statistics for Astronomers IV June 9-14, 2008 Cluster Analysis Jia Li Department of Statistics Penn State University Summer School in Statistics for Astronomers IV June 9-1, 8 1 Clustering A basic tool in data mining/pattern recognition: Divide a

More information

CS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample

CS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample Lecture 9 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem: distribute data into k different groups

More information

North Asian International Research Journal of Sciences, Engineering & I.T.

North Asian International Research Journal of Sciences, Engineering & I.T. North Asian International Research Journal of Sciences, Engineering & I.T. IRJIF. I.F. : 3.821 Index Copernicus Value: 52.88 ISSN: 2454-7514 Vol. 4, Issue-12 December-2018 Thomson Reuters ID: S-8304-2016

More information

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of

More information

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned

More information

K-Means Clustering Using Localized Histogram Analysis

K-Means Clustering Using Localized Histogram Analysis K-Means Clustering Using Localized Histogram Analysis Michael Bryson University of South Carolina, Department of Computer Science Columbia, SC brysonm@cse.sc.edu Abstract. The first step required for many

More information

Latent Variable Models and Expectation Maximization

Latent Variable Models and Expectation Maximization Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15

More information

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,

More information

Content-based image and video analysis. Machine learning

Content-based image and video analysis. Machine learning Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all

More information

Clustering & Dimensionality Reduction. 273A Intro Machine Learning

Clustering & Dimensionality Reduction. 273A Intro Machine Learning Clustering & Dimensionality Reduction 273A Intro Machine Learning What is Unsupervised Learning? In supervised learning we were given attributes & targets (e.g. class labels). In unsupervised learning

More information

Fall 09, Homework 5

Fall 09, Homework 5 5-38 Fall 09, Homework 5 Due: Wednesday, November 8th, beginning of the class You can work in a group of up to two people. This group does not need to be the same group as for the other homeworks. You

More information

A New Energy Model for the Hidden Markov Random Fields

A New Energy Model for the Hidden Markov Random Fields A New Energy Model for the Hidden Markov Random Fields Jérémie Sublime 1,2, Antoine Cornuéjols 1, and Younès Bennani 2 1 AgroParisTech, INRA - UMR 518 MIA, F-75005 Paris, France {jeremie.sublime,antoine.cornuejols}@agroparistech.fr

More information

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically

More information


PERSONALIZATION OF  MESSAGES PERSONALIZATION OF E-MAIL MESSAGES Arun Pandian 1, Balaji 2, Gowtham 3, Harinath 4, Hariharan 5 1,2,3,4 Student, Department of Computer Science and Engineering, TRP Engineering College,Tamilnadu, India

More information

6.801/866. Segmentation and Line Fitting. T. Darrell

6.801/866. Segmentation and Line Fitting. T. Darrell 6.801/866 Segmentation and Line Fitting T. Darrell Segmentation and Line Fitting Gestalt grouping Background subtraction K-Means Graph cuts Hough transform Iterative fitting (Next time: Probabilistic segmentation)

More information

Lecture 8: The EM algorithm

Lecture 8: The EM algorithm 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 8: The EM algorithm Lecturer: Manuela M. Veloso, Eric P. Xing Scribes: Huiting Liu, Yifan Yang 1 Introduction Previous lecture discusses

More information

Unsupervised Learning. Clustering and the EM Algorithm. Unsupervised Learning is Model Learning

Unsupervised Learning. Clustering and the EM Algorithm. Unsupervised Learning is Model Learning Unsupervised Learning Clustering and the EM Algorithm Susanna Ricco Supervised Learning Given data in the form < x, y >, y is the target to learn. Good news: Easy to tell if our algorithm is giving the

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany

More information


ALTERNATIVE METHODS FOR CLUSTERING ALTERNATIVE METHODS FOR CLUSTERING K-Means Algorithm Termination conditions Several possibilities, e.g., A fixed number of iterations Objects partition unchanged Centroid positions don t change Convergence

More information

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction Preprocessing The goal of pre-processing is to try to reduce unwanted variation in image due to lighting,

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14

More information

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample CS 1675 Introduction to Machine Learning Lecture 18 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem:

More information

STATS306B STATS306B. Clustering. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

STATS306B STATS306B. Clustering. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010 STATS306B Jonathan Taylor Department of Statistics Stanford University June 3, 2010 Spring 2010 Outline K-means, K-medoids, EM algorithm choosing number of clusters: Gap test hierarchical clustering spectral

More information

Summer School in Statistics for Astronomers & Physicists June 15-17, Cluster Analysis

Summer School in Statistics for Astronomers & Physicists June 15-17, Cluster Analysis Summer School in Statistics for Astronomers & Physicists June 15-17, 2005 Session on Computational Algorithms for Astrostatistics Cluster Analysis Max Buot Department of Statistics Carnegie-Mellon University

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

CS Introduction to Data Mining Instructor: Abdullah Mueen

CS Introduction to Data Mining Instructor: Abdullah Mueen CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 8: ADVANCED CLUSTERING (FUZZY AND CO -CLUSTERING) Review: Basic Cluster Analysis Methods (Chap. 10) Cluster Analysis: Basic Concepts

More information

Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute Slide Credit: Mehryar Mohri

Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute Slide Credit: Mehryar Mohri Speech Recognition Lecture 8: Acoustic Models. Eugene Weinstein Google, NYU Courant Institute eugenew@cs.nyu.edu Slide Credit: Mehryar Mohri Speech Recognition Components Acoustic and pronunciation model:

More information

A Probabilistic Framework for Spatio-Temporal Video Representation & Indexing

A Probabilistic Framework for Spatio-Temporal Video Representation & Indexing A Probabilistic Framework for Spatio-Temporal Video Representation & Indexing Hayit Greenspan 1, Jacob Goldberger 2, and Arnaldo Mayer 1 1 Faculty of Engineering, Tel Aviv University, Tel Aviv 69978, Israel

More information

A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models

A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models A Visualization Tool to Improve the Performance of a Classifier Based on Hidden Markov Models Gleidson Pegoretti da Silva, Masaki Nakagawa Department of Computer and Information Sciences Tokyo University

More information

Normalized Texture Motifs and Their Application to Statistical Object Modeling

Normalized Texture Motifs and Their Application to Statistical Object Modeling Normalized Texture Motifs and Their Application to Statistical Obect Modeling S. D. Newsam B. S. Manunath Center for Applied Scientific Computing Electrical and Computer Engineering Lawrence Livermore

More information

Histograms. h(r k ) = n k. p(r k )= n k /NM. Histogram: number of times intensity level rk appears in the image

Histograms. h(r k ) = n k. p(r k )= n k /NM. Histogram: number of times intensity level rk appears in the image Histograms h(r k ) = n k Histogram: number of times intensity level rk appears in the image p(r k )= n k /NM normalized histogram also a probability of occurence 1 Histogram of Image Intensities Create

More information

Missing variable problems

Missing variable problems Missing variable problems In many vision problems, if some variables were known the maximum likelihood inference problem would be easy fitting; if we knew which line each token came from, it would be easy

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

Document image binarisation using Markov Field Model

Document image binarisation using Markov Field Model 009 10th International Conference on Document Analysis and Recognition Document image binarisation using Markov Field Model Thibault Lelore, Frédéric Bouchara UMR CNRS 6168 LSIS Southern University of

More information

Using Texture to Annotate Remote Sensed Datasets

Using Texture to Annotate Remote Sensed Datasets Using Texture to Annotate Remote Sensed Datasets S. Newsam, L. Wang, S. Bhagavathy, and B.S. Manjunath Electrical and Computer Engineering University of California at Santa Barbara {snewsam,lwang,sitaram,manj}@ece.ucsb.edu

More information

COMS 4771 Clustering. Nakul Verma

COMS 4771 Clustering. Nakul Verma COMS 4771 Clustering Nakul Verma Supervised Learning Data: Supervised learning Assumption: there is a (relatively simple) function such that for most i Learning task: given n examples from the data, find

More information

Computer Vision Lecture 6

Computer Vision Lecture 6 Course Outline Computer Vision Lecture 6 Segmentation Image Processing Basics Structure Extraction Segmentation Segmentation as Clustering Graph-theoretic Segmentation 12.11.2015 Recognition Global Representations

More information

Cluster Analysis. Debashis Ghosh Department of Statistics Penn State University (based on slides from Jia Li, Dept. of Statistics)

Cluster Analysis. Debashis Ghosh Department of Statistics Penn State University (based on slides from Jia Li, Dept. of Statistics) Cluster Analysis Debashis Ghosh Department of Statistics Penn State University (based on slides from Jia Li, Dept. of Statistics) Summer School in Statistics for Astronomers June 1-6, 9 Clustering: Intuition

More information