Introduction to Mobile Robotics

Similar documents
Machine Learning. Unsupervised Learning. Manfred Huber

CS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample

DATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm

Unsupervised Learning : Clustering

CSE 5243 INTRO. TO DATA MINING

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

Unsupervised Learning

CS Introduction to Data Mining Instructor: Abdullah Mueen

Introduction to Machine Learning CMU-10701

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

Clustering CS 550: Machine Learning

Expectation Maximization!

Clustering Lecture 5: Mixture Model

Note Set 4: Finite Mixture Models and the EM Algorithm

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

ECLT 5810 Clustering

CSE 5243 INTRO. TO DATA MINING

Cluster Evaluation and Expectation Maximization! adapted from: Doug Downey and Bryan Pardo, Northwestern University

10. MLSP intro. (Clustering: K-means, EM, GMM, etc.)

ECLT 5810 Clustering

K-Means Clustering. Sargur Srihari

Unsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Information Retrieval and Web Search Engines

ECG782: Multidimensional Digital Signal Processing

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

COMS 4771 Clustering. Nakul Verma

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors

Unsupervised Learning: Clustering

9.1. K-means Clustering

Unsupervised Learning Partitioning Methods

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Slides From Lecture Notes for Chapter 8. Introduction to Data Mining


Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling

Based on Raymond J. Mooney s slides

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22

Clustering. K-means clustering

Expectation Maximization (EM) and Gaussian Mixture Models

Lecture 7: Segmentation. Thursday, Sept 20

Statistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1

Cluster Analysis. Ying Shen, SSE, Tongji University

10701 Machine Learning. Clustering

Data Clustering. Algorithmic Thinking Luay Nakhleh Department of Computer Science Rice University

k-means demo Administrative Machine learning: Unsupervised learning" Assignment 5 out

CHAPTER 4: CLUSTER ANALYSIS

Information Retrieval and Organisation

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a

INF 4300 Classification III Anne Solberg The agenda today:

INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Information Retrieval and Web Search Engines

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering

Data Informatics. Seon Ho Kim, Ph.D.

K-Means. Oct Youn-Hee Han

University of Florida CISE department Gator Engineering. Clustering Part 2

10601 Machine Learning. Hierarchical clustering. Reading: Bishop: 9-9.2

Supervised vs. Unsupervised Learning

Cluster analysis. Agnieszka Nowak - Brzezinska

Unsupervised Learning and Clustering

ALTERNATIVE METHODS FOR CLUSTERING

Association Rule Mining and Clustering

Gene Clustering & Classification

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Clustering and Dissimilarity Measures. Clustering. Dissimilarity Measures. Cluster Analysis. Perceptually-Inspired Measures

CLUSTERING. CSE 634 Data Mining Prof. Anita Wasilewska TEAM 16

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme

Data Mining. Clustering. Hamid Beigy. Sharif University of Technology. Fall 1394

Understanding Clustering Supervising the unsupervised

Clustering. Shishir K. Shah

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

Exploratory Analysis: Clustering

Pattern Recognition Lecture Sequential Clustering

Clustering. Partition unlabeled examples into disjoint subsets of clusters, such that:

Machine Learning (BSMC-GA 4439) Wenke Liu

CS490W. Text Clustering. Luo Si. Department of Computer Science Purdue University

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

Clustering k-mean clustering

K-Means Clustering 3/3/17

K-Means and Gaussian Mixture Models

Unsupervised Learning and Clustering

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

Clustering. Supervised vs. Unsupervised Learning

Clustering algorithms

Unsupervised Learning. Clustering and the EM Algorithm. Unsupervised Learning is Model Learning

Hierarchical Clustering

IBL and clustering. Relationship of IBL with CBR

Clustering & Dimensionality Reduction. 273A Intro Machine Learning

L10. PARTICLE FILTERING CONTINUED. NA568 Mobile Robotics: Methods & Algorithms

Unsupervised Learning

Probabilistic Robotics

Clustering in Data Mining

Announcements. Image Segmentation. From images to objects. Extracting objects. Status reports next Thursday ~5min presentations in class

Clustering Part 3. Hierarchical Clustering

Contents. Preface to the Second Edition

Generative and discriminative classification techniques

Clustering: K-means and Kernel K-means

Big Data Analytics! Special Topics for Computer Science CSE CSE Feb 9

Unsupervised: no target value to predict

Transcription:

Introduction to Mobile Robotics Clustering Wolfram Burgard Cyrill Stachniss Giorgio Grisetti Maren Bennewitz Christian Plagemann

Clustering (1) Common technique for statistical data analysis (machine learning, data mining, pattern recognition, ) Classification of a data set into subsets (clusters) Ideally, data in each subset have a similar characteristics (proximity according to distance function) 2

Clustering (2) Needed: distance (similarity / dissimilarity) function, e.g., Euclidian distance Clustering quality Inter-clusters distance maximized Intra-clusters distance minimized The quality depends on Clustering algorithm Distance function The application (data) 3

Types of Clustering Hierarchical Clustering Agglomerative Clustering (buttom up) Divisive Clustering (top-down) Partitional Clustering K-Means Clustering (hard & soft) Gaussian Mixture Models (EM-based) 4

K-Means Clustering Partitions the data into k clusters (k is to be specified by the user) Find k reference vectors m j, j =1,...,k which best explain the data X Assign data vectors to nearest (most similar) referencem i x t m i = min j x t m j r-dimensional data vector in a real-valued space reference vector (center of cluster = mean) 5

Reconstruction Error (K-Means as Compression Alg.) The total reconstruction error is defined as with E b t i ({ } ) k m X i 1 = 0 i= 1 if = x t m otherwise m Find reference vectors which minimize the error Taking its derivative with respect to m i and setting it to 0 leads to m i b = x min j m t t t i i i i = t t b x t i b t i t x t 2 j 6

K-Means Algorithm Recompute the cluster centers m i using current cluster membership Assign each x t to the closest cluster 7

K-Means Example Image source: Alpaydin, Introduction to Machine Learning 8

Strength of K-Means Easy to understand and to implement Efficient O(nkt) n = #iterations, k = #clusters, t = #data points Converges to a local optimum (global optimum is hard to find) Most popular clustering algorithm 9

Weaknesses of K-Means User needs to specify #clusters (k) Sensitive to initialization (strategy: use different seeds) Sensitive to outliers since all data points contribute equally to the mean (strategy: try to eliminate outliers) 10

Soft Assignments So far, each data point was assigned to exactly one cluster A variant called soft k-means allows for making fuzzy assignments Data points are assigned to clusters with certain probabilities 11

Soft K-Means Clustering Each data point is given a soft assignment to all means β is a stiffness parameter and plays a crucial role Means are updated Repeat assignment and update step until assignments do not change anymore 12

Soft K-Means Clustering Points between clusters get assigned to both of them Points near the cluster boundaries play a partial role in several clusters Additional parameter β Clusters with varying shapes can be treated in a probabilistic framework (mixtures of Gaussians) 13

Similarity of Soft K-Means and Expectation Maximization (EM) Goal of EM: Find component parameters that maximize the likelihood of dataset Two sets of random variables: observed data set d hidden variables c (assignment of data points to clusters) Since the complete data likelihood cannot determined, work with its expectation 14

Expected Data Likelihood Observed data Correspondence variables (hidden) Joint likelihood of d and c given model θ Since the values of c are hidden, optimize the expected log likelihood 15

Expectation Maximization Optimizing the expected log likelihood is usually not easy EM iteratively maximizes log likelihood functions EM generates a sequence of models of increasing log likelihood 16

Expectation Maximization Use so-called Q-function to find the model with the maximum expected data likelihood Estimation (E) step Compute expected values for c given current model Define the expected data log likelihood as a function of θ Maximization (M) step Maximize this expected likelihood 17

Iterating E and M Steps M-step: Compute new model components given the expectations model E-step: Compute the expectations given the current model random initial model 18

Application: Trajectory Clustering How to learn typical motion patterns of people from observations? 19

Application: Trajectory Clustering Input: Set of trajectories d 1,...,d I entrance sofa TV office piano kitchen dining d i = {x i1,x i2,...,x it } 20

What we are looking for Clustering of similar trajectories into motion patterns θ 1,...,θ M (Note: From now on M = #clusters) Binary correspondence variables c im indicating which trajectory s i belongs to which motion pattern θ m Problem: How can we estimate c im? 21

Motion Patterns Use T Gaussians with fixed variance to represent each motion pattern If we knew the values of the c im, the computation of the motion patterns would be easy But: These values are hidden Use Expectation Maximization to compute expected values for the c im the model θ (i.e., the set of motion patterns) which has the highest expected data likelihood 22

Likelihood of Trajectory d i under Motion Pattern θ m ={θ 1 m,, θt m } probability that the person is at location x i t after t observations given it is engaged in motion pattern θ m 23

Data Likelihood Joint likelihood of a single trajectory and its correspondence vector Expected data log likelihood 24

Q-Function (Expectation is a linear operator, move it inside the sum) 25

E-Step: Compute the Expectations Given the Current Model Bayes uniform prior normalizer likelihood that the i-th trajectory belongs to the m-th model component 26

M-Step: Maximize the Expected Likelihood Compute partial derivative with respect to 27

M-Step: Maximize the Expected Likelihood This is the mean update of soft k-means 28

EM Application Example: 9 Trajectories of 3 Motion Patterns B C A 29

EM: Example (step 1) C im : MP A->B C->A A->C trajectories likelihood that s 0 belongs to the red motion pattern B C θ 1 : A 30

EM: Example (step 2) C im : θ 2 : 31

EM: Example (step 3) C im : θ 3 : 32

EM: Example (step 4) C im : θ 4 : 33

EM: Example (step 5) C im : θ 5 : 34

EM: Example (step 6) C im : θ 6 : 35

EM: Example (step 7) C im : θ 7 : 36

EM: Example (step 8) C im : θ 8 : 37

EM: Example (step 9) C im : θ 9 : 38

Estimating the Number of Model Components After convergence of the EM check whether the model can be improved by introducing a new model component for the trajectory which has the lowest likelihood or by eliminating the model component which has the lowest utility. Select model θ which has the highest evaluation where M = #model components, I = #trajectories Bayesian Information Criterion [Schwarz,`78] 39

Application Example 40

Learned Motion Patterns 41

Summary K-Means is the most popular clustering algorithm It is efficient and easy to implement Converges to a local optimum A variant of hard k-means exists allowing soft assignments Soft k-means corresponds to the EM algorithm which is a general optimization procedure 42

Further Reading E. Alpaydin Introduction to Machine Learning 43