Unsupervised Learning

Size: px
Start display at page:

Download "Unsupervised Learning"

Transcription

1 Unsupervised Learning

2 Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However, there are problems where a definition of the classes and even the number of the classes is unknown. Machine learning methods which deal with such data are said to be unsupervised. Questions: Why would one even be interested in learning from unlabeled samples? Is it even possible in principle to learn anything of value from unlabeled samples?

3 Why unsupervised learning 1. Collecting and labeling a large set of sample patterns can be surprisingly costly. E.g., videos are virtually free, but accurately labeling the video pixels is expensive and time consuming. 2. Extend to a larger training set by using semi-supervised learning. Train a classier on a small set of samples, then tune it up to make it run without supervision on a large, unlabeled set. Or, in the reverse direction, let a large set of unlabeled data group automatically, then label the groupings found. 3. To detect the gradual change of pattern over time. 4. To find features that will then be useful for categorization. 5. To gain insight into the nature or structure of the data during the early stages of an investigation.

4 Unsupervised learning: clustering In practice, unsupervised learning methods implement what is usually referred to as data clustering. Qualitatively and generally, the problem of data clustering can be defined as: Grouping of objects into meaningful categories Given a representation of N objects, find k clusters based on a measure of similarity.

5 Data clustering The problem can be tackled from several points of view: Statistics: represent the density function for all data as the mixture of a number of different distributions S i w i p y wi (y w i ) and fit a set of weights w i and component densities p y wi to the given data

6 Data clustering The problem can be tackled from several points of view: Geometry/topology : Partition the pattern space such that data belonging to each partition are highly homogeneous (i.e., they are similar to one another) More directly related with classification: Group (label) data such that average intragroup distance is minimized and average inter-group distance is maximized (yet another optimization problem!)

7 Data clustering Why data clustering? Natural Classification: degree of similarity among forms. Data exploration: discover underlying structure, generate hypotheses, detect anomalies. Compression: for organizing/indexing/storing/broadcasting data. Applications: can be used by any scientific field that collects data! Relevance: 2340 papers about data clustering indexed in Scopus in 2014!

8 Data clustering: examples

9 Data clustering Given a set of N unlabeled examples D = x 1 ; x 2 ; ; x N in a d-dimensional feature space, D is partitioned into a number of disjoint subsets Dj 's: D = j=1,k Dj where D i D j = ; i j ; where the points in each subset are similar to each other according to a given criterion.

10 Data clustering A partition is denoted by p = (D 1 ;D 2 ;.. ; D k ) and the problem of data clustering is thus formulated as p* = argmin f(p) p where f( ) is formulated according to.

11 Data clustering A general optimization (minimization) algorithm for a classification function J(Y, W) (Y being the dataset and W the ordered set of labels assigned to each sample) can be described as follows: Choose an initial classifier W 0 repeat (step i) Change the classification such that J decreases until the classification is the same as the previous one. If the variables were continuous, a gradient method could be

12 Data clustering A reasonable algorithm (based on the simplifying assumption that the optimization problem is separable, i.e., that the minimum of an n-dimensional function can be found by minimizing it along each dimension separately) would assign to each sample the label that causes the largest (negative) J. NB Since the problem is not separable, there is no guarantee that J decreases as the sum of J s. It may even increase! A better but slower solution, which guarantees montonicity, would be to change, in each step, the label that causes the greatest negative J.

13 K-means clustering K-means clustering is developed by choosing the Euclidean distance as the similarity criterion and J = S k=1,nc Sy (i) ~w k y (i) - m k as the function to be optimized. y (i) is the i th sample and m k is the center of the cluster. y (i) ~w k refers to all samples y (i) assigned to cluster k J is minimized by choosing m k as the sample mean of the data having label w k.

14 K-means clustering Randomly initialize m repeat Classify N samples according to nearest m i Recompute m i until there is no change in {m i }

15

16

17

18

19

20

21

22

23

24 Improving on k-means The main problem with k-means is the need to set a priori the number of desired clusters. A large number of algorithms have been proposed that overcome this limitation, by determining an optimal number of clusters at runtime. The basic idea behind these algorithms is splitting and merging : A cluster is split into two clusters when a measure of homogeneity falls below a certain threshold Two clusters are merged into one when a separation measure falls below a certain threshold

25 Isodata N D = approximate (desired) number of clusters T= threshold of number of samples in a cluster Set Nc = N D 1. Cluster data into Nc clusters, eliminating data and clusters with fewer than T members and decreasing Nc accordingly. Exit if classification has not changed, 2. If Nc N D /2 or (Nc < 2 N D and iteration is odd) : a. Split clusters whose samples are sufficiently disjointed and increase Nc accordingly. b. If any clusters have been split go to Merge any pair of clusters whose samples are sufficiently close and/or overlapping and decrease Nc accordingly. 4. Go to step 1.

26 Isodata (cluster computation) 1. Cluster data into Nc clusters, eliminating data and clusters with fewer than T members and decreasing Nc accordingly. Exit if classification has not changed, For each cluster the following quantities are computed: d k = (1/N k ) S y(i)~wk y (i) - m k avg. distance of samples from the mean for cluster k s k 2 = max (1/N k ) S y(i)~wk (y j (i) - m kj ) 2 largest variance along the coordinate axes. d = 1 N Nc k=1 N kd k overall average distance of samples N k = number of samples in cluster k

27 Isodata (split) s s 2 = max. spread parameter for splitting For k = 1,.., N If s k 2 > s s 2 If (d k > d and ( N k > 2T + 1 or Nc N D /2 or (Nc < 2 N D and iteration odd) ): split cluster k and increase Nc accordingly. Splitting means replacing the original center with two new centers displaced slightly (usually a fraction of s m ) in opposite directions along the axis m of largest variance.

28 Isodata (merge) D m = maximum distance separation for merging N max = maximum number of clusters that can be merged For i = 1,.., Nc For j = 1,.., Nc i. Compute d ij = m i - m j ii. Sort d ij < D m in ascending order For all sorted d ij, if neither cluster i or j have been merged: While the number of merges < N max Merge clusters i and j and decrease Nc accordingly The new center m of the split is computed as: m = (N i m i + N j m j )/(N i +Nj)

29 Data clustering questions 1. What is a cluster? 2. How to define pair-wise similarity? 3. Which features and normalization scheme? 4. How many clusters? 5. Which clustering method? 6. Are the discovered clusters and partition valid? 7. Does the data have any clustering tendency?

30 Similarity Compact Clusters Within-cluster distance < between-cluster distance Connected Clusters Within-cluster connectivity > between-cluster connectivity Ideal cluster: compact and isolated.

31 Representation There's no universal representation; representation is domain dependent.

32 Representation quality A good representation leads to compact and isolated clusters.

33 Feature relevance Two different meaningful groupings produced by different weighting schemes.

34 Number of clusters? The samples are generated by 6 independent classes, yet: Ground truth K=2 K=5 K=6

35 Meaning/validity of clusters Clustering algorithms find clusters, even if there are no natural clusters in the data D uniform data points k-means with k=3

36 Clustering methods: which is the best?

37 There is no best clustering algorithm! Each algorithm imposes a structure on data. Good fit between model and data success. GMM K=3 GMM K=2 Spectral K=3 Spectral K=2

38 References C.W. Therrien Decision, Estimation, Classification J. Corso, A. Chen. Clustering / Unsupervised Methods A. K. Jain and R. C. Dubes. Algorithms for Clustering Data, Prentice Hall, Map of Science, Nature, 2006 D. Aurthor and S. Vassilvitskii. k-means++: The Advantages of Careful Seeding R. Dubes and A. K. Jain. Clustering Techniques: User's Dilemma, Pattern Recognition 1976

39 Unsupervised Learning with Neural Networks

40 Clustering and Neural Networks Neural Networks trained by some unsupervised learning algorithm can also be used for tasks like Clustering Feature extraction Compression (pattern space dimensionality reduction, vector quantization) etc.

41 Clustering and Neural Networks Regarding clustering, the Self-Organizing Maps theorized by Teuvo Kohonen in the late 80 s, despite being modeled after some properties of the cerebral cortex, can be shown to be equivalent, under certain conditions, to the k-means algorithm. It can also be seen as an implementation of the vector quantization and can lead to the definition of a supervised learning algorithm which is rather fast and extremely easy to implement.

42 Unsupervised learning Kohonen s Self-Organizing Map (SOM) Homunculus Biological Model Mappings (projections) of sensory stimuli onto specific nets of cortical neurons can be observed in the cerebral cortex. Sensory-motor neurons form a distorted map of the human body surface as the extension of each region is proportional to the sensitivity of the corresponding body area, not to its size. However, adjacent parts of the cortex correspond to body areas which are also adjacent.

43 Unsupervised learning Kohonen s Self-Organizing Map (SOM) Lateral interactions between neurons Short-range excitatory ( mm) Inhibitory action (fino a mm) weak long-range excitatory action (up to a few cm) Mexican hat function Their profile can be approximated as:

44 Unsupervised learning Kohonen s Self-Organizing Map (SOM) Kohonen SOMs are sensory maps made up of a single layer of neurons, each of which becomes specialized in responding to specific stimuli such that: Different types of inputs (stimuli) activate different neurons Neighboring neurons respond to similar stimuli

45 Unsupervised learning Kohonen s Self-Organizing Map (SOM) Single layer of w*h neurons n i i=1,w*h (w=width h=height) Each input X={x j, j=1,n} is connected to all neurons (therefore each neuron has N connections) Each connection is associated with a weight w ij (i=neuron, j=dimension) Each neuron s weight set is isomorphic with the input patterns. Activation function f i 1/d(W i,x) d = distance Lateral interactions between neurons exist whose strengths depends on the distance between neurons according to the mexican-hat function

46 Unsupervised learning Kohonen s Self-Organizing Map (SOM) When a pattern is input, each neuron s weights are modified: in an excitatory way, with entity proportional to the value of their own and their neighbors activation function, and inversely proportional to their distance; in an inhibitory way, with entity proportional to the value of the activation function of the neurons which are outside their neighborhood and inversely proportional to their distance That means that when, after the weight update, the same input is given to the net: Strongly-activated neurons and their neighbors will be activated even more intensely. Weakly-activated ones will be activated even less.

47 Unsupervised learning Kohonen s Self-Organizing Map (SOM) Remember that: a neuron s activation is inversely proportional to the distance between its weights and the input pattern. Weight sets associated to neurons can be seen as points in the pattern space, being isomorphic with it. This implies that weights are updated such that, each time a pattern is input, weight sets that are close to the input pattern move even closer to it, and weight sets that are far move even farther. If data well distributed within the input space are successively input, each neuron s weight set tends to move to areas where data are more densely present, which means that the corresponding neuron specializes and tends to be activated by data belonging to a specific partition of the pattern space.

48 Unsupervised learning Kohonen s Self-Organizing Map (SOM) If we look at this process in the pattern space, this means that the weight set of each neuron tends to become the center of a data cluster. Also, due to the excitatory effect of the lateral connections, neighboring neurons tend to be activated by similar inputs. Thus, weight sets of neurons which are physically close in the map tend to be close in the pattern space. Because of this, the Self-Organizing Map (SOM) is also frequently called Topology Preserving Map (TPM).

49 Unsupervised learning Kohonen s Self-Organizing Map (SOM) Once a SOM is trained, it can act as a classifier according to a minimum-distance criterion. L = argmin x w i i L being a label, x the input pattern and w i the weights of neuron i. Thus, each input pattern is associated with the neuron coordinates on which it is projected i.e., the coordinates of the neuron with the highest activation (activation i being inversely proportional to the distance x w i ). In practice, the input space is projected onto the neuron layer, causing a dimensionality reduction of data from N (input size) to m (size of the map), but the topological relationships among data are preserved.

50 Unsupervised learning Kohonen s Self-Organizing Map (SOM) In practice: a trained SOM partitions the input space S into as many subspaces as the number of neurons (as k-means does) Each subspace s i of S is defined as: s i ={X j ϵs d(x j,w i ) = min t d(x j,w t )} This induces a so-called Voronoi tessellation of the input space (example limited to 2D patterns)

51 Unsupervised learning Kohonen s Self-Organizing Map (SOM) For a more efficient computer implementation the model is simplified: 1. Only the weights of neighbors of the most activated neuron (a.k.a. winning neuron, as one usually refers to this sort of algorithms as competitive learning algorithms) are updated, by the same rule, as only excitatory lateral interactions within a small neighborhood of the winning neuron are taken into account 2. The update rule w j (t+1) = w j (t) + a (x i - w j (t)) does not depend on the distance from the winning neuron. NB Modifying weights in an excitatory way means making them more similar to the input (activation increases); modifying them in an inhibitory way means to make them less similar to the input pattern (their activation decreases).

52 Unsupervised learning Kohonen s Self-Organizing Map (SOM) The Mexican-hat function which models lateral interactions between neurons is approximated as a box function (a) (a) (b) The inhibitory long-range connections may also be modeled, applying the rule w j (t+1) = w j (t) - a (x i - w j (t)) within an external neighborhood (b) The neighborhood is usually square but it can be of any shape (hexagonal ones are also common).

53 Unsupervised learning Kohonen s Self-Organizing Map (SOM) a=c (a = learning rate, C small positive constant << 1) Given a training set X = { x i, x i=(x i1, x i2,.,, x im ), i=1 N } - Initialize weights with values compatible with data (or randomly) - For each training pattern x i : 1. Find the winning neuron n w 2. Modify the weights of the winning neuron as well as the weights of the neurons located within its neighborhood I(n w ) in the map as follows: 3. w j (t+1) = w j (t) + a (x i - w j (t)) 4. a(t+1) = a(t) * (1 - g) [g small positive constant << 1 ] until the weights do not reach stable values or a = 0

54 Unsupervised learning Kohonen s Self-Organizing Map (SOM) For the sake of simplicity the neighborhood is usually square but it can be of any shape (hexagonal ones are also common). Sometimes also the inhibitory long-range connections are modeled, applying the rule w j (t+1) = w j (t) - a (x i - w j (t)) within an external neighborhood

55 Unsupervised learning Kohonen s Self-Organizing Map (SOM): summary Kohonen SOMs cluster data, i.e., they identify clouds of data which are densely distributed in specific regions, according to which the input space can be partitioned. As in the k-means algorithm, each partition can be represented by the centroid of the cloud: in a SOM it corresponds to the weights associated with a neuron of the map. Conversely, each weight vector associated with a neuron of a trained SOM is a centroid for a data cloud. It is possible to classify data a posteriori based on the partition of the input space to which they belong. Supposing we have a labeled data set, each partition induced by the SOM can be labeled after the class for which the corresponding neuron is most frequently the winner neuron.

56 Unsupervised learning Kohonen s Self-Organizing Map (SOM): summary In practice, to label neurons of a SOM a posteriori according to a labeled data set X (stacked generalization/majority vote) : label(i) = argmax (histogram (l, winner(x, w i )) where X is the dataset represented as a matrix (each row x i is a pattern). W are the SOM weights represented as a matrix (row w i is the weight set of the i-th neuron histogram (L,v) is the frequency histogram of vector v, whose elements may have L different values (from 1 to L). winner (X, w i ) returns a vector J having the same number of elements as the rows of X (i.e., the number of data in the dataset): Element J i is the label of the winning neuron when x i is input.

57 Unsupervised learning Kohonen s Self-Organizing Map (SOM): observations With respect to k-means, a SOM adds an ordering of the centroids which preserves the topological properties of the input space of which the SOM can be considered a lower-dimensional projection. The fact that neighboring neurons respond to similar (neighboring) patterns creates a lattice whose nodes are located where data are actually found and whose arcs never cross, as if a net was casted over the pattern space and its nodes were orderly anchored in the middle of regions where data are densely present. Somehow, it creates a distorted re-projection of data onto a lower-dimensional space where, however, data are (not strictly but surely more) uniformly distributed.

58 Unsupervised learning Kohonen s Self-Organizing Map (SOM): observations It can be demonstrated (Bishop) that the algorithm for training a SOM with neighborhood radius decreasing with time is equivalent to the K-means algorithm. The weight update equation can be derived as the (gradientbased) solution of the vector quantization problem : given K, find a set of K centroids (codebook in TLC terms; each centroid is then termed codebook vector) that minimize the squared error (actually it holds for any exponent) made when each sample in a (large) dataset is substituted by the closest centroid.

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,

More information

11/14/2010 Intelligent Systems and Soft Computing 1

11/14/2010 Intelligent Systems and Soft Computing 1 Lecture 8 Artificial neural networks: Unsupervised learning Introduction Hebbian learning Generalised Hebbian learning algorithm Competitive learning Self-organising computational map: Kohonen network

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)

More information

Figure (5) Kohonen Self-Organized Map

Figure (5) Kohonen Self-Organized Map 2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

Supervised vs. Unsupervised Learning

Supervised vs. Unsupervised Learning Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now

More information

Unsupervised Learning

Unsupervised Learning Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised

More information

Slide07 Haykin Chapter 9: Self-Organizing Maps

Slide07 Haykin Chapter 9: Self-Organizing Maps Slide07 Haykin Chapter 9: Self-Organizing Maps CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Introduction Self-organizing maps (SOM) is based on competitive learning, where output neurons compete

More information

Unsupervised Learning : Clustering

Unsupervised Learning : Clustering Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex

More information

Unsupervised Learning

Unsupervised Learning Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised

More information

Clustering. Supervised vs. Unsupervised Learning

Clustering. Supervised vs. Unsupervised Learning Clustering Supervised vs. Unsupervised Learning So far we have assumed that the training samples used to design the classifier were labeled by their class membership (supervised learning) We assume now

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Supervised vs.unsupervised Learning

Supervised vs.unsupervised Learning Supervised vs.unsupervised Learning In supervised learning we train algorithms with predefined concepts and functions based on labeled data D = { ( x, y ) x X, y {yes,no}. In unsupervised learning we are

More information

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Descriptive model A descriptive model presents the main features of the data

More information

Clustering Part 4 DBSCAN

Clustering Part 4 DBSCAN Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

University of Florida CISE department Gator Engineering. Clustering Part 4

University of Florida CISE department Gator Engineering. Clustering Part 4 Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without a teacher No targets for the outputs Networks which discover patterns, correlations, etc. in the input data This is a self organisation Self organising networks An

More information

Region-based Segmentation

Region-based Segmentation Region-based Segmentation Image Segmentation Group similar components (such as, pixels in an image, image frames in a video) to obtain a compact representation. Applications: Finding tumors, veins, etc.

More information

Artificial Neural Networks Unsupervised learning: SOM

Artificial Neural Networks Unsupervised learning: SOM Artificial Neural Networks Unsupervised learning: SOM 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001

More information

Gene Clustering & Classification

Gene Clustering & Classification BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering

More information

University of Florida CISE department Gator Engineering. Clustering Part 2

University of Florida CISE department Gator Engineering. Clustering Part 2 Clustering Part 2 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Partitional Clustering Original Points A Partitional Clustering Hierarchical

More information

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned

More information

CHAPTER 4: CLUSTER ANALYSIS

CHAPTER 4: CLUSTER ANALYSIS CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis

More information

INF 4300 Classification III Anne Solberg The agenda today:

INF 4300 Classification III Anne Solberg The agenda today: INF 4300 Classification III Anne Solberg 28.10.15 The agenda today: More on estimating classifier accuracy Curse of dimensionality and simple feature selection knn-classification K-means clustering 28.10.15

More information

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/25/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.

More information

CHAPTER FOUR NEURAL NETWORK SELF- ORGANIZING MAP

CHAPTER FOUR NEURAL NETWORK SELF- ORGANIZING MAP 96 CHAPTER FOUR NEURAL NETWORK SELF- ORGANIZING MAP 97 4.1 INTRODUCTION Neural networks have been successfully applied by many authors in solving pattern recognition problems. Unsupervised classification

More information

Unsupervised learning

Unsupervised learning Unsupervised learning Enrique Muñoz Ballester Dipartimento di Informatica via Bramante 65, 26013 Crema (CR), Italy enrique.munoz@unimi.it Enrique Muñoz Ballester 2017 1 Download slides data and scripts:

More information

Clustering & Dimensionality Reduction. 273A Intro Machine Learning

Clustering & Dimensionality Reduction. 273A Intro Machine Learning Clustering & Dimensionality Reduction 273A Intro Machine Learning What is Unsupervised Learning? In supervised learning we were given attributes & targets (e.g. class labels). In unsupervised learning

More information

University of Florida CISE department Gator Engineering. Clustering Part 5

University of Florida CISE department Gator Engineering. Clustering Part 5 Clustering Part 5 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville SNN Approach to Clustering Ordinary distance measures have problems Euclidean

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10. Cluster

More information

2D image segmentation based on spatial coherence

2D image segmentation based on spatial coherence 2D image segmentation based on spatial coherence Václav Hlaváč Czech Technical University in Prague Center for Machine Perception (bridging groups of the) Czech Institute of Informatics, Robotics and Cybernetics

More information

Based on Raymond J. Mooney s slides

Based on Raymond J. Mooney s slides Instance Based Learning Based on Raymond J. Mooney s slides University of Texas at Austin 1 Example 2 Instance-Based Learning Unlike other learning algorithms, does not involve construction of an explicit

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Lesson 3. Prof. Enza Messina

Lesson 3. Prof. Enza Messina Lesson 3 Prof. Enza Messina Clustering techniques are generally classified into these classes: PARTITIONING ALGORITHMS Directly divides data points into some prespecified number of clusters without a hierarchical

More information

Machine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016

Machine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016 Machine Learning for Signal Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 Statistical Modelling and Latent Structure Much of statistical modelling attempts to identify latent structure in the

More information

ECG782: Multidimensional Digital Signal Processing

ECG782: Multidimensional Digital Signal Processing ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting

More information

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification Classification systems: Supervised learning Make a rational prediction given evidence There are several methods for

More information

Chapter 7: Competitive learning, clustering, and self-organizing maps

Chapter 7: Competitive learning, clustering, and self-organizing maps Chapter 7: Competitive learning, clustering, and self-organizing maps António R. C. Paiva EEL 6814 Spring 2008 Outline Competitive learning Clustering Self-Organizing Maps What is competition in neural

More information

3. Cluster analysis Overview

3. Cluster analysis Overview Université Laval Multivariate analysis - February 2006 1 3.1. Overview 3. Cluster analysis Clustering requires the recognition of discontinuous subsets in an environment that is sometimes discrete (as

More information

Methods for Intelligent Systems

Methods for Intelligent Systems Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering

More information

Seismic regionalization based on an artificial neural network

Seismic regionalization based on an artificial neural network Seismic regionalization based on an artificial neural network *Jaime García-Pérez 1) and René Riaño 2) 1), 2) Instituto de Ingeniería, UNAM, CU, Coyoacán, México D.F., 014510, Mexico 1) jgap@pumas.ii.unam.mx

More information

Data Informatics. Seon Ho Kim, Ph.D.

Data Informatics. Seon Ho Kim, Ph.D. Data Informatics Seon Ho Kim, Ph.D. seonkim@usc.edu Clustering Overview Supervised vs. Unsupervised Learning Supervised learning (classification) Supervision: The training data (observations, measurements,

More information

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Clustering

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Clustering Data Mining and Data Warehousing Henryk Maciejewski Data Mining Clustering Clustering Algorithms Contents K-means Hierarchical algorithms Linkage functions Vector quantization SOM Clustering Formulation

More information

Cluster Analysis. Angela Montanari and Laura Anderlucci

Cluster Analysis. Angela Montanari and Laura Anderlucci Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest)

Lecture-17: Clustering with K-Means (Contd: DT + Random Forest) Lecture-17: Clustering with K-Means (Contd: DT + Random Forest) Medha Vidyotma April 24, 2018 1 Contd. Random Forest For Example, if there are 50 scholars who take the measurement of the length of the

More information

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06 Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,

More information

Spectral Classification

Spectral Classification Spectral Classification Spectral Classification Supervised versus Unsupervised Classification n Unsupervised Classes are determined by the computer. Also referred to as clustering n Supervised Classes

More information

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically

More information

Self-Organizing Maps for cyclic and unbounded graphs

Self-Organizing Maps for cyclic and unbounded graphs Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong

More information

Machine Learning. Unsupervised Learning. Manfred Huber

Machine Learning. Unsupervised Learning. Manfred Huber Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training

More information

Machine Learning : Clustering, Self-Organizing Maps

Machine Learning : Clustering, Self-Organizing Maps Machine Learning Clustering, Self-Organizing Maps 12/12/2013 Machine Learning : Clustering, Self-Organizing Maps Clustering The task: partition a set of objects into meaningful subsets (clusters). The

More information

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22 INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task

More information

Cluster Analysis: Agglomerate Hierarchical Clustering

Cluster Analysis: Agglomerate Hierarchical Clustering Cluster Analysis: Agglomerate Hierarchical Clustering Yonghee Lee Department of Statistics, The University of Seoul Oct 29, 2015 Contents 1 Cluster Analysis Introduction Distance matrix Agglomerative Hierarchical

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

3. Cluster analysis Overview

3. Cluster analysis Overview Université Laval Analyse multivariable - mars-avril 2008 1 3.1. Overview 3. Cluster analysis Clustering requires the recognition of discontinuous subsets in an environment that is sometimes discrete (as

More information

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6 Cluster Analysis and Visualization Workshop on Statistics and Machine Learning 2004/2/6 Outlines Introduction Stages in Clustering Clustering Analysis and Visualization One/two-dimensional Data Histogram,

More information

What to come. There will be a few more topics we will cover on supervised learning

What to come. There will be a few more topics we will cover on supervised learning Summary so far Supervised learning learn to predict Continuous target regression; Categorical target classification Linear Regression Classification Discriminative models Perceptron (linear) Logistic regression

More information

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering Introduction to Pattern Recognition Part II Selim Aksoy Bilkent University Department of Computer Engineering saksoy@cs.bilkent.edu.tr RETINA Pattern Recognition Tutorial, Summer 2005 Overview Statistical

More information

A Comparative study of Clustering Algorithms using MapReduce in Hadoop

A Comparative study of Clustering Algorithms using MapReduce in Hadoop A Comparative study of Clustering Algorithms using MapReduce in Hadoop Dweepna Garg 1, Khushboo Trivedi 2, B.B.Panchal 3 1 Department of Computer Science and Engineering, Parul Institute of Engineering

More information

Kapitel 4: Clustering

Kapitel 4: Clustering Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases WiSe 2017/18 Kapitel 4: Clustering Vorlesung: Prof. Dr.

More information

Working with Unlabeled Data Clustering Analysis. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

Working with Unlabeled Data Clustering Analysis. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan Working with Unlabeled Data Clustering Analysis Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan chanhl@mail.cgu.edu.tw Unsupervised learning Finding centers of similarity using

More information

Cluster Analysis. Ying Shen, SSE, Tongji University

Cluster Analysis. Ying Shen, SSE, Tongji University Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group

More information

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms. Volume 3, Issue 5, May 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Survey of Clustering

More information

Accelerating Unique Strategy for Centroid Priming in K-Means Clustering

Accelerating Unique Strategy for Centroid Priming in K-Means Clustering IJIRST International Journal for Innovative Research in Science & Technology Volume 3 Issue 07 December 2016 ISSN (online): 2349-6010 Accelerating Unique Strategy for Centroid Priming in K-Means Clustering

More information

Two-step Modified SOM for Parallel Calculation

Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Petr Gajdoš and Pavel Moravec Petr Gajdoš and Pavel Moravec Department of Computer Science, FEECS, VŠB Technical

More information

Artificial Intelligence. Programming Styles

Artificial Intelligence. Programming Styles Artificial Intelligence Intro to Machine Learning Programming Styles Standard CS: Explicitly program computer to do something Early AI: Derive a problem description (state) and use general algorithms to

More information

Segmentation of Images

Segmentation of Images Segmentation of Images SEGMENTATION If an image has been preprocessed appropriately to remove noise and artifacts, segmentation is often the key step in interpreting the image. Image segmentation is a

More information

COMS 4771 Clustering. Nakul Verma

COMS 4771 Clustering. Nakul Verma COMS 4771 Clustering Nakul Verma Supervised Learning Data: Supervised learning Assumption: there is a (relatively simple) function such that for most i Learning task: given n examples from the data, find

More information

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM. Center of Atmospheric Sciences, UNAM November 16, 2016 Cluster Analisis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster)

More information

Problem definition Image acquisition Image segmentation Connected component analysis. Machine vision systems - 1

Problem definition Image acquisition Image segmentation Connected component analysis. Machine vision systems - 1 Machine vision systems Problem definition Image acquisition Image segmentation Connected component analysis Machine vision systems - 1 Problem definition Design a vision system to see a flat world Page

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

SOMSN: An Effective Self Organizing Map for Clustering of Social Networks

SOMSN: An Effective Self Organizing Map for Clustering of Social Networks SOMSN: An Effective Self Organizing Map for Clustering of Social Networks Fatemeh Ghaemmaghami Research Scholar, CSE and IT Dept. Shiraz University, Shiraz, Iran Reza Manouchehri Sarhadi Research Scholar,

More information

CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE

CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE 32 CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE 3.1 INTRODUCTION In this chapter we present the real time implementation of an artificial neural network based on fuzzy segmentation process

More information

A Population Based Convergence Criterion for Self-Organizing Maps

A Population Based Convergence Criterion for Self-Organizing Maps A Population Based Convergence Criterion for Self-Organizing Maps Lutz Hamel and Benjamin Ott Department of Computer Science and Statistics, University of Rhode Island, Kingston, RI 02881, USA. Email:

More information

Unsupervised Learning Partitioning Methods

Unsupervised Learning Partitioning Methods Unsupervised Learning Partitioning Methods Road Map 1. Basic Concepts 2. K-Means 3. K-Medoids 4. CLARA & CLARANS Cluster Analysis Unsupervised learning (i.e., Class label is unknown) Group data to form

More information

CSE 158. Web Mining and Recommender Systems. Midterm recap

CSE 158. Web Mining and Recommender Systems. Midterm recap CSE 158 Web Mining and Recommender Systems Midterm recap Midterm on Wednesday! 5:10 pm 6:10 pm Closed book but I ll provide a similar level of basic info as in the last page of previous midterms CSE 158

More information

ECLT 5810 Clustering

ECLT 5810 Clustering ECLT 5810 Clustering What is Cluster Analysis? Cluster: a collection of data objects Similar to one another within the same cluster Dissimilar to the objects in other clusters Cluster analysis Grouping

More information

Cluster Analysis for Microarray Data

Cluster Analysis for Microarray Data Cluster Analysis for Microarray Data Seventh International Long Oligonucleotide Microarray Workshop Tucson, Arizona January 7-12, 2007 Dan Nettleton IOWA STATE UNIVERSITY 1 Clustering Group objects that

More information

Road map. Basic concepts

Road map. Basic concepts Clustering Basic concepts Road map K-means algorithm Representation of clusters Hierarchical clustering Distance functions Data standardization Handling mixed attributes Which clustering algorithm to use?

More information

10701 Machine Learning. Clustering

10701 Machine Learning. Clustering 171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among

More information

Introduction to Mobile Robotics

Introduction to Mobile Robotics Introduction to Mobile Robotics Clustering Wolfram Burgard Cyrill Stachniss Giorgio Grisetti Maren Bennewitz Christian Plagemann Clustering (1) Common technique for statistical data analysis (machine learning,

More information

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo CSE 6242 A / CS 4803 DVA Feb 12, 2013 Dimension Reduction Guest Lecturer: Jaegul Choo Data is Too Big To Do Something..

More information

6. Object Identification L AK S H M O U. E D U

6. Object Identification L AK S H M O U. E D U 6. Object Identification L AK S H M AN @ O U. E D U Objects Information extracted from spatial grids often need to be associated with objects not just an individual pixel Group of pixels that form a real-world

More information

COSC 6397 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2015.

COSC 6397 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2015. COSC 6397 Big Data Analytics Fuzzy Clustering Some slides based on a lecture by Prof. Shishir Shah Edgar Gabriel Spring 215 Clustering Clustering is a technique for finding similarity groups in data, called

More information

Computer Vision. Exercise Session 10 Image Categorization

Computer Vision. Exercise Session 10 Image Categorization Computer Vision Exercise Session 10 Image Categorization Object Categorization Task Description Given a small number of training images of a category, recognize a-priori unknown instances of that category

More information

Supervised vs unsupervised clustering

Supervised vs unsupervised clustering Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful

More information

ECLT 5810 Clustering

ECLT 5810 Clustering ECLT 5810 Clustering What is Cluster Analysis? Cluster: a collection of data objects Similar to one another within the same cluster Dissimilar to the objects in other clusters Cluster analysis Grouping

More information

MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION. Steve Tjoa June 25, 2014

MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION. Steve Tjoa June 25, 2014 MACHINE LEARNING: CLUSTERING, AND CLASSIFICATION Steve Tjoa kiemyang@gmail.com June 25, 2014 Review from Day 2 Supervised vs. Unsupervised Unsupervised - clustering Supervised binary classifiers (2 classes)

More information

Machine Learning and Pervasive Computing

Machine Learning and Pervasive Computing Stephan Sigg Georg-August-University Goettingen, Computer Networks 17.12.2014 Overview and Structure 22.10.2014 Organisation 22.10.3014 Introduction (Def.: Machine learning, Supervised/Unsupervised, Examples)

More information

COSC 6339 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2017.

COSC 6339 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2017. COSC 6339 Big Data Analytics Fuzzy Clustering Some slides based on a lecture by Prof. Shishir Shah Edgar Gabriel Spring 217 Clustering Clustering is a technique for finding similarity groups in data, called

More information