Cluster analysis. Agnieszka Nowak - Brzezinska

Similar documents
BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

Hierarchical Clustering

Data Mining Concepts & Techniques

Lecture Notes for Chapter 7. Introduction to Data Mining, 2 nd Edition. by Tan, Steinbach, Karpatne, Kumar

ECLT 5810 Clustering

CS7267 MACHINE LEARNING

ECLT 5810 Clustering

CSE 347/447: DATA MINING

CSE 5243 INTRO. TO DATA MINING

Clustering Part 3. Hierarchical Clustering

Hierarchical clustering

Hierarchical Clustering

DATA MINING LECTURE 7. Hierarchical Clustering, DBSCAN The EM Algorithm

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Slides From Lecture Notes for Chapter 8. Introduction to Data Mining

CSE 5243 INTRO. TO DATA MINING

Clustering Lecture 3: Hierarchical Methods

5/15/16. Computational Methods for Data Analysis. Massimo Poesio UNSUPERVISED LEARNING. Clustering. Unsupervised learning introduction

Clustering CS 550: Machine Learning

Unsupervised Learning. Supervised learning vs. unsupervised learning. What is Cluster Analysis? Applications of Cluster Analysis

Cluster Analysis. Ying Shen, SSE, Tongji University

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

DATA MINING - 1DL105, 1Dl111. An introductory class in data mining

Lecture Notes for Chapter 8. Introduction to Data Mining

What is Cluster Analysis?

Data Mining: Concepts and Techniques. Chapter March 8, 2007 Data Mining: Concepts and Techniques 1

Cluster Analysis. CSE634 Data Mining

University of Florida CISE department Gator Engineering. Clustering Part 2

Unsupervised Learning. Andrea G. B. Tettamanzi I3S Laboratory SPARKS Team

Clustering fundamentals

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

CSE 5243 INTRO. TO DATA MINING

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

Knowledge Discovery in Databases

Statistics 202: Data Mining. c Jonathan Taylor. Clustering Based in part on slides from textbook, slides of Susan Holmes.

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Unsupervised Learning : Clustering

Notes. Reminder: HW2 Due Today by 11:59PM. Review session on Thursday. Midterm next Tuesday (10/09/2018)

Cluster Analysis: Basic Concepts and Algorithms

Unsupervised Learning

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Road map. Basic concepts

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Lecture Notes for Chapter 7. Introduction to Data Mining, 2 nd Edition. by Tan, Steinbach, Karpatne, Kumar

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a

Gene Clustering & Classification

Community Detection. Jian Pei: CMPT 741/459 Clustering (1) 2

Data Mining. Clustering. Hamid Beigy. Sharif University of Technology. Fall 1394

Statistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1

Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Tan,Steinbach, Kumar Introduction to Data Mining 4/18/

Clustering Analysis Basics

Clustering Basic Concepts and Algorithms 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

CS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Machine Learning 15/04/2015. Supervised learning vs. unsupervised learning. What is Cluster Analysis? Applications of Cluster Analysis

Lecture 7 Cluster Analysis: Part A

Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Tan,Steinbach, Kumar Introduction to Data Mining 4/18/

Online Social Networks and Media. Community detection

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi

Clustering Tips and Tricks in 45 minutes (maybe more :)

Cluster Analysis. Angela Montanari and Laura Anderlucci

Introduction to Clustering

9/17/2009. Wenyan Li (Emily Li) Sep. 15, Introduction to Clustering Analysis

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Hierarchical Clustering Lecture 9

CS570: Introduction to Data Mining

CHAPTER 4: CLUSTER ANALYSIS

Hierarchical Clustering

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

Hierarchical clustering

Clustering Part 2. A Partitional Clustering

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

Data Mining 資料探勘. 分群分析 (Cluster Analysis) 1002DM04 MI4 Thu. 9,10 (16:10-18:00) B513

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

Working with Unlabeled Data Clustering Analysis. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

Chapter VIII.3: Hierarchical Clustering

Homework # 4. Example: Age in years. Answer: Discrete, quantitative, ratio. a) Year that an event happened, e.g., 1917, 1950, 2000.

Information Retrieval and Web Search Engines

COMP20008 Elements of Data Processing. Outlier Detection and Clustering

University of Florida CISE department Gator Engineering. Clustering Part 4

Chapter DM:II. II. Cluster Analysis

Lesson 3. Prof. Enza Messina

Machine Learning (BSMC-GA 4439) Wenke Liu

Forestry Applied Multivariate Statistics. Cluster Analysis

Clustering Part 4 DBSCAN

Cluster Analysis. Prof. Thomas B. Fomby Department of Economics Southern Methodist University Dallas, TX April 2008 April 2010

Clustering: Overview and K-means algorithm

Introduction to Computer Science

Clustering part II 1

Cluster Analysis for Microarray Data

Clustering Algorithms for general similarity measures

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

University of Florida CISE department Gator Engineering. Clustering Part 5

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology

Today s lecture. Clustering and unsupervised learning. Hierarchical clustering. K-means, K-medoids, VQ

Unsupervised Learning and Clustering

PAM algorithm. Types of Data in Cluster Analysis. A Categorization of Major Clustering Methods. Partitioning i Methods. Hierarchical Methods

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.

Clustering. Chapter 10 in Introduction to statistical learning

7.1 Euclidean and Manhattan distances between two objects The k-means partitioning algorithm... 21

Transcription:

Cluster analysis Agnieszka Nowak - Brzezinska

Outline of lecture What is cluster analysis? Clustering algorithms Measures of Cluster Validity

What is Cluster Analysis? Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups Intra-cluster distances are minimized Inter-cluster distances are maximized

Applications of Cluster Analysis Understanding Group genes and proteins that have similar functionality, or group stocks with similar price fluctuations Summarization Reduce the size of large data sets Clustering precipitation in Australia

Notion of a Cluster can be Ambiguous How many clusters? Six Clusters Two Clusters Four Clusters

Types of Clusterings A clustering is a set of clusters Important distinction between hierarchical and partitional sets of clusters Partitional Clustering A division data objects into non-overlapping subsets (clusters) such that each data object is in exactly one subset Hierarchical clustering A set of nested clusters organized as a hierarchical tree

Partitional Clustering Original Points A Partitional Clustering

Hierarchical Clustering p1 p2 p3 p4 p1 p2 p3 p4 Traditional Hierarchical Clustering Traditional Dendrogram

Clustering Algorithms K-means Hierarchical clustering Graph based clustering

K-means Clustering Partitional clustering approach Each cluster is associated with a centroid (center point) Each point is assigned to the cluster with the closest centroid Number of clusters, K, must be specified The basic algorithm is very simple

y Illustration 3 Iteration 12 34 56 2.5 2 1.5 1 0.5 0-2 -1.5-1 -0.5 0 0.5 1 1.5 2 x

y y y y y y Illustration 3 Iteration 1 3 Iteration 2 3 Iteration 3 2.5 2.5 2.5 2 2 2 1.5 1.5 1.5 1 1 1 0.5 0.5 0.5 0 0 0-2 -1.5-1 -0.5 0 0.5 1 1.5 2 x -2-1.5-1 -0.5 0 0.5 1 1.5 2 x -2-1.5-1 -0.5 0 0.5 1 1.5 2 x 3 Iteration 4 3 Iteration 5 3 Iteration 6 2.5 2.5 2.5 2 2 2 1.5 1.5 1.5 1 1 1 0.5 0.5 0.5 0 0 0-2 -1.5-1 -0.5 0 0.5 1 1.5 2 x -2-1.5-1 -0.5 0 0.5 1 1.5 2 x -2-1.5-1 -0.5 0 0.5 1 1.5 2 x

y y y Two different K-means Clusterings 3 2.5 2 1.5 Original Points 1 0.5 0-2 -1.5-1 -0.5 0 0.5 1 1.5 2 x 3 3 2.5 2.5 2 2 1.5 1.5 1 1 0.5 0.5 0 0-2 -1.5-1 -0.5 0 0.5 1 1.5 2 x Optimal Clustering -2-1.5-1 -0.5 0 0.5 1 1.5 2 x Sub-optimal Clustering

Solutions to Initial Centroids Problem Multiple runs Helps, but probability is not on your side Sample and use hierarchical clustering to determine initial centroids Select more than k initial centroids and then select among these initial centroids Select most widely separated Bisecting K-means Not as susceptible to initialization issues

Evaluating K-means Clusters Most common measure is Sum of Squared Error (SSE) For each point, the error is the distance to the nearest cluster To get SSE, we square these errors and sum them. SSE K i 1 x C i dist 2 ( m, x) x is a data point in cluster C i and m i is the representative point for cluster C i can show that m i corresponds to the center (mean) of the cluster Given two clusters, we can choose the one with the smaller error One easy way to reduce SSE is to increase K, the number of clusters A good clustering with smaller K can have a lower SSE than a poor clustering with higher K i

Limitations of K-means K-means has problems when clusters are of differing Sizes Densities Non-globular shapes K-means has problems when the data contains outliers. The number of clusters (K) is difficult to determine.

Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram A tree like diagram that records the sequences of merges or splits 6 5 0.2 0.15 0.1 4 3 4 2 5 2 0.05 3 1 1 0 1 3 2 5 4 6

Strengths of Hierarchical Clustering Do not have to assume any particular number of clusters Any desired number of clusters can be obtained by cutting the dendogram at the proper level They may correspond to meaningful taxonomies Example in biological sciences (e.g., animal kingdom, phylogeny reconstruction, )

Hierarchical Clustering Two main types of hierarchical clustering Agglomerative: Start with the points as individual clusters At each step, merge the closest pair of clusters until only one cluster (or k clusters) left Divisive: Start with one, all-inclusive cluster At each step, split a cluster until each cluster contains a point (or there are k clusters) Traditional hierarchical algorithms use a similarity or distance matrix Merge or split one cluster at a time

Agglomerative Clustering Algorithm More popular hierarchical clustering technique Basic algorithm is straightforward 1. Compute the proximity matrix 2. Let each data point be a cluster 3. Repeat 4. Merge the two closest clusters 5. Update the proximity matrix 6. Until only a single cluster remains Key operation is the computation of the proximity of two clusters Different approaches to defining the distance between clusters distinguish the different algorithms

Starting Situation Start with clusters of individual points and a proximity matrix p1 p1 p2 p3 p4 p5... p2 p3 p4 p5... Proximity Matrix... p1 p2 p3 p4 p9 p10 p11 p12

Intermediate Situation After some merging steps, we have some clusters C1 C2 C3 C4 C5 C1 C3 C4 C1 C2 C3 C4 C5 Proximity Matrix C2 C5... p1 p2 p3 p4 p9 p10 p11 p12

Intermediate Situation We want to merge the two closest clusters (C2 and C5) and update the proximity matrix. C1 C2 C1 C2 C3 C4 C5 C3 C4 C3 C5 C4 Proximity Matrix C1 C2 C5... p1 p2 p3 p4 p9 p10 p11 p12

After Merging The question is How do we update the proximity matrix? C1 C2 U C5 C3 C4 C3 C4 C1 C2 U C5 C3 C4??????? C1 Proximity Matrix C2 U C5... p1 p2 p3 p4 p9 p10 p11 p12

How to Define Inter-Cluster Similarity p1 p2 p3 p4 p5... Similarity? p1 p2 MIN MAX Group Average Distance Between Centroids p3 p4 p5... Proximity Matrix

How to Define Inter-Cluster Similarity p1 p2 p3 p4 p5... p1 p2 p3 p4 p5 MIN MAX Group Average Distance Between Centroids... Proximity Matrix

How to Define Inter-Cluster Similarity p1 p2 p3 p4 p5... p1 p2 p3 p4 p5 MIN MAX Group Average Distance Between Centroids... Proximity Matrix

How to Define Inter-Cluster Similarity p1 p2 p3 p4 p5... p1 p2 p3 p4 p5 MIN MAX Group Average Distance Between Centroids... Proximity Matrix

How to Define Inter-Cluster Similarity p1 p1 p2 p3 p4 p5... p2 p3 p4 MIN MAX Group Average Distance Between Centroids p5... Proximity Matrix

Hierarchical Clustering: Group Average Compromise between Single and Complete Link Strengths Less susceptible to noise and outliers Limitations Biased towards globular clusters

Hierarchical Clustering: Time and Space requirements O(N 2 ) space since it uses the proximity matrix. N is the number of points. O(N 3 ) time in many cases There are N steps and at each step the size, N 2, proximity matrix must be updated and searched Complexity can be reduced to O(N 2 log(n) ) time for some approaches

Hierarchical Clustering: Problems and Limitations Once a decision is made to combine two clusters, it cannot be undone No objective function is directly minimized Different schemes have problems with one or more of the following: Sensitivity to noise and outliers (MIN) Difficulty handling different sized clusters and non-convex shapes (Group average, MAX) Breaking large clusters (MAX)

Types of data in clustering analysis Interval-scaled attributes: Binary attributes: Nominal, ordinal, and ratio attributes: Attributes of mixed types:

Interval-scaled attributes Continuous measurements of a roughly linear scale E.g. weight, height, temperature, etc. Standardize data in preprocessing so that all attributes have equal weight Exceptions: height may be a more important attribute associated with basketball players

Similarity and Dissimilarity Between Objects Distances are normally used to measure the similarity or dissimilarity between two data objects (objects=records) Minkowski distance: where i = (x i1, x i2,, x ip ) and j = (x j1, x j2,, x jp ) are two p- dimensional data objects, and q is a positive integer If q = 1, d is Manhattan distance q q p p q q j x i x j x i x j x i x j i d )... ( ), ( 2 2 1 1... ), ( 2 2 1 1 p p j x i x j x i x j x i x j i d

Similarity and Dissimilarity Between Objects (Cont.) If q = 2, d is Euclidean distance: d( i, j) Properties d(i,j) 0 d(i,i) = 0 ( x i d(i,j) = d(j,i) x j x i d(i,j) d(i,k) + d(k,j) 2 x j... x i 1 1 2 2 p p Can also use weighted distance, or other dissimilarity measures. 2 x j 2 )

Binary Attributes A contingency table for binary data Object i Object j 1 0 sum 1 a b a b 0 c d c d sum a c b d p Simple matching coefficient (if the binary attribute is symmetric): d( i, j) b c a b c d Jaccard coefficient (if the binary attribute is asymmetric): d( i, j) b c a b c

Example i j Dissimilarity between Binary gender is a symmetric attribute Attributes Name Gender Fever Cough Test-1 Test-2 Test-3 Test-4 Jack M Y N P N N N Mary F Y N P N P N Jim M Y P N N N N remaining attributes are asymmetric let the values Y and P be set to 1, and the value N be set to 0 d( d( d( 0 1 jack, mary ) 0.33 2 0 1 1 1 jack, jim) 0.67 1 1 1 1 2 jim, mary ) 0.75 1 1 2

Nominal Attributes A generalization of the binary attribute in that it can take more than 2 states, e.g., red, yellow, blue, green Method 1: Simple matching m: # of attributes that are same for both records, p: total # of attributes d( i, j) p p m Method 2: rewrite the database and create a new binary attribute for each of the m states For an object with color yellow, the yellow attribute is set to 1, while the remaining attributes are set to 0.

Ordinal Attributes An ordinal attribute can be discrete or continuous Order is important, e.g., rank Can be treated like interval-scaled replacing x if by their rank map the range of each variable onto [0, 1] by replacing i-th object in the f-th attribute by r 1 if z if M 1 compute the dissimilarity using methods for interval-scaled attributes f r { 1,..., M } if f

Eucliedean distance

Other measures spheric manhattan

Different measures

y The distance between two point Calculate the distance between A (2,3) and B(7,8). D (A,B) = square ((7-2) 2 + (8-3) 2 ) = square (25 + 25) = square (50) = 7.07 9 8 B 7 6 5 4 3 2 1 0 A 0 2 4 6 8 x A B

y 3 points A(2,3), B(7,8) and C(5,1). Calculate: D (A,B) = square ((7-2) 2 + (8-3) 2 ) = square (25 + 25) = square (50) = 7.07 D (A,C) = square ((5-2) 2 + (3-1) 2 ) = square (9 + 4) = square (13) = 3.60 D (B,C) = square ((7-5) 2 + (3-8) 2 ) = square (4 + 25) = square (29) = 5.38 9 8 7 B 6 5 4 3 2 1 0 A C 0 1 2 3 4 5 6 7 8 x A B C

D (A,B) = square ((0.7-0.6) 2 + (0.8-0.8) 2 + (0.4-0.3) 2 + (0.5-0.4) 2 + (0.2-0.2) 2 ) = square (0.01 + 0.01 + 0.01) = square (0.03) = 0.17 D (A,C) = square ((0.7-0.8) 2 + (0.8-0.9) 2 + (0.4-0.7) 2 + (0.5-0.8) 2 + (0.2-0.9) 2 ) = square (0.01 + 0.01 + 0.09 + 0.09 + 0.49) = square (0.69) = 0.83 D (B,C) = square ((0.6-0.8) 2 + (0.8-0.9) 2 + (0.5-0.7) 2 + (0.4-0.8) 2 + (0.2-0.9) 2 ) = square (0.04 + 0.01 + 0.04+0.16 + 0.49) = square (0.74) = 0.86 Many attributes V1 V2 V3 V4 V5 A 0.7 0.8 0.4 0.5 0.2 B 0.6 0.8 0.5 0.4 0.2 C 0.8 0.9 0.7 0.8 0.9

Hierarchical Clustering Use distance matrix as clustering criteria. This method does not require the number of clusters k as an input, but needs a termination condition. Step 0 Step 1 Step 2 Step 3 Step 4 a a b b a b c d e c c d e d d e e Step 4 Step 3 Step 2 Step 1 Step 0 agglomerative (AGNES) divisive (DIANA)

AGNES-Explored Given a set of N items to be clustered, and an NxN distance (or similarity) matrix, the basic process of Johnson's (1967) hierarchical clustering is this: Start by assigning each item to its own cluster, so that if you have N items, you now have N clusters, each containing just one item. Let the distances (similarities) between the clusters equal the distances (similarities) between the items they contain. Find the closest (most similar) pair of clusters and merge them into a single cluster, so that now you have one less cluster.

AGNES Compute distances (similarities) between the new cluster and each of the old clusters. Repeat steps 2 and 3 until all items are clustered into a single cluster of size N. Step 3 can be done in different ways, which is what distinguishes single-link from completelink and average-link clustering

data VAR 1 VAR 2 1 1 3 2 1 8 3 5 3 4 1 1 5 2 8 6 5 2 7 2 3 8 4 8 9 7 2 10 5 8

Distance matrix P_1 P_2 P_3 P_4 P_5 P_6 P_7 P_8 P_9 P_10 P_1 0 5,00 4,00 2,00 5,10 4,12 1,00 5,83 6,08 6,40 P_2 5,00 0 6,40 7,00 1,00 7,21 5,10 3,00 8,49 4,00 P_3 4,00 6,40 0 4,47 5,83 1,00 3,00 5,10 2,24 5,00 P_4 2,00 7,00 4,47 0 7,07 4,12 2,24 7,62 6,08 8,06 P_5 5,10 1,00 5,83 7,07 0 6,71 5,00 2,00 7,81 3,00 P_6 4,12 7,21 1,00 4,12 6,71 0 3,16 6,08 2,00 6,00 P_7 1,00 5,10 3,00 2,24 5,00 3,16 0 5,39 5,10 5,83 P_8 5,83 3,00 5,10 7,62 2,00 6,08 5,39 0 6,71 1,00 P_9 6,08 8,49 2,24 6,08 7,81 2,00 5,10 6,71 0 6,32 P_10 6,40 4,00 5,00 8,06 3,00 6,00 5,83 1,00 6,32 0

P_1 0 P_17 1 step P_1 P_2 P_3 P_4 P_5 P_6 P_7 P_8 P_9 P_10 P_2 5 0 P_3 4 6,4 0 P_4 2 7 4,47 0 P_5 5.1 1 5.83 7.07 0 P_6 4.12 7.21 1 4.12 6.71 0 P_7 P_17 1 P_7 1 5.1 3 2.24 5 3.16 0 P_8 5.83 3 5.1 7.62 2 6.08 5.39 0 P_9 6.08 8.49 2.24 6.08 7.81 2 5.1 6.71 0 P_10 6.4 4 5 8.06 3 6 5.83 1 6.32 0 Find the minimal distance...

P_1 0 P_17 1 step P_1 P_2 P_3 P_4 P_5 P_6 P_7 P_8 P_9 P_10 P_2 5 0 P_3 4 6,4 0 P_4 2 7 4,47 0 P_5 5.1 1 5.83 7.07 0 P_6 4.12 7.21 1 4.12 6.71 0 P_7 P_17 1 P_7 1 5.1 3 2.24 5 3.16 0 P_8 5.83 3 5.1 7.62 2 6.08 5.39 0 P_9 6.08 8.49 2.24 6.08 7.81 2 5.1 6.71 0 P_10 6.4 4 5 8.06 3 6 5.83 1 6.32 0 Find the minimal distance...

P_1 0 P_17 1 step P_1 P_2 P_3 P_4 P_5 P_6 P_7 P_8 P_9 P_10 P_2 5 0 P_3 4 6,4 0 P_4 2 7 4,47 0 P_5 5.1 1 5.83 7.07 0 P_6 4.12 7.21 1 4.12 6.71 0 P_7 P_17 1 P_7 1 5.1 3 2.24 5 3.16 0 P_8 5.83 3 5.1 7.62 2 6.08 5.39 0 P_9 6.08 8.49 2.24 6.08 7.81 2 5.1 6.71 0 P_10 6.4 4 5 8.06 3 6 5.83 1 6.32 0 Find the minimal distance...

2 step P_17 0 P_17 P_2 P_3 P_4 P_5 P_6 P_8 P_9 P_10 P_2 5 0 P_25 P_25 P_3 3 6,4 0 P_4 2 7 4,47 0 1 P_5 5 1 5.83 7.07 0 P_6 3.16 7.21 1 4.12 6.71 0 P_8 5.39 3 5.1 7.62 2 6.08 0 P_9 5.1 8.49 2.24 6.08 7.81 2 6.71 0 P_10 5.83 4 5 8.06 3 6 1 6.32 0

2 step P_17 0 P_17 P_2 P_3 P_4 P_5 P_6 P_8 P_9 P_10 P_2 5 0 P_25 P_25 P_3 3 6,4 0 P_4 2 7 4,47 0 1 P_5 5 1 5.83 7.07 0 P_6 3.16 7.21 1 4.12 6.71 0 P_8 5.39 3 5.1 7.62 2 6.08 0 P_9 5.1 8.49 2.24 6.08 7.81 2 6.71 0 P_10 5.83 4 5 8.06 3 6 1 6.32 0

3 step P_17 P_25 P_3 P_4 P_6 P_8 P_9 P_10 P_17 0 P_25 5 0 P_36 P_3 P_3 3 5,83 0 P_4 2 7 4,47 0 P_6 3.16 6.71 1 4.12 0 P_8 5.39 2 5.1 7.62 6.08 0 P_6 P_36 P_9 5.1 7.81 2.24 6.08 2 6.71 0 P_10 5.83 3 5 8.06 6 1 6.32 0 Find the minimal distance...

3 step P_17 P_25 P_3 P_4 P_6 P_8 P_9 P_10 P_17 0 P_25 5 0 P_36 P_3 P_3 3 5,83 0 P_4 2 7 4,47 0 P_6 3.16 6.71 1 4.12 0 P_8 5.39 2 5.1 7.62 6.08 0 P_6 P_36 P_9 5.1 7.81 2.24 6.08 2 6.71 0 P_10 5.83 3 5 8.06 6 1 6.32 0 Find the minimal distance...

3 step P_17 P_25 P_3 P_4 P_6 P_8 P_9 P_10 P_17 0 P_25 5 0 P_36 P_3 P_3 3 5,83 0 P_4 2 7 4,47 0 P_6 3.16 6.71 1 4.12 0 P_8 5.39 2 5.1 7.62 6.08 0 P_6 P_36 P_9 5.1 7.81 2.24 6.08 2 6.71 0 P_10 5.83 3 5 8.06 6 1 6.32 0 Find the minimal distance...

4 step P_17 P_25 P_36 P_4 P_810 P_9 P_10 P_17 0 P_25 5 0 P_36 3 5,83 0 P_4 2 7 4,12 0 P_8 P_810 5.39 2 5.1 7.62 0 P_9 5.1 7.81 2 6.08 6.71 0 P_10 5.83 3 5 8.06 1 6.3 2 P_10 0 Find the minimal distance...

4 step P_17 P_25 P_36 P_4 P_810 P_9 P_10 P_17 0 P_25 5 0 P_36 3 5,83 0 P_4 2 7 4,12 0 P_8 P_810 5.39 2 5.1 7.62 0 P_9 5.1 7.81 2 6.08 6.71 0 P_10 5.83 3 5 8.06 1 6.3 2 P_10 0 Find the minimal distance...

4 step P_17 P_25 P_36 P_4 P_810 P_9 P_10 P_17 0 P_25 5 0 P_36 3 5,83 0 P_4 2 7 4,12 0 P_8 P_810 5.39 2 5.1 7.62 0 P_9 5.1 7.81 2 6.08 6.71 0 P_10 5.83 3 5 8.06 1 6.3 2 P_10 0 Find the minimal distance...

5 step P_174 P_25 P_36 P_4 P_810 P_9 P_174 0 P_25 5 0 P_36 3 5,83 0 P_4 2 7 4,12 0 P_4 P_810 5.39 2 5 7.62 0 P_9 5.1 7.81 2 6.08 6.71 0 Find the minimal distance...

5 step P_174 P_25 P_36 P_4 P_810 P_9 P_174 0 P_25 5 0 P_36 3 5,83 0 P_4 2 7 4,12 0 P_4 P_810 5.39 2 5 7.62 0 P_9 5.1 7.81 2 6.08 6.71 0 Find the minimal distance...

5 step P_174 P_25 P_36 P_4 P_810 P_9 P_174 0 P_25 5 0 P_36 3 5,83 0 P_4 2 7 4,12 0 P_4 P_810 5.39 2 5 7.62 0 P_9 5.1 7.81 2 6.08 6.71 0 Find the minimal distance...

6 step P_174 P_25810 P_36 P_810 P_9 P_174 0 P_25810 5 0 P_36 3 5,83 0 P_810 5.39 2 5 0 P_9 5.1 7.81 2 6.71 0 Find the minimal distance...

6 step P_174 P_25810 P_36 P_810 P_9 P_174 0 P_25810 5 0 P_36 3 5,83 0 P_810 5.39 2 5 0 P_9 5.1 7.81 2 6.71 0 Find the minimal distance...

6 step P_174 P_25810 P_36 P_810 P_9 P_174 0 P_25810 5 0 P_36 3 5,83 0 P_810 5.39 2 5 0 P_9 5.1 7.81 2 6.71 0 Find the minimal distance...

7 step P_174 0 P_174 P_25810 P_369 P_9 P_25810 5 0 P_369 3 5 0 P_9 5.1 6.71 2 0 Find the minimal distance...

7 step P_174 0 P_174 P_25810 P_369 P_9 P_25810 5 0 P_369 3 5 0 P_9 5.1 6.71 2 0 Find the minimal distance...

7 step P_174 0 P_174 P_25810 P_369 P_9 P_25810 5 0 P_369 3 5 0 P_9 5.1 6.71 2 0 Find the minimal distance...

8 step P_174369 P_174 P_25810 P_369 P_174369 0 P_25810 5 0 P_369 3 5 0 Find the minimal distance...

8 step P_174369 P_174 P_25810 P_369 P_174369 0 P_25810 5 0 P_369 3 5 0 Find the minimal distance...

8 step P_174369 P_174 P_25810 P_369 P_174369 0 P_25810 5 0 P_369 3 5 0 Find the minimal distance...

9 step P_174369 P_174 P_25810 P_369 P_174369 0 P_25810 5 0 P_369 3 5 0 Find the minimal distance... 9 step P_174369 0 P_174369 P_25810 P_25810 5 0 Find the minimal distance...

9 step P_174369 P_174 P_25810 P_369 P_174369 0 It is now only ONE GROUP P_17436925810 P_25810 5 0 P_369 3 5 0 9 STEPS OF THE ALGORITHM Find the minimal distance... 9 step P_174369 0 P_174369 P_25810 P_25810 5 0 Find the minimal distance...

9 step P_174369 P_174 P_25810 P_369 P_174369 0 Powstanie jedna grupa P_17436925810 P_25810 5 0 P_369 3 5 0 Find the minimal distance... 9 iteracji algorytmu 9 step P_174369 0 P_174369 P_25810 P_25810 5 0 Find the minimal distance...

Dendrogram

Single Linkage Complete linkage Average Linkage

The number of clusters