Data Mining. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of Computer Science

Similar documents
Clustering Part 3. Hierarchical Clustering

Data Mining. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of Computer Science

Cluster Analysis. Ying Shen, SSE, Tongji University

Clustering Lecture 3: Hierarchical Methods

5/15/16. Computational Methods for Data Analysis. Massimo Poesio UNSUPERVISED LEARNING. Clustering. Unsupervised learning introduction

Hierarchical Clustering

Lecture Notes for Chapter 7. Introduction to Data Mining, 2 nd Edition. by Tan, Steinbach, Karpatne, Kumar

Data Mining. Asso. Profe. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of CS (1)

Data Mining and Analysis: Fundamental Concepts and Algorithms

Hierarchical Clustering Lecture 9

Hierarchical clustering

Data Mining Concepts & Techniques

Associate Professor Dr. Raed Ibraheem Hamed

Hierarchical and Ensemble Clustering

Lesson 3. Prof. Enza Messina

Hierarchy. No or very little supervision Some heuristic quality guidances on the quality of the hierarchy. Jian Pei: CMPT 459/741 Clustering (2) 1

INF4820, Algorithms for AI and NLP: Hierarchical Clustering

CSE 5243 INTRO. TO DATA MINING

Hierarchical Clustering

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Slides From Lecture Notes for Chapter 8. Introduction to Data Mining

4. Ad-hoc I: Hierarchical clustering

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22

CSE 5243 INTRO. TO DATA MINING

CS 534: Computer Vision Segmentation and Perceptual Grouping

Data Mining & Data Warehouse

Hierarchical Clustering 4/5/17

CSE 5243 INTRO. TO DATA MINING

Unsupervised Learning

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 16

University of Florida CISE department Gator Engineering. Clustering Part 2

Cluster Analysis: Agglomerate Hierarchical Clustering

K-Means Clustering 3/3/17

Data Mining. Associate Professor Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology

Exploiting Parallelism to Support Scalable Hierarchical Clustering

Unsupervised Learning and Data Mining

Machine Learning and Data Mining. Clustering. (adapted from) Prof. Alexander Ihler

Notes. Reminder: HW2 Due Today by 11:59PM. Review session on Thursday. Midterm next Tuesday (10/09/2018)

Cluster Analysis. Angela Montanari and Laura Anderlucci

Objective of clustering

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi

DATABASE MANAGEMENT SYSTEMS

Lecture 15 Clustering. Oct

Chuck Cartledge, PhD. 23 September 2017

Cluster analysis. Agnieszka Nowak - Brzezinska

Hierarchical clustering

Unsupervised Learning Hierarchical Methods

Clustering CS 550: Machine Learning

CS570: Introduction to Data Mining

Clustering algorithms

Clustering Algorithm (DBSCAN) VISHAL BHARTI Computer Science Dept. GC, CUNY

Segmentation by Clustering

CLUSTERING ALGORITHMS

Hierarchical Clustering

Unsupervised: no target value to predict

Big Data Analytics! Special Topics for Computer Science CSE CSE Feb 9

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 19: Unsupervised Clustering (I) Dr. Yanjun Qi. University of Virginia

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

CSE 347/447: DATA MINING

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

Clustering part II 1

Unsupervised learning, Clustering CS434

Clustering (COSC 416) Nazli Goharian. Document Clustering.

CS 534: Computer Vision Segmentation and Perceptual Grouping

CS7267 MACHINE LEARNING

A Review on Cluster Based Approach in Data Mining

Road map. Basic concepts

Comparative Study of Clustering Algorithms using R

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

An Unsupervised Technique for Statistical Data Analysis Using Data Mining

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

DS504/CS586: Big Data Analytics Big Data Clustering Prof. Yanhua Li

UVA CS 4501: Machine Learning. Lecture 22: Unsupervised Clustering (I) Dr. Yanjun Qi. University of Virginia. Department of Computer Science

Information Retrieval and Web Search Engines

Clustering (COSC 488) Nazli Goharian. Document Clustering.

Unsupervised Learning and Clustering

A Systematic Overview of Data Mining Algorithms

Bioinformatics Programming. EE, NCKU Tien-Hao Chang (Darby Chang)

Today s topic CS347. Results list clustering example. Why cluster documents. Clustering documents. Lecture 8 May 7, 2001 Prabhakar Raghavan

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors

Building a Concept Hierarchy from a Distance Matrix

Hierarchical Clustering

A Comparison of Document Clustering Techniques

CPSC 425: Computer Vision

Data Mining Chapter 9: Descriptive Modeling Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

4. Cluster Analysis. Francesc J. Ferri. Dept. d Informàtica. Universitat de València. Febrer F.J. Ferri (Univ. València) AIRF 2/ / 1

Clustering Results. Result List Example. Clustering Results. Information Retrieval

Data Clustering. Algorithmic Thinking Luay Nakhleh Department of Computer Science Rice University

Associate Professor Dr. Raed Ibraheem Hamed

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering

STATS306B STATS306B. Clustering. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

CS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample

Working with Unlabeled Data Clustering Analysis. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

Image Analysis Lecture Segmentation. Idar Dyrdal

Introduction to Data Mining

Parallelization of Hierarchical Density-Based Clustering using MapReduce. Talat Iqbal Syed

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

Clustering algorithms and introduction to persistent homology

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

3. Cluster analysis Overview

Finding Clusters 1 / 60

Transcription:

Data Mining Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology Department of Computer Science 06 07

Department of CS - DM - UHD

Road map Cluster Analysis: Basic Concepts Partitioning Methods Hierarchical Methods What is Hierarchical Clustering General Steps Of Hierarchical Clustering Methods of Hierarchical Clustering Agglomerative (bottom up) Divisive (top down) Dendrogram Summary Department of CS - DM - UHD

Cluster Analysis: Basic Concepts and Methods Cluster Analysis: Basic Concepts Partitioning Methods Hierarchical Methods Evaluation of Clustering Summary

What is Hierarchical Clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis which seeks to build a hierarchy of clusters. The idea is to build a binary tree of the data that successively merges similar groups of points

General Steps Of Hierarchical Clustering Given a set of N items to be clustered, the basic process of hierarchical clustering is this:. Start by assigning each item to a cluster, so that if you have N items, you now have N clusters, each containing just one item.. Find the closest (most similar) pair of clusters and merge them into a single cluster, so that now you have one cluster less.. Compute similarities between the new cluster and each of the old clusters.. Repeat steps and until all items are clustered into K number of clusters 6

Methods of Hierarchical Clustering There are two main types of hierarchical clustering: group data objects into a tree of clusters Hierarchical methods can be Agglomerative: bottom-up approach Divisive: top-down approach 7

Hierarchical Clustering Agglomerative and Divisive Methods Agglomerative (bottom up). Start with point (singleton). Recursively add two or more appropriate clusters. Stop when k number of clusters is achieved. Divisive (top down). Start with a big cluster. Recursively divide into smaller clusters. Stop when k number of clusters is achieved. 8

Hierarchical Clustering Agglomerative and Divisive Methods Use similarity matrix as clustering criteria. Stop when k number of clusters is achieved. Step 0 Step Step Step Step a a b b a b c d e c c d e d d e e Step Step Step Step Step 0 Agglomerative Divisive 9

Hierarchical Clustering Agglomerative and Divisive Methods 0

Hierarchical Agglomerative General Algorithm In this example, after the second step of the agglomerative algorithm will yield clusters:- {a} {b c} {d e} {f}. In the third step will yield clusters {a} {b c} {d e f}, which is a clustering, in the fourth step will give a small number but larger clusters that are {a} {b c d e f} and finally will yield cluster of {a b c d e f} Raw data

Dendrogram Hierarchical Clustering Dendrogram: a tree data structure which illustrates hierarchical clustering techniques. Each level shows clusters for that level. Leaf individual clusters Root one cluster Department of CS - DM - UHD

Dendrogram: Shows How Clusters are Merged Show how to merge clusters hierarchically Decompose data objects into a multi-level of a tree of clusters A clustering of the data objects: giving the dendrogram at the desired level Each connected component forms a cluster

Levels of Clustering

Levels of Traditional Clustering Department of CS - DM - UHD

Traditional Hierarchical Clustering Divisive (top down) Divisive Method. Start with a big cluster. Recursively divide into smaller clusters. Stop when k number of clusters is achieved. 0 9 8 7 6 0 0 6 7 8 9 0 0 9 8 7 6 0 0 6 7 8 9 0 0 9 8 7 6 0 0 6 7 8 9 0 6

Traditional Hierarchical Clustering Agglomerative Method Agglomerative (bottom up). Start with point (singleton). Recursively add two or more appropriate clusters. Stop when k number of clusters is achieved. 0 0 0 9 9 9 8 8 8 7 7 7 6 6 6 0 0 0 0 6 7 8 9 0 0 6 7 8 9 0 0 6 7 8 9 0 7

Hierarchical Clustering p p p p p p p p Traditional Hierarchical Clustering Traditional Dendrogram p p p p p p p p Non-traditional Hierarchical Clustering Non-traditional Dendrogram 8

Hierarchical Clustering 0. 6 0. 0. 0.0 0 6 Nested Clusters Dendrogram 9

Hierarchical Clustering 0. 0. 0. 0. 6 0. 0. 0. 0.0 0 6 Nested Clusters Dendrogram 0

Hierarchical Clustering 0. 0. 6 0. 0. 0.0 0 6 Nested Clusters Dendrogram

Department of CS - DM - UHD