Non-exhaustive, Overlapping k-means

Size: px
Start display at page:

Download "Non-exhaustive, Overlapping k-means"

Transcription

1 Non-exhaustive, Overlapping k-means J. J. Whang, I. S. Dhilon, and D. F. Gleich Teresa Lebair University of Maryland, Baltimore County October 29th, 2015 Teresa Lebair UMBC 1/38

2 Outline Introduction NEO-K-Means Objective Functional NEO-K-Means Algorithm Graph Clustering using NEO-K-Means Experimental Results Conclusions and Future Work Teresa Lebair UMBC 2/38

3 Introduction The traditional k-means clustering algorithm is exhaustive (every point is put into a cluster), and non-overlapping (no point belongs to more than one cluster). There are many applications in which there are outliers (points do not belong to any cluster) and / or some points belong to multiple clusters. Application examples: Outlier detection in datasets. Popular EachMovie dataset: some movies belong to multiple genres. Biology: clustering genes by function results in overlapping clusters. Teresa Lebair UMBC 3/38

4 Introduction Many papers on clustering consider either cluster outliers, or overlapping clusters, but not both. This paper is one of the first to consider both cluster outliers and overlapping clusters in a unified manner. One popular application is community detection (e.g., detecting clusters or communities of users in social networks.) In this paper, the objective functional for traditional clustering is modified to produce the NEO-K-Means objective functional. Teresa Lebair UMBC 4/38

5 Introduction A NEO-K-Means Algorithm is proposed to minimize the NEO-K-Means objective functional. NEO-K-Means is also used to solve graph clustering problems. NEO-K-Means is tested experimentally on both vector and graph data Teresa Lebair UMBC 5/38

6 Traditional k-means X := {x 1, x 2,..., x n } is a set of data points. We want to partition X into k clusters C 1, C 2,..., C k, i.e., j C j = X and i j = C i C j =. The goal of k-means is to pick the clusters to minimize the sum of the distances of the clusters from the cluster centroids, i.e., solve the minimization problem min {C j } k j=1 k j=1 This is an NP-hard problem. x i C j x i m j 2, where m j := x i C j x i C j However, the traditional k-means algorithm monotonically decreases the objective functional. Teresa Lebair UMBC 6/38

7 k-means Extension Define { the assignment matrix U = (u ij ) R n k such that 1 if x i C j u ij = 0 otherwise. In the case of traditional disjoint exhaustive clustering, each column of U contains exactly one entry of 1; all other entries are zeros. Hence the trace of U T U is equal to the number of cluster assignments, n. To control the number of cluster assignments, we introduce the constraint that the trace of U T U is equal to n(1 + α), where 0 α (k 1). Teresa Lebair UMBC 7/38

8 k-means Extension First extend the k-means objective as follows to minimize over assignment matrices U: min U k j=1 i=1 n u ij x i m j 2, where m j = s.t. trace(u T U) = (1 + α)n n i=1 u ijx i n i=1 u. ij Take α << (k 1) to avoid assigning each point as its own cluster. New objective functional allows for both outliers and overlapping clusters. Teresa Lebair UMBC 8/38

9 Testing k-means extension Tested first extension of the k-means objectve on synthetic data set: Two over-lapping clusters of ordered pairs, with several additional points as outliers. Each cluster generated from a Gaussian distribution. Algorithm similar to k-means is used. α is set to 0.1, the ground-truth value of α. Teresa Lebair UMBC 9/38

10 Testing k-means extension First extension of k-means fails to recover ground-truth clusters. Too many points are labeled as outliers. Teresa Lebair UMBC 10/38

11 NEO-K-Means Objective Necessary to control the degree of non-exhaustiveness. Let I denote the indicator function: { 1 if the expression is true I(expression) = 0 otherwise. Let 1 R k denote the vector of all ones. Note that (U1) i denotes the number of clusters to which x i belongs. We update the objective functional by adding a non-exhaustiveness constraint. Teresa Lebair UMBC 11/38

12 NEO-K-Means Objective The NEO-K-Means Objective is defined as follows: min U k j=1 i=1 n u ij x i m j 2, where m j = s.t. trace(u T U) = (1 + α)n, n i=1 u ijx i n i=1 u. ij n I ((U1) i = 0) βn 0 β << 1 controls the amount of points can be labeled as outliers. i=1 The choice of α = β = 0 recovers the traditional k-means objective. Teresa Lebair UMBC 12/38

13 Testing NEO-K-Means Objective Tested the NEO-K-Means Objective using the previous synthetic data set. Outcome is much better than that of the previous k-means extension. Teresa Lebair UMBC 13/38

14 NEO-K-Means Algorithm At most βn data points have no cluster membership = at least n βn data points belong to a cluster. Sketch of NEO-K-Means Algorithm Initialize cluster centroids (use any traditional k-means initialization strategy.) Compute d ij, the distance between each point x i and each cluster C j, for all i = 1,..., n and j = 1,..., k. Sort data points in ascending order by distance to closest cluster. Assign the first n βn data points in the sorted list to their closest clusters. Let C j denote the assignments made by this step. Teresa Lebair UMBC 14/38

15 NEO-K-Means Algorithm Sketch of NEO-K-Means Algorithm (Continued) Note that k j=1 C j = n βn. Make an additional αn + βn assignments based off of the αn + βn smallest entries of D = (d ij ) not already associated with assignments to C j s. Let Ĉ j denote the assignments made by this step. Each cluster C j is then updated to be C j := C j Ĉ j. Repeat process until objective function is sufficiently small, or the maximum number of iterations is reached. Teresa Lebair UMBC 15/38

16 NEO-K-Means Algorithm Teresa Lebair UMBC 16/38

17 NEO-K-Means Algorithm Theorem The NEO-K-Means objective functional monotonically decreases through the application of the NEO-K-Means Algorithm while satisfying the constraints for fixed α and β. Proof. Let J (t) denote the objective at the t-th iterations. Then J (t) = = k j=1 k j=1 x i C (t) j x i C (t+1) j x i m (t) j 2 k j=1 x i m (t) j 2 x i C (t+1) j k j=1 x i C (t+1) j x i m (t) j 2 + k j=1 x i Ĉ (t+1) j x i m (t+1) j 2 = J (t+1). x i m (t) j 2 Teresa Lebair UMBC 17/38

18 NEO-K-Means Algorithm: Choosing α and β Choosing β: Run traditional k-means. Let d i be the distance between a data point x i and its closest cluster. Compute the mean µ and standard deviation σ for these set of distances. Consider d i and outlier if d i / [µ δσ, µ + δσ] for some fixed δ > 0. (δ = 6 usually leads to reasonable β.) Set β equal to the proportion of d i s that are outliers. Choosing α: Two different strategies can be used to choose α. Teresa Lebair UMBC 18/38

19 NEO-K-Means Algorithm: Choosing α First strategy for choosing α (for small overlap): For each cluster C j, consider the distances between each x i C j and C j. Compute the mean µ j and standard deviation σ j of the distances. For each x l / C j, compute the distance between x l and C j. If d lj is less than µ j + δσ j, consider x j to be in the overlapped region. Count the points in overlapped regions to estimate α. Teresa Lebair UMBC 19/38

20 NEO-K-Means Algorithm: Choosing α Second strategy for choosing α (for large overlap): Let d ij denote the distance between the point x i and the cluster C j. Compute the normalized distance between each point x i and cluster C j : d ij := d ij k l=1 d il Count the number of d ij s whose value is less than 1 k+1. Divide by n to obtain α. Note that if a point x i is equidistant from all clusters C l, we have d il = 1 k for all l. Using a threshold of 1 k+1 gives us a stronger bound. Teresa Lebair UMBC 20/38

21 Weighted Kernel K-Means In kernel k-means, each point is mapped into a higher dimensional feature space via the mapping φ. Additionally, weights ω i 0 can be introduced to differentiate each point s contribution to the objective functional. The weighted kernel k-means objective functional is min {C j } k j=1 k j=1 x i C j ω i φ(x i ) m j 2, where m j = x i C j ω i φ(x i ) x i C j ω i. Teresa Lebair UMBC 21/38

22 Weighted Kernel NEO-K-Means Can define an analogous weighted kernel NEO-K-Means objective functional: min U k c=1 i=1 n u ic ω i φ(x i ) m c 2, where m c = s.t. trace(u T U) = (1 + α)n, n I ((U1) i = 0) βn i=1 n i=1 u icω i φ(x i ) n i=1 u icω i. This NEO-K-Means extension allows us to consider non-exhaustive overlapping graph clustering / overlapping community detection. Teresa Lebair UMBC 22/38

23 Graph Clustering Using NEO-K-Means A graph G = (V, E) is a collection of vertices and edges. We can define an adjacency matrix A = (a ij ), such that a ij is equal to the weight of the edge between vertices i and j. Example: Alice 8 Bob Eve Linda Paul Teresa Lebair UMBC 23/38

24 Graph Clustering Using NEO-K-Means Alice 8 Bob Eve 2 1 Linda 4 Paul The adjacency matrix is A B L P E A B A = L P E Teresa Lebair UMBC 24/38

25 Graph Clustering Using NEO-K-Means We assume that there are no connections between any vertex i and itself = a ii = 0 for all i = 1,..., n. Also assume that graph is unidirected (so A is symmetric). Traditional graph partitioning problem groups vertices into k pairwise disjoint clusters C 1... C k = V. Define the links function between two clusters as the sum of the edge weights between the clusters: links(c j, C l ) := x i1 C j x i2 C l a i1 i 2 Example: If C 1 := {Alice, Bob} and C 2 = { Linda, Paul}, then links(c 1, C 2 ) = a AL + a BL + a AP + a BP = = 3. Teresa Lebair UMBC 25/38

26 Graph Clustering Using NEO-K-Means The cut (popular measure of evaluating graph partitioning) of a graph G is defined as Cut(G) = k j=1 links(c j, V \ C j ). links(c j, V) The normalized cut of a graph partition is the partition of V that minimizes Cut(G) over all possible partitions, i.e. NCut(G) = min Cut(G). C 1,...,C k Let D be the diagonal matrix such that d ii = n j=1 a ij, i.e., the matrix of vertex degrees. Teresa Lebair UMBC 26/38

27 Graph Clustering Using NEO-K-Means We can rewrite NCut(G) = min C1,...,C k k j=1 links(c j,v\c j ) links(c j,v) as NCut(G) = min y 1,...,y k k j=1 yj T (D A)y j yj T = max Dy j y 1,...,y k k j=1 yj T Ay j yj T, Dy j where y j denotes the indicator vector for the cluster C j, i.e., y j (i) = 1 if v i C j, and zero otherwise. This traditional normalized cut objective is for disjoint, exhaustive graph clustering. Teresa Lebair UMBC 27/38

28 Graph Clustering Using NEO-K-Means For NEO-K-Means graph clustering, consider the maximization problem max Y k j=1 yj T Ay j Dy j y T j s.t. trace(y T Y ) = (1 + α)n, n I ((Y 1) i = 0) βn, where Y is an assignment matrix, with the jth column of Y given by y j. i=1 Just as with the vector data, we may adjust α and β to control the degree of non-exhaustiveness. This optimization problem for graph partitioning can be reformulated as weighted kernel NEO-K-Means problem. Teresa Lebair UMBC 28/38

29 Graph Clustering Using NEO-K-Means Let W R n n be the diagonal matrix such that w ii is equal to the vertex degree/weight for each i = 1,..., n. Let K be a kernel matrix given by K ij = φ(x i )φ(x j ). Finally, let u c be the column of the assignment matrix U. The weighted kernel NEO-K-Means objective can be rewritten as min U min U k c=1 i=1 n u ic ω i φ(x i ) m c 2 = ( k n c=1 i=1 u ic ω i K ii ut c WKWu c u T c Wu c ). Teresa Lebair UMBC 29/38

30 Graph Clustering Using NEO-K-Means Define the kernel as K = γw 1 + W 1 AW 1, where γ > 0 is chosen so that K is positive definite. Then ) min U ( k n c=1 = min U = max U ( i=1 u ic ω i K ii ut c Au c u T c Wu c γ(1 + α)n k c=1 u T c Au c u T c Wu c. k c=1 u T c Au c u T c Wu c Letting W = D and noting that U = Y demonstrates that the extended normalized cut objective can be formulated as a weighted kernel NEO-K-Means objective. Teresa Lebair UMBC 30/38 )

31 Graph Clustering Using NEO-K-Means Authors use the following distance function between a vertex v i and a cluster C j : dist(v i, C j ) = 2links(v i, C j ) deg(v i )deg(c j ) +links(c j, C j ) deg(c j ) 2 + γ deg(v i ) γ deg(c j ) deg(v i ) denotes the degree of the vertex v i, and deg(c j ) denotes the sum of the edge weights connecting vertices in C j. The NEO-K-Means Algorithm can then be applied to the graph data using this distance function. Teresa Lebair UMBC 31/38

32 Experimental Metrics To measure the effectiveness of the NEO-K-Means algorithm, we use the average F 1 score: Define F 1 score of the the ground-truth cluster S i as F 1 (S i ) = 2r S i p Si r Si +p Si, where p Si = C j S i C j, r Si = C j S i S i and j corresponds to the cluster C j which makes F 1 (S i ) as large as possible out of all the clusters. The average F 1 score is then F 1 = 1 S F 1 (S i ), S i S where S is the set of ground-truth clusters. Higher F 1 [0, 1] score indicates better clustering. Teresa Lebair UMBC 32/38

33 Experimental Results Tested NEO-K-Means Algorithm on several vector data sets. Teresa Lebair UMBC 33/38

34 Experimental Results Compared effectiveness of NEO-K-Means to that of several other algorithms. NEO-K-Means consistently outperforms the other algorithms for this data set. Teresa Lebair UMBC 34/38

35 Experimental Results Additionally test NEO-K-Means with large graph data sets: Compare the average normalized cut of each algorithm when applied to large real-world datasets. Teresa Lebair UMBC 35/38

36 Experimental Results Also consider (i) average F 1 score of different algorithms, and (ii) average normalized cut and F 1 NEO-K-Means for different α s and β s. Teresa Lebair UMBC 36/38

37 Conclusions and Future Work NEO-K-Means simultaneously considers non-exhaustive and overlapping clusters. New method outperforms state-of-the-art methods in terms of finding ground truth clusters. Conclude that NEO-K-Means is a useful algorithm. Plan to extend this type of clustering to other types of Bregman divergences. Teresa Lebair UMBC 37/38

38 References J. Shi and J. Malik Normalized Cuts and Image Segmentation. IEE TPAMI, J. J. Whang, D. F. Gleich, and I. S. Dhilon Overlapping community detection using seed set expansion. CIKM, J. J. Whang, I. S. Dhilon, and D. F. Gleich Non-exhaustive, Overlapping k-means. SIAM International Conference on Data Mining, pg , Teresa Lebair UMBC 38/38

Segmentation: Clustering, Graph Cut and EM

Segmentation: Clustering, Graph Cut and EM Segmentation: Clustering, Graph Cut and EM Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu

More information

COMMUNITY detection is one of the most important

COMMUNITY detection is one of the most important IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion Joyce Jiyoung Whang, Member, IEEE, David F. Gleich, and Inderjit S. Dhillon,

More information

Problem Definition. Clustering nonlinearly separable data:

Problem Definition. Clustering nonlinearly separable data: Outlines Weighted Graph Cuts without Eigenvectors: A Multilevel Approach (PAMI 2007) User-Guided Large Attributed Graph Clustering with Multiple Sparse Annotations (PAKDD 2016) Problem Definition Clustering

More information

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity

More information

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation

CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation CS 534: Computer Vision Segmentation II Graph Cuts and Image Segmentation Spring 2005 Ahmed Elgammal Dept of Computer Science CS 534 Segmentation II - 1 Outlines What is Graph cuts Graph-based clustering

More information

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 Spectral Clustering Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 What are we going to talk about? Introduction Clustering and

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford

More information

Low rank methods for optimizing clustering

Low rank methods for optimizing clustering Purdue University Purdue e-pubs Open Access Dissertations Theses and Dissertations 12-2016 Low rank methods for optimizing clustering Yangyang Hou Purdue University Follow this and additional works at:

More information

Community Detection. Community

Community Detection. Community Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,

More information

Scalable Clustering of Signed Networks Using Balance Normalized Cut

Scalable Clustering of Signed Networks Using Balance Normalized Cut Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.

More information

PCP and Hardness of Approximation

PCP and Hardness of Approximation PCP and Hardness of Approximation January 30, 2009 Our goal herein is to define and prove basic concepts regarding hardness of approximation. We will state but obviously not prove a PCP theorem as a starting

More information

CS Introduction to Data Mining Instructor: Abdullah Mueen

CS Introduction to Data Mining Instructor: Abdullah Mueen CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 8: ADVANCED CLUSTERING (FUZZY AND CO -CLUSTERING) Review: Basic Cluster Analysis Methods (Chap. 10) Cluster Analysis: Basic Concepts

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning

More information

Spectral Clustering and Community Detection in Labeled Graphs

Spectral Clustering and Community Detection in Labeled Graphs Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu

More information

Local higher-order graph clustering

Local higher-order graph clustering Local higher-order graph clustering Hao Yin Stanford University yinh@stanford.edu Austin R. Benson Cornell University arb@cornell.edu Jure Leskovec Stanford University jure@cs.stanford.edu David F. Gleich

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Fast Multiplier Methods to Optimize Non-exhaustive, Overlapping Clustering

Fast Multiplier Methods to Optimize Non-exhaustive, Overlapping Clustering Fast Multiplier Methods to Optimize Non-exhaustive, Overlapping Clustering Yangyang Hou Joyce Jiyoung Whang David F. Gleich Inderjit S. Dhillon Abstract Clustering is one of the most fundamental and important

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank

More information

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Clustering. CE-717: Machine Learning Sharif University of Technology Spring Soleymani Clustering CE-717: Machine Learning Sharif University of Technology Spring 2016 Soleymani Outline Clustering Definition Clustering main approaches Partitional (flat) Hierarchical Clustering validation

More information

A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts

A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts Inderjit Dhillon, Yuqiang Guan and Brian Kulis University of Texas at Austin Department of Computer Sciences Austin, TX 78712 UTCS Technical

More information

Approximation Algorithms: The Primal-Dual Method. My T. Thai

Approximation Algorithms: The Primal-Dual Method. My T. Thai Approximation Algorithms: The Primal-Dual Method My T. Thai 1 Overview of the Primal-Dual Method Consider the following primal program, called P: min st n c j x j j=1 n a ij x j b i j=1 x j 0 Then the

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline

More information

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6,

Spectral Clustering X I AO ZE N G + E L HA M TA BA S SI CS E CL A S S P R ESENTATION MA RCH 1 6, Spectral Clustering XIAO ZENG + ELHAM TABASSI CSE 902 CLASS PRESENTATION MARCH 16, 2017 1 Presentation based on 1. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

The Pre-Image Problem in Kernel Methods

The Pre-Image Problem in Kernel Methods The Pre-Image Problem in Kernel Methods James Kwok Ivor Tsang Department of Computer Science Hong Kong University of Science and Technology Hong Kong The Pre-Image Problem in Kernel Methods ICML-2003 1

More information

General Instructions. Questions

General Instructions. Questions CS246: Mining Massive Data Sets Winter 2018 Problem Set 2 Due 11:59pm February 8, 2018 Only one late period is allowed for this homework (11:59pm 2/13). General Instructions Submission instructions: These

More information

Clustering. Chapter 10 in Introduction to statistical learning

Clustering. Chapter 10 in Introduction to statistical learning Clustering Chapter 10 in Introduction to statistical learning 16 14 12 10 8 6 4 2 0 2 4 6 8 10 12 14 1 Clustering ² Clustering is the art of finding groups in data (Kaufman and Rousseeuw, 1990). ² What

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be

More information

Generalized trace ratio optimization and applications

Generalized trace ratio optimization and applications Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust

More information

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017 Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of

More information

Introduction to spectral clustering

Introduction to spectral clustering Introduction to spectral clustering Vasileios Zografos zografos@isy.liu.se Klas Nordberg klas@isy.liu.se What this course is Basic introduction into the core ideas of spectral clustering Sufficient to

More information

COSC 6397 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2015.

COSC 6397 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2015. COSC 6397 Big Data Analytics Fuzzy Clustering Some slides based on a lecture by Prof. Shishir Shah Edgar Gabriel Spring 215 Clustering Clustering is a technique for finding similarity groups in data, called

More information

Matching and Covering

Matching and Covering Matching and Covering Matchings A matching of size k in a graph G is a set of k pairwise disjoint edges The vertices belonging to the edges of a matching are saturated by the matching; the others are unsaturated

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 8 Based in part on slides from textbook, slides of Susan Holmes. December 2, / 1 Week 8 Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Part I Clustering 2 / 1 Clustering Clustering Goal: Finding groups of objects such that the objects in a group

More information

Algorithms for Data Science

Algorithms for Data Science Algorithms for Data Science CSOR W4246 Eleni Drinea Computer Science Department Columbia University Thursday, October 1, 2015 Outline 1 Recap 2 Shortest paths in graphs with non-negative edge weights (Dijkstra

More information

Evolutionary tree reconstruction (Chapter 10)

Evolutionary tree reconstruction (Chapter 10) Evolutionary tree reconstruction (Chapter 10) Early Evolutionary Studies Anatomical features were the dominant criteria used to derive evolutionary relationships between species since Darwin till early

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

Lecture 11: E-M and MeanShift. CAP 5415 Fall 2007

Lecture 11: E-M and MeanShift. CAP 5415 Fall 2007 Lecture 11: E-M and MeanShift CAP 5415 Fall 2007 Review on Segmentation by Clustering Each Pixel Data Vector Example (From Comanciu and Meer) Review of k-means Let's find three clusters in this data These

More information

Introduction to Graph Theory

Introduction to Graph Theory Introduction to Graph Theory Tandy Warnow January 20, 2017 Graphs Tandy Warnow Graphs A graph G = (V, E) is an object that contains a vertex set V and an edge set E. We also write V (G) to denote the vertex

More information

CHAPTER 4 VORONOI DIAGRAM BASED CLUSTERING ALGORITHMS

CHAPTER 4 VORONOI DIAGRAM BASED CLUSTERING ALGORITHMS CHAPTER 4 VORONOI DIAGRAM BASED CLUSTERING ALGORITHMS 4.1 Introduction Although MST-based clustering methods are effective for complex data, they require quadratic computational time which is high for

More information

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim,

More information

Lecture 10 October 7, 2014

Lecture 10 October 7, 2014 6.890: Algorithmic Lower Bounds: Fun With Hardness Proofs Fall 2014 Lecture 10 October 7, 2014 Prof. Erik Demaine Scribes: Fermi Ma, Asa Oines, Mikhail Rudoy, Erik Waingarten Overview This lecture begins

More information

Unsupervised Learning : Clustering

Unsupervised Learning : Clustering Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex

More information

Clustering. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 238

Clustering. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 238 Clustering Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester 2015 163 / 238 What is Clustering? Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10. Cluster

More information

ADVANCED MACHINE LEARNING MACHINE LEARNING. Kernel for Clustering kernel K-Means

ADVANCED MACHINE LEARNING MACHINE LEARNING. Kernel for Clustering kernel K-Means 1 MACHINE LEARNING Kernel for Clustering ernel K-Means Outline of Today s Lecture 1. Review principle and steps of K-Means algorithm. Derive ernel version of K-means 3. Exercise: Discuss the geometrical

More information

3D Surface Recovery via Deterministic Annealing based Piecewise Linear Surface Fitting Algorithm

3D Surface Recovery via Deterministic Annealing based Piecewise Linear Surface Fitting Algorithm 3D Surface Recovery via Deterministic Annealing based Piecewise Linear Surface Fitting Algorithm Bing Han, Chris Paulson, and Dapeng Wu Department of Electrical and Computer Engineering University of Florida

More information

Clustering will not be satisfactory if:

Clustering will not be satisfactory if: Clustering will not be satisfactory if: -- in the input space the clusters are not linearly separable; -- the distance measure is not adequate; -- the assumptions limit the shape or the number of the clusters.

More information

Normalized cuts and image segmentation

Normalized cuts and image segmentation Normalized cuts and image segmentation Department of EE University of Washington Yeping Su Xiaodan Song Normalized Cuts and Image Segmentation, IEEE Trans. PAMI, August 2000 5/20/2003 1 Outline 1. Image

More information

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the

More information

Chapter VIII.3: Hierarchical Clustering

Chapter VIII.3: Hierarchical Clustering Chapter VIII.3: Hierarchical Clustering 1. Basic idea 1.1. Dendrograms 1.2. Agglomerative and divisive 2. Cluster distances 2.1. Single link 2.2. Complete link 2.3. Group average and Mean distance 2.4.

More information

Scene Detection Media Mining I

Scene Detection Media Mining I Scene Detection Media Mining I Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} Overview Hierarchical structure of video sequence

More information

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies.

CSE 547: Machine Learning for Big Data Spring Problem Set 2. Please read the homework submission policies. CSE 547: Machine Learning for Big Data Spring 2019 Problem Set 2 Please read the homework submission policies. 1 Principal Component Analysis and Reconstruction (25 points) Let s do PCA and reconstruct

More information

Computing NodeTrix Representations of Clustered Graphs

Computing NodeTrix Representations of Clustered Graphs Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 22, no. 2, pp. 139 176 (2018) DOI: 10.7155/jgaa.00461 Computing NodeTrix Representations of Clustered Graphs Giordano Da Lozzo Giuseppe

More information

K-Means and Gaussian Mixture Models

K-Means and Gaussian Mixture Models K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser

More information

Matrix-Vector Multiplication by MapReduce. From Rajaraman / Ullman- Ch.2 Part 1

Matrix-Vector Multiplication by MapReduce. From Rajaraman / Ullman- Ch.2 Part 1 Matrix-Vector Multiplication by MapReduce From Rajaraman / Ullman- Ch.2 Part 1 Google implementation of MapReduce created to execute very large matrix-vector multiplications When ranking of Web pages that

More information

Assignment 4 Solutions of graph problems

Assignment 4 Solutions of graph problems Assignment 4 Solutions of graph problems 1. Let us assume that G is not a cycle. Consider the maximal path in the graph. Let the end points of the path be denoted as v 1, v k respectively. If either of

More information

by conservation of flow, hence the cancelation. Similarly, we have

by conservation of flow, hence the cancelation. Similarly, we have Chapter 13: Network Flows and Applications Network: directed graph with source S and target T. Non-negative edge weights represent capacities. Assume no edges into S or out of T. (If necessary, we can

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Cluster Analysis: Basic Concepts and Methods Huan Sun, CSE@The Ohio State University 09/25/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han 2 Chapter 10.

More information

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11 Big Data Analytics Special Topics for Computer Science CSE 4095-001 CSE 5095-005 Feb 11 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering II Spectral

More information

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection.

CSCI-B609: A Theorist s Toolkit, Fall 2016 Sept. 6, Firstly let s consider a real world problem: community detection. CSCI-B609: A Theorist s Toolkit, Fall 016 Sept. 6, 016 Lecture 03: The Sparsest Cut Problem and Cheeger s Inequality Lecturer: Yuan Zhou Scribe: Xuan Dong We will continue studying the spectral graph theory

More information

Minimum Cost Edge Disjoint Paths

Minimum Cost Edge Disjoint Paths Minimum Cost Edge Disjoint Paths Theodor Mader 15.4.2008 1 Introduction Finding paths in networks and graphs constitutes an area of theoretical computer science which has been highly researched during

More information

Discriminative Clustering for Image Co-segmentation

Discriminative Clustering for Image Co-segmentation Discriminative Clustering for Image Co-segmentation Armand Joulin Francis Bach Jean Ponce INRIA Ecole Normale Supérieure, Paris January 2010 Introduction Introduction Task: dividing simultaneously q images

More information

Graph Adjacency Matrix Automata Joshua Abbott, Phyllis Z. Chinn, Tyler Evans, Allen J. Stewart Humboldt State University, Arcata, California

Graph Adjacency Matrix Automata Joshua Abbott, Phyllis Z. Chinn, Tyler Evans, Allen J. Stewart Humboldt State University, Arcata, California Graph Adjacency Matrix Automata Joshua Abbott, Phyllis Z. Chinn, Tyler Evans, Allen J. Stewart Humboldt State University, Arcata, California Abstract We define a graph adjacency matrix automaton (GAMA)

More information

PERSONALIZED TAG RECOMMENDATION

PERSONALIZED TAG RECOMMENDATION PERSONALIZED TAG RECOMMENDATION Ziyu Guan, Xiaofei He, Jiajun Bu, Qiaozhu Mei, Chun Chen, Can Wang Zhejiang University, China Univ. of Illinois/Univ. of Michigan 1 Booming of Social Tagging Applications

More information

Clustering: Overview and K-means algorithm

Clustering: Overview and K-means algorithm Clustering: Overview and K-means algorithm Informal goal Given set of objects and measure of similarity between them, group similar objects together K-Means illustrations thanks to 2006 student Martin

More information

11/17/2009 Comp 590/Comp Fall

11/17/2009 Comp 590/Comp Fall Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 Problem Set #5 will be available tonight 11/17/2009 Comp 590/Comp 790-90 Fall 2009 1 Clique Graphs A clique is a graph with every vertex connected

More information

Spectral Clustering on Handwritten Digits Database

Spectral Clustering on Handwritten Digits Database October 6, 2015 Spectral Clustering on Handwritten Digits Database Danielle dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park Advance

More information

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22 INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task

More information

STATS306B STATS306B. Clustering. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010

STATS306B STATS306B. Clustering. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010 STATS306B Jonathan Taylor Department of Statistics Stanford University June 3, 2010 Spring 2010 Outline K-means, K-medoids, EM algorithm choosing number of clusters: Gap test hierarchical clustering spectral

More information

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering An unsupervised machine learning problem Grouping a set of objects in such a way that objects in the same group (a cluster) are more similar (in some sense or another) to each other than to those in other

More information

Community detection algorithms survey and overlapping communities. Presented by Sai Ravi Kiran Mallampati

Community detection algorithms survey and overlapping communities. Presented by Sai Ravi Kiran Mallampati Community detection algorithms survey and overlapping communities Presented by Sai Ravi Kiran Mallampati (sairavi5@vt.edu) 1 Outline Various community detection algorithms: Intuition * Evaluation of the

More information

Reflexive Regular Equivalence for Bipartite Data

Reflexive Regular Equivalence for Bipartite Data Reflexive Regular Equivalence for Bipartite Data Aaron Gerow 1, Mingyang Zhou 2, Stan Matwin 1, and Feng Shi 3 1 Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada 2 Department of Computer

More information

Convex Optimization / Homework 2, due Oct 3

Convex Optimization / Homework 2, due Oct 3 Convex Optimization 0-725/36-725 Homework 2, due Oct 3 Instructions: You must complete Problems 3 and either Problem 4 or Problem 5 (your choice between the two) When you submit the homework, upload a

More information

Traditional clustering fails if:

Traditional clustering fails if: Traditional clustering fails if: -- in the input space the clusters are not linearly separable; -- the distance measure is not adequate; -- the assumptions limit the shape or the number of the clusters.

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)

More information

Ma/CS 6b Class 26: Art Galleries and Politicians

Ma/CS 6b Class 26: Art Galleries and Politicians Ma/CS 6b Class 26: Art Galleries and Politicians By Adam Sheffer The Art Gallery Problem Problem. We wish to place security cameras at a gallery, such that they cover it completely. Every camera can cover

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Kernels for Structured Data

Kernels for Structured Data T-122.102 Special Course in Information Science VI: Co-occurence methods in analysis of discrete data Kernels for Structured Data Based on article: A Survey of Kernels for Structured Data by Thomas Gärtner

More information

A Memetic Heuristic for the Co-clustering Problem

A Memetic Heuristic for the Co-clustering Problem A Memetic Heuristic for the Co-clustering Problem Mohammad Khoshneshin 1, Mahtab Ghazizadeh 2, W. Nick Street 1, and Jeffrey W. Ohlmann 1 1 The University of Iowa, Iowa City IA 52242, USA {mohammad-khoshneshin,nick-street,jeffrey-ohlmann}@uiowa.edu

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

Flexible Coloring. Xiaozhou Li a, Atri Rudra b, Ram Swaminathan a. Abstract

Flexible Coloring. Xiaozhou Li a, Atri Rudra b, Ram Swaminathan a. Abstract Flexible Coloring Xiaozhou Li a, Atri Rudra b, Ram Swaminathan a a firstname.lastname@hp.com, HP Labs, 1501 Page Mill Road, Palo Alto, CA 94304 b atri@buffalo.edu, Computer Sc. & Engg. dept., SUNY Buffalo,

More information

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme

Machine Learning. B. Unsupervised Learning B.1 Cluster Analysis. Lars Schmidt-Thieme Machine Learning B. Unsupervised Learning B.1 Cluster Analysis Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany

More information

Hierarchical Clustering

Hierarchical Clustering Hierarchical Clustering Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram A tree-like diagram that records the sequences of merges

More information

Diffusion and Clustering on Large Graphs

Diffusion and Clustering on Large Graphs Diffusion and Clustering on Large Graphs Alexander Tsiatas Final Defense 17 May 2012 Introduction Graphs are omnipresent in the real world both natural and man-made Examples of large graphs: The World

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

6 Randomized rounding of semidefinite programs

6 Randomized rounding of semidefinite programs 6 Randomized rounding of semidefinite programs We now turn to a new tool which gives substantially improved performance guarantees for some problems We now show how nonlinear programming relaxations can

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

COSC 6339 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2017.

COSC 6339 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2017. COSC 6339 Big Data Analytics Fuzzy Clustering Some slides based on a lecture by Prof. Shishir Shah Edgar Gabriel Spring 217 Clustering Clustering is a technique for finding similarity groups in data, called

More information

Gene Clustering & Classification

Gene Clustering & Classification BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering

More information

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Unsupervised Learning Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Content Motivation Introduction Applications Types of clustering Clustering criterion functions Distance functions Normalization Which

More information

Rank Minimization over Finite Fields

Rank Minimization over Finite Fields Rank Minimization over Finite Fields Vincent Y. F. Tan Laura Balzano, Stark C. Draper Department of Electrical and Computer Engineering, University of Wisconsin-Madison ISIT 2011 Vincent Tan (UW-Madison)

More information