An Incremental Hierarchical Clustering

Size: px
Start display at page:

Download "An Incremental Hierarchical Clustering"

Transcription

1 An Incremental Hierarchical Clustering Arnaud Ribert, Abdel Ennaji, Yves Lecourtier P.S.I. Faculté des Sciences, Université de Rouen Mont Saint Aignan Cédex, France Abstract This article describes a new algorithm to treat time incremental data by a hierarchical clustering. Although hierarchical clustering techniques enable one to automatically determine the number of clusters in a data set, they are rarely used in industrial applications, because a large amount of memory is required when treating more than 10,000 elements. To solve this problem, the proposed method proceeds by updating the hierarchical representation of the data instead of re-computing the whole tree when new patterns have to be taken into account. Memory gains, evaluated for a real problem (handwritten digit recognition) allow to treat databases containing 7 times more data than the classical algorithm. 1. Introduction Numerous techniques of pattern recognition and image segmentation require an efficient clustering method to understand, interpret and simplify large amounts of multidimensional data [7][10][9][1]. Most of the time, k-means algorithm and other partitional methods are used in industrial applications [5][11][7][16]. However, they present the major drawback of requiring close-to-the-finalsolution initial conditions, specially concerning the number of clusters. This a priori knowledge is rarely at the user's disposal because, if they use a clustering technique, this is precisely because they ignore the structure of their data. So, this constraint is intractable in the general case. In fact, several authors recommend to use a hierarchical clustering before starting the k-means algorithm, since it does not require any a priori knowledge and provides a good representation of the data. But hierarchical clustering generally requires a great amount of memory, since it increases with the square of the number of elements in the database. So, treating 10,000 elements requires 00 Mo of RAM, while 800 Mo are needed to deal with 0,000 elements. Such a memory cost may explain why hierarchical clustering is rarely used in industrial applications. On the other hand, clustering techniques consider that the given database is completely representative of the problem, whereas in fact, this is rarely the case when dealing with complex problems. In other words, from a practical point of view, a complex problem is often an incremental one. Consequently, it would be very useful to be able to enrich a database to take into account patterns which were not available at the time of the constitution of the database. The problem is then to update one's clustering. Another advantage of incremental approaches is to consider the original database as a small growing one. In this scope, it might be possible to reduce the memory requirements of the hierarchical clustering by building a first numerical taxonomy (i.e. the tree resulting of a hierarchical clustering) using the available memory and updating it when new elements are taken into account. The success of this strategy depends on the capability to update a numerical taxonomy while keeping the largest possible part of it. This is the objective of the algorithm that is described in this paper. In the next section, the main characteristics of the hierarchical clustering are recalled. Sections 3 and 4 describe respectively the way to determine the anchoring point of a new element in a numerical taxonomy and how to update this one, starting from the anchoring point. Eventually, section 5 presents experimental results over real databases.. A brief recall on the hierarchical clustering A hierarchical clustering method is a procedure to represent data as a nested sequence of partitions. An

2 example of the corresponding graphical representation, called a dendrogram, is shown on Figure 1. It is important to note that the height of a node is proportional to the distance between groups it links. Consequently, the shape of a dendrogram gives information on the number of clusters in a data-set. Thus, cutting a dendrogram horizontally engineers a clustering (5 clusters appear in the example of Figure 1). Numerous methods have been proposed to determine the best cutting point, to automatically find the number of clusters [Mil88]. Although these methods are often used, it has been shown that a multi-point cutting leads to better results for real data-sets [14][15]. It can be noticed that the euclidian distance can be replaced by any other dissimilarity measure, that is to say a metric d such that d(x,x) = 0, d(x,y) = d(y,x) and d(x,y) 0 ( x,y). Moreover, an additional metric has to be introduced to measure the distance between two groups. Commonly used metrics are : the minimum or maximum euclidian distance (called single and complete link) or the average distance between the groups. This choice has a great influence on the representation capabilities of the taxonomy. The algorithm described in this article makes no assumption on the used metric, so that the user choice is free. However, a well-suited metric, which will be detailed further, enables one to obtain better performance. On a simplification purpose, we will consider in section 3 and 4 that the average link is used. 3. Determining the place of the new element in the hierarchy When a new element has to be added in a taxonomy, one has firstly to determine its place in the tree. Several methods have been proposed to achieve this [8]. They are generally based on the estimation of a boundary layer between the two sub-trees (i.e. groups) encounted at each node of the taxonomy. Each boundary layer is used to place the new element (NE) like a decision tree would do. However, to our knowledge, none of these methods is endowed of a stopping criterion, allowing for instance to place a new element at the root of the taxonomy if it is required. Figure 1 : Example of dendrogram Hierarchical clustering algorithms generally work by sequential groupings of clusters. A numerical taxonomy can be built using the following algorithm, where initial points are considered as individual clusters. Compute the euclidian distance between every pair of points; Merge into a single cluster the two closest points; While ( There are more than one group ) Do Compute the distance between the new cluster and every existing one; Merge into a single cluster the two closest ones; EndWhile It is of course impossible to consider that NE will be the brother of its nearest neighbour. Indeed, the minimum average distance between NE and a group can be reached for a group which does not contain NE's nearest neighbour. So, the research has to begin from the root of the taxonomy. This implies the definition of a descent procedure in the tree (including a stopping condition) so that the process ends up on top of the right sub-tree, which will be called R. The implemented principle uses the notion of Region Of Influence for a group G. Let D(G 1, G ) be the distance between two groups G 1 and G. Then, as shown on Figure, a point X belongs to the ROI of G 1 if D(X, G 1 ) D(G 1, G ).

3 Figure : Finding the place of a new element in a numerical taxonomy Let Ω be the data set used to build a first taxonomy T. Since NE is not present, R (the search sub-tree) is aggregated with another cluster, H. When building a hierarchy for Ω+{NE}, R is aggregated with NE when D(R, {NE}) is the smallest distance between two clusters. Consequently, it can be said that D(R, {NE}) D(R, H). In other words, NE belongs to the ROI of R. During the building process, it is known that the creation of a cluster (composed of two groups G i and G j ) before the merging of R and NE implies : D(G i, G j ) < D(R, {NE}). This is particularly true for every sub-tree of R. Consequently, NE does not belong to any ROI of the subtrees of R. Moreover, as D(R, {NE}) is the smallest inter-cluster distance when R and NE are aggregated, it can be said that among the groups of T verifying the two first properties, the searched one is the closest from NE. It can be shown that these three properties are necessary and sufficient to define (and find) R when dealing with binary hierarchies, since in this case only one minimum exists [15]. It can also be noticed that this algorithm can be used to perform a supervised classification or to detect exceptions. 4. Updating the numerical taxonomy The second phase of the algorithm consists of updating the taxonomy nodes, starting from the previously determined point. The principle is to cut a sub-tree only when it is needed, and to update node levels in all other cases. The problem is to determine the conditions to be verified to cut a sub-tree. The basic algorithm presented in the following paragraphs proposes criteria to solve this problem. A first optimisation is then proposed The basic algorithm When a new element is added in a taxonomy, there is theoretically no guaranty that a part of it will be kept [13]. However, practically, experiments over 0,000 synthetic databases (without any data structure in the representation space, in order to favour important changes after the addition of the new element) have shown that the whole recontruction never occured for databases of more than 50 elements. Two ways may be envisaged to update a numerical taxonomy. On the one hand, one can try to determine the exact changes that will occur in the taxonomy due to the presence of a new element. On the other hand, one can also determine what part of the taxonomy will never be changed after the addition of a new element. Since the first solution seems to be particularly difficult, the second strategy has been retained. Thus, when the place of the new element (NE) is determined, an ascendant update of the hierarchy (starting from the anchoring point) is performed. The objective is to determine the groups inside the current sub-tree which cannot be changed by the integration of the new element. In fact, depending on the considered scale, NE insertion may engineer two different effects on the data structure. At a given level (i.e. on the top of a sub-tree), the new element can contribute to bring two sub-trees nearer, or on the contrary tend to move them away from each other.

4 Figure 3 : The new element brings two sub-trees nearer Figure 4 : The new element moves two sub-trees away The first case is illustrated on figure 3. In a first time, element 7 contributes to bring element 6 and group {4,5} nearer. Group {4,5} had been made because d(4,5) was inferior to d(4,6) and d(6,5). Element 7 cannot call this into question because it is not close enough from element 5. On the contrary, it calls into question the merging of 3 and group {1,}. Indeed, the only parts of the taxonomy that are necessarily kept are sub-trees whose threshold is inferior to d(3,7), which represents the smallest distance between {1,,3} and {4,5,6}. More generally, it can be proved [15] that when NE contributes to bring two subparts A and B nearer, the smallest tree containing them has to be cut at threshold C t (for "Critical Threshold") which is set to the minimum euclidian distance between A and B (considering NE in A). This cutting is followed by a classical taxonomy building. The second case is illustrated by figure 4. Elements 3 and 6 are moved away because of element 7. Consequently, no regrouping between 3 and 6 or 7 can occur. But if 3 is moved away from {6,7}, it can at the same time be moved closer from another group, particularly from group {1,}. So, such a merging has to be allowed during the reconstruction of the taxonomy in cutting the tree at its critical threshold. More generally, it has been demonstrated [15] that when two sub-trees A and B are moved away from each other, a cutting in their grandfather at its critical threshold (C t ) has to be performed. Nevertheless, it can be noticed that this will not occur if the average distance between A and B does not exceed C t. Indeed, in this case, there is no risk to have another grouping than A and B one : their average distance is only updated without any structure change. It can be observed that this algorithm is partly based on the non-incremental taxonomy building. Consequently, our algorithm can take advantage of every kind of optimisation in the literature [3][6][4]. 4.. A major optimisation The previous algorithm has been tested over a great number of synthetic and real databases, showing that it led to the good tree, but with an important computational cost. Indeed, in many cases, it was necessary to cut two consecutive trees although the first threshold was higher than the second one. Consequently, a lot of calculations were made several times. The following optimisation proposes a way to avoid this. 1. Compute the cutting thresholds, considering that a cut will be needed at each node during the ascendant phase, even for increasing average distances.

5 . Starting from the anchorage node, select cutting thresholds such that any following cutting threshold is strictly superior. 3. Run the ascendant phase, cutting only at the selected thresholds. It can be noticed that we have to consider that a cut will occur even for increasing node indices. This is because it is impossible to know whether the critical threshold for a given sub-tree will be reached or not after the update of the sub-trees that it contains. This potential loss is nevertheless negligible in comparison to the gains engineered by the proposed optimisation. This version has thus been used in the following experiments. 5. Experimental results The aim of these experiments is to measure gains provided by the incremental algorithm in comparison to the classical one (given in section ). In order to favour good performance, an original inter-cluster measure has been developped. Indeed, the single link is the most favourable measure for the incremental algorithm, because the height of a node equals its critical threshold. Consequently, a cutting would often be replaced by a simple re-evaluation of the node height. Unfortunatly, data representation provided by the single link suffers from the «chaining effect» which tends to group all the patterns in a unique cluster. A metric presenting a compromise between the incremental algorithm performance and the representation capabilities of the numerical taxonomy has therefore been designed. This metric has been implemented using the Lance- Williams formula. This one is an efficient way to compute the distance between a new cluster (consisting of groups r and s) and an existing one (k). The metric used in these experiments is computed by the expression below. 1 1 D[( k),( r, s)] = D[( k), ( r)] + D[( k), ( s)] 0, 4960 D[( k), ( r)] D[( k), ( s)] Numeric terms have been determined by a dichotomic procedure for a real database (handwritten digit recognition) such that the supervised analysis of the clusters reveals a good match between groups of data in the representation space and the reality. Tests have also been carried out for a handwritten digit recognition problem (the NIST database), but for different fractions of the data-set. The employed feature vector was constituted by the 85 ( ) grey levels of a resolution pyramid (see Figure 5 below) []. Figure 5 : Representation of a using a 4 level resolution pyramid Savings of memory and number of computed distances appear in Figure 6 below. This curves represent the averages over 50 different experiments Required memory (Number of floats) Classical Algorithm Incremental Algorithm Number of computed distances Classical Algorithm Incremental Algorithm Number of elements Number of elements Figure 6 : Memory and distance savings for a real database

6 A precise analysis of these curves show that for a given memory size, the memory savings offered by the incremental algorithm allow to handle databases comprising 7 times more elements than the classical building method. On the other hand, computational requirements are very important and have to be reduced. In spite of this, since the incremental algorithm can transform memory requirements into computational ones, it can be said that it allows to surpass the material limitations of any computer. 6. Conclusion This article describes an original algorithm to update a numerical taxonomy after addition of new patterns in a database. Tests have shown that using this algorithm allow to progressively perform a hierarchical clustering of big sets of data, which can then contain seven times more elements than using the classical algorithm. In spite of the computational cost of the method, it can be said that the incremental algorithm described in this paper is a first stage towards the use of hierarchical clustering in industrial applications, thus offering an interesting alternative to partitional clustering techniques. Further works should lead to significant reduction of the computational cost of the method by integrating already available optimisation algorithms and adding more efficient rules of building. [10] Kohonen T., "Self-organisation and associative memory", Springer-Verlag, [11] MacQueen J.B., "Some methods for classification and analysis of multivariate observations". Proceedings of the 5 th Berkeley Symposium on Math. Statistics and Probability, Vol. 1, pp , [1] Milligan G.W., M.C. Cooper, "An examination of procedures for determining the number of clusters in a data set", Psychometrika, 50, n, , [13] S.Régnier, "Stabilité d'un opérateur de classification", Mathématiques et Sciences Humaines, N 8, pp , [14] Ribert A., Ennaji A., Lecourtier Y., "Hiérarchies indicées et catégorisation multi-échelle", 6 ièmes Rencontres de la Société Francophone de Classification, pp , Montpellier, Septembre [15] Ribert A., "Structuration évolutive de données : application à la construction de classifieurs distribués", Ph.D Thesis, Rouen, France, October, [16] Venkateswarlu N.B., Raju P.S.V.S.K., "Fast isodata clustering algorithms", Pattern Recognition, Vol. 5, n 3, pp , 199. Bibliography [1] Arabie P, L.J. Hubert, G. De Soete, (eds.): "Clustering and Classification", World Scientific Publ., River Edge, NJ, [] Ballard D.H., Brown C.M., "Computer Vision", Prentice Hall, 198. [3] Bruynooghe M., "Recent Results in Hierarchical Clustering. I - The Reducible Neighborhoods Clustering Algorithm", IJPRAI, Vol. 7, N 3, pp , [4] De Rham C, "La classification hiérarchique ascendante selon la méthode des voisins réciproques", Cahiers de l'analyse de Données, Vol. 5, n, [5] Forgy E., "Cluster analysis of multivariate data : efficiency versus interpretability of classifications", Biometrics, Vol. 1, 768, [6] Hattori K., Torii Y., "Effective algorithms for the nearest neighbor method in the clustering problem", Pattern Recognition, Vol. 6, n 5, pp , [7] Jain A.K., Dubes R.C., Algorithms for clustering data, Prentice Hall, [8] Jambu M., "Exploration informatique et statistique des données", Dunod, Paris, [9] J.D. Jobson, "Applied Multivariate Data Analysis, Volume II : Categorical and Multivariate Methods", Springer-Verlag, New-York, 199.

3. Cluster analysis Overview

3. Cluster analysis Overview Université Laval Multivariate analysis - February 2006 1 3.1. Overview 3. Cluster analysis Clustering requires the recognition of discontinuous subsets in an environment that is sometimes discrete (as

More information

3. Cluster analysis Overview

3. Cluster analysis Overview Université Laval Analyse multivariable - mars-avril 2008 1 3.1. Overview 3. Cluster analysis Clustering requires the recognition of discontinuous subsets in an environment that is sometimes discrete (as

More information

Cluster Analysis. Angela Montanari and Laura Anderlucci

Cluster Analysis. Angela Montanari and Laura Anderlucci Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

A Self Organizing Map for dissimilarity data 0

A Self Organizing Map for dissimilarity data 0 A Self Organizing Map for dissimilarity data Aïcha El Golli,2, Brieuc Conan-Guez,2, and Fabrice Rossi,2,3 Projet AXIS, INRIA-Rocquencourt Domaine De Voluceau, BP 5 Bâtiment 8 7853 Le Chesnay Cedex, France

More information

WATERSHED, HIERARCHICAL SEGMENTATION AND WATERFALL ALGORITHM

WATERSHED, HIERARCHICAL SEGMENTATION AND WATERFALL ALGORITHM WATERSHED, HIERARCHICAL SEGMENTATION AND WATERFALL ALGORITHM Serge Beucher Ecole des Mines de Paris, Centre de Morphologie Math«ematique, 35, rue Saint-Honor«e, F 77305 Fontainebleau Cedex, France Abstract

More information

CS 534: Computer Vision Segmentation and Perceptual Grouping

CS 534: Computer Vision Segmentation and Perceptual Grouping CS 534: Computer Vision Segmentation and Perceptual Grouping Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Outlines Mid-level vision What is segmentation Perceptual Grouping Segmentation

More information

Clustering part II 1

Clustering part II 1 Clustering part II 1 Clustering What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods Hierarchical Methods 2 Partitioning Algorithms:

More information

CHAPTER 4: CLUSTER ANALYSIS

CHAPTER 4: CLUSTER ANALYSIS CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis

More information

AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION

AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION WILLIAM ROBSON SCHWARTZ University of Maryland, Department of Computer Science College Park, MD, USA, 20742-327, schwartz@cs.umd.edu RICARDO

More information

Layout Segmentation of Scanned Newspaper Documents

Layout Segmentation of Scanned Newspaper Documents , pp-05-10 Layout Segmentation of Scanned Newspaper Documents A.Bandyopadhyay, A. Ganguly and U.Pal CVPR Unit, Indian Statistical Institute 203 B T Road, Kolkata, India. Abstract: Layout segmentation algorithms

More information

Keywords hierarchic clustering, distance-determination, adaptation of quality threshold algorithm, depth-search, the best first search.

Keywords hierarchic clustering, distance-determination, adaptation of quality threshold algorithm, depth-search, the best first search. Volume 4, Issue 3, March 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Distance-based

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

A Fast Approximated k Median Algorithm

A Fast Approximated k Median Algorithm A Fast Approximated k Median Algorithm Eva Gómez Ballester, Luisa Micó, Jose Oncina Universidad de Alicante, Departamento de Lenguajes y Sistemas Informáticos {eva, mico,oncina}@dlsi.ua.es Abstract. The

More information

A fuzzy k-modes algorithm for clustering categorical data. Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p.

A fuzzy k-modes algorithm for clustering categorical data. Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p. Title A fuzzy k-modes algorithm for clustering categorical data Author(s) Huang, Z; Ng, MKP Citation IEEE Transactions on Fuzzy Systems, 1999, v. 7 n. 4, p. 446-452 Issued Date 1999 URL http://hdl.handle.net/10722/42992

More information

Relative Constraints as Features

Relative Constraints as Features Relative Constraints as Features Piotr Lasek 1 and Krzysztof Lasek 2 1 Chair of Computer Science, University of Rzeszow, ul. Prof. Pigonia 1, 35-510 Rzeszow, Poland, lasek@ur.edu.pl 2 Institute of Computer

More information

Unsupervised Learning

Unsupervised Learning Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised

More information

Supervised Variable Clustering for Classification of NIR Spectra

Supervised Variable Clustering for Classification of NIR Spectra Supervised Variable Clustering for Classification of NIR Spectra Catherine Krier *, Damien François 2, Fabrice Rossi 3, Michel Verleysen, Université catholique de Louvain, Machine Learning Group, place

More information

Clustering CS 550: Machine Learning

Clustering CS 550: Machine Learning Clustering CS 550: Machine Learning This slide set mainly uses the slides given in the following links: http://www-users.cs.umn.edu/~kumar/dmbook/ch8.pdf http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap8_basic_cluster_analysis.pdf

More information

Hierarchical clustering

Hierarchical clustering Hierarchical clustering Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Description Produces a set of nested clusters organized as a hierarchical tree. Can be visualized

More information

Motivation. Technical Background

Motivation. Technical Background Handling Outliers through Agglomerative Clustering with Full Model Maximum Likelihood Estimation, with Application to Flow Cytometry Mark Gordon, Justin Li, Kevin Matzen, Bryce Wiedenbeck Motivation Clustering

More information

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a Week 9 Based in part on slides from textbook, slides of Susan Holmes Part I December 2, 2012 Hierarchical Clustering 1 / 1 Produces a set of nested clusters organized as a Hierarchical hierarchical clustering

More information

Symbols Recognition System for Graphic Documents Combining Global Structural Approaches and Using a XML Representation of Data

Symbols Recognition System for Graphic Documents Combining Global Structural Approaches and Using a XML Representation of Data Symbols Recognition System for Graphic Documents Combining Global Structural Approaches and Using a XML Representation of Data Mathieu Delalandre 1, Eric Trupin 1, Jean-Marc Ogier 2 1 Laboratory PSI, University

More information

PAM algorithm. Types of Data in Cluster Analysis. A Categorization of Major Clustering Methods. Partitioning i Methods. Hierarchical Methods

PAM algorithm. Types of Data in Cluster Analysis. A Categorization of Major Clustering Methods. Partitioning i Methods. Hierarchical Methods Whatis Cluster Analysis? Clustering Types of Data in Cluster Analysis Clustering part II A Categorization of Major Clustering Methods Partitioning i Methods Hierarchical Methods Partitioning i i Algorithms:

More information

SEEKING THE ACTUAL REASONS FOR THE "NEW PARADIGM" IN THE AREA OF IS ANALYSIS 2. GENERAL CHARACTERISTICS OF THE "STRUCTURED APPROACH" IN IS DEVELOPMENT

SEEKING THE ACTUAL REASONS FOR THE NEW PARADIGM IN THE AREA OF IS ANALYSIS 2. GENERAL CHARACTERISTICS OF THE STRUCTURED APPROACH IN IS DEVELOPMENT SEEKING THE ACTUAL REASONS FOR THE "NEW PARADIGM" IN THE AREA OF IS ANALYSIS Václav Řepa Prague University of Economics, W.Churchill sq. 4, 130 00 Praha 3, Czech Republic E-mail: REPA@VSE.CZ 1. INTRODUCTION

More information

Contextual priming for artificial visual perception

Contextual priming for artificial visual perception Contextual priming for artificial visual perception Hervé Guillaume 1, Nathalie Denquive 1, Philippe Tarroux 1,2 1 LIMSI-CNRS BP 133 F-91403 Orsay cedex France 2 ENS 45 rue d Ulm F-75230 Paris cedex 05

More information

Multi-scale Techniques for Document Page Segmentation

Multi-scale Techniques for Document Page Segmentation Multi-scale Techniques for Document Page Segmentation Zhixin Shi and Venu Govindaraju Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, Amherst

More information

Hidden Loop Recovery for Handwriting Recognition

Hidden Loop Recovery for Handwriting Recognition Hidden Loop Recovery for Handwriting Recognition David Doermann Institute of Advanced Computer Studies, University of Maryland, College Park, USA E-mail: doermann@cfar.umd.edu Nathan Intrator School of

More information

XI International PhD Workshop OWD 2009, October Fuzzy Sets as Metasets

XI International PhD Workshop OWD 2009, October Fuzzy Sets as Metasets XI International PhD Workshop OWD 2009, 17 20 October 2009 Fuzzy Sets as Metasets Bartłomiej Starosta, Polsko-Japońska WyŜsza Szkoła Technik Komputerowych (24.01.2008, prof. Witold Kosiński, Polsko-Japońska

More information

Controlling the spread of dynamic self-organising maps

Controlling the spread of dynamic self-organising maps Neural Comput & Applic (2004) 13: 168 174 DOI 10.1007/s00521-004-0419-y ORIGINAL ARTICLE L. D. Alahakoon Controlling the spread of dynamic self-organising maps Received: 7 April 2004 / Accepted: 20 April

More information

Computer Vision and Graph-Based Representation

Computer Vision and Graph-Based Representation Jean-Yves Ramel Romain Raveaux Laboratoire Informatique de Tours - FRANCE Computer Vision and Graph-Based Representation Presented by: Romain Raveaux About me I About me II Recherche Partners ISRC 2011

More information

Histogram and watershed based segmentation of color images

Histogram and watershed based segmentation of color images Histogram and watershed based segmentation of color images O. Lezoray H. Cardot LUSAC EA 2607 IUT Saint-Lô, 120 rue de l'exode, 50000 Saint-Lô, FRANCE Abstract A novel method for color image segmentation

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

9.1. K-means Clustering

9.1. K-means Clustering 424 9. MIXTURE MODELS AND EM Section 9.2 Section 9.3 Section 9.4 view of mixture distributions in which the discrete latent variables can be interpreted as defining assignments of data points to specific

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Petr Somol 1,2, Jana Novovičová 1,2, and Pavel Pudil 2,1 1 Dept. of Pattern Recognition, Institute of Information Theory and

More information

Improving Cluster Method Quality by Validity Indices

Improving Cluster Method Quality by Validity Indices Improving Cluster Method Quality by Validity Indices N. Hachani and H. Ounalli Faculty of Sciences of Bizerte, Tunisia narjes hachani@yahoo.fr Faculty of Sciences of Tunis, Tunisia habib.ounalli@fst.rnu.tn

More information

Cluster Analysis. Ying Shen, SSE, Tongji University

Cluster Analysis. Ying Shen, SSE, Tongji University Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group

More information

Comparison of supervised self-organizing maps using Euclidian or Mahalanobis distance in classification context

Comparison of supervised self-organizing maps using Euclidian or Mahalanobis distance in classification context 6 th. International Work Conference on Artificial and Natural Neural Networks (IWANN2001), Granada, June 13-15 2001 Comparison of supervised self-organizing maps using Euclidian or Mahalanobis distance

More information

Clustering & Classification (chapter 15)

Clustering & Classification (chapter 15) Clustering & Classification (chapter 5) Kai Goebel Bill Cheetham RPI/GE Global Research goebel@cs.rpi.edu cheetham@cs.rpi.edu Outline k-means Fuzzy c-means Mountain Clustering knn Fuzzy knn Hierarchical

More information

Dimension reduction : PCA and Clustering

Dimension reduction : PCA and Clustering Dimension reduction : PCA and Clustering By Hanne Jarmer Slides by Christopher Workman Center for Biological Sequence Analysis DTU The DNA Array Analysis Pipeline Array design Probe design Question Experimental

More information

Novel Method to Generate and Optimize Reticulated Structures of a Non Convex Conception Domain

Novel Method to Generate and Optimize Reticulated Structures of a Non Convex Conception Domain , pp. 17-26 http://dx.doi.org/10.14257/ijseia.2017.11.2.03 Novel Method to Generate and Optimize Reticulated Structures of a Non Convex Conception Domain Zineb Bialleten 1*, Raddouane Chiheb 2, Abdellatif

More information

Fast Fuzzy Clustering of Infrared Images. 2. brfcm

Fast Fuzzy Clustering of Infrared Images. 2. brfcm Fast Fuzzy Clustering of Infrared Images Steven Eschrich, Jingwei Ke, Lawrence O. Hall and Dmitry B. Goldgof Department of Computer Science and Engineering, ENB 118 University of South Florida 4202 E.

More information

self-organizing maps and symbolic data

self-organizing maps and symbolic data self-organizing maps and symbolic data Aïcha El Golli, Brieuc Conan-Guez, Fabrice Rossi AxIS project, National Research Institute in Computer Science and Control (INRIA) Rocquencourt Research Unit Domaine

More information

Maxima and Minima: A Review

Maxima and Minima: A Review Maxima and Minima: A Review Serge Beucher CMM / Mines ParisTech (September 2013) 1. Introduction The purpose of this note is to clarify the notion of h-maximum (or h-minimum) which is introduced in the

More information

Hierarchy. No or very little supervision Some heuristic quality guidances on the quality of the hierarchy. Jian Pei: CMPT 459/741 Clustering (2) 1

Hierarchy. No or very little supervision Some heuristic quality guidances on the quality of the hierarchy. Jian Pei: CMPT 459/741 Clustering (2) 1 Hierarchy An arrangement or classification of things according to inclusiveness A natural way of abstraction, summarization, compression, and simplification for understanding Typical setting: organize

More information

Report: Reducing the error rate of a Cat classifier

Report: Reducing the error rate of a Cat classifier Report: Reducing the error rate of a Cat classifier Raphael Sznitman 6 August, 2007 Abstract The following report discusses my work at the IDIAP from 06.2007 to 08.2007. This work had for objective to

More information

arxiv:cs/ v1 [cs.ds] 11 May 2005

arxiv:cs/ v1 [cs.ds] 11 May 2005 The Generic Multiple-Precision Floating-Point Addition With Exact Rounding (as in the MPFR Library) arxiv:cs/0505027v1 [cs.ds] 11 May 2005 Vincent Lefèvre INRIA Lorraine, 615 rue du Jardin Botanique, 54602

More information

Automated Clustering-Based Workload Characterization

Automated Clustering-Based Workload Characterization Automated Clustering-Based Worload Characterization Odysseas I. Pentaalos Daniel A. MenascŽ Yelena Yesha Code 930.5 Dept. of CS Dept. of EE and CS NASA GSFC Greenbelt MD 2077 George Mason University Fairfax

More information

Distributed and clustering techniques for Multiprocessor Systems

Distributed and clustering techniques for Multiprocessor Systems www.ijcsi.org 199 Distributed and clustering techniques for Multiprocessor Systems Elsayed A. Sallam Associate Professor and Head of Computer and Control Engineering Department, Faculty of Engineering,

More information

S-Class, A Divisive Clustering Method, and Possible Dual Alternatives

S-Class, A Divisive Clustering Method, and Possible Dual Alternatives S-Class, A Divisive Clustering Method, and Possible Dual Alternatives Jean-Paul Rasson, François Roland, Jean-Yves Pirçon, Séverine Adans, and Pascale Lallemand Facultés Universitaires Notre-Dame de la

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,

More information

Hierarchical Clustering 4/5/17

Hierarchical Clustering 4/5/17 Hierarchical Clustering 4/5/17 Hypothesis Space Continuous inputs Output is a binary tree with data points as leaves. Useful for explaining the training data. Not useful for making new predictions. Direction

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22

INF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22 INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task

More information

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering

Hard clustering. Each object is assigned to one and only one cluster. Hierarchical clustering is usually hard. Soft (fuzzy) clustering An unsupervised machine learning problem Grouping a set of objects in such a way that objects in the same group (a cluster) are more similar (in some sense or another) to each other than to those in other

More information

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering

SYDE Winter 2011 Introduction to Pattern Recognition. Clustering SYDE 372 - Winter 2011 Introduction to Pattern Recognition Clustering Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 5 All the approaches we have learned

More information

Pattern Recognition Lecture Sequential Clustering

Pattern Recognition Lecture Sequential Clustering Pattern Recognition Lecture Prof. Dr. Marcin Grzegorzek Research Group for Pattern Recognition Institute for Vision and Graphics University of Siegen, Germany Pattern Recognition Chain patterns sensor

More information

Line Segment Based Watershed Segmentation

Line Segment Based Watershed Segmentation Line Segment Based Watershed Segmentation Johan De Bock 1 and Wilfried Philips Dep. TELIN/TW07, Ghent University Sint-Pietersnieuwstraat 41, B-9000 Ghent, Belgium jdebock@telin.ugent.be Abstract. In this

More information

Lorentzian Distance Classifier for Multiple Features

Lorentzian Distance Classifier for Multiple Features Yerzhan Kerimbekov 1 and Hasan Şakir Bilge 2 1 Department of Computer Engineering, Ahmet Yesevi University, Ankara, Turkey 2 Department of Electrical-Electronics Engineering, Gazi University, Ankara, Turkey

More information

A Review on Cluster Based Approach in Data Mining

A Review on Cluster Based Approach in Data Mining A Review on Cluster Based Approach in Data Mining M. Vijaya Maheswari PhD Research Scholar, Department of Computer Science Karpagam University Coimbatore, Tamilnadu,India Dr T. Christopher Assistant professor,

More information

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Toomas.Kirt@mail.ee Abstract: Key words: For the visualisation

More information

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06 Clustering CS294 Practical Machine Learning Junming Yin 10/09/06 Outline Introduction Unsupervised learning What is clustering? Application Dissimilarity (similarity) of objects Clustering algorithm K-means,

More information

Graphical Knowledge Management in Graphics Recognition Systems

Graphical Knowledge Management in Graphics Recognition Systems Graphical Knowledge Management in Graphics Recognition Systems Mathieu Delalandre 1, Eric Trupin 1, Jacques Labiche 1, Jean-Marc Ogier 2 1 PSI Laboratory, University of Rouen, 76 821 Mont Saint Aignan,

More information

NUMERICAL COMPUTATION METHOD OF THE GENERAL DISTANCE TRANSFORM

NUMERICAL COMPUTATION METHOD OF THE GENERAL DISTANCE TRANSFORM STUDIA UNIV. BABEŞ BOLYAI, INFORMATICA, Volume LVI, Number 2, 2011 NUMERICAL COMPUTATION METHOD OF THE GENERAL DISTANCE TRANSFORM SZIDÓNIA LEFKOVITS(1) Abstract. The distance transform is a mathematical

More information

Clustering. Chapter 10 in Introduction to statistical learning

Clustering. Chapter 10 in Introduction to statistical learning Clustering Chapter 10 in Introduction to statistical learning 16 14 12 10 8 6 4 2 0 2 4 6 8 10 12 14 1 Clustering ² Clustering is the art of finding groups in data (Kaufman and Rousseeuw, 1990). ² What

More information

Enhanced Cluster Centroid Determination in K- means Algorithm

Enhanced Cluster Centroid Determination in K- means Algorithm International Journal of Computer Applications in Engineering Sciences [VOL V, ISSUE I, JAN-MAR 2015] [ISSN: 2231-4946] Enhanced Cluster Centroid Determination in K- means Algorithm Shruti Samant #1, Sharada

More information

Hierarchical Clustering

Hierarchical Clustering Hierarchical Clustering Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram A tree-like diagram that records the sequences of merges

More information

Model-based segmentation and recognition from range data

Model-based segmentation and recognition from range data Model-based segmentation and recognition from range data Jan Boehm Institute for Photogrammetry Universität Stuttgart Germany Keywords: range image, segmentation, object recognition, CAD ABSTRACT This

More information

Belief Hierarchical Clustering

Belief Hierarchical Clustering Belief Hierarchical Clustering Wiem Maalel, Kuang Zhou, Arnaud Martin and Zied Elouedi Abstract In the data mining field many clustering methods have been proposed, yet standard versions do not take base

More information

Clustering Lecture 3: Hierarchical Methods

Clustering Lecture 3: Hierarchical Methods Clustering Lecture 3: Hierarchical Methods Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced

More information

A Comparative Study of Conventional and Neural Network Classification of Multispectral Data

A Comparative Study of Conventional and Neural Network Classification of Multispectral Data A Comparative Study of Conventional and Neural Network Classification of Multispectral Data B.Solaiman & M.C.Mouchot Ecole Nationale Supérieure des Télécommunications de Bretagne B.P. 832, 29285 BREST

More information

A Study of Clustering Algorithms and Validity for Lossy Image Set Compression

A Study of Clustering Algorithms and Validity for Lossy Image Set Compression A Study of Clustering Algorithms and Validity for Lossy Image Set Compression Anthony Schmieder 1, Howard Cheng 1, and Xiaobo Li 2 1 Department of Mathematics and Computer Science, University of Lethbridge,

More information

Segmentation of Images

Segmentation of Images Segmentation of Images SEGMENTATION If an image has been preprocessed appropriately to remove noise and artifacts, segmentation is often the key step in interpreting the image. Image segmentation is a

More information

DEVELOPMENT OF NEURAL NETWORK TRAINING METHODOLOGY FOR MODELING NONLINEAR SYSTEMS WITH APPLICATION TO THE PREDICTION OF THE REFRACTIVE INDEX

DEVELOPMENT OF NEURAL NETWORK TRAINING METHODOLOGY FOR MODELING NONLINEAR SYSTEMS WITH APPLICATION TO THE PREDICTION OF THE REFRACTIVE INDEX DEVELOPMENT OF NEURAL NETWORK TRAINING METHODOLOGY FOR MODELING NONLINEAR SYSTEMS WITH APPLICATION TO THE PREDICTION OF THE REFRACTIVE INDEX THESIS CHONDRODIMA EVANGELIA Supervisor: Dr. Alex Alexandridis,

More information

UNSUPERVISED STATIC DISCRETIZATION METHODS IN DATA MINING. Daniela Joiţa Titu Maiorescu University, Bucharest, Romania

UNSUPERVISED STATIC DISCRETIZATION METHODS IN DATA MINING. Daniela Joiţa Titu Maiorescu University, Bucharest, Romania UNSUPERVISED STATIC DISCRETIZATION METHODS IN DATA MINING Daniela Joiţa Titu Maiorescu University, Bucharest, Romania danielajoita@utmro Abstract Discretization of real-valued data is often used as a pre-processing

More information

A Pretopological Approach for Clustering ( )

A Pretopological Approach for Clustering ( ) A Pretopological Approach for Clustering ( ) T.V. Le and M. Lamure University Claude Bernard Lyon 1 e-mail: than-van.le@univ-lyon1.fr, michel.lamure@univ-lyon1.fr Abstract: The aim of this work is to define

More information

Data Clustering. Algorithmic Thinking Luay Nakhleh Department of Computer Science Rice University

Data Clustering. Algorithmic Thinking Luay Nakhleh Department of Computer Science Rice University Data Clustering Algorithmic Thinking Luay Nakhleh Department of Computer Science Rice University Data clustering is the task of partitioning a set of objects into groups such that the similarity of objects

More information

IT-Dendrogram: A New Member of the In-Tree (IT) Clustering Family

IT-Dendrogram: A New Member of the In-Tree (IT) Clustering Family IT-Dendrogram: A New Member of the In-Tree (IT) Clustering Family Teng Qiu (qiutengcool@163.com) Yongjie Li (liyj@uestc.edu.cn) University of Electronic Science and Technology of China, Chengdu, China

More information

Data Mining. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of Computer Science

Data Mining. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of Computer Science Data Mining Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology Department of Computer Science 06 07 Department of CS - DM - UHD Road map Cluster Analysis: Basic

More information

CS7267 MACHINE LEARNING

CS7267 MACHINE LEARNING S7267 MAHINE LEARNING HIERARHIAL LUSTERING Ref: hengkai Li, Department of omputer Science and Engineering, University of Texas at Arlington (Slides courtesy of Vipin Kumar) Mingon Kang, Ph.D. omputer Science,

More information

University of Florida CISE department Gator Engineering. Clustering Part 5

University of Florida CISE department Gator Engineering. Clustering Part 5 Clustering Part 5 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville SNN Approach to Clustering Ordinary distance measures have problems Euclidean

More information

An automatic correction of Ma s thinning algorithm based on P-simple points

An automatic correction of Ma s thinning algorithm based on P-simple points Author manuscript, published in "Journal of Mathematical Imaging and Vision 36, 1 (2010) 54-62" DOI : 10.1007/s10851-009-0170-1 An automatic correction of Ma s thinning algorithm based on P-simple points

More information

Texture Segmentation by Windowed Projection

Texture Segmentation by Windowed Projection Texture Segmentation by Windowed Projection 1, 2 Fan-Chen Tseng, 2 Ching-Chi Hsu, 2 Chiou-Shann Fuh 1 Department of Electronic Engineering National I-Lan Institute of Technology e-mail : fctseng@ccmail.ilantech.edu.tw

More information

Hierarchical Clustering

Hierarchical Clustering Hierarchical Clustering Build a tree-based hierarchical taxonomy (dendrogram) from a set animal of documents. vertebrate invertebrate fish reptile amphib. mammal worm insect crustacean One approach: recursive

More information

Rowena Cole and Luigi Barone. Department of Computer Science, The University of Western Australia, Western Australia, 6907

Rowena Cole and Luigi Barone. Department of Computer Science, The University of Western Australia, Western Australia, 6907 The Game of Clustering Rowena Cole and Luigi Barone Department of Computer Science, The University of Western Australia, Western Australia, 697 frowena, luigig@cs.uwa.edu.au Abstract Clustering is a technique

More information

arxiv: v1 [cs.dc] 2 Apr 2016

arxiv: v1 [cs.dc] 2 Apr 2016 Scalability Model Based on the Concept of Granularity Jan Kwiatkowski 1 and Lukasz P. Olech 2 arxiv:164.554v1 [cs.dc] 2 Apr 216 1 Department of Informatics, Faculty of Computer Science and Management,

More information

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)

MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ) 5 MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ) Contents 5.1 Introduction.128 5.2 Vector Quantization in MRT Domain Using Isometric Transformations and Scaling.130 5.2.1

More information

Lesson 3. Prof. Enza Messina

Lesson 3. Prof. Enza Messina Lesson 3 Prof. Enza Messina Clustering techniques are generally classified into these classes: PARTITIONING ALGORITHMS Directly divides data points into some prespecified number of clusters without a hierarchical

More information

A Fast Algorithm for Finding k-nearest Neighbors with Non-metric Dissimilarity

A Fast Algorithm for Finding k-nearest Neighbors with Non-metric Dissimilarity A Fast Algorithm for Finding k-nearest Neighbors with Non-metric Dissimilarity Bin Zhang and Sargur N. Srihari CEDAR, Department of Computer Science and Engineering State University of New York at Buffalo

More information

Symbol Detection Using Region Adjacency Graphs and Integer Linear Programming

Symbol Detection Using Region Adjacency Graphs and Integer Linear Programming 2009 10th International Conference on Document Analysis and Recognition Symbol Detection Using Region Adjacency Graphs and Integer Linear Programming Pierre Le Bodic LRI UMR 8623 Using Université Paris-Sud

More information

Medical images, segmentation and analysis

Medical images, segmentation and analysis Medical images, segmentation and analysis ImageLab group http://imagelab.ing.unimo.it Università degli Studi di Modena e Reggio Emilia Medical Images Macroscopic Dermoscopic ELM enhance the features of

More information

Clustering algorithms

Clustering algorithms Clustering algorithms Machine Learning Hamid Beigy Sharif University of Technology Fall 1393 Hamid Beigy (Sharif University of Technology) Clustering algorithms Fall 1393 1 / 22 Table of contents 1 Supervised

More information

TRANSACTIONAL CLUSTERING. Anna Monreale University of Pisa

TRANSACTIONAL CLUSTERING. Anna Monreale University of Pisa TRANSACTIONAL CLUSTERING Anna Monreale University of Pisa Clustering Clustering : Grouping of objects into different sets, or more precisely, the partitioning of a data set into subsets (clusters), so

More information

Lecture 10: Semantic Segmentation and Clustering

Lecture 10: Semantic Segmentation and Clustering Lecture 10: Semantic Segmentation and Clustering Vineet Kosaraju, Davy Ragland, Adrien Truong, Effie Nehoran, Maneekwan Toyungyernsub Department of Computer Science Stanford University Stanford, CA 94305

More information

9.913 Pattern Recognition for Vision. Class 8-2 An Application of Clustering. Bernd Heisele

9.913 Pattern Recognition for Vision. Class 8-2 An Application of Clustering. Bernd Heisele 9.913 Class 8-2 An Application of Clustering Bernd Heisele Fall 2003 Overview Problem Background Clustering for Tracking Examples Literature Homework Problem Detect objects on the road: Cars, trucks, motorbikes,

More information

5/15/16. Computational Methods for Data Analysis. Massimo Poesio UNSUPERVISED LEARNING. Clustering. Unsupervised learning introduction

5/15/16. Computational Methods for Data Analysis. Massimo Poesio UNSUPERVISED LEARNING. Clustering. Unsupervised learning introduction Computational Methods for Data Analysis Massimo Poesio UNSUPERVISED LEARNING Clustering Unsupervised learning introduction 1 Supervised learning Training set: Unsupervised learning Training set: 2 Clustering

More information

A Graph Theoretic Approach to Image Database Retrieval

A Graph Theoretic Approach to Image Database Retrieval A Graph Theoretic Approach to Image Database Retrieval Selim Aksoy and Robert M. Haralick Intelligent Systems Laboratory Department of Electrical Engineering University of Washington, Seattle, WA 98195-2500

More information

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi

Unsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Unsupervised Learning Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Content Motivation Introduction Applications Types of clustering Clustering criterion functions Distance functions Normalization Which

More information