described with a predened list of attributes (or variables). In many applications, a xed list of

Size: px
Start display at page:

Download "described with a predened list of attributes (or variables). In many applications, a xed list of"

Transcription

1 228

2 22 Hierarchical Clustering of Composite Objects with a Variable Number of Components A. Ketterlin, P. Gancarski and J.J. Korczak LSIIT, URA 1871, CNRS 7, rue Rene Descartes Universite Louis Pasteur F Strasbourg Cedex France ABSTRACT This paper examines the problem of clustering a sequence of objects that cannot be described with a predened list of attributes (or variables). In many applications, a xed list of attributes cannot be determined without substantial pre-processing. An extension of the traditional propositional formalism is thus proposed, which allows objects to be represented as a set of components, i.e. there is no mapping between attributes and values. The algorithm used for clustering is briey illustrated, and mechanisms to handle sets are described. Some empirical evaluations are also provided to assess the validity of the approach Introduction The basic process of clustering consists of nding coherent groupings of objects in a data set. Each grouping must exhibit sucient internal cohesiveness, while maintaining enough distance to other members of the partition. This denition of clustering importantly constrains many techniques and algorithms devoted to the automated discovery of clusters [KR93]. The area of statistics concerned with this problem, namely the eld of cluster analysis, essentially deals with numeric and/or categorical data, where objects are represented as feature vectors [DLPT82]. This representation is equivalent, in essence, to propositional logic (or an attribute-value formalism). Derived clusters are usually represented as centroids in the instance-space. Centroids are used for classication of new data with the help of a distance measure, or with some implicit probability distribution. On the other hand, machine learning research on conceptual clustering [MS83], conducted during the last decade, focuses on the representation of instances and concepts; the goal is to derive some eective knowledge, in the form of generalizations (or concepts), about the domain under study. Concept formation is dened as incremental conceptual clustering [FPL91]: incrementality is the ability to maintain a conceptual structure that is updated after each observation. Many machine learning clustering algorithms still use propositional logic to describe the data, but recent work enriches the language used to describe concepts discovered through clustering. Some systems use a extended form of propositional logic, where con- 1 Learning from Data: AI and Statistics V. Edited by D. Fisher and H.-J. Lenz. c1996 Springer-Verlag.

3 230 A. Ketterlin, P. Gancarski and J.J. Korczak cepts are dened by logical formulae [MS83]. Others use probabilistic concepts [Fis87], where a concept is dened by a probability distribution on each dimension. Other work extends clustering techniques to higher level languages: KBG [Bis92] deals with rst-order logic, and Kluster [KM94] uses a Kl-one-like language *that mitigates computational complexity, while still retaining consider representational power. However, there are domains where numeric data must be considered. In such domains, logic-based formalisms are dicult to apply. In addition to our concern with numeric data, this paper describes an extension of attribute-value formalisms that allows objects to be represented as somewhat structured: in this framework an object is built up from any number of components. Earlier work on related extensions to propositional formalisms has been termed structured concept formation (see[tl91]), since objects exhibit several levels of structural detail. Section 22.2 briey reviews a simple clustering algorithm that we extend to cluster over objects represented as sets. This extension is described in Section Section 22.4 gives three examples of our clustering algorithm in action Concept Formation The underlying concept formation algorithm used throughout this paper is Cobweb. 2. This algorithm forms a hierarchy of clusters from a sequence of objects. Each object is an attribute-value list. Derived clusters (or concepts) are represented in a probabilistic form. Attributes are typed: the algorithm is able to handle nominal variables as well as numeric ones. In the case of numeric attributes, an underlying normal distribution is assumed, whose parameters are estimated. Therefore, for each numeric attribute, a cluster stores the estimated mean and variance with respect to the covered objects. The variances help dene a global predictivity score for one cluster, named, which is an average of the inverses of the standard deviations over all attributes. Each object is incorporated into the cluster hierarchy in turn. When incorporating a new object, a search is performed by hill-climbing through a space of cluster hierarchies. The object is sorted down through the current hierarchy. At each level several operators are tentatively applied to the current partition, some of them having restructuring properties. The best of these operators is denitely applied, and the process restarts one level deeper (see [Fis87, GLF89] for a complete description of the algorithm). When several distinct partitions are generated, a heuristic called category utility is used to select between them. Category utility evaluates the global quality of a single partition. This evaluation is based on the individual predictivity of each cluster involved. The partition (C; fc 1 ; : : : ; C K g) is evaluated by: 1 K KX k=1 P (C k )[(C k )? (C)] where each (C k ) measures the individual predictivity of C k. This expression is an average of the predictivity-gain when stepping from the cluster C to one of its sub-clusters. 2 In this paper, we consider Cobwebto be the algorithm described in [Fis87], with an extension to deal with numeric attributes developed in Classit(see [GLF89]), but without any other extension from Classit.

4 Hierarchical Clustering of Composite Objects with a Variable Number of Components 231 The individual predictivity of a cluster is dened as: (C k ) = 1 I IX i=1 (A i ; C k ) where (A i ; C k ) quanties how the attribute A i is predictive in C k (i.e. how precisely values of A i can be predicted for members of C k ). In this framework, a learning problem is dened on a set of attributes. Observations must be represented by a value for all (or some) of the attributes. Concepts (or clusters { both terms are used interchangeably in this context) must represent some distribution of values for each attribute of the learning problem. Attributes may be of dierent types. The denition of a type of attribute must thus include a value-space, a way to represent a distribution of such values, and a predictivity measure for such a distribution. As originally described in [Fis87], Cobweb provides the denitions for nominal (categorical) attributes. Classit [GLF89] provides the denitions for numeric (continuous) attributes. The Labyrinth system [TL91] further extends Cobweb to deal with `composite' objects by allowing `structured' attributes. The next section introduces a new type of attribute, called a `set' attribute, denes the representation of values and distributions for this attribute type, and gives a predictivity evaluation measure for distributions over the values of a set attribute Clustering Sets Representation of Objects The aim of this paper is to describe a conceptual clustering system that allows objects to be represented as sets of components (i.e. sub-objects). The representation formalism is inspired by propositional logic. In that framework each object is a list of attributevalue pairs. Our system allows a value to be a set of objects (instead of being a single numeric or categorical quantity); all objects of a set are described with the same set of attributes. Members of such a set value are less abstract objects, each of which describes some aspect of the more global (abstract) object. To illustrate this representation scheme consider an example domain of quadruped mammals that is found in the literature: each object consists of several sub-objects (cylinders), each of which is described along several, identical numeric attributes. An object of that domain may be written as: Dog1 = [ cyls={[r=13.9,l=25.1], [r=6.5,l=10.0], [r=4.7,l=6.3]} ] In this case the object Dog1 is described by only one attribute, named cyls, whose value is a set containing three sub-objects. Each of these sub-objects is described with the same two numeric attributes (r and l) In this paper only single attribute domains will be considered. The mechanisms described here apply to one attribute, whose values are sets of objects, but with no loss of generality, the method can be extended to domains described by several attributes, each of them having sets as values

5 232 A. Ketterlin, P. Gancarski and J.J. Korczak Clustering Strategy Section 22.2 briey explained how to build a cluster hierarchy. We noted that this problem can be cast as a problem of quantifying how predictive a cluster is. In the case of the domain described above, the question is: "How can one quantify the predictivity of a set of animals?". The answer to this problem lies in the fact that a separate cluster hierarchy can be built for each level of abstraction. In the animal domain it means that two cluster hierarchies are maintained: one for the cylinders (the component-cluster hierarchy) and one for the animals (the composite-cluster hierarchy). The results of incorporating the cylinders forming an animal are used during the incorporation of the animal. That is, the same process is repeated at each level of structural abstraction. This is a component- rst strategy, where parts are clustered before the whole [TL91]. The algorithm may be summarized as follows: 1. for each component 1a. incorporate the component in the component-cluster hierarchy 1b. update the description of the composite (replace the component by a reference to the cluster it reached) 2. incorporate the composite object into the composite-cluster hierarchy Each `incorporate' operation is a call to Cobweb. The rst call operates at a lower level (i.e. in the component space) than the last one, which operates in the composite space. It follows from this sketch that the integration of a composite object (e.g. an animal) is done with each component (e.g. a cylinder) replaced by a reference to a component-cluster. What is thus needed is a way to quantify the predictivity of a set of sets of componentcluster references. This problem divides itself into two distinct phases, explained in the next two sections Representation of Clusters The rst step is to nd an adequate representation for clusters covering objects whose description includes set-valued attributes. Since members of values (i.e. sub-objects or components) are hierarchically clustered, they can be replaced by their most specic covering cluster: the initial instance space is replaced by a cluster hierarchy (which is a partially ordered space). A conceptual (or intensional) representation of a set of composite objects must rely on a conceptual `vocabulary' for the components. This language is provided by the component-cluster hierarchy: clusters of components will be used to characterize clusters of composite objects. The second step is to select an adequate subset of composite clusters. An intuitive way to characterize a set of sets is by way of their mutual intersection (or overlap). However, instead of having simple members, one may apply the same line of reasoning to complex (structured objects). Generalizing a set of sets means nding a set of clusters describing all the sets of components, i.e. for each set under consideration, there exists a mapping from that set to the generalization. Hierarchically-organized clusters of components clearly help to nd such a description.

6 Hierarchical Clustering of Composite Objects with a Variable Number of Components 233 The conceptual representation of a set of composite objects will thus be a set of component-clusters: these clusters are called central-clusters, and must have the following properties: 1. Each central cluster must cover at least one member of each value. This means that each central cluster must characterize the intersection between all the sets under consideration. 2. All the members of the sets must be covered by central clusters. This means that central clusters must cover all the elements of all the sets, thus avoiding `marginal' intersections between only parts of the sets (that would leave some other components uncovered). 3. Central clusters have to be as specic as possible. This means that the set of central clusters is the most `precise' conceptual characterization of the intersection between all the sets. It is important to note that in cases where only one composite object is covered, the set of central clusters used to represent the composite cluster is the set of component-clusters covering the components of the object Predictivity Evaluation The second step is to evaluate the predictivity of the set of central clusters. Since each central cluster can be found in the component-cluster hierarchy, it is labeled with its individual predictivity (named ). A straightforward way to compute predictivity for a set of clusters is to average their individual predictivity score. This was the approach taken in our implementation, even though other methods of combination could be explored. However, since all central clusters do not cover the same proportion of objects, the contribution of each central cluster is weighted by the proportion of components it covers. The predictivity of a set of L central clusters f 1 ; : : : ; L g is thus: LX l=1 n o ( l ) N o ( l ) where n o ( l ) is the number of members covered by l, and N o = P L l=1 n o( l ) is the total number of components. Given the denition of central clusters, n o ( l ) is guaranteed to be at least equal to the number of covered objects. The weight associated with a central cluster will be called its coverage. The overall predictivity is thus a tradeo between the predictive ability and the coverage of central clusters Incrementality The original Cobweb algorithm, which worked with a purely propositional formalism, addressed the problem of concept formation, which is dened as incremental conceptual clustering. Incremental ability is crucial in most applications. It is thus important to check that incrementality is preserved when extending the knowledge representation language. The problem may be stated as: \Having a composite-cluster description and a new composite object to incorporate, how can one compute the new description of the compositecluster?" The process can be divided into two phases:

7 234 A. Ketterlin, P. Gancarski and J.J. Korczak - generalize any central cluster that does not cover at least one of the components of the new object - if any component of the incoming object remains uncovered by a central cluster, generalize the \nearest" central cluster This is only a sketch of the procedure. Importantly, any generalization must be performed carefully. A generalization is the replacement of one central-cluster by its parent in the component-cluster hierarchy. This process is equivalent to the application of a \climbgeneralization-tree" operator [Mic83], with the property that the generalization tree is itself built and maintained by the system. Nevertheless, this simple procedure ensures that the denition of central clusters is maintained. The two phases correspond to the rst two conditions in the denition of central clusters. It has to be noted that the addition of an object to a composite cluster may only generalize the central clusters describing it. This does not imply that the predictivity of that composite cluster decreases, since a loss in predictivity may be compensated by an increase in coverage Complexity This section investigates the computational cost of the set-clustering process, and focuses particularly on the integration of a new composite object into an existing compositecluster. The problem of structured concept formation has already been addressed by the Labyrinth system [TL91] 4. The dierence between Labyrinth and the system described in this paper is that Labyrinth is designed to nd a binding between the components and a predened set of attributes. Once the binding is determined, the task is reduced to composite-object clustering. But this binding process has high computational cost: O(d!) if an exhaustive search is performed (d being the number of components per object, which is xed in Labyrinth), O(d 2 ) or O(d 3 ) if heuristic solutions are used (see [TL91]). This binding is computed each time an object is added to a cluster. Note that this cost may be penalizing when d is high (see the next section for examples of such situations). In contrast, our system solves the mapping by nding a mutual intersection between all of the sets. Hence, the mapping process is replaced by the search for central clusters. Let us consider a composite cluster, C, covering N objects, S 1 ; : : : ; S N. Let n i =j S i j. It is easy to see that C will be represented by at most! = max i fn i g central clusters. Hence, the integration of the next composite object in C may lead to at most! generalizations during the rst step of the updating process. The second step may apply only j S N +1 j times. The whole updating process is thus linear in the average number of components per composite object. This notable decrease in complexity may be explained by the fact that instead of considering all possible bindings between the components and a predened set of attributes, the set-clustering mechanism uses the component-cluster hierarchy to solve the partialmatching problem. Moreover, the binding searched for by Labyrinth appears as an epiphenomenon: a global structure appears by way of the representation of composite clusters in terms of central clusters, rather than searching for a xed number of components. 4 Labyrinth's formalism also includes relational information between components, which are not discussed here.

8 Hierarchical Clustering of Composite Objects with a Variable Number of Components Empirical Results The Simplied Quadruped Mammals Domain This domain has already been used in the literature on structured concept formation to demonstrate the abilities of the Classit algorithm [GLF89]. It has been simplied here for explanatory purposes. In this domain objects represent some perceptual sketch of an animal: each object is composed of three cylinders (instead of eight in the original domain) representing the head, the torso and one leg of the animal. Each cylinder is described by two (instead of 9 in the original domain) numeric attributes (the radius and length). This domain illustrates a special case of set clustering. Since all the objects have the same number of components (three here), the problem is equivalent to clustering objects wih no knowledge of the binding between attributes and values. Since these objects are articially generated from four available models (cat, dog, horse or girae), 5 the goal is to discover these classes from the unlabeled data. In our experiment, 20 objects were randomly generated (ve from each model), each one being made of three cylinders. The system constructed a cluster hierarchy of animals: it thus built two hierarchies (one for the cylinders and one for the animals). Results are sketched on Figure 1. Note that each terminal class of Figure 1 is, in fact, further developed. Horse All : XyX 6 Gee X X X Small * HHY H Cat Dog FIGURE 1. Results in the quadruped mammals domain. The system found the four classes of animals even though all of them are not at the rst level. More important is the fact (not shown in Figure 1) that the conceptual representation of each of these classes included three central clusters, corresponding to the three cylinders. The cluster labelled `Small' covers all small animals (dogs and cats), and its conceptual representation includes only two cylinder clusters. The interpretation of such a conceptual representation is that a small animal is an animal made of two cylinders of one type (described by the rst central cluster) and one cylinder of another type. Such a representation could not have been obtained if a xed structure (i.e., a predetermined number of attributes) had been used Object Recognition Domains Character Recognition The second experiment involves a simplied form of pattern recognition. The basic idea is to consider a pattern as a set of pixels. To compensate for the loss of information on the spatial arrangement of pixels, the representation of each individual pixel is based on several convolutions. 5 The instance-generator is available at the \UCI Repository of Machine Learning Databases and Domain Theories" at Irvine, at ftp.ics.uci.edu, as /pub/machine-learning-databases/quadrapeds

9 236 A. Ketterlin, P. Gancarski and J.J. Korczak Data are drawn from a set of four alphabetic characters (shown on Figure 2), each of them printed in four directions. Each pattern was convolved with the Laplacian of a Gaussian at several dierent scales. Each pixel shown in black in the original pattern is considered as a component of that pattern. Each component is described by the values of the convolutions at its position. These values can be seen as local characteristics of the pixel in the original pattern. In fact, as shown in [MH80], convolution by the Laplacian of a Gaussian can be used as an edge detector (a value near zero, meaning that the pixel is on an edge in the original image). In the experiments reported here, three convolutions were used with the parameter respectively equal to 5=2, 3 and 7=2 (see [MH80] for details about the meaning of this parameter). The convolutions were directly applied to the iconic images. Note also that the four sets representing one character (one set for each `direction') contain exactly the same objects. Thus, objects from the same class have the same number of components, which is not true between classes. Data are shown on the left side of Figure 2. The system was thus given 16 composite-objects with a total of 1488 components. Results are shown on the right side of Figure 2. n = 103 n = 80 n = 102 n = 87 (a) The data. (b) The resulting hierarchy. FIGURE 2. The character-recognition data-set. As was expected, the system discovered one cluster for each class of characters. As noted earlier, a xed structure (i.e. a predened set of attributes) cannot model such a problem. Rotated Polygons The third experiment was similar to the second, but with more realistic rotation angles. The design was as follows: each pattern was computed from an underlying polygon, to which a rotation was applied, whose angle was randomly drawn. The rotated gure was then discretized, leading to a grey-levels icon. The icon was then treated the same way as the alphabetic characters in the previous experiment. Each non-white pixel was considered a component, and local characteristics (i.e. values of the convolution with the Laplacian of a Gaussian) were used to describe it. Parameters of the convolutions were set to the same values as in the previous experiment. Three `models' were used to generate the data. Each model was randomly rotated by three dierent angles, and each time expressed as a set of pixels. Figure 3 shows the data with the angle of rotation and the number of pixels. This domain is an illustration of the most complex representational case in which objects do not have the same number of

10 Hierarchical Clustering of Composite Objects with a Variable Number of Components 237 components, even if they are from the same class. = 122 n = 44 = 46 n = 51 = 190 n = 47 = 25 n = 46 = 306 n = 47 = 21 n = 47 = 357 n = 44 = 219 n = 50 = 47 n = 50 FIGURE 3. Nine rotated polygons. The result of the clustering is shown on Figure 4. Results are as good as in the nave case, and suggest that the set clustering algorithm is able to discover rotational-invariant clusters of patterns. FIGURE 4. Results in the pattern recognition domain Conclusion The set clustering mechanism described in this paper allows concept formation to take place in domains where a structure can be extracted. Particularly, it resolves the task of concept formation from composite objects. In the case where each set is restricted to one member, and several sets are used to describe each object, the problem reduces to composite-object clustering. In the general case, the overlap between sets is used to form generalizations. Experimental results illustrate the performance of the algorithm and demonstrate abilities that can not be attained by purely propositional algorithms. The experiments described in the previous section raise some questions about the nature of the discovered clusters, particularly in the case of the pattern-recognition problems. The representation of a pattern (as a set of pixels) is somewhat unusual, and the clusters that are derived are expressed in terms of pixel-classes. It is thus impossible to ask the system to `illustrate' what a cross is, for instance. But a new cross will be recognized as such. More precisely, a new cross will be recognized as being similar to previous ones. It seems that this is a case of a knowledge structure with no explicit, internal representation of known clusters. The component-cluster hierarchy may be said to reside at a sub-symbolic level (assuming that patterns are at the symbol level). The work presented in this paper is part of a project on learning and image processing. Earlier work on this project established the adequacy on concept formation techniques for image segmentation [KBK94]. The next step is to study the abilities of the same algorithm

11 238 A. Ketterlin, P. Gancarski and J.J. Korczak on an object recognition problem. Preliminary results are described in this paper. The important point is that the same algorithm is used for both tasks (image segmentation and object recognition), and that this algorithm is totally unsupervised. Future work will likely include the integration of both phases (i.e. object recognition in conjunction with segmentation) References [Bis92] G. Bisson. Conceptual clustering in a rst order logic representation. In Proceedings of the Tenth European Conference on Articial Intelligence, pages 458{462. J. Wiley and Sons, [DLPT82] E. Diday, J. Lemaire, J. Pouget, and F. Testu. Elements d'analyse de Donnees. Dunod, Paris, [Fis87] D. H. Fisher. Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2:139{172, [FPL91] D. H. Fisher, M. J. Pazzani, and P. Langley, editors. Concept Formation: Knowledge and Experience in Unsupervised Learning. Morgan Kaufmann, [GLF89] [KBK94] [KM94] [KR93] [MH80] J. H. Gennari, P. Langley, and D. H. Fisher. Models of incremental concept formation. Articial Intelligence, 40:11{61, J. J. Korczak, D. Blamont, and A. Ketterlin. Thematic image segmentation by a concept formation algorithm. In Proceeding of the European Symposium on Satellite Remote Sensing, J-U Kietz and K. Morik. A polynomial approach to the constructive induction of structural knowledge. Machine Learning, 14:193{217, J. J. Korczak and M. Rymarczyk. Application of classical clustering methods to digital image analysis. Technical report, Universite Louis Pasteur, Strasbourg, France, D. Marr and E. Hildreth. Theory of edge detection. Proceedings of the Royal Society of London, B. 207:187{217, [Mic83] R. S. Michalski. A theory and methodology of inductive learning. In R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, editors, Machine Learning: An Articial Intelligence Approach. Morgan Kaufmann, [MS83] [TL91] R. S. Michalski and R. E. Stepp. Learning from observation: Conceptual clustering. In R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, editors, Machine Learning: An Articial Intelligence Approach. Morgan Kaufmann, K. Thompson and P. Langley. Concept formation in structured domains. In Fisher et al. [FPL91].

Conceptual Clustering in Structured Databases: a Practical Approach

Conceptual Clustering in Structured Databases: a Practical Approach From: KDD-95 Proceedings. Copyright 1995, AAAI (www.aaai.org). All rights reserved. Conceptual Clustering in Structured Databases: a Practical Approach A. Ketterlin, P. Gansarski & J.J. Korczak LSIIT,

More information

size, runs an existing induction algorithm on the rst subset to obtain a rst set of rules, and then processes each of the remaining data subsets at a

size, runs an existing induction algorithm on the rst subset to obtain a rst set of rules, and then processes each of the remaining data subsets at a Multi-Layer Incremental Induction Xindong Wu and William H.W. Lo School of Computer Science and Software Ebgineering Monash University 900 Dandenong Road Melbourne, VIC 3145, Australia Email: xindong@computer.org

More information

Rowena Cole and Luigi Barone. Department of Computer Science, The University of Western Australia, Western Australia, 6907

Rowena Cole and Luigi Barone. Department of Computer Science, The University of Western Australia, Western Australia, 6907 The Game of Clustering Rowena Cole and Luigi Barone Department of Computer Science, The University of Western Australia, Western Australia, 697 frowena, luigig@cs.uwa.edu.au Abstract Clustering is a technique

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

Telecommunication and Informatics University of North Carolina, Technical University of Gdansk Charlotte, NC 28223, USA

Telecommunication and Informatics University of North Carolina, Technical University of Gdansk Charlotte, NC 28223, USA A Decoder-based Evolutionary Algorithm for Constrained Parameter Optimization Problems S lawomir Kozie l 1 and Zbigniew Michalewicz 2 1 Department of Electronics, 2 Department of Computer Science, Telecommunication

More information

APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES

APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES A. Likas, K. Blekas and A. Stafylopatis National Technical University of Athens Department

More information

Tilings of the Euclidean plane

Tilings of the Euclidean plane Tilings of the Euclidean plane Yan Der, Robin, Cécile January 9, 2017 Abstract This document gives a quick overview of a eld of mathematics which lies in the intersection of geometry and algebra : tilings.

More information

2 Data Reduction Techniques The granularity of reducible information is one of the main criteria for classifying the reduction techniques. While the t

2 Data Reduction Techniques The granularity of reducible information is one of the main criteria for classifying the reduction techniques. While the t Data Reduction - an Adaptation Technique for Mobile Environments A. Heuer, A. Lubinski Computer Science Dept., University of Rostock, Germany Keywords. Reduction. Mobile Database Systems, Data Abstract.

More information

CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION

CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION CHAPTER VII INDEXED K TWIN NEIGHBOUR CLUSTERING ALGORITHM 7.1 INTRODUCTION Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called cluster)

More information

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract Clustering Sequences with Hidden Markov Models Padhraic Smyth Information and Computer Science University of California, Irvine CA 92697-3425 smyth@ics.uci.edu Abstract This paper discusses a probabilistic

More information

Keywords: clustering algorithms, unsupervised learning, cluster validity

Keywords: clustering algorithms, unsupervised learning, cluster validity Volume 6, Issue 1, January 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Clustering Based

More information

the number of states must be set in advance, i.e. the structure of the model is not t to the data, but given a priori the algorithm converges to a loc

the number of states must be set in advance, i.e. the structure of the model is not t to the data, but given a priori the algorithm converges to a loc Clustering Time Series with Hidden Markov Models and Dynamic Time Warping Tim Oates, Laura Firoiu and Paul R. Cohen Computer Science Department, LGRC University of Massachusetts, Box 34610 Amherst, MA

More information

BMVC 1996 doi: /c.10.41

BMVC 1996 doi: /c.10.41 On the use of the 1D Boolean model for the description of binary textures M Petrou, M Arrigo and J A Vons Dept. of Electronic and Electrical Engineering, University of Surrey, Guildford GU2 5XH, United

More information

Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee. The Chinese University of Hong Kong.

Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee. The Chinese University of Hong Kong. Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee Department of Computer Science and Engineering The Chinese University of Hong Kong Shatin, N.T., Hong Kong SAR, China fyclaw,jleeg@cse.cuhk.edu.hk

More information

A Formal Analysis of Solution Quality in. FA/C Distributed Sensor Interpretation Systems. Computer Science Department Computer Science Department

A Formal Analysis of Solution Quality in. FA/C Distributed Sensor Interpretation Systems. Computer Science Department Computer Science Department A Formal Analysis of Solution Quality in FA/C Distributed Sensor Interpretation Systems Norman Carver Victor Lesser Computer Science Department Computer Science Department Southern Illinois University

More information

Object classes. recall (%)

Object classes. recall (%) Using Genetic Algorithms to Improve the Accuracy of Object Detection Victor Ciesielski and Mengjie Zhang Department of Computer Science, Royal Melbourne Institute of Technology GPO Box 2476V, Melbourne

More information

The Global Standard for Mobility (GSM) (see, e.g., [6], [4], [5]) yields a

The Global Standard for Mobility (GSM) (see, e.g., [6], [4], [5]) yields a Preprint 0 (2000)?{? 1 Approximation of a direction of N d in bounded coordinates Jean-Christophe Novelli a Gilles Schaeer b Florent Hivert a a Universite Paris 7 { LIAFA 2, place Jussieu - 75251 Paris

More information

Classifier C-Net. 2D Projected Images of 3D Objects. 2D Projected Images of 3D Objects. Model I. Model II

Classifier C-Net. 2D Projected Images of 3D Objects. 2D Projected Images of 3D Objects. Model I. Model II Advances in Neural Information Processing Systems 7. (99) The MIT Press, Cambridge, MA. pp.949-96 Unsupervised Classication of 3D Objects from D Views Satoshi Suzuki Hiroshi Ando ATR Human Information

More information

Minoru SASAKI and Kenji KITA. Department of Information Science & Intelligent Systems. Faculty of Engineering, Tokushima University

Minoru SASAKI and Kenji KITA. Department of Information Science & Intelligent Systems. Faculty of Engineering, Tokushima University Information Retrieval System Using Concept Projection Based on PDDP algorithm Minoru SASAKI and Kenji KITA Department of Information Science & Intelligent Systems Faculty of Engineering, Tokushima University

More information

BRACE: A Paradigm For the Discretization of Continuously Valued Data

BRACE: A Paradigm For the Discretization of Continuously Valued Data Proceedings of the Seventh Florida Artificial Intelligence Research Symposium, pp. 7-2, 994 BRACE: A Paradigm For the Discretization of Continuously Valued Data Dan Ventura Tony R. Martinez Computer Science

More information

Kalev Kask and Rina Dechter. Department of Information and Computer Science. University of California, Irvine, CA

Kalev Kask and Rina Dechter. Department of Information and Computer Science. University of California, Irvine, CA GSAT and Local Consistency 3 Kalev Kask and Rina Dechter Department of Information and Computer Science University of California, Irvine, CA 92717-3425 fkkask,dechterg@ics.uci.edu Abstract It has been

More information

Vector Regression Machine. Rodrigo Fernandez. LIPN, Institut Galilee-Universite Paris 13. Avenue J.B. Clement Villetaneuse France.

Vector Regression Machine. Rodrigo Fernandez. LIPN, Institut Galilee-Universite Paris 13. Avenue J.B. Clement Villetaneuse France. Predicting Time Series with a Local Support Vector Regression Machine Rodrigo Fernandez LIPN, Institut Galilee-Universite Paris 13 Avenue J.B. Clement 9343 Villetaneuse France rf@lipn.univ-paris13.fr Abstract

More information

WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1

WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1 WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1 H. Altay Güvenir and Aynur Akkuş Department of Computer Engineering and Information Science Bilkent University, 06533, Ankara, Turkey

More information

Department of. Computer Science. Remapping Subpartitions of. Hyperspace Using Iterative. Genetic Search. Keith Mathias and Darrell Whitley

Department of. Computer Science. Remapping Subpartitions of. Hyperspace Using Iterative. Genetic Search. Keith Mathias and Darrell Whitley Department of Computer Science Remapping Subpartitions of Hyperspace Using Iterative Genetic Search Keith Mathias and Darrell Whitley Technical Report CS-4-11 January 7, 14 Colorado State University Remapping

More information

and moving down by maximizing a constrained full conditional density on sub-trees of decreasing size given the positions of all facets not contained i

and moving down by maximizing a constrained full conditional density on sub-trees of decreasing size given the positions of all facets not contained i Image Feature Identication via Bayesian Hierarchical Models Colin McCulloch, Jacob Laading, and Valen Johnson Institute of Statistics and Decision Sciences, Duke University Colin McCulloch, Box 90251,

More information

ing data (clouds) and solar spots. Once reliable sets of pixels are computed by contour extraction algorithms, a choice of orientation must be superim

ing data (clouds) and solar spots. Once reliable sets of pixels are computed by contour extraction algorithms, a choice of orientation must be superim Modeling Oceanographic Structures in Sea Surface Temperature Satellite Image Sequences I. L. Herlin, H. M. Yahia INRIA-Rocquencourt BP 105, 78153 Le Chesnay Cedex, France E-mail : Isabelle.Herlin@inria.fr,

More information

CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM

CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM 96 CHAPTER 6 IDENTIFICATION OF CLUSTERS USING VISUAL VALIDATION VAT ALGORITHM Clustering is the process of combining a set of relevant information in the same group. In this process KM algorithm plays

More information

Enumeration of Full Graphs: Onset of the Asymptotic Region. Department of Mathematics. Massachusetts Institute of Technology. Cambridge, MA 02139

Enumeration of Full Graphs: Onset of the Asymptotic Region. Department of Mathematics. Massachusetts Institute of Technology. Cambridge, MA 02139 Enumeration of Full Graphs: Onset of the Asymptotic Region L. J. Cowen D. J. Kleitman y F. Lasaga D. E. Sussman Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02139 Abstract

More information

Hyperplane Ranking in. Simple Genetic Algorithms. D. Whitley, K. Mathias, and L. Pyeatt. Department of Computer Science. Colorado State University

Hyperplane Ranking in. Simple Genetic Algorithms. D. Whitley, K. Mathias, and L. Pyeatt. Department of Computer Science. Colorado State University Hyperplane Ranking in Simple Genetic Algorithms D. Whitley, K. Mathias, and L. yeatt Department of Computer Science Colorado State University Fort Collins, Colorado 8523 USA whitley,mathiask,pyeatt@cs.colostate.edu

More information

Lazy Acquisition of Place Knowledge Pat Langley Institute for the Study of Learning and Expertise 2164 Staunton Court, Palo

Lazy Acquisition of Place Knowledge Pat Langley Institute for the Study of Learning and Expertise 2164 Staunton Court, Palo Lazy Acquisition of Place Knowledge Pat Langley (Langley@cs.stanford.edu) Institute for the Study of Learning and Expertise 2164 Staunton Court, Palo Alto, CA 94306 Karl Pfleger (KPfleger@hpp.stanford.edu)

More information

Supervised Variable Clustering for Classification of NIR Spectra

Supervised Variable Clustering for Classification of NIR Spectra Supervised Variable Clustering for Classification of NIR Spectra Catherine Krier *, Damien François 2, Fabrice Rossi 3, Michel Verleysen, Université catholique de Louvain, Machine Learning Group, place

More information

Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques

Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques M. Lazarescu 1,2, H. Bunke 1, and S. Venkatesh 2 1 Computer Science Department, University of Bern, Switzerland 2 School of

More information

Relational Knowledge Discovery in Databases Heverlee. fhendrik.blockeel,

Relational Knowledge Discovery in Databases Heverlee.   fhendrik.blockeel, Relational Knowledge Discovery in Databases Hendrik Blockeel and Luc De Raedt Katholieke Universiteit Leuven Department of Computer Science Celestijnenlaan 200A 3001 Heverlee e-mail: fhendrik.blockeel,

More information

Automatic Group-Outlier Detection

Automatic Group-Outlier Detection Automatic Group-Outlier Detection Amine Chaibi and Mustapha Lebbah and Hanane Azzag LIPN-UMR 7030 Université Paris 13 - CNRS 99, av. J-B Clément - F-93430 Villetaneuse {firstname.secondname}@lipn.univ-paris13.fr

More information

Adaptive Estimation of Distributions using Exponential Sub-Families Alan Gous Stanford University December 1996 Abstract: An algorithm is presented wh

Adaptive Estimation of Distributions using Exponential Sub-Families Alan Gous Stanford University December 1996 Abstract: An algorithm is presented wh Adaptive Estimation of Distributions using Exponential Sub-Families Alan Gous Stanford University December 1996 Abstract: An algorithm is presented which, for a large-dimensional exponential family G,

More information

Object perception by primates

Object perception by primates Object perception by primates How does the visual system put together the fragments to form meaningful objects? The Gestalt approach The whole differs from the sum of its parts and is a result of perceptual

More information

Data Analytics and Boolean Algebras

Data Analytics and Boolean Algebras Data Analytics and Boolean Algebras Hans van Thiel November 28, 2012 c Muitovar 2012 KvK Amsterdam 34350608 Passeerdersstraat 76 1016 XZ Amsterdam The Netherlands T: + 31 20 6247137 E: hthiel@muitovar.com

More information

Empirical transfer function determination by. BP 100, Universit de PARIS 6

Empirical transfer function determination by. BP 100, Universit de PARIS 6 Empirical transfer function determination by the use of Multilayer Perceptron F. Badran b, M. Crepon a, C. Mejia a, S. Thiria a and N. Tran a a Laboratoire d'oc anographie Dynamique et de Climatologie

More information

P ^ 2π 3 2π 3. 2π 3 P 2 P 1. a. b. c.

P ^ 2π 3 2π 3. 2π 3 P 2 P 1. a. b. c. Workshop on Fundamental Structural Properties in Image and Pattern Analysis - FSPIPA-99, Budapest, Hungary, Sept 1999. Quantitative Analysis of Continuous Symmetry in Shapes and Objects Hagit Hel-Or and

More information

Inital Starting Point Analysis for K-Means Clustering: A Case Study

Inital Starting Point Analysis for K-Means Clustering: A Case Study lemson University TigerPrints Publications School of omputing 3-26 Inital Starting Point Analysis for K-Means lustering: A ase Study Amy Apon lemson University, aapon@clemson.edu Frank Robinson Vanderbilt

More information

Reconciling Dierent Semantics for Concept Denition (Extended Abstract) Giuseppe De Giacomo Dipartimento di Informatica e Sistemistica Universita di Ro

Reconciling Dierent Semantics for Concept Denition (Extended Abstract) Giuseppe De Giacomo Dipartimento di Informatica e Sistemistica Universita di Ro Reconciling Dierent Semantics for Concept Denition (Extended Abstract) Giuseppe De Giacomo Dipartimento di Informatica e Sistemistica Universita di Roma \La Sapienza" Via Salaria 113, 00198 Roma, Italia

More information

Feature-weighted k-nearest Neighbor Classifier

Feature-weighted k-nearest Neighbor Classifier Proceedings of the 27 IEEE Symposium on Foundations of Computational Intelligence (FOCI 27) Feature-weighted k-nearest Neighbor Classifier Diego P. Vivencio vivencio@comp.uf scar.br Estevam R. Hruschka

More information

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing

More information

AST: Support for Algorithm Selection with a CBR Approach

AST: Support for Algorithm Selection with a CBR Approach AST: Support for Algorithm Selection with a CBR Approach Guido Lindner 1 and Rudi Studer 2 1 DaimlerChrysler AG, Research &Technology FT3/KL, PO: DaimlerChrysler AG, T-402, D-70456 Stuttgart, Germany guido.lindner@daimlerchrysler.com

More information

Cluster quality 15. Running time 0.7. Distance between estimated and true means Running time [s]

Cluster quality 15. Running time 0.7. Distance between estimated and true means Running time [s] Fast, single-pass K-means algorithms Fredrik Farnstrom Computer Science and Engineering Lund Institute of Technology, Sweden arnstrom@ucsd.edu James Lewis Computer Science and Engineering University of

More information

Advances in Neural Information Processing Systems, in press, 1996

Advances in Neural Information Processing Systems, in press, 1996 Advances in Neural Information Processing Systems, in press, 1996 A Framework for Non-rigid Matching and Correspondence Suguna Pappu, Steven Gold, and Anand Rangarajan 1 Departments of Diagnostic Radiology

More information

INDUCTIVE LEARNING WITH BCT

INDUCTIVE LEARNING WITH BCT INDUCTIVE LEARNING WITH BCT Philip K Chan CUCS-451-89 August, 1989 Department of Computer Science Columbia University New York, NY 10027 pkc@cscolumbiaedu Abstract BCT (Binary Classification Tree) is a

More information

has to choose. Important questions are: which relations should be dened intensionally,

has to choose. Important questions are: which relations should be dened intensionally, Automated Design of Deductive Databases (Extended abstract) Hendrik Blockeel and Luc De Raedt Department of Computer Science, Katholieke Universiteit Leuven Celestijnenlaan 200A B-3001 Heverlee, Belgium

More information

Support vector machines

Support vector machines Support vector machines Cavan Reilly October 24, 2018 Table of contents K-nearest neighbor classification Support vector machines K-nearest neighbor classification Suppose we have a collection of measurements

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

Images Reconstruction using an iterative SOM based algorithm.

Images Reconstruction using an iterative SOM based algorithm. Images Reconstruction using an iterative SOM based algorithm. M.Jouini 1, S.Thiria 2 and M.Crépon 3 * 1- LOCEAN, MMSA team, CNAM University, Paris, France 2- LOCEAN, MMSA team, UVSQ University Paris, France

More information

Gen := 0. Create Initial Random Population. Termination Criterion Satisfied? Yes. Evaluate fitness of each individual in population.

Gen := 0. Create Initial Random Population. Termination Criterion Satisfied? Yes. Evaluate fitness of each individual in population. An Experimental Comparison of Genetic Programming and Inductive Logic Programming on Learning Recursive List Functions Lappoon R. Tang Mary Elaine Cali Raymond J. Mooney Department of Computer Sciences

More information

Concept Tree Based Clustering Visualization with Shaded Similarity Matrices

Concept Tree Based Clustering Visualization with Shaded Similarity Matrices Syracuse University SURFACE School of Information Studies: Faculty Scholarship School of Information Studies (ischool) 12-2002 Concept Tree Based Clustering Visualization with Shaded Similarity Matrices

More information

1) Give decision trees to represent the following Boolean functions:

1) Give decision trees to represent the following Boolean functions: 1) Give decision trees to represent the following Boolean functions: 1) A B 2) A [B C] 3) A XOR B 4) [A B] [C Dl Answer: 1) A B 2) A [B C] 1 3) A XOR B = (A B) ( A B) 4) [A B] [C D] 2 2) Consider the following

More information

Structural and Syntactic Pattern Recognition

Structural and Syntactic Pattern Recognition Structural and Syntactic Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent

More information

6. Learning Partitions of a Set

6. Learning Partitions of a Set 6. Learning Partitions of a Set Also known as clustering! Usually, we partition sets into subsets with elements that are somewhat similar (and since similarity is often task dependent, different partitions

More information

Based on Raymond J. Mooney s slides

Based on Raymond J. Mooney s slides Instance Based Learning Based on Raymond J. Mooney s slides University of Texas at Austin 1 Example 2 Instance-Based Learning Unlike other learning algorithms, does not involve construction of an explicit

More information

2 Keywords Backtracking Algorithms, Constraint Satisfaction Problem, Distributed Articial Intelligence, Iterative Improvement Algorithm, Multiagent Sy

2 Keywords Backtracking Algorithms, Constraint Satisfaction Problem, Distributed Articial Intelligence, Iterative Improvement Algorithm, Multiagent Sy 1 The Distributed Constraint Satisfaction Problem: Formalization and Algorithms IEEE Trans. on Knowledge and DATA Engineering, vol.10, No.5 September 1998 Makoto Yokoo, Edmund H. Durfee, Toru Ishida, and

More information

SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING

SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING TAE-WAN RYU AND CHRISTOPH F. EICK Department of Computer Science, University of Houston, Houston, Texas 77204-3475 {twryu, ceick}@cs.uh.edu

More information

Constrained Clustering with Interactive Similarity Learning

Constrained Clustering with Interactive Similarity Learning SCIS & ISIS 2010, Dec. 8-12, 2010, Okayama Convention Center, Okayama, Japan Constrained Clustering with Interactive Similarity Learning Masayuki Okabe Toyohashi University of Technology Tenpaku 1-1, Toyohashi,

More information

Localization in Graphs. Richardson, TX Azriel Rosenfeld. Center for Automation Research. College Park, MD

Localization in Graphs. Richardson, TX Azriel Rosenfeld. Center for Automation Research. College Park, MD CAR-TR-728 CS-TR-3326 UMIACS-TR-94-92 Samir Khuller Department of Computer Science Institute for Advanced Computer Studies University of Maryland College Park, MD 20742-3255 Localization in Graphs Azriel

More information

An Evaluation of Information Retrieval Accuracy. with Simulated OCR Output. K. Taghva z, and J. Borsack z. University of Massachusetts, Amherst

An Evaluation of Information Retrieval Accuracy. with Simulated OCR Output. K. Taghva z, and J. Borsack z. University of Massachusetts, Amherst An Evaluation of Information Retrieval Accuracy with Simulated OCR Output W.B. Croft y, S.M. Harding y, K. Taghva z, and J. Borsack z y Computer Science Department University of Massachusetts, Amherst

More information

Ratios and Proportional Relationships (RP) 6 8 Analyze proportional relationships and use them to solve real-world and mathematical problems.

Ratios and Proportional Relationships (RP) 6 8 Analyze proportional relationships and use them to solve real-world and mathematical problems. Ratios and Proportional Relationships (RP) 6 8 Analyze proportional relationships and use them to solve real-world and mathematical problems. 7.1 Compute unit rates associated with ratios of fractions,

More information

GSAT and Local Consistency

GSAT and Local Consistency GSAT and Local Consistency Kalev Kask Computer Science Department University of California at Irvine Irvine, CA 92717 USA Rina Dechter Computer Science Department University of California at Irvine Irvine,

More information

Exemplar Learning in Fuzzy Decision Trees

Exemplar Learning in Fuzzy Decision Trees Exemplar Learning in Fuzzy Decision Trees C. Z. Janikow Mathematics and Computer Science University of Missouri St. Louis, MO 63121 Abstract Decision-tree algorithms provide one of the most popular methodologies

More information

5. Compare the volume of a three dimensional figure to surface area.

5. Compare the volume of a three dimensional figure to surface area. 5. Compare the volume of a three dimensional figure to surface area. 1. What are the inferences that can be drawn from sets of data points having a positive association and a negative association. 2. Why

More information

CS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008

CS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008 CS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008 Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute NAME: Prof. Ruiz Problem

More information

Andrew Davenport and Edward Tsang. fdaveat,edwardgessex.ac.uk. mostly soluble problems and regions of overconstrained, mostly insoluble problems as

Andrew Davenport and Edward Tsang. fdaveat,edwardgessex.ac.uk. mostly soluble problems and regions of overconstrained, mostly insoluble problems as An empirical investigation into the exceptionally hard problems Andrew Davenport and Edward Tsang Department of Computer Science, University of Essex, Colchester, Essex CO SQ, United Kingdom. fdaveat,edwardgessex.ac.uk

More information

SOME TYPES AND USES OF DATA MODELS

SOME TYPES AND USES OF DATA MODELS 3 SOME TYPES AND USES OF DATA MODELS CHAPTER OUTLINE 3.1 Different Types of Data Models 23 3.1.1 Physical Data Model 24 3.1.2 Logical Data Model 24 3.1.3 Conceptual Data Model 25 3.1.4 Canonical Data Model

More information

Efficient integration of data mining techniques in DBMSs

Efficient integration of data mining techniques in DBMSs Efficient integration of data mining techniques in DBMSs Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex, FRANCE {bentayeb jdarmont

More information

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture

More information

Local qualitative shape from stereo. without detailed correspondence. Extended Abstract. Shimon Edelman. Internet:

Local qualitative shape from stereo. without detailed correspondence. Extended Abstract. Shimon Edelman. Internet: Local qualitative shape from stereo without detailed correspondence Extended Abstract Shimon Edelman Center for Biological Information Processing MIT E25-201, Cambridge MA 02139 Internet: edelman@ai.mit.edu

More information

Citation for published version (APA): He, J. (2011). Exploring topic structure: Coherence, diversity and relatedness

Citation for published version (APA): He, J. (2011). Exploring topic structure: Coherence, diversity and relatedness UvA-DARE (Digital Academic Repository) Exploring topic structure: Coherence, diversity and relatedness He, J. Link to publication Citation for published version (APA): He, J. (211). Exploring topic structure:

More information

Coarse-to-fine image registration

Coarse-to-fine image registration Today we will look at a few important topics in scale space in computer vision, in particular, coarseto-fine approaches, and the SIFT feature descriptor. I will present only the main ideas here to give

More information

Image Mining: frameworks and techniques

Image Mining: frameworks and techniques Image Mining: frameworks and techniques Madhumathi.k 1, Dr.Antony Selvadoss Thanamani 2 M.Phil, Department of computer science, NGM College, Pollachi, Coimbatore, India 1 HOD Department of Computer Science,

More information

International Journal of Computer Engineering and Applications, Volume IX, Issue X, Oct. 15 ISSN

International Journal of Computer Engineering and Applications, Volume IX, Issue X, Oct. 15  ISSN DIVERSIFIED DATASET EXPLORATION BASED ON USEFULNESS SCORE Geetanjali Mohite 1, Prof. Gauri Rao 2 1 Student, Department of Computer Engineering, B.V.D.U.C.O.E, Pune, Maharashtra, India 2 Associate Professor,

More information

Progress in Image Analysis and Processing III, pp , World Scientic, Singapore, AUTOMATIC INTERPRETATION OF FLOOR PLANS USING

Progress in Image Analysis and Processing III, pp , World Scientic, Singapore, AUTOMATIC INTERPRETATION OF FLOOR PLANS USING Progress in Image Analysis and Processing III, pp. 233-240, World Scientic, Singapore, 1994. 1 AUTOMATIC INTERPRETATION OF FLOOR PLANS USING SPATIAL INDEXING HANAN SAMET AYA SOFFER Computer Science Department

More information

2. CNeT Architecture and Learning 2.1. Architecture The Competitive Neural Tree has a structured architecture. A hierarchy of identical nodes form an

2. CNeT Architecture and Learning 2.1. Architecture The Competitive Neural Tree has a structured architecture. A hierarchy of identical nodes form an Competitive Neural Trees for Vector Quantization Sven Behnke and Nicolaos B. Karayiannis Department of Mathematics Department of Electrical and Computer Science and Computer Engineering Martin-Luther-University

More information

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph.

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph. Trees 1 Introduction Trees are very special kind of (undirected) graphs. Formally speaking, a tree is a connected graph that is acyclic. 1 This definition has some drawbacks: given a graph it is not trivial

More information

Regularity Analysis of Non Uniform Data

Regularity Analysis of Non Uniform Data Regularity Analysis of Non Uniform Data Christine Potier and Christine Vercken Abstract. A particular class of wavelet, derivatives of B-splines, leads to fast and ecient algorithms for contours detection

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

Handwritten Digit Recognition with a. Back-Propagation Network. Y. Le Cun, B. Boser, J. S. Denker, D. Henderson,

Handwritten Digit Recognition with a. Back-Propagation Network. Y. Le Cun, B. Boser, J. S. Denker, D. Henderson, Handwritten Digit Recognition with a Back-Propagation Network Y. Le Cun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel AT&T Bell Laboratories, Holmdel, N. J. 07733 ABSTRACT

More information

An Average-Case Analysis of the k-nearest Neighbor Classifier for Noisy Domains

An Average-Case Analysis of the k-nearest Neighbor Classifier for Noisy Domains An Average-Case Analysis of the k-nearest Neighbor Classifier for Noisy Domains Seishi Okamoto Fujitsu Laboratories Ltd. 2-2-1 Momochihama, Sawara-ku Fukuoka 814, Japan seishi@flab.fujitsu.co.jp Yugami

More information

Latest development in image feature representation and extraction

Latest development in image feature representation and extraction International Journal of Advanced Research and Development ISSN: 2455-4030, Impact Factor: RJIF 5.24 www.advancedjournal.com Volume 2; Issue 1; January 2017; Page No. 05-09 Latest development in image

More information

On Computing the Minimal Labels in Time. Point Algebra Networks. IRST { Istituto per la Ricerca Scientica e Tecnologica. I Povo, Trento Italy

On Computing the Minimal Labels in Time. Point Algebra Networks. IRST { Istituto per la Ricerca Scientica e Tecnologica. I Povo, Trento Italy To appear in Computational Intelligence Journal On Computing the Minimal Labels in Time Point Algebra Networks Alfonso Gerevini 1;2 and Lenhart Schubert 2 1 IRST { Istituto per la Ricerca Scientica e Tecnologica

More information

SAMOS: an Active Object{Oriented Database System. Stella Gatziu, Klaus R. Dittrich. Database Technology Research Group

SAMOS: an Active Object{Oriented Database System. Stella Gatziu, Klaus R. Dittrich. Database Technology Research Group SAMOS: an Active Object{Oriented Database System Stella Gatziu, Klaus R. Dittrich Database Technology Research Group Institut fur Informatik, Universitat Zurich fgatziu, dittrichg@ifi.unizh.ch to appear

More information

The task of inductive learning from examples is to nd an approximate definition

The task of inductive learning from examples is to nd an approximate definition 1 Initializing Neural Networks using Decision Trees Arunava Banerjee 1.1 Introduction The task of inductive learning from examples is to nd an approximate definition for an unknown function f(x), given

More information

Outline 7/2/201011/6/

Outline 7/2/201011/6/ Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern

More information

Visual object classification by sparse convolutional neural networks

Visual object classification by sparse convolutional neural networks Visual object classification by sparse convolutional neural networks Alexander Gepperth 1 1- Ruhr-Universität Bochum - Institute for Neural Dynamics Universitätsstraße 150, 44801 Bochum - Germany Abstract.

More information

Salford Systems Predictive Modeler Unsupervised Learning. Salford Systems

Salford Systems Predictive Modeler Unsupervised Learning. Salford Systems Salford Systems Predictive Modeler Unsupervised Learning Salford Systems http://www.salford-systems.com Unsupervised Learning In mainstream statistics this is typically known as cluster analysis The term

More information

Mining High Order Decision Rules

Mining High Order Decision Rules Mining High Order Decision Rules Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 e-mail: yyao@cs.uregina.ca Abstract. We introduce the notion of high

More information

PROJECTION MODELING SIMPLIFICATION MARKER EXTRACTION DECISION. Image #k Partition #k

PROJECTION MODELING SIMPLIFICATION MARKER EXTRACTION DECISION. Image #k Partition #k TEMPORAL STABILITY IN SEQUENCE SEGMENTATION USING THE WATERSHED ALGORITHM FERRAN MARQU ES Dept. of Signal Theory and Communications Universitat Politecnica de Catalunya Campus Nord - Modulo D5 C/ Gran

More information

Geodetic Iterative Methods for Nonlinear Acoustic Sources Localization: Application to Cracks Eects Detected with Acoustic Emission

Geodetic Iterative Methods for Nonlinear Acoustic Sources Localization: Application to Cracks Eects Detected with Acoustic Emission 6th NDT in Progress 2011 International Workshop of NDT Experts, Prague, 10-12 Oct 2011 Geodetic Iterative Methods for Nonlinear Acoustic Sources Localization: Application to Cracks Eects Detected with

More information

Computational Foundations of Cognitive Science

Computational Foundations of Cognitive Science Computational Foundations of Cognitive Science Lecture 16: Models of Object Recognition Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk February 23, 2010 Frank Keller Computational

More information

Proceedings of the 6th Int. Conf. on Computer Analysis of Images and Patterns. Direct Obstacle Detection and Motion. from Spatio-Temporal Derivatives

Proceedings of the 6th Int. Conf. on Computer Analysis of Images and Patterns. Direct Obstacle Detection and Motion. from Spatio-Temporal Derivatives Proceedings of the 6th Int. Conf. on Computer Analysis of Images and Patterns CAIP'95, pp. 874-879, Prague, Czech Republic, Sep 1995 Direct Obstacle Detection and Motion from Spatio-Temporal Derivatives

More information

8 th Grade Pre Algebra Pacing Guide 1 st Nine Weeks

8 th Grade Pre Algebra Pacing Guide 1 st Nine Weeks 8 th Grade Pre Algebra Pacing Guide 1 st Nine Weeks MS Objective CCSS Standard I Can Statements Included in MS Framework + Included in Phase 1 infusion Included in Phase 2 infusion 1a. Define, classify,

More information

Anadarko Public Schools MATH Power Standards

Anadarko Public Schools MATH Power Standards Anadarko Public Schools MATH Power Standards Kindergarten 1. Say the number name sequence forward and backward beginning from a given number within the known sequence (counting on, spiral) 2. Write numbers

More information

Using Decision Boundary to Analyze Classifiers

Using Decision Boundary to Analyze Classifiers Using Decision Boundary to Analyze Classifiers Zhiyong Yan Congfu Xu College of Computer Science, Zhejiang University, Hangzhou, China yanzhiyong@zju.edu.cn Abstract In this paper we propose to use decision

More information

Document Image Restoration Using Binary Morphological Filters. Jisheng Liang, Robert M. Haralick. Seattle, Washington Ihsin T.

Document Image Restoration Using Binary Morphological Filters. Jisheng Liang, Robert M. Haralick. Seattle, Washington Ihsin T. Document Image Restoration Using Binary Morphological Filters Jisheng Liang, Robert M. Haralick University of Washington, Department of Electrical Engineering Seattle, Washington 98195 Ihsin T. Phillips

More information