Circle Graphs: New Visualization Tools for Text-Mining
|
|
- Leo Bond
- 5 years ago
- Views:
Transcription
1 Circle Graphs: New Visualization Tools for Text-Mining Yonatan Aumann, Ronen Feldman, Yaron Ben Yehuda, David Landau, Orly Liphstat, Yonatan Schler Department of Mathematics and Computer Science Bar-Ilan University Ramat-Gan, ISRAEL Tel: Fax: Abstract. The proliferation of digitally available textual data necessitates automatic tools for analyzing large textual collections. Thus, in analogy to data mining for structured databases, text mining is defined for textual collections. A central tool in text-mining is the analysis of concept relationship, which discovers connections between different concepts, as reflected in the collection. However, discovering these relationships is not sufficient, as they also have to be presented to the user in a meaningful and manageable way. In this paper we introduce a new family of visualization tools, which we coin circle graphs, which provide means for visualizing concept relationships mined from large collections. Circle graphs allow for instant appreciation of multiple relationships gathered from the entire collection. A special type of circle-graphs, called Trend Graphs, allows tracking of the evolution of relationships over time. 1 Introduction Most informal definitions [2] introduce knowledge discovery in databases (KDD) as the extraction of useful information from databases by large-scale search for interesting patterns. The vast majority of existing KDD applications and methods deal with structured databases, for example, client data stored in a relational database, and thus exploits data organized in records structured by categorical, ordinal, and continuous variables. However, a tremendous amount of information is stored in documents that are essentially unstructured. The availability of document collections and especially of online information is rapidly growing, so that an analysis bottleneck often arises also in this area. Thus, in analogy to data mining for structured data, text mining is defined for textual data. Text mining is the science of extracting information from hidden patterns in large textual collections. Text mining shares many characteristics with classical data mining, but also differs in some. Thus, it is necessary to provide special tools geared specifically to text mining. A central tool, found valuable in text mining, is the analysis of concept J.M. Zytkow and J. Rauch (Eds.): PKDD 99, LNAI 1704, pp , Springer Verlag Berlin Heidelberg 1999
2 278 Y. Aumann et al. relationship [3,4], defined as follows. Large textual corpuses are most commonly composed of a collection of separate documents (e.g. news articles, web pages). Each document refers to a set of concepts (terms). Text mining operations consider the distribution of concepts on the inter-document level, seeking to discover the nature and relationships of concepts as reflected in the collection as a whole. For example, in a collection of news articles, a large number of articles on politician X and "scandal" may indicate a negative image of the character, and alert for a new PR campaign. Or, for another example, a growing number of articles on both company Y and product Z may indicate a shift of focus in the company s interests, a shift which should be noted by its competitors. Notice that in both of these cases, the information is not provided by any single document, but rather from the totality of the collection. Thus, concept relationship analysis seeks to discover the relationship between concepts, as reflected by the totality of the corpus at hand. Clearly, discovering the concept relationships is only useful insofar as this information can be conveyed to the end-user. In practice, even medium-sized collections tend to give rise to a very large number of relationships. Thus, a mere listing of the discovered relationships is of little practical use for the end-user, as it is too large to comprehend. In addition, a linear list fails to show the structure arising from the entirety of relationships. Thus, we find that in order for mining of concept-relationship to be a useful tool for text-analysis, proper visualization techniques are necessary for presenting the results to the end user in a meaningful and manageable form. In this paper we introduce a new family of visualization tools for text-mining, which we call Circle Graphs. Circle graphs prove to be an effective tool for visualizing concept relationships discovered in text-mining. The graphs provide the user with an instant overall view of many relationships at once. Thus, circle graphs provide the extra benefit of surfacing the overall structure emerging from the multitude of relationships. We describe two specific types of circle graphs: 1. Category Connection Graphs: Provide a graphic representation of relationships between concept in different categories. 2. Context Connection Circle Graphs: Provide the user with a graphic representation of the connection between entities in a give context. 2 Circle Graphs We now describe the circle graphs visualization. We first give some basic definitions and notations.
3 Circle Graphs: New Visualization Tools for Text Mining Definitions and Notations Let T a be taxonomy. T is represented as a DAG (Directed Acyclic Graph), with the terms at the leaves. For a given node v T, we denote by Terms(v) the terms which are decedents of v. Let D be a collection of documents. For terms e 1 and e 2 we denote sup D (e 1,e) the number of documents in D which indicate a relationship between e 1 and e 2. The nature of the indication can be defined individually according to the context. In the current implementation we say that a document indicates a relationship if both terms appear in the document in the same sentence. This has proved to be a strong indicator. Similarly, for a collection D and terms e 1, e 2 and c, we denote by sup D (e 1,e 2,c) the number of documents which indicate a relationship between e 1 and e 2 in the context of c (e.g. relationship between the UK and Ireland in the context of peace talks). Again, the nature of indication may be determined in many ways. In the current implementation we require that they all appear in the same sentence. 2.2 Category Connection Maps Category Connection Maps provide a means for concise visual representation of connections between different categories, e.g. between companies and technologies, countries and people, or drugs and diseases. In order to define a category connection map, the user chooses any number of categories from the taxonomy. The system finds all the connections between the terms in the different categories. To visualize the output, all the terms in the chosen categories are depicted on a circle, with each category placed on a separate part on the circle. A line is depicted between terms of different categories which are related. A color coding scheme represents stronger links with darker colors. Formally, given a set C={v 1,v 2,,v k } of taxonomy nodes, and a document collection D, the category connection map is the weighted G defined as follows. The nodes of the graph are the set V=terms(v 1 ) terms(v 2 ) terms(v k ). Nodes u,w V are connected by an edge if: 1. u and w are from different categories 2. sup D (u,w)>0. 3. The weight of the edge (u,w) is sup D (u,w). An important point to notice regarding Category Connection Maps is that the map presents in a single picture information from the entire collection of documents. In the specific example, there is no single document that has the relationship between all the companies and the technologies. Rather, the graphs depicts aggregate knowledge from hundreds of documents. Thus, the user is provided with a bird s-eye summary view of data from across the collection.
4 280 Y. Aumann et al. The Category Connection Maps are dynamic in several ways. Firstly, the user can choose any node in the graph and the links from this node are highlighted. In addition, a double-click on any of the edges brings the list of documents which support the given relationship, together with the most relevant sentence in each document. Thus, in a way, the system is the opposite of search engines. Search engines point to documents, in the hope that the user will be able to find the necessary information. Circle category connection maps present the user with the information itself, which can then be backed by a list of documents. Figure 1 Context Circle Graph. The graph presents the connections between companies in the context of joint venture. Clusters are depicted separately. Color coded lines represent the strength of the connection. The information is based on 5,413 news articles obtained from Marketwatch.com. Only connection with weight 2 or more are depicted. 2.3 Context Circle-Graphs Context Circle-Graphs provide a visual means for concise representation of the relationship between many terms in a given context. In order to define a context circle graph the user defines: 1. A taxonomy category (e.g. companies ), which determines the nodes of the circle graph (e.g. companies) 2. An optional context node (e.g. joint venture ): which will determine the type of connection we wish to find among the graph nodes. Formally, for a set of taxonomy nodes vs, and a context node C, the Context Circle Graph is a weighted graph on the node set V=terms(vs). For each pair u,w V there is an edge between u and w, if there exists a context term c C, such that sup D (u,w,c)>0. In this case the weight of the edge is Σ c C sup D (u,w,c). If no
5 Circle Graphs: New Visualization Tools for Text Mining 281 context node is defined, then the connection can be in any context. Formally, in this case the root of the taxonomy is considered as the context. A Context Circle Graph for companies in the context of joint venture is depicted in Figure 1. In this case, the graph is clustered, as described below. The graph is based on 5,413 news documents downloaded from Marketwatch.com. The graph gives the user a summary of the entire collection in one visualization. The user can appreciate the overall structure of the connections between companies in this context, even before reading a single document! Clustering For Context Circle Graph we use clustering to identify clusters of nodes which are strongly inter-related in the given context. In the example of figure 1, the system identified six separate clusters. The edges between members of each cluster are depicted in a separate small Context Circle Graph, adjacent to the center graph. The center graph shows connections between terms of different clusters, and those with terms which are not in any cluster. We now describe the algorithm for determining the clusters. Note that the clustering problem here is different from the classic clustering problem. In the classic problem we are given points in some space, and seek to find clusters of points which are close to each other. Here, we are given a graph in which we are seeking to find dense sub-graphs of the graph. Thus, a different type of clustering algorithm is necessary. The algorithm is composed of two main steps. In the first step we assign weights to edges in the graph. The weight of an edge reflects the strength of the connection between the vertices. Edges incident to vertices which are in the same cluster should be associated with high weights. In the next step we identify sets of vertices which are dense-subgraphs. This step uses the weights assigned to the edges in the previous one. We first describe the weight-assignment method. In order to evaluate the strength of a link between a pair of vertices u and v, we consider two criteria: Let u be a vertex in the graph. We use the notation Γ(u) to represent the neighborhood of u. The cluster weight of (u,v) is affected by the similarity of Γ(u) and Γ(v). We assume that vertices within the same clusters have many common neighbors. Existence of many common neighbors is not a sufficient condition, since in dense graphs any two vertices may have some common neighbors. Thus, we emphasize the neighbors which are close to u and v in the sense of cluster weight. Suppose x Γ(u) Γ(u), if the cluster weights of (x,u) and (x,v) are high, there is a good chance that x belongs to the same cluster as u and v.
6 282 Y. Aumann et al. We can now define an update operation on an edge (u,v) which takes into account both criteria: w(u,v) = w(x,u) + w(x,v) x Γ (u) I Γ(v) x Γ (u) I Γ (v) The algorithm starts with initializing all weights to be equal, w(u,v)=1 for all u,v. Next, the update operation is applied to all edges iteratively. After a small number of iterations (set to 5 in the current implementation) it stops and outputs the values associated with each edge. We call this the cluster weight of the edge. The cluster weight has the following characteristic. Consider two vertices u and v within the same dense sub-graph. The edges within this sub-graph mutually affect each other. Thus the iterations drive cluster weight w(u,v) up. If, however, u and v do not belong to the dense sub-graph, the majority of edges affecting w(u,v) will have lower weights, resulting in a low cluster weight assigned to (u,v). After computing the weights, the second step of the algorithm finds the clusters. We define a new graph with the same set of vertices. In the new graph we consider only a small subset of the original edges, whose weights were the highest. In our experiments we took the top 10% of the edges. Since now almost all of the edges are likely to connect vertices within the same dense sub-graph, we thus separate the vertices into clusters by computing the connected components of the new graph and considering each component as a cluster. Figure 1 shows a circle context graph with six clusters. The cluster are depicted around the center graph. Each cluster is depicted in a different color. Nodes which are not in any cluster are colored gray. Note that the nodes of cluster appear both in the central circle and in the separate cluster graph. Edges within the cluster are depicted in the separate cluster graph. Edges between clusters are depicted in the central circle. References 1. Eick, S.G. and Wills, G.J.; Navigating Large Networks with Hierarchies. Visualization 93, pp , Fayyad, U,; Piatetsky-Shapiro, G.; and Smyth P. Knowledge Discover and Data Mining: Towards a Unifying Framework. In Proceedings of the 2 nd International Conference of Knowledge Discovery and Data Mining, 82-88, Feldman, R.; and Dagan, I.; KDT - Knowledge Discovery in Texts. In Proceedings of the 1sr International Conference of Knowledge Discovery and Data Mining Feldman, R.; Klosgen, W.; and Zilberstein, A.; Visualization Techniques to Explore Data Mining Results for Document Collections. In Proceedings of the 3rd International Conference of Knowledge Discovery and Data Mining, Hendley, R.J., Drew, N.S., Wood, A.M., and Beale R.; Narcissus: Visualizing Information. In Proc. Int. Symp. On Information Visualization, pp , 1995.
Text Mining via Information Extraction
Ronen Feldman, Yonatan Aumann, Moshe Fresko, Orly Liphstat, Binyamin Rosenfeld, Yonatan Schler Department of Mathematics and Computer Science Bar-Ilan University Ramat-Gan, ISRAEL Tel: 972-3-5326611 Fax:
More informationVisualization Techniques to Explore Data Mining Results for Document Collections
Visualization Techniques to Explore Data Mining Results for Document Collections Ronen Feldman Math and Computer Science Department Bar-Ilan University Ramat-Gan, ISRAEL 52900 feldman@macs.biu.ac.il Willi
More informationOverview. Data-mining. Commercial & Scientific Applications. Ongoing Research Activities. From Research to Technology Transfer
Data Mining George Karypis Department of Computer Science Digital Technology Center University of Minnesota, Minneapolis, USA. http://www.cs.umn.edu/~karypis karypis@cs.umn.edu Overview Data-mining What
More informationTheme Identification in RDF Graphs
Theme Identification in RDF Graphs Hanane Ouksili PRiSM, Univ. Versailles St Quentin, UMR CNRS 8144, Versailles France hanane.ouksili@prism.uvsq.fr Abstract. An increasing number of RDF datasets is published
More informationarxiv: v2 [cs.ds] 25 Jan 2017
d-hop Dominating Set for Directed Graph with in-degree Bounded by One arxiv:1404.6890v2 [cs.ds] 25 Jan 2017 Joydeep Banerjee, Arun Das, and Arunabha Sen School of Computing, Informatics and Decision System
More informationDiscovering interesting rules from financial data
Discovering interesting rules from financial data Przemysław Sołdacki Institute of Computer Science Warsaw University of Technology Ul. Andersa 13, 00-159 Warszawa Tel: +48 609129896 email: psoldack@ii.pw.edu.pl
More informationMulti-relational Decision Tree Induction
Multi-relational Decision Tree Induction Arno J. Knobbe 1,2, Arno Siebes 2, Daniël van der Wallen 1 1 Syllogic B.V., Hoefseweg 1, 3821 AE, Amersfoort, The Netherlands, {a.knobbe, d.van.der.wallen}@syllogic.com
More informationDatabase and Knowledge-Base Systems: Data Mining. Martin Ester
Database and Knowledge-Base Systems: Data Mining Martin Ester Simon Fraser University School of Computing Science Graduate Course Spring 2006 CMPT 843, SFU, Martin Ester, 1-06 1 Introduction [Fayyad, Piatetsky-Shapiro
More informationCommunity Detection. Community
Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,
More informationVisualisation of Temporal Interval Association Rules
Visualisation of Temporal Interval Association Rules Chris P. Rainsford 1 and John F. Roddick 2 1 Defence Science and Technology Organisation, DSTO C3 Research Centre Fernhill Park, Canberra, 2600, Australia.
More informationWEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS
1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,
More informationAssociation Rule Selection in a Data Mining Environment
Association Rule Selection in a Data Mining Environment Mika Klemettinen 1, Heikki Mannila 2, and A. Inkeri Verkamo 1 1 University of Helsinki, Department of Computer Science P.O. Box 26, FIN 00014 University
More informationCIS 121 Data Structures and Algorithms Minimum Spanning Trees
CIS 121 Data Structures and Algorithms Minimum Spanning Trees March 19, 2019 Introduction and Background Consider a very natural problem: we are given a set of locations V = {v 1, v 2,..., v n }. We want
More informationThe Interestingness Tool for Search in the Web
The Interestingness Tool for Search in the Web e-print, Gilad Amar and Ran Shaltiel Software Engineering Department, The Jerusalem College of Engineering Azrieli POB 3566, Jerusalem, 91035, Israel iaakov@jce.ac.il,
More informationK-Mean Clustering Algorithm Implemented To E-Banking
K-Mean Clustering Algorithm Implemented To E-Banking Kanika Bansal Banasthali University Anjali Bohra Banasthali University Abstract As the nations are connected to each other, so is the banking sector.
More informationMinimum Spanning Trees My T. UF
Introduction to Algorithms Minimum Spanning Trees @ UF Problem Find a low cost network connecting a set of locations Any pair of locations are connected There is no cycle Some applications: Communication
More informationKDD, SEMMA AND CRISP-DM: A PARALLEL OVERVIEW. Ana Azevedo and M.F. Santos
KDD, SEMMA AND CRISP-DM: A PARALLEL OVERVIEW Ana Azevedo and M.F. Santos ABSTRACT In the last years there has been a huge growth and consolidation of the Data Mining field. Some efforts are being done
More informationHardness of Subgraph and Supergraph Problems in c-tournaments
Hardness of Subgraph and Supergraph Problems in c-tournaments Kanthi K Sarpatwar 1 and N.S. Narayanaswamy 1 Department of Computer Science and Engineering, IIT madras, Chennai 600036, India kanthik@gmail.com,swamy@cse.iitm.ac.in
More informationA New Approach To Graph Based Object Classification On Images
A New Approach To Graph Based Object Classification On Images Sandhya S Krishnan,Kavitha V K P.G Scholar, Dept of CSE, BMCE, Kollam, Kerala, India Sandhya4parvathy@gmail.com Abstract: The main idea of
More informationThe strong chromatic number of a graph
The strong chromatic number of a graph Noga Alon Abstract It is shown that there is an absolute constant c with the following property: For any two graphs G 1 = (V, E 1 ) and G 2 = (V, E 2 ) on the same
More informationMatching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.
18.433 Combinatorial Optimization Matching Algorithms September 9,14,16 Lecturer: Santosh Vempala Given a graph G = (V, E), a matching M is a set of edges with the property that no two of the edges have
More informationSequences Modeling and Analysis Based on Complex Network
Sequences Modeling and Analysis Based on Complex Network Li Wan 1, Kai Shu 1, and Yu Guo 2 1 Chongqing University, China 2 Institute of Chemical Defence People Libration Army {wanli,shukai}@cqu.edu.cn
More informationPROJECT PERIODIC REPORT
PROJECT PERIODIC REPORT Grant Agreement number: 257403 Project acronym: CUBIST Project title: Combining and Uniting Business Intelligence and Semantic Technologies Funding Scheme: STREP Date of latest
More informationMichał Dębski. Uniwersytet Warszawski. On a topological relaxation of a conjecture of Erdős and Nešetřil
Michał Dębski Uniwersytet Warszawski On a topological relaxation of a conjecture of Erdős and Nešetřil Praca semestralna nr 3 (semestr letni 2012/13) Opiekun pracy: Tomasz Łuczak On a topological relaxation
More informationInformation Retrieval using Pattern Deploying and Pattern Evolving Method for Text Mining
Information Retrieval using Pattern Deploying and Pattern Evolving Method for Text Mining 1 Vishakha D. Bhope, 2 Sachin N. Deshmukh 1,2 Department of Computer Science & Information Technology, Dr. BAM
More informationStraight-Line Drawings of 2-Outerplanar Graphs on Two Curves
Straight-Line Drawings of 2-Outerplanar Graphs on Two Curves (Extended Abstract) Emilio Di Giacomo and Walter Didimo Università di Perugia ({digiacomo,didimo}@diei.unipg.it). Abstract. We study how to
More informationarxiv: v1 [math.co] 3 Apr 2016
A note on extremal results on directed acyclic graphs arxiv:1604.0061v1 [math.co] 3 Apr 016 A. Martínez-Pérez, L. Montejano and D. Oliveros April 5, 016 Abstract The family of Directed Acyclic Graphs as
More informationPreimages of Small Geometric Cycles
Preimages of Small Geometric Cycles Sally Cockburn Department of Mathematics Hamilton College, Clinton, NY scockbur@hamilton.edu Abstract A graph G is a homomorphic preimage of another graph H, or equivalently
More informationInternational Journal of Scientific & Engineering Research Volume 8, Issue 5, May ISSN
International Journal of Scientific & Engineering Research Volume 8, Issue 5, May-2017 106 Self-organizing behavior of Wireless Ad Hoc Networks T. Raghu Trivedi, S. Giri Nath Abstract Self-organization
More informationWeb Structure Mining Community Detection and Evaluation
Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside
More informationRealizing the 2-Associahedron
Realizing the 2-Associahedron Patrick Tierney Satyan L. Devadoss, Advisor Dagan Karp, Reader Department of Mathematics May, 2016 Copyright 2016 Patrick Tierney. The author grants Harvey Mudd College and
More informationStrongly Connected Components
Strongly Connected Components Let G = (V, E) be a directed graph Write if there is a path from to in G Write if and is an equivalence relation: implies and implies s equivalence classes are called the
More informationAs an example of possible application we compute these invariants for connectome graphs which are studied in neuroscience.
STRONG CONNECTIVITY AND ITS APPLICATIONS arxiv:1609.07355v2 [cs.dm] 20 Oct 2016 PETERIS DAUGULIS Abstract. Directed graphs are widely used in modelling of nonsymmetric relations in various sciences and
More informationCS473-Algorithms I. Lecture 13-A. Graphs. Cevdet Aykanat - Bilkent University Computer Engineering Department
CS473-Algorithms I Lecture 3-A Graphs Graphs A directed graph (or digraph) G is a pair (V, E), where V is a finite set, and E is a binary relation on V The set V: Vertex set of G The set E: Edge set of
More informationApproximation of Frequency Queries by Means of Free-Sets
Approximation of Frequency Queries by Means of Free-Sets Jean-François Boulicaut, Artur Bykowski, and Christophe Rigotti Laboratoire d Ingénierie des Systèmes d Information INSA Lyon, Bâtiment 501 F-69621
More informationCommunication Networks I December 4, 2001 Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page 1
Communication Networks I December, Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page Communication Networks I December, Notation G = (V,E) denotes a
More informationText Mining. Representation of Text Documents
Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,
More informationA SOM-view of oilfield data: A novel vector field visualization for Self-Organizing Maps and its applications in the petroleum industry
A SOM-view of oilfield data: A novel vector field visualization for Self-Organizing Maps and its applications in the petroleum industry Georg Pölzlbauer, Andreas Rauber (Department of Software Technology
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More informationToward the integration of informatic tools and GRID infrastructure for Assyriology text analysis
58 Rencontre Assyriologique Internationale (RAI) Private and State 16-20 July 2012 - Leiden Toward the integration of informatic tools and GRID infrastructure for Assyriology text analysis Giovanni Ponti,
More informationA Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA)
International Journal of Innovation and Scientific Research ISSN 2351-8014 Vol. 12 No. 1 Nov. 2014, pp. 217-222 2014 Innovative Space of Scientific Research Journals http://www.ijisr.issr-journals.org/
More informationof optimization problems. In this chapter, it is explained that what network design
CHAPTER 2 Network Design Network design is one of the most important and most frequently encountered classes of optimization problems. In this chapter, it is explained that what network design is? The
More informationDischarging and reducible configurations
Discharging and reducible configurations Zdeněk Dvořák March 24, 2018 Suppose we want to show that graphs from some hereditary class G are k- colorable. Clearly, we can restrict our attention to graphs
More informationInformation mining and information retrieval : methods and applications
Information mining and information retrieval : methods and applications J. Mothe, C. Chrisment Institut de Recherche en Informatique de Toulouse Université Paul Sabatier, 118 Route de Narbonne, 31062 Toulouse
More informationSolutions to Midterm 2 - Monday, July 11th, 2009
Solutions to Midterm - Monday, July 11th, 009 CPSC30, Summer009. Instructor: Dr. Lior Malka. (liorma@cs.ubc.ca) 1. Dynamic programming. Let A be a set of n integers A 1,..., A n such that 1 A i n for each
More informationBinding Number of Some Special Classes of Trees
International J.Math. Combin. Vol.(206), 76-8 Binding Number of Some Special Classes of Trees B.Chaluvaraju, H.S.Boregowda 2 and S.Kumbinarsaiah 3 Department of Mathematics, Bangalore University, Janana
More informationDistributed minimum spanning tree problem
Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with
More informationGraph Structure Over Time
Graph Structure Over Time Observing how time alters the structure of the IEEE data set Priti Kumar Computer Science Rensselaer Polytechnic Institute Troy, NY Kumarp3@rpi.edu Abstract This paper examines
More informationA Data Mining Case Study
A Data Mining Case Study Mark-André Krogel Otto-von-Guericke-Universität PF 41 20, D-39016 Magdeburg, Germany Phone: +49-391-67 113 99, Fax: +49-391-67 120 18 Email: krogel@iws.cs.uni-magdeburg.de ABSTRACT:
More informationResearch Question Presentation on the Edge Clique Covers of a Complete Multipartite Graph. Nechama Florans. Mentor: Dr. Boram Park
Research Question Presentation on the Edge Clique Covers of a Complete Multipartite Graph Nechama Florans Mentor: Dr. Boram Park G: V 5 Vertex Clique Covers and Edge Clique Covers: Suppose we have a graph
More informationApplying Data Mining Techniques to Wafer Manufacturing
Applying Data Mining Techniques to Wafer Manufacturing Elisa Bertino 1, Barbara Catania 2, and Eleonora Caglio 3 1 Università degli Studi di Milano Via Comelico 39/41 20135 Milano, Italy bertino@dsi.unimi.it
More informationEvolving SQL Queries for Data Mining
Evolving SQL Queries for Data Mining Majid Salim and Xin Yao School of Computer Science, The University of Birmingham Edgbaston, Birmingham B15 2TT, UK {msc30mms,x.yao}@cs.bham.ac.uk Abstract. This paper
More informationThe Design Space of Software Development Methodologies
The Design Space of Software Development Methodologies Kadie Clancy, CS2310 Term Project I. INTRODUCTION The success of a software development project depends on the underlying framework used to plan and
More informationInternational Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16
The Survey Of Data Mining And Warehousing Architha.S, A.Kishore Kumar Department of Computer Engineering Department of computer engineering city engineering college VTU Bangalore, India ABSTRACT: Data
More informationExtremal Graph Theory: Turán s Theorem
Bridgewater State University Virtual Commons - Bridgewater State University Honors Program Theses and Projects Undergraduate Honors Program 5-9-07 Extremal Graph Theory: Turán s Theorem Vincent Vascimini
More informationGenerating Topology on Graphs by. Operations on Graphs
Applied Mathematical Sciences, Vol. 9, 2015, no. 57, 2843-2857 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2015.5154 Generating Topology on Graphs by Operations on Graphs M. Shokry Physics
More informationON DECOMPOSITION OF FUZZY BԐ OPEN SETS
ON DECOMPOSITION OF FUZZY BԐ OPEN SETS 1 B. Amudhambigai, 2 K. Saranya 1,2 Department of Mathematics, Sri Sarada College for Women, Salem-636016, Tamilnadu,India email: 1 rbamudha@yahoo.co.in, 2 saranyamath88@gmail.com
More informationRedefining and Enhancing K-means Algorithm
Redefining and Enhancing K-means Algorithm Nimrat Kaur Sidhu 1, Rajneet kaur 2 Research Scholar, Department of Computer Science Engineering, SGGSWU, Fatehgarh Sahib, Punjab, India 1 Assistant Professor,
More informationAn ICA-Based Multivariate Discretization Algorithm
An ICA-Based Multivariate Discretization Algorithm Ye Kang 1,2, Shanshan Wang 1,2, Xiaoyan Liu 1, Hokyin Lai 1, Huaiqing Wang 1, and Baiqi Miao 2 1 Department of Information Systems, City University of
More informationC-NBC: Neighborhood-Based Clustering with Constraints
C-NBC: Neighborhood-Based Clustering with Constraints Piotr Lasek Chair of Computer Science, University of Rzeszów ul. Prof. St. Pigonia 1, 35-310 Rzeszów, Poland lasek@ur.edu.pl Abstract. Clustering is
More informationLecture 22 Tuesday, April 10
CIS 160 - Spring 2018 (instructor Val Tannen) Lecture 22 Tuesday, April 10 GRAPH THEORY Directed Graphs Directed graphs (a.k.a. digraphs) are an important mathematical modeling tool in Computer Science,
More informationSome Strong Connectivity Concepts in Weighted Graphs
Annals of Pure and Applied Mathematics Vol. 16, No. 1, 2018, 37-46 ISSN: 2279-087X (P), 2279-0888(online) Published on 1 January 2018 www.researchmathsci.org DOI: http://dx.doi.org/10.22457/apam.v16n1a5
More informationSIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING
SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING TAE-WAN RYU AND CHRISTOPH F. EICK Department of Computer Science, University of Houston, Houston, Texas 77204-3475 {twryu, ceick}@cs.uh.edu
More informationIsometric Diamond Subgraphs
Isometric Diamond Subgraphs David Eppstein Computer Science Department, University of California, Irvine eppstein@uci.edu Abstract. We test in polynomial time whether a graph embeds in a distancepreserving
More informationTadeusz Morzy, Maciej Zakrzewicz
From: KDD-98 Proceedings. Copyright 998, AAAI (www.aaai.org). All rights reserved. Group Bitmap Index: A Structure for Association Rules Retrieval Tadeusz Morzy, Maciej Zakrzewicz Institute of Computing
More informationPerformance Analysis of Data Mining Classification Techniques
Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal
More informationLecture 17 November 7
CS 559: Algorithmic Aspects of Computer Networks Fall 2007 Lecture 17 November 7 Lecturer: John Byers BOSTON UNIVERSITY Scribe: Flavio Esposito In this lecture, the last part of the PageRank paper has
More informationGraphs. Data Structures and Algorithms CSE 373 SU 18 BEN JONES 1
Graphs Data Structures and Algorithms CSE 373 SU 18 BEN JONES 1 Warmup Discuss with your neighbors: Come up with as many kinds of relational data as you can (data that can be represented with a graph).
More informationImage Access and Data Mining: An Approach
Image Access and Data Mining: An Approach Chabane Djeraba IRIN, Ecole Polythechnique de l Université de Nantes, 2 rue de la Houssinière, BP 92208-44322 Nantes Cedex 3, France djeraba@irin.univ-nantes.fr
More informationCMSC 380. Graph Terminology and Representation
CMSC 380 Graph Terminology and Representation GRAPH BASICS 2 Basic Graph Definitions n A graph G = (V,E) consists of a finite set of vertices, V, and a finite set of edges, E. n Each edge is a pair (v,w)
More informationLecture Notes. char myarray [ ] = {0, 0, 0, 0, 0 } ; The memory diagram associated with the array can be drawn like this
Lecture Notes Array Review An array in C++ is a contiguous block of memory. Since a char is 1 byte, then an array of 5 chars is 5 bytes. For example, if you execute the following C++ code you will allocate
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 2 Sajjad Haider Spring 2010 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric
More informationLecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1
CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture
More informationGenerating edge covers of path graphs
Generating edge covers of path graphs J. Raymundo Marcial-Romero, J. A. Hernández, Vianney Muñoz-Jiménez and Héctor A. Montes-Venegas Facultad de Ingeniería, Universidad Autónoma del Estado de México,
More informationTowards Semantic Data Mining
Towards Semantic Data Mining Haishan Liu Department of Computer and Information Science, University of Oregon, Eugene, OR, 97401, USA ahoyleo@cs.uoregon.edu Abstract. Incorporating domain knowledge is
More informationIJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, ISSN:
IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 20131 Improve Search Engine Relevance with Filter session Addlin Shinney R 1, Saravana Kumar T
More informationGraph Algorithms Using Depth First Search
Graph Algorithms Using Depth First Search Analysis of Algorithms Week 8, Lecture 1 Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Graph Algorithms Using Depth
More informationScenario-based Refactoring Selection
BABEŞ-BOLYAI University of Cluj-Napoca Faculty of Mathematics and Computer Science Proceedings of the National Symposium ZAC2014 (Zilele Academice Clujene, 2014), p. 32-41 Scenario-based Refactoring Selection
More informationAlessandro Del Ponte, Weijia Ran PAD 637 Week 3 Summary January 31, Wasserman and Faust, Chapter 3: Notation for Social Network Data
Wasserman and Faust, Chapter 3: Notation for Social Network Data Three different network notational schemes Graph theoretic: the most useful for centrality and prestige methods, cohesive subgroup ideas,
More informationTheoretical Computer Science. Testing Eulerianity and connectivity in directed sparse graphs
Theoretical Computer Science 412 (2011) 6390 640 Contents lists available at SciVerse ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Testing Eulerianity and connectivity
More informationLearning Graph Grammars
Learning Graph Grammars 1 Aly El Gamal ECE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Abstract Discovery of patterns in a given graph - in the form of repeated
More informationIntroduction to Trajectory Clustering. By YONGLI ZHANG
Introduction to Trajectory Clustering By YONGLI ZHANG Outline 1. Problem Definition 2. Clustering Methods for Trajectory data 3. Model-based Trajectory Clustering 4. Applications 5. Conclusions 1 Problem
More informationText Mining: A Burgeoning technology for knowledge extraction
Text Mining: A Burgeoning technology for knowledge extraction 1 Anshika Singh, 2 Dr. Udayan Ghosh 1 HCL Technologies Ltd., Noida, 2 University School of Information &Communication Technology, Dwarka, Delhi.
More informationData Mining. Introduction. Piotr Paszek. (Piotr Paszek) Data Mining DM KDD 1 / 44
Data Mining Piotr Paszek piotr.paszek@us.edu.pl Introduction (Piotr Paszek) Data Mining DM KDD 1 / 44 Plan of the lecture 1 Data Mining (DM) 2 Knowledge Discovery in Databases (KDD) 3 CRISP-DM 4 DM software
More informationThe Rainbow Connection of a Graph Is (at Most) Reciprocal to Its Minimum Degree
The Rainbow Connection of a Graph Is (at Most) Reciprocal to Its Minimum Degree Michael Krivelevich 1 and Raphael Yuster 2 1 SCHOOL OF MATHEMATICS, TEL AVIV UNIVERSITY TEL AVIV, ISRAEL E-mail: krivelev@post.tau.ac.il
More informationAn Apriori-like algorithm for Extracting Fuzzy Association Rules between Keyphrases in Text Documents
An Apriori-lie algorithm for Extracting Fuzzy Association Rules between Keyphrases in Text Documents Guy Danon Department of Information Systems Engineering Ben-Gurion University of the Negev Beer-Sheva
More information[8] that this cannot happen on the projective plane (cf. also [2]) and the results of Robertson, Seymour, and Thomas [5] on linkless embeddings of gra
Apex graphs with embeddings of face-width three Bojan Mohar Department of Mathematics University of Ljubljana Jadranska 19, 61111 Ljubljana Slovenia bojan.mohar@uni-lj.si Abstract Aa apex graph is a graph
More informationLecture 6: Graph Properties
Lecture 6: Graph Properties Rajat Mittal IIT Kanpur In this section, we will look at some of the combinatorial properties of graphs. Initially we will discuss independent sets. The bulk of the content
More informationEVALUATING GENERALIZED ASSOCIATION RULES THROUGH OBJECTIVE MEASURES
EVALUATING GENERALIZED ASSOCIATION RULES THROUGH OBJECTIVE MEASURES Veronica Oliveira de Carvalho Professor of Centro Universitário de Araraquara Araraquara, São Paulo, Brazil Student of São Paulo University
More informationBipartite Graph Partitioning and Content-based Image Clustering
Bipartite Graph Partitioning and Content-based Image Clustering Guoping Qiu School of Computer Science The University of Nottingham qiu @ cs.nott.ac.uk Abstract This paper presents a method to model the
More informationPACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS
PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS PAUL BALISTER Abstract It has been shown [Balister, 2001] that if n is odd and m 1,, m t are integers with m i 3 and t i=1 m i = E(K n) then K n can be decomposed
More informationA THREE AND FIVE COLOR THEOREM
PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 52, October 1975 A THREE AND FIVE COLOR THEOREM FRANK R. BERNHART1 ABSTRACT. Let / be a face of a plane graph G. The Three and Five Color Theorem
More informationMINING OPERATIONAL DATA FOR IMPROVING GSM NETWORK PERFORMANCE
MINING OPERATIONAL DATA FOR IMPROVING GSM NETWORK PERFORMANCE Antonio Leong, Simon Fong Department of Electrical and Electronic Engineering University of Macau, Macau Edison Lai Radio Planning Networks
More informationGraph Theory S 1 I 2 I 1 S 2 I 1 I 2
Graph Theory S I I S S I I S Graphs Definition A graph G is a pair consisting of a vertex set V (G), and an edge set E(G) ( ) V (G). x and y are the endpoints of edge e = {x, y}. They are called adjacent
More informationBasic relations between pixels (Chapter 2)
Basic relations between pixels (Chapter 2) Lecture 3 Basic Relationships Between Pixels Definitions: f(x,y): digital image Pixels: q, p (p,q f) A subset of pixels of f(x,y): S A typology of relations:
More informationNumber Theory and Graph Theory
1 Number Theory and Graph Theory Chapter 6 Basic concepts and definitions of graph theory By A. Satyanarayana Reddy Department of Mathematics Shiv Nadar University Uttar Pradesh, India E-mail: satya8118@gmail.com
More information2. Background. 2.1 Clustering
2. Background 2.1 Clustering Clustering involves the unsupervised classification of data items into different groups or clusters. Unsupervised classificaiton is basically a learning task in which learning
More informationChapter 9 Graph Algorithms
Introduction graph theory useful in practice represent many real-life problems can be if not careful with data structures Chapter 9 Graph s 2 Definitions Definitions an undirected graph is a finite set
More informationDiscovering Paths Traversed by Visitors in Web Server Access Logs
Discovering Paths Traversed by Visitors in Web Server Access Logs Alper Tugay Mızrak Department of Computer Engineering Bilkent University 06533 Ankara, TURKEY E-mail: mizrak@cs.bilkent.edu.tr Abstract
More information5 Matchings in Bipartite Graphs and Their Applications
5 Matchings in Bipartite Graphs and Their Applications 5.1 Matchings Definition 5.1 A matching M in a graph G is a set of edges of G, none of which is a loop, such that no two edges in M have a common
More information