PROJECT PROPOSALS: COMMUNITY DETECTION AND ENTITY RESOLUTION. Donatella Firmani
|
|
- Sydney Stewart
- 5 years ago
- Views:
Transcription
1 PROJECT PROPOSALS: COMMUNITY DETECTION AND ENTITY RESOLUTION Donatella Firmani
2 PROJECT 1: COMMUNITY DETECTION
3 What is Community Detection? What Social Network Analysis is? Network Analysis is the keyword For the 21 st Century Researchers, Politicians, People talk about Social Networks. Community detection: discovering groups in a network where individuals group memberships are not explicitly given
4 Subjectivity of Community Definition A denselyconnected community Each connected component is a community Definition of a community can be subjective.
5 Node-Centric Community Detection Node-Centric Community: Each node in a group satisfies certain properties Sample properties: Complete Mutuality cliques Reachability of members k-clique, k-clan, k-club Nodal degrees k-plex, k-core Relative frequency of Within-Outside Ties 5
6 Complete Mutuality: Cliques Clique: a maximum complete subgraph in which all nodes are adjacent to each other Nodes 5, 6, 7 and 8 form a clique NP-hard to find the maximum clique in a network Straightforward implementation to find cliques is very expensive in time complexity 6
7 Enumerating all Maximal Cliques [CDMPT16] W U A F S G J H D E Y A J H H F S D D E
8 Clique is Very Strict Clique is a very strict definition Vertices of a clique are at distance 1 each other Diameter of induced subgraph is 1 Min-degree of induced subgraph s-1 (clique size s) Normally use relaxations of cliques as definition for communities Clique relaxations include: k-clique: vertices with distance* no greater than k from each other k-club / k-clan: subgraphs of diameter no greater than k k-plex: subgraphs of min-degree no greater than s-k 1-clique=1-club=1-clan=1-plex=clique (*) distance is computed on the input graph and can contain external edges 8
9 Enumerating large k-plexes [CFMPT17]
10 Enumerating large k-plexes [CFMPT17]
11 Graph Databases Store data as nodes and relationships Database full of linked nodes
12 Sample Graph DB AllegroGraph Bitsy Cayley GraphBase Graphd HyperGraphDB IBM System G imgraph InfiniteGraph InfoGrid Neo4j Sparksee/DEX Trinity TurboGraph
13 Sample GraphDB queries Pattern matching query Nodes with first name James Adjacency query Nodes that James knows direcly I.e., are adjacent to James in the knows relationship Reachability query Nodes that James knows I.e., are reachable from James in the knows relationship Graph Analytical query
14 Single-sql-query for #connected components (for FUN)
15 Neo4j query for #connected components mponents/follows Via Mazerunner REST API Integrates Apache Spark, GraphX and Neo4j for big scale graph analysis GraphX: Apache Spark's API for graphs and graph-parallel computation
16 Performance
17 Summary and open problems Network Analysis is the keyword For the 21st Century Researchers, Politicians, People talk about Social Networks. Problems: Communities Analysis of Structure & Social Space Technologies: GraphDB Big Data technologies
18 PROJECT 2: ENTITY RESOLUTION
19 What is Entity Resolution (ER)? Input data: modeled as a graph. Graph node = data record. Graph edge label = probability that record pair represents the same entity. Output: a set of clusters, each of which corresponds to an entity. 2 nodes in a cluster iff records represent the same entity. Traditional problems [EIV07, GM12]. Pairwise match: what is the probability that two records match? Clustering: how to partition records into an unknown # of entities? Blocking: how to perform ER in sub-quadratic time? 19
20 What is ER Using an Oracle? Input data: modeled as a graph. Output: a set of clusters = entities. Formal problem [WL+13, VBD14, FSS16]: Given an oracle that can correctly answer if a record pair is a match, what is an optimal strategy to ask oracle queries so as to minimize the number of queries for resolving the entire graph? Motivation: reduce crowdsourcing ER cost for data set. 20
21 What is Online ER Using an Oracle? Formal problem [FSS16]: Given an oracle that can correctly answer if a record pair is a match, what is an optimal strategy to ask oracle queries so as to maximize progressive recall wrt the sequence of oracle queries? Progressive recall = area under recall vs query sequence curve. Motivation: limited resolution time, early user termination. 21
22 Example: DB of Handwritten Characters Data from the Vatican Secret Archives Registri Vaticani: Pope letters throughout the 13 th -century. Linkage problem: entities = characters. 22
23 Example: DB of Handwritten Characters?? 23
24 Strategy 1: Edge Ordering [WL+13] Optimal strategy needs to ask N K + (K choose 2) oracle queries. Takes advantage of (matching and non-matching) transitivity. EO: ask oracle queries in edge probability order. Can grow multiple clusters and sub-clusters in parallel. Worst-case approximation ratio of O(N) [VBD14]. 24
25 Strategy 2: Node Ordering [VBD14] Optimal strategy needs to ask N K + (K choose 2) oracle queries. Takes advantage of (matching and non-matching) transitivity. NO: process nodes in order of their expected cluster sizes. Ask oracle queries in edge probability order to processed nodes. Can grow similar-sized clusters (but not sub-clusters) in parallel. Worst-case approximation ratio of O(K) [VBD14]. 25
26 Oracle Strategy for Progressive Recall Edge ordering: use benefit metric instead of edge probability. Iteratively query oracle with (u, v) having highest value of b e (u, v). Initially, edge with highest value of p(u, v) is queried. Subsequently, can query lower probability, higher benefit edge. 26
27 Strategy 3: Hybrid Ordering [FSS16] Hybrid ordering: use node ordering, then edge ordering. Iteratively: select node u with highest value of b n (u), then query oracle with (u, v), v є C, in decreasing order of b n (u, C). Heuristic: use a threshold on benefit b n (u, C). Finally, process non-inferable edges (u, v) in order of b e (u, v). 27
28 Errors in Oracle Answers Input data: modeled as a graph. Output: a set of noisy clusters. Formal problem: Given an oracle that can answers if a record pair is a match with some error probability, what is an optimal strategy to ask oracle queries so as to minimize the number of queries for resolving the entire graph and maximizing precision? 28
29 Example: DB of Handwritten Characters?? 29
30 Errors and Graph Cuts Vertex cut: a partition of the nodes (vertices) of a graph into two disjoint subsets. Cut-set: the set of edges that have one endpoint in each subset of the partition. What would you trust more??? 30
31 Errors and Graph Cuts Vertex cut: a partition of the nodes (vertices) of a graph into two disjoint subsets. Cut-set: the set of edges that have one endpoint in each subset of the partition. Formal problem: Build graphs with large cuts with as less as edges as possible So-called expander graphs Technical contribution: Prove that the output graph consists of expanders 31
32 Strategy 4: Hybrid Ordering with Expanders Hybrid ordering with Expanders: use node ordering by assigning a node to a cluster only if more than K answers are positive, then edge ordering. 32
33 Summary and open problems Formal study of maximizing progressive recall in online ER. Problem is NP-complete. Formal study of maximizing progressive recall and precision in presence of errors in oracle answers. Open problems: Design robust, online strategies for errors in oracle answers. Design a more powerful interface for queries than pairwise. Scalability (e.g. blocking) 33
Web Structure Mining Community Detection and Evaluation
Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside
More informationCommunity Detection. Community
Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,
More informationIntroduction to Graph Data Management
Introduction to Graph Data Management Claudio Gutierrez Center for Semantic Web Research (CIWS) Department of Computer Science Universidad de Chile EDBT Summer School Palamos 2015 Joint Work With Renzo
More informationClusters and Communities
Clusters and Communities Lecture 7 CSCI 4974/6971 22 Sep 2016 1 / 14 Today s Biz 1. Reminders 2. Review 3. Communities 4. Betweenness and Graph Partitioning 5. Label Propagation 2 / 14 Today s Biz 1. Reminders
More informationSocial-Network Graphs
Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities
More informationPersistent Homology in Complex Network Analysis
Persistent Homology Summer School - Rabat Persistent Homology in Complex Network Analysis Ulderico Fugacci Kaiserslautern University of Technology Department of Computer Science July 7, 2017 Anything has
More informationSocial Network Analysis
Chirayu Wongchokprasitti, PhD University of Pittsburgh Center for Causal Discovery Department of Biomedical Informatics chw20@pitt.edu http://www.pitt.edu/~chw20 Overview Centrality Analysis techniques
More informationGraph Theory S 1 I 2 I 1 S 2 I 1 I 2
Graph Theory S I I S S I I S Graphs Definition A graph G is a pair consisting of a vertex set V (G), and an edge set E(G) ( ) V (G). x and y are the endpoints of edge e = {x, y}. They are called adjacent
More informationClustering Algorithms for general similarity measures
Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative
More informationLocal Algorithms for Sparse Spanning Graphs
Local Algorithms for Sparse Spanning Graphs Reut Levi Dana Ron Ronitt Rubinfeld Intro slides based on a talk given by Reut Levi Minimum Spanning Graph (Spanning Tree) Local Access to a Minimum Spanning
More informationLearning decomposable models with a bounded clique size
Learning decomposable models with a bounded clique size Achievements 2014-2016 Aritz Pérez Basque Center for Applied Mathematics Bilbao, March, 2016 Outline 1 Motivation and background 2 The problem 3
More informationJure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah
Jure Leskovec (@jure) Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah 2 My research group at Stanford: Mining and modeling large social and information networks
More informationP = NP; P NP. Intuition of the reduction idea:
1 Polynomial Time Reducibility The question of whether P = NP is one of the greatest unsolved problems in the theoretical computer science. Two possibilities of relationship between P and N P P = NP; P
More informationGraph Theory: Introduction
Graph Theory: Introduction Pallab Dasgupta, Professor, Dept. of Computer Sc. and Engineering, IIT Kharagpur pallab@cse.iitkgp.ernet.in Resources Copies of slides available at: http://www.facweb.iitkgp.ernet.in/~pallab
More informationTELCOM2125: Network Science and Analysis
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning
More informationMatching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.
18.433 Combinatorial Optimization Matching Algorithms September 9,14,16 Lecturer: Santosh Vempala Given a graph G = (V, E), a matching M is a set of edges with the property that no two of the edges have
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu HITS (Hypertext Induced Topic Selection) Is a measure of importance of pages or documents, similar to PageRank
More informationMath 776 Graph Theory Lecture Note 1 Basic concepts
Math 776 Graph Theory Lecture Note 1 Basic concepts Lectured by Lincoln Lu Transcribed by Lincoln Lu Graph theory was founded by the great Swiss mathematician Leonhard Euler (1707-178) after he solved
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationNetwork Based Models For Analysis of SNPs Yalta Opt
Outline Network Based Models For Analysis of Yalta Optimization Conference 2010 Network Science Zeynep Ertem*, Sergiy Butenko*, Clare Gill** *Department of Industrial and Systems Engineering, **Department
More informationDO NOT RE-DISTRIBUTE THIS SOLUTION FILE
Professor Kindred Math 104, Graph Theory Homework 3 Solutions February 14, 2013 Introduction to Graph Theory, West Section 2.1: 37, 62 Section 2.2: 6, 7, 15 Section 2.3: 7, 10, 14 DO NOT RE-DISTRIBUTE
More informationKernelization Through Tidying A Case-Study Based on s-plex Cluster Vertex Deletion
1/18 Kernelization Through Tidying A Case-Study Based on s-plex Cluster Vertex Deletion René van Bevern Hannes Moser Rolf Niedermeier Institut für Informatik Friedrich-Schiller-Universität Jena, Germany
More informationTypes of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters
Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive
More informationDistributed Graph Storage. Veronika Molnár, UZH
Distributed Graph Storage Veronika Molnár, UZH Overview Graphs and Social Networks Criteria for Graph Processing Systems Current Systems Storage Computation Large scale systems Comparison / Best systems
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford
More informationAnalyzing Flight Data
IBM Analytics Analyzing Flight Data Jeff Carlson Rich Tarro July 21, 2016 2016 IBM Corporation Agenda Spark Overview a quick review Introduction to Graph Processing and Spark GraphX GraphX Overview Demo
More informationLet G = (V, E) be a graph. If u, v V, then u is adjacent to v if {u, v} E. We also use the notation u v to denote that u is adjacent to v.
Graph Adjacent Endpoint of an edge Incident Neighbors of a vertex Degree of a vertex Theorem Graph relation Order of a graph Size of a graph Maximum and minimum degree Let G = (V, E) be a graph. If u,
More informationLeveraging Transitive Relations for Crowdsourced Joins*
Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,
More information3 No-Wait Job Shops with Variable Processing Times
3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select
More information1 Matchings in Graphs
Matchings in Graphs J J 2 J 3 J 4 J 5 J J J 6 8 7 C C 2 C 3 C 4 C 5 C C 7 C 8 6 J J 2 J 3 J 4 J 5 J J J 6 8 7 C C 2 C 3 C 4 C 5 C C 7 C 8 6 Definition Two edges are called independent if they are not adjacent
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationEE512 Graphical Models Fall 2009
EE512 Graphical Models Fall 2009 Prof. Jeff Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2009 http://ssli.ee.washington.edu/~bilmes/ee512fa09 Lecture 11 -
More informationClustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search
Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2
More informationGraph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References q Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationGraphs: Introduction. Ali Shokoufandeh, Department of Computer Science, Drexel University
Graphs: Introduction Ali Shokoufandeh, Department of Computer Science, Drexel University Overview of this talk Introduction: Notations and Definitions Graphs and Modeling Algorithmic Graph Theory and Combinatorial
More informationAdjacent: Two distinct vertices u, v are adjacent if there is an edge with ends u, v. In this case we let uv denote such an edge.
1 Graph Basics What is a graph? Graph: a graph G consists of a set of vertices, denoted V (G), a set of edges, denoted E(G), and a relation called incidence so that each edge is incident with either one
More informationIntroduction to Graph Theory
Introduction to Graph Theory Tandy Warnow January 20, 2017 Graphs Tandy Warnow Graphs A graph G = (V, E) is an object that contains a vertex set V and an edge set E. We also write V (G) to denote the vertex
More informationNotes for Lecture 24
U.C. Berkeley CS170: Intro to CS Theory Handout N24 Professor Luca Trevisan December 4, 2001 Notes for Lecture 24 1 Some NP-complete Numerical Problems 1.1 Subset Sum The Subset Sum problem is defined
More informationmodern database systems lecture 10 : large-scale graph processing
modern database systems lecture 1 : large-scale graph processing Aristides Gionis spring 18 timeline today : homework is due march 6 : homework out april 5, 9-1 : final exam april : homework due graphs
More informationLecture Note: Computation problems in social. network analysis
Lecture Note: Computation problems in social network analysis Bang Ye Wu CSIE, Chung Cheng University, Taiwan September 29, 2008 In this lecture note, several computational problems are listed, including
More informationThe ILP approach to the layered graph drawing. Ago Kuusik
The ILP approach to the layered graph drawing Ago Kuusik Veskisilla Teooriapäevad 1-3.10.2004 1 Outline Introduction Hierarchical drawing & Sugiyama algorithm Linear Programming (LP) and Integer Linear
More informationOn the Approximability of Modularity Clustering
On the Approximability of Modularity Clustering Newman s Community Finding Approach for Social Nets Bhaskar DasGupta Department of Computer Science University of Illinois at Chicago Chicago, IL 60607,
More informationOn the Relationships between Zero Forcing Numbers and Certain Graph Coverings
On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,
More informationHigh-Level Data Models on RAMCloud
High-Level Data Models on RAMCloud An early status report Jonathan Ellithorpe, Mendel Rosenblum EE & CS Departments, Stanford University Talk Outline The Idea Data models today Graph databases Experience
More informationTesting the Cluster Structure of Graphs Christian Sohler
Testing the Cluster Structure of Graphs Christian Sohler Very Large Networks Examples Social networks The World Wide Web Cocitation graphs Coauthorship graphs Data size GigaByte upto TeraByte (only the
More informationSFI Talk. Lev Reyzin Yahoo! Research (work done while at Yale University) talk based on 2 papers, both with Dana Angluin and James Aspnes
SFI Talk Lev Reyzin Yahoo! Research (work done while at Yale University) talk based on 2 papers, both with Dana Angluin and James Aspnes 1 Reconstructing Evolutionary Trees via Distance Experiments Learning
More informationStudying Graph Connectivity
Studying Graph Connectivity Freeman Yufei Huang July 1, 2002 Submitted for CISC-871 Instructor: Dr. Robin Dawes Studying Graph Connectivity Freeman Yufei Huang Submitted July 1, 2002 for CISC-871 In some
More informationClustering Using Graph Connectivity
Clustering Using Graph Connectivity Patrick Williams June 3, 010 1 Introduction It is often desirable to group elements of a set into disjoint subsets, based on the similarity between the elements in the
More informationSpectral Clustering and Community Detection in Labeled Graphs
Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu
More information22 Elementary Graph Algorithms. There are two standard ways to represent a
VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph
More informationAn Effective Upperbound on Treewidth Using Partial Fill-in of Separators
An Effective Upperbound on Treewidth Using Partial Fill-in of Separators Boi Faltings Martin Charles Golumbic June 28, 2009 Abstract Partitioning a graph using graph separators, and particularly clique
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 18 Luca Trevisan March 3, 2011
Stanford University CS359G: Graph Partitioning and Expanders Handout 8 Luca Trevisan March 3, 20 Lecture 8 In which we prove properties of expander graphs. Quasirandomness of Expander Graphs Recall that
More informationKyle Gettig. Mentor Benjamin Iriarte Fourth Annual MIT PRIMES Conference May 17, 2014
Linear Extensions of Directed Acyclic Graphs Kyle Gettig Mentor Benjamin Iriarte Fourth Annual MIT PRIMES Conference May 17, 2014 Motivation Individuals can be modeled as vertices of a graph, with edges
More informationGene expression & Clustering (Chapter 10)
Gene expression & Clustering (Chapter 10) Determining gene function Sequence comparison tells us if a gene is similar to another gene, e.g., in a new species Dynamic programming Approximate pattern matching
More informationIntegrating Multi-Party Computation in Big Data Workflows
Integrating Multi-Party Computation in Big Data Workflows Nikolaj Volgushev, Malte Schwarzkopf, Andrei Lapets, Mayank Varia, Azer Bestavros 1 How often does #@$%! appear in the internal chat logs of these
More informationImportant separators and parameterized algorithms
Important separators and parameterized algorithms Dániel Marx 1 1 Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA SZTAKI) Budapest, Hungary School on Parameterized Algorithms
More informationCONNECTIVITY AND NETWORKS
CONNECTIVITY AND NETWORKS We begin with the definition of a few symbols, two of which can cause great confusion, especially when hand-written. Consider a graph G. (G) the degree of the vertex with smallest
More informationVertex Cover Approximations
CS124 Lecture 20 Heuristics can be useful in practice, but sometimes we would like to have guarantees. Approximation algorithms give guarantees. It is worth keeping in mind that sometimes approximation
More informationData Preprocessing. Slides by: Shree Jaswal
Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data
More informationDistributed Graph Algorithms
Distributed Graph Algorithms Alessio Guerrieri University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1 Introduction
More informationPolynomial-Time Approximation Algorithms
6.854 Advanced Algorithms Lecture 20: 10/27/2006 Lecturer: David Karger Scribes: Matt Doherty, John Nham, Sergiy Sidenko, David Schultz Polynomial-Time Approximation Algorithms NP-hard problems are a vast
More informationSpectral Methods for Network Community Detection and Graph Partitioning
Spectral Methods for Network Community Detection and Graph Partitioning M. E. J. Newman Department of Physics, University of Michigan Presenters: Yunqi Guo Xueyin Yu Yuanqi Li 1 Outline: Community Detection
More informationSpectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014
Spectral Clustering Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014 What are we going to talk about? Introduction Clustering and
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Raquel Urtasun and Tamir Hazan TTI Chicago April 22, 2011 Raquel Urtasun and Tamir Hazan (TTI-C) Graphical Models April 22, 2011 1 / 22 If the graph is non-chordal, then
More informationA Parallel Community Detection Algorithm for Big Social Networks
A Parallel Community Detection Algorithm for Big Social Networks Yathrib AlQahtani College of Computer and Information Sciences King Saud University Collage of Computing and Informatics Saudi Electronic
More informationGraphs. Edges may be directed (from u to v) or undirected. Undirected edge eqvt to pair of directed edges
(p 186) Graphs G = (V,E) Graphs set V of vertices, each with a unique name Note: book calls vertices as nodes set E of edges between vertices, each encoded as tuple of 2 vertices as in (u,v) Edges may
More informationCS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination
CS242: Probabilistic Graphical Models Lecture 3: Factor Graphs & Variable Elimination Instructor: Erik Sudderth Brown University Computer Science September 11, 2014 Some figures and materials courtesy
More informationPreventing Unraveling in Social Networks: The Anchored k-core Problem
Preventing Unraveling in Social Networks: Kshipra Bhawalkar 1 Jon Kleinberg 2 1 Tim Roughgarden 1 Aneesh Sharma 3 1 Stanford University 2 Cornell University 3 Twitter, Inc. ICALP 2012 Outline 1 The Problem
More informationNon Overlapping Communities
Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides
More informationJie Gao Computer Science Department Stony Brook University
Localization of Sensor Networks II Jie Gao Computer Science Department Stony Brook University 1 Rigidity theory Given a set of rigid bars connected by hinges, rigidity theory studies whether you can move
More informationLecture and notes by: Nate Chenette, Brent Myers, Hari Prasad November 8, Property Testing
Property Testing 1 Introduction Broadly, property testing is the study of the following class of problems: Given the ability to perform (local) queries concerning a particular object (e.g., a function,
More informationModels for grids. Computer vision: models, learning and inference. Multi label Denoising. Binary Denoising. Denoising Goal.
Models for grids Computer vision: models, learning and inference Chapter 9 Graphical Models Consider models where one unknown world state at each pixel in the image takes the form of a grid. Loops in the
More informationPregel. Ali Shah
Pregel Ali Shah s9alshah@stud.uni-saarland.de 2 Outline Introduction Model of Computation Fundamentals of Pregel Program Implementation Applications Experiments Issues with Pregel 3 Outline Costs of Computation
More informationA Computational Theory of Clustering
A Computational Theory of Clustering Avrim Blum Carnegie Mellon University Based on work joint with Nina Balcan, Anupam Gupta, and Santosh Vempala Point of this talk A new way to theoretically analyze
More informationNetwork Based Hard/Soft Information Fusion Data Association Process Gregory Tauer, Kedar Sambhoos, Rakesh Nagi (co-pi), Moises Sudit (co-pi)
Network Based Hard/Soft Information Fusion Data Association Process Gregory Tauer, Kedar Sambhoos, Rakesh Nagi (co-pi), Moises Sudit (co-pi) Objectives: Formulate and implement a workable, quantitativelybased
More information9 About Intersection Graphs
9 About Intersection Graphs Since this lecture we focus on selected detailed topics in Graph theory that are close to your teacher s heart... The first selected topic is that of intersection graphs, i.e.
More informationOnline Social Networks and Media. Community detection
Online Social Networks and Media Community detection 1 Notes on Homework 1 1. You should write your own code for generating the graphs. You may use SNAP graph primitives (e.g., add node/edge) 2. For the
More informationGenetic Algorithm for Circuit Partitioning
Genetic Algorithm for Circuit Partitioning ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,
More information1. Lecture notes on bipartite matching
Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans February 5, 2017 1. Lecture notes on bipartite matching Matching problems are among the fundamental problems in
More informationPaths. Path is a sequence of edges that begins at a vertex of a graph and travels from vertex to vertex along edges of the graph.
Paths Path is a sequence of edges that begins at a vertex of a graph and travels from vertex to vertex along edges of the graph. Formal Definition of a Path (Undirected) Let n be a nonnegative integer
More informationSome graph theory applications. communications networks
Some graph theory applications to communications networks Keith Briggs Keith.Briggs@bt.com http://keithbriggs.info Computational Systems Biology Group, Sheffield - 2006 Nov 02 1100 graph problems Sheffield
More informationOverlapping Experiment Infrastructure: More, Better, Faster. Diane Tang, Ashish Agarwal, Mike Meyer, Deirdre O'Brien
Overlapping Experiment Infrastructure: More, Better, Faster Diane Tang, Ashish Agarwal, Mike Meyer, Deirdre O'Brien Why run experiments? Experiments: Live traffic = incoming search queries Experiments
More informationBig Data Management and NoSQL Databases
NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic
More informationApproximation Basics
Milestones, Concepts, and Examples Xiaofeng Gao Department of Computer Science and Engineering Shanghai Jiao Tong University, P.R.China Spring 2015 Spring, 2015 Xiaofeng Gao 1/53 Outline History NP Optimization
More information1 The Traveling Salesperson Problem (TSP)
CS 598CSC: Approximation Algorithms Lecture date: January 23, 2009 Instructor: Chandra Chekuri Scribe: Sungjin Im In the previous lecture, we had a quick overview of several basic aspects of approximation
More informationINF4820, Algorithms for AI and NLP: Hierarchical Clustering
INF4820, Algorithms for AI and NLP: Hierarchical Clustering Erik Velldal University of Oslo Sept. 25, 2012 Agenda Topics we covered last week Evaluating classifiers Accuracy, precision, recall and F-score
More informationGRAPH THEORY LECTURE 3 STRUCTURE AND REPRESENTATION PART B
GRAPH THEORY LECTURE 3 STRUCTURE AND REPRESENTATION PART B Abstract. We continue 2.3 on subgraphs. 2.4 introduces some basic graph operations. 2.5 describes some tests for graph isomorphism. Outline 2.3
More informationImportant separators and parameterized algorithms
Important separators and parameterized algorithms Dániel Marx 1 1 Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA SZTAKI) Budapest, Hungary PCSS 2017 Vienna, Austria September
More informationSocrates: A System for Scalable Graph Analytics C. Savkli, R. Carr, M. Chapman, B. Chee, D. Minch
Socrates: A System for Scalable Graph Analytics C. Savkli, R. Carr, M. Chapman, B. Chee, D. Minch September 10, 2014 Cetin Savkli Cetin.Savkli@jhuapl.edu 240 228 0115 Challenges of Big Data & Analytics
More informationMatching Theory. Figure 1: Is this graph bipartite?
Matching Theory 1 Introduction A matching M of a graph is a subset of E such that no two edges in M share a vertex; edges which have this property are called independent edges. A matching M is said to
More informationComputing Largest Correcting Codes and Their Estimates Using Optimization on Specially Constructed Graphs p.1/30
Computing Largest Correcting Codes and Their Estimates Using Optimization on Specially Constructed Graphs Sergiy Butenko Department of Industrial Engineering Texas A&M University College Station, TX 77843
More informationChapter 5 Graph Algorithms Algorithm Theory WS 2012/13 Fabian Kuhn
Chapter 5 Graph Algorithms Algorithm Theory WS 2012/13 Fabian Kuhn Graphs Extremely important concept in computer science Graph, : node (or vertex) set : edge set Simple graph: no self loops, no multiple
More informationOn the Max Coloring Problem
On the Max Coloring Problem Leah Epstein Asaf Levin May 22, 2010 Abstract We consider max coloring on hereditary graph classes. The problem is defined as follows. Given a graph G = (V, E) and positive
More informationPaired Approximation Problems and Incompatible Inapproximabilities David Eppstein
Paired Approximation Problems and Incompatible Inapproximabilities David Eppstein Cherries by Emma Rose Photos (CC-BY-NC), http://www.flickr.com/photos/29974980@n04/2814910772/ In a nutshell: Given two
More informationA Branch-and-Cut Algorithm for Constrained Graph Clustering
A Branch-and-Cut Algorithm for Constrained Graph Clustering Behrouz Babaki, Dries Van Daele, Bram Weytjens, Tias Guns, Department of Computer Science, KU Leuven, Belgium Department of of Microbial and
More informationUML CS Algorithms Qualifying Exam Spring, 2004 ALGORITHMS QUALIFYING EXAM
NAME: This exam is open: - books - notes and closed: - neighbors - calculators ALGORITHMS QUALIFYING EXAM The upper bound on exam time is 3 hours. Please put all your work on the exam paper. (Partial credit
More informationCluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1
Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods
More informationTrading-off incrementality and dynamic restart of multiple solvers in IC3
Trading-off incrementality and dynamic restart of multiple solvers in IC3 Paolo Enrico Camurati, Carmelo Loiacono, Paolo Pasini, Denis Patti, Stefano Quer Dip. di Automatica ed Informatica, Politecnico
More informationV10 Metabolic networks - Graph connectivity
V10 Metabolic networks - Graph connectivity Graph connectivity is related to analyzing biological networks for - finding cliques - edge betweenness - modular decomposition that have been or will be covered
More information