Graph Mining: Introduction

Size: px
Start display at page:

Download "Graph Mining: Introduction"

Transcription

1 Graph Mining: Introduction Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016

2 Lecture road Course Information Introduction to graph mining Graphs: models and basic concepts 2

3 Organization of the lecture Lecture and slides in English Tuesday Two mandatory assignments: [Individual] One presentation of a paper of choice among a list of papers about topics covered in the lectures. Two slots: 13/12 (first part) 07/02 (second part) [Individual] One small project of graph analytics to be completed before the end of the course Examination (in English!): Oral exam (in the first three weeks after the lecture period) Grading scheme: 20%: Presentation 10%: Project 70%: Exam Lectures will be recorded and online (tele-task) There is no official textbook for the course Registration is required for this lecture, notify the studienreferat and myself: davide.mottin@hpi.de 3

4 About the lecturers Davide Mottin Postdoctoral Knowledge Discovery and Data Mining PhD in 2015, University of Trento Research Interests: Graph Mining, Data Mining, Graph databases, Preference models, Query paradigms Konstantina Lazaridou PhD Information Systems Research - Web Science MSc in 2015, University Ioannina Research Interests: Graph Mining, Social Network Analysis, Web data Mining, Opinion and Sentiment Analysis, Data Stream Mining 4

5 Course Web site Lecture material (slides, papers, books, tutorials, assignments, ) available online hing/aktuelle-vorlesung/ws- 1617/graph-mining.html The slides are also available in the intranet! 5

6 Objectives Understanding Where graphs are, why they are important, and what are new applications The main challenges from data mining perspective Learn How to efficiently query, and store a graph using graph mining techniques Analyze networks to understand the properties and the behaviors of individuals Think in a research perspective (novelty, clarity, ) Solve practical problems Work on real scale data and existing tools 6

7 Prerequisites Basic computer science and programming. Data-mining knowledge is a plus but is not strictly required. Basic probability theory and linear algebra are beneficial, although a small recap of the main concepts will be done at the beginning of the required lectures. 7

8 Schedule (tentative) Introduction to graph mining Social network analysis - Diffusion Graph Querying: exact, approximate, and reachability Frequent subgraph mining Graph indexing HPI-Kolloquium Invited speaker: prof. Danai Koutra Node classification Some practical graph mining framework Project assignment Link prediction Student paper presentation [first part] Christmas break Christmas break Non overlapping communities Overlapping communities Anomaly detection Graph summarization Report handover Summary of algorithms for different graph models Student paper presentation [second part] 8

9 Course Material - 1 There is no official book in the course. However, the slides are based on materials from these books: Aggarwal, C.C. and Wang, H. eds., 2010.Managing and mining graph data (Vol. 40). New York: Springer. Chakrabarti, D. and Faloutsos, C., Graph mining: laws, tools, and case studies. Synthesis Lectures on Data Mining and Knowledge Discovery, 7(1), pp Easley, D. and Kleinberg, J., Networks, crowds, and markets: Reasoning about a highly connected world. Cambridge University Press. 9

10 Course Material - 2 Some material is inspired, imported and modified from several existing courses. Graph Mining and Exploration at Scale (prof. Danai Koutra) Social and Information Network Analysis (prof. Jure Leskovec) Online Social Networks and Media (prof. Evaggelia Pitoura, prof. Panayotis Tsaparas) Data Mining meets Graph Mining (prof. Leman Akoglu) 10

11 How to send s To: Subject: Problem Help Text: Dear Dr. Davide Mottin, Your sincerely, BigBug92 I m a student at the third year, attending the course, number of shoes, quantity of food To: eaten davide.mottin@hpi.de yesterday The slides are not clear. I don t understand Subject: [GraphMining] the things Subgraph isomorphism theres. Text: Hi Davide, the subgraph isomorphism concept is not entirely clear to me. Why is the function bijective? Thanks, [First Name-Last Name] 11

12 Some rule of thumbs I m available for any kind of concern Use the mailing list: Seldom send to me directly, unless it is a very important concern Be quick and precise in the s Ask me questions in the course, or right after/before the lecture. If the question requires more time ask for a meeting with me: Better if you cluster and come in group instead of alone so I can answer to many questions at the same time If you think the course load/organization is unfair please let me know before the end of the semester. After that there will be NO possibility for discussion. 12

13 Feedback The course is taught for the first time: Any feedback is appreciated Any comments on slides and clarity as well There might be some mistake here and there (but we will do our best) Ask questions if you don t understand something. Better a question in class than a doubt during the exam! 13

14 (There's) no such thing as a stupid question 14

15 Content of the course First part Second part Background concepts: probability theory/statistics, basic linear algebra, basic graph concepts (morphisms, degrees, matrix representation,...) Social network analysis: Diffusion Power laws Influence propagation Graph querying and indexing: Exact and approximate queries Reachability queries Frequent subgraph mining Graph indexing Node classification and node similarity Link prediction Communities and anomalies Overlapping/Non overlapping communities Anomaly detection Graph summarization Summary of algorithms for different models (graph streams, evolving graphs, probabilistic graphs, colored graphs) Graph mining frameworks 15

16 About the presentations The presentation will be 15 mins in total 10 minutes presentation 5 minutes questions The group will be divided into two halves: One half will present on December 13 papers regarding the first part of the course The other half will present on February 7 regarding the second part of the course Every person presents one paper First come first served if two people ask to present the same paper, the second has to change the choice Paper list for the first part of the course: 16

17 Questions? 17

18 Lecture road Course Information Introduction to graph mining Graphs: models and basic concepts 18

19 The web August 2016 >= 50 billions of pages At least 4.73 billion pages indexed by search engines Source: 19

20 Social graphs facebook 1.5 Bln users 450 Bln Relationships 600 Mln groups 10.5 USD per user Twitter 313 Mln users 500 Mln Tweets/day Avg 208 followers/user They are complex: Groups, links, preferences, attributes 20

21 Knowledge graphs 20Mln entities 100Mln relationships 2500 types of relationships Other knowledge graphs: YAGO DBPedia DBLP Pubmed Linkmdb Connect entities such as persons, organizations, countries, objects through semantic relationships (e.g. owns a company) 21

22 Biological networks Protein-protein interaction networks Nodes: Proteins Edges: Physical interactions Metabolic networks Nodes: Metabolites and enzymes Edges: Chemical reactions 22

23 What else? Source: Source: Anything that involves relationships (implicit or explicit) can be modeled as a graph! 23

24 Graphs are everywhere Social Networks Complex Ubiquituous Large Valuable Road Networks Recommendation Graphs Knowledge Graphs 24

25 Why Graphs? Why now? Describe complex data with a simple structure Nature, social, concepts, roads, circuits Same representation for many disciplines Computer science, biology, physics, economics,... Availability of (BIG) data Large networks are now available and require complex algorithms Networks are evolving over time (e.g., new users/friends in Facebook) Usefulness Analysis will discover non trivial patterns, and allow simple smooth explorations They reveal user behaviors They are valuable (Facebook, Twitter, Amazon... All of them based on graphs!!!) 25

26 Graph mining is the process of discovering, retrieving and analyzing non trivial patterns in graph shaped data Graph mining 26

27 What can we do with graph mining? Compressing graphs without losing information Finding complex structures fast Recognizing communities and social patterns Study the propagation of viruses Predicting if two people will become friends Understanding what are the important nodes Showing how the network will evolve Helping the visualization of complex structures Finding roles, positive and negative influence prediction 27

28 What is involved in graph mining? Basic graph algorithms (shortest paths, BFS, DFS, isomorphisms, traversals, random walks ) Storage and indexing Smart representations for compactness Modeling of problems as graphs Distance metrics and similarity measures Exact, Approximate, and heuristic algorithms Evolving structures Interactivity and online updates Complexity (most of the problems are not polynomially solvable) 28

29 Practical applications of graph mining 29

30 Finding substructures 30

31 Community detection 31

32 Influence propagation 32

33 Link prediction 33

34 Graph evolution 34

35 Detecting frauds 35

36 Visualization Several visualization tools: General: Gephi, GraphViz, Biological: Cytoscape, Network Workbench Social: EgoNet, NodeXL,... Relational: Tulip 36

37 Lost in the graph? Hopefully not after this course ;) 37

38 Current: Query languages SELECT?name? WHERE {?person a foaf:person.?person foaf:name?name.?person foaf:mbox? . } Query languages ARE: Expressive Powerful Scalable Compact SPARQL g.v().haslabel('movie').as('a','b'). where(ine('rated').count().is(gt(10))). select('a','b'). by('name'). by(ine('rated').values('stars').mean()). order(). by(select('b'),decr). limit(10 GREMLIN MATCH (node1:label1)-->(node2:label2) WHERE node1.propertya = {value} RETURN node2.propertya, node2.propertyb but Not user friendly Not interactive CYPHER 38

39 Lecture road Course Information Introduction to graph mining Graphs: models and basic concepts 39

40 Network or graphs? Network refers to real systems Web, Social, Biological, Terminology: Network, node, link/relationship Graph is an abstract mathematical model of a network Web graph, Social graph Terminology: Graph, vertex/node, edge BUT we often use both without distinction 40

41 Graphs a a 0.2b c0.2 a b c b G = (V, E,p) E) l) Verteces Edges Labeling Probability function l: V E Σ E V V Undirected Graphs Co-authorship, Roads, Biological Directed graphs Follows, Labeled (or colored) Graphs Knowledge graphs, Probabilistic graphs Causal graphs 41

42 Graph databases (set of graphs) a a b a a c c b d a b a a c b b G 1 G 2 G 3 D = G -, G /,, G 1, G 2 = V 2, E 2, l 2, l 2 : E 2 V 2 Σ Set of small labeled graphs Chemical compounds, Business models, 3D objects 42

43 An example? Give me an example of network you know. What are the nodes? What are the edges? What shape? 43

44 Important Terminology Degree of a node: Number of neighbors of a node In directed graphs In-degree: number of inbound links Out-degree: number of outgoing links Adjacent node: A node u is adjacent to a node v if there is an edge between u and v, i.e. u, v E Path: Sequence of adjacent, non-repeating nodes in a graph Length of a path = number of edges Diameter of a graph: Size of the longest shortest path a v a Degree of v: 3 In-degree: 1 Out-degree: 2 44

45 Graph representation Adjacency matrix A = a 27 = 8 1 i, j E 0 otherwise 1 => {2} 2 => {4} 3 => {1,2,4,6} Adjacency list What are the advantages/disadvantages of one or another representation? 45

46 Static vs Evolving graph t n A 1 A 1 A1 A 1 A 1 t 1 Static graph Adjacency matrix A Dynamic, temporal graph 3D Matrix (tensor) 46

47 Graph Isomorphism G 1 G 2 f Given two graphs,g - : V -, E -, l -, G / : V /, E /, l / G - is isomorphic G / iff exists a bijective function f: V - V / s.t.: 1. For each v - V -, l v - = l(f v - ) 2. v -, u - E - iff f v -, f u - E / 47

48 Subgraph Isomorphism Q G G A graph,q: V M, E M, l M is subgraph isomorphic to a graph G: V, E, l if exists a subgraph G N G, isomorphic to Q 48

49 Frequent Subgraph Mining a a c c Problem Find all subgraphs of G that appear at least σ times b a c Suppose σ = 2, the frequent subgraphs are (only edge labels) a, b, c a-a, a-c, b-c, c-c a-c-a b Exponential number of patterns!!! G 49

50 Questions? 50

Davide Mottin, Emmanuel Müller Hasso Plattner Institute, Potsdam, Germany b-it center, University of Bonn. August 19, 2018 KDD 2018, London, UK

Davide Mottin, Emmanuel Müller Hasso Plattner Institute, Potsdam, Germany b-it center, University of Bonn. August 19, 2018 KDD 2018, London, UK Graph e Let me Show what is loration Relevant in your Graph Davide Mottin, Emmanuel Müller Hasso Plattner Institute, Potsdam, Germany b-it center, University of Bonn August 19, 2018 KDD 2018, London, UK

More information

Graph Mining: Overview of different graph models

Graph Mining: Overview of different graph models Graph Mining: Overview of different graph models Davide Mottin, Konstantina Lazaridou Hasso Plattner Institute Graph Mining course Winter Semester 2016 Lecture road Anomaly detection (previous lecture)

More information

Non Overlapping Communities

Non Overlapping Communities Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides

More information

An overview of Graph Categories and Graph Primitives

An overview of Graph Categories and Graph Primitives An overview of Graph Categories and Graph Primitives Dino Ienco (dino.ienco@irstea.fr) https://sites.google.com/site/dinoienco/ Topics I m interested in: Graph Database and Graph Data Mining Social Network

More information

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov Algorithms and Applications in Social Networks 2017/2018, Semester B Slava Novgorodov 1 Lesson #1 Administrative questions Course overview Introduction to Social Networks Basic definitions Network properties

More information

Distributed Data Analytics Introduction

Distributed Data Analytics Introduction G-3.1.09, Campus III Hasso Plattner Institut Information Systems Team Prof. Felix Naumann Dr. Ralf Krestel Tim Repke Diana Stephan project DuDe Duplicate Detection Data Fusion Sebastian Kruse Data Change

More information

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science

More information

CSE 316: SOCIAL NETWORK ANALYSIS INTRODUCTION. Fall 2017 Marion Neumann

CSE 316: SOCIAL NETWORK ANALYSIS INTRODUCTION. Fall 2017 Marion Neumann CSE 316: SOCIAL NETWORK ANALYSIS Fall 2017 Marion Neumann INTRODUCTION Contents in these slides may be subject to copyright. Some materials are adopted from: http://www.cs.cornell.edu/home /kleinber/ networks-book,

More information

Social-Network Graphs

Social-Network Graphs Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities

More information

Epilog: Further Topics

Epilog: Further Topics Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases SS 2016 Epilog: Further Topics Lecture: Prof. Dr. Thomas

More information

Big Data Management and NoSQL Databases

Big Data Management and NoSQL Databases NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic

More information

Introduction to Data Mining and Data Analytics

Introduction to Data Mining and Data Analytics 1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns

More information

Data Mining in Bioinformatics Day 5: Graph Mining

Data Mining in Bioinformatics Day 5: Graph Mining Data Mining in Bioinformatics Day 5: Graph Mining Karsten Borgwardt February 25 to March 10 Bioinformatics Group MPIs Tübingen from Borgwardt and Yan, KDD 2008 tutorial Graph Mining and Graph Kernels,

More information

Chapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han

Chapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han Chapter 1. Social Media and Social Computing October 2012 Youn-Hee Han http://link.koreatech.ac.kr 1.1 Social Media A rapid development and change of the Web and the Internet Participatory web application

More information

Fall Principles of Knowledge Discovery in Databases. University of Alberta

Fall Principles of Knowledge Discovery in Databases. University of Alberta Principles of Knowledge Discovery in Databases Fall 1999 Dr. Osmar R. Zaïane 2 1 Class and Office Hours Class: Mondays, Wednesdays and Fridays from 10:00 to 10:50 Office Hours: Tuesdays from 11:00 to 11:55

More information

Data Mining in Bioinformatics Day 3: Graph Mining

Data Mining in Bioinformatics Day 3: Graph Mining Graph Mining and Graph Kernels Data Mining in Bioinformatics Day 3: Graph Mining Karsten Borgwardt & Chloé-Agathe Azencott February 6 to February 17, 2012 Machine Learning and Computational Biology Research

More information

Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior

Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior Who? Prof. Aris Anagnostopoulos Prof. Luciana S. Buriol Prof. Guido Schäfer What will We Cover? Topics: Network properties

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #1: Course Introduction U Kang Seoul National University U Kang 1 In This Lecture Motivation to study data mining Administrative information for this course U Kang 2

More information

Exploring the Structure of Data at Scale. Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019

Exploring the Structure of Data at Scale. Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019 Exploring the Structure of Data at Scale Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019 Outline Why exploration of large datasets matters Challenges in working with large data

More information

Database Systems (INFR10070) Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Database Systems (INFR10070) Dr Paolo Guagliardo. University of Edinburgh. Fall 2016 Database Systems (INFR10070) Dr Paolo Guagliardo University of Edinburgh Fall 2016 Databases are everywhere Electronic commerce, websites (e.g., Wordpress blogs) Banking applications, booking systems,

More information

Introduction to Text Mining. Hongning Wang

Introduction to Text Mining. Hongning Wang Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:

More information

Using! to Teach Graph Theory

Using! to Teach Graph Theory !! Using! to Teach Graph Theory Todd Abel Mary Elizabeth Searcy Appalachian State University Why Graph Theory? Mathematical Thinking (Habits of Mind, Mathematical Practices) Accessible to students at a

More information

Graph Data Management Systems in New Applications Domains. Mikko Halin

Graph Data Management Systems in New Applications Domains. Mikko Halin Graph Data Management Systems in New Applications Domains Mikko Halin Introduction Presentation is based on two papers Graph Data Management Systems for New Application Domains - Philippe Cudré-Mauroux,

More information

CSE 417 Practical Algorithms. (a.k.a. Algorithms & Computational Complexity)

CSE 417 Practical Algorithms. (a.k.a. Algorithms & Computational Complexity) CSE 417 Practical Algorithms (a.k.a. Algorithms & Computational Complexity) Outline for Today > Course Goals & Overview > Administrivia > Greedy Algorithms Why study algorithms? > Learn the history of

More information

CSE 701: LARGE-SCALE GRAPH MINING. A. Erdem Sariyuce

CSE 701: LARGE-SCALE GRAPH MINING. A. Erdem Sariyuce CSE 701: LARGE-SCALE GRAPH MINING A. Erdem Sariyuce WHO AM I? My name is Erdem Office: 323 Davis Hall Office hours: Wednesday 2-4 pm Research on graph (network) mining & management Practical algorithms

More information

Graph Exploration: Taking the User into the Loop

Graph Exploration: Taking the User into the Loop Grph Explortion: Tking the User into the Loop Dvide Mottin, Anj Jentzsch, Emmnuel Müller Hsso Plttner Institute, Potsdm, Germny 2016/10/24 CIKM2016, Indinpolis, US Who we re Dvide Mottin grph mining, novel

More information

Lecture Note: Computation problems in social. network analysis

Lecture Note: Computation problems in social. network analysis Lecture Note: Computation problems in social network analysis Bang Ye Wu CSIE, Chung Cheng University, Taiwan September 29, 2008 In this lecture note, several computational problems are listed, including

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 3/6/2012 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 2 In many data mining

More information

Fundamentals of Database Systems

Fundamentals of Database Systems Fundamentals of Database Systems Semester 1, 2017 Fundamentals of Database Systems COMPSCI/SOFTENG 351 COMPSCI 751 Instructors: Gill Dobbie, Miika Hannula, Sebastian Link, Gerald Weber Department of Computer

More information

Networks in economics and finance. Lecture 1 - Measuring networks

Networks in economics and finance. Lecture 1 - Measuring networks Networks in economics and finance Lecture 1 - Measuring networks What are networks and why study them? A network is a set of items (nodes) connected by edges or links. Units (nodes) Individuals Firms Banks

More information

(Refer Slide Time: 05:25)

(Refer Slide Time: 05:25) Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering IIT Delhi Lecture 30 Applications of DFS in Directed Graphs Today we are going to look at more applications

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms CSE, Winter 8 Design and Analysis of Algorithms Lecture : Graphs, DFS (Undirected, Directed), DAGs Class URL: http://vlsicad.ucsd.edu/courses/cse-w8/ Graphs Internet topology Graphs Gene-gene interactions

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

USC Viterbi School of Engineering

USC Viterbi School of Engineering Introduction to Computational Thinking and Data Science USC Viterbi School of Engineering http://www.datascience4all.org Term: Fall 2016 Time: Tues- Thur 10am- 11:50am Location: Allan Hancock Foundation

More information

CSC6290: Data Communication and Computer Networks. Hongwei Zhang

CSC6290: Data Communication and Computer Networks. Hongwei Zhang CSC6290: Data Communication and Computer Networks Hongwei Zhang http://www.cs.wayne.edu/~hzhang Objectives of the course Ultimate goal: To help students become deep thinkers in computer networking! Humble

More information

Pattern Mining in Frequent Dynamic Subgraphs

Pattern Mining in Frequent Dynamic Subgraphs Pattern Mining in Frequent Dynamic Subgraphs Karsten M. Borgwardt, Hans-Peter Kriegel, Peter Wackersreuther Institute of Computer Science Ludwig-Maximilians-Universität Munich, Germany kb kriegel wackersr@dbs.ifi.lmu.de

More information

Case Study: Social Network Analysis. Part II

Case Study: Social Network Analysis. Part II Case Study: Social Network Analysis Part II https://sites.google.com/site/kdd2017iot/ Outline IoT Fundamentals and IoT Stream Mining Algorithms Predictive Learning Descriptive Learning Frequent Pattern

More information

Social Network Analysis

Social Network Analysis Social Network Analysis Mathematics of Networks Manar Mohaisen Department of EEC Engineering Adjacency matrix Network types Edge list Adjacency list Graph representation 2 Adjacency matrix Adjacency matrix

More information

Topics. Trees Vojislav Kecman. Which graphs are trees? Terminology. Terminology Trees as Models Some Tree Theorems Applications of Trees CMSC 302

Topics. Trees Vojislav Kecman. Which graphs are trees? Terminology. Terminology Trees as Models Some Tree Theorems Applications of Trees CMSC 302 Topics VCU, Department of Computer Science CMSC 302 Trees Vojislav Kecman Terminology Trees as Models Some Tree Theorems Applications of Trees Binary Search Tree Decision Tree Tree Traversal Spanning Trees

More information

CS-490WIR Web Information Retrieval and Management. Luo Si

CS-490WIR Web Information Retrieval and Management. Luo Si CS490W: Web Information Retrieval & Management CS-490WIR Web Information Retrieval and Management Luo Si Department of Computer Science Purdue University Overview Web: Growth of the Web The world produces

More information

Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining

Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining Chloé-Agathe Azencott & Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institutes

More information

Convex and Distributed Optimization. Thomas Ropars

Convex and Distributed Optimization. Thomas Ropars >>> Presentation of this master2 course Convex and Distributed Optimization Franck Iutzeler Jérôme Malick Thomas Ropars Dmitry Grishchenko from LJK, the applied maths and computer science laboratory and

More information

Network Basics. CMSC 498J: Social Media Computing. Department of Computer Science University of Maryland Spring Hadi Amiri

Network Basics. CMSC 498J: Social Media Computing. Department of Computer Science University of Maryland Spring Hadi Amiri Network Basics CMSC 498J: Social Media Computing Department of Computer Science University of Maryland Spring 2016 Hadi Amiri hadi@umd.edu Lecture Topics Graphs as Models of Networks Graph Theory Nodes,

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/24/2014 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 2 High dim. data

More information

Biology, Physics, Mathematics, Sociology, Engineering, Computer Science, Etc

Biology, Physics, Mathematics, Sociology, Engineering, Computer Science, Etc Motivation Motifs Algorithms G-Tries Parallelism Complex Networks Networks are ubiquitous! Biology, Physics, Mathematics, Sociology, Engineering, Computer Science, Etc Images: UK Highways Agency, Uriel

More information

Graph Exploration: Taking the User into the Loop

Graph Exploration: Taking the User into the Loop Graph Exploration: Taking the User into the Loop Davide Mottin, Anja Jentzsch, Emmanuel Müller Hasso Plattner Institute, Potsdam, Germany 2016/10/24 CIKM2016, Indianapolis, US Where we are Background (5

More information

A New Parallel Algorithm for Connected Components in Dynamic Graphs. Robert McColl Oded Green David Bader

A New Parallel Algorithm for Connected Components in Dynamic Graphs. Robert McColl Oded Green David Bader A New Parallel Algorithm for Connected Components in Dynamic Graphs Robert McColl Oded Green David Bader Overview The Problem Target Datasets Prior Work Parent-Neighbor Subgraph Results Conclusions Problem

More information

KNOWLEDGE GRAPHS. Lecture 1: Introduction and Motivation. TU Dresden, 16th Oct Markus Krötzsch Knowledge-Based Systems

KNOWLEDGE GRAPHS. Lecture 1: Introduction and Motivation. TU Dresden, 16th Oct Markus Krötzsch Knowledge-Based Systems KNOWLEDGE GRAPHS Lecture 1: Introduction and Motivation Markus Krötzsch Knowledge-Based Systems TU Dresden, 16th Oct 2018 Introduction and Organisation Markus Krötzsch, 16th Oct 2018 Knowledge Graphs slide

More information

Introduction. Introduction. Heuristic Algorithms. Giovanni Righini. Università degli Studi di Milano Department of Computer Science (Crema)

Introduction. Introduction. Heuristic Algorithms. Giovanni Righini. Università degli Studi di Milano Department of Computer Science (Crema) Introduction Heuristic Algorithms Giovanni Righini Università degli Studi di Milano Department of Computer Science (Crema) Objectives The course aims at illustrating the main algorithmic techniques for

More information

V 1 Introduction! Mon, Oct 15, 2012! Bioinformatics 3 Volkhard Helms!

V 1 Introduction! Mon, Oct 15, 2012! Bioinformatics 3 Volkhard Helms! V 1 Introduction! Mon, Oct 15, 2012! Bioinformatics 3 Volkhard Helms! How Does a Cell Work?! A cell is a crowded environment! => many different proteins,! metabolites, compartments,! On a microscopic level!

More information

Fast Nearest Neighbor Search on Large Time-Evolving Graphs

Fast Nearest Neighbor Search on Large Time-Evolving Graphs Fast Nearest Neighbor Search on Large Time-Evolving Graphs Leman Akoglu Srinivasan Parthasarathy Rohit Khandekar Vibhore Kumar Deepak Rajan Kun-Lung Wu Graphs are everywhere Leman Akoglu Fast Nearest Neighbor

More information

Modeling and Simulating Social Systems with MATLAB

Modeling and Simulating Social Systems with MATLAB Modeling and Simulating Social Systems with MATLAB Lecture 8 Introduction to Graphs/Networks Olivia Woolley, Stefano Balietti, Lloyd Sanders, Dirk Helbing Chair of Sociology, in particular of Modeling

More information

Graph Exploration: Taking the User into the Loop

Graph Exploration: Taking the User into the Loop Graph Exploration: Taking the User into the Loop Davide Mottin, Anja Jentzsch, Emmanuel Müller Hasso Plattner Institute, Potsdam, Germany 2016/10/24 CIKM2016, Indianapolis, US Where we are Background (5

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be

More information

Various Graphs and Their Applications in Real World

Various Graphs and Their Applications in Real World Various Graphs and Their Applications in Real World Pranav Patel M. Tech. Computer Science and Engineering Chirag Patel M. Tech. Computer Science and Engineering Abstract This day s usage of computers

More information

Graph Theory. Graph Theory. COURSE: Introduction to Biological Networks. Euler s Solution LECTURE 1: INTRODUCTION TO NETWORKS.

Graph Theory. Graph Theory. COURSE: Introduction to Biological Networks. Euler s Solution LECTURE 1: INTRODUCTION TO NETWORKS. Graph Theory COURSE: Introduction to Biological Networks LECTURE 1: INTRODUCTION TO NETWORKS Arun Krishnan Koenigsberg, Russia Is it possible to walk with a route that crosses each bridge exactly once,

More information

Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing

Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing Gautam Bhat, Rajeev Kumar Singh Department of Computer Science and Engineering Shiv Nadar University Gautam Buddh Nagar,

More information

Lecture 1: Introduction and Motivation Markus Kr otzsch Knowledge-Based Systems

Lecture 1: Introduction and Motivation Markus Kr otzsch Knowledge-Based Systems KNOWLEDGE GRAPHS Introduction and Organisation Lecture 1: Introduction and Motivation Markus Kro tzsch Knowledge-Based Systems TU Dresden, 16th Oct 2018 Markus Krötzsch, 16th Oct 2018 Course Tutors Knowledge

More information

CSCI 5417 Information Retrieval Systems! What is Information Retrieval?

CSCI 5417 Information Retrieval Systems! What is Information Retrieval? CSCI 5417 Information Retrieval Systems! Lecture 1 8/23/2011 Introduction 1 What is Information Retrieval? Information retrieval is the science of searching for information in documents, searching for

More information

CE4031 and CZ4031 Database System Principles

CE4031 and CZ4031 Database System Principles CE431 and CZ431 Database System Principles Course CE/CZ431 Course Database System Principles CE/CZ21 Algorithms; CZ27 Introduction to Databases CZ433 Advanced Data Management (not offered currently) Lectures

More information

Chapters 11 and 13, Graph Data Mining

Chapters 11 and 13, Graph Data Mining CSI 4352, Introduction to Data Mining Chapters 11 and 13, Graph Data Mining Young-Rae Cho Associate Professor Department of Computer Science Balor Universit Graph Representation Graph An ordered pair GV,E

More information

Mining and Analyzing Online Social Networks

Mining and Analyzing Online Social Networks The 5th EuroSys Doctoral Workshop (EuroDW 2011) Salzburg, Austria, Sunday 10 April 2011 Mining and Analyzing Online Social Networks Emilio Ferrara eferrara@unime.it Advisor: Prof. Giacomo Fiumara PhD School

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data: Part I Instructor: Yizhou Sun yzsun@ccs.neu.edu November 12, 2013 Announcement Homework 4 will be out tonight Due on 12/2 Next class will be canceled

More information

Graph Analytics in the Big Data Era

Graph Analytics in the Big Data Era Graph Analytics in the Big Data Era Yongming Luo, dr. George H.L. Fletcher Web Engineering Group What is really hot? 19-11-2013 PAGE 1 An old/new data model graph data Model entities and relations between

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data

More information

COMP Data Structures

COMP Data Structures COMP 2140 - Data Structures Shahin Kamali Topic 1 - Introductions University of Manitoba Based on notes by S. Durocher. COMP 2140 - Data Structures 1 / 35 Introduction COMP 2140 - Data Structures 1 / 35

More information

Graph analytics approach to analyse Enterprise Architecture models

Graph analytics approach to analyse Enterprise Architecture models Nikhitha Rajashekar nikhita.rajashekar@rwth-aachen.de Graph analytics approach to analyse Enterprise Architecture models Master Thesis Proposal Supervisor: Simon Hacks Overview 1. Enterprise Architecture

More information

Jure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah

Jure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah Jure Leskovec (@jure) Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah 2 My research group at Stanford: Mining and modeling large social and information networks

More information

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection

CSE 258 Lecture 6. Web Mining and Recommender Systems. Community Detection CSE 258 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

Database and Knowledge-Base Systems: Data Mining. Martin Ester

Database and Knowledge-Base Systems: Data Mining. Martin Ester Database and Knowledge-Base Systems: Data Mining Martin Ester Simon Fraser University School of Computing Science Graduate Course Spring 2006 CMPT 843, SFU, Martin Ester, 1-06 1 Introduction [Fayyad, Piatetsky-Shapiro

More information

DS504/CS586: Big Data Analytics Data Pre-processing and Cleaning Prof. Yanhua Li

DS504/CS586: Big Data Analytics Data Pre-processing and Cleaning Prof. Yanhua Li Welcome to DS504/CS586: Big Data Analytics Data Pre-processing and Cleaning Prof. Yanhua Li Time: 6:00pm 8:50pm R Location: AK 232 Fall 2016 The Data Equation Oceans of Data Ocean Biodiversity Informatics,

More information

CPSC 2380 Data Structures and Algorithms

CPSC 2380 Data Structures and Algorithms CPSC 2380 Data Structures and Algorithms Spring 2014 Department of Computer Science University of Arkansas at Little Rock 2801 South University Avenue Little Rock, Arkansas 72204-1099 Class Hours: Tuesday

More information

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015 University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 2:00pm-3:30pm, Tuesday, December 15th Name: ComputingID: This is a closed book and closed notes exam. No electronic

More information

Design and Analysis of Algorithms. Comp 271. Mordecai Golin. Department of Computer Science, HKUST

Design and Analysis of Algorithms. Comp 271. Mordecai Golin. Department of Computer Science, HKUST Design and Analysis of Algorithms Revised 05/02/03 Comp 271 Mordecai Golin Department of Computer Science, HKUST Information about the Lecturer Dr. Mordecai Golin Office: 3559 Email: golin@cs.ust.hk http://www.cs.ust.hk/

More information

Last week: Breadth-First Search

Last week: Breadth-First Search 1 Last week: Breadth-First Search Set L i = [] for i=1,,n L 0 = {w}, where w is the start node For i = 0,, n-1: For u in L i : For each v which is a neighbor of u: If v isn t yet visited: - mark v as visited,

More information

TOTAL CREDIT UNITS L T P/ S SW/F W. Course Title: Analysis & Design of Algorithm. Course Level: UG Course Code: CSE303 Credit Units: 5

TOTAL CREDIT UNITS L T P/ S SW/F W. Course Title: Analysis & Design of Algorithm. Course Level: UG Course Code: CSE303 Credit Units: 5 Course Title: Analysis & Design of Algorithm Course Level: UG Course Code: CSE303 Credit Units: 5 L T P/ S SW/F W TOTAL CREDIT UNITS 3 1 2-5 Course Objectives: The designing of algorithm is an important

More information

Modeling the Language of Risk in Activism. Hailey Reeves and Evan Brown Mentor: William Lippitt

Modeling the Language of Risk in Activism. Hailey Reeves and Evan Brown Mentor: William Lippitt Modeling the Language of Risk in Activism Hailey Reeves and Evan Brown Mentor: William Lippitt Project Motivation Modeling perceived risks and possible influences that discourage or negatively influence

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Queries on streams

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #6: Mining Data Streams Seoul National University 1 Outline Overview Sampling From Data Stream Queries Over Sliding Window 2 Data Streams In many data mining situations,

More information

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul 1 CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul Introduction Our problem is crawling a static social graph (snapshot). Given

More information

Compressed representations for web and social graphs. Cecilia Hernandez and Gonzalo Navarro Presented by Helen Xu 6.

Compressed representations for web and social graphs. Cecilia Hernandez and Gonzalo Navarro Presented by Helen Xu 6. Compressed representations for web and social graphs Cecilia Hernandez and Gonzalo Navarro Presented by Helen Xu 6.886 April 6, 2018 Web graphs and social networks Web graphs represent the link structure

More information

Network visualization techniques and evaluation

Network visualization techniques and evaluation Network visualization techniques and evaluation The Charlotte Visualization Center University of North Carolina, Charlotte March 15th 2007 Outline 1 Definition and motivation of Infovis 2 3 4 Outline 1

More information

Biological Networks Analysis

Biological Networks Analysis Biological Networks Analysis Introduction and Dijkstra s algorithm Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein The clustering problem: partition genes into distinct

More information

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection

CSE 158 Lecture 6. Web Mining and Recommender Systems. Community Detection CSE 158 Lecture 6 Web Mining and Recommender Systems Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1 Graph Theory Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ January 30, 2018 Network

More information

CE4031 and CZ4031 Database System Principles

CE4031 and CZ4031 Database System Principles CE4031 and CZ4031 Database System Principles Academic AY1819 Semester 1 CE/CZ4031 Database System Principles s CE/CZ2001 Algorithms; CZ2007 Introduction to Databases CZ4033 Advanced Data Management (not

More information

Mining Data that Changes. 17 July 2015

Mining Data that Changes. 17 July 2015 Mining Data that Changes 17 July 2015 Data is Not Static Data is not static New transactions, new friends, stop following somebody in Twitter, But most data mining algorithms assume static data Even a

More information

Gotcha! Network Analytics to augment Fraud Detection Big Data in the Food Chain: the un(der)explored goldmine?

Gotcha! Network Analytics to augment Fraud Detection Big Data in the Food Chain: the un(der)explored goldmine? Gotcha! Network Analytics to augment Fraud Detection Big Data in the Food Chain: the un(der)explored goldmine? December 4th, 2018 Author: Véronique Van Vlasselaer SAS Pre-Sales Analytical Consultant Introduction

More information

Week 8: The fundamentals of graph theory; Planar Graphs 25 and 27 October, 2017

Week 8: The fundamentals of graph theory; Planar Graphs 25 and 27 October, 2017 (1/25) MA284 : Discrete Mathematics Week 8: The fundamentals of graph theory; Planar Graphs 25 and 27 October, 2017 1 Definitions 1. A graph 2. Paths and connected graphs 3. Complete graphs 4. Vertex degree

More information

COMP Data Structures

COMP Data Structures Shahin Kamali Topic 1 - Introductions University of Manitoba Based on notes by S. Durocher. 1 / 35 Introduction Introduction 1 / 35 Introduction In a Glance... Data structures are building blocks for designing

More information

Graph Theory for Network Science

Graph Theory for Network Science Graph Theory for Network Science Dr. Natarajan Meghanathan Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Networks or Graphs We typically

More information

Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies

Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com

More information

CS2 Algorithms and Data Structures Note 9

CS2 Algorithms and Data Structures Note 9 CS2 Algorithms and Data Structures Note 9 Graphs The remaining three lectures of the Algorithms and Data Structures thread will be devoted to graph algorithms. 9.1 Directed and Undirected Graphs A graph

More information

CS490W: Web Information Search & Management. CS-490W Web Information Search and Management. Luo Si. Department of Computer Science Purdue University

CS490W: Web Information Search & Management. CS-490W Web Information Search and Management. Luo Si. Department of Computer Science Purdue University CS490W: Web Information Search & Management CS-490W Web Information Search and Management Luo Si Department of Computer Science Purdue University Overview Web: Growth of the Web The world produces between

More information

745: Advanced Database Systems

745: Advanced Database Systems 745: Advanced Database Systems Yanlei Diao University of Massachusetts Amherst Outline Overview of course topics Course requirements Database Management Systems 1. Online Analytical Processing (OLAP) vs.

More information

Text Analytics (Text Mining)

Text Analytics (Text Mining) CSE 6242 / CX 4242 Apr 1, 2014 Text Analytics (Text Mining) Concepts and Algorithms Duen Horng (Polo) Chau Georgia Tech Some lectures are partly based on materials by Professors Guy Lebanon, Jeffrey Heer,

More information

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection

CSE 255 Lecture 6. Data Mining and Predictive Analytics. Community Detection CSE 255 Lecture 6 Data Mining and Predictive Analytics Community Detection Dimensionality reduction Goal: take high-dimensional data, and describe it compactly using a small number of dimensions Assumption:

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms CSE 101, Winter 2018 Design and Analysis of Algorithms Lecture 6: BFS, SPs, Dijkstra Class URL: http://vlsicad.ucsd.edu/courses/cse101-w18/ Distance in a Graph The distance between two nodes is the length

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms CS4335: Design and Analysis of Algorithms Who we are: Dr. Lusheng WANG Dept. of Computer Science office: B6422 phone: 2788 9820 e-mail: lwang@cs.cityu.edu.hk Course web site: http://www.cs.cityu.edu.hk/~lwang/ccs3335.html

More information