Graph and Link Mining

Size: px
Start display at page:

Download "Graph and Link Mining"

Transcription

1 Graph and Link Mining

2 Graphs - Basics A graph is a powerful abstraction for modeling entities and their pairwise relationships. G = (V,E) Set of nodes V = v,, v 5 Set of edges E = { v, v 2, v 4, v 5 } Examples: Social network Twitter Followers Web Collaboration graphs v 5 v v 2 v 4 v 3 2

3 Undirected Graphs Undirected Graph The edges are undirected pairs they can be traversed in any direction. Degree of node: Number of edges incident on the node Path: A sequence of edges from one node to another Connected Component: A set of nodes such that there is a path between any two nodes in the set v v 5 A v 4 v 3 v 2 3

4 Directed Graphs Directed Graph: Edges are ordered pairs they can be traversed in the direction from first to second. In-degree and Out-degree of a node. Path: A sequence of directed edges from one node to another Strongly Connected Component: A set of nodes such that there is a directed path between any two nodes in the set v A v 5 v 2 v 4 v 3 4

5 Examples of Graphs we Might Mine Airline Route Maps are useful Info can tell you about both history and politics Call Detail Records Tell us about relationships between people Who got in trouble about a decade ago for using this info? Web is based on (hyper)links between docs Social Networks form Graphs Link Analysis is the data mining technique that addresses relationships and connections 5

6 6 Degrees of Separation Claim: there are at most 6 degrees of separation between any two people This is important in social networks LinkedIn tell you how you connect to others and it expands with each link. Stanley Milgram wasn t first to note small world effect But popularized it with famous experiment: How close are two random people? Picked people in Omaha Nebraska or Wichita Kansas, and someone in Boston Asked source person to send it to other person and if did not know the person send it to someone more likely to know them Average path length was 5.5 or 6 But only 64 of 296 arrived (this is often not highlighted) 6

7 Examples of Applications Identifying authoritative sources of information on the WWW by analyzing page links Google and PageRank we will come back to this Understanding physician referral patterns Analyzing telephone call patterns MCI Friends and Family You call Mary Smith, also on MCI, so ask her to join MCI But your wife does not know Mary Smith! Oops! Far-fetched? Facebook does it all of the time!!!! Identify fraud: in past one would purchaser several stolen calling cards and use them to call same person. That is a clue. 7

8 Mining the graph structure A graph is a combinatorial object, with a certain structure. Mining the structure of the graph reveals information about the entities in the graph E.g., if in the Facebook graph I find that there are people that are all linked to each other, then these people are likely to be a community The community discovery problem By measuring the number of friends in Facebook graph I can find the most important nodes The node importance problem 8

9 Importance problem What are the most important nodes in the graph? What are the most authoritative pages on the web? Who are the important users in Facebook? What are the most influential Twitter accounts? 9

10 Link Analysis First generation search engines view documents as flat text files could not cope with size, spamming, user needs Second generation search engines Ranking becomes critical shift from relevance to authoritativeness authoritativeness: the static importance of the page a success story for the network analysis + a huge commercial success it all started with two graduate students at Stanford. Everyone knows the company, right?

11 Link Analysis: Intuition A link from page p to page q denotes endorsement page p considers page q an authority on a subject use the graph of recommendations assign an authority value to every page The same idea applies to other graphs as well Twitter graph, where user p follows user q

12 Constructing the graph w w w w w Goal: output an authority weight for each node Also known as centrality or importance 2

13 Rank by Popularity Rank pages according to the number of incoming edges (in-degree, degree centrality) w=2 w=3 w=2. Red Page 2. Yellow Page 3. Blue Page 4. Purple Page 5. Green Page w= w= 3

14 Popularity It is not important only how many link to you, but how important they are Good authorities are pointed by good authorities Recursive definition of importance 4

15 PageRank w Good authorities are pointed to by good authorities The value of a page is the value of the people that link to you How do we implement that? Each node distributes its authority value equally to its neighbors The authority value of each node is the sum of the authority fractions it collects from its neighbors. Solving the system of equations we get authority values for the nodes w = ½, w = ¼, w = ¼ w w + w + w = w = w + w w = ½ w w = ½ w w 5

16 A More Complex Example v v 2 w = /3 w 4 + /2 w 5 v 3 w 2 = /2 w + w 3 + /3 w 4 w 3 = /2 w + /3 w 4 w 4 = /2 w 5 w 5 = w 2 v 5 v 4 6

17 Random Walks on Graphs What we described is equivalent to a random walk on the graph Random walk: Start from a node uniformly at random Pick one of the outgoing edges uniformly at random Repeat Some nodes will be visited more often than others. Those are more important. Based not only on number of incoming links, but how often the predecessor nodes are visited A value like Google s Pagerank indicates how often a node would be visited 7

18 Random walks on graphs Question: what is the probability of being at a specific node? p i : probability of being at node i at this step p i : probability of being at node i in the next step p = /3 p 4 + /2 p 5 v v 2 p 2 = /2 p + p 3 + /3 p 4 v 3 p 3 = /2 p + /3 p 4 p 4 = /2 p 5 p 5 = p 2 v 5 v 4 After many steps the probabilities converge to the stationary distribution of the random walk. 8

19 How Does Pagerank Work? Arbitrarily initialize all pages to Pagerank of Repeatedly perform calculations for each page Eventually the values will converge Pagerank is what caused Google to succeed Prior to that only content mattered, not link structure 9

20 Benefits of PageRank It is not trivial to fool Pagerank You can create dummy pages to point to your page, but since no one is pointing to those pages, it will have low PageRank and not help much You can create dummy pages to also point to one another, but without being pointed to by an outside authority, the impact will be limited But it is clear that Google must have many tweaks to catch cases like this link spam or link farms 2

21 Social Network Analysis Social Network Analysis Overview 5 Minutes What is Social Network Analysis 4 minutes 2

Social Networks 2015 Lecture 10: The structure of the web and link analysis

Social Networks 2015 Lecture 10: The structure of the web and link analysis 04198250 Social Networks 2015 Lecture 10: The structure of the web and link analysis The structure of the web Information networks Nodes: pieces of information Links: different relations between information

More information

Structure of Social Networks

Structure of Social Networks Structure of Social Networks Outline Structure of social networks Applications of structural analysis Social *networks* Twitter Facebook Linked-in IMs Email Real life Address books... Who Twitter #numbers

More information

Introduction To Graphs and Networks. Fall 2013 Carola Wenk

Introduction To Graphs and Networks. Fall 2013 Carola Wenk Introduction To Graphs and Networks Fall 203 Carola Wenk On the Internet, links are essentially weighted by factors such as transit time, or cost. The goal is to find the shortest path from one node to

More information

Lecture #3: PageRank Algorithm The Mathematics of Google Search

Lecture #3: PageRank Algorithm The Mathematics of Google Search Lecture #3: PageRank Algorithm The Mathematics of Google Search We live in a computer era. Internet is part of our everyday lives and information is only a click away. Just open your favorite search engine,

More information

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena CSE 190 Lecture 16 Data Mining and Predictive Analytics Small-world phenomena Another famous study Stanley Milgram wanted to test the (already popular) hypothesis that people in social networks are separated

More information

Degree Distribution: The case of Citation Networks

Degree Distribution: The case of Citation Networks Network Analysis Degree Distribution: The case of Citation Networks Papers (in almost all fields) refer to works done earlier on same/related topics Citations A network can be defined as Each node is a

More information

HW 4: PageRank & MapReduce. 1 Warmup with PageRank and stationary distributions [10 points], collaboration

HW 4: PageRank & MapReduce. 1 Warmup with PageRank and stationary distributions [10 points], collaboration CMS/CS/EE 144 Assigned: 1/25/2018 HW 4: PageRank & MapReduce Guru: Joon/Cathy Due: 2/1/2018 at 10:30am We encourage you to discuss these problems with others, but you need to write up the actual solutions

More information

Graphs / Networks. CSE 6242/ CX 4242 Feb 18, Centrality measures, algorithms, interactive applications. Duen Horng (Polo) Chau Georgia Tech

Graphs / Networks. CSE 6242/ CX 4242 Feb 18, Centrality measures, algorithms, interactive applications. Duen Horng (Polo) Chau Georgia Tech CSE 6242/ CX 4242 Feb 18, 2014 Graphs / Networks Centrality measures, algorithms, interactive applications Duen Horng (Polo) Chau Georgia Tech Partly based on materials by Professors Guy Lebanon, Jeffrey

More information

Einführung in Web und Data Science Community Analysis. Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme

Einführung in Web und Data Science Community Analysis. Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Einführung in Web und Data Science Community Analysis Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Today s lecture Anchor text Link analysis for ranking Pagerank and variants

More information

Information Networks: PageRank

Information Networks: PageRank Information Networks: PageRank Web Science (VU) (706.716) Elisabeth Lex ISDS, TU Graz June 18, 2018 Elisabeth Lex (ISDS, TU Graz) Links June 18, 2018 1 / 38 Repetition Information Networks Shape of the

More information

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov Algorithms and Applications in Social Networks 2017/2018, Semester B Slava Novgorodov 1 Lesson #1 Administrative questions Course overview Introduction to Social Networks Basic definitions Network properties

More information

How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200?

How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200? How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200? Questions from last time Avg. FB degree is 200 (suppose).

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

More information

Copyright 2000, Kevin Wayne 1

Copyright 2000, Kevin Wayne 1 Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. Directed

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second

More information

Link Analysis and Web Search

Link Analysis and Web Search Link Analysis and Web Search Moreno Marzolla Dip. di Informatica Scienza e Ingegneria (DISI) Università di Bologna http://www.moreno.marzolla.name/ based on material by prof. Bing Liu http://www.cs.uic.edu/~liub/webminingbook.html

More information

Information Retrieval and Web Search

Information Retrieval and Web Search Information Retrieval and Web Search Link analysis Instructor: Rada Mihalcea (Note: This slide set was adapted from an IR course taught by Prof. Chris Manning at Stanford U.) The Web as a Directed Graph

More information

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge

Centralities (4) By: Ralucca Gera, NPS. Excellence Through Knowledge Centralities (4) By: Ralucca Gera, NPS Excellence Through Knowledge Some slide from last week that we didn t talk about in class: 2 PageRank algorithm Eigenvector centrality: i s Rank score is the sum

More information

Lecture 27: Learning from relational data

Lecture 27: Learning from relational data Lecture 27: Learning from relational data STATS 202: Data mining and analysis December 2, 2017 1 / 12 Announcements Kaggle deadline is this Thursday (Dec 7) at 4pm. If you haven t already, make a submission

More information

How to organize the Web?

How to organize the Web? How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second try: Web Search Information Retrieval attempts to find relevant docs in a small and trusted set Newspaper

More information

Absorbing Random walks Coverage

Absorbing Random walks Coverage DATA MINING LECTURE 3 Absorbing Random walks Coverage Random Walks on Graphs Random walk: Start from a node chosen uniformly at random with probability. n Pick one of the outgoing edges uniformly at random

More information

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017 THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS Summer semester, 2016/2017 SOCIAL NETWORK ANALYSIS: THEORY AND APPLICATIONS 1. A FEW THINGS ABOUT NETWORKS NETWORKS IN THE REAL WORLD There are four categories

More information

Absorbing Random walks Coverage

Absorbing Random walks Coverage DATA MINING LECTURE 3 Absorbing Random walks Coverage Random Walks on Graphs Random walk: Start from a node chosen uniformly at random with probability. n Pick one of the outgoing edges uniformly at random

More information

Introduction Types of Social Network Analysis Social Networks in the Online Age Data Mining for Social Network Analysis Applications Conclusion

Introduction Types of Social Network Analysis Social Networks in the Online Age Data Mining for Social Network Analysis Applications Conclusion Introduction Types of Social Network Analysis Social Networks in the Online Age Data Mining for Social Network Analysis Applications Conclusion References Social Network Social Network Analysis Sociocentric

More information

World Wide Web has specific challenges and opportunities

World Wide Web has specific challenges and opportunities 6. Web Search Motivation Web search, as offered by commercial search engines such as Google, Bing, and DuckDuckGo, is arguably one of the most popular applications of IR methods today World Wide Web has

More information

Using! to Teach Graph Theory

Using! to Teach Graph Theory !! Using! to Teach Graph Theory Todd Abel Mary Elizabeth Searcy Appalachian State University Why Graph Theory? Mathematical Thinking (Habits of Mind, Mathematical Practices) Accessible to students at a

More information

Web Search Ranking. (COSC 488) Nazli Goharian Evaluation of Web Search Engines: High Precision Search

Web Search Ranking. (COSC 488) Nazli Goharian Evaluation of Web Search Engines: High Precision Search Web Search Ranking (COSC 488) Nazli Goharian nazli@cs.georgetown.edu 1 Evaluation of Web Search Engines: High Precision Search Traditional IR systems are evaluated based on precision and recall. Web search

More information

CS6200 Information Retreival. The WebGraph. July 13, 2015

CS6200 Information Retreival. The WebGraph. July 13, 2015 CS6200 Information Retreival The WebGraph The WebGraph July 13, 2015 1 Web Graph: pages and links The WebGraph describes the directed links between pages of the World Wide Web. A directed edge connects

More information

CS 6604: Data Mining Large Networks and Time-Series

CS 6604: Data Mining Large Networks and Time-Series CS 6604: Data Mining Large Networks and Time-Series Soumya Vundekode Lecture #12: Centrality Metrics Prof. B Aditya Prakash Agenda Link Analysis and Web Search Searching the Web: The Problem of Ranking

More information

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds MAE 298, Lecture 9 April 30, 2007 Web search and decentralized search on small-worlds Search for information Assume some resource of interest is stored at the vertices of a network: Web pages Files in

More information

Algorithms, Games, and Networks February 21, Lecture 12

Algorithms, Games, and Networks February 21, Lecture 12 Algorithms, Games, and Networks February, 03 Lecturer: Ariel Procaccia Lecture Scribe: Sercan Yıldız Overview In this lecture, we introduce the axiomatic approach to social choice theory. In particular,

More information

Brief (non-technical) history

Brief (non-technical) history Web Data Management Part 2 Advanced Topics in Database Management (INFSCI 2711) Textbooks: Database System Concepts - 2010 Introduction to Information Retrieval - 2008 Vladimir Zadorozhny, DINS, SCI, University

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second

More information

Part 1: Link Analysis & Page Rank

Part 1: Link Analysis & Page Rank Chapter 8: Graph Data Part 1: Link Analysis & Page Rank Based on Leskovec, Rajaraman, Ullman 214: Mining of Massive Datasets 1 Graph Data: Social Networks [Source: 4-degrees of separation, Backstrom-Boldi-Rosa-Ugander-Vigna,

More information

MODULE 5 BLOG PROMOTION AND MARKETING STRATEGIES

MODULE 5 BLOG PROMOTION AND MARKETING STRATEGIES MODULE 5 BLOG PROMOTION AND MARKETING STRATEGIES RANKING YOUR CONTENT IN THE SEARCH ENGINES In order to successfully rank your content in the search engines, you must make sure you ve optimized your blog

More information

1 Starting around 1996, researchers began to work on. 2 In Feb, 1997, Yanhong Li (Scotch Plains, NJ) filed a

1 Starting around 1996, researchers began to work on. 2 In Feb, 1997, Yanhong Li (Scotch Plains, NJ) filed a !"#$ %#& ' Introduction ' Social network analysis ' Co-citation and bibliographic coupling ' PageRank ' HIS ' Summary ()*+,-/*,) Early search engines mainly compare content similarity of the query and

More information

CSI 445/660 Part 10 (Link Analysis and Web Search)

CSI 445/660 Part 10 (Link Analysis and Web Search) CSI 445/660 Part 10 (Link Analysis and Web Search) Ref: Chapter 14 of [EK] text. 10 1 / 27 Searching the Web Ranking Web Pages Suppose you type UAlbany to Google. The web page for UAlbany is among the

More information

Undirected Graphs. V = { 1, 2, 3, 4, 5, 6, 7, 8 } E = { 1-2, 1-3, 2-3, 2-4, 2-5, 3-5, 3-7, 3-8, 4-5, 5-6 } n = 8 m = 11

Undirected Graphs. V = { 1, 2, 3, 4, 5, 6, 7, 8 } E = { 1-2, 1-3, 2-3, 2-4, 2-5, 3-5, 3-7, 3-8, 4-5, 5-6 } n = 8 m = 11 Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. V = {

More information

Promoting Your Small Business with and Social Media

Promoting Your Small Business with  and Social Media How To Guide: Promoting Your Small Business with Email and Social Media Connect with Constant Contact. Everywhere. v1.0 06.27.2016 Market Your Email Socially Did you know that social media and email work

More information

CENTRALITIES. Carlo PICCARDI. DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy

CENTRALITIES. Carlo PICCARDI. DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy CENTRALITIES Carlo PICCARDI DEIB - Department of Electronics, Information and Bioengineering Politecnico di Milano, Italy email carlo.piccardi@polimi.it http://home.deib.polimi.it/piccardi Carlo Piccardi

More information

Link Analysis from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer and other material.

Link Analysis from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer and other material. Link Analysis from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer and other material. 1 Contents Introduction Network properties Social network analysis Co-citation

More information

Social Network Analysis

Social Network Analysis Social Network Analysis Giri Iyengar Cornell University gi43@cornell.edu March 14, 2018 Giri Iyengar (Cornell Tech) Social Network Analysis March 14, 2018 1 / 24 Overview 1 Social Networks 2 HITS 3 Page

More information

Lecture 9: I: Web Retrieval II: Webology. Johan Bollen Old Dominion University Department of Computer Science

Lecture 9: I: Web Retrieval II: Webology. Johan Bollen Old Dominion University Department of Computer Science Lecture 9: I: Web Retrieval II: Webology Johan Bollen Old Dominion University Department of Computer Science jbollen@cs.odu.edu http://www.cs.odu.edu/ jbollen April 10, 2003 Page 1 WWW retrieval Two approaches

More information

F. Aiolli - Sistemi Informativi 2007/2008. Web Search before Google

F. Aiolli - Sistemi Informativi 2007/2008. Web Search before Google Web Search Engines 1 Web Search before Google Web Search Engines (WSEs) of the first generation (up to 1998) Identified relevance with topic-relateness Based on keywords inserted by web page creators (META

More information

Large-Scale Networks. PageRank. Dr Vincent Gramoli Lecturer School of Information Technologies

Large-Scale Networks. PageRank. Dr Vincent Gramoli Lecturer School of Information Technologies Large-Scale Networks PageRank Dr Vincent Gramoli Lecturer School of Information Technologies Introduction Last week we talked about: - Hubs whose scores depend on the authority of the nodes they point

More information

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 11 Web Mining and Recommender Systems Triadic closure; strong & weak ties Triangles So far we ve seen (a little about) how networks can be characterized by their connectivity patterns What

More information

Network Mathematics - Why is it a Small World? Oskar Sandberg

Network Mathematics - Why is it a Small World? Oskar Sandberg Network Mathematics - Why is it a Small World? Oskar Sandberg 1 Networks Formally, a network is a collection of points and connections between them. 2 Networks Formally, a network is a collection of points

More information

Lesson Three: False Claims Act and Health Insurance Portability and Accountability Act (HIPAA)

Lesson Three: False Claims Act and Health Insurance Portability and Accountability Act (HIPAA) Lesson Three: False Claims Act and Health Insurance Portability and Accountability Act (HIPAA) Introduction: Welcome to Honesty and Confidentiality Lesson Three: The False Claims Act is an important part

More information

PARTICIPANT CENTER GUIDE TEAMRAISER 2016 GUIDE

PARTICIPANT CENTER GUIDE TEAMRAISER 2016 GUIDE TEAMRAISER 06 GUIDE Participant Center Customer Service Guide September 05 EVERY RIDE. EVERY RIDER. EVERY CONTRIBUTION MATTERS. Every day we come one step closer to our goal a world free of MS. Every day

More information

A Guide to using Social Media (Facebook and Twitter)

A Guide to using Social Media (Facebook and Twitter) A Guide to using Social Media (Facebook and Twitter) Facebook 1. Visit www.facebook.com 2. Click the green Sign up button on the top left-hand corner (see diagram below) 3. Enter all the information required

More information

PARTICIPANT CENTER GUIDE 1 TEAMRAISER 2016 GUIDE

PARTICIPANT CENTER GUIDE 1 TEAMRAISER 2016 GUIDE PARTICIPANT CENTER GUIDE TEAMRAISER 06 GUIDE PARTICIPANT CENTER GUIDE EVERY RIDE. EVERY RIDER. EVERY CONTRIBUTION MATTERS. Every day we come one step closer to our goal a world free of MS. Every day we

More information

Social-Network Graphs

Social-Network Graphs Social-Network Graphs Mining Social Networks Facebook, Google+, Twitter Email Networks, Collaboration Networks Identify communities Similar to clustering Communities usually overlap Identify similarities

More information

The main things to note here are that:

The main things to note here are that: The MadeUpCompany Link Building Kit This is an internal link building template. One best practice for companies hoping to attain high search rankings is to get company employees to link to your site. This

More information

Graph Theory. Network Science: Graph theory. Graph theory Terminology and notation. Graph theory Graph visualization

Graph Theory. Network Science: Graph theory. Graph theory Terminology and notation. Graph theory Graph visualization Network Science: Graph Theory Ozalp abaoglu ipartimento di Informatica Scienza e Ingegneria Università di ologna www.cs.unibo.it/babaoglu/ ranch of mathematics for the study of structures called graphs

More information

A Survey of Google's PageRank

A Survey of Google's PageRank http://pr.efactory.de/ A Survey of Google's PageRank Within the past few years, Google has become the far most utilized search engine worldwide. A decisive factor therefore was, besides high performance

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Queries on streams

More information

walkinfo@aidatlanta.org What is my Personal Page? How do I set it Your Personal Page is your very own webpage dedicated to your fundraising efforts for AIDS Walk Atlanta & 5K Run. Setting up your Personal

More information

An Improved Computation of the PageRank Algorithm 1

An Improved Computation of the PageRank Algorithm 1 An Improved Computation of the PageRank Algorithm Sung Jin Kim, Sang Ho Lee School of Computing, Soongsil University, Korea ace@nowuri.net, shlee@computing.ssu.ac.kr http://orion.soongsil.ac.kr/ Abstract.

More information

Week 5 Video 5. Relationship Mining Network Analysis

Week 5 Video 5. Relationship Mining Network Analysis Week 5 Video 5 Relationship Mining Network Analysis Today s Class Network Analysis Network Analysis Analysis of anything that can be seen as connections between nodes Most common social networks Connections

More information

Filtering Unwanted Messages from (OSN) User Wall s Using MLT

Filtering Unwanted Messages from (OSN) User Wall s Using MLT Filtering Unwanted Messages from (OSN) User Wall s Using MLT Prof.Sarika.N.Zaware 1, Anjiri Ambadkar 2, Nishigandha Bhor 3, Shiva Mamidi 4, Chetan Patil 5 1 Department of Computer Engineering, AISSMS IOIT,

More information

Using Non-Linear Dynamical Systems for Web Searching and Ranking

Using Non-Linear Dynamical Systems for Web Searching and Ranking Using Non-Linear Dynamical Systems for Web Searching and Ranking Panayiotis Tsaparas Dipartmento di Informatica e Systemistica Universita di Roma, La Sapienza tsap@dis.uniroma.it ABSTRACT In the recent

More information

Strongly connected: A directed graph is strongly connected if every pair of vertices are reachable from each other.

Strongly connected: A directed graph is strongly connected if every pair of vertices are reachable from each other. Directed Graph In a directed graph, each edge (u, v) has a direction u v. Thus (u, v) (v, u). Directed graph is useful to model many practical problems (such as one-way road in traffic network, and asymmetric

More information

Administrative. Web crawlers. Web Crawlers and Link Analysis!

Administrative. Web crawlers. Web Crawlers and Link Analysis! Web Crawlers and Link Analysis! David Kauchak cs458 Fall 2011 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture15-linkanalysis.ppt http://webcourse.cs.technion.ac.il/236522/spring2007/ho/wcfiles/tutorial05.ppt

More information

Chapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han

Chapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han Chapter 1. Social Media and Social Computing October 2012 Youn-Hee Han http://link.koreatech.ac.kr 1.1 Social Media A rapid development and change of the Web and the Internet Participatory web application

More information

AAG Mobile App User Manual

AAG Mobile App User Manual AAG Mobile App User Manual Tired of carrying a large printed program around the AAG Annual Meeting? Want to easily organize your AAG session schedule in a digital calendar format? Looking to save some

More information

CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS

CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS Overview of Networks Instructor: Yizhou Sun yzsun@cs.ucla.edu January 10, 2017 Overview of Information Network Analysis Network Representation Network

More information

A P2P-based Incremental Web Ranking Algorithm

A P2P-based Incremental Web Ranking Algorithm A P2P-based Incremental Web Ranking Algorithm Sumalee Sangamuang Pruet Boonma Juggapong Natwichai Computer Engineering Department Faculty of Engineering, Chiang Mai University, Thailand sangamuang.s@gmail.com,

More information

Raising Money with Facebook

Raising Money with Facebook Raising Money with Facebook What Is Boundless Fundraising and How Does it Work? boundlessfundraising is a Facebook application that enables participants to extend their fundraising efforts beyond their

More information

WALK MS Fundraise with. Guide To Fundraising with Facebook Created by the Georgia Chapter

WALK MS Fundraise with. Guide To Fundraising with Facebook Created by the Georgia Chapter WALK MS Fundraise with Fundraise with Facebook Using CharityDynamics new boundlessfundraising TM Application for Facebook, you can extent your fundraising efforts beyond the National MS Society s Participant

More information

PEOPLE PEOPLE. Dynamic profiles of all your people, with info captured from anywhere. Includes followups & targeting.

PEOPLE PEOPLE. Dynamic profiles of all your people, with info captured from anywhere. Includes followups & targeting. FEATURES PEOPLE WEBSITE COMMUNICATIONS FINANCES The world's first Community Organizing System PEOPLE WEBSITE Multiple page types & user profiles. Build custom responsive designs using NationBuilder Theme

More information

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1

Graph Theory Review. January 30, Network Science Analytics Graph Theory Review 1 Graph Theory Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ January 30, 2018 Network

More information

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition What s the BIG deal?! 2011 2011 2008 2010 2012 What s the BIG deal?! (Gartner Hype Cycle) What s the

More information

Chapter 3. Graphs. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Chapter 3. Graphs. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved. Chapter 3 Graphs Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved. 1 3.1 Basic Definitions and Applications Undirected Graphs Undirected graph. G = (V, E) V = nodes. E

More information

WE RE STRONGER TOGETHER.

WE RE STRONGER TOGETHER. WE RE STRONGER TOGETHER. Every day we come one step closer to our goal a world free of MS. Every day we learn more about the disease and push for new treatments and programs to help people living with

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 3 Mutually recursive definition: A hub links to many authorities; An authority is linked to by many hubs. Authorities turn out to be places where information can

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/24/2014 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 2 High dim. data

More information

COMP5331: Knowledge Discovery and Data Mining

COMP5331: Knowledge Discovery and Data Mining COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd, Jon M. Kleinberg 1 1 PageRank

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 7

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 7 CS 70 Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 7 An Introduction to Graphs A few centuries ago, residents of the city of Königsberg, Prussia were interested in a certain problem.

More information

.. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar..

.. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Link Analysis in Graphs: PageRank Link Analysis Graphs Recall definitions from Discrete math and graph theory. Graph. A graph

More information

Social Networks. Slides by : I. Koutsopoulos (AUEB), Source:L. Adamic, SN Analysis, Coursera course

Social Networks. Slides by : I. Koutsopoulos (AUEB), Source:L. Adamic, SN Analysis, Coursera course Social Networks Slides by : I. Koutsopoulos (AUEB), Source:L. Adamic, SN Analysis, Coursera course Introduction Political blogs Organizations Facebook networks Ingredient networks SN representation Networks

More information

Graph Data Management

Graph Data Management Graph Data Management Analysis and Optimization of Graph Data Frameworks presented by Fynn Leitow Overview 1) Introduction a) Motivation b) Application for big data 2) Choice of algorithms 3) Choice of

More information

How To Create Backlinks

How To Create Backlinks How To Create Backlinks 1 Page Contents Who Is This Book For?... 3 A Trip In The Way-Back Machine... 4 A Little Refresher... 4 How To Build Backlinks... 6 Build Backlinks With Guest Posts... 7 Build Backlinks

More information

So, why not start making some recommendations that will earn you some cash?

So, why not start making some recommendations that will earn you some cash? 1 Welcome To Our Affiliate Program! Thank you for your interest in becoming an affiliate with The Selling Family! We love our affiliates and look forward to working with you to help you earn some passive

More information

Information Networks: Hubs and Authorities

Information Networks: Hubs and Authorities Information Networks: Hubs and Authorities Web Science (VU) (706.716) Elisabeth Lex KTI, TU Graz June 11, 2018 Elisabeth Lex (KTI, TU Graz) Links June 11, 2018 1 / 61 Repetition Opinion Dynamics Culture

More information

The Internet and World Wide Web. Chapter4

The Internet and World Wide Web. Chapter4 The Internet and World Wide Web Chapter4 ITBIS105 IS-IT-UOB 2016 The Internet What is the Internet? Worldwide collection of millions of computers networks that connects ITBIS105 IS-IT-UOB 2016 2 History

More information

Link analysis. Query-independent ordering. Query processing. Spamming simple popularity

Link analysis. Query-independent ordering. Query processing. Spamming simple popularity Today s topic CS347 Link-based ranking in web search engines Lecture 6 April 25, 2001 Prabhakar Raghavan Web idiosyncrasies Distributed authorship Millions of people creating pages with their own style,

More information

Bruno Martins. 1 st Semester 2012/2013

Bruno Martins. 1 st Semester 2012/2013 Link Analysis Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2012/2013 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 4

More information

Reading Time: A Method for Improving the Ranking Scores of Web Pages

Reading Time: A Method for Improving the Ranking Scores of Web Pages Reading Time: A Method for Improving the Ranking Scores of Web Pages Shweta Agarwal Asst. Prof., CS&IT Deptt. MIT, Moradabad, U.P. India Bharat Bhushan Agarwal Asst. Prof., CS&IT Deptt. IFTM, Moradabad,

More information

3.1 Basic Definitions and Applications. Chapter 3. Graphs. Undirected Graphs. Some Graph Applications

3.1 Basic Definitions and Applications. Chapter 3. Graphs. Undirected Graphs. Some Graph Applications Chapter 3 31 Basic Definitions and Applications Graphs Slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley All rights reserved 1 Undirected Graphs Some Graph Applications Undirected graph G = (V,

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 4 Graphs Definitions Traversals Adam Smith 9/8/10 Exercise How can you simulate an array with two unbounded stacks and a small amount of memory? (Hint: think of a

More information

TABLE OF CONTENT A) INTRODUCTION TO TIMELINE FACEBOOK TIMELINE ANATOMY OF FACEBOOK TIMELINE B) FACEBOOK TIMELINE ELEMENTS 1. COVER 2.

TABLE OF CONTENT A) INTRODUCTION TO TIMELINE FACEBOOK TIMELINE ANATOMY OF FACEBOOK TIMELINE B) FACEBOOK TIMELINE ELEMENTS 1. COVER 2. TABLE OF CONTENT A) INTRODUCTION TO TIMELINE FACEBOOK TIMELINE ANATOMY OF FACEBOOK TIMELINE B) FACEBOOK TIMELINE ELEMENTS 1. COVER 2. PROFILE PICTURE 3. ABOUT SECTION 4. PAGE TABS 5. MESSAGES 6. FRIEND

More information

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Node Importance and Neighborhoods

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Node Importance and Neighborhoods Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Node Importance and Neighborhoods Matthias Schubert, Matthias Renz, Felix Borutta, Evgeniy Faerman, Christian Frey, Klaus Arthur

More information

Apache Giraph. for applications in Machine Learning & Recommendation Systems. Maria Novartis

Apache Giraph. for applications in Machine Learning & Recommendation Systems. Maria Novartis Apache Giraph for applications in Machine Learning & Recommendation Systems Maria Stylianou @marsty5 Novartis Züri Machine Learning Meetup #5 June 16, 2014 Apache Giraph for applications in Machine Learning

More information

Efficient and Scalable Friend Recommendations

Efficient and Scalable Friend Recommendations Efficient and Scalable Friend Recommendations Comparing Traditional and Graph-Processing Approaches Nicholas Tietz Software Engineer at GraphSQL nicholas@graphsql.com January 13, 2014 1 Introduction 2

More information

Online Communication. Chat Rooms Instant Messaging Blogging Social Media

Online Communication.  Chat Rooms Instant Messaging Blogging Social Media Online Communication E-mail Chat Rooms Instant Messaging Blogging Social Media { Advantages: { Reduces cost of postage Fast and convenient Need an email address to sign up for other online accounts. Eliminates

More information

What is this Page Known for? Computing Web Page Reputations. Davood Rafiei, Alberto Mendelzon University of Toronto

What is this Page Known for? Computing Web Page Reputations. Davood Rafiei, Alberto Mendelzon University of Toronto What is this Page Known for? Computing Web Page Reputations Davood Rafiei, Alberto Mendelzon University of Toronto 1 Introduction Ranking plays an important role in searching the Web. But the importance

More information

HW1. Due: September 13, 2018

HW1. Due: September 13, 2018 CSCI 1010 Theory of Computation HW1 Due: September 13, 2018 Attach a fully filled-in cover sheet to the front of your printed homework. Your name should not appear anywhere; the cover sheet and each individual

More information

The Structure of Information Networks. Jon Kleinberg. Cornell University

The Structure of Information Networks. Jon Kleinberg. Cornell University The Structure of Information Networks Jon Kleinberg Cornell University 1 TB 1 GB 1 MB How much information is there? Wal-Mart s transaction database Library of Congress (text) World Wide Web (large snapshot,

More information