Decentralized Search

Size: px
Start display at page:

Download "Decentralized Search"

Transcription

1 Link Analysis and Decentralized Search Markus Strohmaier, Denis Helic Multimediale l Informationssysteme t II 1

2 The Memex (1945) The Memex [Bush 1945]: B A mechanized private library for individual use Mimics i associative memory where users can insert documents navigate documents retrieve documents build trails through documents A (i) (ii) p C (iii) C s Cs interaction with documents is mediated by user A and B [Bush 1945] V. Bush. As We May Think. Atlantic Monthly, Operated and maintained individually But trails can be shared socially e.g. (i) a user A can send trail to user B (ii) user B modifies and shares it with user C (iii) user C uses the trail for navigation 2

3 Web based Retrieval: Challenges Working with an enormous amount of data 10 billion pages a 500kB estimated in pages / person on the globe 20 times larger than the LoC print collection estimated in 2003 Furthermore there is a Deep Web 550 billion pages estimated in

4 Web based Retrieval: Challenges Example for the amount of web pages: Searching for Star Trek yielded about 11 million of results on Google [Nov 2007] Ordinary users investigate result list entries. What web page is the most interesting? How to store an index (inverted file) with this size? 4

5 Web based Retrieval: Challenges The Web is highly hl dynamic Study by Cho & Garcia-Molina (2002): 40% of the web pages changed their dataset t within a week 23% of the.com pages changed on daily basis Study by Fetterly et al. (2003): 35 % of the pages changed during investigations Larger web pages change more often 5

6 Web based Retrieval: Challenges The Web is self-organized No central authority (for the WWW) or main index Everyone can add (even edit) pages Pages disappear on regular basis A US study claimed that in 2 investigated tech. journals 50% of the cited links were inaccessible after four years. Lots of errors and falsehood, no quality control 6

7 Web based Retrieval: Challenges The Web is hyperlinked Based on HTML Markup tags and URIs Pages are interconnected Unidirectional links (in-link, out-link, self-link) Network structures emerge from the links Link analysis is possible 7

8 Common Architecture 8

9 The World Wide Web ( ) A user s interaction with the web is mediated by (a few) editors and publishers 9

10 The World Wide Web Today (2010) Interaction between individuals and computational systems is mediated by the aggregate behavior of massive numbers (millions) of users. 10

11 Social Computation influences system properties (X) X=Findability X=Utility Emergent system properties are beyond the direct control of engineers. New methods and algorithms for designing i and shaping socialcomputational systems are needed. It is through the process of social computation, i.e. the combination of social behavior and algorithmic computation, that desired and undesired system properties p and functions emerge. X=Navigability X=Relevance 11

12 Example: X = Connectivity (of the web graph) Questions: What is X like? What causes X? bow-tie architecture of the web [Broder et al 2000] 12

13 Example: X = Connectivity (of the web graph) Questions: What is X like? bow-tie architecture of the web What causes X? How can we Social mechanisms, such as improve X? preferential attachment Preferential attachment: Degree of vertex i an open problem e The sum of all vertices degrees [Broder et al. 2000] [Barabasi and Albert 1999] Probability of a new vertex attaching to a vertex i with degree k [Barabasi and Albert 1999] A.-L. Barabási, R. Albert, Emergence of Scaling in Random Networks, Vol no. 5439, pp , Science, 15 October [Broder et al. 2000] A. Broder, R. Kumar, F. Maghoul, P.Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, and J. Wiener. Graph structure on the web. In 9th International WWW Conference,

14 Analysis of Dynamic Links in Social Tagging Systems How can navigability in social tagging systems be described d and improved? D. Helic, C. Trattner, M. Strohmaier and K. Andrews, On the Navigability of Social Tagging Systems, The 2nd IEEE International Conference on Social Computing (SocialCom 2010), Minneapolis, Minnesota, USA, (acceptance rate 33/245, 13,47% quota, nominated for Best Paper). 14

15 Structure of Social Tagging Systems: Definition Resources Tags User A folksonomy is a tuple F:= (U, T, R, Y) where the three disjoint, i finite it sets U, T, R correspond to user 1 a set of persons or users u U a set of tags t T and a set of resources or objects r R Y U T R, called set of tag assignments tag 1 res. 1 navigation [Hotho et al 2006] 15

16 Tag Clouds are Assumed to be Efficient Tools for Navigation The Navigability Assumption: An implicit assumption among designers of social tagging systems that tag clouds are specifically useful to support navigation. This has hardly been tested or critically reflected in the past. web Navigating tagging systems via tag clouds: 1. The system presents a tag cloud to the user. 2. The user selects a tag from the tag cloud. 3. The system presents a list of resources tagged with the selected tag. 4. The user selects a resource from the list of resources. 5. The system transfers the user to the selected resource, and the process potentially starts anew. Navigating Y T R 16

17 Navigability Informal Description: If / how quick one can get from document A to document Bi in a hypertext t system (more precise definition follows later) Designing for Navigability: In traditional hypertext systems, this property p used to be within the control of system designers 17

18 Defining Navigability A network is navigable iff: There is a path between all or almost all pairs of nodes in the network. [Kleinberg 1999] Formally: 1. There exists a giant component: size(gc) > 0.9 * n a single connected component that accounts for a significant fraction of all nodes 2. The effective diameter d eff is low: d eff < log n d eff = distance at which 90% of pairs of nodes are reachable n number of nodes in the network [Kleinberg 1999] J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing,

19 Example 1: Navigability: Examples Not navigable: No giant component Example 2: Not navigable: giant component, BUT avg. shortest path > log 2 (9) 19

20 Example 3: Navigability: Examples Navigable: Giant component AND avg. shortest path 2 < log(9) 2 Is this efficiently navigable? There are short paths between all nodes, but can an agent or algorithm find them with local knowledge only? 20

21 Efficiently navigable A network is efficiently navigable iff: If there is an algorithm that can find a short path with only locall knowledge, and the delivery time of the algorithm is bounded polynomially by log k (n). Example 4: B A C Efficiently navigable, if the algorithm knows it needs to go through A B C J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, Also appears as Cornell Computer Science 21 Technical Report (October 1999)

22 Navigability of Social Tagging Systems But: how doesdatasets Annotations Resources (i) the size of tag Austria clouds Forum and 32,245 12,837 (ii) number of resources Bibsonomy / tag 916, ,339 influence the navigability (X 1 ) of social tagging systems? CiteULike 6,328,021 1,697,365 established systems, many users New system, few users Navigable in theory: GC exists, low eff. diameter Shrinking diameter over time, cf. [Leskovec et al. 2005] (for Y T R) [Leskovec et al. 2005] J. Leskovec, J.M. Kleinberg, C. Faloutsos: Graphs over time: densification laws, shrinking diameters and possible explanations. KDD 2005:

23 Modeling UI constraints Tag Cloud Size n number of n tags displayed per resource (with a topn algorithm) Pagination of resources / tag number of k resources displayed per tag (with reverse chronological ordering) 23

24 How UI constraints effect Navigability Tag Cloud Size Pagination Limiting the tag cloud size n to practically feasible sizes (e.g. 5, 10, or more) does not influence navigability (this is not very surprising). BUT: Limiting the out-degree of high frequency tags k (e.g. through pagination with resources sorted in reverse-chronological order) leaves the network vulnerable to fragmentation. This destroys navigability of prevalent approaches to tag clouds. 24

25 Findings 1. For certain specific, but popular, tag cloud scenarios, the so-called Navigability Assumption does not hold. 2. While we could confirm that tag-resource networks have efficient navigational properties in theory, we found that popular user interface decisions significantly impair navigability. These results make a theoretical and an empirical argument against existing approaches to tag cloud construction. How can we recover navigability of social tagging systems? 25

26 Recovering Navigability in Social Tagging Systems Instead of reverse-chronological ordering of resources, we apply a naive random ordering. Based on this observation, we have developed ordering algorithms that balance semantic and navigational aspects, eg e.g. [Trattner et al. 2010] [Trattner et al. 2010] C. Trattner, M. Strohmaier, and D. Helic. Improving navigability of hierarchically-structured encyclopedias through effective tag cloud construction. In 10th International Conference on Knowledge Management and Knowledge Technologies I-KNOW 2010, Graz, Austria,

27 Navigating g Networks How can model user navigation on networks? 27

28 Experiment [Milgram] Goal Define a single target person and a group of starting persons Generate an acquaintance chain from each starter to the target Experimental Set Up Each starter receives a document was asked to begin moving it by mail toward the target Information about the target: name, address, occupation, company, college, year of graduation, wife s name and hometown Information about relationship (friend/acquaintance) [Granovetter 1973] Constraints starter group was only allowed to send the document to people they know and was urged to choose the next recipient in a way as to advance the progress of the document toward the target 28

29 Introduction The simplest way of formulating the small-world problem is: Starting with any two people in the world, what is the likelihood that they will know each other? A somewhat more sophisticated formulation, however, takes account of the fact that while person X and Z may not know each other directly, they may share a mutual acquaintance - that is, a person who knows both of them. One can then think of an acquaintance chain with X knowing Y and Y knowing Z. Moreover, one can imagine circumstances in which X is linked to Z not by a single link, but by a series of links, X-A-B-C-D Y- Z. That is to say, person X knows person A who in turn knows person B, who knows C who knows Y, who knows Z. [Milgram 1967, according to ] 29

30 An Experimental Study of the Small World Problem [Travers and Milgram 1969] A Social Network Experiment tailored towards Demonstrating Defining And measuring Inter-connectedness in a large society (USA) A test of the modern idea of six degrees of separation Which states that: every person on earth is connected to any other person through a chain of acquaintances not longer than 6 30

31 Results I How many of the starters would be able to establish contact with the target? 64 out of 296 reached the target How many intermediaries would be required to link starters with the target? Well, that depends: the overall mean 5.2 links Through hometown: 6.1 links Through hbusiness: 46li 4.6 links Boston group faster than Nebraska groups Nebraska stockholders not faster than Nebraska random What form would the distribution of chain lengths take? 31

32 Results III. Common paths Also see: Gladwell s Law of the few 32

33 Follow up work (2008) Horvitz and Leskovec study billion conversations among 240 million people of Microsoft Messenger Communication graph with 180 million nodes and 1.3 billion undirected edges Largest social network constructed and analyzed to date (2008) 33

34 Decentralized Search Then, the performance of decentralized search Background knowledge: depends on the suitability of folksonomies. (a tag hierarchy) Idea: use folksonomies as background knowledge Shortest path to target In other words, we can evaluate the suitability of folksonomies for decentralized search through simulations. Folksonomy 1 Folksonomy... Folksonomy n A (tag-tag) network: shortest path found with locall knowledge p LK = 4 Goal: Navigate from START to TARGET Δ = p LK -p GK using local and background knowledge only candidates start target shortest path with global knowledge p GK = 3 J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, Also appears as Cornell Computer Science Technical Report (October 1999) 34

35 Evaluating Hierarchical Structures in Networks How can measure the efficiency of hierarchical structures t for navigation? 35

36 The World Wide Web ( ) How efficient is this as a navigational aid? 36

37 Construction of hierarchies from unstructured tagging data From tag centrality to high tag tag generality: centrality: more abstract low tag centrality: more specific Other existing folksonomy algorithms: k-means, affinity propagation, [Heyman and Garcia-Molina 2006] 37

38 Evaluation Framework Decentralized Search Folksonomy 1 Simulation Performance Evaluation which folksonomy performs best on a given navigational task Folksonomy Click-Data Explanatory Evaluation which folksonomy explains actual user behavior best Folksonomy n 38

39 Success Rates Across Different Folksonomies flickr dataset Tag generality approaches k-means / affinity propagation Success rate: The number of times an agent is successful in finding a path using a particular folksonomy as background knowledge max hops n: the maximal number of steps an agent is allowed to perform before stopping (a tunable parameter e.g., an agent only follows n links). n Random folksonomy All approaches outperform a random folksonomy Tag generality approaches outperform k-means / Aff. Propagation 39

40 Success Rates Across Different Datasets Holds for all datasets (to diff. extents) Efficiency: how often does an agent not find the global shortest path, but some other path that is longer. But how efficient are those folksonomies during search? 40

41 Stretch Δ =p LK -p GK Shortest Paths found with Local Knowledge Bibsonomy K-Means Finds no path: Δ = infinite Finds paths that is +1 longer: Δ = 1 Holds for all datasets t Finds shortest possible path: (to diff. Δ = 0 extents) Tag generality approaches (d+e) find much shorter paths! 41

42 Conclusions Dsearch as a natural model of user navigation on the web Emergence of dynamic, user-generated links reduces control Empirical studies and new algorithms are needed to recover important system properties 42

43 End of Presentation Acknowledgements 43

Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists: A Comparative Study

Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists: A Comparative Study Improving the Navigability of Tagging Systems with Hierarchically Constructed Resource Lists: A Comparative Study Christoph Trattner Knowledge Management Institute and Institute for Information Systems

More information

Are Tag Clouds Useful for Navigation? A Network-Theoretic Analysis

Are Tag Clouds Useful for Navigation? A Network-Theoretic Analysis Are Tag Clouds Useful for Navigation? A Network-Theoretic Analysis Denis Helic, Christoph Trattner, Markus Strohmaier, Keith Andrews Knowledge Management Institute Graz University of Technology Graz, Austria

More information

Math 443/543 Graph Theory Notes 10: Small world phenomenon and decentralized search

Math 443/543 Graph Theory Notes 10: Small world phenomenon and decentralized search Math 443/543 Graph Theory Notes 0: Small world phenomenon and decentralized search David Glickenstein November 0, 008 Small world phenomenon The small world phenomenon is the principle that all people

More information

arxiv:cs/ v1 [cs.ir] 26 Apr 2002

arxiv:cs/ v1 [cs.ir] 26 Apr 2002 Navigating the Small World Web by Textual Cues arxiv:cs/0204054v1 [cs.ir] 26 Apr 2002 Filippo Menczer Department of Management Sciences The University of Iowa Iowa City, IA 52242 Phone: (319) 335-0884

More information

Graph theoretic concepts. Devika Subramanian Comp 140 Fall 2008

Graph theoretic concepts. Devika Subramanian Comp 140 Fall 2008 Graph theoretic concepts Devika Subramanian Comp 140 Fall 2008 The small world phenomenon The phenomenon is surprising because Size of graph is very large (> 6 billion for the planet). Graph is sparse

More information

Extracting Information from Complex Networks

Extracting Information from Complex Networks Extracting Information from Complex Networks 1 Complex Networks Networks that arise from modeling complex systems: relationships Social networks Biological networks Distinguish from random networks uniform

More information

Random Generation of the Social Network with Several Communities

Random Generation of the Social Network with Several Communities Communications of the Korean Statistical Society 2011, Vol. 18, No. 5, 595 601 DOI: http://dx.doi.org/10.5351/ckss.2011.18.5.595 Random Generation of the Social Network with Several Communities Myung-Hoe

More information

Pragmatic Evaluation of Folksonomies

Pragmatic Evaluation of Folksonomies WWW Session: Evaluation March 8 April,, Hyderabad, India Pragmatic Evaluation of Folksonomies Denis Helic Graz University of Technology Graz, Austria dhelic@tugraz.at Markus Muhr Know-Center Graz Graz,

More information

Social and Technological Network Data Analytics. Lecture 5: Structure of the Web, Search and Power Laws. Prof Cecilia Mascolo

Social and Technological Network Data Analytics. Lecture 5: Structure of the Web, Search and Power Laws. Prof Cecilia Mascolo Social and Technological Network Data Analytics Lecture 5: Structure of the Web, Search and Power Laws Prof Cecilia Mascolo In This Lecture We describe power law networks and their properties and show

More information

Navigation in Networks. Networked Life NETS 112 Fall 2017 Prof. Michael Kearns

Navigation in Networks. Networked Life NETS 112 Fall 2017 Prof. Michael Kearns Navigation in Networks Networked Life NETS 112 Fall 2017 Prof. Michael Kearns The Navigation Problem You are an individual (vertex) in a very large social network You want to find a (short) chain of friendships

More information

Overlay (and P2P) Networks

Overlay (and P2P) Networks Overlay (and P2P) Networks Part II Recap (Small World, Erdös Rényi model, Duncan Watts Model) Graph Properties Scale Free Networks Preferential Attachment Evolving Copying Navigation in Small World Samu

More information

The Role of Homophily and Popularity in Informed Decentralized Search

The Role of Homophily and Popularity in Informed Decentralized Search The Role of Homophily and Popularity in Informed Decentralized Search Florian Geigl and Denis Helic Knowledge Technologies Institute In eldgasse 13/5. floor, 8010 Graz, Austria {florian.geigl,dhelic}@tugraz.at

More information

Graph similarity. Laura Zager and George Verghese EECS, MIT. March 2005

Graph similarity. Laura Zager and George Verghese EECS, MIT. March 2005 Graph similarity Laura Zager and George Verghese EECS, MIT March 2005 Words you won t hear today impedance matching thyristor oxide layer VARs Some quick definitions GV (, E) a graph G V the set of vertices

More information

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science

More information

Lecture 17 November 7

Lecture 17 November 7 CS 559: Algorithmic Aspects of Computer Networks Fall 2007 Lecture 17 November 7 Lecturer: John Byers BOSTON UNIVERSITY Scribe: Flavio Esposito In this lecture, the last part of the PageRank paper has

More information

Complex Networks. Structure and Dynamics

Complex Networks. Structure and Dynamics Complex Networks Structure and Dynamics Ying-Cheng Lai Department of Mathematics and Statistics Department of Electrical Engineering Arizona State University Collaborators! Adilson E. Motter, now at Max-Planck

More information

LINKING RELATED CONTENT IN WEB ENCYCLOPEDIAS WITH SEARCH QUERY TAG CLOUDS

LINKING RELATED CONTENT IN WEB ENCYCLOPEDIAS WITH SEARCH QUERY TAG CLOUDS LINKING RELATED CONTENT IN WEB ENCYCLOPEDIAS WITH SEARCH QUERY TAG CLOUDS Christoph Trattner * Knowledge Management Institute and Institute for Information Systems and Computer Media Graz University of

More information

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds MAE 298, Lecture 9 April 30, 2007 Web search and decentralized search on small-worlds Search for information Assume some resource of interest is stored at the vertices of a network: Web pages Files in

More information

Distributed Network Routing Algorithms Table for Small World Networks

Distributed Network Routing Algorithms Table for Small World Networks Distributed Network Routing Algorithms Table for Small World Networks Mudit Dholakia 1 1 Department of Computer Engineering, VVP Engineering College, Rajkot, 360005, India, Email:muditdholakia@gmail.com

More information

The Structure of Information Networks. Jon Kleinberg. Cornell University

The Structure of Information Networks. Jon Kleinberg. Cornell University The Structure of Information Networks Jon Kleinberg Cornell University 1 TB 1 GB 1 MB How much information is there? Wal-Mart s transaction database Library of Congress (text) World Wide Web (large snapshot,

More information

An Empirical Analysis of Communities in Real-World Networks

An Empirical Analysis of Communities in Real-World Networks An Empirical Analysis of Communities in Real-World Networks Chuan Sheng Foo Computer Science Department Stanford University csfoo@cs.stanford.edu ABSTRACT Little work has been done on the characterization

More information

Chapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han

Chapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han Chapter 1. Social Media and Social Computing October 2012 Youn-Hee Han http://link.koreatech.ac.kr 1.1 Social Media A rapid development and change of the Web and the Internet Participatory web application

More information

Multimedia Information Systems - Introduction

Multimedia Information Systems - Introduction Multimedia Information Systems - Introduction VO/KU (707.020) Christoph Trattner Know-Center, TU Graz Oct 05, 2015 Christoph Trattner (Know-Center, TU Graz)Multimedia Information Systems - Introduction

More information

How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200?

How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200? How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200? Questions from last time Avg. FB degree is 200 (suppose).

More information

The Complex Network Phenomena. and Their Origin

The Complex Network Phenomena. and Their Origin The Complex Network Phenomena and Their Origin An Annotated Bibliography ESL 33C 003180159 Instructor: Gerriet Janssen Match 18, 2004 Introduction A coupled system can be described as a complex network,

More information

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks CSE 258 Lecture 12 Web Mining and Recommender Systems Social networks Social networks We ve already seen networks (a little bit) in week 3 i.e., we ve studied inference problems defined on graphs, and

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models

More information

Compact Encoding of the Web Graph Exploiting Various Power Laws

Compact Encoding of the Web Graph Exploiting Various Power Laws Compact Encoding of the Web Graph Exploiting Various Power Laws Statistical Reason Behind Link Database Yasuhito Asano, Tsuyoshi Ito 2, Hiroshi Imai 2, Masashi Toyoda 3, and Masaru Kitsuregawa 3 Department

More information

Review: Searching the Web [Arasu 2001]

Review: Searching the Web [Arasu 2001] Review: Searching the Web [Arasu 2001] Gareth Cronin University of Auckland gareth@cronin.co.nz The authors of Searching the Web present an overview of the state of current technologies employed in the

More information

Some Characteristics of Web Data and their Reflection on Our Society: an Empirical Approach *

Some Characteristics of Web Data and their Reflection on Our Society: an Empirical Approach * Some Characteristics of Web Data and their Reflection on Our Society: an Empirical Approach * Li Xiaoming and Zhu Jiaji Institute for Internet Information Studies (i 3 S) Peking University 1. Introduction

More information

How Do Real Networks Look? Networked Life NETS 112 Fall 2014 Prof. Michael Kearns

How Do Real Networks Look? Networked Life NETS 112 Fall 2014 Prof. Michael Kearns How Do Real Networks Look? Networked Life NETS 112 Fall 2014 Prof. Michael Kearns Roadmap Next several lectures: universal structural properties of networks Each large-scale network is unique microscopically,

More information

Network Mathematics - Why is it a Small World? Oskar Sandberg

Network Mathematics - Why is it a Small World? Oskar Sandberg Network Mathematics - Why is it a Small World? Oskar Sandberg 1 Networks Formally, a network is a collection of points and connections between them. 2 Networks Formally, a network is a collection of points

More information

Jure Leskovec Computer Science Department Cornell University / Stanford University

Jure Leskovec Computer Science Department Cornell University / Stanford University Jure Leskovec Computer Science Department Cornell University / Stanford University Large on line systems have detailed records of human activity On line communities: Facebook (64 million users, billion

More information

Models for the growth of the Web

Models for the growth of the Web Models for the growth of the Web Chi Bong Ho Introduction Yihao Ben Pu December 6, 2007 Alexander Tsiatas There has been much work done in recent years about the structure of the Web and other large information

More information

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 11 Web Mining and Recommender Systems Triadic closure; strong & weak ties Triangles So far we ve seen (a little about) how networks can be characterized by their connectivity patterns What

More information

A FAST COMMUNITY BASED ALGORITHM FOR GENERATING WEB CRAWLER SEEDS SET

A FAST COMMUNITY BASED ALGORITHM FOR GENERATING WEB CRAWLER SEEDS SET A FAST COMMUNITY BASED ALGORITHM FOR GENERATING WEB CRAWLER SEEDS SET Shervin Daneshpajouh, Mojtaba Mohammadi Nasiri¹ Computer Engineering Department, Sharif University of Technology, Tehran, Iran daneshpajouh@ce.sharif.edu,

More information

CSE 158 Lecture 11. Web Mining and Recommender Systems. Social networks

CSE 158 Lecture 11. Web Mining and Recommender Systems. Social networks CSE 158 Lecture 11 Web Mining and Recommender Systems Social networks Assignment 1 Due 5pm next Monday! (Kaggle shows UTC time, but the due date is 5pm, Monday, PST) Assignment 1 Assignment 1 Social networks

More information

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena CSE 190 Lecture 16 Data Mining and Predictive Analytics Small-world phenomena Another famous study Stanley Milgram wanted to test the (already popular) hypothesis that people in social networks are separated

More information

World Wide Web has specific challenges and opportunities

World Wide Web has specific challenges and opportunities 6. Web Search Motivation Web search, as offered by commercial search engines such as Google, Bing, and DuckDuckGo, is arguably one of the most popular applications of IR methods today World Wide Web has

More information

Social Network Analysis of the Short Message Service

Social Network Analysis of the Short Message Service Social Network Analysis of the Short Message Service Vikrant Tomar, Himanshu Asnani, Abhay Karandikar, Vinay Chander, Swati Agrawal, Prateek Kapadia TTSL-IITB Center for Excellence in Telecom (TICET),

More information

Structure of Social Networks

Structure of Social Networks Structure of Social Networks Outline Structure of social networks Applications of structural analysis Social *networks* Twitter Facebook Linked-in IMs Email Real life Address books... Who Twitter #numbers

More information

Behavioral Data Mining. Lecture 9 Modeling People

Behavioral Data Mining. Lecture 9 Modeling People Behavioral Data Mining Lecture 9 Modeling People Outline Power Laws Big-5 Personality Factors Social Network Structure Power Laws Y-axis = frequency of word, X-axis = rank in decreasing order Power Laws

More information

ECS 253 / MAE 253, Lecture 8 April 21, Web search and decentralized search on small-world networks

ECS 253 / MAE 253, Lecture 8 April 21, Web search and decentralized search on small-world networks ECS 253 / MAE 253, Lecture 8 April 21, 2016 Web search and decentralized search on small-world networks Search for information Assume some resource of interest is stored at the vertices of a network: Web

More information

Information Retrieval. Lecture 9 - Web search basics

Information Retrieval. Lecture 9 - Web search basics Information Retrieval Lecture 9 - Web search basics Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester 2007 1/ 30 Introduction Up to now: techniques for general

More information

Attack Vulnerability of Network with Duplication-Divergence Mechanism

Attack Vulnerability of Network with Duplication-Divergence Mechanism Commun. Theor. Phys. (Beijing, China) 48 (2007) pp. 754 758 c International Academic Publishers Vol. 48, No. 4, October 5, 2007 Attack Vulnerability of Network with Duplication-Divergence Mechanism WANG

More information

Information Networks: Hubs and Authorities

Information Networks: Hubs and Authorities Information Networks: Hubs and Authorities Web Science (VU) (706.716) Elisabeth Lex KTI, TU Graz June 11, 2018 Elisabeth Lex (KTI, TU Graz) Links June 11, 2018 1 / 61 Repetition Opinion Dynamics Culture

More information

Where the Social Web Meets the Semantic Web. Tom Gruber RealTravel.com tomgruber.org

Where the Social Web Meets the Semantic Web. Tom Gruber RealTravel.com tomgruber.org Where the Social Web Meets the Semantic Web Tom Gruber RealTravel.com tomgruber.org Doug Engelbart, 1968 "The grand challenge is to boost the collective IQ of organizations and of society. " Tim Berners-Lee,

More information

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties CSE 255 Lecture 13 Data Mining and Predictive Analytics Triadic closure; strong & weak ties Monday Random models of networks: Erdos Renyi random graphs (picture from Wikipedia http://en.wikipedia.org/wiki/erd%c5%91s%e2%80%93r%c3%a9nyi_model)

More information

Telling Experts from Spammers Expertise Ranking in Folksonomies

Telling Experts from Spammers Expertise Ranking in Folksonomies 32 nd Annual ACM SIGIR 09 Boston, USA, Jul 19-23 2009 Telling Experts from Spammers Expertise Ranking in Folksonomies Michael G. Noll (Albert) Ching-Man Au Yeung Christoph Meinel Nicholas Gibbins Nigel

More information

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 13 Web Mining and Recommender Systems Triadic closure; strong & weak ties Monday Random models of networks: Erdos Renyi random graphs (picture from Wikipedia http://en.wikipedia.org/wiki/erd%c5%91s%e2%80%93r%c3%a9nyi_model)

More information

Social Networks 2015 Lecture 10: The structure of the web and link analysis

Social Networks 2015 Lecture 10: The structure of the web and link analysis 04198250 Social Networks 2015 Lecture 10: The structure of the web and link analysis The structure of the web Information networks Nodes: pieces of information Links: different relations between information

More information

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov

Algorithms and Applications in Social Networks. 2017/2018, Semester B Slava Novgorodov Algorithms and Applications in Social Networks 2017/2018, Semester B Slava Novgorodov 1 Lesson #1 Administrative questions Course overview Introduction to Social Networks Basic definitions Network properties

More information

Case Studies in Complex Networks

Case Studies in Complex Networks Case Studies in Complex Networks Introduction to Scientific Modeling CS 365 George Bezerra 08/27/2012 The origin of graph theory Königsberg bridge problem Leonard Euler (1707-1783) The Königsberg Bridge

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References q Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

beyond social networks

beyond social networks beyond social networks Small world phenomenon: high clustering C network >> C random graph low average shortest path l network ln( N)! neural network of C. elegans,! semantic networks of languages,! actor

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/4/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

More information

Tie strength, social capital, betweenness and homophily. Rik Sarkar

Tie strength, social capital, betweenness and homophily. Rik Sarkar Tie strength, social capital, betweenness and homophily Rik Sarkar Networks Position of a node in a network determines its role/importance Structure of a network determines its properties 2 Today Notion

More information

Wednesday, March 8, Complex Networks. Presenter: Jirakhom Ruttanavakul. CS 790R, University of Nevada, Reno

Wednesday, March 8, Complex Networks. Presenter: Jirakhom Ruttanavakul. CS 790R, University of Nevada, Reno Wednesday, March 8, 2006 Complex Networks Presenter: Jirakhom Ruttanavakul CS 790R, University of Nevada, Reno Presented Papers Emergence of scaling in random networks, Barabási & Bonabeau (2003) Scale-free

More information

Complex networks: A mixture of power-law and Weibull distributions

Complex networks: A mixture of power-law and Weibull distributions Complex networks: A mixture of power-law and Weibull distributions Ke Xu, Liandong Liu, Xiao Liang State Key Laboratory of Software Development Environment Beihang University, Beijing 100191, China Abstract:

More information

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008 Lesson 4 Random graphs Sergio Barbarossa Graph models 1. Uncorrelated random graph (Erdős, Rényi) N nodes are connected through n edges which are chosen randomly from the possible configurations 2. Binomial

More information

Minimizing the Spread of Contamination by Blocking Links in a Network

Minimizing the Spread of Contamination by Blocking Links in a Network Minimizing the Spread of Contamination by Blocking Links in a Network Masahiro Kimura Deptartment of Electronics and Informatics Ryukoku University Otsu 520-2194, Japan kimura@rins.ryukoku.ac.jp Kazumi

More information

An Empirical Validation of Growth Models for Complex Networks

An Empirical Validation of Growth Models for Complex Networks An Empirical Validation of Growth Models for Complex Networks Alan Mislove, Hema Swetha Koppula, Krishna P. Gummadi, Peter Druschel, and Bobby Bhattacharjee 1 Introduction Complex networks arise in a variety

More information

On Degree-Based Decentralized Search in Complex Networks

On Degree-Based Decentralized Search in Complex Networks 1 On Degree-Based Decentralized Search in Complex Networks Shi Xiao Gaoxi Xiao Division of Communication Engineering School of Electrical and Electronic Engineering Nanyang technological University, Singapore

More information

Graph Theory and Network Measurment

Graph Theory and Network Measurment Graph Theory and Network Measurment Social and Economic Networks MohammadAmin Fazli Social and Economic Networks 1 ToC Network Representation Basic Graph Theory Definitions (SE) Network Statistics and

More information

Example 1: An algorithmic view of the small world phenomenon

Example 1: An algorithmic view of the small world phenomenon Lecture Notes: Social Networks: Models, Algorithms, and Applications Lecture 1: Jan 17, 2012 Scribes: Preethi Ambati and Azar Aliyev Example 1: An algorithmic view of the small world phenomenon The story

More information

Drawing power law graphs

Drawing power law graphs Drawing power law graphs Reid Andersen Fan Chung Linyuan Lu Abstract We present methods for drawing graphs that arise in various information networks. It has been noted that many realistic graphs have

More information

Degree Distribution: The case of Citation Networks

Degree Distribution: The case of Citation Networks Network Analysis Degree Distribution: The case of Citation Networks Papers (in almost all fields) refer to works done earlier on same/related topics Citations A network can be defined as Each node is a

More information

Small-world networks

Small-world networks Small-world networks c A. J. Ganesh, University of Bristol, 2015 Popular folklore asserts that any two people in the world are linked through a chain of no more than six mutual acquaintances, as encapsulated

More information

Ermergent Semantics in BibSonomy

Ermergent Semantics in BibSonomy Ermergent Semantics in BibSonomy Andreas Hotho Robert Jäschke Christoph Schmitz Gerd Stumme Applications of Semantic Technologies Workshop. Dresden, 2006-10-06 Agenda Introduction Folksonomies BibSonomy

More information

Do TREC Web Collections Look Like the Web?

Do TREC Web Collections Look Like the Web? Do TREC Web Collections Look Like the Web? Ian Soboroff National Institute of Standards and Technology Gaithersburg, MD ian.soboroff@nist.gov Abstract We measure the WT10g test collection, used in the

More information

Large Scale Graph Algorithms

Large Scale Graph Algorithms Large Scale Graph Algorithms A Guide to Web Research: Lecture 2 Yury Lifshits Steklov Institute of Mathematics at St.Petersburg Stuttgart, Spring 2007 1 / 34 Talk Objective To pose an abstract computational

More information

COMMUNITY SHELL S EFFECT ON THE DISINTEGRATION OF SOCIAL NETWORKS

COMMUNITY SHELL S EFFECT ON THE DISINTEGRATION OF SOCIAL NETWORKS Annales Univ. Sci. Budapest., Sect. Comp. 43 (2014) 57 68 COMMUNITY SHELL S EFFECT ON THE DISINTEGRATION OF SOCIAL NETWORKS Imre Szücs (Budapest, Hungary) Attila Kiss (Budapest, Hungary) Dedicated to András

More information

A STUDY ON THE EVOLUTION OF THE WEB

A STUDY ON THE EVOLUTION OF THE WEB A STUDY ON THE EVOLUTION OF THE WEB Alexandros Ntoulas, Junghoo Cho, Hyun Kyu Cho 2, Hyeonsung Cho 2, and Young-Jo Cho 2 Summary We seek to gain improved insight into how Web search engines should cope

More information

AS Connectedness Based on Multiple Vantage Points and the Resulting Topologies

AS Connectedness Based on Multiple Vantage Points and the Resulting Topologies AS Connectedness Based on Multiple Vantage Points and the Resulting Topologies Steven Fisher University of Nevada, Reno CS 765 Steven Fisher (UNR) CS 765 CS 765 1 / 28 Table of Contents 1 Introduction

More information

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul 1 CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul Introduction Our problem is crawling a static social graph (snapshot). Given

More information

Nick Hamilton Institute for Molecular Bioscience. Essential Graph Theory for Biologists. Image: Matt Moores, The Visible Cell

Nick Hamilton Institute for Molecular Bioscience. Essential Graph Theory for Biologists. Image: Matt Moores, The Visible Cell Nick Hamilton Institute for Molecular Bioscience Essential Graph Theory for Biologists Image: Matt Moores, The Visible Cell Outline Core definitions Which are the most important bits? What happens when

More information

(Social) Networks Analysis III. Prof. Dr. Daning Hu Department of Informatics University of Zurich

(Social) Networks Analysis III. Prof. Dr. Daning Hu Department of Informatics University of Zurich (Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Outline Network Topological Analysis Network Models Random Networks Small-World Networks Scale-Free Networks

More information

Overview of Web Mining Techniques and its Application towards Web

Overview of Web Mining Techniques and its Application towards Web Overview of Web Mining Techniques and its Application towards Web *Prof.Pooja Mehta Abstract The World Wide Web (WWW) acts as an interactive and popular way to transfer information. Due to the enormous

More information

Peer-to-Peer Networks 15 Self-Organization. Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg

Peer-to-Peer Networks 15 Self-Organization. Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg Peer-to-Peer Networks 15 Self-Organization Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg Gnutella Connecting Protokoll - Ping Ping participants query

More information

Economics of Information Networks

Economics of Information Networks Economics of Information Networks Stephen Turnbull Division of Policy and Planning Sciences Lecture 4: December 7, 2017 Abstract We continue discussion of the modern economics of networks, which considers

More information

CHAPTER THREE INFORMATION RETRIEVAL SYSTEM

CHAPTER THREE INFORMATION RETRIEVAL SYSTEM CHAPTER THREE INFORMATION RETRIEVAL SYSTEM 3.1 INTRODUCTION Search engine is one of the most effective and prominent method to find information online. It has become an essential part of life for almost

More information

Flat Routing on Curved Spaces

Flat Routing on Curved Spaces Flat Routing on Curved Spaces Dmitri Krioukov (CAIDA/UCSD) dima@caida.org Berkeley April 19 th, 2006 Clean slate: reassess fundamental assumptions Information transmission between nodes in networks that

More information

An Introduction to Search Engines and Web Navigation

An Introduction to Search Engines and Web Navigation An Introduction to Search Engines and Web Navigation MARK LEVENE ADDISON-WESLEY Ал imprint of Pearson Education Harlow, England London New York Boston San Francisco Toronto Sydney Tokyo Singapore Hong

More information

Response Network Emerging from Simple Perturbation

Response Network Emerging from Simple Perturbation Journal of the Korean Physical Society, Vol 44, No 3, March 2004, pp 628 632 Response Network Emerging from Simple Perturbation S-W Son, D-H Kim, Y-Y Ahn and H Jeong Department of Physics, Korea Advanced

More information

HYBRIDIZED MODEL FOR EFFICIENT MATCHING AND DATA PREDICTION IN INFORMATION RETRIEVAL

HYBRIDIZED MODEL FOR EFFICIENT MATCHING AND DATA PREDICTION IN INFORMATION RETRIEVAL International Journal of Mechanical Engineering & Computer Sciences, Vol.1, Issue 1, Jan-Jun, 2017, pp 12-17 HYBRIDIZED MODEL FOR EFFICIENT MATCHING AND DATA PREDICTION IN INFORMATION RETRIEVAL BOMA P.

More information

Link Analysis: Web Structure and Search

Link Analysis: Web Structure and Search Link Analysis: Web Structure and Search Web Science (VU) (706716) Elisabeth Lex ISDS, TU Graz June 12, 2017 Elisabeth Lex (ISDS, TU Graz) Links June 12, 2017 1 / 69 Outline 1 Information Networks 2 Paths

More information

External influence on Bitcoin trust network structure

External influence on Bitcoin trust network structure External influence on Bitcoin trust network structure Edison Alejandro García, garcial@stanford.edu SUNetID: 6675 CS4W - Analysis of Networks Abstract Networks can express how much people trust or distrust

More information

Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing

Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing Gautam Bhat, Rajeev Kumar Singh Department of Computer Science and Engineering Shiv Nadar University Gautam Buddh Nagar,

More information

CSCI5070 Advanced Topics in Social Computing

CSCI5070 Advanced Topics in Social Computing CSCI5070 Advanced Topics in Social Computing Irwin King The Chinese University of Hong Kong king@cse.cuhk.edu.hk!! 2012 All Rights Reserved. Outline Graphs Origins Definition Spectral Properties Type of

More information

Network Theory: Social, Mythological and Fictional Networks. Math 485, Spring 2018 (Midterm Report) Christina West, Taylor Martins, Yihe Hao

Network Theory: Social, Mythological and Fictional Networks. Math 485, Spring 2018 (Midterm Report) Christina West, Taylor Martins, Yihe Hao Network Theory: Social, Mythological and Fictional Networks Math 485, Spring 2018 (Midterm Report) Christina West, Taylor Martins, Yihe Hao Abstract: Comparative mythology is a largely qualitative and

More information

Alain Barrat CPT, Marseille, France ISI, Turin, Italy

Alain Barrat CPT, Marseille, France ISI, Turin, Italy Link creation and profile alignment in the anobii social network Alain Barrat CPT, Marseille, France ISI, Turin, Italy Social networks Huge field of research Data: mostly small samples, surveys Multiplexity

More information

An Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network. Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu

An Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network. Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu An Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu ACM SAC 2010 outline Social network Definition and properties Social

More information

Models and Algorithms for Complex Networks

Models and Algorithms for Complex Networks Models and Algorithms for Complex Networks with network with parametrization, elements maintaining typically characteristic local profiles with categorical attributes [C. Faloutsos MMDS8] Milena Mihail

More information

Web 2.0 Social Data Analysis

Web 2.0 Social Data Analysis Web 2.0 Social Data Analysis Ing. Jaroslav Kuchař jaroslav.kuchar@fit.cvut.cz Structure(1) Czech Technical University in Prague, Faculty of Information Technologies Software and Web Engineering 2 Contents

More information

Efficient, Scalable, and Provenance-Aware Management of Linked Data

Efficient, Scalable, and Provenance-Aware Management of Linked Data Efficient, Scalable, and Provenance-Aware Management of Linked Data Marcin Wylot 1 Motivation and objectives of the research The proliferation of heterogeneous Linked Data on the Web requires data management

More information

Beyond Ten Blue Links Seven Challenges

Beyond Ten Blue Links Seven Challenges Beyond Ten Blue Links Seven Challenges Ricardo Baeza-Yates VP of Yahoo! Research for EMEA & LatAm Barcelona, Spain Thanks to Andrei Broder, Yoelle Maarek & Prabhakar Raghavan Agenda Past and Present Wisdom

More information

The Web: Concepts and Technology. 1 CS 584: Information Retrieval. Math & Computer Science Department, Emory University

The Web: Concepts and Technology. 1 CS 584: Information Retrieval. Math & Computer Science Department, Emory University The Web: Concepts and Technology January 15: Course Overview 1 CS 584: Information Retrieval. Math & Computer Science Department, Emory University Today s Plan Who am I? What is this course about? Logistics

More information

CS-E5740. Complex Networks. Scale-free networks

CS-E5740. Complex Networks. Scale-free networks CS-E5740 Complex Networks Scale-free networks Course outline 1. Introduction (motivation, definitions, etc. ) 2. Static network models: random and small-world networks 3. Growing network models: scale-free

More information

Accessing Web Archives

Accessing Web Archives Accessing Web Archives Web Science Course 2017 Helge Holzmann 05/16/2017 Helge Holzmann (holzmann@l3s.de) Not today s topic http://blog.archive.org/2016/09/19/the-internet-archive-turns-20/ 05/16/2017

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 3/6/2012 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu 2 In many data mining

More information