Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig

Size: px
Start display at page:

Download "Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig"

Transcription

1 Distributed Data Management Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig

2 7.0 Network Models 7.0 Introduction 7.1 Graph Model Basics 7.2 Random Graph Models 7.3 Small-World Graph Models 7.4 Scale-Free Graph Models 7.5 Network Examples 7.6 Network Models in P2P Distributed Data Management Christoph Lofi IfIS TU Braunschweig 2

3 7.0 Network Models Basic motivation for this lecture Can we show that a given P2P network really has some desired properties? How can a P2P network be designed that it will, with high probability, show those desired properties? Large P2P networks are hard to evaluate In productive phase, usually no global view of the network is available In design phase, no large number of peers is available Distributed Data Management Christoph Lofi IfIS TU Braunschweig 3

4 7.0 Network Models Desirable System properties for P2P Decentralized and a self-organized network No single point of failure or central bottleneck Maintaining the network (joining /leaving/ publishing new content) should be performed without any central authority or global view Scalability The network should scale for any (possible large) number of nodes The structure of the network supports searching and retrieving information efficiently Obvious demand in information exchange systems Reliability despite dynamic changes Network should be robust wrt. network and node failures Book: P2P Systems and applications, pp Distributed Data Management Christoph Lofi IfIS TU Braunschweig 4

5 7.0 Network Models To examine the properties of a P2P network, good models are needed In this lecture, we focus on graph models for unstructured P2P networks Allows easy statistical analysis of network properties Peers are represented by vertices in a graph Entries in routing tables are represented by edges of the graph Peers are ego-centered and do not have global knowledge about all other peers and the data stored at those peers More complex P2P network protocols require dynamic simulation of networks to evaluate properties Distributed Data Management Christoph Lofi IfIS TU Braunschweig 5

6 7.0 Network Models Outline for this lecture: Network graph basics How can P2P networks be represented as graphs? Which properties can networks graphs have? What are desirable properties for a P2P graph? Network models Many different network models have been studied during the last 60 years Some of them are useful to evaluate or design P2P networks Distributed Data Management Christoph Lofi IfIS TU Braunschweig 6

7 7.0 Network Models Random Networks Simple network model to represent pure P2P networks like Gnutella Small-World networks Naturally occurring networks showing very desirable properties which can be exploited by P2P systems Scale-Free networks Naturally occurring networks in large infrastructures, like e.g. the internet or power grids Distributed Data Management Christoph Lofi IfIS TU Braunschweig 7

8 7.1 Graph Theory A directed graph G is defined as a G = (V, E) V: a set of nodes or vertices V E: a set of directed edges between elements of V E V V For P2P networks, V represents the set of peers V = n E represents all directed links in the P2P overlay network i.e. the union of entries in the routing table of all peers If later examples use undirected links, it is assumed that directed links in both directions exist E = m Distributed Data Management Christoph Lofi IfIS TU Braunschweig 8

9 7.1 Graph Theory Node outdegree of a node v is denoted deg + (v) i.e., the number of vertices w it is connected to by an edge (v, w) deg + v = N v = w V v, w E} Node indegree of a node v is denoted deg (v) i.e., the number of vertices w that are connected to v by an edge (w, v) deg v = w V w, v E} Node degree of a node v is denoted deg(v) deg v = deg + (v) + deg (v) For undirected graphs, only the node degree is defined no in- or out degree Neighbors set of a node v is denoted N(v) N v = w V v, w E} For every neighbor w N(v) there exists an edge v, w E Distributed Data Management Christoph Lofi IfIS TU Braunschweig 9

10 7.1 Graph Theory Example: V = 1, 2, 3, 4, 5 E = { 1,5, 5, 4, 4,5, 2,4, (2,1), (2,3)} N 2 = 1, 3, 4 N 4 = {5} deg + (2) = 3, deg 2 = 0 deg + (4) = 1, deg 4 = Distributed Data Management Christoph Lofi IfIS TU Braunschweig 10

11 7.1 Graph Theory Path P(v, w) A path P(v, w) is a set of vertices {v 0, v 1,, v k } with v 0 = v and v k = w and v i, v i+1 E for all (0 i k 1) The path length P(v, w) is defined as the number of edges in path P The distance d(v, w) is defined as the shortest path length of any path between v and w A path between v, w with length 6 V W A shortest path between v, w with length 4 Thus, distance between v and w is 4 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 11

12 7.1 Graph Theory Metrics describing whole graphs: Connectedness A graph is connected, if there is a path from any node to any other node k-connectedness A graph is k-connected if the removal of k 1 nodes still leaves the graph connected Bisection width bsw(g) Bisection width of a graph G is the minimal number of edges which must be removed to split the graph into two equally-sized unconnected subgraphs Represents the minimal cohesion of the graph Distributed Data Management Christoph Lofi IfIS TU Braunschweig 12

13 7.1 Graph Theory Graph diameter d(g) Represents the maximum extent (path length) of a graph The diameter of a graph is the maximal distance of any pair of vertices d G = max d v, w ; v, w V Average path length d avg (G) The sum of all distances between each pair of nodes divided by the number of all pairs of nodes in a connected graph d avg G = i,j VxV n n 1 d i,j Distributed Data Management Christoph Lofi IfIS TU Braunschweig 13

14 7.1 Graph Theory Graph outdegree deg + (G) The average outdegree of all nodes of G Graph indegree deg (G) The average indegree of all nodes of G For undirected graphs, there is just degree deg G The average degree of all nodes Distributed Data Management Christoph Lofi IfIS TU Braunschweig 14

15 7.1 Graph Theory The clustering coefficient C(v) of vertex v in a directed graph is given by The number of links between the vertices within its neighborhood divided by the number of links that could possibly exist between them The number of neighbors of v is deg + (v) The maximum number of connections between all neighboring nodes is deg + v (deg + v 1) i.e. each neighbor connected with each other neighbor Describes how densely the neighbors of a vertex are interconnected Distributed Data Management Christoph Lofi IfIS TU Braunschweig 15

16 7.1 Graph Theory If e(n(v)) denotes the actual number of connections that neighbors of v have with each other, the clustering coefficient is C v = e N v deg + v (deg + v 1) Wasserman, S., and Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press. V V V C v = = 0 Links between neighbors of V C v = = 0.66 C v = = 1 Maximum number of neighbor links (4 neighbors having at most 3 links) Distributed Data Management Christoph Lofi IfIS TU Braunschweig 16

17 7.1 Graph Theory Which properties should a good P2P graph have? Connectedness Each node should be reachable If not, some information is not accessible to all peers k-connectedness with large k Removing nodes should not immediately disconnect a graph Low diameter d(g) Low diameters are necessary to ensure reachability and reduce message load Low diameter quicker TTL possible when flooding Distributed Data Management Christoph Lofi IfIS TU Braunschweig 17

18 7.1 Graph Theory Low average path length d avg (G) Most messages should quickly reach their target Low average node degrees deg(g) The higher the node degree is, the more node states must be stored at nodes Increases size of routing tables High average cluster coefficient Densely connected neighborhoods increase the failure-resilience of networks Distributed routing possible See later: Kleinberg Model Distributed Data Management Christoph Lofi IfIS TU Braunschweig 18

19 7.2 Random Graphs Random graphs provide the easiest model for any network Simple underlying assumptions Analyzable with statistical methods First family of network models studied (1950s) Multiple models for generating a random graph have been developed Most prominent generation models are the Erdös-Renyi random graph the Gilbert random graph Distributed Data Management Christoph Lofi IfIS TU Braunschweig 19

20 7.2 Random Graphs A random graph is usually denoted as g n,m Random graph with n nodes and m edges For simplicity, we just consider undirected graphs Basic idea for constructing a random graph Graph construction starts with n vertices without any connections m edges are added one by one between the vertices using some random system Distributed Data Management Christoph Lofi IfIS TU Braunschweig 20

21 7.2 Random Graphs Pure peer-to-peer networks like Gnutella 0.4 can be modeled by a random graph Peers choose their neighbors more or less randomly Random bootstrapping, random Ping-Pong Unfortunately, real Gnutella 0.4 networks are usually not really random Bootstrapping is not random Usage special bootstrap nodes or bootstrap caches Ping-Pong strengthens connectedness of neighborhood and favors strong nodes Nodes prefer more popular and stronger nodes See later: scale-free networks Distributed Data Management Christoph Lofi IfIS TU Braunschweig 21

22 7.2 Random Graphs The behavior of random graphs is often studied for cases where the number of vertices diverges to infinity, i.e. n In context of P2P, think of scalability! While the number m of edges could be fixed, it is usually assumed that m grows with n e.g. new nodes in a P2P network will also lead to new connections Fixed m would quickly lead to mostly unconnected graphs Thus, usually m is a function of n Distributed Data Management Christoph Lofi IfIS TU Braunschweig 22

23 7.2 Erdős-Rényi Graphs Erdős-Rényi graphs are the most popular family of random graphs (1959) There are two predominant models which are equivalent for large graphs g n,m models Based on randomly selecting an instance of all graphs with n nodes and m edges g n,p models Each possible edge has a certain probability p to be added to a graph or not Also known as Gilbert graphs (1959) Distributed Data Management Christoph Lofi IfIS TU Braunschweig 23

24 7.2 Erdős-Rényi Graphs Constructing g n,m graphs Let G n,m be the set of all labeled graphs with n nodes and m edges Labeled graphs: nodes are identifiable Unlabeled random graphs only consider the shape of graphs The number of all such graphs is given by the polynomial coefficient G n,m = N m = n 2 m The number of possible edges between n nodes is N = n 2 For generating an instance g n,m, any instance of G n,m is selected with equal probability Erdős, P.; Rényi, A. (1959). "On Random Graphs. I.". Publicationes Mathematicae 6: Distributed Data Management Christoph Lofi IfIS TU Braunschweig 24

25 7.2 Erdős-Rényi Graphs Example: Constructing g 3,2 graphs There are 3 possible g 3,2 in G 3,2 Each graph is selected with the probability Distributed Data Management Christoph Lofi IfIS TU Braunschweig 25

26 #Edges 7.2 Erdős-Rényi Graphs The g n,m model of random graphs is not suitable for actually generating large random graphs Extremely high number of possible graphs for given n and m #Nodes

27 7.2 Gilbert Model For generative models: use probabilistic g n,p model of Erdős-Rényi random graphs So-called Gilbert graphs Gilbert, E.N. (1959). "Random Graphs". Annals of Mathematical Statistics 30: Number of nodes n is fixed Each possible edge in V V has the fixed probability p to be added to the graph i.e. underlying assumption is that adding an edge is fully independent of all existing edges Larger p will generate graphs with more edges, smaller p will generate graphs with less edges Distributed Data Management Christoph Lofi IfIS TU Braunschweig 27

28 7.2 Gilbert Model Both models g n,m and g n,p behave asymptotically equivalent for large n Expected number of edges is m = n 2 p for large n Law of large number will guarantee equivalence for pn 2 Thus, for large pn 2, statements about properties can made like Property P holds for most graphs in g n,p Property P holds for most graphs in g n,m= n 2 p Distributed Data Management Christoph Lofi IfIS TU Braunschweig 28

29 7.2 Random Graph Properties Randomly generated graphs can be used to approximate properties of large random P2P networks Many basic properties of random graphs have been established by Erdős & Rényi 1960 using large g n,p, n Asymptotical observations Many properties are directly dependent on the probability p (or the number of edges m) Graphs show several phase transitions depending on the node/edge ratio, Each phase transition has a threshold at which certain properties suddenly becomes extremely probable Before or after the threshold, the probability of a property P is either P(P) 0 or P(P) 1 for n Distributed Data Management Christoph Lofi IfIS TU Braunschweig 29

30 7.2 Random Graph Properties Predicting connected components For n p < 1, a g n,p graph will rarely have any connected components larger than O(log n) The graph is mainly unconnected, each of its component is very small e.g. for a graph g n,m with 150 nodes, this threshold is roughly around 74 edges Distributed Data Management Christoph Lofi IfIS TU Braunschweig 30

31 7.2 Random Graph Properties Example Graphs Statistical prediction: most components will be of logarithmic size wrt. to the number of nodes (i.e. will be small) g 150,25 g 150,p= g 150,50 g 150,p= Distributed Data Management Christoph Lofi IfIS TU Braunschweig 31

32 7.2 Random Graph Properties Giant Connected Component For n p = 1, a graph g n,p will very probably have a giant connected component of size in O(n 2 3) e.g. for a graph g n,m with 150 nodes, giant components should be observable for 75 edges and more Surprisingly, the giant component will appear when the average node degree is 1! For n p > 1, all other components will be of size O(log n) Distributed Data Management Christoph Lofi IfIS TU Braunschweig 32

33 7.2 Random Graph Properties Example Graphs (giant component appears) Statistical prediction: for m = 75 (n p = 1), there is a largest component of size 28 g 150,m=75 g 150,p= g 150,m=100 g 150,p= Distributed Data Management Christoph Lofi IfIS TU Braunschweig 33

34 7.2 Random Graph Properties Example Graphs (other components diminish) Statistical prediction: no other component will be large g 150,m=150 g 150,p= g 150,m=300 g 150,p= Distributed Data Management Christoph Lofi IfIS TU Braunschweig 34

35 7.2 Random Graph Properties Connectedness For p < ln n n, the graph will surely contain isolated vertices and will thus be disconnected For p > ln n n connected, the graph will usually be almost e.g. for a graph g n,m with 150 nodes, this threshold around 374 edges Distributed Data Management Christoph Lofi IfIS TU Braunschweig 35

36 7.2 Random Graph Properties Example Graphs (connectedness) Statistical prediction: for of p = = connected ln n n, the graphs is almost surly g 150,374 g 150,p=0.033 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 36

37 7.2 Random Graph Properties Degree Distribution The node degree of large random graphs can be modeled with Poisson distribution Let λ be a constant λ = (n 1) p. Then the probability distribution of the node degrees k = 0, 1, 2, 3, 4, can be approximated for n as the Poisson density P X = k = λk e λ k! Distributed Data Management Christoph Lofi IfIS TU Braunschweig 37

38 7.2 Random Graph Properties This degree distribution falls faster than an exponential distribution in d, hence it is not a power-law distribution For larger λ, behaves approximately similar to a normal distribution Distributed Data Management Christoph Lofi IfIS TU Braunschweig 38

39 7.2 Random Graph Properties Degree Distribution for g 150,p= 1 69 edges 150 and λ = 1 measured estimated Distributed Data Management Christoph Lofi IfIS TU Braunschweig 39

40 7.2 Random Graph Properties Degree Distribution for g 150,p= edges 150 and λ = 2 estimated measured Distributed Data Management Christoph Lofi IfIS TU Braunschweig 40

41 7.2 Random Graph Properties Diameter If g is connected, the expected diameter of g n,m is in O(log n) with high probability i.e. the diameter of a connected random graph grows only logarithmically g n,p is surely connected for p ln n or: g n,m is surely connected m ( n 2 n ln n ) n for n d 75, ln 76 75, d(g) = 7 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 41

42 7.2 Random Graph Properties Clustering Coefficient The clustering coefficient of a random graph g n,p is with high probability asymptotically equal to p for n This is a rather low clustering coefficient g 100,p=0.03 C avg g 100,p=0.06 C avg nodes colored by C Distributed Data Management Christoph Lofi IfIS TU Braunschweig 42

43 7.3 Small-World Graphs Observation: Real and natural networks are not random, but have some inherent structure Many naturally occurring networks are very robust and efficient Social network among people Neural networks Power lines, the Internet, streets, etc. What properties do real-life networks have? Why are they stable and efficient? Distributed Data Management Christoph Lofi IfIS TU Braunschweig 43

44 7.3 Small-World Graphs First real networks to be studied: Social Networks among people Six degrees of Separation First mentioned 1929 by the Hungarian star author Karinthy Frigyes in his short story Chains Claim: all ½ billion people in the world (sic.) know Frigyes via at most five acquaintances» Friend-of-a-friend connections Motivated by two examples Distributed Data Management Christoph Lofi IfIS TU Braunschweig 44

45 7.3 Small-World Graphs Example 1: some 1929 Nobel price laureate, knows King Gustav of Sweden who passionately plays tennis and knows a famous tennis champion who is a friend of Frigyes Example 2: unknown factory worker at a Ford manufacture Knows his boss, who knows Ford personally, who knows the director of the media house Hearst Publications, who knows the writer Árpád Pásztor, who is a friend of Frigyes Distributed Data Management Christoph Lofi IfIS TU Braunschweig 45

46 7.3 Small-World Graphs This idea was scientifically examined in 1967: Sociologist Stanley Milgram, Yale University Persons chosen at random in Kansas and Nebraska were asked to deliver a letter to a certain stock broker in Cambridge, MA This was the only information about the target person Constraint: The letter can only be given to persons one knows on a first name basis (acquaintances) 1967: No internet, transportation really expensive and cumbersome, close local communities S. Milgram ( ) Distributed Data Management Christoph Lofi IfIS TU Braunschweig 46

47 7.3 Small-World Graphs Letters used in the Milgram experiment Distributed Data Management Christoph Lofi IfIS TU Braunschweig 47

48 7.3 Small-World Graphs Those letters that reached the target person were only passed on over 6 mediators on average 6 degrees of separation This was far less than originally assumed! Thus, social graphs were coined Small-World Graphs The original experiment was later criticized Only 50 persons took part in the original experiment Only 5% of letters were actually received by the target person But, One letter was received within only 4 days The small world effect was experimentally observed in a vast variety of other sciences Distributed Data Management Christoph Lofi IfIS TU Braunschweig 48

49 7.3 Small-World Graphs Interesting trivia Six Degrees of Kevin Bacon Kevin Bacon once claimed that he's worked with everybody in Hollywood or someone who's worked with them College students build a party game out of that statement based on Milgram s ideas Basic idea: Link actors via a minimum number of movies to actor Kevin Bacon e.g., Val Kilmer was in Top Gun with Tom Cruise, and Tom Cruise was in A Few Good Men with Kevin Bacon Only approximately 12% of all actors cannot be linked to Bacon > try: Distributed Data Management Christoph Lofi IfIS TU Braunschweig 49

50 7.3 Small-World Graphs However, it took a while until such naturally occurring networks have been formally understood Erdős Rényi random graphs are bad models for natural networks Natural networks often show hubs There a some nodes with very high node degree Node degree better described by a power-law distribution than a Poisson distribution Natural networks often show a very high degree of local clustering High average cluster coefficients e.g. by local communities, friend cliques, co-worker networks, local transportation networks, etc Natural networks often have a low average path length Distributed Data Management Christoph Lofi IfIS TU Braunschweig 50

51 7.3 Small-World Graphs First models for natural graphs were proposed by Duncan Watts and Steven Strogatz in1998 Watts, D.J.; Strogatz, S.H. (1998). "Collective dynamics of 'small-world' networks. Nature 393 (6684): doi: /30918 They examined three real-world networks The simple neural brain network of the roundworm (nematode) Caenorhabditis Elegans A natural network Power grids networks A man-made network Collaborations networks between movie actors Semi-natural network Distributed Data Management Christoph Lofi IfIS TU Braunschweig 51

52 7.3 Small-World Graphs Watts and Strogatz mainly examined the average path length and the cluster coefficient Comparison with equally sized random graphs Similar node and edge number Result: Real networks have a much higher degree of local clustering (10x to 1000x higher) than random graphs Average path length is more or less similar Distributed Data Management Christoph Lofi IfIS TU Braunschweig 52

53 7.3 Watts-Strogatz Graphs Definition of small-world graphs A small world network is a network with a dense local structure and a diameter comparable to a random graph with same numbers of nodes and edges. Additionally: The node degree is homogenous Watts and Strogatz also proposed the first generative model for a certain class of smallworld- graphs So called Watts-Strogatz graphs There are other small-world classes Distributed Data Management Christoph Lofi IfIS TU Braunschweig 53

54 7.3 Watts-Strogatz Graphs Properties of Watts-Strogatz graphs Low average path length High average clustering coefficients Homogenously distributed node degrees Good model for e.g. social or neural networks and most other natural networks Not a good model for most man-made grid-like networks Those show power-law distributed node degrees By definition, these are not small-world graphs e.g. internet, airline routes, train lines, etc. Watts-Strogatz graphs are between random and scalefree networks Distributed Data Management Christoph Lofi IfIS TU Braunschweig 54

55 7.3 Watts-Strogatz Graphs The generative model (Watts-Strogatz model) Graph is denoted as g_ws n,k,p n is the number of nodes (integer) k is the neighborhood degree (integer) p is the rewire probability (float in [0.. 1]) Build a ring of n vertices and connect each vertex with its k clockwise neighbors on the ring Draw a random number between 0 and 1 for each edge Rewire each edge with probability p: if random number is larger than p, do nothing. Else rewire. Rewiring: keep the source vertex of the edge fixed, and choose a new target vertex uniformly at random from all other vertices Distributed Data Management Christoph Lofi IfIS TU Braunschweig 55

56 7.3 Watts-Strogatz Graphs For p = 0, the resulting network is totally regular, with a clustering coefficient approaching 3 for large k, the 4 diameter is in O(n) For p = 1, the resulting network is a kind of a random graph (regular random graph) with a diameter in O(log n) k = 2 Increasing randomness p=0 p=1 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 56

57 7.3 Watts-Strogatz Graphs Comparing Watts-Strogatz Graphs n = 50, m = 150 k = 3 coloring by cluster coefficient Erdős-Rényi Graph Watts-Strogatz with p = 0.0 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 57

58 7.3 Watts-Strogatz Graphs Comparing Watts-Strogatz Graphs n = 50, m = 150 k = 3 Watts-Strogatz with p = 0.01 Watts-Strogatz with p = 0.03 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 58

59 7.3 Watts-Strogatz Graphs Comparing Watts-Strogatz Graphs n = 50, m = 150 k = 3 Watts-Strogatz with p = 0.05 Watts-Strogatz with p = 0.1 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 59

60 Number of Nodes Watts-Strogatz Graphs Histogram of cluster coefficients Single sample Random Generally lower coefficient Small World Homogeneous, higher coefficient p=0.00 p=0.02 p=0.05 p=0.10 random Cluster Coefficient Distributed Data Management Christoph Lofi IfIS TU Braunschweig 60

61 Number of Nodes Watts-Strogatz Graphs Histogram of node degrees Same sample Random Homogeneous degree Higher variance Small World Homogeneous degree Low variance p=0.00 p=0.02 p=0.05 p=0.10 random Node Degree Distributed Data Management Christoph Lofi IfIS TU Braunschweig 61

62 7.3 Watts-Strogatz Graphs Investigating clustering coefficients and average path lengths in dependence of p For a graph with 5000 nodes Normalized by the clustering coefficient and the path length at p = 0 Clustering coefficient is still high for small p, but the average path length decreases extremely fast due to short cuts p 1 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 62

63 7.3 Kleinberg Navigability Model The Watts-Strogatz model explains how a smallworld graph can be constructed i.e. How can locally densely connected graphs with shortcuts be constructed? But: navigating a small-world can be very difficult! Assume Six Degrees of Separation was true: route a message to any arbitrary person All people on earth would be reachable by just six acquaintances But which ones? Random navigation or flooding won t help Exponentially many possibilities! Solution: Use clues and heuristics to quickly route the massage into the correct neighborhood! Distributed Data Management Christoph Lofi IfIS TU Braunschweig 63

64 7.3 Kleinberg Navigability Model Challenging question: how can we find short paths in a distributed fashion in a small-world? Why should arbitrary pairs of strangers be able to find short chains of acquaintances that link them together? J.M. Kleinberg, Navigation in a Small-World, Nature, 2000 Some routing information is necessary Enough but not too much information! Distributed Data Management Christoph Lofi IfIS TU Braunschweig 64

65 7.3 Kleinberg Navigability Model Nodes see local parts of the network (neighborhood) i.e., they route the letter in a decentralized fashion In social networks additional information (same profession, address, hobbies, etc.) is used to decide which neighbor is closest to the recipient Milgram showed that the first steps of the letter were the geographically largest, while later steps were closing in on the target area Distributed Data Management Christoph Lofi IfIS TU Braunschweig 65

66 7.3 Kleinberg Navigability Model A decentralized routing algorithm can be modeled as follows Let every node v have a position Pos(v) on a toroidal grid in a d-dimensional space Pos(v) = (x 1, x 2,, x d ) with all x i being integers Pos(v) is d-dimensional vector x i (v) is the position of v in dimension i Every node knows the some basic information of the underlying grid structure i.e. its own position in the grid, its neighbors, and the target node no global knowledge, only local information Distributed Data Management Christoph Lofi IfIS TU Braunschweig 66

67 7.3 Kleinberg Navigability Models Each node hands the message (i.e., letter) to the one neighbor of v that is closest to the target t The distance measure d M (v, w) is given by the Manhattan Distance by the sum over the absolute difference x i v x i (w) i Let the routing algorithm take place on the following network model Start with a d-dimensional grid Add random edges between vertices v and w with a probability of P v, w ~ d M v, w α inverse α th -power distribution Distributed Data Management Christoph Lofi IfIS TU Braunschweig 67

68 7.3 Kleinberg Navigability Model Node u is connected to all its neighbors (a, b, c, and d) and has a long-range link to some randomly chosen node v with a probability proportional to dist u, v α The higher the distance, the lower the link probability Distributed Data Management Christoph Lofi IfIS TU Braunschweig 68

69 7.3 Kleinberg Navigability Model Theorem: The routing algorithm will find short paths, if and only if α = d short means that arbitrary paths length are in O(log n) Simulation results on the greedy routing algorithm a 2-dimensional toroidal grid with 20,000 20,000 nodes (averages over 1000 runs) Distributed Data Management Christoph Lofi IfIS TU Braunschweig 69

70 7.3 Kleinberg Navigability Model Idea behind the proof is that for any α < d there are too few random edges to form shortcuts For α > d there are too many random edges, and hence too many choices to which the message could be passed on The routing will degenerate into a random walk Kleinberg small-worlds thus provide a way of building a peer-to-peer overlay network allowing for a simple, greedy, and distributed routing protocol But: How are nodes mapped to d-dimensional space such that the distance measurement is meaningful? Distributed Data Management Christoph Lofi IfIS TU Braunschweig 70

71 7.4 Scale-Free Networks Small-World and random graphs show homogenous node degree distributions For small-world, distribution looks similar to a normal distribution with μ = 2k for non-extreme p The actual model is more complicated k is the number of neighbors of the initial ring Random graphs are Poisson distributed For larger m, will also approximate a normal distribution But many (especially artificial) real-life networks show extreme node degree distributions e.g. strong hub-topologies Distributed Data Management Christoph Lofi IfIS TU Braunschweig 71

72 7.4 Scale-Free Networks In 1999, Albert-László Barabási (Univ. of Notre Dame) crawled parts of the WWW to investigate its actual structure The node degree is power-law distributed i.e., the probability that a node in the network is connects to k other nodes is P k ~ k γ (usually with 2 < γ 3) Most nodes have a small degree of around 1 to 2 Few nodes have an extremely high node degree High-degree vertices are called hubs Albert-László Barabási. Linked: How Everything Is Connected to Everything Else and What It Means for Business, Science, and Everyday Life. Plume ISBN Distributed Data Management Christoph Lofi IfIS TU Braunschweig 72

73 7.4 Scale-Free Networks Definition: Graphs with a power-law node degree distribution form scale-free networks Also called power-law networks What kind of network model can generate this more realistic degree distribution? Barabási Albert model builds a certain subset of scale-free networks Albert-László Barabási & Réka Albert."Emergence of scaling in random networks". Science, 1999 doi: /science Distributed Data Management Christoph Lofi IfIS TU Braunschweig 73

74 7.4 Barabási Albert Graphs Barabási Albert model: Basic Idea In its simplest form denoted as g_ba n,m n is the number of nodes in the graph m is the number of edges added per time step The total number of edges is thus n m Start with any initial graph of size n 0 n 0 2 and degree of any node deg (v) 1 Often, just m connected nodes are used as default initial network If initial network is not connected, the result network cannot be guaranteed to be connected Barabási Albert graph is constructed iteratively by adding new nodes one by one until target size n is reached Represents one time step in a simulated network growth i.e. Discrete Time Modeling Add nodes until target size n is reached Each new node is connected to m existing nodes Distributed Data Management Christoph Lofi IfIS TU Braunschweig 74

75 7.4 Barabási Albert Graphs New edges are not added randomly, but favor higher-degree nodes The rich get richer Preferential attachment to higher-degree nodes The higher the degree of a possible target node, the higher the probability that the new node will attach to it Preferential attachment defines the probability (v) for vertex v to get an edge to a new node In general, is propertional to the node degree, i.e. v ~ deg(v) Most common definition is deg v v = w V deg (w) Distributed Data Management Christoph Lofi IfIS TU Braunschweig 75

76 7.4 Barabási Albert Graphs Example: g_ba 5,1 t = 0 t = 1 ε Initial graph Add new node v 3 Probability for connecting any old node v to v 3 is given by v = deg v w V deg w e.g., connect to v 1 Random decision steered by preferential attachment t = 1 v 1 v 2 v 1 v 2 v 1 v 2 (v 1 ) = 1 2 (v 2 ) = 1 2 v 3 v 3 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 76

77 7.4 Barabási Albert Graphs Example: g_ba 5,1 t = 2 ε Add new node v 4 Evaluate preferential attachment e.g. connect to v 1 t = 3 ε Add new node v 4 Evaluate preferential attachment e.g. connect to v 2 v 4 v 4 (v 4 ) = 1 6 (v 1 ) = 1 2 (v 2 ) = 1 2 v 1 v 2 (v 2 ) = 1 4 v 1 v 2 v 5 (v 2 ) = 1 6 (v 3 ) = 1 4 v 3 v 3 (v 3 ) = 1 6 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 77

78 7.4 Barabási Albert Graphs Comparing Barabási Albert Graphs n = 50, ~50 edges coloring by node degree Erdős-Rényi Graph Barabási Albert Graphs Distributed Data Management Christoph Lofi IfIS TU Braunschweig 78

79 7.4 Barabási Albert Graphs Comparing Barabási Albert Graphs n = 100, ~100 edges Erdős-Rényi Graph Barabási Albert Graphs Distributed Data Management Christoph Lofi IfIS TU Braunschweig 79

80 7.4 Barabási Albert Graphs Comparing Barabási Albert Graphs n = 100, ~150 edges Erdős-Rényi Graph Barabási Albert Graphs Distributed Data Management Christoph Lofi IfIS TU Braunschweig 80

81 Number of Nodes Barabási Albert Graphs Histogram of node coefficients Single sample 100 nodes 300 edges Random Generally lower degree Small World Homogeneous degree Scale-Free Power-law Hubs visible Dampening factor for decreasing strength of preferential attachment Barabási(pa=0.5) Watts-Strogatz(p=0.05) Random Node Degree Distributed Data Management Christoph Lofi IfIS TU Braunschweig 81

82 relative frequency 7.4 Barabási Albert Graphs Node degree for larger Barabási Albert graphs 200k nodes 400k edges Logarithmic Scale degree Distributed Data Management Christoph Lofi IfIS TU Braunschweig 82

83 Number of Nodes Barabási Albert Graphs Histogram of cluster coefficients (C) Same sample Random Low C Small World Homogeneous high C Scale-Free Also power-law Lower than SW Barabási(pa=0.5) Watts-Strogatz(p=0.05) Random Cluster Coefficient Distributed Data Management Christoph Lofi IfIS TU Braunschweig 83

84 7.4 Scale-Free Networks Important property of scale-free networks is robustness against random failures Removing a random vertex v will likely hit a low-degree node Expected damage to network is small A failing high-degree node can severely damage a network Better fail-safety necessary for high-degree node to ensure overall robustness Thus, scale-free networks are very sensitive against attacks If a malevolent attacks explicitly target the highest degree nodes, the network can easily decompose Note: random graphs are not resilient against random failures, but also not particularly prone to attacks Most vertices more or less have the same degree Distributed Data Management Christoph Lofi IfIS TU Braunschweig 84

85 7.4 Scale-Free Networks Example: Airline Networks (Ryanair) Distributed Data Management Christoph Lofi IfIS TU Braunschweig 85

86 7.4 Scale-Free Networks Example: Airline Networks (Ryanair) Distributed Data Management Christoph Lofi IfIS TU Braunschweig 86

87 7.4 Scale-Free Networks Example: Internet (2009) Measured by CAIDA skitter monitor in London ca. 535k nodes and 600k links

88 7.4 Scale-Free Networks Example: Internet (2005) From Try full size! Distributed Data Management Christoph Lofi IfIS TU Braunschweig 88

89 7.4 Scale-Free Networks Example: Internet By geographic location Distributed Data Management Christoph Lofi IfIS TU Braunschweig 89

90 7.4 Scale-Free Networks :-) 90

91 7.5 Comparing Graphs Random Graph: 50 nodes, 50 edges Color by degree Property Value Connected No Diameter (conn.) 9 Avg. Path Length 4.39 #Clusters 6 Largest Cluster 39 k-connectedness 0 Avg. Cluster Coeff Avg. Degree 2 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 91

92 7.5 Comparing Graphs Watts-Strogatz Graph: 50 nodes, 50 edges Property Value Connected No Diameter (conn.) 35 Avg. Path Length #Clusters 2 Largest Cluster 38 k-connectedness 0 Avg. Cluster Coeff. 0 Avg. Degree 2 p = 0.05 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 92

93 7.5 Comparing Graphs Barabási-Albert Graph: 50 nodes, 49 edges Property Value Connected Yes Diameter 12 Avg. Path Length 5.14 k-connectedness 1 Avg. Cluster Coeff. 0 Avg. Degree 1.96 pa = 0.8 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 93

94 7.5 Comparing Graphs Random Graph: 50 nodes, 100 edges Property Value Connected No Diameter (conn.) 6 Avg. Path Length 2.88 #Clusters 2 Largest Cluster 49 k-connectedness 0 Avg. Cluster Coeff Avg. Degree 4 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 94

95 7.5 Comparing Graphs Watts-Strogatz Graph: 50 nodes, 100 edges Property Value Connected Yes Diameter (conn.) 10 Avg. Path Length 4.6 k-connectedness 2 Avg. Cluster Coeff Avg. Degree 4 p = 0.05 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 95

96 7.5 Comparing Graphs Barabási-Albert: 50 nodes, 98 edges Property Value Connected Yes Diameter 4 Avg. Path Length 2.55 k-connectedness 2 Avg. Cluster Coeff Avg. Degree 3.92 pa = 0.8 Distributed Data Management Christoph Lofi IfIS TU Braunschweig 96

97 7.6 Models in P2P What do real Peer-To-Peer Networks look like? Depends on the used protocols Some P2P networks like e.g. Freenet evolve voluntarily in a small-world with a high clustering coefficient and a small diameter Analogously, some protocols, e.g., Gnutella, will implicitly generate a scale-free degree distribution Implied by boot-strapping and Ping-Pong Distributed Data Management Christoph Lofi IfIS TU Braunschweig 97

98 7.6 Models in P2P Freenet converges to a small-world network under medium load This is achieved by routing table updates Every file is correlated with a key (by a hash function) A file will then be stored at some node with a similar key At each peer, each request is forwarded to the node in its routing table having the closest key to the requested one If the request s time-to-live expires or a node does not have neighbors to send the file to, a backtracking request failed message is sent If the request is successful, the file is sent back via the routing nodes and each node saves the file and adds the sending node s address to its local routing table i.e., frequently requested files are replicated If the routing table is full, the least recently used (LRU) entry is evicted Distributed Data Management Christoph Lofi IfIS TU Braunschweig 98

99 7.6 Models in P2P Example of Freenet Routing B s routing table Key Pointer D s routing table 6 C Key Pointer 15 D 9 F? key=9 A 9 F 9? 9 B 9? 9? Sorry! 9 1 E D 9? 9 F key = 9 C s routing table empty C E Distributed Data Management Christoph Lofi IfIS TU Braunschweig 99

100 7.6 Models in P2P What should Peer-to-Peer networks look like? It depends If it should be navigable in a decentralized fashion, Make it a small-world and implement Kleinberg s routing algorithm (or a variant, e.g., Symphony) If the peer-to-peer network could be under attack also make it a small-world, where most vertices have the same (low) degree If it is peer-to-peer network in a small and secure context, e.g. an intranet in a company, Make it a scale-free network. This allows to buy only a small number of servers with a high bandwidth. These will work as 'hubs' of the network Distributed Data Management Christoph Lofi IfIS TU Braunschweig 100

101 7.6 Models in P2P The network structure of a peer-to-peer system influences: average necessary number of hops (path length) possibility of greedy, decentralized routing algorithms stability against random failures sensitivity against attacks redundancy of routing table entries (edges) many other properties of the system build onto this network Important measures of a network structure are: average path length clustering coefficient the degree distribution Influence the edge generation rules such that a network structure arises showing the desired properties Distributed Data Management Christoph Lofi IfIS TU Braunschweig 101

102 Next Lecture Content Distribution Swarming BitTorrent Error Correction Privacy Dark Nets Distributed Data Management Christoph Lofi IfIS TU Braunschweig 102

Distributed Data Management

Distributed Data Management Distributed Data Management Christoph Lofi José Pinto Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 7.0 Network Models 7.0 Introduction 7.1 Graph Model

More information

Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig

Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig Distributed Data Management Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 6.4 Other DHTs: CAN CAN is another early DHT implementation S.

More information

Peer-to-Peer Data Management

Peer-to-Peer Data Management Peer-to-Peer Data Management Wolf-Tilo Balke Sascha Tönnies Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 10. Networkmodels 1. Introduction Motivation

More information

6. Overview. L3S Research Center, University of Hannover. 6.1 Section Motivation. Investigation of structural aspects of peer-to-peer networks

6. Overview. L3S Research Center, University of Hannover. 6.1 Section Motivation. Investigation of structural aspects of peer-to-peer networks , University of Hannover Random Graphs, Small-Worlds, and Scale-Free Networks Wolf-Tilo Balke and Wolf Siberski 05.12.07 * Original slides provided by K.A. Lehmann (University Tübingen, Germany) 6. Overview

More information

Resilient Networking. Thorsten Strufe. Module 3: Graph Analysis. Disclaimer. Dresden, SS 15

Resilient Networking. Thorsten Strufe. Module 3: Graph Analysis. Disclaimer. Dresden, SS 15 Resilient Networking Thorsten Strufe Module 3: Graph Analysis Disclaimer Dresden, SS 15 Module Outline Why bother with theory? Graphs and their representations Important graph metrics Some graph generators

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/4/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

More information

Overlay (and P2P) Networks

Overlay (and P2P) Networks Overlay (and P2P) Networks Part II Recap (Small World, Erdös Rényi model, Duncan Watts Model) Graph Properties Scale Free Networks Preferential Attachment Evolving Copying Navigation in Small World Samu

More information

(Social) Networks Analysis III. Prof. Dr. Daning Hu Department of Informatics University of Zurich

(Social) Networks Analysis III. Prof. Dr. Daning Hu Department of Informatics University of Zurich (Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Outline Network Topological Analysis Network Models Random Networks Small-World Networks Scale-Free Networks

More information

1 Random Graph Models for Networks

1 Random Graph Models for Networks Lecture Notes: Social Networks: Models, Algorithms, and Applications Lecture : Jan 6, 0 Scribes: Geoffrey Fairchild and Jason Fries Random Graph Models for Networks. Graph Modeling A random graph is a

More information

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008

Lesson 4. Random graphs. Sergio Barbarossa. UPC - Barcelona - July 2008 Lesson 4 Random graphs Sergio Barbarossa Graph models 1. Uncorrelated random graph (Erdős, Rényi) N nodes are connected through n edges which are chosen randomly from the possible configurations 2. Binomial

More information

Distributed Data Management

Distributed Data Management Distributed Data Management Profr. Dr. Wolf-Tilo Balke Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Network Models and Content Provisioning Network Models

More information

M.E.J. Newman: Models of the Small World

M.E.J. Newman: Models of the Small World A Review Adaptive Informatics Research Centre Helsinki University of Technology November 7, 2007 Vocabulary N number of nodes of the graph l average distance between nodes D diameter of the graph d is

More information

CSCI5070 Advanced Topics in Social Computing

CSCI5070 Advanced Topics in Social Computing CSCI5070 Advanced Topics in Social Computing Irwin King The Chinese University of Hong Kong king@cse.cuhk.edu.hk!! 2012 All Rights Reserved. Outline Graphs Origins Definition Spectral Properties Type of

More information

Models of Network Formation. Networked Life NETS 112 Fall 2017 Prof. Michael Kearns

Models of Network Formation. Networked Life NETS 112 Fall 2017 Prof. Michael Kearns Models of Network Formation Networked Life NETS 112 Fall 2017 Prof. Michael Kearns Roadmap Recently: typical large-scale social and other networks exhibit: giant component with small diameter sparsity

More information

Advanced Distributed Systems. Peer to peer systems. Reference. Reference. What is P2P? Unstructured P2P Systems Structured P2P Systems

Advanced Distributed Systems. Peer to peer systems. Reference. Reference. What is P2P? Unstructured P2P Systems Structured P2P Systems Advanced Distributed Systems Peer to peer systems Karl M. Göschka Karl.Goeschka@tuwien.ac.at http://www.infosys.tuwien.ac.at/teaching/courses/ AdvancedDistributedSystems/ What is P2P Unstructured P2P Systems

More information

Scalable P2P architectures

Scalable P2P architectures Scalable P2P architectures Oscar Boykin Electrical Engineering, UCLA Joint work with: Jesse Bridgewater, Joseph Kong, Kamen Lozev, Behnam Rezaei, Vwani Roychowdhury, Nima Sarshar Outline Introduction to

More information

ECS 253 / MAE 253, Lecture 8 April 21, Web search and decentralized search on small-world networks

ECS 253 / MAE 253, Lecture 8 April 21, Web search and decentralized search on small-world networks ECS 253 / MAE 253, Lecture 8 April 21, 2016 Web search and decentralized search on small-world networks Search for information Assume some resource of interest is stored at the vertices of a network: Web

More information

Small-World Models and Network Growth Models. Anastassia Semjonova Roman Tekhov

Small-World Models and Network Growth Models. Anastassia Semjonova Roman Tekhov Small-World Models and Network Growth Models Anastassia Semjonova Roman Tekhov Small world 6 billion small world? 1960s Stanley Milgram Six degree of separation Small world effect Motivation Not only friends:

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University, y http://cs224w.stanford.edu Due in 1 week: Oct 4 in class! The idea of the reaction papers is: To familiarize yourselves

More information

Math 443/543 Graph Theory Notes 10: Small world phenomenon and decentralized search

Math 443/543 Graph Theory Notes 10: Small world phenomenon and decentralized search Math 443/543 Graph Theory Notes 0: Small world phenomenon and decentralized search David Glickenstein November 0, 008 Small world phenomenon The small world phenomenon is the principle that all people

More information

Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior

Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior Social, Information, and Routing Networks: Models, Algorithms, and Strategic Behavior Who? Prof. Aris Anagnostopoulos Prof. Luciana S. Buriol Prof. Guido Schäfer What will We Cover? Topics: Network properties

More information

Network Thinking. Complexity: A Guided Tour, Chapters 15-16

Network Thinking. Complexity: A Guided Tour, Chapters 15-16 Network Thinking Complexity: A Guided Tour, Chapters 15-16 Neural Network (C. Elegans) http://gephi.org/wp-content/uploads/2008/12/screenshot-celegans.png Food Web http://1.bp.blogspot.com/_vifbm3t8bou/sbhzqbchiei/aaaaaaaaaxk/rsc-pj45avc/

More information

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds

MAE 298, Lecture 9 April 30, Web search and decentralized search on small-worlds MAE 298, Lecture 9 April 30, 2007 Web search and decentralized search on small-worlds Search for information Assume some resource of interest is stored at the vertices of a network: Web pages Files in

More information

Constructing a G(N, p) Network

Constructing a G(N, p) Network Random Graph Theory Dr. Natarajan Meghanathan Associate Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Introduction At first inspection,

More information

Wednesday, March 8, Complex Networks. Presenter: Jirakhom Ruttanavakul. CS 790R, University of Nevada, Reno

Wednesday, March 8, Complex Networks. Presenter: Jirakhom Ruttanavakul. CS 790R, University of Nevada, Reno Wednesday, March 8, 2006 Complex Networks Presenter: Jirakhom Ruttanavakul CS 790R, University of Nevada, Reno Presented Papers Emergence of scaling in random networks, Barabási & Bonabeau (2003) Scale-free

More information

Complex Networks. Structure and Dynamics

Complex Networks. Structure and Dynamics Complex Networks Structure and Dynamics Ying-Cheng Lai Department of Mathematics and Statistics Department of Electrical Engineering Arizona State University Collaborators! Adilson E. Motter, now at Max-Planck

More information

How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200?

How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200? How to explore big networks? Question: Perform a random walk on G. What is the average node degree among visited nodes, if avg degree in G is 200? Questions from last time Avg. FB degree is 200 (suppose).

More information

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena

CSE 190 Lecture 16. Data Mining and Predictive Analytics. Small-world phenomena CSE 190 Lecture 16 Data Mining and Predictive Analytics Small-world phenomena Another famous study Stanley Milgram wanted to test the (already popular) hypothesis that people in social networks are separated

More information

Network Mathematics - Why is it a Small World? Oskar Sandberg

Network Mathematics - Why is it a Small World? Oskar Sandberg Network Mathematics - Why is it a Small World? Oskar Sandberg 1 Networks Formally, a network is a collection of points and connections between them. 2 Networks Formally, a network is a collection of points

More information

CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS

CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS CS249: SPECIAL TOPICS MINING INFORMATION/SOCIAL NETWORKS Overview of Networks Instructor: Yizhou Sun yzsun@cs.ucla.edu January 10, 2017 Overview of Information Network Analysis Network Representation Network

More information

CAIM: Cerca i Anàlisi d Informació Massiva

CAIM: Cerca i Anàlisi d Informació Massiva 1 / 72 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim

More information

Lesson 18. Laura Ricci 08/05/2017

Lesson 18. Laura Ricci 08/05/2017 Lesson 18 WATTS STROGATZ AND KLEINBERG MODELS 08/05/2017 1 SMALL WORLD NETWORKS Many real networks are characterized by a diameter very low. In several social networks, individuals tend to group in clusters

More information

How Do Real Networks Look? Networked Life NETS 112 Fall 2014 Prof. Michael Kearns

How Do Real Networks Look? Networked Life NETS 112 Fall 2014 Prof. Michael Kearns How Do Real Networks Look? Networked Life NETS 112 Fall 2014 Prof. Michael Kearns Roadmap Next several lectures: universal structural properties of networks Each large-scale network is unique microscopically,

More information

Constructing a G(N, p) Network

Constructing a G(N, p) Network Random Graph Theory Dr. Natarajan Meghanathan Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Introduction At first inspection, most

More information

A Generating Function Approach to Analyze Random Graphs

A Generating Function Approach to Analyze Random Graphs A Generating Function Approach to Analyze Random Graphs Presented by - Vilas Veeraraghavan Advisor - Dr. Steven Weber Department of Electrical and Computer Engineering Drexel University April 8, 2005 Presentation

More information

RANDOM-REAL NETWORKS

RANDOM-REAL NETWORKS RANDOM-REAL NETWORKS 1 Random networks: model A random graph is a graph of N nodes where each pair of nodes is connected by probability p: G(N,p) Random networks: model p=1/6 N=12 L=8 L=10 L=7 The number

More information

Critical Phenomena in Complex Networks

Critical Phenomena in Complex Networks Critical Phenomena in Complex Networks Term essay for Physics 563: Phase Transitions and the Renormalization Group University of Illinois at Urbana-Champaign Vikyath Deviprasad Rao 11 May 2012 Abstract

More information

6.207/14.15: Networks Lecture 5: Generalized Random Graphs and Small-World Model

6.207/14.15: Networks Lecture 5: Generalized Random Graphs and Small-World Model 6.207/14.15: Networks Lecture 5: Generalized Random Graphs and Small-World Model Daron Acemoglu and Asu Ozdaglar MIT September 23, 2009 1 Outline Generalized random graph models Graphs with prescribed

More information

Peer-to-Peer Systems. Chapter General Characteristics

Peer-to-Peer Systems. Chapter General Characteristics Chapter 2 Peer-to-Peer Systems Abstract In this chapter, a basic overview is given of P2P systems, architectures, and search strategies in P2P systems. More specific concepts that are outlined include

More information

Networks and stability

Networks and stability Networks and stability Part 1A. Network topology www.weaklink.sote.hu csermelypeter@yahoo.com Peter Csermely 1. network topology 2. network dynamics 3. examples for networks 4. synthesis (complex equilibria,

More information

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 11. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 11 Web Mining and Recommender Systems Triadic closure; strong & weak ties Triangles So far we ve seen (a little about) how networks can be characterized by their connectivity patterns What

More information

The Structure of Information Networks. Jon Kleinberg. Cornell University

The Structure of Information Networks. Jon Kleinberg. Cornell University The Structure of Information Networks Jon Kleinberg Cornell University 1 TB 1 GB 1 MB How much information is there? Wal-Mart s transaction database Library of Congress (text) World Wide Web (large snapshot,

More information

γ : constant Goett 2 P(k) = k γ k : degree

γ : constant Goett 2 P(k) = k γ k : degree Goett 1 Jeffrey Goett Final Research Paper, Fall 2003 Professor Madey 19 December 2003 Abstract: Recent observations by physicists have lead to new theories about the mechanisms controlling the growth

More information

Peer-to-peer networks: pioneers, self-organisation, small-world-phenomenons

Peer-to-peer networks: pioneers, self-organisation, small-world-phenomenons Peer-to-peer networks: pioneers, self-organisation, small-world-phenomenons Patrick Baier October 10, 2008 Contents 1 Introduction 1 1.1 Preamble.................................... 1 1.2 Definition....................................

More information

Example 1: An algorithmic view of the small world phenomenon

Example 1: An algorithmic view of the small world phenomenon Lecture Notes: Social Networks: Models, Algorithms, and Applications Lecture 1: Jan 17, 2012 Scribes: Preethi Ambati and Azar Aliyev Example 1: An algorithmic view of the small world phenomenon The story

More information

Peer-to-Peer Networks 15 Self-Organization. Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg

Peer-to-Peer Networks 15 Self-Organization. Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg Peer-to-Peer Networks 15 Self-Organization Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg Gnutella Connecting Protokoll - Ping Ping participants query

More information

Nick Hamilton Institute for Molecular Bioscience. Essential Graph Theory for Biologists. Image: Matt Moores, The Visible Cell

Nick Hamilton Institute for Molecular Bioscience. Essential Graph Theory for Biologists. Image: Matt Moores, The Visible Cell Nick Hamilton Institute for Molecular Bioscience Essential Graph Theory for Biologists Image: Matt Moores, The Visible Cell Outline Core definitions Which are the most important bits? What happens when

More information

SLANG Session 4. Jason Quinley Roland Mühlenbernd Seminar für Sprachwissenschaft University of Tübingen

SLANG Session 4. Jason Quinley Roland Mühlenbernd Seminar für Sprachwissenschaft University of Tübingen SLANG Session 4 Jason Quinley Roland Mühlenbernd Seminar für Sprachwissenschaft University of Tübingen Overview Network properties Degree Density and Distribution Clustering and Connections Network formation

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 Figures are taken from: M.E.J. Newman, Networks: An Introduction 2

More information

Scalable overlay Networks

Scalable overlay Networks overlay Networks Dr. Samu Varjonen 1 Lectures MO 15.01. C122 Introduction. Exercises. Motivation. TH 18.01. DK117 Unstructured networks I MO 22.01. C122 Unstructured networks II TH 25.01. DK117 Bittorrent

More information

Topology Enhancement in Wireless Multihop Networks: A Top-down Approach

Topology Enhancement in Wireless Multihop Networks: A Top-down Approach Topology Enhancement in Wireless Multihop Networks: A Top-down Approach Symeon Papavassiliou (joint work with Eleni Stai and Vasileios Karyotis) National Technical University of Athens (NTUA) School of

More information

Properties of Biological Networks

Properties of Biological Networks Properties of Biological Networks presented by: Ola Hamud June 12, 2013 Supervisor: Prof. Ron Pinter Based on: NETWORK BIOLOGY: UNDERSTANDING THE CELL S FUNCTIONAL ORGANIZATION By Albert-László Barabási

More information

Graph Theory. Graph Theory. COURSE: Introduction to Biological Networks. Euler s Solution LECTURE 1: INTRODUCTION TO NETWORKS.

Graph Theory. Graph Theory. COURSE: Introduction to Biological Networks. Euler s Solution LECTURE 1: INTRODUCTION TO NETWORKS. Graph Theory COURSE: Introduction to Biological Networks LECTURE 1: INTRODUCTION TO NETWORKS Arun Krishnan Koenigsberg, Russia Is it possible to walk with a route that crosses each bridge exactly once,

More information

Networks and Discrete Mathematics

Networks and Discrete Mathematics Aristotle University, School of Mathematics Master in Web Science Networks and Discrete Mathematics Small Words-Scale-Free- Model Chronis Moyssiadis Vassilis Karagiannis 7/12/2012 WS.04 Webscience: lecture

More information

Small-world networks

Small-world networks Small-world networks c A. J. Ganesh, University of Bristol, 2015 Popular folklore asserts that any two people in the world are linked through a chain of no more than six mutual acquaintances, as encapsulated

More information

Community Detection. Community

Community Detection. Community Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,

More information

1 Degree Distributions

1 Degree Distributions Lecture Notes: Social Networks: Models, Algorithms, and Applications Lecture 3: Jan 24, 2012 Scribes: Geoffrey Fairchild and Jason Fries 1 Degree Distributions Last time, we discussed some graph-theoretic

More information

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017

THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017 THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS Summer semester, 2016/2017 SOCIAL NETWORK ANALYSIS: THEORY AND APPLICATIONS 1. A FEW THINGS ABOUT NETWORKS NETWORKS IN THE REAL WORLD There are four categories

More information

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization

An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science

More information

Universal Properties of Mythological Networks Midterm report: Math 485

Universal Properties of Mythological Networks Midterm report: Math 485 Universal Properties of Mythological Networks Midterm report: Math 485 Roopa Krishnaswamy, Jamie Fitzgerald, Manuel Villegas, Riqu Huang, and Riley Neal Department of Mathematics, University of Arizona,

More information

CS-E5740. Complex Networks. Scale-free networks

CS-E5740. Complex Networks. Scale-free networks CS-E5740 Complex Networks Scale-free networks Course outline 1. Introduction (motivation, definitions, etc. ) 2. Static network models: random and small-world networks 3. Growing network models: scale-free

More information

Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows)

Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows) Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows) Average clustering coefficient of a graph Overall measure

More information

World Wide Web has specific challenges and opportunities

World Wide Web has specific challenges and opportunities 6. Web Search Motivation Web search, as offered by commercial search engines such as Google, Bing, and DuckDuckGo, is arguably one of the most popular applications of IR methods today World Wide Web has

More information

Random Generation of the Social Network with Several Communities

Random Generation of the Social Network with Several Communities Communications of the Korean Statistical Society 2011, Vol. 18, No. 5, 595 601 DOI: http://dx.doi.org/10.5351/ckss.2011.18.5.595 Random Generation of the Social Network with Several Communities Myung-Hoe

More information

GIAN Course on Distributed Network Algorithms. Network Topologies and Local Routing

GIAN Course on Distributed Network Algorithms. Network Topologies and Local Routing GIAN Course on Distributed Network Algorithms Network Topologies and Local Routing Stefan Schmid @ T-Labs, 2011 GIAN Course on Distributed Network Algorithms Network Topologies and Local Routing If you

More information

Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies

Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com

More information

Structure of Social Networks

Structure of Social Networks Structure of Social Networks Outline Structure of social networks Applications of structural analysis Social *networks* Twitter Facebook Linked-in IMs Email Real life Address books... Who Twitter #numbers

More information

Chapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han

Chapter 1. Social Media and Social Computing. October 2012 Youn-Hee Han Chapter 1. Social Media and Social Computing October 2012 Youn-Hee Han http://link.koreatech.ac.kr 1.1 Social Media A rapid development and change of the Web and the Internet Participatory web application

More information

Intro to Random Graphs and Exponential Random Graph Models

Intro to Random Graphs and Exponential Random Graph Models Intro to Random Graphs and Exponential Random Graph Models Danielle Larcomb University of Denver Danielle Larcomb Random Graphs 1/26 Necessity of Random Graphs The study of complex networks plays an increasingly

More information

Degree Distribution: The case of Citation Networks

Degree Distribution: The case of Citation Networks Network Analysis Degree Distribution: The case of Citation Networks Papers (in almost all fields) refer to works done earlier on same/related topics Citations A network can be defined as Each node is a

More information

Exercise set #2 (29 pts)

Exercise set #2 (29 pts) (29 pts) The deadline for handing in your solutions is Nov 16th 2015 07:00. Return your solutions (one.pdf le and one.zip le containing Python code) via e- mail to Becs-114.4150@aalto.fi. Additionally,

More information

When Network Embedding meets Reinforcement Learning?

When Network Embedding meets Reinforcement Learning? When Network Embedding meets Reinforcement Learning? ---Learning Combinatorial Optimization Problems over Graphs Changjun Fan 1 1. An Introduction to (Deep) Reinforcement Learning 2. How to combine NE

More information

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma Overlay and P2P Networks Unstructured networks Prof. Sasu Tarkoma 20.1.2014 Contents P2P index revisited Unstructured networks Gnutella Bloom filters BitTorrent Freenet Summary of unstructured networks

More information

beyond social networks

beyond social networks beyond social networks Small world phenomenon: high clustering C network >> C random graph low average shortest path l network ln( N)! neural network of C. elegans,! semantic networks of languages,! actor

More information

Mathematics of networks. Artem S. Novozhilov

Mathematics of networks. Artem S. Novozhilov Mathematics of networks Artem S. Novozhilov August 29, 2013 A disclaimer: While preparing these lecture notes, I am using a lot of different sources for inspiration, which I usually do not cite in the

More information

Ian Clarke Oskar Sandberg

Ian Clarke Oskar Sandberg Ian Clarke is the architect and coordinator of The Freenet Project, and the Chief Executive Officer of Cematics Ltd, a company he founded to realise commercial applications for the Freenet technology.

More information

Erdős-Rényi Model for network formation

Erdős-Rényi Model for network formation Network Science: Erdős-Rényi Model for network formation Ozalp Babaoglu Dipartimento di Informatica Scienza e Ingegneria Università di Bologna www.cs.unibo.it/babaoglu/ Why model? Simpler representation

More information

Behavioral Data Mining. Lecture 9 Modeling People

Behavioral Data Mining. Lecture 9 Modeling People Behavioral Data Mining Lecture 9 Modeling People Outline Power Laws Big-5 Personality Factors Social Network Structure Power Laws Y-axis = frequency of word, X-axis = rank in decreasing order Power Laws

More information

Flat Routing on Curved Spaces

Flat Routing on Curved Spaces Flat Routing on Curved Spaces Dmitri Krioukov (CAIDA/UCSD) dima@caida.org Berkeley April 19 th, 2006 Clean slate: reassess fundamental assumptions Information transmission between nodes in networks that

More information

Chapter 8 DOMINATING SETS

Chapter 8 DOMINATING SETS Chapter 8 DOMINATING SETS Distributed Computing Group Mobile Computing Summer 2004 Overview Motivation Dominating Set Connected Dominating Set The Greedy Algorithm The Tree Growing Algorithm The Marking

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 Figures are taken from: M.E.J. Newman, Networks: An Introduction 2

More information

Signal Processing for Big Data

Signal Processing for Big Data Signal Processing for Big Data Sergio Barbarossa 1 Summary 1. Networks 2.Algebraic graph theory 3. Random graph models 4. OperaGons on graphs 2 Networks The simplest way to represent the interaction between

More information

6 Structured P2P Networks

6 Structured P2P Networks 6 Structured P2P Networks Distributed Data Management Wolf-Tilo Balke Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 6.1 Hash Tables 6.3

More information

Chapter 8 DOMINATING SETS

Chapter 8 DOMINATING SETS Distributed Computing Group Chapter 8 DOMINATING SETS Mobile Computing Summer 2004 Overview Motivation Dominating Set Connected Dominating Set The Greedy Algorithm The Tree Growing Algorithm The Marking

More information

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties

CSE 255 Lecture 13. Data Mining and Predictive Analytics. Triadic closure; strong & weak ties CSE 255 Lecture 13 Data Mining and Predictive Analytics Triadic closure; strong & weak ties Monday Random models of networks: Erdos Renyi random graphs (picture from Wikipedia http://en.wikipedia.org/wiki/erd%c5%91s%e2%80%93r%c3%a9nyi_model)

More information

Simplicial Complexes of Networks and Their Statistical Properties

Simplicial Complexes of Networks and Their Statistical Properties Simplicial Complexes of Networks and Their Statistical Properties Slobodan Maletić, Milan Rajković*, and Danijela Vasiljević Institute of Nuclear Sciences Vinča, elgrade, Serbia *milanr@vin.bg.ac.yu bstract.

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models

More information

Graph-theoretic Properties of Networks

Graph-theoretic Properties of Networks Graph-theoretic Properties of Networks Bioinformatics: Sequence Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Graphs A graph is a set of vertices, or nodes, and edges that connect pairs

More information

Topic II: Graph Mining

Topic II: Graph Mining Topic II: Graph Mining Discrete Topics in Data Mining Universität des Saarlandes, Saarbrücken Winter Semester 2012/13 T II.Intro-1 Topic II Intro: Graph Mining 1. Why Graphs? 2. What is Graph Mining 3.

More information

Web Structure Mining Community Detection and Evaluation

Web Structure Mining Community Detection and Evaluation Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside

More information

Introduction to Peer-to-Peer Systems

Introduction to Peer-to-Peer Systems Introduction Introduction to Peer-to-Peer Systems Peer-to-peer (PP) systems have become extremely popular and contribute to vast amounts of Internet traffic PP basic definition: A PP system is a distributed

More information

CS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University

CS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University CS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University Course website: http://snap.stanford.edu/na09 Slides will be available online Reading material will be posted online:

More information

Navigation in Networks. Networked Life NETS 112 Fall 2017 Prof. Michael Kearns

Navigation in Networks. Networked Life NETS 112 Fall 2017 Prof. Michael Kearns Navigation in Networks Networked Life NETS 112 Fall 2017 Prof. Michael Kearns The Navigation Problem You are an individual (vertex) in a very large social network You want to find a (short) chain of friendships

More information

Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig

Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig Distributed Data Management Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 8.0 Content Provisioning 8.0 Content Distribution 8.1 Swarming

More information

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks

CSE 258 Lecture 12. Web Mining and Recommender Systems. Social networks CSE 258 Lecture 12 Web Mining and Recommender Systems Social networks Social networks We ve already seen networks (a little bit) in week 3 i.e., we ve studied inference problems defined on graphs, and

More information

Lecture Note: Computation problems in social. network analysis

Lecture Note: Computation problems in social. network analysis Lecture Note: Computation problems in social network analysis Bang Ye Wu CSIE, Chung Cheng University, Taiwan September 29, 2008 In this lecture note, several computational problems are listed, including

More information

Summary: What We Have Learned So Far

Summary: What We Have Learned So Far Summary: What We Have Learned So Far small-world phenomenon Real-world networks: { Short path lengths High clustering Broad degree distributions, often power laws P (k) k γ Erdös-Renyi model: Short path

More information

Characteristics of Preferentially Attached Network Grown from. Small World

Characteristics of Preferentially Attached Network Grown from. Small World Characteristics of Preferentially Attached Network Grown from Small World Seungyoung Lee Graduate School of Innovation and Technology Management, Korea Advanced Institute of Science and Technology, Daejeon

More information

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties

CSE 158 Lecture 13. Web Mining and Recommender Systems. Triadic closure; strong & weak ties CSE 158 Lecture 13 Web Mining and Recommender Systems Triadic closure; strong & weak ties Monday Random models of networks: Erdos Renyi random graphs (picture from Wikipedia http://en.wikipedia.org/wiki/erd%c5%91s%e2%80%93r%c3%a9nyi_model)

More information

Understanding Disconnection and Stabilization of Chord

Understanding Disconnection and Stabilization of Chord Understanding Disconnection and Stabilization of Chord Zhongmei Yao Joint work with Dmitri Loguinov Internet Research Lab Department of Computer Science Texas A&M University, College Station, TX 77843

More information