CS 111: Program Desig I Lecture 21: Network Aalysis Robert H. Sloa & Richard Warer Uiversity of Illiois at Chicago April 10, 2018
NETWORK ANALYSIS
Which displays a graph i the sese of graph/etwork aalysis? A: Left, B. Right, C. Both
Graphs Graph (or etwork) is collectio of odes (aka vertices) ad liks (aka edges). For the formal mided, lik is pair of odes A CS 151 (Discrete Math) textbook might make a formal defiitio like "A graph is a fiite set of odes N together with a set L of pairs of odes."
Graphs Graph (or etwork) is collectio of odes (aka vertices) ad liks (aka edges). For the formal mided, lik is pair of odes A CS 151 (Discrete Math) might make a formal defiitio like "A graph is a fiite set of odes N together with a set L of pairs of odes." Default case: liks are bidirectioal, udirected, ad what we assume if we just see "graph" or "etwork" Ca also have directed liks (e.g., road etwork with oe-way streets): "directed graph"
Example Network with 6 odes ad 7 liks (Udirected) Liks are: (6, 4), (4, 5), (1, 5), (1,2), (2,3), (2, 5) (3, 4)
Examples Social etworks people as odes; fried = (udirected) lik Web pages (directed) page = ode; lik = lik Web crawler: crawls aroud this etwork Computer etworks odes = computers liks = 2 computers that ca commuicate directly Chicago El Nodes = stops; liks betwee adjacet stops Collectio of Phoe Calls Nodes = phoe umbers; liks = phoe calls
Networkx To work with graphs i Pytho, especially for etwork aalysis: import etworkx (as x) Lear more at: https://etworkx.github.io/documetatio/etworkx-1.11/tutorial/idex.html I believe that Spyder will have give versio 1.11, which almost idetical to versio 1.10. etworkx is i process of migratig to versio 2 etworkx provides Graph as basic data type ad ways to add odes ad edges ad do all sorts of thigs, icludig visualize
Simple graph example import etworkx g = etworkx.graph() #Create a empty graph object #Add several odes g.add_ode("alice") g.add_ode("bob") g.add_ode("charlie") g.umber_of_odes() à 3 g.umber_of_edges() à 0
Simple graph example cotiued # Add a sigle edge g.add_edge("alice", "Bob") # udirected >>> g.umber_of_edges() 1 >>> g.odes() ['Alice', 'Charlie', 'Bob'] >>> g.edges() [('Alice', 'Bob')]
Drawig etworkx ca do simple drawig (workig with matplotlib.pyplot uder hood): >>> etworkx.draw_etworksx(g) >>> plt.show() will produce a drawig of our graph (with label ames!) (tutorial says draw but ow deprecated)
Drawig without ode labels import etworkx as x x.draw(g, with_labels=false)
Addig a bit to the graph # Add some more edges ad odes g.add_ode("david") g.add_edges_from([("alice", "Charlie"), ("Alice", "David")]) add_edges_from is a method whose argumet should be list of pairs of ode ames, each pair i ()s.
etworkx.draw(g) (as updated)
Key graph statistic 1: Degree Degree of ode = umber of eighbors ode has Rage of differet degrees discovered i last 20-30 years to vary with ature of graph. etworkx has graph method degree that gives us dictioary with all the degree iformatio for the graph
degrees from etworkx I [5]: d = etworkx.degree(g) I [6]: d Out[6]: {'Alice': 3, 'Bob': 1, 'Charlie': 1, 'David': 1} Will explai that dictioary data structure later. For ow, it's somethig easy to give to our old fried padas:
etworkx degree data à padas I [7]: degree_data = padas.series(etworkx.degree(g)) I [5]: degree_data Out[5]: Alice 3 Bob 1 Charlie 1 David 1 Ad we kow from earlier i semester how to plot graphs from padas series: padas series has a method.plot()
Plottig degree data We wat a histogram: Bar plot where the thigs o the x-axis have a specific meaigful order (e.g., umbers), as opposed to beig categorical (e.g., ames of justices) degree_data.plot(kid='hist') (Uimportat side ote: There is a abbreviatio for that. Ca write.hist() istead of.plot(kid='hist') )
To make plot of series look ice padas put i stuff automatically for some of the plots we did before from dataframes, but it does't always. If eed be: plt.xlabel('strig I wat to see below x axis') plt.ylabel('similarly for y') plt.title('strig I wat up top i title positio')
Plot with some appropriate labels
Remember what our graph looks like Bob Charlie Alice David
Cetrality Alice is coected to everybody else; Bob, Charlie, ad David are coected oly to oe ode each (Alice) Alice is obviously the most cetral Various cetrality measures to tell which odes are most cetral (Prof. Philip Yu of UIC CS foud to be "most cetral" computer sciece author by oe such measure)
What is maximum degree of ode i graph with odes? A. B. 1 C. 2 D. ( 1) / 2 E. 42
Oe simple measure of cetrality of a ode degree of ode / maximum possible degree of ay ode i that ode's graph For -ode graph: degree of ode / ( - 1) etworkx will give us (a dictioary of) the cetrality of every ode i graph g: cet = etworkx.degree_cetrality(g)
For our little graph I [11]: cet = etworkx.degree_cetrality(g) I [12]: cet Out[12]: {'Alice': 1.0, 'Bob': 0.3333333333333333, 'Charlie': 0.3333333333333333, 'David': 0.3333333333333333}
Path legths How may edges do we eed to walk over to get from oe ode to aother? 0 to get from ode to itself 1 to get to immediate eighbor > 1 to get to all other odes
Gettig all path legths eworkx has operator for this; gives somewhat complex data structure back padas to the rescue: It kows how to hadle that data structure ad tur it ito a dataframe, which we already kow about: padas.dataframe(etworkx.all_pairs_shortest_path_legth(g))
All path legths i our graph p = padas.dataframe(etworkx.all_pairs_shortest_pa th_legth(g)) >>> p Alice Bob Charlie David Alice 0 1 1 1 Bob 1 0 2 2 Charlie 1 2 0 2 David 1 2 2 0
Aother stat: Average shortest path legth etworkx will calculate the average over all the shortest path legths for you: # Get the average path legth prit(etworkx.average_shortest_path_legth(g)) 1.5
Our 4 ode graph is kida dull Poit is to apply these sorts of techiues to e.g., graphs of various types of social etworks with thousads to 1 billio+ odes Our example data (real data): odes = twitter users edge = follows relatioship (could be directed; could igore directio) ~40,000 pairs of follower, followee (This particular bit of twitter formed by techiue called sowball samplig startig at Computatioal Legal Aalytics)
Large etworks Stored as text files Oe lie for each lik with lie cotaiig ames (strig or umber) of odes Notice that if we kow all the liks the we kow what the odes are Both comma ad space are commo delimiters for betwee the two odes of a edge i large etwork work Both are, broadly speakig, CSV We'll use padas to read these i