Archipelago Measurement Infrastructure Updates and Analyses Young Hyun CAIDA ISMA 2009 AIMS Workshop Feb 12, 2009
2 Outline Focus and Architecture Monitor Deployment Measurements Future Work
3 Introduction Archipelago (Ark) is CAIDA s next-generation active measurement infrastructure evolution of the skitter infrastructure in production since Sep 12, 2007
4 Focus easy development and rapid prototyping lower barriers => implement better measurements faster with lower cost measurement infrastructures notoriously lack funding raise level of abstraction with high-level API and scripting language inspiration from Scriptroute, Metasploit, Scapy, Racket
5 Focus dynamic and coordinated measurements take advantage of multiple distributed measurement nodes in sophisticated ways one measurement triggers another measurement use multiple nodes to divide and conquer synchronize measurements for example: Doubletree; tomography; Rocketfuel-like targeted discovery of a single network s topology
6 Focus measurement services build upon the work of others; share services between measurement activities for example, on-demand traceroute/ping service; IP-to-AS mapping service similiar in goal to service-oriented architecture (SOA) but at finer granularity and without the complexity
7 Architecture Ark is composed of measurement nodes (machines) located in various networks worldwide many thanks to the organizations hosting Ark boxes please contact us if you want to host an Ark box Ark employs a tuple space to enable communication and coordination a tuple space is a distributed shared memory combined with a small number of easy-to-use operations a tuple space stores tuples, which are arrays of simple values (strings and numbers), and clients retrieve tuples by pattern matching
8 Architecture use tuple space for decentralized (that is, peer-topeer) communication, interaction, and coordination monitor4 monitor5 monitor1 central server monitor3 monitor2
Monitor Deployment 33 monitors in 22 countries Continent 12 North America 2 South America 11 Europe 1 Africa 5 Asia 2 Oceania Organization 19 academic 9 research network 2 network infrastructure 1 commercial network 1 community network 1 military research 9
10 Measurements IPv4 Routed /24 Topology IPv4 Routed /24 AS Links IPv6 Topology DNS Names DNS Query/Response Traffic Spoofer Project Collaboration
11 IPv4 Routed /24 Topology ongoing large-scale topology measurements ICMP Paris traceroute to every routed /24 (7.4 million) running scamper written by Matthew Luckie of WAND, University of Waikato group monitors into teams and dynamically divide up the measurement work among team members 13-member team probes every /24 in 48 hrs at 100pps only one monitor probes each /24 per cycle 3 teams active
12 IPv4 Routed /24 Topology 30 25 20 15 10 5 0 Sep 07 Nov 07 Jan Mar May Jul Sep Nov Jan 09 Mar 09 nap-it (3) eug-us (3) dfw-us (3) pna-es (3) her-gr (3) scl-cl (3) she-cn (3) yow-ca (2) tpe-tw (2) ams-nl (2) bwi-us (2) zrh-ch (2) sjc-us (2) gig-br (2) cmn-ma (2) hnl-us (2) lax-us (2) cbg-uk (2*) vie-at (2) iad-us (2*) yto-ca (1*) bcn-es (1*) hlz-nz (1) lej-de (1) laf-us (1*) syd-au (1*) san-us (1*) nrt-jp (1*) mnl-ph (1*) hel-fi (1*) dub-ie (1) cjj-kr (1*) amw-us (1) Sep 2007 to Jan 2009 (17 months): 2.5 billion traceroutes; 1.0TB data
13 IPv4 Routed /24 Topology 30 25 20 15 10 5 0 Sep 07 Nov 07 software failure Jan Mar May Jul Sep Nov Jan 09 Mar 09 nap-it (3) eug-us (3) dfw-us (3) pna-es (3) her-gr (3) scl-cl (3) she-cn (3) yow-ca (2) tpe-tw (2) ams-nl (2) bwi-us (2) zrh-ch (2) sjc-us (2) gig-br (2) cmn-ma (2) hnl-us (2) lax-us (2) cbg-uk (2*) vie-at (2) iad-us (2*) yto-ca (1*) bcn-es (1*) hlz-nz (1) lej-de (1) laf-us (1*) syd-au (1*) san-us (1*) nrt-jp (1*) mnl-ph (1*) hel-fi (1*) dub-ie (1) cjj-kr (1*) amw-us (1) Sep 2007 to Jan 2009 (17 months): 2.5 billion traceroutes; 1.0TB data
14 IPv4 Routed /24 Topology 30 25 20 15 10 5 0 Sep 07 Nov 07 hardware failure Jan Mar May Jul Sep Nov Jan 09 Mar 09 nap-it (3) eug-us (3) dfw-us (3) pna-es (3) her-gr (3) scl-cl (3) she-cn (3) yow-ca (2) tpe-tw (2) ams-nl (2) bwi-us (2) zrh-ch (2) sjc-us (2) gig-br (2) cmn-ma (2) hnl-us (2) lax-us (2) cbg-uk (2*) vie-at (2) iad-us (2*) yto-ca (1*) bcn-es (1*) hlz-nz (1) lej-de (1) laf-us (1*) syd-au (1*) san-us (1*) nrt-jp (1*) mnl-ph (1*) hel-fi (1*) dub-ie (1) cjj-kr (1*) amw-us (1) Sep 2007 to Jan 2009 (17 months): 2.5 billion traceroutes; 1.0TB data
15 IPv4 Routed /24 Topology 30 25 20 15 10 5 0 Sep 07 power supply Nov 07 died Jan replacement power supply Mar died May Jul Sep Nov Jan 09 Mar 09 nap-it (3) eug-us (3) dfw-us (3) pna-es (3) her-gr (3) scl-cl (3) she-cn (3) yow-ca (2) tpe-tw (2) ams-nl (2) bwi-us (2) zrh-ch (2) sjc-us (2) gig-br (2) cmn-ma (2) hnl-us (2) lax-us (2) cbg-uk (2*) vie-at (2) iad-us (2*) yto-ca (1*) bcn-es (1*) hlz-nz (1) lej-de (1) laf-us (1*) syd-au (1*) san-us (1*) nrt-jp (1*) mnl-ph (1*) hel-fi (1*) dub-ie (1) cjj-kr (1*) amw-us (1) Sep 2007 to Jan 2009 (17 months): 2.5 billion traceroutes; 1.0TB data
16 IPv4 Routed /24 AS Links AS links from Routed /24 Topology traces map IP addresses to ASes with RouteViews BGP table
17 IPv4 Routed /24 AS Links statistics for 1 month of AS links from three sources (Dec 20): nodes links max degree average degree average neighbor degree mean clustering Ark DIMES RouteViews (rv2) 23,425 56,760 2,509 4.85 467.3 0.354 22,995 74,140 3,590 6.45 705.4 0.446 30,760 65,775 2,328 4.28 487.2 0.241 avg neighbor deg = avg neighbor degree of the avg k- degree node averaged over all k mean clustering = (avg number of links between neighbors of k-deg nodes) / (max possible such links for k) averaged over all k
18 3 AS Links Sources: 1 Month 10 0 10-1 DIMES AS links (20-12) Ark AS links (20-12) RouteViews (rv2) AS links (20-12) 10-2 CCDF 10-3 10-4 10-5 10 0 10 1 10 2 10 3 10 4 Node degree
19 3 AS Links Sources: 1 Month 10 3 DIMES AS links (20-12) Ark AS links (20-12) RouteViews (rv2) AS links (20-12) average neighbor degree 10 2 10 1 10 0 10 0 10 1 10 2 10 3 10 4 Node degree
20 3 AS Links Sources: 1 Month 10 0 DIMES AS links (20-12) Ark AS links (20-12) RouteViews (rv2) AS links (20-12) 10-1 clustering 10-2 10-3 10-4 10 0 10 1 10 2 10 3 10 4 Node degree
21 AS Links Growth AS links seem to accumulate linearly without bound in skitter, Ark, DIMES; possibly in BGP even with fixed traceroute sources and destination list (which happened with skitter for 4 years) AS graph densification: average degree increases for example: 1 year of Ark (20): 104k AS links, 28k ASes 2 years of DIMES: 356k AS links, 29k ASes 7.5 years of skitter: 209k AS links, 27k ASes
22 AS Links Growth hard to determine the natural time period to aggregate AS links 1 month? 6 months? years? when do we get a representative AS graph?
23 AS Links Growth hard to compare different infrastructures you can always make AS graph bigger by aggregating
AS Links Growth hard to compare different infrastructures you can always make AS graph bigger by aggregating in fact, got spam on this... 23
AS Links Growth hard to compare different infrastructures you can always make AS graph bigger by aggregating in fact, got spam on this... 23
Ark AS Links Growth 29000 28000 # nodes # links 105000 100000 95000 27000 90000 85000 # nodes 26000 80000 75000 # links 25000 70000 24000 65000 60000 23000 1 2 3 4 5 6 7 8 9 10 11 12 55000 Months of accumulation 24
Ark AS Links Growth 4500 4250 4000 max degree average degree 7.5 7 3750 6.5 max degree 3500 3250 3000 6 5.5 average degree 2750 5 2500 2250 1 2 3 4 5 6 7 8 9 10 11 12 4.5 Months of accumulation 25
Ark AS Links Growth 850 800 average neighbor degree clustering 0.54 0.52 average neighbor degree 750 700 650 600 550 0.5 0.48 0.46 0.44 0.42 0.4 0.38 clustering 500 0.36 450 1 2 3 4 5 6 7 8 9 10 11 12 0.34 Months of accumulation 26
27 Ark AS Links: 1, 6, 12 Months 10 0 10-1 Ark AS links (20, 1 to 12) Ark AS links (20, 7 to 12) Ark AS links (20, 12 to 12) 10-2 CCDF 10-3 10-4 10-5 10 0 10 1 10 2 10 3 10 4 Node degree
28 Ark AS Links: 1, 6, 12 Months 10 3 Ark AS links (20, 1 to 12) Ark AS links (20, 7 to 12) Ark AS links (20, 12 to 12) average neighbor degree 10 2 10 1 10 0 10 1 10 2 10 3 10 4 Node degree
29 Ark AS Links: 1, 6, 12 Months 10 0 Ark AS links (20, 1 to 12) Ark AS links (20, 7 to 12) Ark AS links (20, 12 to 12) 10-1 clustering 10-2 10-3 10 0 10 1 10 2 10 3 10 4 Node degree
30 Ark IPv6 Topology ongoing large-scale IPv6 measurements since Dec 12, 20 6 monitors: 3 in US, 3 in Europe 2 IPv6 boxes down 3 more IPv6 boxes coming Real Soon Now ICMP Paris traceroute to every routed prefix each monitor probes a random destination in every routed prefix in every cycle; 1,553 prefixes <= /48 reduced probing rate to take 2 days per cycle running scamper
31 Ark IPv6 Topology statistics for 8 weeks of AS links from six sources: Dec 12, 20 to Feb 7, 2009 nodes links max degree average degree average neighbor degree mean clustering IPv6 8 weeks IPv4 4 weeks 520 1,181 94 4.54 36.3 0.265 23,425 56,760 2,509 4.85 467.3 0.354
32 Ark IPv6 AS Links 10 0 Ark IPv4 AS links (20-12, 4 weeks) Ark IPv6 AS links (20-12, 8 weeks) 10-1 10-2 CCDF 10-3 10-4 10-5 10 0 10 1 10 2 10 3 10 4 Node degree
33 Ark IPv6 AS Links 10 3 Ark IPv4 AS links (20-12, 4 weeks) Ark IPv6 AS links (20-12, 8 weeks) average neighbor degree 10 2 10 1 10 0 10 1 10 2 10 3 10 4 Node degree
34 Ark IPv6 AS Links 10 0 Ark IPv4 AS links (20-12, 4 weeks) Ark IPv6 AS links (20-12, 8 weeks) 10-1 clustering 10-2 10-3 10 0 10 1 10 2 10 3 10 4 Node degree
35 DNS Names automated ongoing DNS lookup of IP addresses seen in the Routed /24 Topology traces all intermediate addresses and responding destinations using our in-house bulk DNS lookup service (HostDB) can look up millions of addresses per day 213M lookups since March 20
36 DNS Traffic tcpdump capture of DNS query/response traffic only for lookups of Routed /24 Topology addresses continuous collection of 3-5M packets per day can download most recent 30 days of pcap files a broad sampling of the nameservers on the Internet due to the broad coverage of the routed space in traces how many nameservers have IPv6 glue records? DNSSEC records? support EDNS? typical TTLs?
37 Alias Resolution Goal: collapse interfaces observed in traceroute paths into routers toward a router-level map of the Internet alias resolution work led by Ken Keys
38 Spoofer Project collaboration with Rob Beverly on MIT Spoofer Project how many networks allow packets with spoofed IP addresses to leave their network? Ark monitors act as targets for spoofed probes sent by willing participants forwards received probe data to MIT server
39 Spoofer Project UDP port 53 monitor monitor monitor tuple space CAIDA MIT
Ark Statistics Pages per-monitor analysis of IPv4 topology data RTT, path length, RTT vs. distance www.caida.org/projects/ark/statistics 40
41 Future Work release Marinda tuple space under GPL implement large-scale RadarGun measurements more in-depth analysis of data for stats pages investigate AS link densification DNS open resolver surveys? high-level packet generation, capture, and analysis API allow semi-trusted 3rd parties to conduct measurements
42 Thanks! For more information and to request data: www.caida.org/projects/ark