Democratizing Content Publication with Coral

Similar documents
Democratizing Content Publication with Coral

CE693: Adv. Computer Networking

CONTENT DISTRIBUTION. Oliver Michel University of Illinois at Urbana-Champaign. October 25th, 2011

Overlay and P2P Networks. Applications. Prof. Sasu Tarkoma and

CS November 2018

Distributed Systems. 21. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2018

Scalable overlay Networks

Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04

CS November 2017

Last Class: Consistency Models. Today: Implementation Issues

416 Distributed Systems. March 23, 2018 CDNs

Today: World Wide Web! Traditional Web-Based Systems!

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante

CS 43: Computer Networks BitTorrent & Content Distribution. Kevin Webb Swarthmore College September 28, 2017

Today s Lecture : Computer Networking. Overview. How Akamai Works. Naming and CDNs Required readings. Optional readings.

CSE 124: CONTENT-DISTRIBUTION NETWORKS. George Porter December 4, 2017

Content Distribution Networks

Content Delivery on the Web: HTTP and CDNs

automating server selection with OASIS

Cooperative End-to-end content distribution. Márk Jelasity

COMP6218: Content Caches. Prof Leslie Carr

Chapter 6: Distributed Systems: The Web. Fall 2012 Sini Ruohomaa Slides joint work with Jussi Kangasharju et al.

Server Selection Mechanism. Server Selection Policy. Content Distribution Network. Content Distribution Networks. Proactive content replication

Experiences with CoralCDN

CSE 123b Communications Software

Today s class. CSE 123b Communications Software. Telnet. Network File System (NFS) Quick descriptions of some other sample applications

CSC2231: DNS with DHTs

Content Distribu-on Networks (CDNs)

CONTENT-DISTRIBUTION NETWORKS

Today: World Wide Web

Today: World Wide Web. Traditional Web-Based Systems

Mul$media Networking. #9 CDN Solu$ons Semester Ganjil 2012 PTIIK Universitas Brawijaya

Multimedia Streaming. Mike Zink

A Tale of Three CDNs

Subway : Peer-To-Peer Clustering of Clients for Web Proxy

Distributed Systems Principles and Paradigms. Chapter 12: Distributed Web-Based Systems

Autonomic Multi-Server Distribution in Flash Crowds Alleviation Network

EARM: An Efficient and Adaptive File Replication with Consistency Maintenance in P2P Systems.

Web caches (proxy server) Applications (part 3) Applications (part 3) Caching example (1) More about Web caching

Chapter 2 Application Layer

Large-Scale Web Applications

Autonomic Multi-server Distribution in Flash Crowds Alleviation Network

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. Caching, Content Distribution and Load Balancing

Flash Crowds Alleviation via Dynamic Adaptive Network

Akamai's V6 Rollout Plan and Experience from a CDN Point of View. Christian Kaufmann Director Network Architecture Akamai Technologies, Inc.

CS 43: Computer Networks. 14: DHTs and CDNs October 3, 2018

Kademlia: A P2P Informa2on System Based on the XOR Metric

Finding a needle in Haystack: Facebook's photo storage

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. File Sharing

Fixing the Embarrassing Slowness of OpenDHT on PlanetLab

Applied Architectures

Simplifying Wide-Area Application Development with WheelFS

Content Deployment and Caching Techniques in Africa

Peer-to-Peer Systems and Distributed Hash Tables

Web, HTTP, Caching, CDNs

Enabling Internet-of-Things (IoT) Services in MobilityFirst FIA

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

HTTP and Web Content Delivery

What is the relationship between a domain name (e.g., youtube.com) and an IP address?

From Internet Data Centers to Data Centers in the Cloud

Design and Implementation of A P2P Cooperative Proxy Cache System

Odin: Microsoft s Scalable Fault-

Squirrel case-study. Decentralized peer-to-peer web cache. Traditional centralized web cache. Based on the Pastry peer-to-peer middleware system

Computer Networks. Wenzhong Li. Nanjing University

Sloppy hashing and self-organizing clusters

Peer-to-peer computing research a fad?

In our view, one of the cost-ecient approaches to building high-performance and scalable web-sites, CDNs and Web-hosting systems, is to integrate seve

Yet another redirection mechanism for the World-Wide Web?

Configuring Answers and Answer Groups

Configuring Answers and Answer Groups

Exam - Final. CSCI 1680 Computer Networks Fonseca. Closed Book. Maximum points: 100 NAME: 1. TCP Congestion Control [15 pts]

COOPERATIVE WEB CACHING OF DYNAMIC WEB CONTENT

Application-Layer Protocols Peer-to-Peer Systems, Media Streaming & Content Delivery Networks

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

War Stories from the Cloud Going Behind the Web Security Headlines. Emmanuel Mace Security Expert

Distributed Systems. 05r. Case study: Google Cluster Architecture. Paul Krzyzanowski. Rutgers University. Fall 2016

Akamai's V6 Rollout Plan and Experience from a CDN Point of View. Christian Kaufmann Director Network Architecture Akamai Technologies, Inc.

CoopNet: Cooperative Networking

ATS Summit: CARP Plugin. Eric Schwartz

Computer Networks. HTTP and more. Jianping Pan Spring /20/17 CSC361 1

Lightweight Resource Management for DDoS Traffic Isolation in a Cloud Environment

Reminders. - Open book, open notes, closed laptop - Bring textbook - Bring printouts of slides & any notes you may have taken

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

OpenCache. A Platform for Efficient Video Delivery. Matthew Broadbent. 1 st Year PhD Student

HWP2 Distributed search HWP1 Each beacon knows about every other beacon B2 B1 B3

11/13/2018 CACHING, CONTENT-DISTRIBUTION NETWORKS, AND OVERLAY NETWORKS ATTRIBUTION

Community-of-Interest Multicast Cache Loading

Content Distribution. Today. l Challenges of content delivery l Content distribution networks l CDN through an example

Flexible, Wide-Area Storage for Distributed Systems Using Semantic Cues

Revealing the load-balancing behavior of YouTube traffic of interdomain links

State-of-the-art of Content Delivery Network

Files/News/Software Distribution on Demand. Replicated Internet Sites. Means of Content Distribution. Computer Networks 11/9/2009

Kenji Yokota [1] Takuya Asaka [2] Tatsuro Takahashi [1] [1]Kyoto University, Japan [2]Tokyo Metropolitan University, Japan

Internet Services & Protocols

DONAR Decentralized Server Selection for Cloud Services

Joint Server Selection and Routing for Geo-Replicated Services

Session Level Techniques for Improving Web Browsing Performance on Wireless Links

Transcription:

Democratizing Content Publication with Mike Freedman Eric Freudenthal David Mazières New York University NSDI 2004

A problem Feb 3: Google linked banner to julia fractals Users clicking directed to Australian University web site University s network link overloaded, web server taken down temporarily

The problem strikes again! Feb 4: Slashdot ran the story about Google Site taken down temporarily again

The response from down under Feb 4, later Paul Bourke asks: They have hundreds (thousands?) of servers worldwide that distribute their traffic load. If even a small percentage of that traffic is directed to a single server what chance does it have? Help the little guy

Existing approaches Client-side proxying Squid, Summary Cache, hierarchical cache, CoDeeN, Squirrel, Backslash, PROOFS, Problem: Not 100% coverage Throw money at the problem Load-balanced servers, fast network connections Problem: Can t afford or don t anticipate need Content Distribution Networks (CDNs) Akamai, Digital Island, Mirror Image Centrally managed, needs to recoup costs

s solution Origin Server Browser Browser Browser Browser Pool resources to dissipate flash crowds Implement an open CDN Allow anybody to contribute Works with unmodified clients CDN only fetches once from origin server

s solution Origin Server Browser Browser Browser Browser Pool resources to dissipate flash crowds Strong locality without a priori knowledge No hotspots in CDN Should all work automatically with nobody in charge

Contributions Self-organizing clusters of nodes NYU and Columbia prefer one another to Germany Rate-limiting mechanism Everybody caching and fetching same URL does not overload any node in system Decentralized DNS Redirection Works with unmodified clients No centralized management or a priori knowledge of proxies locations or network configurations

Using CDN Rewrite URLs into ized URLs www.x.com www.x.com.nyud.net:8090 Directs clients to, which absorbs load Who might ize URLs? Web server operators ize URLs ized URLs posted to portals, mailing lists Users explicitly ize URLs

CDN components Origin Server?? Fetch data from nearby DNS Redirection Return proxy, preferably one near client Resolver www.x.com.nyud.net Browser 216.165.108.10 Cooperative Web Caching

Functionality needed DNS: Given network location of resolver, return a proxy near the client put (network info, self) get (resolver info) {proxies} HTTP: Given URL, find proxy caching object, preferably one nearby put (URL, self) get (URL) {proxies}

Use a DHT? Supports put/get interface using key-based routing Problems with using DHTs as given NYC Japan NYU NYC Columbia Germany Lookup latency Transfer latency Hotspots

distributed index Insight: Don t need hash table semantics Just need one well-located proxy put (key, value, ttl) Avoid hotspots get (key) Retrieves some subset of values put under key Prefer values put by nodes near requestor Hierarchical clustering groups nearby nodes Expose hierarchy to applications Rate-limiting mechanism distributes puts

CDN components Fetch data from nearby DNS Redirection Return proxy, preferably one near client Resolver www.x.com.nyud.net Browser 216.165.108.10 Cooperative Web Caching

CDN components get get Fetch data from nearby DNS Redirection Return proxy, preferably one near client Resolver www.x.com.nyud.net Browser 216.165.108.10 Cooperative Web Caching

Key-based XOR routing 000 Distance to key 111 Thresholds None < 60 ms Minimizes lookup latency < 20 ms Prefer values stored by nodes within faster clusters

Prevent insertion hotspots Store value once in each level cluster Always storing at closest node causes hotspot NYU (log n) reqs / min Halt put routing at full and loaded node Full M vals/key with TTL > ½ insertion TTL Loaded puts traverse node in past minute Store at furthest, non-full node seen

Challenges for DNS Redirection lacks Central management A priori knowledge of network topology Anybody can join system Any special tools (e.g., BGP feeds) has Large # of vantage points to probe topology Distributed index in which to store network hints Each node maps nearby networks to self

s DNS Redirection DNS server probes resolver Once local, stay local When serving requests from nearby DNS resolver Respond with nearby proxies Respond with nearby DNS servers Ensures future requests remain local Else, help resolver find local DNS server

DNS measurement mechanism Server probes client (2 RTTs) Resolver Browser Return servers within appropriate cluster e.g., for resolver RTT = 19 ms, return from cluster < 20 ms Use network hints to find nearby servers i.e., client and server on same subnet Otherwise, take random walk within cluster

Experimental results Consider requests to Australian web site: Does absorb flash crowds? Does clustering help latency? Does form sensible clusters? Does prevent hotspots? Experimental setup 166 PlanetLab hosts; node and client on each Twelve 41-KB files on 384 Kb/sec (DSL) web server (0.6 reqs / sec) / client 32,800 Kb/sec aggregate

Solves flash-crowd problem hits in 20 ms cluster Local caches begin to handle most requests Hits to origin web server

Benefits end-to-end client latency

Benefits end-to-end client latency

Finds natural clusters Nodes share letter Size of letter in same < 60 ms cluster number of collocated nodes in same cluster

Prevents put hotspots 494 nodes 3 2 1 Nodes aggregate put/get rate: ~12 million / min Rate-limit per node ( ): 12 / min RPCs at closest leaked through 7 others: 83 / min

Conclusions indexing infrastructure Provides non-standard P2P storage abstraction Stores network hints and forms clusters Exposes hierarchy and hints to applications Prevents hotspots Use to build fully decentralized CDN Solves Slashdot effect Popular data widely replicated highly available Democratizes content publication

For more information www.scs.cs.nyu.edu/coral www.scs.cs.nyu.edu.nyud.net:8090/coral