Handling Flash Crowds from your Garage

Similar documents
Handling Flash Crowds from your Garage

From Internet Data Centers to Data Centers in the Cloud

Large-Scale Web Applications

Middle East Technical University. Jeren AKHOUNDI ( ) Ipek Deniz Demirtel ( ) Derya Nur Ulus ( ) CENG553 Database Management Systems

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04

Zero to Microservices in 5 minutes using Docker Containers. Mathew Lodge Weaveworks

Cloud Computing. What is cloud computing. CS 537 Fall 2017

CIT 668: System Architecture. Amazon Web Services

CLOUD COMPUTING It's about the data. Dr. Jim Baty Distinguished Engineer Chief Architect, VP / CTO Global Sales & Services, Sun Microsystems

Alibaba Cloud CDN. FAQs

Traffic is coming! OMG moments

ARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

DISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing WHAT IS CLOUD COMPUTING? 2. Slide 3. Slide 1. Why is it called Cloud?

Yahoo Traffic Server -a Powerful Cloud Gatekeeper

ECE Enterprise Storage Architecture. Fall ~* CLOUD *~. Tyler Bletsch Duke University

what is cloud computing?

Next-Generation Cloud Platform

Running at 99%: Surviving an Application DoS Ryan

Demystifying the Cloud With a Look at Hybrid Hosting and OpenStack

FAST, FLEXIBLE, RELIABLE SEAMLESSLY ROUTING AND SECURING BILLIONS OF REQUESTS PER MONTH

To Kill a Monolith: Slaying the Demons of a Monolith with Node.js Microservices on CloudFoundry. Tony Erwin,

Advanced Computer Networks Exercise Session 7. Qin Yin Spring Semester 2013


jetnexus Virtual Load Balancer

F5 Synthesis Information Session. April, 2014

How can you implement this through a script that a scheduling daemon runs daily on the application servers?

Send me up to 5 good questions in your opinion, I ll use top ones Via direct message at slack. Can be a group effort. Try to add some explanation.

jetnexus Virtual Load Balancer

Apica ZebraTester. Advanced Load Testing Tool and Cloud Platform

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. Caching, Content Distribution and Load Balancing

Scaling DreamFactory

Chapter 6: Distributed Systems: The Web. Fall 2012 Sini Ruohomaa Slides joint work with Jussi Kangasharju et al.

Lecture 8: Internet and Online Services. CS 598: Advanced Internetworking Matthew Caesar March 3, 2011

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu

DIGGING INTO THE SOFTWARE DEFINED DATA CENTER

CLOUD-BASED DDOS PROTECTION FOR HOSTING PROVIDERS

Enterprise Overview. Benefits and features of Cloudflare s Enterprise plan FLARE

Big Data in OpenStack Storage

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

Architekturen für die Cloud

Deployment. Chris Wilson, AfNOG / 26

Data Centers and Cloud Computing

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Drupal Frontend Performance & Scalability

CS 61C: Great Ideas in Computer Architecture. MapReduce

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

Cloud Computing. Technologies and Types

CSE/EE 461 HTTP and the Web

Data Centers and Cloud Computing. Data Centers

The DNS of Things. A. 2001:19b8:10 1:2::f5f5:1d Q. WHERE IS Peter Silva Sr. Technical Marketing

TechNote AltitudeCDN Multicast+ and OmniCache Support for Citrix

Scalability of web applications

Lesson 14: Cloud Computing

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

Cloud Computing Technologies and Types

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

Developing Enterprise Cloud Solutions with Azure

Reactive Microservices Architecture on AWS

Persistence. SWE 432, Fall 2017 Design and Implementation of Software for the Web

How the Cloud is Enabling the Disruption of the Construction Industry. AWS Case Study Construction Industry. Abstract

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

IIS Installation for.net Application. Md. Saifullah Al Azad

Important DevOps Technologies (3+2+3days) for Deployment

Case Study Virtual Expo. Virtual Expo, a leader in online B2B exhibitions, used Varnish Plus Cloud to build and enhance their own in-house CDN

Real Life Web Development. Joseph Paul Cohen

90 Minute Optimization Life Cycle

Data Centers. Tom Anderson

COMP6218: Content Caches. Prof Leslie Carr

IaaS. IaaS. Virtual Server

Content Distribution. Today. l Challenges of content delivery l Content distribution networks l CDN through an example

Fixed Size Ad Specifications

Distributed Systems Principles and Paradigms. Chapter 12: Distributed Web-Based Systems

Flexible and LEAN Ads

Hybride Cloud Szenarien HHochverfügbar mit KEMP Loadbalancern. Köln am 10.Oktober 2017

Aerospike Scales with Google Cloud Platform

Performance Is Not a Commodity Nathan Day Chief Scientist and Co - Founder

A Peer-to-Peer System to Bring Content Distribution Networks to the Masses Guillaume Pierre Vrije Universiteit, Amsterdam

7/22/2008. Transformations

Cache Management for TelcoCDNs. Daphné Tuncer Department of Electronic & Electrical Engineering University College London (UK)

Content Distribu-on Networks (CDNs)

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

Federated Array of Bricks Y Saito et al HP Labs. CS 6464 Presented by Avinash Kulkarni

Overview Content Delivery Computer Networking Lecture 15: The Web Peter Steenkiste. Fall 2016

CCDP-ARCH. Section 11. Content Networking

Provisioning IT at the Speed of Need with Microsoft Azure. Presented by Mark Gordon and Larry Kuhn Hashtag: #HAND5

Building a Scalable Architecture for Web Apps - Part I (Lessons Directi)

Issues in Distributed Architecture

Using Alluxio to Improve the Performance and Consistency of HDFS Clusters

Elastic Efficient Execution of Varied Containers. Sharma Podila Nov 7th 2016, QCon San Francisco

Composable Infrastructure for Public Cloud Service Providers

Mission-Critical Enterprise Cloud Applications

Pulse Secure Application Delivery

Tech Brief Wasabi for Consumers & Small Businesses

WHITEPAPER AMAZON ELB: Your Master Key to a Secure, Cost-Efficient and Scalable Cloud.

Web Architecture Review Sheet

Distributed Data Store

Cloud Storage with AWS: EFS vs EBS vs S3 AHMAD KARAWASH

Transcription:

Handling Flash Crowds from your Garage Jeremy Elson, Jon Howell Microsoft Research presented by: Vrije Universiteit Amsterdam March 2, 2012

Overview Flash crowds Building blocks of a scalable system Scaling architectures How do we actually scale? What are the implications? Real life examples Conclusions

What is a Flash Crowd? 1 Large masses of readers at your HTTPD http://en.wikipedia.org/wiki/slashdot_effect

What is a Flash Crowd? 2 Reasons: Being mentioned on a popular site (/. effect) Advertisement campaign Other reasons

Why is this bad? Our goal!

Why is this bad? Our goal! Overloads our website: Loss of readers (potential clients)

Why is this bad? Our goal! Overloads our website: Loss of readers (potential clients) Unpredictable (mostly) Sudden Temporary (several hours)

Let s make a scalable web service!

Essential properties Low overhead during the lean times Highly scalable Quickly scalable Cheap!

Introducing utility computing Utility computing: Packaging of computing resources as a metered service Related to Cloud computing Relies on statistical multiplexing

SDN Storage Delivery Networks aka Cloud Storage Amazon S3, Nirvanix, Google Cloud Storage Similar, but more attractive than CDNs

Compute Clouds Amazon EC2 Simple programmatic API Billed by the hour EC2 has a free tier!

A missing piece: relational databases Hardly scalable... unless we abandon full transactionality Replace relation databases altogether: NoSQL Amazon S3 Amazon SimpleDB MongoDB Many others

But how do we actually scale?

Using bare SDN Unlikely failures (not our problem) Scaling up and down is not our problem The capacity is virtually infinite Static HTTP

HTTP redirection Redirects users to a specific machine Frontend only involved in new session establishment Single point of failure, but only for new sessions Scale up time: minutes

EC2 HTTP redirection experiment If CPU capacity < 50%, launch a new server

L4/L7 Load balancing L4 aka DNAT L7 aka reverse HTTP proxy Bandwidth is limited by the frontend Single point of failure Scale up time: minutes

L4/L7 Load balancing L4 aka DNAT L7 aka reverse HTTP proxy Bandwidth is limited by the frontend Single point of failure Scale up time: minutes Need to worry about client-server affinity L7: HTTP cookies

DNS load-balanced clusters Multiple A records (max. 25) No single point of failure Unpredictable client-server affinity Badly-behaved resolvers Slow scale-up time: tens of minutes Server startup time TTL, Cache

Hybrid approaches?!

What about real life?

MapCruncher 1 AJAX-style interactive maps Output: a set of static content (images, js) Single powerful webserver Sample data: 25 GB of maps

MapCruncher 2 Press release from Microsoft (2006) Unresponsive and failing service

MapCruncher 2 Press release from Microsoft (2006) Unresponsive and failing service Little locality of reference - fs cache is useless Disk-IO bound Solution: publish data to Amazon S3 Problem solved! $4/month for 25GB storage Flash crowd: one time charge of $200 Statistical multiplexing rules!

Assira CAPTCHA: requires image identification Many servers on Amazon EC2 Uses DNS load balancing BerkeleyDB file is pushed to every server Servers store session data locally, on disk client-server affinity issues session request are forwarded to the right server Survives flash crowds Survives DoS attacks

Lessons learned from Assira Poor client affinity is not a big problem EC2 nodes fail: And get different IP on a reboot Already fixed!

InkblotPassword.com 1 Similar to Assira Laziness as the new philosophy: virtually no time spent on optimizing it is cheaper to just rent a new server Store everything in S3, no local data Slow S3 writes are hidden (write-behind)

InkblotPassword.com 2 An article on several popular tech sites No automatic expansion = unresponsive system Fixed by adding more servers (20 min.) Marginal cost < $150

4 different scaling strategies there discussed All of them are available to the garage innovator

No single scaling strategy dominates! Questions?