SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

Similar documents
SEDA: An Architecture for Scalable, Well-Conditioned Internet Services

SEDA An architecture for Well Condi6oned, scalable Internet Services

CS533 Concepts of Operating Systems. Jonathan Walpole

Capriccio : Scalable Threads for Internet Services

One Server Per City: Using TCP for Very Large SIP Servers. Kumiko Ono Henning Schulzrinne {kumiko,

Resource Containers. A new facility for resource management in server systems. Presented by Uday Ananth. G. Banga, P. Druschel, J. C.

Cluster-Based Scalable Network Services

IsoStack Highly Efficient Network Processing on Dedicated Cores

Capriccio: Scalable Threads for Internet Services

Comparing the Performance of Web Server Architectures

A Scalable Event Dispatching Library for Linux Network Servers

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

Capriccio: Scalable Threads for Internet Services (by Behren, Condit, Zhou, Necula, Brewer) Presented by Alex Sherman and Sarita Bafna

Adaptive Cluster Computing using JavaSpaces

Servlet Performance and Apache JServ

WITH the rapidly increasing usage of internet based services,

Integrating Concurrency Control and Energy Management in Device Drivers. Chenyang Lu

University of Alberta

An AIO Implementation and its Behaviour

Motivation. Threads. Multithreaded Server Architecture. Thread of execution. Chapter 4

SE Memory Consumption

An Efficient Approach for Leakage Tracing

TUTORIAL: WHITE PAPER. VERITAS Indepth for the J2EE Platform PERFORMANCE MANAGEMENT FOR J2EE APPLICATIONS

May 1, Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science (ICS) A Sleep-based Communication Mechanism to

A Design Framework for Highly Concurrent Systems

Chapter 4: Threads. Chapter 4: Threads. Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues

Practical Concurrency. Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited.

Application-aware Interface for SOAP Communication in Web Services

YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores

LiSEP: a Lightweight and Extensible tool for Complex Event Processing

Oracle Database 12c: JMS Sharded Queues

Connection Handoff Policies for TCP Offload Network Interfaces

Modification and Evaluation of Linux I/O Schedulers

Making the Box Transparent: System Call Performance as a First-class Result. Yaoping Ruan, Vivek Pai Princeton University

Congestion in Data Networks. Congestion in Data Networks

Node.js Paradigms and Benchmarks

Virtualization, Xen and Denali

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee

Online Feedback-based Estimation of Dynamic Page Service Time

SE Memory Consumption

Introduction to Parallel Computing. CPS 5401 Fall 2014 Shirley Moore, Instructor October 13, 2014

Operating Systems : Overview

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

A Study of the Performance Tradeoffs of a Tape Archive

Shadow: Real Applications, Simulated Networks. Dr. Rob Jansen U.S. Naval Research Laboratory Center for High Assurance Computer Systems

Staged Memory Scheduling

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh

Apache Multipurpose Infrastructure for Network Applications Building Scalable Network Applications

When the Servlet Model Doesn't Serve. Gary Murphy Hilbert Computing, Inc.

Chapter 4: Multithreaded Programming

Interconnect EGEE and CNGRID e-infrastructures

Lecture 8: February 19

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

What Is Congestion? Computer Networks. Ideal Network Utilization. Interaction of Queues

Dynamic Fine Grain Scheduling of Pipeline Parallelism. Presented by: Ram Manohar Oruganti and Michael TeWinkle

The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication

The Power of Batching in the Click Modular Router

Integrating Concurrency Control and Energy Management in Device Drivers

OPERATING SYSTEM. Chapter 4: Threads

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Achieving Scalability and High Availability for clustered Web Services using Apache Synapse. Ruwan Linton WSO2 Inc.

Reactive design patterns for microservices on multicore

Real-time for Windows NT

Chapter 4: Multithreaded Programming. Operating System Concepts 8 th Edition,

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

OS-caused Long JVM Pauses - Deep Dive and Solutions

How IBM Can Identify z/os Networking Issues without tracing

Application of WLAN to the control of mobile robots. Presented by : Priyanka Lingayat

CLUSTERING HIVEMQ. Building highly available, horizontally scalable MQTT Broker Clusters

2015 Erlang Solutions Ltd

A developer s guide to load testing

Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark

Ultra high-speed transmission technology for wide area data movement

Real Time Linux patches: history and usage

Designing Parallel Programs. This review was developed from Introduction to Parallel Computing

Evaluating the Scalability of Java Event-Driven Web Servers

Real Time: Understanding the Trade-offs Between Determinism and Throughput

Proceedings of the 2002 USENIX Annual Technical Conference

Processes and Threads. Processes: Review

Chapter 4: Threads. Chapter 4: Threads

Copyright Push Technology Ltd December Diffusion TM 4.4 Performance Benchmarks

ArcGIS Server Performance and Scalability : Optimizing GIS Services

Flash: an efficient and portable web server

PAC485 Managing Datacenter Resources Using the VirtualCenter Distributed Resource Scheduler

Accelerating Enterprise Search with Fusion iomemory PCIe Application Accelerators

Sybase Adaptive Server Enterprise on Linux

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant

Much Faster Networking

Improve Web Application Performance with Zend Platform

Lightstreamer. The Streaming-Ajax Revolution. Product Insight

Improving Java Server Performance with Interruptlets

The Google File System

(if you can t read this, move closer!) The high-performance protocol construction toolkit. Peter Royal

State of the Linux Kernel

WHITE PAPER NGINX An Open Source Platform of Choice for Enterprise Website Architectures

Practical Considerations for Multi- Level Schedulers. Benjamin

VM Migration, Containers (Lecture 12, cs262a)

Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware

[module lab 1.2] STRUCTURING PROGRAMS IN TASKS

Appendix B. Standards-Track TCP Evaluation

Transcription:

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of California, Berkeley Operating Systems Principles (SOSP-18), Chateau Lake Louise, Canada, October 21-24, 2001.

Motivation Millions of Internet users Demand for Internet services grows High variations in service load. Load spikes are expected Web services getting more complex Not static content anymore, dynamic content that require extensive computation and I/O

Problem statement Services that support millions of users (massive concurrency) Responsive Robust Highly available Well conditioned service : Request rate scales with the response rate Excessive demand does not degrade throughput and all clients experience an equal response time penalty linear to the length of the queue (graceful degradation). A notion of fairness

Thread-based concurrency Contention for resources and context switches cause high overhead High number of threads degrades the throughput and response time is greatly increased Partial Solution: Bound number of threads Throughput maintained But what about max response time? Some clients experience long waiting times Overcommitting resources Transparent resource virtualization prevents application from adapting to load changes and spotting bottlenecks

Event-driven concurrency Efficient and scalable concurrency But difficult to engineer and tune How order the processing of events. Scheduling challenges Difficulty to follow the flow of events Little support from OS

Staged Event-Driven Architecture (SEDA) Hybrid approach Thread-based concurrency models for ease of programming Event-based models for extensive concurrency Main Idea: Decompose service into stages separated by queues Each stage performs a subset of request processing

SEDA - Stage Event queues can pose various control policies Modularity. Each stage implemented and managed independently Explicit event delivery facilitates tracing flow of events and thus spotting bottlenecks and debugging

Controllers Thread pool controller : ideal degree of concurrency for a stage Adjust number of threads by observing the incoming queue length Idle threads are removed Batching controller : aims at low response time and high throughput Batching factor: number of events consumed at each iteration of the event handler Large batching factor : more locality, higher throughput Small batching factor : lower response time

SEDA Prototype: Sandstorm Implemented in Java Java provides software engineering benefits Built-in threading, automatic memory management APIs are provided for naming, creating and destroying stages, performing queue operations, controlling queue thresholds and profiling and debugging Asynchronous I/O primitives are implemented using existing OS primitives. The sockets interface consists of three stages: read, write and listen Asynchronous I/O file operations.

Evaluation Evaluated Haboob a high-performance SEDA-based HTTP server Used the static file load from SpecWEB99 benchmark, a realistic, industrystandard benchmark 1 to 1024 clients making repeated requests Files sizes range from 102 to 921600 Bytes Total file set size is 3.31 GB Memory Cache of 200Mb Server running on 4-way SMP 500 MHz Pentium III system with 2 GB of RAM 32 machines of a similar configuration were used for load generation

Other designs Apache Web server Thread-based concurrency Fixed-size process pool of 150 processes Flash Web server Event-based concurrency Single process handling most request-processing tasks. Up to 506 simultaneous connections (due to limitations of select system call). Haboob Hybrid approach. Event-based and thread-based concurrency Up to 1024 concurrent requests

Evaluation

Evaluation SEDA provides some fairness The other techniques suffer from long TCP retransmit backoff times. Requests rejected and re-submitted

Evaluation Under overload Requests with high computation and I/O needs from 1024 clients Admission control policy by queues Can perform prioritization or filtering during heavy load Adjust size of queue according to the response time Maintain low response time

Summary New designs needed for the ever increasing demands of web services SEDA is thus proposed to reach the desired performance Combines thread-based and event-based concurrency models Splits an application into a network of stages with event queues in between Dynamic resource controllers for each stage Simplified building high-concurrent services by decoupling load management from core application logic

Strengths High concurrency. Ability to scale to large numbers of concurrent requests Ease of engineering. Simplify construction of well-conditioned services Modularity. Each stage implemented and managed independently Adaption to load variations. Resource management adjusted dynamically. Low variance in response time

Weaknesses Increased latency. A request traverse many stages and experiences multiple context switches and additional delays due to queuing. On a lightly loaded server, the worst case context switching overhead can dominate What about the average case performance? Is the worst response time the most important metric to consider? Programming still harder than thread-based concurrency models

Thank you