Jinho Hwang (IBM Research) Wei Zhang, Timothy Wood, H. Howie Huang (George Washington Univ.) K.K. Ramakrishnan (Rutgers University)

Similar documents
Jinho Hwang and Timothy Wood George Washington University

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION

Model-Driven Geo-Elasticity In Database Clouds

Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li

Tools for Social Networking Infrastructures

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

ITU Kaleidoscope 2016 ICTs for a Sustainable World. DESIGN OF SCALABLE DIRECTORY SERVICE FOR FUTURE IoT APPLICATIONS

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems

Amazon ElastiCache 8/1/17. Why Amazon ElastiCache is important? Introduction:

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage

Scaling for Humongous amounts of data with MongoDB

BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding. Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University

EECS 498 Introduction to Distributed Systems

MICA: A Holistic Approach to Fast In-Memory Key-Value Storage

Facebook Tao Distributed Data Store for the Social Graph

TAIL LATENCY AND PERFORMANCE AT SCALE

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

DMAP : Global Name Resolution Services Through Direct Mapping

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

Cloud Transcoder: Bridging the Format and Resolution Gap between Internet Videos and Mobile Devices

MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing

Peer-to-Peer Systems and Distributed Hash Tables

745: Advanced Database Systems

NFVnice: Dynamic Backpressure and Scheduling for NFV Service Chains

Efficient Memory Disaggregation with Infiniswap. Juncheng Gu, Youngmoon Lee, Yiwen Zhang, MosharafChowdhury, Kang G. Shin

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

NetChain: Scale-Free Sub-RTT Coordination

Dynamic Metadata Management for Petabyte-scale File Systems

11/13/2018 CACHING, CONTENT-DISTRIBUTION NETWORKS, AND OVERLAY NETWORKS ATTRIBUTION

Towards Deadline Guaranteed Cloud Storage Services Guoxin Liu, Haiying Shen, and Lei Yu

Distributed Systems CS6421

data parallelism Chris Olston Yahoo! Research

Big data, little time. Scale-out data serving. Scale-out data serving. Highly skewed key popularity

ZHT A Fast, Reliable and Scalable Zero- hop Distributed Hash Table

Deployment Planning and Optimization for Big Data & Cloud Storage Systems

Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017

Decentralized Distributed Storage System for Big Data

Data Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research

Infiniswap. Efficient Memory Disaggregation. Mosharaf Chowdhury. with Juncheng Gu, Youngmoon Lee, Yiwen Zhang, and Kang G. Shin

Advanced Computer Networks Exercise Session 7. Qin Yin Spring Semester 2013

MicroFuge: A Middleware Approach to Providing Performance Isolation in Cloud Storage Systems

CS-580K/480K Advanced Topics in Cloud Computing. Object Storage

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Warehouse-Scale Computing

NetCache: Balancing Key-Value Stores with Fast In-Network Caching

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

NetCache: Balancing Key-Value Stores with Fast In-Network Caching

Building a Scalable Architecture for Web Apps - Part I (Lessons Directi)

Volley: Automated Data Placement for Geo-Distributed Cloud Services

Correlation based File Prefetching Approach for Hadoop

Huge market -- essentially all high performance databases work this way

SCALING A DISTRIBUTED SPATIAL CACHE OVERLAY. Alexander Gessler Simon Hanna Ashley Marie Smith

EARM: An Efficient and Adaptive File Replication with Consistency Maintenance in P2P Systems.

SHARDS & Talus: Online MRC estimation and optimization for very large caches

Managing VMware Distributed Resource Scheduler

An Empirical Study of High Availability in Stream Processing Systems

5/2/16. Announcements. NoSQL Motivation. The New Hipster: NoSQL. Serverless. What is the Problem? Database Systems CSE 414

Database Systems CSE 414

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Stateless Network Functions:

Key Value Store. Yiding Wang, Zhaoxiong Yang

Introduction to Database Services

NetCache: Balancing Key-Value Stores with Fast In-Network Caching

High-Performance Key-Value Store on OpenSHMEM

RAMP: RDMA Migration Platform

Performance Characterization of a Commercial Video Streaming Service. Mojgan Ghasemi, Akamai Technologies - Princeton University

Large-Scale Web Applications

Network Requirements for Resource Disaggregation

Data Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research

CSE 124: CONTENT-DISTRIBUTION NETWORKS. George Porter December 4, 2017

416 Distributed Systems. March 23, 2018 CDNs

Cassandra - A Decentralized Structured Storage System. Avinash Lakshman and Prashant Malik Facebook

Thomas Lin, Naif Tarafdar, Byungchul Park, Paul Chow, and Alberto Leon-Garcia

Caribou: Intelligent Distributed Storage

Ambry: LinkedIn s Scalable Geo- Distributed Object Store

Webinar Series: Triangulate your Storage Architecture with SvSAN Caching. Luke Pruen Technical Services Director

Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink

SCALABLE WEB PROGRAMMING. CS193S - Jan Jannink - 2/02/10

Scaling the LTE Control-Plane for Future Mobile Access

CSE 123b Communications Software

CSE 124: TAIL LATENCY AND PERFORMANCE AT SCALE. George Porter November 27, 2017

Crescando: Predictable Performance for Unpredictable Workloads

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation

Distributed Hash Tables: Chord

Cuckoo Filter: Practically Better Than Bloom

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Today s class. CSE 123b Communications Software. Telnet. Network File System (NFS) Quick descriptions of some other sample applications

Analysis and Optimization. Carl Waldspurger Irfan Ahmad CloudPhysics, Inc.

Deconstructing RDMA-enabled Distributed Transaction Processing: Hybrid is Better!

Survey of ETSI NFV standardization documents BY ABHISHEK GUPTA FRIDAY GROUP MEETING FEBRUARY 26, 2016

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

Chapter 6: Distributed Systems: The Web. Fall 2012 Sini Ruohomaa Slides joint work with Jussi Kangasharju et al.

Data Center Performance

CSE 486/586 Distributed Systems

A Comparison of Performance and Accuracy of Measurement Algorithms in Software

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space

Performance Characterization of a Commercial Video Streaming Service

Engineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05

Warehouse-Scale Computing

Memory-Based Cloud Architectures

Transcription:

Jinho Hwang (IBM Research) Wei Zhang, Timothy Wood, H. Howie Huang (George Washington Univ.) K.K. Ramakrishnan (Rutgers University)

Background: Memory Caching Two orders of magnitude more reads than writes Solution: Deploy memcached hosts to handle the read capacity 6. HTTP Response 1. HTTP Request Web Server 6. (key, data) 5. Data 3. Miss(key) 2. Get(key) 4. DB Lookup(key) Memcache DB DB 6/17/14 DB 2

Memcached at Scale Databases are hard to scale Memcached is easy o Facebook has 10,000+ memcached servers Partition data and divide key space among all nodes o Simple data model. Stupid nodes. Web application must track where each object is stored o Or use a proxy (load-balancer) like moxi LB Clients Web Servers DB Memcached nodes 6/17/14 3

Scales easily, but loads are imbalanced Random placement Skewed popularity distributions Load on Wikipedia s memcached servers 6/17/14 4

Motivation Cache clusters are typically provisioned to support peak load, in terms of request processing capabilities and cache storage size o This worst-case provisioning can be very expensive o This does not take advantage of dynamic resource allocation and VM provisioning capabilities There can be great diversity in both the workloads and types of nodes o This makes cluster management difficult Solution: large-scale self-managing caches 6/17/14 5

Contributions A high speed memcached load balancer o can forward millions of requests per second A hot spot detection algorithm o uses stream data mining techniques to efficiently determine the hottest keys A two-level key mapping system o combines consistent hashing with a lookup table to flexibly control the placement and replication level of hot keys An automated server management system o determines the number of (types of) servers in the caching cluster 6/17/14 6

Outline Background and Motivation Workload & Server Heterogeneity Memcached Load Balancer Design Hotspot Detection Algorithm Conclusions 6/17/14 7

Workload Heterogeneity Varied key popularity depending on applications Read/write rates, churn rates, costs for cache misses, quality-of-service demands Unpopular region (65%) Hash Space (0-2 32 ) Popular region (35%) Based on Wikipedia 2008 database dump and access trace 0 50 100 150 200 250 300 Time (5 hours) 6/17/14 8

Memcached Cluster Grouping Traditionally, many applications share memcached servers Today, to handle different applications, memcacheds are clustered manually Web Server Memcached Cluster Application Server Memcached Servers Applications Streaming Static File Media Server DB Query Need smarter way to manage cache clusters 6/17/14 9

Server Heterogeneity Software and hardware can be diverse Different memory cache softwares out there Many researches prototyped memcached on different HW architectures o GP-GPU [Hetherington:ISPASS12] o RDMA (remote direct memory access) [Stuedi:ATC12] o FPGA [Maysam:Computer_Architecture_Letter13] o Intel DPDK [Lim:NSDI14] Dynamically adapt to workload heterogeneity and server heterogeneity 6/17/14 10

Outline Background and Motivation Workload & Server Heterogeneity Memcached Load Balancer Design Hotspot Detection Algorithm Conclusions 6/17/14 11

Self-Managing Cache Clustering System Cache clustering system considers heterogeneous workloads and servers o Identifies server capabilities to adjust key space o Finds hot items based on the item frequency to handle them differently Dynamically increase/decrease the number of memcached servers Uses two-level key mapping system o Consistent hashing: applies to warm/cold items (LRU) o Forward table: applies to hot items 6/17/14 12

Automated Load-Balancer Architecture Clients Server Manager Hot Spot Detector Key Replicator k1 k2 k3 Fwd Table Automated Load Balancer Key Director S1, S14 S3, S6 S2,S8,S9 Consistent Hashing Ring S1 S1 Heterogeneous Cache Servers Use UDP protocol to redirect the packets directly from memcached servers to clients (lower load in load-balancer) Hot Spot Detector runs a streaming algorithm to find hot items Server Manager manages the memcached servers Key Replicator manages the key replication Key Director manages forwarding table and consistent hashing ring 6/17/14 13

Outline Background and Motivation Workload & Server Heterogeneity Memcached Load Balancer Design Hotspot Detection Algorithm Conclusions 6/17/14 14

Lossy Counting based Hot Spot Detection Lossy counting algorithm is a one-pass deterministic streaming algorithm with user defined parameters o Support threshold s Given the stream length N, return at least sn o Error rate ε (ε << s) o No item with true frequency < (s - ε)n is returned (no false negative) o Memory size is not monotonically increasing Parameters: ε, s Data <key, val> N Counting sn Frequency Estimate Tracking 6/17/14 15

Find Hot Items and Groups of Hot Items Estimated frequency (i.e., request rate) seen by the top keys The goal is two-fold o Find hot items o Find groups of items that can balance the loads across servers 25 Frequency (x1000) 20 15 10 5 0 0 20 40 60 80 100 120 Item Number 6/17/14 16

Adaptively Sizing Number of Hot Items Given the number of servers, and found hot spot groups, the load-balancer decides to increase/decrease the number of keys In a scheduling window, we adjust the support parameter s to adjust the number of hot items Number of Hot Items 250 200 150 100 50 Autonomic Lossy Counting Generic Lossy Counting Target Level 0 10 20 30 40 50 60 70 Stream Length N (x1000) 6/17/14 17

Conclusion Summary o Self-managing cache clustering system Workload and server heterogeneity o Adaptive hot item groups Number of keys and groups based on servers o High speed load-balancer (~10 million requests per sec / single core) Leverage high-performing NICs and processors Ongoing work o Optimize the number of servers and replicas of hot items (cloud environment) 6/17/14 18