Jinho Hwang and Timothy Wood George Washington University

Size: px
Start display at page:

Download "Jinho Hwang and Timothy Wood George Washington University"

Transcription

1 Jinho Hwang and Timothy Wood George Washington University

2 Background: Memory Caching Two orders of magnitude more reads than writes Solution: Deploy memcached hosts to handle the read capacity 6. HTTP Response 1. HTTP Request Web Server 6. (key, data) 5. Data 3. Miss(key) 2. Get(key) 4. DB Lookup(key) Memcache DB DB 6/26/13 DB The George Washington University 2

3 Memcached at Scale Databases are hard to scale Memcached is easy o Facebook has 10,000+ memcached servers Partition data and divide key space among all nodes o Simple data model. Stupid nodes. Web application must track where each object is stored o Or use a proxy like moxi moxi Clients Web Servers DB Memcached nodes 6/26/13 The George Washington University 3

4 Scales easily, but loads are imbalanced Random placement Skewed popularity distributions Load on Wikipedia s memcached servers 6/26/13 The George Washington University 4

5 Motivation Consistent hashing does not evenly load data across memory cache servers o Variation in number of keys assigned to each server o Key popularity is skewed and changes over time Unpopular region (65%) Hash Space ( ) Popular region (35%) Based on Wikipedia 2008 database dump and access trace Solution: dynamically balance load according to the performance 6/26/13 The George Washington University 5

6 Contributions A hash space allocation scheme o allows for targeted load shifting between unbalanced servers Adaptive partitioning of the cache s hash space o automatically meet hit rate and server utilization goals An automated replica management system o adds or removes cache replicas based on overall cache performance 6/26/13 The George Washington University 6

7 Outline Background and Motivation Initial Hash Space Partitioning Dynamic Adaptation Evaluation Conclusions 6/26/13 The George Washington University 7

8 Background: Hash Space Allocation Simple Hashing o hash(key) % [# of server] o Once assigned, never changes o If node added or removed, all objects need to be rearranged Memory Server Memory Server Memory Server Load Balancer server[key % 3] Consistent hashing o Treat hash space as ring with nodes assigned to each region o Node addition / removal only affects adjacent nodes o Used in P2P systems and by popular memcached proxy system Moxi N4 N1 N4 N1 N3 N2 Key Hash Space 2^32 N3 Key N2 belong to 6/26/13 The George Washington University 8

9 Initial Assignment To enable efficient repartitioning of the hash space: o Every node is adjacent to every other node o This allows a simple transfer of load between two nodes by adjusting just one boundary Required number of duplicate nodes = Total number of nodes = Multiply number of virtual nodes N1 N2 N3 N4 N5 N1 N3 N5 N2 N4 N2 N3 N5 N1 N4 N5 N2 N4 N1 N3 6/26/13 The George Washington University 9

10 Dynamic Hash Space Scheduling Two factors to measure server performance: o Hit rate: enough memory for popular data o Usage ratio: server processing Minimize {cost = hit rate + usage ratio} Scheduling decision: o Find the most different two memory servers o Find the most different two adjacent virtual nodes Size of hash space moved at each scheduling decision o Determine the speed of adaptability, but more fluctuation o Using ratio value: 6/26/13 The George Washington University 10

11 Node Addition / Removal Balance out the requests across replicas that overall performance improves Highly overloaded server(s) sustaining a certain period of time should be backed by new server(s) Find the most costly memory server, and its virtual node si Migrate sk new node Node Addition Find the least costly memory server, and its virtual node sj si Set si moved Set sk removed sj Node Removal 6/26/13 The George Washington University 11

12 Outline Background and Motivation Initial Hash Space Partitioning Dynamic Adaptation Evaluation Conclusions 6/26/13 The George Washington University 12

13 Experimental Setup Lab setup o Five experimental servers(4 Intel Xeon X GHz processor, 16GB, and a 500GB 7200RPM hard drive) Amazon setup o 15 medium instances Clients web Proxy memcd memcd memcd memcd Elastic Decision (+/-) Memory Pool memcd memcd memcd All workloads are from Wikipedia data and access traces 6/26/13 The George Washington University 13

14 Initial Hash Space Assignment 5 memory servers used (total 500 virtual nodes) o For consistent hashing, 100 virtual nodes per each server Server Number Server Number o For our scheme, the initial set is 5 x 4 = 20, and 25 virtual nodes per node Consistent 5 Adaptive Hash Space ( ) The largest gap between the biggest hash size and the smallest hash size is 381,114,554 ( 20% more) Hash Space Size (x10 6 ) Consistent Adaptive Server Number 6/26/13 The George Washington University 14

15 Dynamic Partitioning α = 1.0 (only hit rate) Hit Rate Host Host 2 Host # of Requests (per min) Host 1 Host 2 Host 3 Hash Space ( ) 33.3 % Host % 33.3 % Host % 33.3 % Host 1 Host 2 Host Host % α = 0 (only usage ratio) Hit Rate Host Host 2 Host # of Requests (per min) Host Host 2 Host 3 0 Hash Space ( ) 33.3 % Host % 33.3 % Host % 33.3 % Host 1 Host Host % Host 3 6/26/13 The George Washington University 15

16 α Behavior When α = 0.5, β = 0.01 Hit Rate Cost Host Host 2 Host Host 1 Host 2 Host # of Reqs per min(x10 3 ) Hash Space ( ) Host 1 Host 2 Host % Host % 33.3 % Host % 33.3 % Host 1 Host Host 3 2 Host % 6/26/13 The George Washington University 16

17 Node Addition / Removal # of Reqs per min(x10 3 ) Host added Time (3 hours) Hash Space ( ) 33.3 % 33.3 % 33.3 % Host added 10.7 % 26.7 % 17.2 % 45.3 % Time (3 hours) Addition A new node takes reduces load on the overloaded server # of Reqs per min(x10 3 ) Host removed Time (3 hours) Hash Space ( ) 20 % 20 % 20 % 20 % Host removed 25.1 % 24.7 % 27.8 % 20 % 22.2 % Time (3 hours) Removal Removing an underloaded server gives cost benefits while maintaining performance 6/26/13 The George Washington University 17

18 β Behavior Amount ratio of hash space movement Determine the speed of adaptability Use β = 0.01 (1%) to show the behavior # of Reqs per min(x10 3 ) Host 1 Host 2 Host Moved Hash Space Size (x 10 6 ) = Traffic changes over 5 hours Moved hash space per each scheduling 6/26/13 The George Washington University 18

19 Scaling Up / Down Dynamically add / remove server(s) depending on amount of load intensity Watch each server for a period of time (5 min) to check high load sustainability To maximize variation, α = 1 (hit rate only) 5 Wikipedia traffic generators used # of Reqs Per Min (x10 3 ) # of Servers /26/13 The George Washington University 19

20 QoE Improvement Avg. Response Time (ms) Ketama Value [0.0, 1.0] Usage rate Hit rate # of Used Memory Servers Ketama Value [0.0, 1.0] Wikipedia workload achieves better response time as hit rate increases ( 45% increase) But the number of servers used increases as well As recommendation, the combination of hit rate and usage rate (α = 0.5) is a good administrative choice 6/26/13 The George Washington University 20

21 Related Work [Stoica, ToN 03] Chord Peer-to-Peer architecture [Nishtala, NSDI 13] Scaling Memcached at Facebook [Zhu, HotCloud 12] Shrinking memcached to save $$ Ideas may apply to many other key-value based storage systems: couchebase, redis, SILT, FAWN, etc 6/26/13 The George Washington University 21

22 Conclusion Summary o A hash space allocation scheme Carefully place nodes to ensure adjacency o Adaptive partitioning of the cache s hash space Maximize hit rate and minimize difference in utilization rate o An automated replica management system Detect sustained overload and add or remove nodes Future works o Automatic α value adjustment to minimize response time o Targeted management of hot objects without impacting application performance 6/26/13 The George Washington University 22

Jinho Hwang (IBM Research) Wei Zhang, Timothy Wood, H. Howie Huang (George Washington Univ.) K.K. Ramakrishnan (Rutgers University)

Jinho Hwang (IBM Research) Wei Zhang, Timothy Wood, H. Howie Huang (George Washington Univ.) K.K. Ramakrishnan (Rutgers University) Jinho Hwang (IBM Research) Wei Zhang, Timothy Wood, H. Howie Huang (George Washington Univ.) K.K. Ramakrishnan (Rutgers University) Background: Memory Caching Two orders of magnitude more reads than writes

More information

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage Arijit Khan Nanyang Technological University (NTU), Singapore Gustavo Segovia ETH Zurich, Switzerland Donald Kossmann Microsoft

More information

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems Rebecca Taft, Essam Mansour, Marco Serafini, Jennie Duggan, Aaron J. Elmore, Ashraf Aboulnaga, Andrew Pavlo, Michael

More information

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif * University of Massachusetts Amherst * Intel, Portland Data

More information

RobinHood: Tail Latency-Aware Caching Dynamically Reallocating from Cache-Rich to Cache-Poor

RobinHood: Tail Latency-Aware Caching Dynamically Reallocating from Cache-Rich to Cache-Poor RobinHood: Tail Latency-Aware Caching Dynamically Reallocating from -Rich to -Poor Daniel S. Berger (CMU) Joint work with: Benjamin Berg (CMU), Timothy Zhu (PennState), Siddhartha Sen (Microsoft Research),

More information

Balancing Fairness and Efficiency in Tiered Storage Systems with Bottleneck-Aware Allocation

Balancing Fairness and Efficiency in Tiered Storage Systems with Bottleneck-Aware Allocation Balancing Fairness and Efficiency in Tiered Storage Systems with Bottleneck-Aware Allocation Hui Wang, Peter Varman Rice University FAST 14, Feb 2014 Tiered Storage Tiered storage: HDs and SSDs q Advantages:

More information

Towards Deadline Guaranteed Cloud Storage Services Guoxin Liu, Haiying Shen, and Lei Yu

Towards Deadline Guaranteed Cloud Storage Services Guoxin Liu, Haiying Shen, and Lei Yu Towards Deadline Guaranteed Cloud Storage Services Guoxin Liu, Haiying Shen, and Lei Yu Presenter: Guoxin Liu Ph.D. Department of Electrical and Computer Engineering, Clemson University, Clemson, USA Computer

More information

SWAP: EFFECTIVE FINE-GRAIN MANAGEMENT

SWAP: EFFECTIVE FINE-GRAIN MANAGEMENT : EFFECTIVE FINE-GRAIN MANAGEMENT OF SHARED LAST-LEVEL CACHES WITH MINIMUM HARDWARE SUPPORT Xiaodong Wang, Shuang Chen, Jeff Setter, and José F. Martínez Computer Systems Lab Cornell University Page 1

More information

LRC: Dependency-Aware Cache Management for Data Analytics Clusters. Yinghao Yu, Wei Wang, Jun Zhang, and Khaled B. Letaief IEEE INFOCOM 2017

LRC: Dependency-Aware Cache Management for Data Analytics Clusters. Yinghao Yu, Wei Wang, Jun Zhang, and Khaled B. Letaief IEEE INFOCOM 2017 LRC: Dependency-Aware Cache Management for Data Analytics Clusters Yinghao Yu, Wei Wang, Jun Zhang, and Khaled B. Letaief IEEE INFOCOM 2017 Outline Cache Management for Data Analytics Clusters Inefficiency

More information

Load Balancing with Minimal Flow Remapping for Network Processors

Load Balancing with Minimal Flow Remapping for Network Processors Load Balancing with Minimal Flow Remapping for Network Processors Imad Khazali and Anjali Agarwal Electrical and Computer Engineering Department Concordia University Montreal, Quebec, Canada Email: {ikhazali,

More information

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation Dynamo Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/20 Outline Motivation 1 Motivation 2 3 Smruti R. Sarangi Leader

More information

Amazon ElastiCache 8/1/17. Why Amazon ElastiCache is important? Introduction:

Amazon ElastiCache 8/1/17. Why Amazon ElastiCache is important? Introduction: Amazon ElastiCache Introduction: How to improve application performance using caching. What are the ElastiCache engines, and the difference between them. How to scale your cluster vertically. How to scale

More information

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value

More information

Saving Cash by Using Less Cache

Saving Cash by Using Less Cache Saving Cash by Using Less Cache Timothy Zhu, Anshul Gandhi, Mor Harchol-Balter Carnegie Mellon University Michael A. Kozuch Intel Labs Abstract Everyone loves a large caching tier in their multitier cloud-based

More information

HyperDex. A Distributed, Searchable Key-Value Store. Robert Escriva. Department of Computer Science Cornell University

HyperDex. A Distributed, Searchable Key-Value Store. Robert Escriva. Department of Computer Science Cornell University HyperDex A Distributed, Searchable Key-Value Store Robert Escriva Bernard Wong Emin Gün Sirer Department of Computer Science Cornell University School of Computer Science University of Waterloo ACM SIGCOMM

More information

MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing

MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing Bin Fan (CMU), Dave Andersen (CMU), Michael Kaminsky (Intel Labs) NSDI 2013 http://www.pdl.cmu.edu/ 1 Goal: Improve Memcached 1. Reduce space overhead

More information

Key Value Store. Yiding Wang, Zhaoxiong Yang

Key Value Store. Yiding Wang, Zhaoxiong Yang Key Value Store Yiding Wang, Zhaoxiong Yang Outline Part 1 Definitions/Operations Compare with RDBMS Scale Up Part 2 Distributed Key Value Store Network Acceleration Definitions A key-value database, or

More information

Cache Management for In Memory. Jun ZHANG Oct 15, 2018

Cache Management for In Memory. Jun ZHANG Oct 15, 2018 Cache Management for In Memory Analytics Jun ZHANG Oct 15, 2018 1 Outline 1. Introduction 2. LRC: Dependency aware caching 3. OpuS: Fair cache sharing in multi tenant cloud 4. SP Cache: Load balancing

More information

Deadline Guaranteed Service for Multi- Tenant Cloud Storage Guoxin Liu and Haiying Shen

Deadline Guaranteed Service for Multi- Tenant Cloud Storage Guoxin Liu and Haiying Shen Deadline Guaranteed Service for Multi- Tenant Cloud Storage Guoxin Liu and Haiying Shen Presenter: Haiying Shen Associate professor *Department of Electrical and Computer Engineering, Clemson University,

More information

LEEN: Locality/Fairness- Aware Key Partitioning for MapReduce in the Cloud

LEEN: Locality/Fairness- Aware Key Partitioning for MapReduce in the Cloud LEEN: Locality/Fairness- Aware Key Partitioning for MapReduce in the Cloud Shadi Ibrahim, Hai Jin, Lu Lu, Song Wu, Bingsheng He*, Qi Li # Huazhong University of Science and Technology *Nanyang Technological

More information

Dynamic Load Balancing for Efficient Video Streaming Service

Dynamic Load Balancing for Efficient Video Streaming Service Dynamic Load Balancing for Efficient Video Streaming Service Junyeop Kim and Youjip Won Dept. of Computer Software Hanyang University Seoul, Korea Email: {exdream yjwon}@hanyang.ac.kr Abstract As cloud

More information

MicroFuge: A Middleware Approach to Providing Performance Isolation in Cloud Storage Systems

MicroFuge: A Middleware Approach to Providing Performance Isolation in Cloud Storage Systems 1 MicroFuge: A Middleware Approach to Providing Performance Isolation in Cloud Storage Systems Akshay Singh, Xu Cui, Benjamin Cassell, Bernard Wong and Khuzaima Daudjee July 3, 2014 2 Storage Resources

More information

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Replication, Load Balancing and Efficient Range Query Processing in DHTs

Replication, Load Balancing and Efficient Range Query Processing in DHTs Replication, Load Balancing and Efficient Range Query Processing in DHTs Theoni Pitoura, Nikos Ntarmos, and Peter Triantafillou R.A. Computer Technology Institute and Computer Engineering & Informatics

More information

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism Parallel DBMS Parallel Database Systems CS5225 Parallel DB 1 Uniprocessor technology has reached its limit Difficult to build machines powerful enough to meet the CPU and I/O demands of DBMS serving large

More information

Data Center Services and Optimization. Sobir Bazarbayev Chris Cai CS538 October

Data Center Services and Optimization. Sobir Bazarbayev Chris Cai CS538 October Data Center Services and Optimization Sobir Bazarbayev Chris Cai CS538 October 18 2011 Outline Background Volley: Automated Data Placement for Geo-Distributed Cloud Services, by Sharad Agarwal, John Dunagan,

More information

Today. Architectural Styles

Today. Architectural Styles Today Architectures for distributed systems (Chapter 2) Centralized, decentralized, hybrid Middleware Self-managing systems Lecture 2, page 1 Architectural Styles Important styles of architecture for distributed

More information

Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems

Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems 1 Presented by Hadeel Alabandi Introduction and Motivation 2 A serious issue to the effective utilization

More information

R, 1,..., The first constraint is to realize the load balance among the cloud servers by controlling the weight difference in

R, 1,..., The first constraint is to realize the load balance among the cloud servers by controlling the weight difference in 3rd International Conference on Multimedia Technology(ICMT 013) Optimization of Content Placement Scheme for Social Media on Distributed Content Clouds Qian Zhang 1,.Runzhi Li.Yusong Lin.Zongmin Wang Abstract.

More information

Optimizing Flash-based Key-value Cache Systems

Optimizing Flash-based Key-value Cache Systems Optimizing Flash-based Key-value Cache Systems Zhaoyan Shen, Feng Chen, Yichen Jia, Zili Shao Department of Computing, Hong Kong Polytechnic University Computer Science & Engineering, Louisiana State University

More information

On Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs

On Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs On Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs Sungpack Hong 2, Nicole C. Rodia 1, and Kunle Olukotun 1 1 Pervasive Parallelism Laboratory, Stanford University

More information

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li Goal: fast and cost-efficient key-value store Store, retrieve, manage key-value objects Get(key)/Put(key,value)/Delete(key) Target: cluster-level

More information

High-Performance Key-Value Store on OpenSHMEM

High-Performance Key-Value Store on OpenSHMEM High-Performance Key-Value Store on OpenSHMEM Huansong Fu*, Manjunath Gorentla Venkata, Ahana Roy Choudhury*, Neena Imam, Weikuan Yu* *Florida State University Oak Ridge National Laboratory Outline Background

More information

vbuckets: The Core Enabling Mechanism for Couchbase Server Data Distribution (aka Auto-Sharding )

vbuckets: The Core Enabling Mechanism for Couchbase Server Data Distribution (aka Auto-Sharding ) vbuckets: The Core Enabling Mechanism for Data Distribution (aka Auto-Sharding ) Table of Contents vbucket Defined 3 key-vbucket-server ping illustrated 4 vbuckets in a world of s 5 TCP ports Deployment

More information

Distributed Two-way Trees for File Replication on Demand

Distributed Two-way Trees for File Replication on Demand Distributed Two-way Trees for File Replication on Demand Ramprasad Tamilselvan Department of Computer Science Golisano College of Computing and Information Sciences Rochester, NY 14586 rt7516@rit.edu Abstract

More information

Introduction to Distributed Data Systems

Introduction to Distributed Data Systems Introduction to Distributed Data Systems Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook January

More information

Click to edit Master title

Click to edit Master title Click to edit Master title DIMM: A Distributed Metadata Management for Data-Intensive HPC Brandon Szeliga, John Cavicchio and Weisong Shi Wayne State University bszeliga@wayne.edu 1 Click Roadmap to edit

More information

Peer-to-Peer Systems and Distributed Hash Tables

Peer-to-Peer Systems and Distributed Hash Tables Peer-to-Peer Systems and Distributed Hash Tables CS 240: Computing Systems and Concurrency Lecture 8 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Selected

More information

FastScale: Accelerate RAID Scaling by

FastScale: Accelerate RAID Scaling by FastScale: Accelerate RAID Scaling by Minimizing i i i Data Migration Weimin Zheng, Guangyan Zhang gyzh@tsinghua.edu.cn Tsinghua University Outline Motivation Minimizing data migration Optimizing data

More information

Architectures for distributed systems (Chapter 2)

Architectures for distributed systems (Chapter 2) Today Architectures for distributed systems (Chapter 2) Architectural styles Client-server architectures Decentralized and peer-to-peer architectures Lecture 2, page!1 Module 1: Architectural Styles Important

More information

Predictive Elastic Database Systems. Rebecca Taft HPTS 2017

Predictive Elastic Database Systems. Rebecca Taft HPTS 2017 Predictive Elastic Database Systems Rebecca Taft becca@cockroachlabs.com HPTS 2017 1 Modern OLTP Applications Large Scale Cloud-Based Performance is Critical 2 Challenges to transaction performance: skew

More information

Accelerating Analytical Workloads

Accelerating Analytical Workloads Accelerating Analytical Workloads Thomas Neumann Technische Universität München April 15, 2014 Scale Out in Big Data Analytics Big Data usually means data is distributed Scale out to process very large

More information

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU PRESENTED BY ROMAN SHOR Overview Technics of data reduction in storage systems:

More information

Network Architecture Laboratory

Network Architecture Laboratory Automated Synthesis of Adversarial Workloads for Network Functions Luis Pedrosa, Rishabh Iyer, Arseniy Zaostrovnykh, Jonas Fietz, Katerina Argyraki Network Architecture Laboratory Software NFs The good:

More information

Data Centers and Cloud Computing

Data Centers and Cloud Computing Data Centers and Cloud Computing CS677 Guest Lecture Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Data Centers and Cloud Computing. Slides courtesy of Tim Wood Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

From the Outside Looking In: Probing Web APIs to Build Detailed Workload Profile

From the Outside Looking In: Probing Web APIs to Build Detailed Workload Profile From the Outside Looking In: Probing Web APIs to Build Detailed Workload Profile Nan Deng, Zichen Xu, Christopher Stewart and Xiaorui Wang The Ohio State University From the Outside Looking In Internet

More information

Content Distribution Networks

Content Distribution Networks ontent Distribution Networks Outline Implementation Techniques Hashing Schemes edirection Strategies Spring 22 S 461 1 Design Space aching explicit transparent (hijacking connections) eplication server

More information

Near Memory Key/Value Lookup Acceleration MemSys 2017

Near Memory Key/Value Lookup Acceleration MemSys 2017 Near Key/Value Lookup Acceleration MemSys 2017 October 3, 2017 Scott Lloyd, Maya Gokhale Center for Applied Scientific Computing This work was performed under the auspices of the U.S. Department of Energy

More information

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Venugopalan Ramasubramanian Emin Gün Sirer Presented By: Kamalakar Kambhatla * Slides adapted from the paper -

More information

Today. Architectural Styles

Today. Architectural Styles Today Architectures for distributed systems (Chapter 2) Centralized, decentralized, hybrid Middleware Self-managing systems Lecture 2, page 1 Architectural Styles Important styles of architecture for distributed

More information

Distributed Hash Tables: Chord

Distributed Hash Tables: Chord Distributed Hash Tables: Chord Brad Karp (with many slides contributed by Robert Morris) UCL Computer Science CS M038 / GZ06 12 th February 2016 Today: DHTs, P2P Distributed Hash Tables: a building block

More information

Correlation based File Prefetching Approach for Hadoop

Correlation based File Prefetching Approach for Hadoop IEEE 2nd International Conference on Cloud Computing Technology and Science Correlation based File Prefetching Approach for Hadoop Bo Dong 1, Xiao Zhong 2, Qinghua Zheng 1, Lirong Jian 2, Jian Liu 1, Jie

More information

Planar: Parallel Lightweight Architecture-Aware Adaptive Graph Repartitioning

Planar: Parallel Lightweight Architecture-Aware Adaptive Graph Repartitioning Planar: Parallel Lightweight Architecture-Aware Adaptive Graph Repartitioning Angen Zheng, Alexandros Labrinidis, and Panos K. Chrysanthis University of Pittsburgh 1 Graph Partitioning Applications of

More information

SRM-Buffer: An OS Buffer Management Technique to Prevent Last Level Cache from Thrashing in Multicores

SRM-Buffer: An OS Buffer Management Technique to Prevent Last Level Cache from Thrashing in Multicores SRM-Buffer: An OS Buffer Management Technique to Prevent Last Level Cache from Thrashing in Multicores Xiaoning Ding et al. EuroSys 09 Presented by Kaige Yan 1 Introduction Background SRM buffer design

More information

NetCache: Balancing Key-Value Stores with Fast In-Network Caching

NetCache: Balancing Key-Value Stores with Fast In-Network Caching NetCache: Balancing Key-Value Stores with Fast In-Network Caching Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé Jeongkeun Lee, Nate Foster, Changhoon Kim, Ion Stoica NetCache is a rack-scale key-value

More information

CSE 124: Networked Services Lecture-17

CSE 124: Networked Services Lecture-17 Fall 2010 CSE 124: Networked Services Lecture-17 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/30/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

NetCache: Balancing Key-Value Stores with Fast In-Network Caching

NetCache: Balancing Key-Value Stores with Fast In-Network Caching NetCache: Balancing Key-Value Stores with Fast In-Network Caching Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé Jeongkeun Lee, Nate Foster, Changhoon Kim, Ion Stoica NetCache is a rack-scale key-value

More information

RAMP: RDMA Migration Platform

RAMP: RDMA Migration Platform RAMP: RDMA Migration Platform Babar Naveed Memon, Xiayue Charles Lin, Arshia Mufti, Arthur Scott Wesley, Tim Brecht, Kenneth Salem, Bernard Wong, and Benjamin Cassell Contact @ firstname.lastname@uwaterloo.ca

More information

SAY-Go: Towards Transparent and Seamless Storage-As-You-Go with Persistent Memory

SAY-Go: Towards Transparent and Seamless Storage-As-You-Go with Persistent Memory SAY-Go: Towards Transparent and Seamless Storage-As-You-Go with Persistent Memory Hyeonho Song, Sam H. Noh UNIST HotStorage 2018 Contents Persistent Memory Motivation SAY-Go Design Implementation Evaluation

More information

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University CS 555: DISTRIBUTED SYSTEMS [P2P SYSTEMS] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey Byzantine failures vs malicious nodes

More information

Multimedia Streaming. Mike Zink

Multimedia Streaming. Mike Zink Multimedia Streaming Mike Zink Technical Challenges Servers (and proxy caches) storage continuous media streams, e.g.: 4000 movies * 90 minutes * 10 Mbps (DVD) = 27.0 TB 15 Mbps = 40.5 TB 36 Mbps (BluRay)=

More information

CS 425 / ECE 428 Distributed Systems Fall 2015

CS 425 / ECE 428 Distributed Systems Fall 2015 CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Measurement Studies Lecture 23 Nov 10, 2015 Reading: See links on website All Slides IG 1 Motivation We design algorithms, implement

More information

Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017

Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017 Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store Wei Xie TTU CS Department Seminar, 3/7/2017 1 Outline General introduction Study 1: Elastic Consistent Hashing based Store

More information

ASN Configuration Best Practices

ASN Configuration Best Practices ASN Configuration Best Practices Managed machine Generally used CPUs and RAM amounts are enough for the managed machine: CPU still allows us to read and write data faster than real IO subsystem allows.

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Distributed System Engineering: Spring Exam II

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Distributed System Engineering: Spring Exam II Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.824 Distributed System Engineering: Spring 2018 Exam II Write your name on this cover sheet. If you tear

More information

Ambry: LinkedIn s Scalable Geo- Distributed Object Store

Ambry: LinkedIn s Scalable Geo- Distributed Object Store Ambry: LinkedIn s Scalable Geo- Distributed Object Store Shadi A. Noghabi *, Sriram Subramanian +, Priyesh Narayanan +, Sivabalan Narayanan +, Gopalakrishna Holla +, Mammad Zadeh +, Tianwei Li +, Indranil

More information

IBM Tivoli Storage Manager for HP-UX Version Installation Guide IBM

IBM Tivoli Storage Manager for HP-UX Version Installation Guide IBM IBM Tivoli Storage Manager for HP-UX Version 7.1.4 Installation Guide IBM IBM Tivoli Storage Manager for HP-UX Version 7.1.4 Installation Guide IBM Note: Before you use this information and the product

More information

SONAS Best Practices and options for CIFS Scalability

SONAS Best Practices and options for CIFS Scalability COMMON INTERNET FILE SYSTEM (CIFS) FILE SERVING...2 MAXIMUM NUMBER OF ACTIVE CONCURRENT CIFS CONNECTIONS...2 SONAS SYSTEM CONFIGURATION...4 SONAS Best Practices and options for CIFS Scalability A guide

More information

ElMem: Towards an Elastic Memcached System

ElMem: Towards an Elastic Memcached System : Towards an Elastic Memcached System Ubaid Ullah Hafeez, Muhammad Wajahat, Anshul Gandhi; Stony Brook University PACE Lab, Department of Computer Science, Stony Brook University {uhafeez,mwajahat,anshul}@cs.stonybrook.edu

More information

Page 1. Key Value Storage"

Page 1. Key Value Storage Key Value Storage CS162 Operating Systems and Systems Programming Lecture 14 Key Value Storage Systems March 12, 2012 Anthony D. Joseph and Ion Stoica http://inst.eecs.berkeley.edu/~cs162 Handle huge volumes

More information

OASIS: Self-tuning Storage for Applications

OASIS: Self-tuning Storage for Applications OASIS: Self-tuning Storage for Applications Kostas Magoutis, Prasenjit Sarkar, Gauri Shah 14 th NASA Goddard- 23 rd IEEE Mass Storage Systems Technologies, College Park, MD, May 17, 2006 Outline Motivation

More information

Quantifying Load Imbalance on Virtualized Enterprise Servers

Quantifying Load Imbalance on Virtualized Enterprise Servers Quantifying Load Imbalance on Virtualized Enterprise Servers Emmanuel Arzuaga and David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston MA 1 Traditional Data Centers

More information

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure Mario Beck (mario.beck@oracle.com) Principal Sales Consultant MySQL Session Agenda Requirements for

More information

A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510

A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510 A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510 Incentives for migrating to Exchange 2010 on Dell PowerEdge R720xd Global Solutions Engineering

More information

GD-Wheel: A Cost-Aware Replacement Policy for Key-Value Stores

GD-Wheel: A Cost-Aware Replacement Policy for Key-Value Stores GD-Wheel: A Cost-Aware Replacement Policy for Key-Value Stores Conglong Li Carnegie Mellon University conglonl@cs.cmu.edu Alan L. Cox Rice University alc@rice.edu Abstract Memory-based key-value stores,

More information

V-Cache: Towards Flexible Resource Provisioning for Multi-tier Applications in IaaS Clouds

V-Cache: Towards Flexible Resource Provisioning for Multi-tier Applications in IaaS Clouds : Towards Flexible Resource Provisioning for Multi-tier Applications in IaaS Clouds Yanfei Guo, Palden Lama, Jia Rao and Xiaobo Zhou Department of Computer Science University of Colorado, Colorado Springs,

More information

ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System

ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System Xiaodong Shi Email: shixd.hust@gmail.com Dan Feng Email: dfeng@hust.edu.cn Wuhan National Laboratory for Optoelectronics,

More information

A Fast and High Throughput SQL Query System for Big Data

A Fast and High Throughput SQL Query System for Big Data A Fast and High Throughput SQL Query System for Big Data Feng Zhu, Jie Liu, and Lijie Xu Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing, China 100190

More information

Cataclysm: Policing Extreme Overloads in Internet Applications

Cataclysm: Policing Extreme Overloads in Internet Applications Cataclysm: Policing Extreme Overloads in Internet Applications Bhuvan Urgaonkar Dept. of Computer Science University of Massachusetts Amehrst, MA bhuvan@cs.umass.edu Prashant Shenoy Dept. of Computer Science

More information

Cloud Computing Architecture

Cloud Computing Architecture Cloud Computing Architecture 1 Contents Workload distribution architecture Dynamic scalability architecture Cloud bursting architecture Elastic disk provisioning architecture Redundant storage architecture

More information

SRCMap: Energy Proportional Storage using Dynamic Consolidation

SRCMap: Energy Proportional Storage using Dynamic Consolidation SRCMap: Energy Proportional Storage using Dynamic Consolidation By: Akshat Verma, Ricardo Koller, Luis Useche, Raju Rangaswami Presented by: James Larkby-Lahet Motivation storage consumes 10-25% of datacenter

More information

Volley: Automated Data Placement for Geo-Distributed Cloud Services

Volley: Automated Data Placement for Geo-Distributed Cloud Services Volley: Automated Data Placement for Geo-Distributed Cloud Services Authors: Sharad Agarwal, John Dunagen, Navendu Jain, Stefan Saroiu, Alec Wolman, Harbinder Bogan 7th USENIX Symposium on Networked Systems

More information

Normalized cuts and image segmentation

Normalized cuts and image segmentation Normalized cuts and image segmentation Department of EE University of Washington Yeping Su Xiaodan Song Normalized Cuts and Image Segmentation, IEEE Trans. PAMI, August 2000 5/20/2003 1 Outline 1. Image

More information

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc Fuxi Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc {jiamang.wang, yongjun.wyj, hua.caihua, zhipeng.tzp, zhiqiang.lv,

More information

Big data, little time. Scale-out data serving. Scale-out data serving. Highly skewed key popularity

Big data, little time. Scale-out data serving. Scale-out data serving. Highly skewed key popularity /7/6 Big data, little time Goal is to keep (hot) data in memory Requires scale-out approach Each server responsible for one chunk Fast access to local data The Case for RackOut Scalable Data Serving Using

More information

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays Dell EMC Engineering December 2016 A Dell Best Practices Guide Revisions Date March 2011 Description Initial

More information

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems 16. Distributed Lookup Paul Krzyzanowski Rutgers University Fall 2017 1 Distributed Lookup Look up (key, value) Cooperating set of nodes Ideally: No central coordinator Some nodes can

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis Test Loads CS 147: Computer Systems Performance Analysis Test Loads 1 / 33 Overview Overview Overview 2 / 33 Test Load Design Test Load Design Test Load Design

More information

One Server Per City: Using TCP for Very Large SIP Servers. Kumiko Ono Henning Schulzrinne {kumiko,

One Server Per City: Using TCP for Very Large SIP Servers. Kumiko Ono Henning Schulzrinne {kumiko, One Server Per City: Using TCP for Very Large SIP Servers Kumiko Ono Henning Schulzrinne {kumiko, hgs}@cs.columbia.edu Goal Answer the following question: How does using TCP affect the scalability and

More information

Shen, Tang, Yang, and Chu

Shen, Tang, Yang, and Chu Integrated Resource Management for Cluster-based Internet s About the Authors Kai Shen Hong Tang Tao Yang LingKun Chu Published on OSDI22 Presented by Chunling Hu Kai Shen: Assistant Professor of DCS at

More information

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Database Solutions Engineering By Raghunatha M, Ravi Ramappa Dell Product Group October 2009 Executive Summary

More information

08 Distributed Hash Tables

08 Distributed Hash Tables 08 Distributed Hash Tables 2/59 Chord Lookup Algorithm Properties Interface: lookup(key) IP address Efficient: O(log N) messages per lookup N is the total number of servers Scalable: O(log N) state per

More information

PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads

PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads Ran Xu (Purdue), Subrata Mitra (Adobe Research), Jason Rahman (Facebook), Peter Bai (Purdue),

More information

Tuesday, June 22, JBoss Users & Developers Conference. Boston:2010

Tuesday, June 22, JBoss Users & Developers Conference. Boston:2010 JBoss Users & Developers Conference Boston:2010 Infinispan s Hot Rod Protocol Galder Zamarreño Senior Software Engineer, Red Hat 21st June 2010 Who is Galder? Core R&D engineer on Infinispan and JBoss

More information

RobinHood: Tail Latency Aware Caching Dynamic Reallocation from Cache-Rich to Cache-Poor

RobinHood: Tail Latency Aware Caching Dynamic Reallocation from Cache-Rich to Cache-Poor RobinHood: Tail Latency Aware Caching Dynamic Reallocation from Cache-Rich to Cache-Poor Daniel S. Berger and Benjamin Berg, Carnegie Mellon University; Timothy Zhu, Pennsylvania State University; Siddhartha

More information

BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding. Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University

BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding. Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University Outline Introduction and Motivation Our Design System and Implementation

More information

CONTENT DISTRIBUTION. Oliver Michel University of Illinois at Urbana-Champaign. October 25th, 2011

CONTENT DISTRIBUTION. Oliver Michel University of Illinois at Urbana-Champaign. October 25th, 2011 CONTENT DISTRIBUTION Oliver Michel University of Illinois at Urbana-Champaign October 25th, 2011 OVERVIEW 1. Why use advanced techniques for content distribution on the internet? 2. CoralCDN 3. Identifying

More information

Modeling and Caching of P2P Traffic

Modeling and Caching of P2P Traffic School of Computing Science Simon Fraser University, Canada Modeling and Caching of P2P Traffic Mohamed Hefeeda Osama Saleh ICNP 06 15 November 2006 1 Motivations P2P traffic is a major fraction of Internet

More information

IBM Tivoli Storage Manager for AIX Version Installation Guide IBM

IBM Tivoli Storage Manager for AIX Version Installation Guide IBM IBM Tivoli Storage Manager for AIX Version 7.1.3 Installation Guide IBM IBM Tivoli Storage Manager for AIX Version 7.1.3 Installation Guide IBM Note: Before you use this information and the product it

More information