Jinho Hwang and Timothy Wood George Washington University

Size: px

Start display at page:

Download "Jinho Hwang and Timothy Wood George Washington University"

Owen Hines
5 years ago
Views:

1 Jinho Hwang and Timothy Wood George Washington University

2 Background: Memory Caching Two orders of magnitude more reads than writes Solution: Deploy memcached hosts to handle the read capacity 6. HTTP Response 1. HTTP Request Web Server 6. (key, data) 5. Data 3. Miss(key) 2. Get(key) 4. DB Lookup(key) Memcache DB DB 6/26/13 DB The George Washington University 2

Memcached at Scale Databases are hard to scale Memcached is easy o Facebook has 10,000+ memcached servers Partition data and divide key space among all nodes o Simple data model.

3 Memcached at Scale Databases are hard to scale Memcached is easy o Facebook has 10,000+ memcached servers Partition data and divide key space among all nodes o Simple data model. Stupid nodes. Web application must track where each object is stored o Or use a proxy like moxi moxi Clients Web Servers DB Memcached nodes 6/26/13 The George Washington University 3

4 Scales easily, but loads are imbalanced Random placement Skewed popularity distributions Load on Wikipedia s memcached servers 6/26/13 The George Washington University 4

5 Motivation Consistent hashing does not evenly load data across memory cache servers o Variation in number of keys assigned to each server o Key popularity is skewed and changes over time Unpopular region (65%) Hash Space ( ) Popular region (35%) Based on Wikipedia 2008 database dump and access trace Solution: dynamically balance load according to the performance 6/26/13 The George Washington University 5

6 Contributions A hash space allocation scheme o allows for targeted load shifting between unbalanced servers Adaptive partitioning of the cache s hash space o automatically meet hit rate and server utilization goals An automated replica management system o adds or removes cache replicas based on overall cache performance 6/26/13 The George Washington University 6

7 Outline Background and Motivation Initial Hash Space Partitioning Dynamic Adaptation Evaluation Conclusions 6/26/13 The George Washington University 7

8 Background: Hash Space Allocation Simple Hashing o hash(key) % [# of server] o Once assigned, never changes o If node added or removed, all objects need to be rearranged Memory Server Memory Server Memory Server Load Balancer server[key % 3] Consistent hashing o Treat hash space as ring with nodes assigned to each region o Node addition / removal only affects adjacent nodes o Used in P2P systems and by popular memcached proxy system Moxi N4 N1 N4 N1 N3 N2 Key Hash Space 2^32 N3 Key N2 belong to 6/26/13 The George Washington University 8

9 Initial Assignment To enable efficient repartitioning of the hash space: o Every node is adjacent to every other node o This allows a simple transfer of load between two nodes by adjusting just one boundary Required number of duplicate nodes = Total number of nodes = Multiply number of virtual nodes N1 N2 N3 N4 N5 N1 N3 N5 N2 N4 N2 N3 N5 N1 N4 N5 N2 N4 N1 N3 6/26/13 The George Washington University 9

10 Dynamic Hash Space Scheduling Two factors to measure server performance: o Hit rate: enough memory for popular data o Usage ratio: server processing Minimize {cost = hit rate + usage ratio} Scheduling decision: o Find the most different two memory servers o Find the most different two adjacent virtual nodes Size of hash space moved at each scheduling decision o Determine the speed of adaptability, but more fluctuation o Using ratio value: 6/26/13 The George Washington University 10

11 Node Addition / Removal Balance out the requests across replicas that overall performance improves Highly overloaded server(s) sustaining a certain period of time should be backed by new server(s) Find the most costly memory server, and its virtual node si Migrate sk new node Node Addition Find the least costly memory server, and its virtual node sj si Set si moved Set sk removed sj Node Removal 6/26/13 The George Washington University 11

12 Outline Background and Motivation Initial Hash Space Partitioning Dynamic Adaptation Evaluation Conclusions 6/26/13 The George Washington University 12

Experimental Setup Lab setup o Five experimental servers(4

67GHz processor, 16GB, and a 500GB 7200RPM hard drive) Amazon

5 0 Proxy 0 1 2 3 4 5 memcd memcd memcd memcd Elastic Decision

13 Experimental Setup Lab setup o Five experimental servers(4 Intel Xeon X GHz processor, 16GB, and a 500GB 7200RPM hard drive) Amazon setup o 15 medium instances Clients web Proxy memcd memcd memcd memcd Elastic Decision (+/-) Memory Pool memcd memcd memcd All workloads are from Wikipedia data and access traces 6/26/13 The George Washington University 13

14 Initial Hash Space Assignment 5 memory servers used (total 500 virtual nodes) o For consistent hashing, 100 virtual nodes per each server Server Number Server Number o For our scheme, the initial set is 5 x 4 = 20, and 25 virtual nodes per node Consistent 5 Adaptive Hash Space ( ) The largest gap between the biggest hash size and the smallest hash size is 381,114,554 ( 20% more) Hash Space Size (x10 6 ) Consistent Adaptive Server Number 6/26/13 The George Washington University 14

Dynamic Partitioning α = 1.0 (only hit rate) Hit Rate 1.0 0.8 0.6 0.4 Host 1 0.2 Host 2 Host 3 0.

15 Dynamic Partitioning α = 1.0 (only hit rate) Hit Rate Host Host 2 Host # of Requests (per min) Host 1 Host 2 Host 3 Hash Space ( ) 33.3 % Host % 33.3 % Host % 33.3 % Host 1 Host 2 Host Host % α = 0 (only usage ratio) Hit Rate Host Host 2 Host # of Requests (per min) Host Host 2 Host 3 0 Hash Space ( ) 33.3 % Host % 33.3 % Host % 33.3 % Host 1 Host Host % Host 3 6/26/13 The George Washington University 15

16 α Behavior When α = 0.5, β = 0.01 Hit Rate Cost Host Host 2 Host Host 1 Host 2 Host # of Reqs per min(x10 3 ) Hash Space ( ) Host 1 Host 2 Host % Host % 33.3 % Host % 33.3 % Host 1 Host Host 3 2 Host % 6/26/13 The George Washington University 16

17 Node Addition / Removal # of Reqs per min(x10 3 ) Host added Time (3 hours) Hash Space ( ) 33.3 % 33.3 % 33.3 % Host added 10.7 % 26.7 % 17.2 % 45.3 % Time (3 hours) Addition A new node takes reduces load on the overloaded server # of Reqs per min(x10 3 ) Host removed Time (3 hours) Hash Space ( ) 20 % 20 % 20 % 20 % Host removed 25.1 % 24.7 % 27.8 % 20 % 22.2 % Time (3 hours) Removal Removing an underloaded server gives cost benefits while maintaining performance 6/26/13 The George Washington University 17

18 β Behavior Amount ratio of hash space movement Determine the speed of adaptability Use β = 0.01 (1%) to show the behavior # of Reqs per min(x10 3 ) Host 1 Host 2 Host Moved Hash Space Size (x 10 6 ) = Traffic changes over 5 hours Moved hash space per each scheduling 6/26/13 The George Washington University 18

19 Scaling Up / Down Dynamically add / remove server(s) depending on amount of load intensity Watch each server for a period of time (5 min) to check high load sustainability To maximize variation, α = 1 (hit rate only) 5 Wikipedia traffic generators used # of Reqs Per Min (x10 3 ) # of Servers /26/13 The George Washington University 19

20 QoE Improvement Avg. Response Time (ms) Ketama Value [0.0, 1.0] Usage rate Hit rate # of Used Memory Servers Ketama Value [0.0, 1.0] Wikipedia workload achieves better response time as hit rate increases ( 45% increase) But the number of servers used increases as well As recommendation, the combination of hit rate and usage rate (α = 0.5) is a good administrative choice 6/26/13 The George Washington University 20

21 Related Work [Stoica, ToN 03] Chord Peer-to-Peer architecture [Nishtala, NSDI 13] Scaling Memcached at Facebook [Zhu, HotCloud 12] Shrinking memcached to save $$ Ideas may apply to many other key-value based storage systems: couchebase, redis, SILT, FAWN, etc 6/26/13 The George Washington University 21

22 Conclusion Summary o A hash space allocation scheme Carefully place nodes to ensure adjacency o Adaptive partitioning of the cache s hash space Maximize hit rate and minimize difference in utilization rate o An automated replica management system Detect sustained overload and add or remove nodes Future works o Automatic α value adjustment to minimize response time o Targeted management of hot objects without impacting application performance 6/26/13 The George Washington University 22

Jinho Hwang (IBM Research) Wei Zhang, Timothy Wood, H. Howie Huang (George Washington Univ.) K.K. Ramakrishnan (Rutgers University)

Jinho Hwang (IBM Research) Wei Zhang, Timothy Wood, H. Howie Huang (George Washington Univ.) K.K. Ramakrishnan (Rutgers University) Background: Memory Caching Two orders of magnitude more reads than writes