Distributed caching for cloud computing

Size: px

Start display at page:

Download "Distributed caching for cloud computing"

Allan Cole
5 years ago
Views:

1 Distributed caching for cloud computing Maxime Lorrillere, Julien Sopena, Sébastien Monnet et Pierre Sens February 11, 2013 Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

2 Introduction Context Cloud computing Host Host VM VM VM Cloud computing Computing resources as a service On-demand self-service Elastically provisioned QoS guarantees Google S3 Building a virtual platform Deal with different resources From different providers With different properties We need a common interface Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

3 Introduction Context Cloud computing Host Host VM VM VM Cache Cloud computing Computing resources as a service On-demand self-service Elastically provisioned QoS guarantees Google S3 Building a virtual platform Deal with different resources From different providers With different properties We need a common interface Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

4 Introduction Context Cloud computing Host Host VM VM VM Cache Cloud computing Computing resources as a service On-demand self-service Elastically provisioned QoS guarantees Google S3 Building a virtual platform Deal with different resources From different providers With different properties We need a common interface Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

5 Introduction Caching Distributed caching 100ms Remote hard drive CPU Main memory 10ms 100ns Local hard drive 100µs Main memory Local hard drive High capacity High latency Bottleneck Network Low latency Capacity? Main memory Very low latency Small capacity Principle of locality Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

6 Introduction Related works Distributed caching Related works Operating system layer: Application level [Memcached] Existing applications have to be updated Filesystem level [xfs, PAFS, Ceph] Guest operating system have to use a specific file system Bloc level [XHive, dm-cache] Incompatible with distributed file systems Applications Virtual File System Cache File systems Bloc level Existing solutions are not cloud aware Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

7 Introduction Related works Distributed caching Related works Operating system layer: Application level [Memcached] Existing applications have to be updated Filesystem level [xfs, PAFS, Ceph] Guest operating system have to use a specific file system Bloc level [XHive, dm-cache] Incompatible with distributed file systems Applications Virtual File System Cache File systems Bloc level Existing solutions are not cloud aware Our contribution: a generic approach to develop ditributed caches for cloud computing Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

8 Constraints Development of a distributed cache Implementation constraints Ensure genericity Integration into the Linux kernel Be non-intrusive Performance constraints Limit overhead Minimise memory footprint Memory management Applications Virtual File System Cache File systems Bloc level Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

9 Remote caches Remote cache Direct client cooperation Client p res miss req Remote memory extends local memory Easy localisation of data No data sharing hit Remote cache Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

10 Remote caches Remote cache Direct client cooperation Client p IO req res miss req Remote memory extends local memory Easy localisation of data No data sharing miss Remote cache Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

11 Remote caches Remote cache Direct client cooperation Client p IO res IO req res miss req Remote memory extends local memory Easy localisation of data No data sharing miss Remote cache Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

12 Architecture Architecture client server p miss Client Basic operations: get and put Blocking get Executed by the process in kernel-space Serveur Dedicated kernel thread Request-response Red-black tree Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

13 Architecture Architecture client server p miss get() M get M res Client Basic operations: get and put Blocking get Executed by the process in kernel-space Serveur Dedicated kernel thread Request-response Red-black tree Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

14 Architecture Architecture client server p miss IO res miss? get() IO req M get M res Client Basic operations: get and put Blocking get Executed by the process in kernel-space Serveur Dedicated kernel thread Request-response Red-black tree Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

15 Architecture Architecture alloc p Memory manager Client Basic operations: get and put put() called inside critical section client server Serveur Dedicated kernel thread Request-response Red-black tree Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

16 Architecture Architecture alloc p Memory manager put() Client Basic operations: get and put put() called inside critical section client server Serveur Dedicated kernel thread Request-response Red-black tree Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

17 Architecture Architecture alloc p Memory manager Client Basic operations: get and put M put M put put() put() called inside critical section Executed in a dedicated thread client server Serveur Dedicated kernel thread Request-response Red-black tree Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

18 Details and optimizations Metadata management Problem: metadata management efficiency. 1 hit? test(key 0) Solution: Bloom filter [Bloom 1970] miss hit? test(key 2) test(key 3) Probabilistic data structure Compact No false negative False positive possible Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

19 Details and optimizations Cache accesses management Problem: sequential access detection get(0) 0 get(1) 1 get(2) get(3) get(4) get(5) get(6) Solution: prefetching Sequential read detection Read prediction Read ahead of data Amortized network latency get(7) Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

20 Details and optimizations Communications management Problem: network buffers memory footprint Memory manager alloc p put() Solution: zero-copy Avoid copying into the network stack Decrease memory allocations Avoid deadlocks Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

21 Details and optimizations Communications management Problem: network buffers memory footprint Memory manager alloc alloc p put() Solution: zero-copy Avoid copying into the network stack Decrease memory allocations Avoid deadlocks Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

22 Evaluation Experiment setup Evaluation Experiment setup Virtualized platform Intel Core i (4 hyper-threaded cores), 8GB memory Cache server (2 cores, 4GB) Client (2 cores, 512MB) Reads from local virtual hard drive 1Gbit/s virtual network ( 600µs RTT) Micro-benchmark 32MB read Each read is split into fragments from 512 bytes to 8MB Each fragment is read at a random position from a file Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

23 Evaluation Performance evaluation Remote miss overhead Empty remote cache Throughput(MB/s) Classic Remotecache Fragment size (bytes) Bloom filter avoids remote miss Code execution has a negligible Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

24 Evaluation Performance evaluation Performance peak Data preloaded in remote cache Throughput (MB/s) Classic Remotecache Speed-up Speed-up Fragment size (bytes) Fragment size (bytes) Up to 8x performance improvement with small fragments (1KB) Performance drop above 128KB Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

25 Evaluation Performance evaluation Performance with local memory full of data Data preloaded in remotecache, full local memory Throughput (MB/s) Classic Remotecache Fragment size (bytes) Speed-up Fragment size (bytes) Speed-up Up to 6x performance improvement with small fragments (1KB) Performance drop above 64K Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

26 Conclusion Conclusion Summary Existing distributed caches are not cloud aware We propose an approach to develop distributed caches for the cloud Working non-intrusive prototype Promising: up to 8x performance improvement in random read Future works Realistics benchmarks: Memcached, dm-cache, bcache,... Sequential read performance improvements Consistency guarantees Maxime Lorrillere (LIP6/UPMC/CNRS) February 11, / 16

Buffer Management for XFS in Linux. William J. Earl SGI

Buffer Management for XFS in Linux William J. Earl SGI XFS Requirements for a Buffer Cache Delayed allocation of disk space for cached writes supports high write performance Delayed allocation main memory