Warming up Storage-level Caches with Bonfire

Size: px

Start display at page:

Download "Warming up Storage-level Caches with Bonfire"

Theodora Booth
5 years ago
Views:

1 Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul Soundararajan Mark W. Storer Lakshmi N. Bairavasundaram Sethuraman Subbiah Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau

2 2 Does on-demand cache warmup still work?

Memory (Cache) Size (GB) 3 65536.0 10s of TBs 4096.0 256.0 16.0 1 TB Cache... 1.0 0.

3 Memory (Cache) Size (GB) s of TBs TB Cache Sequential: 2.5 hours < 10 GB Random: 6 days + idle time Year

4 Read Hit Rate (%) 4 How Long Does On-demand Warmup Take? Read hit rate difference between warm cache and on-demand On-demand warmup takes hours to days 20 Always-warm Cold On-demand Time (hour) * Simulation results from a project server trace

5 5 To Make Things Worse Caches are critical Key component to meet application SLAs Reduce storage server I/O load Cache warmup happens often Storage server restart Storage server take-over Dynamic caching [Narayanan 08, Bairavasundaram 12]

6 6 On-demand Warmup Doesn t Work Anymore What Can We Do? Bonfire Monitors and logs I/Os Load warmup data in bulk I/O Bonfire Monitor Warmup Information Challenges What to monitor & log? Effective How to monitor & log? Efficient How to load warmup data? Fast General solution Storage System Logging Volume

7 7 Summary of Contributions Trace analysis for storage-level cache warmup Temporal and spatial patterns of reaccesses Cache warmup algorithm design and simulation Implementation and evaluation of Bonfire Up to 100% warmup time improvement over on-demand Up to 200% more server I/O load reduction Up to 5 times lower read latency Low overhead

8 8 Outline Introduction Trace analysis for cache warmup Cache warmup algorithm study with simulation Bonfire architecture Evaluation results Conclusion

9 9 Workload Study Trace Selection MSR-Cambridge [Narayanan 08] 36 one-week block-level traces from MSR-Cambridge data center servers Filter out write-intensive, small working set, and low reaccess-rate Server Function #volumes mds Media server 1 prn Print server 1 proj Project directories 3 src1 Source control 1 usr User home directories 2 web Web/SQL server 1 Reaccesses: Read After Reads and Read After Writes

10 LBA 10 Questions for Trace Study Time Space Relative Absolute Q1: What s the temporal distance? Q2: When do reaccesses happen (in terms of wall clock time)? Q3: What s the spatial distance? Any clustering of reaccesses? Q4: Where do reaccesses happen (in terms of LBA)? hours 1 hour AM 1 AM 2 AM 3 AM 4 AM 5 AM 6 AM

11 Amount of Reaccesses (%) 11 Q1: What is the Temporal Distance? Time b/w Reaccesses for All Traces Within an hour: Hourly hours: Daily Temporal Distance (Hour)

12 Amount of Reacceses (%) 12 Q1: What is the Temporal Distance? Hourly Dominated Hourly Daily Other Daily Dominated Other 20 0 mds_1 proj_1 proj_4 usr_1 src1_1 web_2 prn_1 proj_2 usr_2 A1: Two main reaccess patterns: Hourly & Daily In an hour, recent blocks more likely reaccessed

13 Daily Reaccesses (%) 13 Q2: When Do Reaccesses Happen (Wall Clock Time)? src11 web Time (day) A2: Daily reaccesses at same time every day

14 Amount of Reaccesses (%) 14 Q3: What is the Spatial Distance? 100% Hourly Dominated Daily Dominated Other 80% 60% 40% 20% 0% mds1 proj1 proj4 usr1 src11 web2 prn1 proj2 usr2 <10MB 10MB-1GB 1-10GB GB >100GB A3: Spatial distance usually small for hourly sometimes small for other reaccesses

15 Pcentage of Reacceses in 1MB Region (%) 15 Q3: Any spatial clustering among reaccesses? Percentage of 1MB regions that have reaccesses Hourly Daily Other mds_1 proj_1 proj_4 usr_1 src1_1 web_2 prn_1 proj_2 usr_2 A3: Daily reaccesses more spatially clustered

16 16 Trace Analysis Summary and Implications Relative Absolute Time A1: Reaccesses have two main temporal patterns: within 1 hour, around 1 day A2: Daily reaccesses correlate with wall clock time Space A3: Hourly reaccesses are close in spatial distance. Daily reaccesses exhibit spatial clustering. A4: No hot spot of reaccesses in LBA space A1 Hourly: Use recently accessed blocks A1 and A2 Daily: Use same period from previous day A3 Small spatial distance: Size of monitoring buffer is small

17 17 Outline Introduction Trace analysis for cache warmup Cache warmup algorithm study with simulation Bonfire architecture Evaluation results Conclusion

18 Read Hit Rate (%) 18 Metrics: Warmup Time Warmup period: Hit-rate convergence time 100 Converge time loose Converge time strict Always-warm cache New cache Time

19 19 Metrics: Server I/O Reduction Storage server I/O load reduction Amount of I/Os going to cache Total I/Os Improvement in server I/O load reduction Server I/O load reduction of Bonfire Server I load reduction of On demand O during convergence time

20 20 Cache Warmup Algorithms Last-K: Last K regions accessed in the trace First-K: First K regions in the past 24 hours Top-K: K most frequent regions Random-K: Random K regions T R Region: granularity of monitoring and logging e.g., 1MB I/Os F L 24 Hours cache starts Time

21 21 Simulation Results - Overall LRU cache simulator with four warmup algorithms Convergence time Improves 14% to 100% Server I/O load reduction Improves 44% to 228% In general, Last-K is the best First-K works for special case (known patterns)

22 22 Outline Introduction Trace analysis for cache warmup Cache warmup algorithm study with simulation Bonfire architecture Evaluation results Conclusion

23 23 Bonfire Design Design principles Low overhead monitoring and logging (efficient) Bulk loading useful warmup data (effective and fast) General design applicable to a range of scenarios Techniques Last-K Monitors I/O below the server buffer cache Performance snapshot

24 24 Bonfire Architecture: Monitoring I/O Buffer Cache I/O Bonfire Monitor I/O In-memory Staging Buffer Data Volumes Warmup Data Warmup Metadata. n Performance Snapshot Logging Volume Only store warmup metadata: metadata-only Store warmup metadata and data: metadata+data Storage System

25 25 Bonfire Architecture: Bulk Cache Warmup I/O Buffer Cache New Cache Warmup Data Bonfire Monitor In-mem n n-1.. k I/O Sorted by LBA Data Volumes Warmup Data Warmup Metadata. n Performance Snapshot Logging Volume Storage System

26 26 Outline Introduction Trace analysis for cache warmup Cache warmup algorithm study with simulation Bonfire architecture Evaluation results Conclusion

27 27 Evaluation Set Up Implemented Bonfire as a trace replayer Always-warm, on-demand, and Bonfire Metadata-only and metadata+data Replay traces using sync I/Os Workloads Synthetic workloads MSR-Cambridge traces Metrics Benefits and overheads week On-demand Bonfire Always-warm

28 Read Hit Rate (%) 28 Benefit Results - Read Hit Rate of MSR Trace Higher read hit rate => less server I/O load 100 On-demand converges Cache Starts Bonfire converges Always-warm Cold On-demand Bonfire Num of I/Os (x ) Day 5 Day 6 * Results of a project server trace from MSR-Cambridge trace set

5 Cache Starts Always-warm Cold On-demand Bonfire 1 0.

29 Read Latency (ms) 29 Benefit Results - Read Latency of MSR Trace Lower read latency => better application-perceived performance Cache Starts Always-warm Cold On-demand Bonfire Num of I/Os (x ) Day 5 Day 6 * Results of a project server trace from MSR-Cambridge trace set

30 30 Overhead Results I/O Buffer Cache Bonfire Monitor I/O In-memory Staging Buffer 256KB & 128MB Data Volumes Warmup Data Warmup Metadata Storage System 9KB/s to 476KB/s 4.6MB/s to 238MB/s 19MB to 71MB n Performance Snapshot 9.5GB to 36GB Logging Volume

31 Overhead Results Buffer Cache I/O New Cache 2 to 20 minutes Warmup Data Bonfire Monitor I/O In-memory Staging Buffer 256KB & 128MB Data Volumes Warmup Data 9KB/s to

31 31 Overhead Results Buffer Cache I/O New Cache 2 to 20 minutes Warmup Data Bonfire Monitor I/O In-memory Staging Buffer 256KB & 128MB Data Volumes Warmup Data 9KB/s to 476KB/s Warmup Metadata 4.6MB/s to 238MB/s Proper Bonfire scheme and configuration 19MB to 71MB n Performance Snapshot 9.5GB to 36GB Logging Volume Storage System

32 32 Summary of Results Faster cache warmup 59% to 100% improvement over on-demand Less storage server I/O load 38% to 200% more reduction than on-demand Better application-perceived latency Avg read latency 1/5 to 2/3 of on-demand Small controllable overhead

33 33 Conclusion On-demand warmup doesn t work anymore Warm up terabytes of caches take days Bonfire and beyond Client-side cache warmup Application-aware warmup In need for more long big public traces!

34 34 Thank You Questions?

Warped Mirrors for Flash Yiying Zhang

Warped Mirrors for Flash Yiying Zhang Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau 2 3 Flash-based SSDs in Storage Systems Using commercial SSDs in storage layer Good performance Easy to use Relatively