Manylogs Improving CMR/SMR Disk Bandwidth & Latency Tiratat Patana-anake, Vincentius Martin, Nora Sandler, Cheng Wu, and Haryadi S. Gunawi
Manylogs @ M SST 1 6 2 Got 100% of the read bandwidth User 1 Big Read
Many 3 Got 50% of the read bandwidth User 1 Fair Share Big Read User 2 Big Read
N Manylogs @ M SST 1 6 4 1 Still Fair! 2 3 1/N bandwidth
Manylogs @ M SST 1 6 5 Our Oh no! 5% bandwidth?! User 1 Big Read User 2 Small Durable Writes
Manylogs @ M SST 1 6 6 Our More Bandwidth Please! m Faster Latency Please! User 1 Big Read User 2 Small Durable Writes
Manylogs @ M SST 1 6 7 Ordered ournaling Data ournaling ournal Big Read Small Writes Seek
Manylogs @ M SST 1 6 8 Ordered ournaling Data ournaling ournal Big Read Small Writes Seek
Manylogs @ M SST 1 6 9 Problems with Current ournaling Ordered ournaling Data ournaling
Manylogs @ M SST 1 6 10 Problems with Current ournaling Ordered ournaling Data o First Write Second Write
Manylogs @ M SST 16 11 Introducing Manylogs Single Log Manylogs 10 MB 100 MB
Manylogs @ M SST 1 6 12 Manylogs Small writes made durable to the nearest log without seeking ournal Big Read Small Writes Seek
Manylogs @ M SST 16 13 Manylogs q Reserved log spaces uniformly across the disk 10 MB every 100 MB q Follow the disk head (last big I/O) q Redirect Small Writes (e.g. 256 KB) Nearest log: log closest to last big I/O q Sequential Writes are left untouched 10 MB 100 MB
Manylogs @ M SST 1 6 14 Increased Read Throughput Manylogs Reduced Write Latency
Manylogs @ M SST 16 15 Where are logs on the disk? 10 MB 100 MB The log space = whole platter
Manylogs @ M SST 16 16 Where are logs on the disk? 10 MB 100 MB The log space = whole platter
Manylogs @ M SST 16 17 Same cylinder = No seek! Log for others in the same cylinder
Manylogs @ M SST 1 6 18 Ratio of Max Read Bandwidth Latency (ms) User 1 128MB Sequential Reads User 2 4KB Random Writes Ordered vs. Data vs. Adaptive vs. Manylogs At different intensities writes/s writes/s writes/s 320 writes/s Disk
Manylogs @ M SST 16 19 Adaptive ournaling q Middle ground between ordered journaling and data journaling q Single-log design q Prabhakaran et al., ATC 05
Manylogs @ M SST 16 20 Results Ordered Adaptive Data Manylogs 100% Ratio of Max Bandwidth 80% 60% 40% 20% 0% 40 80 160 320 Random Writes per Second (IOPS)
Manylogs @ M SST 16 21 Results Ordered Adaptive Data Manylogs 100 Ratio of Max Bandwidth 80% 60% 40% 20% 0% 57% at 40 IOPS 5% at 320 IOPS 40 80 160 320 Random Writes per Second (IOPS)
Manylogs @ M SST 16 22 Results Ordered Adaptive Data Manylogs 100% Ratio of Max Bandwidth 80% 60% 40% 20% 73% at 40 IOPS 9% at 320 IOPS 0% 40 80 160 320 Random Writes per Second (IOPS)
Manylogs @ M SST 16 23 Results Ordered Adaptive Data Manylogs 100% Ratio of Max Bandwidth 80% 60% 40% 20% 0% 40 80 160 320 Random Writes per Second (IOPS)
Manylogs @ M SST 16 24 of Max Bandwidth Manylogs gives the most bandwidth 100% 80% 60% Hey! What about my latency? Ordered daptive Data Manylogs 50% of max bandwidth at extreme IOPS 40 80 160 320 Random Writes per Second (IOPS)
Manylogs @ M SST 16 25 Results Ordered Adaptive Data Manylogs 600 500 400 300 200 100 Average sync Latency (ms) 40 80 160 320 Random Writes per Second (IOPS)
Manylogs @ M SST 16 26 Results Ordered Adaptive Data Manylogs 600 500 400 300 200 100 Average sync Latency (ms) 40 80 160 320 Random Writes per Second (IOPS) 0
Manylogs @ M SST 16 27 Results Ordered Adaptive Data Manylogs 600 500 400 300 200 100 Average sync Latency (ms) 40 80 160 320 Random Writes per Second (IOPS) 0
Manylogs @ M SST 16 28 Results Ordered Adaptive Data Manylogs 600 500 400 300 200 100 Average sync Latency (ms) 40 80 160 320 Random Writes per Second (IOPS) 0
Manylogs @ M SST 16 29 Results Ordered Adaptive Data Manylogs WOW! Fast latency at extreme IOPS! 600 500 400 300 200 100 Average sync Latency (ms) 40 80 160 320 Random Writes per Second (IOPS) 0
Manylogs @ M SST 16 30 Results Ordered Adaptive Data Manylogs 100% 600 Ratio of Max Bandwidth 80% 60% 40% 20% 500 400 300 200 100 Average sync Latency (ms) 0% 40 80 160 320 Random Writes per Second (IOPS) 0
Results Manylogs @ M SST 16 31
Manylogs @ M SST 1 6 32 Ratio of Max Read Bandwidth Latency (ms) User 1 128MB Sequential Reads User 2 fileserver Using Filebench Multi-threaded 2, 4, 8 instances
Manylogs @ M SST 16 33 Ordered Manylogs provides the best outcomes! 100% Data ogs 4000 Ratio of Max Bandwidth 80% 60% 40% 20% 3000 2000 1000 Operation Latency (ms) 0% 2 4 8 fileserver Instances 0
Manylogs @ M SST 16 34 Checkpointing Data ournaling q Periodically Usually every 5 secs q ournal can get filled fast because all writes are in the journal! Manylogs q Lazy or Off-hours q Rarely full because just small writes are redirected q Log Swapping
Manylogs @ M SST 1 6 35 Log Swapping Hot Area! Cold Area! ournal Big Read Small Writes Seek
Manylogs @ M SST 16 36 Integrations q File System (MLFS) Durability-Only Mode (O_DUR) q SMR Disk (MLSMR) q RAID
Manylogs @ M SST 16 37 Cassandra Write Path Writes Memtable Commit Log SSTable Memory Disk
Manylogs @ M SST 16 38 Cassandra Write Path Writes Memtable Commit Log SSTable Memory Disk
Manylogs @ M SST 16 39 Cassandra Write Path Writes Memtable Requires Fast Durability Commit Log Ideally SYNC write SSTable Random writes are the problem Flush Triggered But in practice, background write e.g. every 10 seconds Memtable Commit Log SSTable Temp File No location needed Memory Disk
Manylogs @ M SST 16 40 open(file, O_DUR); q Need fast durability but not location constraints q Content of files will be put in Manylogs regardless of the write size q Never checkpoint their content q Random writes are not a problem anymore!
Manylogs @ M SST 1 6 41 Ratio of Max Read Bandwidth Latency (ms) User 1 HDFS User 2 MongoDB 1 instance 2 instances 4 instances
Manylogs @ M SST 16 42 Most Bandwidth & Lowest Latency with Manylogs Data Manylogs 250 Ratio of Max Bandwidth 80% 60% 40% 20% 200 150 100 50 MongoDB Write Latency (ms) 0% 1DB 2DB 4DB MongoDB Instances (w/default flush period) 0
Manylogs @ M SST 16 43 Manylogs & SMR One non-shingled surface = log space
Manylogs @ M SST 16 44 Manylogs & SMR Non-Shingled Tracks Shingled Band
Manylogs @ M SST 1 6 45 Latency (ms) Latency (ms) User 1 Build Server (WBS) Trace User 2 Back-end Server (LM-TBE) Trace
Manylogs @ M SST 16 46 SMR Manylogs SMR (MLSMR) Single-log SMR (SLSMR) 100 80 Percentile 60 40 20 0 Latency (ms) ML-SMR SL-SMR 0 10 80 30 40 50
Manylogs @ M SST 16 47 Manylogs & RAID User 1 Big Read User 2 Small Durable Writes Disk #1 Disk #2 Disk #3 Disk #4 Max Bandwidth! Bandwidth drops up to 50% Mingzhe Hao, Gokul Soundararajan, Deepak Kenchammana-Hosekote, Andrew A. Chien, and Haryadi S. Gunawi. "The Tail at Store: A Revelation from Millions of Hours of Disk and SSD Deployments." FAST 16.
Manylogs @ M SST 1 6 48 Ratio of Max Read Bandwidth Latency (ms) User 1 128MB Sequential Reads User 2 4KB Random Writes At different intensities 40 writes/s 80 writes/s 160 writes/s 320 writes/s
Manylogs @ M SST 16 49 Ordered 10x Bandwidth Speed-up 14x Latency Speed-up at 320 IOPS! Data Manylogs 600 Ratio of Max Bandwidth 80% 60% 40% 20% 500 400 300 200 100 Average sync Latency (ms) 0% 40 80 160 320 Random Writes per Second (IOPS) per Disk 0
Manylogs @ M SST 16 50 More in the paper q Block-Level Manylogs q Other workloads Sequential Writes varmail More Traces q Log Size q Logged Write Size q Mapping Table
Manylogs @ M SST 16 51 Manylogs q Reserved log spaces uniformly across the disk q Redirect small writes to the nearest log q Can help with NoSQL, SMR, RAID, and more! q Provide up to 5x speed-up on average
Manylogs @ M SST 16 52 Manylogs Bandwidth Speed-up vs. Ordered 3.7x 5.7x vs. Adaptive 2.7x 2.0x vs. Single-log SMR 1.3x Latency Speed-up
Manylogs @ M SST 1 6 53 Thank you! Questions? http://ucare.cs.uchicago.edu