ThyNVM. Enabling So1ware- Transparent Crash Consistency In Persistent Memory Systems

Size: px

Start display at page:

Download "ThyNVM. Enabling So1ware- Transparent Crash Consistency In Persistent Memory Systems"

Milo Walters
5 years ago
Views:

1 ThyNVM Enabling So1ware- Transparent Crash Consistency In Persistent Memory Systems Jinglei Ren, Jishen Zhao, Samira Khan, Jongmoo Choi, Yongwei Wu, and Onur Mutlu

2 TWO- LEVEL STORAGE MODEL MEMORY CPU STORAGE Ld/St DRAM FILE I/O VOLATILE FAST BYTE ADDR NONVOLATILE SLOW BLOCK ADDR 2

3 TWO- LEVEL STORAGE MODEL MEMORY STORAGE CPU Ld/St DRAM FILE I/O NVM PCM, STT-RAM VOLATILE FAST BYTE ADDR NONVOLATILE SLOW BLOCK ADDR Non- volamle memories combine characterismcs of memory and storage 3

4 PERSISTENT MEMORY CPU Ld/St NVM PERSISTENT MEMORY Provides an opportunity to manipulate persistent data directly 4

5 CHALLENGE: CRASH CONSISTENCY Persistent Memory System System crash can result in permanent data corrupmon in NVM 5

6 CURRENT SOLUTIONS Explicit interfaces to manage consistency NV- Heaps [ASPLOS 11], BPFS [SOSP 09], Mnemosyne [ASPLOS 11] AtomicBegin { Insert a new node; } AtomicEnd; Limits adopmon of NVM Have to rewrite code with clear parmmon between volamle and non- volamle data Burden on the programmers 6

7 OUR APPROACH: ThyNVM Goal: Software transparent consistency in persistent memory systems 7

ThyNVM: Summary A new hardware-based checkpointing mechanism Checkpoints at mul$ple granulari$es to reduce both checkpoineng latency and metadata overhead Overlaps

8 ThyNVM: Summary A new hardware-based checkpointing mechanism Checkpoints at mul$ple granulari$es to reduce both checkpoineng latency and metadata overhead Overlaps checkpoin$ng and execu$on to reduce checkpoineng latency Adapts to DRAM and NVM characterisecs 4.9% Performs within of an idealized DRAM with zero cost consistency 8

9 OUTLINE Crash Consistency Problem Current Solutions ThyNVM Evaluation Conclusion 9

10 CRASH CONSISTENCY PROBLEM Add a node to a linked list 2. Link to prev 1. Link to next System crash can result in inconsistent memory state 10

11 OUTLINE Crash Consistency Problem Current Solutions ThyNVM Evaluation Conclusion 11

12 CURRENT SOLUTIONS Explicit interfaces to manage consistency NV- Heaps [ASPLOS 11], BPFS [SOSP 09], Mnemosyne [ASPLOS 11] Example Code update a node in a persistent hash table void hashtable_update(hashtable_t* ht, void *key, void *data) { list_t* chain = get_chain(ht, key); pair_t* pair; pair_t updatepair; updatepair.first = key; pair = (pair_t*) list_find(chain, &updatepair); pair->second = data; 12 }

13 CURRENT SOLUTIONS void TMhashtable_update(TMARCGDECL hashtable_t* ht, void *key, void*data) { list_t* chain = get_chain(ht, key); pair_t* pair; pair_t updatepair; updatepair.first = key; pair = (pair_t*) TMLIST_FIND(chain, &updatepair); pair->second = data; } 13

14 CURRENT SOLUTIONS Manual declaramon of persistent components void TMhashtable_update(TMARCGDECL hashtable_t* ht, void *key, void*data) { list_t* chain = get_chain(ht, key); pair_t* pair; pair_t updatepair; updatepair.first = key; pair = (pair_t*) TMLIST_FIND(chain, &updatepair); pair->second = data; } 14

15 CURRENT SOLUTIONS Manual declaramon of persistent components void TMhashtable_update(TMARCGDECL hashtable_t* ht, void *key, void*data) { list_t* chain = get_chain(ht, key); pair_t* pair; pair_t updatepair; updatepair.first = key; pair = (pair_t*) TMLIST_FIND(chain, &updatepair); pair->second = data; } Need a new implementamon 15

16 CURRENT SOLUTIONS Manual declaramon of persistent components void TMhashtable_update(TMARCGDECL hashtable_t* ht, void *key, void*data) { list_t* chain = get_chain(ht, key); pair_t* pair; pair_t updatepair; updatepair.first = key; pair = (pair_t*) TMLIST_FIND(chain, &updatepair); pair->second = data; Third party code } can be inconsistent Need a new implementamon 16

$CURRENT SOLUTIONS Manual declaramon of persistent components void TMhashtable_update(TMARCGDECL hashtable_t* ht, void *key, void*data) { list_t* chain = get_chain(ht, key); pair_t* pair; pair_t$

17 CURRENT SOLUTIONS Manual declaramon of persistent components void TMhashtable_update(TMARCGDECL hashtable_t* ht, void *key, void*data) { list_t* chain = get_chain(ht, key); pair_t* pair; pair_t updatepair; updatepair.first = key; pair = (pair_t*) TMLIST_FIND(chain, Prohibited &updatepair); pair->second = data; Third party code } OperaMon can be inconsistent Need a new implementamon Burden on the programmers 17

18 OUTLINE Crash Consistency Problem Current Solutions ThyNVM Evaluation Conclusion 18

19 OUR GOAL Software transparent consistency in persistent memory systems Execute legacy applica$ons Reduce burden on programmers Enable easier integra$on of NVM 19

20 NO MODIFICATION IN THE CODE void hashtable_update(hashtable_t* ht, void *key, void *data) { list_t* chain = get_chain(ht, key); pair_t* pair; pair_t updatepair; updatepair.first = key; pair = (pair_t*) list_find(chain, &updatepair); pair->second = data; }

$*data){ list_t* chain = get_chain(ht, key); pair_t*$ first = key; pair = (pair_t*) list_find(chain,

21 RUN THE EXACT SAME CODE void hashtable_update(hashtable_t* ht, void *key, void *data){ list_t* chain = get_chain(ht, key); pair_t* pair; pair_t updatepair; updatepair.first = key; pair = (pair_t*) list_find(chain, &updatepair); pair->second = data; } Persistent Memory System So1ware transparent memory crash consistency 21

22 ThyNVM APPROACH Periodic checkpointing of data managed by hardware time Running CheckpoinMng Running CheckpoinMng Epoch 0 Epoch 1 Transparent to applicamon and system 22

23 CHECKPOINTING OVERHEAD 1. Metadata overhead Metadata Table Working locamon Checkpoint locamon X X Y Y time Running CheckpoinMng Running CheckpoinMng Epoch 0 Epoch 1 2. Checkpointing latency 23

24 1. METADATA AND CHECKPOINTING GRANULARITY Working locamon Checkpoint locamon X X Y Y PAGE CACHE BLOCK PAGE GRANULARITY One Entry Per Page Small Metadata BLOCK GRANULARITY One Entry Per Block Huge Metadata 24

25 2. LATENCY AND LOCATION DRAM- BASED WRITEBACK Working locamon 2. Update the metadata table Checkpoint locamon 1. Writeback data from DRAM X X W DRAM NVM Long latency of wrimng back data to NVM 25

26 2. LATENCY AND LOCATION NVM- BASED REMAPPING Working locamon Y 2. Update the metadata table Checkpoint locamon Write No copying a new X of locaeon data DRAM NVM Short latency in NVM- based remapping 26

DRAM: long latency Remap in NVM: short latency Based on these we propose two

27 ThyNVM KEY MECHANISMS CheckpoinMng granularity Small granularity: large metadata Large granularity: small metadata Latency and locamon Writeback from DRAM: long latency Remap in NVM: short latency Based on these we propose two key mechanisms 1. Dual granularity checkpoinmng 2. Overlap of execumon and checkpoinmng

28 1. DUAL GRANULARITY CHECKPOINTING Page Writeback in DRAM Block Remapping in NVM DRAM GOOD FOR STREAMING WRITES NVM GOOD FOR RANDOM WRITES High write locality pages in DRAM, low write locality pages in NVM 28

29 2. OVERLAPPING CHECKOINTING AND EXECUTION time Running CheckpoinMng Running CheckpoinMng Epoch 0 Epoch 1 Epoch 0 Epoch 1 time Running CheckpoinMng Running CheckpoinMng Running CheckpoinMng Epoch 0 Epoch 1 Epoch 2 Hides the long latency of Page Writeback

30 OUTLINE Crash Consistency Problem Current Solutions ThyNVM Evaluation Conclusion 30

31 METHODOLOGY Cycle accurate x86 simulator Gem5 Comparison Points: Ideal DRAM: DRAM- based, no cost for consistency Lowest latency system Ideal NVM: NVM- based, no cost for consistency NVM has higher latency than DRAM Journaling: Hybrid, commit dirty cache blocks Leverages DRAM to buffer dirty blocks Shadow Paging: Hybrid, copy- on- write pages Leverages DRAM to buffer dirty pages 31

ADAPTIVITY TO ACCESS PATTERN RANDOM SEQUENTIAL B ET T E R Normalized Write Traffic To NVM 3 2 1 0 Journal Shadow ThyNVM Normalized Write Traffic

32 ADAPTIVITY TO ACCESS PATTERN RANDOM SEQUENTIAL B ET T E R Normalized Write Traffic To NVM Journal Shadow ThyNVM Normalized Write Traffic To NVM Journal Shadow ThyNVM Journaling is beher for Random and Shadow paging is beher for SequenMal ThyNVM adapts to both access paherns 32

Journal Shadow ThyNVM ThyNVM Can spend spends 35-45% only of 2.4%/5.

33 OVERLAPPING CHECKPOINTING AND EXECUTION 60 RANDOM 60 SEQUENTIAL B ET T E R Percentage of ExecuMon Time Percentage of ExecuMon Time Journal Shadow ThyNVM 0 Journal Shadow ThyNVM ThyNVM Can spend spends 35-45% only of 2.4%/5.5% the execumon of the execumon Mme checkpoinmng on checkpoinmng Stalls the applicamon for a negligible Mme 33

PERFORMANCE OF LEGACY CODE Normalized IPC 1 0.8 0.6 0.4 0.

34 PERFORMANCE OF LEGACY CODE Normalized IPC Ideal DRAM Ideal NVM ThyNVM gcc bwaves milc leslie. soplex Gems lbm omnet Within - 4.9%/+2.7% of an idealized DRAM/NVM system Provides consistency without significant performance overhead 34

35 OUTLINE Crash Consistency Problem Current Solutions ThyNVM Evaluation Conclusion 35

36 ThyNVM A new hardware-based checkpointing mechanism, with no programming effort Checkpoints at mul$ple granulari$es to minimize both latency and metadata Overlaps checkpoin$ng and execu$on Adapts to DRAM and NVM characterisecs Can enable widespread adop$on of persistent memory 36

37 Available at hhp://persper.com/thynvm ThyNVM Enabling So1ware- transparent Crash Consistency In Persistent Memory Systems Jinglei Ren, Jishen Zhao, Samira Khan, Jongmoo Choi, Yongwei Wu, and Onur Mutlu

SOLVING THE DRAM SCALING CHALLENGE: RETHINKING THE INTERFACE BETWEEN CIRCUITS, ARCHITECTURE, AND SYSTEMS

SOLVING THE DRAM SCALING CHALLENGE: RETHINKING THE INTERFACE BETWEEN CIRCUITS, ARCHITECTURE, AND SYSTEMS Samira Khan MEMORY IN TODAY S SYSTEM Processor DRAM Memory Storage DRAM is critical for performance