Relaxing Persistent Memory Constraints with Hardware-Driven Undo+Redo Logging

Size: px
Start display at page:

Download "Relaxing Persistent Memory Constraints with Hardware-Driven Undo+Redo Logging"

Transcription

1 Relaxing Persistent Memory Constraints with Hardware-Driven Undo+Redo Logging Matheus A. Ogleari Ethan L. Miller University of California, Santa Cruz November 19, 2016 Jishen Zhao Abstract Persistent memory is a new tier of memory that functions as a hybrid of traditional storage systems and main memory. It combines the benefits of both: the data persistence property of storage with the fast load/store interface of memory. Yet, efficiently supporting data persistence in memory requires non-trivial effort. In particular, logging is a widely used data persistence scheme due to its portability and flexibility benefits compared with logless mechanisms. However, traditional software logging mechanisms are expensive to adopt in persistent memory, due to their overhead on performance, energy, and implementation. Additionally, alternate proposed hardware-based logging schemes don t fully address the issues with software. To address the challenges, we propose a hardware-driven undo+redo logging scheme, which maintains data persistence by leveraging the otherwise largely wasted hardware information available in the cache hierarchy. Our method allows persistent memory to exploit undo+redo logging to relax data persistence constraints on caching without nonvolatile on-chip buffering or caching components. Our evaluation across persistent memory microbenchmarks and real workloads demonstrates that our design leads to significant system throughput improvement and reduction in both dynamic energy and memory traffic. 1 Introduction Persistent memory is a new technology that bridges the decades-long divide between memory and storage systems, by having the property of both. Recent studies implement persistent memory by placing emerging nonvolatile memory devices, such as phase-change memory [44], resistive RAM [7], spin-transfer torque RAM [51], and 3D Xpoint memory [17], on the memory bus. Persistent memory provides data persistence support in memory ( 2). One of the mechanisms that provides such support is logging ( 2.1). This paper focuses on devising efficient logging support in persistent memory systems. Software-based logging is one of commonly used schemes for maintaining data persistence in persistent memory systems. This gives programmers large control over data persistence. However, this is inefficient for a number of reasons. First, software logging must make changes to existing legacy code. Second, It does not guarantee data persistence in multithreaded programs in the face of context switching, further discussed in 2.5. Third, software logging inflicts tight ordering constraints because proper use of undo+redo logging in memory systems is expensive and unfeasible (discussed in 2.3 and 2.4. Other works have proposed hardware-based logging solutions (see 8). However, they lack advantages of either eliminating the need for memory fence instructions, relaxing ordering constraints, or are too costly in hardware. More importantly, they implement either undo or redo logging, but not both. This comes at an especially high hardware cost because they do not utilize key information from the cache hierarchy that our design does which allows us to efficiently implement this feature. Our goal in this paper is to design an efficient scheme that both enables high-performance, energy-efficient hardware-based undo+redo logging in persistent memory and relaxes data persistence constraints without adding nonvolatile buffers or caches to the processor. We propose a persistent memory undo+redo logging scheme that enables 1

2 steal/no-force (a concept borrowed from database design) [31] of persistent data updates across caches and memory controllers at low cost. Our scheme incorporates undo and redo logging. The undo log allows cache blocks of uncommitted transactions to steal (i.e., write-back) into NVRAM (eliminating memory barriers). The redo log allows transactions to restore their latest state without forcing (i.e., no-force) cache blocks of committed transactions out of the processor (reducing cache flushes and force write-backs). Supporting both undo and redo logging may seem expensive, but we found that the undo and redo information naturally exists in the cache hierarchy. This information, which is wasted in most previous persistent memory designs, allows us to enable practical undo+redo logging support at low performance and implementation cost. As a result, combining undo and redo logging provides both steal and no-force benefits, which significantly relaxes the ordering constraints that persistent memory has placed on caching of original data updates. In order to enable undo+redo logging in persistent memory at low cost, we propose two hardware mechanisms: WAL and FWB. WAL implements automatic write-ahead logging in hardware for persistent data. FWB implements automatic periodic forced write-back in the cache. This paper makes the following contributions: We identify the critical overheads in traditional software-based logging mechanisms, imposed by under-utilization of the information already available in CPU cache hierarchy of persistent memory systems. We propose a hardware-driven undo+redo logging scheme, which substantially reduces the implementation overheads and relaxes the data persistence constraints. This reduces the number of software instructions executed in the pipeline and simplifies the software interface. We ensure the order between original data and its log updates in NVRAM without explicit cache flushes or memory barriers. Our design exploits detailed hardware information that can secure data persistence in multithreading scenarios. Previous software-based logging schemes can risk data persistence with multithreaded workloads. We devise a set of efficient hardware design techniques that enable our logging scheme with lightweight software support and modifications to processors. We update the log by leveraging the hardware information naturally available in caches. 2 Background and Motivation Modern computer systems decouple fast data access and data persistence by adopting separate main memory (DRAM) and storage (disks and flash) components. This decades-long model is being enriched by a new tier of memory, persistent memory, which offers a single-level, flat data storage model by incorporating the persistence property of storage and the fast load/store access of memory [1, 4, 30, 24]. Persistent memory is enabled by attaching byteaddressable nonvolatile memories (NVRAMs) such as spin-transfer torque RAM [51], phase-change memory [44], resistive RAM [7], battery-backed DRAM [6], and 3D XPoint memory [17] directly to the processor-memory bus. As NVRAMs are expected to be on market in 2017 [17], persistent memory can radically change the landscape of software and hardware design in future computer systems. Persistent memory is fundamentally different from traditional DRAM-based main memory or their NVRAM replacement, due to its persistence (aka. crash consistency) property inherited from storage systems. It needs to ensure the integrity of in-memory data despite system failures, such as system crashes and power outages [1, 46, 37, 33, 36, 20, 26, 27, 19, 25, 8]. This persistence property is not guaranteed by memory consistency of traditional memory systems: memory consistency ensures a consistent global view of caches and the main memory. Whereas hardware can reorder loads and stores within this hierarchy, data in volatile caches can be lost during system failures. This leaves NVRAM in a partially updated, inconsistent state. Consequently, persistent memory systems need to carefully control the writes across the processor-nvram interface and ensure that the data in NVRAM standing alone is consistent [36, 46, 20]. One key data persistence scheme is logging, which maintains a historical record of critical data/metadata changes available for system recovery after system failures. Software logging is widely used not only because it is compatible with most log-based legacy code, but also due to several benefits that logless schemes [45, 11, 34, 33] do not support well. First, flexible recovery options log allows systems to either flexibly recover only the information recorded immediately prior to failures or roll back to older, recorded states. Second, less fragmented memory space systems typically allocate logs in one or multiple contiguous address spaces, leading to less memory fragmentation. 2

3 Yet, traditional software-driven logging schemes can impose substantial performance and energy overhead to persistent memory. These schemes require the processor pipeline to execute logging-related instructions in addition to application operations. They can also impose substantial ordering constraints to caches and memory controller operations: they typically employ memory barriers and cache flushes (or force write-backs) to ensure that log updates are committed to NVRAM before corresponding updates to the original data; otherwise, system failures can leave the original data partially updated without log protection [46, 10, 49]. Furthermore, adapting traditional logging mechanisms designed for block-level storage systems to be memory-friendly can substantially alter system software or library designs [46, 10, 45, 42, 9, 43, 1, 33]. These changes also introduce new programming models and complex APIs [46, 10, 43, 1]. As a result, previous research focused on reducing the software overhead of logging by introducing expensive nonvolatile on-chip buffers or caches to store persistent data [49, 47, 21]. Since NVRAM technologies are difficult to be integrated with CMOS (traditional processor technology) on the same die, such designs can be costly to fabricate, if not unfeasible. In addition, on-chip nonvolatile buffers/caches can be susceptible to persistent data overflow issues [49], which renders the persistence optimization mechanisms unusable. We motivate our design by discussing the background of logging and transactions in persistent memory, the benefit of undo+redo logging, and inefficiencies of traditional schemes. 2.1 Logging and Transactions Logging is an important class of data persistence schemes used in persistent memory [46, 10, 20]. It maintains a record of changes to critical data in a reserved NVRAM region alternative to the original data structures. The log can store undo (old data values) and redo (new data values) information. In the case where system failures leave the original data in NVRAM in a partially updated, inconsistent state, the data can be recovered to a consistent state in an old version (using undo log) or a new version (using redo log). Ensuring the persistence of continuous data updates is difficult, especially when manipulating complex data structures with fine-grained accesses. One way to ease this difficulty is to access persistent data through atomic, durable transactions. Similar to traditional storage transaction concept, each persistent memory transaction makes a group of updates appear as one atomic unit when system failure happens. Software can define such critical data that requires persistence guarantee by including them in transactions. Transactions have been used as a powerful and convenient software interface in most databases and file systems [15, 41] and as in many prior persistent memory designs [46, 10, 20, 49]. 2.2 Persistent Memory Write-Order Control Logging alone cannot secure persistent updates of transactions. CPU caches and memory controllers typically optimize system performance (e.g., to exploit the data locality in cache and memory interface access) by reordering the writes issued by software. As a result, the order of software-issued log and original data updates in a transaction does not necessarily match the order that they are written into NVRAM. In order to guarantee persistence, most loggingbased persistent memory systems enforce write-order control in two manners. First, complete log updates in NVRAM before updating original data. Otherwise, system failures can leave both log and original data partially updated and unrecoverable. Prior persistent memory systems typically employ memory barriers (e.g., mfence, sfence, and pcommit instructions [16]) between logging and original data updates to enforce the ordering. Second, flush original data updates into NVRAM. Undo-logging-based systems need to perform the cache flush before committing each transaction, in order to ensure that the data can be recovered to the new state of the committed transaction. Because logs typically have limited size, redo-logging-based systems need to flush the original data cache lines before recycling the associated log record to avoid losing of the persistent data. Both memory barrier and cache flush can degrade system performance by canceling out the performance optimization provided by the reordering in caches and memory controllers as shown by our experiments ( 6). 2.3 Relaxing Ordering Constraints Most previous work employs either undo [20] or redo [46, 10] logging in persistent memory but not both, due to the high performance and implementation costs of maintaining logging. Yet undo and redo logging schemes provide different benefits, respectively. Undo logging allows original data cache lines updates to steal the way into NVRAM before all log updates of a transaction complete: once the undo information of a piece of data (e.g., the old value of a 3

4 Tx_begin(TxID) do some reads do some computation UC_log(addr(A), new_val(a), old_val(a)) micro-ops: load A 1 load A 2 store log_a 1 store log_a 2... Memory_barrier Write A clwb // may be delayed Tx_commit (a) Tx_begin(TxID) do some reads do some computation Write A Tx_commit Trigger hardwarebased logging A free ride for memory barrier (b) Figure 1: Example code that updates an object A in a transaction. (a) A transaction with software-based uncacheable (UC) logging that supports steal/no-force. (b) The transaction code with our design. cache line) is written into NVRAM, this piece of data is recoverable; CPU can issue the corresponding piece of original data updates without waiting for the completion of all log updates in the transaction and caches can naturally evict the new updates using traditional cache eviction policies. Redo logging maintains the latest states of transactions, while relaxing the cache force write-back constraints of persistent memory ( no-force ): With redo log stored in NVRAM, dirty (updated) cache lines can remain in caches as long as determined by traditional cache eviction policies; without cache line flushes, the transaction can still commit and be recovered after system failures using the redo log. 2.4 Undo+Redo with Software-based Logging Undo+Redo logging has been used in many traditional database systems stored in disk/flash [32, 15]. But it may appear costly to be adopted in memory systems that provides much higher access bandwidth than disk/flash. Figure 1(a) shows an example code of a transaction that updates a data object (A) protected by software-based logging. The memory barrier between logging and original data update (Write A) ensures that persistent memory state is protected by completed logging before the the original data can be updated. Undo logging consists of a set of load and store instructions to read the old values of the data and then write them out to the log in NVRAM; redo logging consists of the store instructions that write the new values out to the log. As shown in the figure, the logging operations not only increase memory read/write traffic, but also also introduces extra load and store instructions in CPU pipeline. Worse yet, these instructions can block the subsequent instruction execution in the pipeline due to the memory barrier instruction between logging and original data updates. Due to such performance, energy, and implementation overhead, software-based logging is unfeasible for implementing undo+redo logging. 2.5 Software-based Logging and Multithreading Concurrent workload execution can further complicate software-based logging in persistent memory. Even if persistent memory issues clflush or clwb instructions in each transaction, a context switch by the OS can occur before a clwb instruction is executed. Such context switches will interrupt the control flow of transactions/epochs and divert the program to other threads, which reintroduces the risk of prematurely overwriting the log records in NVRAM. Implementing per-thread log circular buffers can mitigate such risk. But doing so can introduce new persistent memory API and complicate recovery. Such risk and software implementation complexities further increase the cost of software-based logging. 3 Our Design Our hardware-driven logging design also enables a progressive cache force-write-back scheme, where we invoke forcewrite-backs individual cache lines only when necessary. Furthermore, by invoking force-write-backs in hardware, we can address the persistence risk in multithreading scenario. This section describes our design principles. Detailed implementation methods and the required software support will be described in 4. 4

5 Assuming update a data object A: the old value of A consists of {A 1, A 2, } Redo information, the new value of A consists of {A 1, A 2,...} Undo information Cor Cor Processor (A n and A n are word sized values). e e Cor L1$ L1$ Log Processor Cor e 5 Buffer (Volatile) e 5 A 1 miss 2 A 1 hit 2 Tx_commit Last-level Cache L1$ Tx_commit L1$ 4 A Write-allocate Memory Controllers Hit in a Bypass A 1 Lower-level$ lower-level Log entries (64B + Torn bit) Caches 1 3 Bypass cache Log Buffer (FIFO) 1 Head Pointer Tail Pointer Caches 3 Log Buffer (FIFO) DRAM NVRAM Log Log record of A A 1 A 1 A 2 A 2 (circular A 1 A 1 A 2 A 2 buffer) Header: TxID, addr(a) 4 Word size TxID, addr(a) 4 Word size H Header Indicator Cache Controllers A 1 A 1 A 2 A 2 A 3 A 3 H B 1 NVRAM (Nonvolatile) Log (circular buffer) NVRAM (Nonvolatile) Log (circular buffer) TxID addr(a) (a) (b) (c) Figure 2: Overview of the proposed hardware-driven logging in persistent memory. (a) Architecture overview (NVRAM components are shaded) and log record format. Hardware-driven logging (b) on a write hit in L1 cache and (c) on a write miss in L1 cache. 3.1 Architecture and System Design Overview Figure 2(a) depicts an overview of our processor and memory architecture. The figure also shows the log structure that is stored in NVRAM. We assume that all storage components in the processor are volatile, caches adopt write-back caching scheme, and main memory consists of DRAM and NVRAM, which are both deployed on the processormemory bus with separate memory controllers [46, 49]. Failure Model. The data in DRAM or caches is lost across system reboots, while data in NVRAM remains. Our design focuses on maintaining persistence of user-defined critical data stored in NVRAM. After failures, the system can recover critical data structures by replaying the log in NVRAM. The DRAM can be used to store data without persistence requirement, such as stacks and data transfer buffers [46, 49]. Persistent Memory Transactions. Similar to most prior persistent memory designs [46, 20], we employ persistent memory transactions ( 2.1) as a software abstraction to enforce persistence of critical data. The writes that are included in persistent memory transactions are considered as those requiring persistence guarantee. Figure 1 illustrates simple code examples of a transaction implemented with traditional logging-based persistent memory transactions (Figure 1(a)) and our design (Figure 1(b), detailed software interface design is discussed in 4), respectively. The transaction defines object A as a piece of critical data that needs persistence guarantee. Compared with traditional logging-based persistent memory transactions, our transactions eliminate logging functions, cache force-write-back instructions, and memory barrier instructions. An Uncacheable Log in NVRAM. We employ a single-consumer/single-producer Lamport circular buffer [22] as the log. It can be allocated and truncated by system software ( 4). Our hardware mechanisms perform log appends. We adopt a circular buffer because it allows simultaneous appends and truncates without locking [22, 46]. As illustrated in Figure 2(a), each log record maintains undo and redo information of a single data object (e.g., A ). Each entry in the log record stores either a) a record header or b) word-sized undo/redo values. A record header in the log indicates the starting point of a log record it consists of a 128-bit transaction ID, a 64-bit address of the data object, and a special character (can be reserved through our software interface) to indicate that the entry is a header. We also adopt a torn bit per log entry to indicate the completion of each log update [46]. Torn bits have the same value for all writes in one pass over the log, and reverses when a log entry is overwritten. Thus, completely-written log entries have the same torn bit value, while incomplete entries have mixed values [46]. The log is shared by all threads of an application to simplify system recovery. The log needs to accommodate all the write requests of undo+redo logging. But lifetime of this NVRAM region is not an issue as we will discuss in 7. Log is typically used during system recovery, hence less likely reused during application execution. In addition, it is imperative to force log updates from caches to NVRAM in store-order by cache flushes and memory barriers. Therefore, we make the log uncacheable. This is inline with most prior works, which directly write log updates into write-combine buffer (WCB) in commodity processors [46, 50] that coalesce multiple stores to the same cache line. An Optional Volatile Log Buffer in Processor. To improve the performance of log updates written to NVRAM, we provide an optional log buffer design (a volatile FIFO, similar to WCB) in the memory controller to buffer and 5

6 coalesce log updates. Note that the log buffer is not required for ensuring data persistence but only for performance optimization. 6 evaluates system performance across various log buffer sizes. Natural Ordering Guarantee on Data and Log Updates. Our design does not require write-order control mechanisms (e.g., cache flushes and memory barriers) to enforce the order between original data and corresponding log updates. Such order is naturally enforced by our architecture design: Without the log buffer, the uncacheable writing of the log can guarantee that log updates will arrive at NVRAM earlier than corresponding original data updates (which need to traverse through the cache hierarchy); no memory barrier is needed. With the log buffer, we need to carefully configure the log buffer size such that we can still enforce the guarantee without memory barriers. 4.3 discusses the bound of the log buffer size in order to guarantee data persistence. 3.2 Enabling Undo+Redo Logging with Caches The goal of our logging mechanism is to enable feasible undo+redo logging of persistent data, which can relax the order constraints on caching to the point where neither one of undo and redo logging can achieve. To this end, our logging mechanism leverages the information naturally available in commodity cache hierarchy, without the performance overhead of unnecessary data movement or executing logging, cache flush, and memory barrier instructions in CPU pipeline. Commodity processors employ write-back write-allocate policy in the cache hierarchy [35]. On a write hit, the cache hierarchy will only update the cache line in the hitting level with old values remaining in lower levels; a dirty bit in the cache tag indicates such inconsistency across different levels of caches. On a write miss, the write-allocate (also called fetch-on-write) policy enforces that the cache hierarchy first loads the missing cache line from a lower-level, followed by a write hit operation. In order to enable efficient steal/no-force support in persistent memory, we propose a hardware-driven logging mechanism by leveraging the write-back write-allocate scheme. In our design, a write in a transaction will trigger hardware-based log appends that record both redo and undo information in NVRAM (as shown in Figure 1(b)). We acquire the redo information from the write operation itself, which is currently in-flight. We acquire the undo information from the cache lines, where the current write will be placed. If the write request hits in L1 cache, we read the old value before overwriting the cache line and use that for the undo logging. If the write request misses in L1 cache, we acquire the undo information (the old value) from the cache line that will be write-allocated by the cache hierarchy. The log buffer if adopted manipulates log entries (transaction ID, data object address, and undo and redo cache line values) in a FIFO manner. The entries will be written out to NVRAM using the head and tail pointer information, which is stored in special registers ( 4). 3.3 Progressive Cache Force-write-back It appears that writes are persistent once their logs are written into NVRAM. In fact, our design allows us to commit a transaction once our hardware-driven logging of that transaction is completed. Yet this does not secure data persistence due to the circular buffering nature of the log in NVRAM ( 2.2). Inserting force-write-back instructions (such as clflush and clwb) in transactions can impose substantial performance overhead ( 2.2) and complex data persistence support in the multithreading scenario ( 2.5). In order to guarantee persistence in multithreaded applications yet eliminate the need for force-write-back instructions, we design a hardware-based, progressive force-write-back mechanism that only force certain original data blocks out of the caches when necessary. To this end, we introduce a force-write-back bit (fwb) into the cache tag of each cache line. With fwb bit and the dirty bit existing in traditional cache tags, we maintain a finite state machine of each cache block ( 4.4). The dirty bit is maintained by traditional caching schemes: a cache line update will set the bit and a cache eviction (write-back) will reset it. The fwb is maintained by cache controllers that periodically scan all cache tags. Per every two scans, the cache controllers will set the fwb bit of the dirty cache lines in one scan, and force-write-back all the cache lines with { f wb,dirty} = {1,1} in the other scan. If the dirty bit value of any cache lines changes between the two scans, the cache line will not be force-written-back later on. Our mechanism maintains force-write-backs in a hardware-controlled manner decoupled from software multithreading mechanisms. As such, our mechanism is robust to the interruption of software context switches. The frequency of this force-write-backs can vary. But force-write-backs need to be faster than the rate at which the log updates wrap around and overwrite the first entry. In fact, we can determine force-write-back frequency (associated 6

7 with the scanning frequency) based on the log size and the NVRAM write bandwidth as we discuss in 4.4. Our evaluation shows that the frequency can be every three million cycles ( 6). 3.4 A Free Ride for Transaction Commits Most previous designs require software or hardware memory barriers (and/or cache flushes) at transaction commits to enforce the order of writing log updates (or persistent data) into NVRAM across consecutive transactions [49, 25]. Instead, our design gives transaction commits a free ride no actual instructions need to be executed in the CPU pipeline. In the log, we identify the commit of a transaction by the presence of the log record header of its immediately subsequent transaction. Our mechanisms also naturally enforce the order of intra- and inter-transaction log updates: we issue log updates in the order of writes to corresponding original data; we also write the log updates into NVRAM in the order as they are issued (the log buffer is a FIFO so does not affect such order). As such, log updates of subsequent transactions can only be written into NVRAM after the current log updates are written. 3.5 Putting It All Together Figure 2(b) and (c) illustrate a high-level summary of how our hardware-driven logging works. Hardware treats all writes encompassed in transactions e.g., write A in the transaction delimited by tx begin and tx commit in Figure 1(b) as persistent writes. Those writes automatically trigger the logging mechanisms described in the above sections. The mechanisms work together as follows. Note that the log updates will be directly send to the WCB or NVRAM, if the system does not adopt the log buffer. Processor core sends the writes of data object A (a variable or other data structures), which consists of the new values of one or multiple cache lines {A 1,A 2,...} to L1 cache. Upon updating an L1 cache line (e.g., from old value A 1 to a new value A 1 ): 1. Update the redo information of the cache line (➊). (a) If the update is the first cache line update of the data object A, the CPU core which has the information of transaction ID and the address of A will write a log record header into the log buffer. (b) Otherwise, the L1 cache controller writes the new value of the cache line (e.g., A 1 ) into the log buffer. 2. Obtain the undo information of the cache line (➋). This step can be parallelized with Step-1. (a) If the cache line write request hits in L1 cache (Figure 2(b)), the L1 cache controller will extract the old value (e.g., A 1 ) directly from the cache line before writing. Rather than using an additional read instruction, the L1 cache controller reads the old value from the hitting line out of the L1 cache read port and write it into the log buffer in the Step-3. Note that the read port is typically not used when servicing write requests, so it would otherwise be idle. (b) If the write request misses in L1 cache (Figure 2(c)), the cache hierarchy will naturally perform write-allocate on that cache line. The cache controller at a lower-level cache that owns that cache line will extract the old value (e.g., A 1 ). The cache controller will send the extracted old value to the log buffer in Step Update the undo information of the cache line: the cache controller writes the old value of the cache line (e.g., A 1 ) to the log buffer (➌). 4. The L1 cache controller can update the cache line in the L1 cache (➍). The cache line can be evicted by natural cache eviction policy without being subjected to data persistence constraints. In addition, our log buffer is small enough to guarantee that the log updates traverse through the log buffer faster than the cache line traversing through the cache hierarchy ( 4.4). Therefore, this step can be performed without waiting for the corresponding log entries to arrive at NVRAM. 5. Memory controller evicts the log buffer entries to NVRAM in a first-in-first-out manner (➍). This step is independent from other steps. 6. Repeat Step-1-(b) through 5, if the data object A consists of multiple cache line writes. The log buffer will also coalesce the log updates of any writes to the same cache line. 7. After log updates of all the writes in the transaction are issued, the transaction can commit (➎). 8. Original data object updates can remain in caches until they are written back to NVRAM by a natural eviction or our progressive cache force-write-backs. 7

8 void persistent_update( int threadid ) { tx_begin( threadid ); // Persistent data updates write A[threadid]; tx_commit(); } //... int main() { // Executes one persistent // transaction per thread for ( int i = 0; i < nthreads; i++ ) thread t( persistent_update, i ); } Figure 3: Pseudocode showing an example use for the tx begin and tx commit functions in which the thread ID is assigned to the transaction ID to perform one persistent transaction per thread. 4 Implementation In this section, we describe implementation details of our design and hardware overhead (the impact of the hardware overhead and NVRAM space consumption and lifetime/endurance will be discussed in 7). 4.1 Software Support Our design employs software support for defining persistent memory transactions, allocating and truncating the circular log in NVRAM, and reserving a special character as the log header indicator. Transaction Interface. We employ a pair of transaction functions, tx begin( txid ) and tx commit(), define transactions which incorporate the writes in the program that need persistence support. We employ the txid to provide the transaction ID information used by our hardware-based logging scheme. This ID is used to group writes from the same transaction together. Such transaction interface has been used by many previous persistent memory designs [49, 10]. Figure 3 shows a multithreaded example in pseudocode with these transaction functions in use. System Library Functions that Maintain the Log. Our hardware mechanisms perform the log updates, whereas the log structure is maintained by system software. In particular, we employ a pair of system library functions, log create() and log truncate() (similar to the ones used in prior work [46]), to allocate and truncate log, respectively. The memory controller obtains such log maintenance information by reading special registers ( 4.2) that indicate the head and tail pointers of the log. Furthermore, a single transaction that exceeds the originally allocated log size can corrupt persistent data. We provide two options to prevent such overflows: 1) The log create() function allocate a large-enough log by taking an hint of the maximum transaction size from the program interface (e.g., #define MAX TX SIZE N); 2) An additional library function log grow() allocates additional log regions when the log is filled by an uncommitted transaction. 4.2 Special Registers The txid argument provided through the tx begin() function will be translated into an 8-bit unsigned integer a physical transaction ID that is stored in a special register in the processor. Because the transaction IDs are used to group writes of the same transactions, we can simply pick a not-in-use physical transaction ID to represent a newly received txid. An 8-bit length can accommodate 256 unique active persistent memory transactions at a time. A physical transaction ID can be reused after the transaction commits. We also employs two 64-bit special registers to store the head and tail pointers of the log. The system library initializes the pointer values when allocating the log by log create(). During log updates, the pointers can be updated by the memory controller and log truncate() function. If log grow() is used, we will employ additional registers to store the head and tail pointers of newly allocated log regions and an indicator of the active log region. 8

9 ~dirty ~dirty IDLE fwb = 0 dirty FLAG STABLE Technical Report No (November 19, 2016) not dirty reset fwb,dirty ={0,0} cache line write write-back force-write-back IDLE FLAG FWB fwb,dirty = {0,1} set fwb=1 fwb,dirty = {1,1} Figure 4: writeback The finite-state machine in the cache controller for implementing the progressive cache write-back. fwb = Determining the Log Buffer Size dirty We employ a log buffer in the memory controller to improve the write performance of log updates in NVRAM. Data persistence requires that a log record arrives at NVRAM before the corresponding cache line (original data) update. To ensure such ordering, we bound the log buffer size such that the time for an undo+redo log record to traverse FWB through the log buffer is shorter than the minimum time that a cache line traverses through the cache hierarchy the dirty = 0 time that an L1 cache line directly gets evicted all the way until out of the last-level cache. As such, the maximum allowable log buffer size is 15 cache-line-size entries based on our processor configuration (Table 2); each log entry also employs two additional bits: one indicates whether the entry stores a record header; the other is a torn bit [46]. Further increasing the number of log buffer entries can improve the performance of log updates until we hit the write bandwidth provided by NVRAM as we demonstrated in 6, but will risk losing data persistence. 4.4 Cache Modifications To implement our progressive cache write-back scheme, we add one fwb to the tag of each cache line, along with the dirty bit existing in conventional cache implementations. We also maintain three states IDLE, FLAG, and FWB for each cache block using these tag bits. Cache Block State Transition. Figure 4 shows a finite-state machine that we implement in the cache controller of each level. When an application starts to execute, cache controllers initialize (reset) each cache line to IDLE state by setting fwb bit to be 0 (conventional cache implementations will also initialize dirty and valid bits to be 0s). During application execution, existing cache controllers implementations will set the dirty and valid bits to 1 whenever a cache line is written and set the dirty bit back to 0 after the cache line is written back to the lower level. To implement our state machine, the cache controllers will also periodically scan the tag of each cache line and act as following. A cache line with { f wb,dirty} = {0,0} is in IDLE state; the cache controller will do nothing to those cache lines; A cache line with { f wb,dirty} = {0,1} is in FLAG state; the cache controller will set the fwb to 1. This indicates that the cache line needs to be force-write-back during the next scanning iteration, in case it still remains in the cache. A cache line with { f wb,dirty} = {1,1} is in FWB state; the cache controller will force-write-back this line. After the force-write-back, the cache controller changes the line back to IDLE state by resetting { f wb,dirty} = {0,0}. If a cache line is evicted out of the cache at any time point, the cache controller will reset that cache line to IDLE. Determining the Cache Force-Write-Back Frequency. The tag scanning frequency determines the frequency of our cache force write-back operations. The force write-back (i.e., the scanning) needs to be performed as frequent as to ensure that the original data is written back to NVRAM before the corresponding log records are overwritten by newer updates. As a result, the more frequent the write requests are serviced, the more frequent the log will be overwritten; the larger the log, the less frequent the log will be overwritten. As such, the scanning frequency can be determined by the maximum log update frequency (bounded by NVRAM write bandwidth because applications cannot write to the NVRAM faster than its bandwidth) and log size (see the sensitivity study in 6). 4.5 Summary of Hardware Overhead Table 1 amalgamates the hardware overhead of our implementation mechanisms in the processor. Note that these values may vary depending on the native processor and ISA. Our implementation assumes a 64-bit machine which is why the circular log head and tail pointers are 8 bytes. Only half of these bytes are required in a 32-bit machine. The size of the log buffer varies based on the size of the cache line. The size of the overhead needed for the fwb state 9

10 Mechanism Logic Type Size Transaction ID register flip-flops 1 Byte Log head pointer register flip-flops 8 Bytes Log tail pointer register flip-flops 8 Bytes Log buffer (optional) SRAM 964 Bytes Fwb tag bit SRAM 768 Bytes Table 1: Summary of major hardware overhead. varies on the total number of cache lines at all levels of cache. Our value in the table was computed based on the specifications of all our system caches described in 5. Also note that these are the major logic components that make up storage on-chip. Additional gates for logic operations are also necessary to implement the design mechanisms described in previous section. However, the simplicity of the design means the gates used will be primarily small and medium-sized gates, on the same complexity level as a multiplexer or decoder. To get the exact impact and count of these gates would require further, lower-level study into this design. We discuss this further in Recovery We outline the steps of recovering the critical in-memory data in systems that adopt our design. Step 1: Following a power failure, the first step is to obtain the head and tail pointers of the log in NVRAM. These pointers are part of the log structure. They allow systems to correctly order the log entries. We employ only one circular buffer as the log of all transactions of all threads. Therefore, we use the reference to look up these pointers. Step 2: System recovery handler fetches log entries from NVRAM and use the address, old value, and new value fields to generate writes to NVRAM to the addresses specified. We identify which writes did not commit by going backwards from the tail pointer. Each log entry that does not match its value in NVRAM is considered a non-committed entry. The address stored with each entry corresponds to the address of the data member in the original persistent data structure. Aside from the head and tail pointers, we use the torn bit to correctly order these writes [46]. Step 3: The generated writes go directly to NVRAM and do not use the cache hierarchy. We use volatile caches so their state is reset and all these generated writes on recovery are persistent, so they can bypass the caches without issue. Step 4: With each persistent write generated from the circular log, update the head and tail pointers accordingly. After all the updates from the log are redone (or undone), the head and tail pointers of the log should point to the same location and the entries need to be invalidated. 5 Experimental Setup We evaluate our design by implementing our mechanisms in McSimA+ [2], a Pin-based [28] cycle-level multi-core simulator. The simulator is configured to model a multi-core out-of-order processor and NVRAM DIMM described in Table 2. We adopt phase-change memory parameters in the NVRAM DIMM [23]. Yet, because all of our performance numbers shown in 6 are relative, the same observations should be valid for various NVRAM latencies and access energy. Our work focuses on improving persistent memory access so we do not evaluate DRAM access in our experiments. We evaluate both microbenchmarks and real workloads in our experiments. The microbenchmarks repeatedly update persistent memory storing various data structures, including hash table, red-black tree, array, B+tree, and graph. These are data structures widely used in various applications, such as databases and file systems [10]. Table 3 describes the details these benchmarks. Our experiments use several versions of each benchmark and vary the data type between integers and strings within them. Data structures with integer elements pack less data (smaller than a cache line) per element, whereas those with strings have multiple cache lines per element and allow us to explore more complex scenarios found in real-world applications. We compile these benchmarks in native x86 and run on the McSimA+ simulator. Our paper simulates both single-threaded and multi-threaded behavior of each benchmark. In addition, we evaluate a persistent version of memcached [29], which is a real-world in-memory workload studied in prior works [10]. 10

11 Processor Similar to Intel Core i7 / 22 nm Cores 4 cores, 2.5GHz, 2 threads/core IL1 Cache 32KB, 8-way set-associative, 64B cache lines, 1.6ns latency DL1 Cache 32KB, 8-way set-associative, 64B cache lines, 1.6ns latency L2 Cache 8MB, 16-way set-associative, 64B cache lines, 4.4ns latency Memory Controller 64-/64-entry read/write queues NVRAM DIMM 8GB, 8 banks, 2KB row, 36ns row-buffer hit, 100/300ns read/write row-buffer conflict [23] Power and Energy Processor: 149W (peak) NVRAM: row buffer read (write): 0.93 (1.02) pj/bit, array read (write): 2.47 (16.82) pj/bit [23] Memory Name Footprint Description Table 2: Processor and memory configurations. Hash [10] 256 MB Searches for a value in an open-chain hash table. Insert if absent, remove if found. RBTree [49] 256 MB Searches for a value in a red-black tree. Insert if absent, remove if found SPS [49] 1 GB Random swaps between entries in a 1 GB vector of values. BTree [5] 256 MB Searches for a value in a B+ tree. Insert if absent, remove if found. SSCA2 [3] 16 MB A transactional implementation of SSCA 2.2, performing several analyses of large, scale-free graph. 6 Results Table 3: A list of evaluated benchmarks. To demonstrate the benefits of our design, we study instruction per cycle (IPC), operational throughput, NVRAM traffic reduction, and dynamic energy. The operational throughput measures the number of executed application-level operations (e.g., node inserts, deletes, swaps, etc.) per second. 6.1 Microbenchmark Results In our microbenchmarks, the number of operations is proportional to the data structure size, listed as memory footprint in Table 3. We observe that processor dynamic energy does not significantly change across various workloads so we only show our memory dynamic energy results. All results are speedup (higher is better) normalized to an unsafe baseline (dashed line), which performs logging but does not employ cache flushing or write-back. However, this unsafe baseline does not attempt to guarantee data persistence. The threshold line in the figures shows the best case of the unsafe baseline (i.e., the best value achieved by employing redo- or undo-logging). Each cluster of bars (from left to right along the x-axis) shows the other three baselines (non-pers that employs NVRAM as a working memory without data persistence guarantee or logging; redoclwb-base and undo-clwb-base that issue clwb instructions to ensure persistence on top of the unsafe baseline), our logging mechanism with steal/no-force support (wal), and employing our progressive cache write-back on top of our logging mechanism (fwb) which shows the full implementation of all our proposed mechanisms. Note that the non- non-pers redo-clwb-base undo-clwb-base wal fwb Figure 5: Operational throughput speedup of microbenchmarks (normalized to the unsafe baseline, dashed line). 11

12 non-pers redo-clwb-base undo-clwb-base wal fwb Figure 6: IPC speedup of microbenchmarks (normalized to the unsafe baseline, dashed line) non-pers redo-clwb-base undo-clwb-base wal fwb hash-int-1c hash-int-2c hash-int-4c hash-str-1c hash-str-2c hash-str-4c rbtree-int-1c rbtree-int-2c rbtree-int-4c rbtree-str-1c rbtree-str-2c rbtree-str-4c 0.2 sps-int-1c sps-int-2c sps-int-4c sps-str-1c sps-str-2c sps-str-4c SSCA2-1c SSCA2-2c SSCA2-4c btree-1c btree-2c btree-4c average-1c average-2c average-4c Figure 7: Dynamic energy reduction across all microbenchmarks (normalized to unsafe baseline, dashed line). pers bar may go beyond the displayed axis. This configuration can yield an ideal yet unachievable performance for persistent memory systems [49]. We execute our microbenchmarks with single, two, and four threads on one, two, and four cores, respectively. The corresponding sets of bars are illustrated by appending -1c, -2c, and -4c to each benchmark name. We make the following major observations out of our microbenchmark results. Overall Performance and Energy Improvement. In general, wal alone already significantly improves IPC, dynamic energy consumption, system throughput, and write memory traffic. With all our proposed mechanisms, fwb extends the improvements of all measured metrics further. Our design improves IPC by up to 1.73 over the baseline, with an average of 1.49 compared with the unsafe baseline. Operation throughput is improved by up to 2.12, with average of 1.51 over the unsafe baseline. Our design significantly reduces NVRAM traffic, by an average factor of 1.79 against the unsafe baseline. Due to the memory traffic reduction, fwb reduces dynamic memory energy 2.2 non-pers redo-clwb-base undo-clwb-base wal fwb Figure 8: Memory write traffic reduction across all microbenchmarks (normalized to unsafe baseline, dashed line). 12

13 Normalized Throughput NVRAM Write Bandwidth Limita4on 1.4 Persistence Constraint (a) Log Buffer Entries Force-write-backs per Cycle 2.0E E E E E+00 (b) 2.8E Log Size (KB) Figure 9: Sensitivity studies of (a) system throughput with various log buffer sizes and (b) cache force write-back frequency with various NVRAM log sizes Memcached redo-clwb-base undo-clwb-base wal fwb ipc-t1 ipc-t2 ipc-t4 energy-t1 energy-t2 energy-t4 thrpt-t1 thrpt-t2 thrpt-t4 memwr-t1 memwr-t2 memwr-t4 Figure 10: Single and multithreaded memcached results on performance, energy, throughput, and memory write traffic. consumption by an average The energy reduction can achieve as high as 2.38 over the baselines with clwb. The net decrease in memory accesses, shown in Figure 8, leads to a decrease in the Average Memory Access Latency (AMAL) which directly improves IPC and system throughput. Performance Sensitivity to the Number of Cores. Our results map the change across various microbenchmarks as we increase the number of processor cores (also threads). Our results show that wal IPC and throughput appears insensitive to the number of cores. In all experimental cases with multi-core or multithreaded, our design always yields significant improvement over all baselines. Performance Sensitivity to Log Buffer Size. As we discussed in 4.3, the log buffer size is bounded by the data persistence requirement: the log updates need to arrive at NVRAM before the corresponding original data updates. This bound is 15 entries based on our processor configuration. Indeed, larger log buffers better improve throughput as we studied using Hash benchmark (Figure 9(a)): an 8-entry log buffer improves system throughput by 10%; our implementation with a 15-entry log buffer improves throughput by 18%. Further increasing log buffer size although will no longer guarantee data persistence can continue to improve system throughput until hitting the NVRAM write bandwidth limitation (64 entries based on our NVRAM configuration). Note that the system throughput results with 128 and 256 entries are generated assuming infinite NVRAM write bandwidth. The Relationship Between Force Write-back Frequency and Log Size. As we discussed in 4.4, the force writeback frequency is determined by the NVRAM write bandwidth and log size. With given NVRAM write bandwidth, we studied the relationship between the required force write-back frequency and the log size. As shown in Figure 9(b), we only need to perform force-write-backs every three million cycles once we have a 4MB log. The Impact of Application Complexity. Note, however, that results for the SSCA2 and BTree benchmarks do not follow the pattern of clear improvement over the baselines. Rather, their metrics fall within 5-10% of the baseline. We attribute this to the fact that the SSCA2 and BTree employ data structures, whose complexities far outweigh the 13

Steal but No Force: Efficient Hardware Undo+Redo Logging for Persistent Memory Systems

Steal but No Force: Efficient Hardware Undo+Redo Logging for Persistent Memory Systems 2018 IEEE International Symposium on High Performance Computer Architecture Steal but No Force: Efficient Hardware Undo+Redo Logging for Persistent Memory Systems Matheus Almeida Ogleari, Ethan L. Miller,,

More information

Hardware Undo+Redo Logging. Matheus Ogleari Ethan Miller Jishen Zhao CRSS Retreat 2018 May 16, 2018

Hardware Undo+Redo Logging. Matheus Ogleari Ethan Miller Jishen Zhao   CRSS Retreat 2018 May 16, 2018 Hardware Undo+Redo Logging Matheus Ogleari Ethan Miller Jishen Zhao https://users.soe.ucsc.edu/~mogleari/ CRSS Retreat 2018 May 16, 2018 Typical Memory and Storage Hierarchy: Memory Fast access to working

More information

Logging in Persistent Memory: to Cache, or Not to Cache? Mengjie Li, Matheus Ogleari, Jishen Zhao

Logging in Persistent Memory: to Cache, or Not to Cache? Mengjie Li, Matheus Ogleari, Jishen Zhao Logging in Persistent Memory: to Cache, or Not to Cache? Mengjie Li, Matheus Ogleari, Jishen Zhao Persistent Memory CPU DRAM Disk/Flash Memory Load/store Not persistent Storage Fopen, fread, fwrite, Persistent

More information

Loose-Ordering Consistency for Persistent Memory

Loose-Ordering Consistency for Persistent Memory Loose-Ordering Consistency for Persistent Memory Youyou Lu 1, Jiwu Shu 1, Long Sun 1, Onur Mutlu 2 1 Tsinghua University 2 Carnegie Mellon University Summary Problem: Strict write ordering required for

More information

Lecture 21: Logging Schemes /645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo

Lecture 21: Logging Schemes /645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo Lecture 21: Logging Schemes 15-445/645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo Crash Recovery Recovery algorithms are techniques to ensure database consistency, transaction

More information

Loose-Ordering Consistency for Persistent Memory

Loose-Ordering Consistency for Persistent Memory Loose-Ordering Consistency for Persistent Memory Youyou Lu, Jiwu Shu, Long Sun and Onur Mutlu Department of Computer Science and Technology, Tsinghua University, Beijing, China State Key Laboratory of

More information

SoftWrAP: A Lightweight Framework for Transactional Support of Storage Class Memory

SoftWrAP: A Lightweight Framework for Transactional Support of Storage Class Memory SoftWrAP: A Lightweight Framework for Transactional Support of Storage Class Memory Ellis Giles Rice University Houston, Texas erg@rice.edu Kshitij Doshi Intel Corp. Portland, OR kshitij.a.doshi@intel.com

More information

Chapter 17: Recovery System

Chapter 17: Recovery System Chapter 17: Recovery System! Failure Classification! Storage Structure! Recovery and Atomicity! Log-Based Recovery! Shadow Paging! Recovery With Concurrent Transactions! Buffer Management! Failure with

More information

Failure Classification. Chapter 17: Recovery System. Recovery Algorithms. Storage Structure

Failure Classification. Chapter 17: Recovery System. Recovery Algorithms. Storage Structure Chapter 17: Recovery System Failure Classification! Failure Classification! Storage Structure! Recovery and Atomicity! Log-Based Recovery! Shadow Paging! Recovery With Concurrent Transactions! Buffer Management!

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

CS122 Lecture 15 Winter Term,

CS122 Lecture 15 Winter Term, CS122 Lecture 15 Winter Term, 2017-2018 2 Transaction Processing Last time, introduced transaction processing ACID properties: Atomicity, consistency, isolation, durability Began talking about implementing

More information

Blurred Persistence in Transactional Persistent Memory

Blurred Persistence in Transactional Persistent Memory Blurred Persistence in Transactional Persistent Memory Youyou Lu, Jiwu Shu, Long Sun Tsinghua University Overview Problem: high performance overhead in ensuring storage consistency of persistent memory

More information

Hardware Support for NVM Programming

Hardware Support for NVM Programming Hardware Support for NVM Programming 1 Outline Ordering Transactions Write endurance 2 Volatile Memory Ordering Write-back caching Improves performance Reorders writes to DRAM STORE A STORE B CPU CPU B

More information

JOURNALING techniques have been widely used in modern

JOURNALING techniques have been widely used in modern IEEE TRANSACTIONS ON COMPUTERS, VOL. XX, NO. X, XXXX 2018 1 Optimizing File Systems with a Write-efficient Journaling Scheme on Non-volatile Memory Xiaoyi Zhang, Dan Feng, Member, IEEE, Yu Hua, Senior

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to

More information

Chapter 16: Recovery System. Chapter 16: Recovery System

Chapter 16: Recovery System. Chapter 16: Recovery System Chapter 16: Recovery System Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 16: Recovery System Failure Classification Storage Structure Recovery and Atomicity Log-Based

More information

Chapter 17: Recovery System

Chapter 17: Recovery System Chapter 17: Recovery System Database System Concepts See www.db-book.com for conditions on re-use Chapter 17: Recovery System Failure Classification Storage Structure Recovery and Atomicity Log-Based Recovery

More information

arxiv: v1 [cs.dc] 3 Jan 2019

arxiv: v1 [cs.dc] 3 Jan 2019 A Secure and Persistent Memory System for Non-volatile Memory Pengfei Zuo *, Yu Hua *, Yuan Xie * Huazhong University of Science and Technology University of California, Santa Barbara arxiv:1901.00620v1

More information

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 23 Virtual memory Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ Is a page replaces when

More information

Crash Consistency: FSCK and Journaling. Dongkun Shin, SKKU

Crash Consistency: FSCK and Journaling. Dongkun Shin, SKKU Crash Consistency: FSCK and Journaling 1 Crash-consistency problem File system data structures must persist stored on HDD/SSD despite power loss or system crash Crash-consistency problem The system may

More information

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck Main memory management CMSC 411 Computer Systems Architecture Lecture 16 Memory Hierarchy 3 (Main Memory & Memory) Questions: How big should main memory be? How to handle reads and writes? How to find

More information

Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory

Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory Write-Optimized and High-Performance Hashing Index Scheme for Persistent Memory Pengfei Zuo, Yu Hua, and Jie Wu, Huazhong University of Science and Technology https://www.usenix.org/conference/osdi18/presentation/zuo

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 13 Virtual memory and memory management unit In the last class, we had discussed

More information

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications Last Class Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications Basic Timestamp Ordering Optimistic Concurrency Control Multi-Version Concurrency Control C. Faloutsos A. Pavlo Lecture#23:

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

High Performance Transactions in Deuteronomy

High Performance Transactions in Deuteronomy High Performance Transactions in Deuteronomy Justin Levandoski, David Lomet, Sudipta Sengupta, Ryan Stutsman, and Rui Wang Microsoft Research Overview Deuteronomy: componentized DB stack Separates transaction,

More information

Michael Adler 2017/05

Michael Adler 2017/05 Michael Adler 2017/05 FPGA Design Philosophy Standard platform code should: Provide base services and semantics needed by all applications Consume as little FPGA area as possible FPGAs overcome a huge

More information

Chapter 8. Virtual Memory

Chapter 8. Virtual Memory Operating System Chapter 8. Virtual Memory Lynn Choi School of Electrical Engineering Motivated by Memory Hierarchy Principles of Locality Speed vs. size vs. cost tradeoff Locality principle Spatial Locality:

More information

Chapter 8: Virtual Memory. Operating System Concepts

Chapter 8: Virtual Memory. Operating System Concepts Chapter 8: Virtual Memory Silberschatz, Galvin and Gagne 2009 Chapter 8: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance 6.823, L11--1 Cache Performance and Memory Management: From Absolute Addresses to Demand Paging Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Cache Performance 6.823,

More information

Journaling. CS 161: Lecture 14 4/4/17

Journaling. CS 161: Lecture 14 4/4/17 Journaling CS 161: Lecture 14 4/4/17 In The Last Episode... FFS uses fsck to ensure that the file system is usable after a crash fsck makes a series of passes through the file system to ensure that metadata

More information

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2) The Memory Hierarchy Cache, Main Memory, and Virtual Memory (Part 2) Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Cache Line Replacement The cache

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading Review on ILP TDT 4260 Chap 5 TLP & Hierarchy What is ILP? Let the compiler find the ILP Advantages? Disadvantages? Let the HW find the ILP Advantages? Disadvantages? Contents Multi-threading Chap 3.5

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) ) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Goal A Distributed Transaction We want a transaction that involves multiple nodes Review of transactions and their properties

More information

Using Memory-Style Storage to Support Fault Tolerance in Data Centers

Using Memory-Style Storage to Support Fault Tolerance in Data Centers Using Memory-Style Storage to Support Fault Tolerance in Data Centers Xiao Liu Qing Yi Jishen Zhao University of California at Santa Cruz, University of Colorado Colorado Springs {xiszishu,jishen.zhao}@ucsc.edu,

More information

Recovery Techniques. The System Failure Problem. Recovery Techniques and Assumptions. Failure Types

Recovery Techniques. The System Failure Problem. Recovery Techniques and Assumptions. Failure Types The System Failure Problem Recovery Techniques It is possible that the stable database: (because of buffering of blocks in main memory) contains values written by uncommitted transactions. does not contain

More information

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5 th Edition Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic

More information

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3.

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3. 5 Solutions Chapter 5 Solutions S-3 5.1 5.1.1 4 5.1.2 I, J 5.1.3 A[I][J] 5.1.4 3596 8 800/4 2 8 8/4 8000/4 5.1.5 I, J 5.1.6 A(J, I) 5.2 5.2.1 Word Address Binary Address Tag Index Hit/Miss 5.2.2 3 0000

More information

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Today s Class. Faloutsos/Pavlo CMU /615

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Today s Class. Faloutsos/Pavlo CMU /615 Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos A. Pavlo Lecture#23: Crash Recovery Part 1 (R&G ch. 18) Last Class Basic Timestamp Ordering Optimistic Concurrency

More information

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed McGraw-Hill by

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed McGraw-Hill by Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan. Original slides are available

More information

CHAPTER 3 RECOVERY & CONCURRENCY ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

CHAPTER 3 RECOVERY & CONCURRENCY ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI CHAPTER 3 RECOVERY & CONCURRENCY ADVANCED DATABASE SYSTEMS Assist. Prof. Dr. Volkan TUNALI PART 1 2 RECOVERY Topics 3 Introduction Transactions Transaction Log System Recovery Media Recovery Introduction

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) ) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Goal A Distributed Transaction We want a transaction that involves multiple nodes Review of transactions and their properties

More information

CS Computer Architecture

CS Computer Architecture CS 35101 Computer Architecture Section 600 Dr. Angela Guercio Fall 2010 An Example Implementation In principle, we could describe the control store in binary, 36 bits per word. We will use a simple symbolic

More information

Lecture 18: Reliable Storage

Lecture 18: Reliable Storage CS 422/522 Design & Implementation of Operating Systems Lecture 18: Reliable Storage Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of

More information

Introduction. Storage Failure Recovery Logging Undo Logging Redo Logging ARIES

Introduction. Storage Failure Recovery Logging Undo Logging Redo Logging ARIES Introduction Storage Failure Recovery Logging Undo Logging Redo Logging ARIES Volatile storage Main memory Cache memory Nonvolatile storage Stable storage Online (e.g. hard disk, solid state disk) Transaction

More information

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability Topics COS 318: Operating Systems File Performance and Reliability File buffer cache Disk failure and recovery tools Consistent updates Transactions and logging 2 File Buffer Cache for Performance What

More information

DHTM: Durable Hardware Transactional Memory

DHTM: Durable Hardware Transactional Memory DHTM: Durable Hardware Transactional Memory Arpit Joshi, Vijay Nagarajan, Marcelo Cintra, Stratis Viglas ISCA 2018 is here!2 is here!2 Systems LLC!3 Systems - Non-volatility over the memory bus - Load/Store

More information

Views of Memory. Real machines have limited amounts of memory. Programmer doesn t want to be bothered. 640KB? A few GB? (This laptop = 2GB)

Views of Memory. Real machines have limited amounts of memory. Programmer doesn t want to be bothered. 640KB? A few GB? (This laptop = 2GB) CS6290 Memory Views of Memory Real machines have limited amounts of memory 640KB? A few GB? (This laptop = 2GB) Programmer doesn t want to be bothered Do you think, oh, this computer only has 128MB so

More information

Chapter 8 Virtual Memory

Chapter 8 Virtual Memory Operating Systems: Internals and Design Principles Chapter 8 Virtual Memory Seventh Edition William Stallings Operating Systems: Internals and Design Principles You re gonna need a bigger boat. Steven

More information

Virtual Memory - Overview. Programmers View. Virtual Physical. Virtual Physical. Program has its own virtual memory space.

Virtual Memory - Overview. Programmers View. Virtual Physical. Virtual Physical. Program has its own virtual memory space. Virtual Memory - Overview Programmers View Process runs in virtual (logical) space may be larger than physical. Paging can implement virtual. Which pages to have in? How much to allow each process? Program

More information

EXAM 1 SOLUTIONS. Midterm Exam. ECE 741 Advanced Computer Architecture, Spring Instructor: Onur Mutlu

EXAM 1 SOLUTIONS. Midterm Exam. ECE 741 Advanced Computer Architecture, Spring Instructor: Onur Mutlu Midterm Exam ECE 741 Advanced Computer Architecture, Spring 2009 Instructor: Onur Mutlu TAs: Michael Papamichael, Theodoros Strigkos, Evangelos Vlachos February 25, 2009 EXAM 1 SOLUTIONS Problem Points

More information

Operating System Concepts

Operating System Concepts Chapter 9: Virtual-Memory Management 9.1 Silberschatz, Galvin and Gagne 2005 Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Caching and reliability

Caching and reliability Caching and reliability Block cache Vs. Latency ~10 ns 1~ ms Access unit Byte (word) Sector Capacity Gigabytes Terabytes Price Expensive Cheap Caching disk contents in RAM Hit ratio h : probability of

More information

High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc.

High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc. High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc. Table of Contents Section I: The Need for Warm Standby...2 The Business Problem...2 Section II:

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system

More information

CS6401- Operating System UNIT-III STORAGE MANAGEMENT

CS6401- Operating System UNIT-III STORAGE MANAGEMENT UNIT-III STORAGE MANAGEMENT Memory Management: Background In general, to rum a program, it must be brought into memory. Input queue collection of processes on the disk that are waiting to be brought into

More information

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 5 Memory Hierachy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic

More information

Memory Management Outline. Operating Systems. Motivation. Paging Implementation. Accessing Invalid Pages. Performance of Demand Paging

Memory Management Outline. Operating Systems. Motivation. Paging Implementation. Accessing Invalid Pages. Performance of Demand Paging Memory Management Outline Operating Systems Processes (done) Memory Management Basic (done) Paging (done) Virtual memory Virtual Memory (Chapter.) Motivation Logical address space larger than physical

More information

Operating Systems. File Systems. Thomas Ropars.

Operating Systems. File Systems. Thomas Ropars. 1 Operating Systems File Systems Thomas Ropars thomas.ropars@univ-grenoble-alpes.fr 2017 2 References The content of these lectures is inspired by: The lecture notes of Prof. David Mazières. Operating

More information

Chapter 8 & Chapter 9 Main Memory & Virtual Memory

Chapter 8 & Chapter 9 Main Memory & Virtual Memory Chapter 8 & Chapter 9 Main Memory & Virtual Memory 1. Various ways of organizing memory hardware. 2. Memory-management techniques: 1. Paging 2. Segmentation. Introduction Memory consists of a large array

More information

Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches

Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging 6.823, L8--1 Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Highly-Associative

More information

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts Memory management Last modified: 26.04.2016 1 Contents Background Logical and physical address spaces; address binding Overlaying, swapping Contiguous Memory Allocation Segmentation Paging Structure of

More information

RISC-V Support for Persistent Memory Systems

RISC-V Support for Persistent Memory Systems RISC-V Support for Persistent Memory Systems Matheus Ogleari, Storage Architecture Engineer June 26, 2018 7/6/2018 Persistent Memory State-of-the-art, hybrid memory + storage properties Supported by hardware

More information

Advanced Computer Architecture

Advanced Computer Architecture Advanced Computer Architecture 1 L E C T U R E 4: D A T A S T R E A M S I N S T R U C T I O N E X E C U T I O N I N S T R U C T I O N C O M P L E T I O N & R E T I R E M E N T D A T A F L O W & R E G I

More information

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management CSE 120 July 18, 2006 Day 5 Memory Instructor: Neil Rhodes Translation Lookaside Buffer (TLB) Implemented in Hardware Cache to map virtual page numbers to page frame Associative memory: HW looks up in

More information

First-In-First-Out (FIFO) Algorithm

First-In-First-Out (FIFO) Algorithm First-In-First-Out (FIFO) Algorithm Reference string: 7,0,1,2,0,3,0,4,2,3,0,3,0,3,2,1,2,0,1,7,0,1 3 frames (3 pages can be in memory at a time per process) 15 page faults Can vary by reference string:

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

Basic Memory Management

Basic Memory Management Basic Memory Management CS 256/456 Dept. of Computer Science, University of Rochester 10/15/14 CSC 2/456 1 Basic Memory Management Program must be brought into memory and placed within a process for it

More information

SmartSaver: Turning Flash Drive into a Disk Energy Saver for Mobile Computers

SmartSaver: Turning Flash Drive into a Disk Energy Saver for Mobile Computers SmartSaver: Turning Flash Drive into a Disk Energy Saver for Mobile Computers Feng Chen 1 Song Jiang 2 Xiaodong Zhang 1 The Ohio State University, USA Wayne State University, USA Disks Cost High Energy

More information

Chapter 14: Recovery System

Chapter 14: Recovery System Chapter 14: Recovery System Chapter 14: Recovery System Failure Classification Storage Structure Recovery and Atomicity Log-Based Recovery Remote Backup Systems Failure Classification Transaction failure

More information

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) ) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Transactions - Definition A transaction is a sequence of data operations with the following properties: * A Atomic All

More information

COSC3330 Computer Architecture Lecture 20. Virtual Memory

COSC3330 Computer Architecture Lecture 20. Virtual Memory COSC3330 Computer Architecture Lecture 20. Virtual Memory Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston Virtual Memory Topics Reducing Cache Miss Penalty (#2) Use

More information

CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER

CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER 84 CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER 3.1 INTRODUCTION The introduction of several new asynchronous designs which provides high throughput and low latency is the significance of this chapter. The

More information

Non-Volatile Memory Through Customized Key-Value Stores

Non-Volatile Memory Through Customized Key-Value Stores Non-Volatile Memory Through Customized Key-Value Stores Leonardo Mármol 1 Jorge Guerra 2 Marcos K. Aguilera 2 1 Florida International University 2 VMware L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware)

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

GFS: The Google File System

GFS: The Google File System GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one

More information

OPERATING SYSTEM. Chapter 9: Virtual Memory

OPERATING SYSTEM. Chapter 9: Virtual Memory OPERATING SYSTEM Chapter 9: Virtual Memory Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating Kernel Memory

More information

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan. Original slides are available

More information

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang CHAPTER 6 Memory 6.1 Memory 341 6.2 Types of Memory 341 6.3 The Memory Hierarchy 343 6.3.1 Locality of Reference 346 6.4 Cache Memory 347 6.4.1 Cache Mapping Schemes 349 6.4.2 Replacement Policies 365

More information

Faster and Durable Hash Indexes

Faster and Durable Hash Indexes Faster and Durable Hash Indexes Amit Kapila 2017.05.25 2013 EDB All rights reserved. 1 Contents Hash indexes Locking improvements Improved usage of space Code re-factoring to enable write-ahead-logging

More information

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh Accelerating Pointer Chasing in 3D-Stacked : Challenges, Mechanisms, Evaluation Kevin Hsieh Samira Khan, Nandita Vijaykumar, Kevin K. Chang, Amirali Boroumand, Saugata Ghose, Onur Mutlu Executive Summary

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common

More information

Crash Recovery CMPSCI 645. Gerome Miklau. Slide content adapted from Ramakrishnan & Gehrke

Crash Recovery CMPSCI 645. Gerome Miklau. Slide content adapted from Ramakrishnan & Gehrke Crash Recovery CMPSCI 645 Gerome Miklau Slide content adapted from Ramakrishnan & Gehrke 1 Review: the ACID Properties Database systems ensure the ACID properties: Atomicity: all operations of transaction

More information

LECTURE 11. Memory Hierarchy

LECTURE 11. Memory Hierarchy LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed

More information

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals Cache Memory COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline The Need for Cache Memory The Basics

More information

Phase Change Memory An Architecture and Systems Perspective

Phase Change Memory An Architecture and Systems Perspective Phase Change Memory An Architecture and Systems Perspective Benjamin C. Lee Stanford University bcclee@stanford.edu Fall 2010, Assistant Professor @ Duke University Benjamin C. Lee 1 Memory Scaling density,

More information

ACID Properties. Transaction Management: Crash Recovery (Chap. 18), part 1. Motivation. Recovery Manager. Handling the Buffer Pool.

ACID Properties. Transaction Management: Crash Recovery (Chap. 18), part 1. Motivation. Recovery Manager. Handling the Buffer Pool. ACID Properties Transaction Management: Crash Recovery (Chap. 18), part 1 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke CS634 Class 20, Apr 13, 2016 Transaction Management

More information

Log Manager. Introduction

Log Manager. Introduction Introduction Log may be considered as the temporal database the log knows everything. The log contains the complete history of all durable objects of the system (tables, queues, data items). It is possible

More information

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Adrian M. Caulfield Arup De, Joel Coburn, Todor I. Mollov, Rajesh K. Gupta, Steven Swanson Non-Volatile Systems

More information

3. Memory Persistency Goals. 2. Background

3. Memory Persistency Goals. 2. Background Memory Persistency Steven Pelley Peter M. Chen Thomas F. Wenisch University of Michigan {spelley,pmchen,twenisch}@umich.edu Abstract Emerging nonvolatile memory technologies (NVRAM) promise the performance

More information

Transaction Management: Crash Recovery (Chap. 18), part 1

Transaction Management: Crash Recovery (Chap. 18), part 1 Transaction Management: Crash Recovery (Chap. 18), part 1 CS634 Class 17 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke ACID Properties Transaction Management must fulfill

More information

some sequential execution crash! Recovery Manager replacement MAIN MEMORY policy DISK

some sequential execution crash! Recovery Manager replacement MAIN MEMORY policy DISK ACID Properties Transaction Management: Crash Recovery (Chap. 18), part 1 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke CS634 Class 17 Transaction Management must fulfill

More information

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 20: Main Memory II Prof. Onur Mutlu Carnegie Mellon University Today SRAM vs. DRAM Interleaving/Banking DRAM Microarchitecture Memory controller Memory buses

More information