Design-Induced Latency Variation in Modern DRAM Chips:

Size: px
Start display at page:

Download "Design-Induced Latency Variation in Modern DRAM Chips:"

Transcription

1 Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms Donghyuk Lee 1,2 Samira Khan 3 Lavanya Subramanian 2 Saugata Ghose 2 Rachata Ausavarungnirun 2 Gennady Pekhimenko 4 Vivek Seshadri 4 Onur Mutlu 5,2 1 NVIDIA 2 Carnegie Mellon University 3 University of Virginia 4 Microsoft Research 5 ETH Zürich June 7, 2017

2 What Is Design-Induced Variation? across column distance from wordline driver Inherently fast wordline drivers fast slow sense amplifiers slow fast inherently slow across row distance from sense amplifier Systematic variation in cell access times caused by the physical organization of DRAM 2

3 Executive Summary Design-Induced Variation Inherently slow regions due to DRAM cell array organization Goal: Use design-induced variation to reduce DRAM latency Analysis: Characterization of 96 real DRAM modules Three types of systematic variation due to design Great potential to reduce DRAM latency at low cost Our Approach: DIVA-DRAM DIVA Profiling: Reliably reduce DRAM latency Profile only cells in inherently slow regions Use error correction (ECC) for slow cells that are not profiled DIVA Shuffling: Exploit variation to improve ECC reliability 15.1%/14.2% higher performance for 2-/8-core workloads 3

4 Presentation Outline DRAM Background Experimental Study of Design-Induced Variation Goal: Understand, identify inherently slower regions Methodology: Profile 96 real DRAM modules by using FPGA-based DRAM test infrastructure Exploiting Design-Induced Variation DIVA Profiling: Using low-cost slow region profiling to reliably and dynamically reduce DRAM latency DIVA Shuffling: Using variation across data bursts to reduce uncorrectable errors 4

5 High-Level DRAM Organization memory channel: 64-bit data bus 8-bit data bus per chip DRAM chip Processor DIMM dual in-line memory module 5

6 High-Level DRAM Organization memory channel: 64-bit data bus 8-bit data bus per chip Processor data burst Read request DIMM dual in-line memory module 6

7 High-Level DRAM Organization memory channel: 64-bit data bus 8-bit data bus per chip Processor 8 chips X 8 bits X 8 bursts = 64 bytes Read request DIMM dual in-line memory module 7

8 Organization Inside a DRAM Chip DRAM mat DRAM cell wordline drivers sense amplifiers 8

9 Organization Inside a DRAM Chip global wordline row group row decoder row decoder 9

10 DRAM Operations Memory controller sends commands to DRAM row decoder 10

11 DRAM Operations Memory controller sends commands to DRAM row decoder Activation: Open one row in each mat Row decoder, wordline drivers select a row of cells 11

12 DRAM Operations Memory controller sends commands to DRAM row decoder Activation: Open one row in each mat Row decoder, wordline drivers select a row of cells Each cell in the row shares charge with a sense amplifier 12

13 DRAM Operations Memory controller sends commands to DRAM row decoder column column column column Activation: Open one row in each mat Row decoder, wordline drivers select a row of cells Each cell in the row shares charge with a sense amplifier Read: Send one column of data from sense amplifiers to CPU 13

14 DRAM Operations Memory controller sends commands to DRAM row decoder Activation: Open one row in each mat Row decoder, wordline drivers select a row of cells Each cell in the row shares charge with a sense amplifier Read: Send one column of data from sense amplifiers to CPU Precharge: Prepare mats for next row 14

15 DRAM Operations Memory controller sends commands to DRAM row decoder Activation: Open one row in each mat Row decoder, wordline drivers select a row of cells Each cell in the row shares charge with a sense amplifier Read: Send one column of data from sense amplifiers to CPU Precharge: Prepare mats for next row Timing parameters: how long to wait for each step to finish 15

16 DRAM Timing Parameters Standard timing parameters are dictated by the worst case Must ensure reliable operation for: The smallest cell with the smallest charge in all DRAM modules (process variation) The highest operating temperature allowed (charge leakage) Large timing margin for the common case Goal: Lower common-case latency 16

17 DRAM Timing Parameters Standard timing parameters are dictated by the worst case Must ensure reliable operation for: The smallest cell with the smallest charge in all DRAM modules (process variation) The highest operating temperature allowed (charge leakage) Large timing margin for the common case Can we use design-induced variation to find and Goal: use Lower common-case common-case latency latency at low cost? 17

18 Presentation Outline DRAM Background Experimental Study of Design-Induced Variation Goal: Understand, identify inherently slower regions Methodology: Profile 96 real DRAM modules by using FPGA-based DRAM test infrastructure Exploiting Design-Induced Variation DIVA Profiling: Using low-cost slow region profiling to reliably and dynamically reduce DRAM latency DIVA Shuffling: Using variation across data bursts to reduce uncorrectable errors 18

19 Expected DRAM Characteristics Variation Some regions are slower than others If we reduce DRAM latency, slower regions are more vulnerable to errors Repeatability Latency (error) characteristics repeat periodically, if the same component (e.g., mat) is duplicated Similarity Characteristics repeat across different organizations (e.g., chip/dimm) that share same design 19

20 Characterization Methodology We use error behavior when we reduce latency to infer DRAM organization and characteristics FPGA-based memory controller infrastructure 20

21 DRAM Testing Infrastructure Temperature Controller FPGAs Heater FPGAs PC Tested 96 DIMMs from three vendors 21

22 Characterization Methodology We use error behavior when we reduce latency to infer DRAM organization and characteristics FPGA-based memory controller infrastructure We reverse engineer row address mapping Lower trp (precharge timing parameter) to 7.5 ns We characterize three types of variation Variation across columns Variation across rows Variation across data bursts Data and circuit model will be available on GitHub: 22

23 row decoder 1. Variation Across Rows darker cells are faster global wordline 512 rows row group Latency characteristics vary across 512 rows 23

24 row decoder row decoder 1. Variation Across Rows darker cells are faster global wordline 512 rows 512 rows row group Latency characteristics vary across 512 rows Same organization repeats every 512 rows 24

25 1. Variation Across Rows darker cells are faster global wordline row group 512 rows 512 rows row decoder row decoder sweep across rows Latency characteristics vary across 512 rows Latency Same characteristics organization repeats repeat every every rows rows 25

26 1.2. Periodic Row Variation Behavior Sorting with Discovered Row Mapping Erroneous Request Count sorted row address Row error (latency) characteristics periodically repeat every 512 rows 26

27 2. Variation Across Columns row decoder darker cells are faster global wordline Column latency depends on distance from row decoder, wordline driver 27

28 2. Variation Across Columns row decoder darker cells are faster global wordline Column latency depends on distance from row decoder, wordline driver 28

29 2. Variation Across Columns Erroneous Request Count Column error (latency) characteristics have specific patterns that repeat across row groups 29

30 3. Variation Across Data Bursts Processor Read request 64-bit data bus in memory channel A B8-bit Cdata D bus Eper FchipG H A B C D E F G H A B C D E F G H A B C D E F G H A B C D E F G H A A A B B B C C C D D D E E E F F F G G G H H H 64-bit data from different locations in the same row in the same chip DIMM Chip 1 Chip 2 Chip 3 Chip 4 Chip 5 Chip 6 Chip 7 Chip 8 30

31 3. Variation Across Data Bursts Error Count chip 1 chip 2 chip 3 chip 4 chip 5 chip 6 chip 7 chip 8 A B C D E F G H data bits in 8 data bursts Specific bits in a request induce more errors 31

32 Summary: Design-Induced Variation Systematic variation across rows Slow cells further from sense amplifier Systematic variation across columns Slow cells further from row decoder Slow cells further from wordline driver Systematic variation across data bursts Slow cells at certain bits in a burst Clustering of errors Can we use design-induced variation to find and use common-case latency at low cost? 32

33 Presentation Outline DRAM Background Experimental Study of Design-Induced Variation Goal: Understand, identify inherently slower regions Methodology: Profile 96 real DRAM modules by using FPGA-based DRAM test infrastructure Exploiting Design-Induced Variation DIVA Profiling: Using low-cost slow region profiling to reliably and dynamically reduce DRAM latency DIVA Shuffling: Using variation across data bursts to reduce uncorrectable errors 33

34 Challenges of Lowering Latency Static DRAM latency (e.g., AL-DRAM [HPCA 2015]) DRAM vendors need to provide fixed timings, increasing testing costs Doesn t account for latency changes over time (e.g., aging and wear out) Conventional online profiling Takes long time (high cost) to profile all DRAM cells Our Goal: Use design-induced variation to minimize profiling 34

35 1. DIVA Profiling Design-Induced-Variation-Aware wordline driver inherently slow sense amplifier Profile only slow regions to determine latency 35

36 What About Process Variation? Design-Induced-Variation-Aware slow cells process variation random error wordline driver inherently slow architectural variation localized error sense amplifier 36

37 What About Process Variation? Design-Induced-Variation-Aware slow cells process variation random error error-correcting codes (ECC) wordline driver sense amplifier inherently slow architectural variation localized error online profiling Combine ECC & online profiling Reliably reduce DRAM latency at low cost 37

38 Correction with Conventional ECC Processor Read request 64-bit data bus in memory channel 8-bit data bus per chip DIMM Error-Correcting Code (ECC) 38

39 Challenge of Conventional ECC Processor error 8-bit data bus per chip DIMM uncorrectable by ECC uncorrectable by ECC 39

40 Challenge of Conventional ECC Processor error 8-bit data bus per chip DIMM uncorrectable by ECC uncorrectable by ECC Clusters of slow cells due to design-induced variation lead to more uncorrectable errors 40

41 2. DIVA Shuffling Design-Induced-Variation-Aware Processor error DIMM 8-bit data bus per chip Shuffle data bursts 41

42 2. DIVA Shuffling Design-Induced-Variation-Aware Processor error DIMM 8-bit data bus per chip Shuffle data bursts Reduce uncorrectable errors 42

43 2. DIVA Shuffling Design-Induced-Variation-Aware Processor error DIMM 8-bit data bus per chip How do DIVA Profiling and DIVA Shuffling perform? Shuffle data bursts Reduce uncorrectable errors 43

44 DIVA Shuffling Improves ECC Fraction of Errors Removed 100% 80% 60% 40% 20% 0% ECC w/o DIVA AVA Suffling ECC with AVA DIVA Suffling Uncorrectable 33 DIMMs average % 16.1% DIVA Shuffling uses architectural variation to improve error correction using the same codeword 44

45 DIVA-DRAM Reduces Latency 50% Read 50% Write Latency Reduction 40% 30% 20% 10% 0% 35.1%34.6% 36.6% 35.8% 31.2% 25.5% 55 C 85 C 55 C 85 C 55 C 85 C 40% 30% 20% 10% 0% 39.4%38.7% 41.3% 40.3% 36.6% 27.5% 55 C 85 C 55 C 85 C 55 C 85 C AL-DRAM DIVA AVA Profiling DIVA AVA Profiling + Shuffling AL-DRAM DIVA AVA Profiling DIVA AVA Profiling + Shuffling 45

46 Latency Reduction DIVA-DRAM Reduces Latency 50% 40% 30% 20% 10% 0% Read 35.1%34.6% 36.6% 35.8% 31.2% 25.5% 55 C 85 C 55 C 85 C 55 C 85 C AL-DRAM DIVA AVA Profiling DIVA AVA Profiling + Shuffling 50% 40% 30% 20% 10% 0% 36.6% 27.5% Write 39.4%38.7% 41.3% 40.3% DIVA-DRAM reduces latency more aggressively and uses ECC to correct random slow cells 55 C 85 C 55 C 85 C 55 C 85 C AL-DRAM DIVA AVA Profiling DIVA AVA Profiling + Shuffling How do DRAM latency reductions translate to system performance? 46

47 DIVA-DRAM Improves Performance 32 single-core benchmarks: SPEC, Stream, TPC, GUPS 96 multicore workloads constructed with benchmarks 20% AL-DRAM DIVA AVA Profiling DIVA AVA Profiling + Shuffling System Performance Improvement 15% 10% 5% 7.0% 9.2% 9.5% 11.7% 14.7% 15.1% 13.7% 14.2% 13.8% 14.1% 11.0% 11.5% 0% 1-core 2-core 4-core 8-core 47

48 DIVA-DRAM Improves Performance 32 single-core benchmarks: SPEC, Stream, TPC, GUPS 96 multicore workloads constructed with benchmarks System Performance Improvement 20% 15% 10% 5% AL-DRAM DIVA AVA Profiling DIVA AVA Profiling + Shuffling 7.0% 9.2% 9.5% 11.7% 14.7% 15.1% 13.7% 14.2% 13.8% 14.1% 11.0% 11.5% DIVA-DRAM 0% outperforms the best prior work 1-core 2-core 4-core 8-core and can adapt to dynamic latency changes 48

49 Conclusion Design-Induced Variation: Inherently slow regions due to DRAM cell array organization Analysis: Characterization of 96 real DRAM modules Three types of systematic variation due to design Great potential to reduce DRAM latency at low cost Our Approach: DIVA-DRAM DIVA Profiling: Reliably reduce DRAM latency Profile only cells in inherently slow regions Use error correction (ECC) for slow cells that are not profiled DIVA Shuffling: Exploit variation to improve ECC reliability 15.1%/14.2% higher performance for 2-/8-core workloads 49

50 Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms Donghyuk Lee 1,2 Samira Khan 3 Lavanya Subramanian 2 Saugata Ghose 2 Rachata Ausavarungnirun 2 Gennady Pekhimenko 4 Vivek Seshadri 4 Onur Mutlu 5,2 1 NVIDIA 2 Carnegie Mellon University 3 University of Virginia 4 Microsoft Research 5 ETH Zürich Data, Circuit Model Will Be Available at

51 Backup Slides 51

52 Hierarchical Organization of DRAM 52

53 Sending Data From a DRAM Chip row decoder global sense amplifiers 64 bits IO interface 8 bits X 8 bursts Data in a request transferred as multiple data bursts 53

54 DRAM Stores Data as Charge DRAM cell Three steps of charge movement Sense amplifier 54

55 DRAM Stores Data as Charge DRAM cell Three steps of charge movement 1. Sensing Sense amplifier 55

56 DRAM Stores Data as Charge DRAM cell Three steps of charge movement 1. Sensing 2. Restore Sense amplifier 56

57 DRAM Stores Data as Charge DRAM cell Three steps of charge movement 1. Sensing 2. Restore 3. Precharge Sense amplifier 57

58 DRAM Charge over Time cell Sense amplifier charge cell Sense amplifier Data 1 Data 0 time 58

59 DRAM Charge over Time cell Sense amplifier charge cell Sense amplifier Sensing Data 1 Data 0 time 59

60 DRAM Charge over Time cell cell Data 1 charge Sense amplifier Sense amplifier Timing Parameters Sensing Restore In theory In practice margin Data 0 time Why does DRAM need the extra timing margin? 60

61 Two Reasons for Timing Margin 1. Process Variation DRAM cells are not equal Leads to extra timing margin for cells that can store large small amount of charge; 2. Temperature Dependence DRAM leaks more charge at higher temperature Leads to extra timing margin when operating at low temperature 61

62 DRAM Cells are Not Equal Ideal Real Same size Same charge Same latency Different size Different charge Different latency 62

63 DRAM Cells are Not Equal Ideal Real Smallest cell Large variation in cell size Largest cell Large variation in charge Large variation in access latency 63

64 Two Reasons for Timing Margin 1. Process Variation DRAM cells are not equal Leads to extra timing margin for cells that can store large amount of charge 2. Temperature Dependence DRAM leaks more charge at higher temperature Leads to extra timing margin when operating at low temperature 64

65 Charge Leakage Room Temp. Temperature Hot Temp. (85 C) Small leakage Large leakage 65

66 Charge Leakage Room Temp. Temperature Hot Temp. (85 C) Cells store small charge at high temperature and large charge at low temperature Large variation in access latency 66

67 DRAM Timing Parameters DRAM timing parameters are dictated by the worst case The smallest cell with the smallest charge in all DRAM products Operating at the highest temperature Large timing margin for the common case Can lower latency for the common case 67

68 DRAM Timing Parameters Command ACTIVATE Data Duration 43 68

69 DRAM Timing Parameters 1 Activation latency: trcd (13ns / 50 cycles) Command Data Duration ACTIVATE READ Cache line (64B) 43 69

70 DRAM Timing Parameters 1 2 Activation latency: trcd (13ns / 50 cycles) Precharge latency: trp (13ns / 50 cycles) Command Data Duration ACTIVATE READ PRECHARGE Cache line (64B) Next ACT 43 70

71 Design-Induced Variation in Open Bitline DRAM 71

72 72 Challenge: External Internal DRAM chip Internal address Address mapping IO interface External address External address Internal address

73 DRAM-Internal vs. DRAM-External IntMSB IntMID IntLSB near distance from s/a far ExtMID ExtMSB ExtLSB low error count high Estimated Mapping External Internal ExtMSB IntMID 4/4 = 100% ExtMID IntMSB 4/4 = 100% ExtLSB IntLSB 3/4 = 75% Estimated Mapping (External Internal) Based on Error Counts for the External Address 73

74 Row Address Mapping Confidence Confidence 120% 100% 80% 60% 40% 20% 0% IntMSB IntLSB

75 1.1. Measuring Row Variation Lower trp (precharge timing parameter) to 7.5 ns Erroneous Request Count row address (mod. 512) Periodic Errors 75

76 1.1. Measuring Row Variation Lower trp (precharge timing parameter) to 7.5 ns Erroneous Request Count row address (mod. 512) Need to reverse engineer Periodic row address mapping details in Errors our paper 76

77 1.2. Periodic Row Variation Behavior Sorting with Discovered Row Mapping Erroneous Request Count sorted row address Row error (latency) characteristics periodically repeat every 512 rows 77

78 Erroneous Request Count Variation in Rows trp (precharge timing parameter) 10.0 ns 7.5 ns 5.0 ns row addr. (mod. 512) row addr. (mod. 512) row addr. (mod. 512) Random Errors Periodic Errors Mostly Errors 78

79 1.2. Periodic Row Variation Behavior Aggregated & Sorted Apply sorted order to each 512-row group Erroneous Request Count row address (mod. 512) row address Error (latency) characteristics periodically repeat every 512 rows 79

80 2. Variation Across Columns row decoder 80

81 2. Variation Across Columns row decoder column column column global sense amplifier 64 bits IO interface global wordline 8 bits X 8 bursts column Different columns data from different locations different characteristics 81

82 3. Variation in Data Bits row decoder global sense amplifier 64 bits IO interface 8 bits X 8 bursts Data in a request transferred as multiple data bursts 82

83 DIVA-DRAM Evaluation Methodology Modified version of Ramulator (cycle accurate DRAM simulator) 83

84 Erroneous Request Count ms 128ms 64ms Row Address (modulo 512)

85 Erroneous Request Count C 65 C 45 C Row Address (modulo 512)

86 K K Error Count Ratio K vendor A vendor B Company A Company B Company C Tested DIMMs vendor C Error Count Ratio K vendor A vendor B Company A Company B Company C Tested DIMMs vendor C

87 global wordline driver local wordline driver cell capacitor access transistor wordline bitline parasitic resistance & capacitance rows (cells per bitline) local sense amplifier local sense amplifier mat 512 columns (cells per wordline) mat

88

89

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity Donghyuk Lee Carnegie Mellon University Problem: High DRAM Latency processor stalls: waiting for data main memory high latency Major bottleneck

More information

arxiv: v2 [cs.ar] 15 May 2017

arxiv: v2 [cs.ar] 15 May 2017 Understanding and Exploiting Design-Induced Latency Variation in Modern DRAM Chips Donghyuk Lee Samira Khan ℵ Lavanya Subramanian Saugata Ghose Rachata Ausavarungnirun Gennady Pekhimenko Vivek Seshadri

More information

Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms

Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms DONGHYUK LEE, NVIDIA and Carnegie Mellon University SAMIRA KHAN, University of Virginia

More information

SoftMC A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies

SoftMC A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies SoftMC A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies Hasan Hassan, Nandita Vijaykumar, Samira Khan, Saugata Ghose, Kevin Chang, Gennady Pekhimenko, Donghyuk

More information

Understanding Reduced-Voltage Operation in Modern DRAM Devices

Understanding Reduced-Voltage Operation in Modern DRAM Devices Understanding Reduced-Voltage Operation in Modern DRAM Devices Experimental Characterization, Analysis, and Mechanisms Kevin Chang A. Giray Yaglikci, Saugata Ghose,Aditya Agrawal *, Niladrish Chatterjee

More information

ChargeCache. Reducing DRAM Latency by Exploiting Row Access Locality

ChargeCache. Reducing DRAM Latency by Exploiting Row Access Locality ChargeCache Reducing DRAM Latency by Exploiting Row Access Locality Hasan Hassan, Gennady Pekhimenko, Nandita Vijaykumar, Vivek Seshadri, Donghyuk Lee, Oguz Ergin, Onur Mutlu Executive Summary Goal: Reduce

More information

Tiered-Latency DRAM: A Low Latency and A Low Cost DRAM Architecture

Tiered-Latency DRAM: A Low Latency and A Low Cost DRAM Architecture Tiered-Latency DRAM: A Low Latency and A Low Cost DRAM Architecture Donghyuk Lee, Yoongu Kim, Vivek Seshadri, Jamie Liu, Lavanya Subramanian, Onur Mutlu Carnegie Mellon University HPCA - 2013 Executive

More information

Computer Architecture: Main Memory (Part II) Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Main Memory (Part II) Prof. Onur Mutlu Carnegie Mellon University Computer Architecture: Main Memory (Part II) Prof. Onur Mutlu Carnegie Mellon University Main Memory Lectures These slides are from the Scalable Memory Systems course taught at ACACES 2013 (July 15-19,

More information

What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study

What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study Saugata Ghose, A. Giray Yağlıkçı, Raghav Gupta, Donghyuk Lee, Kais Kudrolli, William X. Liu, Hasan Hassan, Kevin

More information

Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case

Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case Carnegie Mellon University Research Showcase @ CMU Department of Electrical and Computer Engineering Carnegie Institute of Technology 2-2015 Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case

More information

Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms

Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms Understanding Reduced-Voltage Operation in Modern DRAM Devices: Experimental Characterization, Analysis, and Mechanisms KEVIN K. CHANG, A. GİRAY YAĞLIKÇI, and SAUGATA GHOSE, Carnegie Mellon University

More information

DRAM Disturbance Errors

DRAM Disturbance Errors http://www.ddrdetective.com/ http://users.ece.cmu.edu/~yoonguk/ Flipping Bits in Memory Without Accessing Them An Experimental Study of DRAM Disturbance Errors Yoongu Kim Ross Daly, Jeremie Kim, Chris

More information

arxiv: v1 [cs.ar] 29 May 2017

arxiv: v1 [cs.ar] 29 May 2017 Understanding Reduced-Voltage Operation in Modern DRAM Chips: Characterization, Analysis, and Mechanisms Kevin K. Chang Abdullah Giray Yağlıkçı Saugata Ghose Aditya Agrawal Niladrish Chatterjee Abhijith

More information

arxiv: v1 [cs.ar] 13 Jul 2018

arxiv: v1 [cs.ar] 13 Jul 2018 What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study arxiv:1807.05102v1 [cs.ar] 13 Jul 2018 SAUGATA GHOSE, Carnegie Mellon University ABDULLAH GIRAY YAĞLIKÇI, ETH

More information

What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study

What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study Saugata Ghose Abdullah Giray Yağlıkçı Raghav Gupta Donghyuk Lee Kais Kudrolli William X. Liu Hasan Hassan Kevin

More information

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 19: Main Memory Prof. Onur Mutlu Carnegie Mellon University Last Time Multi-core issues in caching OS-based cache partitioning (using page coloring) Handling

More information

Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques

Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques Yu Cai, Saugata Ghose, Yixin Luo, Ken Mai, Onur Mutlu, Erich F. Haratsch February 6, 2017

More information

Low-Cost Inter-Linked Subarrays (LISA) Enabling Fast Inter-Subarray Data Movement in DRAM

Low-Cost Inter-Linked Subarrays (LISA) Enabling Fast Inter-Subarray Data Movement in DRAM Low-Cost Inter-Linked ubarrays (LIA) Enabling Fast Inter-ubarray Data Movement in DRAM Kevin Chang rashant Nair, Donghyuk Lee, augata Ghose, Moinuddin Qureshi, and Onur Mutlu roblem: Inefficient Bulk Data

More information

Solar-DRAM: Reducing DRAM Access Latency by Exploiting the Variation in Local Bitlines

Solar-DRAM: Reducing DRAM Access Latency by Exploiting the Variation in Local Bitlines Solar-: Reducing Access Latency by Exploiting the Variation in Local Bitlines Jeremie S. Kim Minesh Patel Hasan Hassan Onur Mutlu Carnegie Mellon University ETH Zürich latency is a major bottleneck for

More information

Introduction to memory system :from device to system

Introduction to memory system :from device to system Introduction to memory system :from device to system Jianhui Yue Electrical and Computer Engineering University of Maine The Position of DRAM in the Computer 2 The Complexity of Memory 3 Question Assume

More information

HeatWatch Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu

HeatWatch Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu HeatWatch Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu Storage Technology Drivers

More information

DRAM Tutorial Lecture. Vivek Seshadri

DRAM Tutorial Lecture. Vivek Seshadri DRAM Tutorial 18-447 Lecture Vivek Seshadri DRAM Module and Chip 2 Goals Cost Latency Bandwidth Parallelism Power Energy 3 DRAM Chip Bank I/O 4 Sense Amplifier top enable Inverter bottom 5 Sense Amplifier

More information

EEM 486: Computer Architecture. Lecture 9. Memory

EEM 486: Computer Architecture. Lecture 9. Memory EEM 486: Computer Architecture Lecture 9 Memory The Big Picture Designing a Multiple Clock Cycle Datapath Processor Control Memory Input Datapath Output The following slides belong to Prof. Onur Mutlu

More information

SOLVING THE DRAM SCALING CHALLENGE: RETHINKING THE INTERFACE BETWEEN CIRCUITS, ARCHITECTURE, AND SYSTEMS

SOLVING THE DRAM SCALING CHALLENGE: RETHINKING THE INTERFACE BETWEEN CIRCUITS, ARCHITECTURE, AND SYSTEMS SOLVING THE DRAM SCALING CHALLENGE: RETHINKING THE INTERFACE BETWEEN CIRCUITS, ARCHITECTURE, AND SYSTEMS Samira Khan MEMORY IN TODAY S SYSTEM Processor DRAM Memory Storage DRAM is critical for performance

More information

The Memory Hierarchy 1

The Memory Hierarchy 1 The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow

More information

Improving the Reliability of Chip-Off Forensic Analysis of NAND Flash Memory Devices. Aya Fukami, Saugata Ghose, Yixin Luo, Yu Cai, Onur Mutlu

Improving the Reliability of Chip-Off Forensic Analysis of NAND Flash Memory Devices. Aya Fukami, Saugata Ghose, Yixin Luo, Yu Cai, Onur Mutlu Improving the Reliability of Chip-Off Forensic Analysis of NAND Flash Memory Devices Aya Fukami, Saugata Ghose, Yixin Luo, Yu Cai, Onur Mutlu 1 Example Target Devices for Chip-Off Analysis Fire damaged

More information

Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture

Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture Carnegie Mellon University Research Showcase @ CMU Department of Electrical and Computer Engineering Carnegie Institute of Technology 2-2013 Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture

More information

Staged Memory Scheduling

Staged Memory Scheduling Staged Memory Scheduling Rachata Ausavarungnirun, Kevin Chang, Lavanya Subramanian, Gabriel H. Loh*, Onur Mutlu Carnegie Mellon University, *AMD Research June 12 th 2012 Executive Summary Observation:

More information

Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery

Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery Yu Cai, Yixin Luo, Erich F. Haratsch*, Ken Mai, Onur Mutlu Carnegie Mellon University, *LSI Corporation 1 Many use

More information

! Memory Overview. ! ROM Memories. ! RAM Memory " SRAM " DRAM. ! This is done because we can build. " large, slow memories OR

! Memory Overview. ! ROM Memories. ! RAM Memory  SRAM  DRAM. ! This is done because we can build.  large, slow memories OR ESE 57: Digital Integrated Circuits and VLSI Fundamentals Lec 2: April 5, 26 Memory Overview, Memory Core Cells Lecture Outline! Memory Overview! ROM Memories! RAM Memory " SRAM " DRAM 2 Memory Overview

More information

Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery

Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery Yu Cai, Yixin Luo, Saugata Ghose, Erich F. Haratsch*, Ken Mai, Onur Mutlu Carnegie Mellon University, *Seagate Technology

More information

Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content

Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content Detecting and Mitigating Data-Dependent DRAM Failures by Exploiting Current Memory Content Samira Khan Chris Wilkerson Zhe Wang Alaa R. Alameldeen Donghyuk Lee Onur Mutlu University of Virginia Intel Labs

More information

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh Accelerating Pointer Chasing in 3D-Stacked : Challenges, Mechanisms, Evaluation Kevin Hsieh Samira Khan, Nandita Vijaykumar, Kevin K. Chang, Amirali Boroumand, Saugata Ghose, Onur Mutlu Executive Summary

More information

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved. Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Internal Memory http://www.yildiz.edu.tr/~naydin 1 2 Outline Semiconductor main memory Random Access Memory

More information

Speeding Up Crossbar Resistive Memory by Exploiting In-memory Data Patterns

Speeding Up Crossbar Resistive Memory by Exploiting In-memory Data Patterns March 12, 2018 Speeding Up Crossbar Resistive Memory by Exploiting In-memory Data Patterns Wen Wen Lei Zhao, Youtao Zhang, Jun Yang Executive Summary Problems: performance and reliability of write operations

More information

Efficient Data Mapping and Buffering Techniques for Multi-Level Cell Phase-Change Memories

Efficient Data Mapping and Buffering Techniques for Multi-Level Cell Phase-Change Memories Efficient Data Mapping and Buffering Techniques for Multi-Level Cell Phase-Change Memories HanBin Yoon, Justin Meza, Naveen Muralimanohar*, Onur Mutlu, Norm Jouppi* Carnegie Mellon University * Hewlett-Packard

More information

Couture: Tailoring STT-MRAM for Persistent Main Memory. Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung

Couture: Tailoring STT-MRAM for Persistent Main Memory. Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung Couture: Tailoring STT-MRAM for Persistent Main Memory Mustafa M Shihab Jie Zhang Shuwen Gao Joseph Callenes-Sloan Myoungsoo Jung Executive Summary Motivation: DRAM plays an instrumental role in modern

More information

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory 5.1 Semiconductor Main Memory 5.2 Error Correction 5.3 Advanced DRAM Organization 5.1 Semiconductor Main Memory

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

Computer Architecture

Computer Architecture Computer Architecture Lecture 7: Memory Hierarchy and Caches Dr. Ahmed Sallam Suez Canal University Spring 2015 Based on original slides by Prof. Onur Mutlu Memory (Programmer s View) 2 Abstraction: Virtual

More information

Addressing the Memory Wall

Addressing the Memory Wall Lecture 26: Addressing the Memory Wall Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Tunes Cage the Elephant Back Against the Wall (Cage the Elephant) This song is for the

More information

Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative

Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative Emre Kültürsay *, Mahmut Kandemir *, Anand Sivasubramaniam *, and Onur Mutlu * Pennsylvania State University Carnegie Mellon University

More information

Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology

Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology Using Commodity DRAM Technology Vivek Seshadri,5 Donghyuk Lee,5 Thomas Mullins 3,5 Hasan Hassan 4 Amirali Boroumand 5 Jeremie Kim 4,5 Michael A. Kozuch 3 Onur Mutlu 4,5 Phillip B. Gibbons 5 Todd C. Mowry

More information

Z-RAM Ultra-Dense Memory for 90nm and Below. Hot Chips David E. Fisch, Anant Singh, Greg Popov Innovative Silicon Inc.

Z-RAM Ultra-Dense Memory for 90nm and Below. Hot Chips David E. Fisch, Anant Singh, Greg Popov Innovative Silicon Inc. Z-RAM Ultra-Dense Memory for 90nm and Below Hot Chips 2006 David E. Fisch, Anant Singh, Greg Popov Innovative Silicon Inc. Outline Device Overview Operation Architecture Features Challenges Z-RAM Performance

More information

Computer Architecture

Computer Architecture Computer Architecture Lecture 1: Introduction and Basics Dr. Ahmed Sallam Suez Canal University Spring 2016 Based on original slides by Prof. Onur Mutlu I Hope You Are Here for This Programming How does

More information

Lecture 5: Scheduling and Reliability. Topics: scheduling policies, handling DRAM errors

Lecture 5: Scheduling and Reliability. Topics: scheduling policies, handling DRAM errors Lecture 5: Scheduling and Reliability Topics: scheduling policies, handling DRAM errors 1 PAR-BS Mutlu and Moscibroda, ISCA 08 A batch of requests (per bank) is formed: each thread can only contribute

More information

Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance

Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance Rachata Ausavarungnirun Saugata Ghose, Onur Kayiran, Gabriel H. Loh Chita Das, Mahmut Kandemir, Onur Mutlu Overview of This Talk Problem:

More information

Unleashing the Power of Embedded DRAM

Unleashing the Power of Embedded DRAM Copyright 2005 Design And Reuse S.A. All rights reserved. Unleashing the Power of Embedded DRAM by Peter Gillingham, MOSAID Technologies Incorporated Ottawa, Canada Abstract Embedded DRAM technology offers

More information

ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM SCALING. Thesis Defense Yoongu Kim

ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM SCALING. Thesis Defense Yoongu Kim ARCHITECTURAL TECHNIQUES TO ENHANCE DRAM SCALING Thesis Defense Yoongu Kim CPU+CACHE MAIN MEMORY STORAGE 2 Complex Problems Large Datasets High Throughput 3 DRAM Module DRAM Chip 1 0 DRAM Cell (Capacitor)

More information

Spring 2018 :: CSE 502. Main Memory & DRAM. Nima Honarmand

Spring 2018 :: CSE 502. Main Memory & DRAM. Nima Honarmand Main Memory & DRAM Nima Honarmand Main Memory Big Picture 1) Last-level cache sends its memory requests to a Memory Controller Over a system bus of other types of interconnect 2) Memory controller translates

More information

Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery

Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery Carnegie Mellon University Research Showcase @ CMU Department of Electrical and Computer Engineering Carnegie Institute of Technology 6-2015 Read Disturb Errors in MLC NAND Flash Memory: Characterization,

More information

HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness

HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu Carnegie Mellon University

More information

RowClone: Fast and Efficient In-DRAM Copy and Initialization of Bulk Data

RowClone: Fast and Efficient In-DRAM Copy and Initialization of Bulk Data RowClone: Fast and Efficient In-DRAM Copy and Initialization of Bulk Data Vivek Seshadri Yoongu Kim Chris Fallin Donghyuk Lee Rachata Ausavarungnirun Gennady Pekhimenko Yixin Luo Onur Mutlu Phillip B.

More information

Improving DRAM Performance by Parallelizing Refreshes with Accesses

Improving DRAM Performance by Parallelizing Refreshes with Accesses Improving DRAM Performance by Parallelizing Refreshes with Accesses Kevin Chang Donghyuk Lee, Zeshan Chishti, Alaa Alameldeen, Chris Wilkerson, Yoongu Kim, Onur Mutlu Executive Summary DRAM refresh interferes

More information

ABSTRACT. HIGH-PERFORMANCE DRAM SYSTEM DESIGN CONSTRAINTS AND CONSIDERATIONS Joseph G. Gross, Master of Science, 2010

ABSTRACT. HIGH-PERFORMANCE DRAM SYSTEM DESIGN CONSTRAINTS AND CONSIDERATIONS Joseph G. Gross, Master of Science, 2010 ABSTRACT Title of Document: HIGH-PERFORMANCE DRAM SYSTEM DESIGN CONSTRAINTS AND CONSIDERATIONS Joseph G. Gross, Master of Science, 2010 Thesis Directed By: Dr. Bruce L. Jacob, Assistant Professor, Department

More information

Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation

Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation Improving 3D NAND Flash Memory Lifetime by Tolerating Early Retention Loss and Process Variation YIXIN LUO, Carnegie Mellon University SAUGATA GHOSE, Carnegie Mellon University YU CAI, SK Hynix, Inc. ERICH

More information

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory Semiconductor Memory Types Semiconductor Memory RAM Misnamed as all semiconductor memory is random access

More information

CSE502: Computer Architecture CSE 502: Computer Architecture

CSE502: Computer Architecture CSE 502: Computer Architecture CSE 502: Computer Architecture Memory / DRAM SRAM = Static RAM SRAM vs. DRAM As long as power is present, data is retained DRAM = Dynamic RAM If you don t do anything, you lose the data SRAM: 6T per bit

More information

18-740: Computer Architecture Recitation 4: Rethinking Memory System Design. Prof. Onur Mutlu Carnegie Mellon University Fall 2015 September 22, 2015

18-740: Computer Architecture Recitation 4: Rethinking Memory System Design. Prof. Onur Mutlu Carnegie Mellon University Fall 2015 September 22, 2015 18-740: Computer Architecture Recitation 4: Rethinking Memory System Design Prof. Onur Mutlu Carnegie Mellon University Fall 2015 September 22, 2015 Agenda Review Assignments for Next Week Rethinking Memory

More information

Lecture-14 (Memory Hierarchy) CS422-Spring

Lecture-14 (Memory Hierarchy) CS422-Spring Lecture-14 (Memory Hierarchy) CS422-Spring 2018 Biswa@CSE-IITK The Ideal World Instruction Supply Pipeline (Instruction execution) Data Supply - Zero-cycle latency - Infinite capacity - Zero cost - Perfect

More information

Memories: Memory Technology

Memories: Memory Technology Memories: Memory Technology Z. Jerry Shi Assistant Professor of Computer Science and Engineering University of Connecticut * Slides adapted from Blumrich&Gschwind/ELE475 03, Peh/ELE475 * Memory Hierarchy

More information

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013 18-447: Computer Architecture Lecture 25: Main Memory Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013 Reminder: Homework 5 (Today) Due April 3 (Wednesday!) Topics: Vector processing,

More information

Chapter 5 Internal Memory

Chapter 5 Internal Memory Chapter 5 Internal Memory Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM) Read-write memory Electrically, byte-level Electrically Volatile Read-only memory (ROM) Read-only

More information

CS152 Computer Architecture and Engineering Lecture 16: Memory System

CS152 Computer Architecture and Engineering Lecture 16: Memory System CS152 Computer Architecture and Engineering Lecture 16: System March 15, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http://http.cs.berkeley.edu/~patterson

More information

Adaptive-Latency DRAM (AL-DRAM)

Adaptive-Latency DRAM (AL-DRAM) This is a summary of the original paper, entitled Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case which appears in HPCA 2015 [64]. Adaptive-Latency DRAM (AL-DRAM) Donghyuk Lee Yoongu

More information

An introduction to SDRAM and memory controllers. 5kk73

An introduction to SDRAM and memory controllers. 5kk73 An introduction to SDRAM and memory controllers 5kk73 Presentation Outline (part 1) Introduction to SDRAM Basic SDRAM operation Memory efficiency SDRAM controller architecture Conclusions Followed by part

More information

CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda

CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda biswap@cse.iitk.ac.in https://www.cse.iitk.ac.in/users/biswap/cs698y.html Row decoder Accessing a Row Access Address

More information

ECE 571 Advanced Microprocessor-Based Design Lecture 17

ECE 571 Advanced Microprocessor-Based Design Lecture 17 ECE 571 Advanced Microprocessor-Based Design Lecture 17 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 3 April 2018 HW8 is readings Announcements 1 More DRAM 2 ECC Memory There

More information

Row Buffer Locality Aware Caching Policies for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu

Row Buffer Locality Aware Caching Policies for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu Row Buffer Locality Aware Caching Policies for Hybrid Memories HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu Executive Summary Different memory technologies have different

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

15-740/ Computer Architecture Lecture 5: Project Example. Jus%n Meza Yoongu Kim Fall 2011, 9/21/2011

15-740/ Computer Architecture Lecture 5: Project Example. Jus%n Meza Yoongu Kim Fall 2011, 9/21/2011 15-740/18-740 Computer Architecture Lecture 5: Project Example Jus%n Meza Yoongu Kim Fall 2011, 9/21/2011 Reminder: Project Proposals Project proposals due NOON on Monday 9/26 Two to three pages consisang

More information

CSE502: Computer Architecture CSE 502: Computer Architecture

CSE502: Computer Architecture CSE 502: Computer Architecture CSE 502: Computer Architecture Memory / DRAM SRAM = Static RAM SRAM vs. DRAM As long as power is present, data is retained DRAM = Dynamic RAM If you don t do anything, you lose the data SRAM: 6T per bit

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Cache 11232011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review Memory Components/Boards Two-Level Memory Hierarchy

More information

EECS 366: Computer Architecure. Memory Technology. Lecture Notes # 15. University of Illinois at Chicago. Instructor: Shantanu Dutt Department of EECS

EECS 366: Computer Architecure. Memory Technology. Lecture Notes # 15. University of Illinois at Chicago. Instructor: Shantanu Dutt Department of EECS EECS 366: Computer Architecure Instructor: Shantanu Dutt Department of EECS University of Illinois at Chicago Lecture Notes # 15 Memory Technology c Shantanu Dutt MEMORY ORGANIZATION Physical Characteristics

More information

ECE 152 Introduction to Computer Architecture

ECE 152 Introduction to Computer Architecture Introduction to Computer Architecture Main Memory and Virtual Memory Copyright 2009 Daniel J. Sorin Duke University Slides are derived from work by Amir Roth (Penn) Spring 2009 1 Where We Are in This Course

More information

ECE 485/585 Microprocessor System Design

ECE 485/585 Microprocessor System Design Microprocessor System Design Lecture 7: Memory Modules Error Correcting Codes Memory Controllers Zeshan Chishti Electrical and Computer Engineering Dept. Maseeh College of Engineering and Computer Science

More information

Mitigating Bitline Crosstalk Noise in DRAM Memories

Mitigating Bitline Crosstalk Noise in DRAM Memories Mitigating Bitline Crosstalk Noise in DRAM Memories Seyed Mohammad Seyedzadeh, Donald Kline Jr, Alex K. Jones, Rami Melhem University of Pittsburgh seyedzadeh@cs.pitt.edu,{dek61,akjones}@pitt.edu,melhem@cs.pitt.edu

More information

Understanding and Improving the Latency of DRAM-Based Memory Systems

Understanding and Improving the Latency of DRAM-Based Memory Systems Carnegie Mellon University Research Showcase @ CMU Dissertations Theses and Dissertations Spring 5-2017 Understanding and Improving the Latency of DRAM-Based Memory Systems Kevin K. Chang Carnegie Mellon

More information

Silicon Memories. Why store things in silicon? It s fast!!! Compatible with logic devices (mostly)

Silicon Memories. Why store things in silicon? It s fast!!! Compatible with logic devices (mostly) Memories and SRAM 1 Silicon Memories Why store things in silicon? It s fast!!! Compatible with logic devices (mostly) The main goal is to be cheap Dense -- The smaller the bits, the less area you need,

More information

Computer Architecture Lecture 19: Memory Hierarchy and Caches. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 3/19/2014

Computer Architecture Lecture 19: Memory Hierarchy and Caches. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 3/19/2014 18-447 Computer Architecture Lecture 19: Memory Hierarchy and Caches Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 3/19/2014 Extra Credit Recognition for Lab 3 1. John Greth (13157 ns) 2. Kevin

More information

ECE 250 / CS250 Introduction to Computer Architecture

ECE 250 / CS250 Introduction to Computer Architecture ECE 250 / CS250 Introduction to Computer Architecture Main Memory Benjamin C. Lee Duke University Slides from Daniel Sorin (Duke) and are derived from work by Amir Roth (Penn) and Alvy Lebeck (Duke) 1

More information

Chapter 5. Internal Memory. Yonsei University

Chapter 5. Internal Memory. Yonsei University Chapter 5 Internal Memory Contents Main Memory Error Correction Advanced DRAM Organization 5-2 Memory Types Memory Type Category Erasure Write Mechanism Volatility Random-access memory(ram) Read-write

More information

A Row Buffer Locality-Aware Caching Policy for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu

A Row Buffer Locality-Aware Caching Policy for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu A Row Buffer Locality-Aware Caching Policy for Hybrid Memories HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu Overview Emerging memories such as PCM offer higher density than

More information

The DRAM Cell. EEC 581 Computer Architecture. Memory Hierarchy Design (III) 1T1C DRAM cell

The DRAM Cell. EEC 581 Computer Architecture. Memory Hierarchy Design (III) 1T1C DRAM cell EEC 581 Computer Architecture Memory Hierarchy Design (III) Department of Electrical Engineering and Computer Science Cleveland State University The DRAM Cell Word Line (Control) Bit Line (Information)

More information

Semiconductor Memory Classification. Today. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. CPU Memory Hierarchy.

Semiconductor Memory Classification. Today. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. CPU Memory Hierarchy. ESE 57: Digital Integrated Circuits and VLSI Fundamentals Lec : April 4, 7 Memory Overview, Memory Core Cells Today! Memory " Classification " ROM Memories " RAM Memory " Architecture " Memory core " SRAM

More information

DRAM Main Memory. Dual Inline Memory Module (DIMM)

DRAM Main Memory. Dual Inline Memory Module (DIMM) DRAM Main Memory Dual Inline Memory Module (DIMM) Memory Technology Main memory serves as input and output to I/O interfaces and the processor. DRAMs for main memory, SRAM for caches Metrics: Latency,

More information

Understanding and Improving Latency of DRAM-Based Memory Systems

Understanding and Improving Latency of DRAM-Based Memory Systems Understanding and Improving Latency of DRAM-Based Memory ystems Thesis Oral Kevin Chang Committee: rof. Onur Mutlu (Chair) rof. James Hoe rof. Kayvon Fatahalian rof. tephen Keckler (NVIDIA, UT Austin)

More information

Phase Change Memory An Architecture and Systems Perspective

Phase Change Memory An Architecture and Systems Perspective Phase Change Memory An Architecture and Systems Perspective Benjamin Lee Electrical Engineering Stanford University Stanford EE382 2 December 2009 Benjamin Lee 1 :: PCM :: 2 Dec 09 Memory Scaling density,

More information

ENEE 759H, Spring 2005 Memory Systems: Architecture and

ENEE 759H, Spring 2005 Memory Systems: Architecture and SLIDE, Memory Systems: DRAM Device Circuits and Architecture Credit where credit is due: Slides contain original artwork ( Jacob, Wang 005) Overview Processor Processor System Controller Memory Controller

More information

Mark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness

Mark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness EE 352 Unit 10 Memory System Overview SRAM vs. DRAM DMA & Endian-ness The Memory Wall Problem: The Memory Wall Processor speeds have been increasing much faster than memory access speeds (Memory technology

More information

Computer Organization. 8th Edition. Chapter 5 Internal Memory

Computer Organization. 8th Edition. Chapter 5 Internal Memory William Stallings Computer Organization and Architecture 8th Edition Chapter 5 Internal Memory Semiconductor Memory Types Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM)

More information

Phase Change Memory An Architecture and Systems Perspective

Phase Change Memory An Architecture and Systems Perspective Phase Change Memory An Architecture and Systems Perspective Benjamin C. Lee Stanford University bcclee@stanford.edu Fall 2010, Assistant Professor @ Duke University Benjamin C. Lee 1 Memory Scaling density,

More information

Introduction to Semiconductor Memory Dr. Lynn Fuller Webpage:

Introduction to Semiconductor Memory Dr. Lynn Fuller Webpage: ROCHESTER INSTITUTE OF TECHNOLOGY MICROELECTRONIC ENGINEERING Introduction to Semiconductor Memory Webpage: http://people.rit.edu/lffeee 82 Lomb Memorial Drive Rochester, NY 14623-5604 Tel (585) 475-2035

More information

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Main Memory. Readings

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Main Memory. Readings Introduction to Computer Architecture Main Memory and Virtual Memory Copyright 2012 Daniel J. Sorin Duke University Slides are derived from work by Amir Roth (Penn) Spring 2012 Where We Are in This Course

More information

Marching Memory マーチングメモリ. UCAS-6 6 > Stanford > Imperial > Verify 中村維男 Based on Patent Application by Tadao Nakamura and Michael J.

Marching Memory マーチングメモリ. UCAS-6 6 > Stanford > Imperial > Verify 中村維男 Based on Patent Application by Tadao Nakamura and Michael J. UCAS-6 6 > Stanford > Imperial > Verify 2011 Marching Memory マーチングメモリ Tadao Nakamura 中村維男 Based on Patent Application by Tadao Nakamura and Michael J. Flynn 1 Copyright 2010 Tadao Nakamura C-M-C Computer

More information

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1 The Memory Hierarchy Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1 Memory Technologies Technologies have vastly different tradeoffs between capacity, latency,

More information

Lecture 15: DRAM Main Memory Systems. Today: DRAM basics and innovations (Section 2.3)

Lecture 15: DRAM Main Memory Systems. Today: DRAM basics and innovations (Section 2.3) Lecture 15: DRAM Main Memory Systems Today: DRAM basics and innovations (Section 2.3) 1 Memory Architecture Processor Memory Controller Address/Cmd Bank Row Buffer DIMM Data DIMM: a PCB with DRAM chips

More information

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 20: Main Memory II Prof. Onur Mutlu Carnegie Mellon University Today SRAM vs. DRAM Interleaving/Banking DRAM Microarchitecture Memory controller Memory buses

More information

The University of Adelaide, School of Computer Science 13 September 2018

The University of Adelaide, School of Computer Science 13 September 2018 Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

EE414 Embedded Systems Ch 5. Memory Part 2/2

EE414 Embedded Systems Ch 5. Memory Part 2/2 EE414 Embedded Systems Ch 5. Memory Part 2/2 Byung Kook Kim School of Electrical Engineering Korea Advanced Institute of Science and Technology Overview 6.1 introduction 6.2 Memory Write Ability and Storage

More information