Memory Controllers for Real-Time Embedded Systems. Benny Akesson Czech Technical University in Prague

Size: px
Start display at page:

Download "Memory Controllers for Real-Time Embedded Systems. Benny Akesson Czech Technical University in Prague"

Transcription

1 Memory Controllers for Real-Time Embedded Systems Benny Akesson Czech Technical University in Prague

2 Trends in Embedded Systems Embedded systems get increasingly complex Increasingly complex applications (more functionality) Growing number of applications integrated in a device More applications execute concurrently Requires increased system performance without increasing power The resulting complex contemporary platforms are heterogeneous multi-processor systems with distributed memory hierarchy to improve performance/power ratio Resources in the system are shared to reduce cost 2

3 Consumer Electronics Examples Source: International Technology Roadmap for Semiconductors (ITRS) Philips Set-Top Box Architecture (approximately one decade old) 3

4 Mixed Time-Criticality Applications have mixed time-criticality Firm real-time requirements (FRT) E.g. software-defined radio application Failure to satisfy requirement may violate correctness No deadline misses tolerable Soft real-time requirements (SRT) E.g. media decoder application Failure to satisfy requirement reduces quality of output Occassional deadline misses tolerable No real-time requirements (NRT) E.g. graphical user interface No timing requirements, but must be responsive 4

5 Simulation-Based Verification Simulation is typically used to verify SRT and NRT applications System simulated with a large set of inputs Resource sharing results in interference between applications Timing behaviors of applications in use-case inter-dependent All use-cases must be verified instead of all applications Verification must be repeated if applications are added or modified Verification by simulation is a slow process with poor coverage Verification is costly and effort is expected to increase in future! 5

6 Formal Verification Formal verification is alternative to simulation Provides analytical bounds on response time or throughput Considers all application inputs Covers all combinations of concurrently running applications Approach requires models of both applications and hardware Application models are not always available Behavior of dynamic applications is not captured accurately Most hardware is not designed with formal analysis in mind 6

7 Problem with SDRAM This presentation covers the CompSoC SDRAM controller SDRAM memories are particularly challenging resources The execution time of a request in an SDRAM is variable Worst-Case Execution Time (WCET) is pessimistic and guaranteed bandwidth is very low SDRAM bandwidth is scarce and must be efficiently utilized Additional interfaces cannot be added due to cost constraints 7

8 Problem Statement Complex systems have mixed time-criticality Firm, soft, and no real-time requirements in one system We refer to this as mixed real-time (MRT) requirements There are suitable SDRAM controllers for either FRT and SRT/NRT No good solutions for mixes between these types We develop a controller that reduce verification effort by enabling formal verification of FRT requirements enable independent verification by simulation for SRT/NRT while using the SDRAM in an efficient and power conscious manner 8

9 Presentation Outline Introduction Memory Overview Memory Controller Architectures Predator SDRAM Controller Conclusions 9

10 Distance from processor Access speed Memory Hierarchy Processor CPU Registers Registers L1 Cache L1 Cache L2 Cache L2 Cache Off-chip memory Off-chip memory Secondary memory Secondary memory (Hard disk) Size (capacity) 10

11 Memory Hierarchy Module Memory type Access Time Capacity Managed by Registers SRAM 1 cycle ~500B Software/compiler L1 Cache SRAM 1-3 cycles ~64KB Hardware L2 Cache SRAM 5-10 cycles 1-10MB Hardware Off-chip memory DRAM ~100 cycles ~10GB Software/OS Secondary memory Disk drive ~1000 cycles ~1TB Software/OS 11 Credits: J.Leverich, Stanford

12 Comparing SRAM and DRAM SRAM DRAM Fast (10x) Bitlines driven by transistors Large cell size (6-10x) 6 transistors Small cell size 1 capacitor and 1 transistor Bit stored as charge on capacitor Capacitor loses charge over time Period refresh required Hence the name dynamic RAM 12 Credits: J.Leverich, Stanford

13 Summary of Characteristics SRAM is preferable for register files and L1/L2 caches Simpler manufacturing (compatible with logic process) Fast random access (on-chip memory) No refreshes Lower density (6 transistors per cell) Higher cost DRAM is preferable for stand-alone memory chips Much higher capacity Higher density Lower cost Slower random access (off-chip memory) 13 Credits: J.Leverich, Stanford

14 SDRAM Architecture An SDRAM is organized in banks, rows and columns A row buffer in each bank stores a currently active (open) row Interface has a command bus, address bus, and a data bus Buses shared between banks to reduce the number of off-chip pins 14

15 Basic SDRAM Operation Memory map decodes address to bank, row, and column Row is activated and copied into the row buffer of the bank Read bursts and/or write bursts are issued to the active row Programmed burst length (BL) of 4 or 8 words Row is precharged and stored back into the memory array 15

16 Timing Constraints Timing constraints determine schedulability of commands More than 20 constraints on minimum time between commands E.g. activate-to-activate, activate-to-read/write, read/write-toprecharge, read-to-write, write-to-read, etc. Timing constraints for a DDR2-400 device 16

17 Memory Efficiency Execution times of requests are variable and traffic dependent Can vary by an order of magnitude Three reasons for overhead cycles: Activating and precharging (opening and closing) rows Switching direction of data bus from read to write Refreshing the memory Memory efficiency The fraction of clock cycles when requested data is transferred The exchange rate between peak bandwidth and net bandwidth High efficiency is required since bandwidth is a scarce resource 17

18 Bank-level Parallelism Enable DRAM access from different banks in parallel Reduces memory access latency and improves efficiency! 18 Credits: J.Leverich, Stanford

19 Presentation Outline Introduction Memory Overview Memory Controller Architectures Predator SDRAM Controller Conclusions 19

20 A General Memory Controller Architecture A general memory controller architecture consists of two parts The front-end buffers requests and responses per memory client schedules requests for memory access is independent of the memory type The back-end translates scheduled requests into SDRAM command sequence is dependent on the memory type 20

21 Front-End Front-end provides buffering and arbitration Arbiter can schedule requests in many different ways Round-Robin, Time-Division Multiplexing (TDM), priority scheduling Scheduled requests are sent to the back-end 21

22 Memory Map Back-end contains a memory map and a command generator Memory map decodes logical address to physical address Physical address is (bank, row, column) Decoding is done by slicing the bits in the logical address Many possibilities choice affects efficiency and power Logical addr. Memory map Physical addr. 0x10FF00 (2, 510, 128) 22

23 Command Generator Command generator generates and schedules commands Customized for a particular memory generation Programmable to handle different timing constraints 23

24 Controller Types Firm Real-Time Controllers Soft/No Real-Time Controllers Mixed Real-Time Controllers 24

25 Firm Real-Time Controllers FRT requirements must be satisfied even in worst-case scenario Typical goals of firm real-time controllers: Maximize the worst-case net bandwidth Minimize the worst-case response time A trade-off between the two, since they are contradictory 25

26 Locality in FRT Controllers SDRAM performance is highly dependent on locality Request served quickly if it targets an open row No overhead of opening and closing rows FRT controllers are typically unable to exploit locality Locality has to be guaranteed also in worst case Difficult for a single executing application Requires intimate knowledge of memory accesses More or less impossible for multiple concurrent applications Memory accesses mixed by memory arbiter 26

27 Close-Page Policy FRT controllers use close-page policies [Paolieri, Reineke] Precharge banks immediately after each request Assumes that every request targets closed rows Benefits of policy Reduces worst-case overhead of opening/closing rows Increases guaranteed net bandwidth Drawbacks of policy Sacrifices best and average-case performance and power Limits best-case efficiency of DDR3-800 with 64B requests to 80% 27

28 Statically Scheduled Controllers Controllers are classified as statically or dynamically scheduled Depends on SDRAM command scheduling mechanism Statically scheduled controllers [Bayliss] Pre-compute SDRAM schedule at design time Bandwidth and execution time bounded by inspecting schedule Suitable for FRT requirements Restricted to applications with well-specified memory behavior 28

29 Dynamically Scheduled Controllers Dynamically scheduled FRT controllers [Lee, Paolieri, Shao] Schedule commands at run-time based on incoming requests Challenge is to analyze command scheduler Required to bound net bandwidth and execution times Analysis often assumes large fixed-size requests [Paolieri, Reineke] Large enough to exploit maximum bank-level parallelism by interleaving Requires B requests depending on memory device 29

30 Predictable Arbitration These FRT controllers all have bounded execution time Bounding response times requires predictable arbitration Bounds number of interfering requests from other memory clients Different controllers uses different arbiters Statically scheduled controllers uses a static schedule [Paolieri] employs Round-Robin arbitration Targeting homogeneous chip multi-processors [Akesson] supports a variety of predictable arbiters E.g. (Weighted) Round-Robin, Credit-Controlled Static-Priority, and Frame-Based Static-Priority Targets heterogeneous MPSoCs 30

31 Controller Types Firm Real-Time Controllers Soft/No Real-Time Controllers Mixed Real-Time Controllers 31

32 Soft/No Real-Time Controllers Same controllers normally used for SRT/NRT requirements Dynamically scheduled high-performance controllers SRT applications are verified by simulation rather than formally Firm transaction-level guarantees are not necessary Sufficient to satisfy application-level deadlines with high probability May correspond to thousands of memory requests Typical goals of soft/no real-time controllers: Maximize the average net bandwidth Minimize the average response time A trade-off between the two, since they are contradictory 32

33 Locality in SRT Controllers SRT controllers do not have to guarantee locality Requires locality to offset miss penalties with high probability Open-page policies are common in SRT controllers Rows are speculatively kept open to exploit locality Average efficiency is hence typically higher than for FRT controllers Best-case memory efficiency is hence around 98% All requests are either reads or writes to the same row Efficiency losses only due to mandatory refresh activities 33

34 Flexibility SRT controllers are flexible and supports most memory traffic SRT controllers are dynamically scheduled Do not require formal analysis of supported memory traffic Enables supports of e.g. variable request sizes Fine-grained scheduling at level of single SDRAM bursts Reduces wasted data when serving small requests Reduces response times of sensitive clients Low worst-case memory efficiency Cannot guarantee locality or bank-level parallelism Worst-case efficiency about 16% for DDR3-800 with BL=8 words 34

35 Improving Memory Efficiency Memory efficiency is optimized using sophisticated mechanisms Preference for requests that target open rows Reduces overhead of opening and closing rows Increases response times for clients targeting closed rows Read/write grouping Reduces read/write switching overhead Increases response times for requests in wrong direction 35

36 Reducing Response Times Preference for reads over writes [Shao] Reads are often blocking while writes are posted Reduces stall cycles on processor Preemption of low-priority requests in service [Lee] Reduces response times of high-priority clients Increases response times of low-priority clients Reduces memory efficiency due to preemption overhead Interactions between mechanisms are complex Difficult to derive useful bounds on bandwidth and response times May even be difficult to guarantee the default 16% net bandwidth 36

37 Controller Types Firm Real-Time Controllers Soft/No Real-Time Controllers Mixed Real-Time Controllers 37

38 Mixed Real-Time Controllers MRT controllers must efficiently support FRT, SRT and NRT Current FRT controllers treat SRT/NRT clients like FRT clients Expensive both in terms of bandwidth and power Current SRT/NRT controllers treat FRT like SRT/NRT clients Guarantees are either not formally proven or very pessimistic Worst-case may be maximum observed case plus a safety margin Deadlines may be missed in corner cases MRT controllers are likely to evolve from current controllers Either from FRT controllers or SRT/NRT controllers 38

39 Presentation Outline Introduction Memory Overview Memory Controller Architectures Predator SDRAM Controller Conclusions 39

40 Positioning Predator is a mixed real-time hybrid controller Evolved from a firm real-time design Combines static and dynamic scheduling Statically computed memory patterns (sub-schedules) that are dynamically scheduled Two key ideas to predictability 1. Predictable WCET by using memory patterns 2. Predictable WCRT by using predictable arbitration 40

41 Predictable SDRAM Predictability through precomputed memory patterns Patterns are precomputed sequences of SDRAM commands There are six types of memory patterns Read, write, r/w switch, w/r switch, refresh, and power down Dynamic request to pattern mapping: Read request read pattern (potentially first w/r switch) Write request write pattern (potentially first r/w switch) Refresh pattern issued when required Power down when idle 41

42 Memory Patterns Patterns enable scheduling at higher level than commands Less state and fewer constraints, making them easier to analyze Bounding memory efficiency Worst sequence of patterns is known (mapping & pattern lengths) Data transferred by patterns is known (by definition) Sequence of patterns for a DDR2-400 memory 42

43 Predictable Front-End Arbitration Controller works with any predictable front-end arbiter Example: Round-Robin, TDM, or our own CCSP arbiter Bounding response times Number of interfering requests is known (arbiter analysis) Worst-case request to pattern mapping is known (mapping) Pattern to cycle mapping is known (pattern lengths) Design provides bounds on net bandwidth and response times For any combination of supported memory and arbiter 43

44 Controller Architecture Front-end components Atomizer splits large requests into smaller pieces Delay Block contains request and response buffers Resource Bus schedules client according to arbiter policy Configuration Bus makes components run-time programmable Back-end Different components for SRAM and DRAM Real-Time Memory Controller 44

45 Pattern Structure Structure of basic access patterns (read and write patterns) At least one activate and precharge command per bank Access patterns of same type must be independent Banks are precharged immediately after access (close-page policy) Example read pattern for a DDR2-400 memory 45

46 Access Pattern Parameters There are three key access pattern parameters Determined at design time Number of banks to interleave (BI) request over Determines bank parallelism Memory efficiency and power consumption increase with BI Number of bursts to each bank, called burst count (BC) Number of words in a burst, the burst length (BL) BC & BL determine amount of data transferred per bank Increasing BC & BL amortizes overhead and increases efficiency 46

47 Access Granularity Parameters determine access granularity (AG) of the memory Amount of data accessed by a pattern AG = BI x BC x BL x word size Execution time of a request is proportional to AG Large requests increase WCET and WCRT Parameters are trade-off between efficiency, WCRT and power Automatically configured based on application requirements Read pattern for 16-bit DDR2-400 BI=4, BC=1, BL=8 AG=64 bytes 47

48 FRT Bandwidth / Power Trade-Off Bandwidth/ power trade-off for FRT applications For different values of BI and BC (BL=8) Labels denote BI 48

49 Conservative Open-Page Policy Controller uses a conservative open-page policy Page kept open as long as possible without sacrificing WCET Enables limited locality to be exploited Reduces execution time and power consumption Policy introduces three new types of read and write patterns With / without activates and precharges, respectively Pattern is chosen when it is known if following request hits or misses Worst-case analysis is only based on miss patterns Request arrivals: Request 1 Request 2 Request 3 Request 4 Request 5 (requests of the same color use the same page) Close-page policy ACT Read 1 PRE ACT Read 2 PRE ACT Read 3 PRE ACT Read 4 PRE ACT Read 5 PRE Open-page policy ACT Read 1 Read 2 Read 3 PRE ACT Read 4 PRE ACT Read 5 Conservative open-page policy ACT Read 1 Read 2 PRE ACT Read 3 PRE 49 ACT Read 4 PRE ACT Read 5 PRE

50 SRT Execution Time / Power Trade-Off SRT application is verified by SystemC system simulation Traffic generator running SRT H.263 trace with 32 B cache lines 2 FRT applications modeled with synthetic traffic (64, 128 B) Results normalized to best configuration Conservative OP policy reduces power and execution time Legend: Labels: (BI, BC) Colored points = close-page policy Thin symbol = violates power budget 50

51 Presentation Outline Introduction Memory Overview Memory Controller Architectures Predator SDRAM Controller Conclusions 51

52 Conclusions Complex SoCs have mixed real-time (MRT) requirements Mix of firm (FRT), soft (SRT), and no real-time (NRT) requirements There are suitable controllers for FRT and SRT/NRT, but not MRT Firm real-time controllers Maximize bandwidth bound and minimize response time bound Static, dynamic, or hybrid SDRAM command scheduling Close-page policies to reduce miss penalty Predictable arbitration Soft/no real-time controllers Maximize average bandwidth and minimize average response time Dynamically scheduled with sophisticated mechanisms Open-page policies to exploit locality 52

53 Conclusions We propose a mixed real-time SDRAM controller Predictability enables formal verification of FRT requirements Achieved by memory patterns and predictable arbitration Configurable trade-off between efficiency, response time, and power Uses a conservative open-page policy to exploit locality for SRT/NRT 53

54 Acknowledgements This work would not have been possible without the memory team Kees Goossens (TU Eindhoven) Karthik Chandrasekar (TU Delft) Sven Goossens (TU Eindhoven) Manil Dev Gomony (TU Eindhoven) Yonghui Li (TU Eindhoven) Anna Minaeva (CTU Prague) Tim Kouters (TU Eindhoven) Williston Hayes jr. (TU Eindhoven) Jasper Kuijsten (TU Eindhoven) Getachew Teshome (TU Delft) Eelke Strooisma (TU Delft) Markus Ringhofer (TU Graz) Anand Khot (TU Delft) Sahar Foroutan (TIMA Laboratories) Gervin Thomas (TU Berlin) Winston Siauw (TU Delft) This research is supported by EU grants T-CREST, FlexTiles, Cobra, & NL grant NEST. CA104 COBRA 54

55 References [Akesson] B. Akesson and K. Goossens. Architectures and Modeling of Predictable Memory Controllers for Improved System Integration. In Proc. DATE, 2011 [Bayliss] S. Bayliss and George A. Constantinides. "Methodology for designing statically scheduled application-specific SDRAM controllers using constrained local search. In Proc. FPT, 2009 [Lee] K. Lee, T. Lin, and C. Jen. An efficient quality-aware memory controller for multimedia platform SoC. In IEEE Trans. on Circuits and Systems for Video Technology,15(5), 2005 [Paolieri] M. Paolieri, E. Quinones, F. Cazorla, and M. Valero. An Analyzable Memory Controller for Hard Real-Time CMPs. In: Embedded Systems Letters, IEEE, 1(4), [Reineke] J. Reineke, Isaac Liu, Hiren Patel, Sungjun Kim, and Edward Lee. PRET DRAM Controller: Bank Privatization for Predictability and Temporal Isolation. In: Proc. CODES+ISSS, 2011 [Shao] J. Shao and B. Davis. A burst scheduling access reordering mechanism. In Proc. HPCA,

56 Questions Thank you for your attention! Any questions? For the full story, please consult our book Memory Controllers for Real-Time Embedded Systems, available from Springer. 56

57

An introduction to SDRAM and memory controllers. 5kk73

An introduction to SDRAM and memory controllers. 5kk73 An introduction to SDRAM and memory controllers 5kk73 Presentation Outline (part 1) Introduction to SDRAM Basic SDRAM operation Memory efficiency SDRAM controller architecture Conclusions Followed by part

More information

Efficient real-time SDRAM performance

Efficient real-time SDRAM performance 1 Efficient real-time SDRAM performance Kees Goossens with Benny Akesson, Sven Goossens, Karthik Chandrasekar, Manil Dev Gomony, Tim Kouters, and others Kees Goossens

More information

Trends in Embedded System Design

Trends in Embedded System Design Trends in Embedded System Design MPSoC design gets increasingly complex Moore s law enables increased component integration Digital convergence creates a market for highly integrated devices The resulting

More information

Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu Yogen Krish Rodolfo Pellizzoni

Worst Case Analysis of DRAM Latency in Multi-Requestor Systems. Zheng Pei Wu Yogen Krish Rodolfo Pellizzoni orst Case Analysis of DAM Latency in Multi-equestor Systems Zheng Pei u Yogen Krish odolfo Pellizzoni Multi-equestor Systems CPU CPU CPU Inter-connect DAM DMA I/O 1/26 Multi-equestor Systems CPU CPU CPU

More information

Embedded Systems. Series Editors

Embedded Systems. Series Editors Embedded Systems Series Editors Nikil D. Dutt, Department of Computer Science, Zot Code 3435, Donald Bren School of Information and Computer Sciences, University of California, Irvine, CA 92697-3435, USA

More information

trend: embedded systems Composable Timing and Energy in CompSOC trend: multiple applications on one device problem: design time 3 composability

trend: embedded systems Composable Timing and Energy in CompSOC trend: multiple applications on one device problem: design time 3 composability Eindhoven University of Technology This research is supported by EU grants T-CTEST, Cobra and NL grant NEST. Parts of the platform were developed in COMCAS, Scalopes, TSA, NEVA,

More information

A Reconfigurable Real-Time SDRAM Controller for Mixed Time-Criticality Systems

A Reconfigurable Real-Time SDRAM Controller for Mixed Time-Criticality Systems A Reconfigurable Real-Time SDRAM Controller for Mixed Time-Criticality Systems Sven Goossens, Jasper Kuijsten, Benny Akesson, Kees Goossens Eindhoven University of Technology {s.l.m.goossens,k.b.akesson,k.g.w.goossens}@tue.nl

More information

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 19: Main Memory Prof. Onur Mutlu Carnegie Mellon University Last Time Multi-core issues in caching OS-based cache partitioning (using page coloring) Handling

More information

Exploiting Expendable Process-Margins in DRAMs for Run-Time Performance Optimization

Exploiting Expendable Process-Margins in DRAMs for Run-Time Performance Optimization Exploiting Expendable Process-Margins in DRAMs for Run-Time Performance Optimization Karthik Chandrasekar, Sven Goossens 2, Christian Weis 3, Martijn Koedam 2, Benny Akesson 4, Norbert Wehn 3, and Kees

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

The University of Adelaide, School of Computer Science 13 September 2018

The University of Adelaide, School of Computer Science 13 September 2018 Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

Introduction to memory system :from device to system

Introduction to memory system :from device to system Introduction to memory system :from device to system Jianhui Yue Electrical and Computer Engineering University of Maine The Position of DRAM in the Computer 2 The Complexity of Memory 3 Question Assume

More information

Lecture 14: Cache Innovations and DRAM. Today: cache access basics and innovations, DRAM (Sections )

Lecture 14: Cache Innovations and DRAM. Today: cache access basics and innovations, DRAM (Sections ) Lecture 14: Cache Innovations and DRAM Today: cache access basics and innovations, DRAM (Sections 5.1-5.3) 1 Reducing Miss Rate Large block size reduces compulsory misses, reduces miss penalty in case

More information

Worst Case Analysis of DRAM Latency in Hard Real Time Systems

Worst Case Analysis of DRAM Latency in Hard Real Time Systems Worst Case Analysis of DRAM Latency in Hard Real Time Systems by Zheng Pei Wu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Applied

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

EEM 486: Computer Architecture. Lecture 9. Memory

EEM 486: Computer Architecture. Lecture 9. Memory EEM 486: Computer Architecture Lecture 9 Memory The Big Picture Designing a Multiple Clock Cycle Datapath Processor Control Memory Input Datapath Output The following slides belong to Prof. Onur Mutlu

More information

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,

More information

Lecture 15: DRAM Main Memory Systems. Today: DRAM basics and innovations (Section 2.3)

Lecture 15: DRAM Main Memory Systems. Today: DRAM basics and innovations (Section 2.3) Lecture 15: DRAM Main Memory Systems Today: DRAM basics and innovations (Section 2.3) 1 Memory Architecture Processor Memory Controller Address/Cmd Bank Row Buffer DIMM Data DIMM: a PCB with DRAM chips

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology

More information

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

Architecture and Optimal Configuration of a Real-Time Multi-Channel Memory Controller

Architecture and Optimal Configuration of a Real-Time Multi-Channel Memory Controller Architecture and Optimal Configuration of a Real-Time Multi-Channel Controller Manil Dev Gomony, Benny Akesson, and Kees Goossens Eindhoven University of Technology, The Netherlands CISTER-ISEP Research

More information

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 20: Main Memory II Prof. Onur Mutlu Carnegie Mellon University Today SRAM vs. DRAM Interleaving/Banking DRAM Microarchitecture Memory controller Memory buses

More information

Variability Windows for Predictable DDR Controllers, A Technical Report

Variability Windows for Predictable DDR Controllers, A Technical Report Variability Windows for Predictable DDR Controllers, A Technical Report MOHAMED HASSAN 1 INTRODUCTION In this technical report, we detail the derivation of the variability window for the eight predictable

More information

Embedded Systems. Series Editors

Embedded Systems. Series Editors Embedded Systems Series Editors Nikil D. Dutt, Department of Computer Science, Zot Code 3435, Donald Bren School of Information and Computer Sciences, University of California, Irvine, CA 92697-3435, USA

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to

More information

Advanced Memory Organizations

Advanced Memory Organizations CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU

More information

COMPUTER ARCHITECTURES

COMPUTER ARCHITECTURES COMPUTER ARCHITECTURES Random Access Memory Technologies Gábor Horváth BUTE Department of Networked Systems and Services ghorvath@hit.bme.hu Budapest, 2019. 02. 24. Department of Networked Systems and

More information

A Comparative Study of Predictable DRAM Controllers

A Comparative Study of Predictable DRAM Controllers 0:1 0 A Comparative Study of Predictable DRAM Controllers Real-time embedded systems require hard guarantees on task Worst-Case Execution Time (WCET). For this reason, architectural components employed

More information

LECTURE 5: MEMORY HIERARCHY DESIGN

LECTURE 5: MEMORY HIERARCHY DESIGN LECTURE 5: MEMORY HIERARCHY DESIGN Abridged version of Hennessy & Patterson (2012):Ch.2 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive

More information

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013 18-447: Computer Architecture Lecture 25: Main Memory Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013 Reminder: Homework 5 (Today) Due April 3 (Wednesday!) Topics: Vector processing,

More information

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Caches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Caches Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns

More information

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) Department of Electr rical Eng ineering, Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Depar rtment of Electr rical Engineering,

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design Edited by Mansour Al Zuair 1 Introduction Programmers want unlimited amounts of memory with low latency Fast

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!

More information

Adapted from David Patterson s slides on graduate computer architecture

Adapted from David Patterson s slides on graduate computer architecture Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual

More information

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved. Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Internal Memory http://www.yildiz.edu.tr/~naydin 1 2 Outline Semiconductor main memory Random Access Memory

More information

A Comparative Study of Predictable DRAM Controllers

A Comparative Study of Predictable DRAM Controllers A A Comparative Study of Predictable DRAM Controllers Danlu Guo,Mohamed Hassan,Rodolfo Pellizzoni and Hiren Patel, {dlguo,mohamed.hassan,rpellizz,hiren.patel}@uwaterloo.ca, University of Waterloo Recently,

More information

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics ,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics The objectives of this module are to discuss about the need for a hierarchical memory system and also

More information

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now? cps 14 memory.1 RW Fall 2 CPS11 Computer Organization and Programming Lecture 13 The System Robert Wagner Outline of Today s Lecture System the BIG Picture? Technology Technology DRAM A Real Life Example

More information

Computer Organization. 8th Edition. Chapter 5 Internal Memory

Computer Organization. 8th Edition. Chapter 5 Internal Memory William Stallings Computer Organization and Architecture 8th Edition Chapter 5 Internal Memory Semiconductor Memory Types Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM)

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Review: Major Components of a Computer Processor Devices Control Memory Input Datapath Output Secondary Memory (Disk) Main Memory Cache Performance

More information

Mainstream Computer System Components

Mainstream Computer System Components Mainstream Computer System Components Double Date Rate (DDR) SDRAM One channel = 8 bytes = 64 bits wide Current DDR3 SDRAM Example: PC3-12800 (DDR3-1600) 200 MHz (internal base chip clock) 8-way interleaved

More information

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building

More information

CENG3420 Lecture 08: Memory Organization

CENG3420 Lecture 08: Memory Organization CENG3420 Lecture 08: Memory Organization Bei Yu byu@cse.cuhk.edu.hk (Latest update: February 22, 2018) Spring 2018 1 / 48 Overview Introduction Random Access Memory (RAM) Interleaving Secondary Memory

More information

Main Memory Systems. Department of Electrical Engineering Stanford University Lecture 5-1

Main Memory Systems. Department of Electrical Engineering Stanford University   Lecture 5-1 Lecture 5 Main Memory Systems Department of Electrical Engineering Stanford University http://eeclass.stanford.edu/ee282 Lecture 5-1 Announcements If you don t have a group of 3, contact us ASAP HW-1 is

More information

Large and Fast: Exploiting Memory Hierarchy

Large and Fast: Exploiting Memory Hierarchy CSE 431: Introduction to Operating Systems Large and Fast: Exploiting Memory Hierarchy Gojko Babić 10/5/018 Memory Hierarchy A computer system contains a hierarchy of storage devices with different costs,

More information

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp. 13 1 CMPE110 Computer Architecture, Winter 2009 Andrea Di Blas 110 Winter 2009 CMPE Cache Direct-mapped cache Reads and writes Cache associativity Cache and performance Textbook Edition: 7.1 to 7.3 Third

More information

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B. Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.5) Memory Technologies Dynamic Random Access Memory (DRAM) Optimized

More information

EE414 Embedded Systems Ch 5. Memory Part 2/2

EE414 Embedded Systems Ch 5. Memory Part 2/2 EE414 Embedded Systems Ch 5. Memory Part 2/2 Byung Kook Kim School of Electrical Engineering Korea Advanced Institute of Science and Technology Overview 6.1 introduction 6.2 Memory Write Ability and Storage

More information

ECE 485/585 Microprocessor System Design

ECE 485/585 Microprocessor System Design Microprocessor System Design Lecture 5: Zeshan Chishti DRAM Basics DRAM Evolution SDRAM-based Memory Systems Electrical and Computer Engineering Dept. Maseeh College of Engineering and Computer Science

More information

Addressing the Memory Wall

Addressing the Memory Wall Lecture 26: Addressing the Memory Wall Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Tunes Cage the Elephant Back Against the Wall (Cage the Elephant) This song is for the

More information

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight

More information

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity Donghyuk Lee Carnegie Mellon University Problem: High DRAM Latency processor stalls: waiting for data main memory high latency Major bottleneck

More information

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation Mainstream Computer System Components CPU Core 2 GHz - 3.0 GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation One core or multi-core (2-4) per chip Multiple FP, integer

More information

Topic 21: Memory Technology

Topic 21: Memory Technology Topic 21: Memory Technology COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Old Stuff Revisited Mercury Delay Line Memory Maurice Wilkes, in 1947,

More information

Topic 21: Memory Technology

Topic 21: Memory Technology Topic 21: Memory Technology COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Old Stuff Revisited Mercury Delay Line Memory Maurice Wilkes, in 1947,

More information

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory Semiconductor Memory Types Semiconductor Memory RAM Misnamed as all semiconductor memory is random access

More information

Memory latency: Affects cache miss penalty. Measured by:

Memory latency: Affects cache miss penalty. Measured by: Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row. Static RAM may be used for main memory

More information

Chapter 8 Memory Basics

Chapter 8 Memory Basics Logic and Computer Design Fundamentals Chapter 8 Memory Basics Charles Kime & Thomas Kaminski 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Overview Memory definitions Random Access

More information

Memory latency: Affects cache miss penalty. Measured by:

Memory latency: Affects cache miss penalty. Measured by: Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row. Static RAM may be used for main memory

More information

DRAM Main Memory. Dual Inline Memory Module (DIMM)

DRAM Main Memory. Dual Inline Memory Module (DIMM) DRAM Main Memory Dual Inline Memory Module (DIMM) Memory Technology Main memory serves as input and output to I/O interfaces and the processor. DRAMs for main memory, SRAM for caches Metrics: Latency,

More information

Main Memory. EECC551 - Shaaban. Memory latency: Affects cache miss penalty. Measured by:

Main Memory. EECC551 - Shaaban. Memory latency: Affects cache miss penalty. Measured by: Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row (~every 8 msec). Static RAM may be

More information

CENG4480 Lecture 09: Memory 1

CENG4480 Lecture 09: Memory 1 CENG4480 Lecture 09: Memory 1 Bei Yu byu@cse.cuhk.edu.hk (Latest update: November 8, 2017) Fall 2017 1 / 37 Overview Introduction Memory Principle Random Access Memory (RAM) Non-Volatile Memory Conclusion

More information

A Cache Hierarchy in a Computer System

A Cache Hierarchy in a Computer System A Cache Hierarchy in a Computer System Ideally one would desire an indefinitely large memory capacity such that any particular... word would be immediately available... We are... forced to recognize the

More information

EECS750: Advanced Operating Systems. 2/24/2014 Heechul Yun

EECS750: Advanced Operating Systems. 2/24/2014 Heechul Yun EECS750: Advanced Operating Systems 2/24/2014 Heechul Yun 1 Administrative Project Feedback of your proposal will be sent by Wednesday Midterm report due on Apr. 2 3 pages: include intro, related work,

More information

Basics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS

Basics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS Basics DRAM ORGANIZATION DRAM Word Line Bit Line Storage element (capacitor) In/Out Buffers Decoder Sense Amps... Bit Lines... Switching element Decoder... Word Lines... Memory Array Page 1 Basics BUS

More information

Memory Hierarchy and Caches

Memory Hierarchy and Caches Memory Hierarchy and Caches COE 301 / ICS 233 Computer Organization Dr. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals Presentation Outline

More information

CS311 Lecture 21: SRAM/DRAM/FLASH

CS311 Lecture 21: SRAM/DRAM/FLASH S 14 L21-1 2014 CS311 Lecture 21: SRAM/DRAM/FLASH DARM part based on ISCA 2002 tutorial DRAM: Architectures, Interfaces, and Systems by Bruce Jacob and David Wang Jangwoo Kim (POSTECH) Thomas Wenisch (University

More information

A Comparative Study of Predictable DRAM Controllers

A Comparative Study of Predictable DRAM Controllers 1 A Comparative Study of Predictable DRAM Controllers DALU GUO, MOHAMED HASSA, RODOLFO PELLIZZOI, and HIRE PATEL, University of Waterloo, CADA Recently, the research community has introduced several predictable

More information

Lecture 11 Cache. Peng Liu.

Lecture 11 Cache. Peng Liu. Lecture 11 Cache Peng Liu liupeng@zju.edu.cn 1 Associative Cache Example 2 Associative Cache Example 3 Associativity Example Compare 4-block caches Direct mapped, 2-way set associative, fully associative

More information

Managing Memory for Timing Predictability. Rodolfo Pellizzoni

Managing Memory for Timing Predictability. Rodolfo Pellizzoni Managing Memory for Timing Predictability Rodolfo Pellizzoni Thanks This work would not have been possible without the following students and collaborators Zheng Pei Wu*, Yogen Krish Heechul Yun* Renato

More information

Lecture: DRAM Main Memory. Topics: virtual memory wrap-up, DRAM intro and basics (Section 2.3)

Lecture: DRAM Main Memory. Topics: virtual memory wrap-up, DRAM intro and basics (Section 2.3) Lecture: DRAM Main Memory Topics: virtual memory wrap-up, DRAM intro and basics (Section 2.3) 1 TLB and Cache 2 Virtually Indexed Caches 24-bit virtual address, 4KB page size 12 bits offset and 12 bits

More information

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

CPU issues address (and data for write) Memory returns data (or acknowledgment for write) The Main Memory Unit CPU and memory unit interface Address Data Control CPU Memory CPU issues address (and data for write) Memory returns data (or acknowledgment for write) Memories: Design Objectives

More information

Composable Resource Sharing Based on Latency-Rate Servers

Composable Resource Sharing Based on Latency-Rate Servers Composable Resource Sharing Based on Latency-Rate Servers Benny Akesson 1, Andreas Hansson 1, Kees Goossens 2,3 1 Eindhoven University of Technology 2 NXP Semiconductors Research 3 Delft University of

More information

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast

More information

Lecture-14 (Memory Hierarchy) CS422-Spring

Lecture-14 (Memory Hierarchy) CS422-Spring Lecture-14 (Memory Hierarchy) CS422-Spring 2018 Biswa@CSE-IITK The Ideal World Instruction Supply Pipeline (Instruction execution) Data Supply - Zero-cycle latency - Infinite capacity - Zero cost - Perfect

More information

Unleashing the Power of Embedded DRAM

Unleashing the Power of Embedded DRAM Copyright 2005 Design And Reuse S.A. All rights reserved. Unleashing the Power of Embedded DRAM by Peter Gillingham, MOSAID Technologies Incorporated Ottawa, Canada Abstract Embedded DRAM technology offers

More information

CSE502: Computer Architecture CSE 502: Computer Architecture

CSE502: Computer Architecture CSE 502: Computer Architecture CSE 502: Computer Architecture Memory / DRAM SRAM = Static RAM SRAM vs. DRAM As long as power is present, data is retained DRAM = Dynamic RAM If you don t do anything, you lose the data SRAM: 6T per bit

More information

CS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory

CS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory CS65 Computer Architecture Lecture 9 Memory Hierarchy - Main Memory Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 9: Main Memory 9-/ /6/ A. Sohn Memory Cycle Time 5

More information

COSC 6385 Computer Architecture - Memory Hierarchies (III)

COSC 6385 Computer Architecture - Memory Hierarchies (III) COSC 6385 Computer Architecture - Memory Hierarchies (III) Edgar Gabriel Spring 2014 Memory Technology Performance metrics Latency problems handled through caches Bandwidth main concern for main memory

More information

Memories: Memory Technology

Memories: Memory Technology Memories: Memory Technology Z. Jerry Shi Assistant Professor of Computer Science and Engineering University of Connecticut * Slides adapted from Blumrich&Gschwind/ELE475 03, Peh/ELE475 * Memory Hierarchy

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

Bounding SDRAM Interference: Detailed Analysis vs. Latency-Rate Analysis

Bounding SDRAM Interference: Detailed Analysis vs. Latency-Rate Analysis Bounding SDRAM Interference: Detailed Analysis vs. Latency-Rate Analysis Hardik Shah 1, Alois Knoll 2, and Benny Akesson 3 1 fortiss GmbH, Germany, 2 Technical University Munich, Germany, 3 CISTER-ISEP

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

ECE 341. Lecture # 16

ECE 341. Lecture # 16 ECE 341 Lecture # 16 Instructor: Zeshan Chishti zeshan@ece.pdx.edu November 24, 2014 Portland State University Lecture Topics The Memory System Basic Concepts Semiconductor RAM Memories Organization of

More information

Lecture: DRAM Main Memory. Topics: virtual memory wrap-up, DRAM intro and basics (Section 2.3)

Lecture: DRAM Main Memory. Topics: virtual memory wrap-up, DRAM intro and basics (Section 2.3) Lecture: DRAM Main Memory Topics: virtual memory wrap-up, DRAM intro and basics (Section 2.3) 1 TLB and Cache Is the cache indexed with virtual or physical address? To index with a physical address, we

More information

Computer Systems Laboratory Sungkyunkwan University

Computer Systems Laboratory Sungkyunkwan University DRAMs Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Main Memory & Caches Use DRAMs for main memory Fixed width (e.g., 1 word) Connected by fixed-width

More information

Lecture: Large Caches, Virtual Memory. Topics: cache innovations (Sections 2.4, B.4, B.5)

Lecture: Large Caches, Virtual Memory. Topics: cache innovations (Sections 2.4, B.4, B.5) Lecture: Large Caches, Virtual Memory Topics: cache innovations (Sections 2.4, B.4, B.5) 1 More Cache Basics caches are split as instruction and data; L2 and L3 are unified The /L2 hierarchy can be inclusive,

More information

Chapter 5 Internal Memory

Chapter 5 Internal Memory Chapter 5 Internal Memory Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM) Read-write memory Electrically, byte-level Electrically Volatile Read-only memory (ROM) Read-only

More information

Computer Architecture Lecture 24: Memory Scheduling

Computer Architecture Lecture 24: Memory Scheduling 18-447 Computer Architecture Lecture 24: Memory Scheduling Prof. Onur Mutlu Presented by Justin Meza Carnegie Mellon University Spring 2014, 3/31/2014 Last Two Lectures Main Memory Organization and DRAM

More information

Tiered-Latency DRAM: A Low Latency and A Low Cost DRAM Architecture

Tiered-Latency DRAM: A Low Latency and A Low Cost DRAM Architecture Tiered-Latency DRAM: A Low Latency and A Low Cost DRAM Architecture Donghyuk Lee, Yoongu Kim, Vivek Seshadri, Jamie Liu, Lavanya Subramanian, Onur Mutlu Carnegie Mellon University HPCA - 2013 Executive

More information

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory 5.1 Semiconductor Main Memory 5.2 Error Correction 5.3 Advanced DRAM Organization 5.1 Semiconductor Main Memory

More information

Random Access Memory (RAM)

Random Access Memory (RAM) Random Access Memory (RAM) EED2003 Digital Design Dr. Ahmet ÖZKURT Dr. Hakkı YALAZAN 1 Overview Memory is a collection of storage cells with associated input and output circuitry Possible to read and write

More information

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off

More information

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar

More information