An introduction to SDRAM and memory controllers. 5kk73
|
|
- Rodney Arthur Wade
- 6 years ago
- Views:
Transcription
1 An introduction to SDRAM and memory controllers 5kk73
2 Presentation Outline (part 1) Introduction to SDRAM Basic SDRAM operation Memory efficiency SDRAM controller architecture Conclusions Followed by part 2 2
3 Static RAM (SRAM) SRAM is typically on-chip memory Found in higher levels of the memory hierarchy Caches and scratchpads Either local to processor or centralized Local memory has very short access time Centralized shared memories have intermediate access time An SRAM cell consists of six transistors Limits memory to a few megabytes, or even smaller 3
4 4
5 Dynamic RAM (DRAM) DRAM was patented in 1968 by Robert Dennard at IBM Significantly cheaper than SRAM (power / area) 1 transistor and 1 capacitor vs. 6 transistors for SRAM A bit is represented by a high or low charge on the capacitor Charge dissipates due to leakage hence the term dynamic RAM Capacity of up to a gigabyte per chip DRAM is (shared) off-chip memory Long access time compared to SRAM Off-chip pins are expensive in terms of area and power SDRAM bandwidth is scarce and must be efficiently utilized Found in lower levels of memory hierarchy Used as remote high-volume storage 5
6 The DRAM evolution Evolution of the DRAM design in the past 16 years A clock signal was added making the design synchronous (SDRAM) The data bus transfers data on both rising and falling edge of the clock (hence Double Data Rate) Second and third generation of DDR memory (DDR2/DDR3) scales to higher clock frequencies (up to 1066 MHz) DDR4 is now standardized by JEDEC (up to 1200 MHz, 1600 MHz is planned) Special branches of DDR memories for graphic cards (GDDR) and for low-power systems (LPDDR, LPDDR2, LPDDR3) 6
7 Comparing the DRAM Evolution to the Dennard Evolution 7
8 SDRAM Architecture The SDRAM architecture is organized in banks, rows and columns A row buffer stores a currently active (open) row The memory interface has a command bus, address bus, and a data bus Buses shared between all banks to reduce the number of off-chip pins A bank is essentially is an independent memory, but with shared I/O Typical values DDR2/DDR3: 4 or 8 banks bank column Example memory: 16-bit DDR MB 8K 65K rows / bank 1K 2K columns / row 4, 8, 16 bits / column MHz 32 MB 1 GB density activate (open) row row buffer read write precharge (close) 8 banks 8K rows / bank 1024 columns / row 16 bits / column 3200 MB/s peak bandwidth 8
9 An Analogy An 8-bank SDRAM is like an 8-lane highway with a dead end and a single exit Shared by 2-way traffic! 9
10 Presentation Outline Introduction to SDRAM Basic SDRAM operation Memory efficiency SDRAM controller architecture Conclusions 10
11 Basic SDRAM Operation Requested row is activated and copied into the row buffer of the bank Read bursts and/or write bursts are issued to the active row Programmed burst length (BL) of 4 or 8 words Row is precharged and stored back into the memory array Command Abbr Description Activate ACT Activate a row in a particular bank bank column Read RD Initiate a read burst to an active row Write WR Initiate a write burst to an active row Precharge PRE Close a row in a particular bank Refresh REF Start a refresh operation No operation NOP Ignores all inputs 11 activate (open) row row buffer read write precharge (close)
12 Timing Constraints Timing constraints determine which commands can be scheduled More than 20 constraints, some are inter-dependent Limits the efficiency of memory accesses Wait for precharge, activate and read/write commands before data on bus Timing constraints get increasingly severe for faster memories The physical design of the memory core has not changed much Constaint in nanoseconds constant, but clock period gets shorter Parameter Abbr. Cycles ACT to RD/WR trcd 3 ACT to ACT (diff. banks) trrd 2 ACT to ACT (same bank) tras 12 Read latency trl 3 RD to RD - BL/2 12 December 8, 2014
13 Bank Parallelism Multiple banks provide parallelism SDRAM has separate data and command buses Activate, precharge and transfer data in parallel (bank preparation) Increases efficiency Figure shows parallel memory bursts with burst length 8 13 December 8, 2014
14 Presentation Outline Introduction to SDRAM Basic SDRAM operation Memory efficiency SDRAM controller architecture Conclusions 14
15 Memory Efficiency Memory efficiency is the fraction of clock cycles with data transfer Defines the exchange rate between peak bandwidth and net bandwidth Net bandwidth is the actual useful bandwidth after considering overhead Five categories of memory efficiency for SDRAM: Refresh efficiency Read/write efficiency Bank efficiency Command efficiency Data efficiency Memory efficiency is the product of these five categories 15
16 Refresh Efficiency SDRAM need to be refreshed regularly to retain data DRAM cell contains leaking capacitor Refresh command must be issued every 7.8 μs for DDR2/DDR3/DDR4 All banks must be precharged Data cannot be transfered during refresh Refresh efficiency is largely independent of traffic generally 95 99% 16
17 Read / Write Efficiency The data bus of an SDRAM is bi-directional Cycles are lost when switching direction of the data bus Extra NOPs must be inserted between read and write commands I should have inserted more NOPs! Read/write efficiency depends on traffic Determined by frequency of read/write switches Switching too often has a significant impact on memory efficiency Switching after every burst of 8 words gives 57% r/w efficiency with DDR2-400 How would you address this if you designed a memory controller? 17
18 Bank Efficiency Bank conflict when a read or write targets an inactive row (row miss) Significantly impacts memory efficiency Requires precharge followed by activate Less than 40% bank efficiency if always row miss in same bank Bank efficiency depends on traffic Determined by address of request and memory map How would you address this if you designed a memory controller? 18
19 Command Efficiency Command bus uses single data rate Congested if two commands are required simultaneously One command has to be delayed may delay data on bus Command efficiency depends on traffic Small bursts reduce command efficiency Potentially more activate and precharge commands issued Generally quite high (95-100%) 19
20 Be a Memory Command Scheduler! (Worst Halloween costume ever) Pen and paper may be useful For this SDRAM (loosely based on an LPDDR2-667 memory): Parameter Abbr. Cycles ACT to RD/WR (same bank) trcd 6 ACT to ACT (diff. banks) trrd 5 ACT to ACT (same bank) trc 20 RD to RD, or WR to WR - 4 RD to WR 8 WR to RD 8 RD to PRE (same bank) 5 WR to PRE (same bank) PRE to ACT (same bank) trp 5 Schedule these bursts: a) RD to bank 0, row 0, column 8 b) RD to bank 0, row 1, column 16 c) WR to bank 0, row 0, column 0 d) WR to bank 3, row 0, column 8 e) WR to bank 0, row 1, column 32 f) RD to bank 0, row 0, column 0 You can reorder all you like Optimize for the total schedule length 20
21 A possible solution Schedule these bursts: a) RD to bank 0, row 0, column 8 b) RD to bank 0, row 1, column 16 c) WR to bank 0, row 0, column 0 d) WR to bank 3, row 0, column 8 e) WR to bank 0, row 1, column 32 f) RD to bank 0, row 0, column 0 Parameter Abbr. Cycles ACT to RD/WR (same bank) trcd 6 ACT to ACT (diff. banks) trrd 5 ACT to ACT (same bank) trc 20 RD to RD, or WR to WR - 4 RD to WR 8 WR to RD 8 RD to PRE (same bank) 5 WR to PRE (same bank) PRE to ACT (same bank) trp CMD Row /col A0 A3 W0 W CMD Row /col R0 R0 P0 A CMD Row /col W0 R
22 A possible solution Useful techniques: Grouping reads and writes Pipelining operations to different banks Overlapping timing constraints (wait for both WR-to-RD constraint and WR-to-PRE at the same time for example) (Reordering is not always an option in reality) Further down this lecture these concepts will be revisited We build hardware to schedule these commands normally 22
23 Data Efficiency A memory burst cannot access segments of the minimum burst length Minimum access granularity Burst length 8 words is 16 B with 16-bit memory and 64 B with 64-bit memory Excess data is thrown away! If data is poorly aligned an extra segment has to be transferred Cycles are lost when transferring unrequested data Data efficiency depends on the memory client ( and the application ) Smaller requests reduce data efficiency 23
24 Conclusions on Memory Efficiency Memory efficiency is highly dependent on traffic Worst-case efficiency is very low Every burst targets different rows in the same bank Read/write switch after every burst Results in Less than 31% efficiency for all DDR2/DDR3/LPDDR/LPDDR2 memories Efficiency drops as memories become faster (DDR4) 24
25 Worst-case memory efficiency MB_DDR MB_DDR MB_DDR MB_DDR MB_LPDDR MB_LPDDR MB_LPDDR MB_LPDDR2-667-S4 256MB_LPDDR2-800-S4 256MB_LPDDR S4 DDRX-Y runs at Y/2 MHz command rate Transports Y memory words per cycle Conclusion Worst-case efficiency must be avoided! (And what is wrong with this picture?) 25
26 Presentation Outline Introduction to SDRAM Basic SDRAM operation Memory efficiency SDRAM controller architecture Conclusions 26
27 A general memory controller architecture A general controller architecture consists of two parts The front-end buffers requests and responses per requestor schedules one (or more) requests for memory access is independent of the memory type The back-end translates scheduled request(s) into SDRAM command sequence is dependent on the memory type 27
28 Front-end arbitration Front-end provides buffering and arbitration Arbiter can schedule requests in many different ways Priorities are common to give low-latency access to critical requestors E.g. stalling processor waiting for a cache line However, it is important to prevent starvation of low priority requestors Common to schedule fairly in case of multiple processors (round-robin, TDM) Next request may be scheduled before previous is finished Gives more options to command generator in back-end Scheduled requests are sent to the back-end for memory access 28
29 Back-end Back-end contains a memory map and a command generator Memory map decodes logical address to physical address Physical address is (bank, row, column) Can be done in different ways choice affects efficiency Logical addr. Memory map Physical addr. 0x10FF00 (2, 510, 128) Command generator schedules commands for the target memory Customized for a particular memory generation Programmable to handle different timing constraints 29
30 Programmable Memory Timings 30
31 Programmable Memory Timings The SDRAM DIMM generally knows which timings it likes: 31
32 Memory map 32
33 Command generator Generates and schedules commands for scheduled requests May work with both requests and commands Many ways to determine which request to process Increase bank efficiency Prefer requests targeting open rows Increase read/write efficiency Prefer read after read and write after write Reduce stall cycles of processor Always prefer reads, since reads are blocking and writes are often posted 36 What are the pros and cons of these methods? What happens to the worst-case?
34 Command generator Generate SDRAM commands without violating timing constraints Often build hierarchically: distribute request across banks, and issue commands once timing constraints are satisfied. Then choose which command for which bank is executed. Many possible policies to determine which command to schedule Page policies Close rows as soon as possible to activate new one faster Keep rows open as long as possible to benefit from locality Command priorities Read and write commands have high priority, as they put data on the bus Precharge and activate commands have lower priorities Algorithms often try to put data on the bus as soon as possible 37
35 Presentation Outline Introduction to SDRAM Basic SDRAM operation Memory efficiency SDRAM controller architecture Conclusions 38
36 Conclusions SDRAM is used as shared off-chip high-volume storage Cheaper but slower than SRAM The worst-case efficiency of SDRAM depends on many factors MB_DDR MB_DDR MB_DDR MB_DDR MB_LPDDR MB_LPDDR MB_LPDDR MB_LPDDR2-667-S4 256MB_LPDDR2-800-S4 256MB_LPDDR S Actual case is highly variable and depends on the application Controller tries to minimize latency and maximize efficiency Low-latency for critical requestors using priorities Fairness among multiple processors High efficiency by reordering requests to fit with memory state Memory map impacts efficiency and power 39
37 Presentation Outline (part 2) Mixed time-criticality 41
38 Trends in embedded systems Embedded systems get increasingly complex Increasingly complex applications (more functionality) Growing number of applications integrated in a device More applications execute concurrently Requires increased system performance without increasing power The resulting complex contemporary platforms are heterogeneous multi-processor systems with distributed memory hierarchy to improve performance/power ratio Resources in the system are shared to reduce cost 42
39 Mixed time-criticality Applications have mixed time-criticality Firm real-time requirements (FRT) E.g. software-defined radio application Failure to satisfy requirement may violate correctness No deadline misses tolerable Soft real-time requirements (SRT) E.g. media decoder application Failure to satisfy requirement reduces quality of output Occassional deadline misses tolerable No real-time requirements (NRT) E.g. graphical user interface No timing requirements, but must be responsive 43
40 Formal verification Verifying MRT systems requires a combination of methods Formal verification Simulation-based verification Formal verification is often used to verify FRT requirements Provides analytical bounds on response time or throughput Considers all application inputs Covers all combinations of concurrently running applications Approach requires models of both applications and hardware Application models are not always available Behavior of dynamic applications is not captured accurately Most hardware is not designed with formal analysis in mind 44
41 Simulation Based verification Simulation is typically used to verify SRT and NRT applications System simulated with a large set of inputs Resource sharing results in interference between applications Timing behaviors of applications in use-case inter-dependent All use-cases must be verified instead of all applications Verification must be repeated if applications are added or modified Verification by simulation is a slow process with poor coverage Verification is costly and effort is expected to increase in future! 45
42 Performance guarantees for SDRAM SDRAM memories are particularly challenging resources The execution time of a request in an SDRAM is variable WCET is pessimistic and guaranteed bandwidth is very low Less than 16% bandwidth can be guaranteed for all DDR3 devices MB_DDR MB_DDR MB_DDR MB_DDR MB_LPDDR MB_LPDDR MB_LPDDR MB_LPDDR2-667-S4 256MB_LPDDR2-800-S4 256MB_LPDDR S SDRAM bandwidth is scarce and must be efficiently utilized Additional interfaces cannot be added due to cost constraints 46
43 Problem statement Complex systems have mixed time-criticality Firm, soft, and no real-time requirements in one system We refer to this as mixed real-time (MRT) requirements Sharing an SDRAM controllers between FRT and SRT/NRT applications is challenging We would like to use the SDRAM in an efficient and power conscious manner Satisfying the FRT requirements, while providing sufficient performance to the SRT/NRT applications 47
Memory Controllers for Real-Time Embedded Systems. Benny Akesson Czech Technical University in Prague
Memory Controllers for Real-Time Embedded Systems Benny Akesson Czech Technical University in Prague Trends in Embedded Systems Embedded systems get increasingly complex Increasingly complex applications
More informationTrends in Embedded System Design
Trends in Embedded System Design MPSoC design gets increasingly complex Moore s law enables increased component integration Digital convergence creates a market for highly integrated devices The resulting
More informationTopic 21: Memory Technology
Topic 21: Memory Technology COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Old Stuff Revisited Mercury Delay Line Memory Maurice Wilkes, in 1947,
More informationTopic 21: Memory Technology
Topic 21: Memory Technology COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Old Stuff Revisited Mercury Delay Line Memory Maurice Wilkes, in 1947,
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II
ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,
More informationEEM 486: Computer Architecture. Lecture 9. Memory
EEM 486: Computer Architecture Lecture 9 Memory The Big Picture Designing a Multiple Clock Cycle Datapath Processor Control Memory Input Datapath Output The following slides belong to Prof. Onur Mutlu
More informationCS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda
CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda biswap@cse.iitk.ac.in https://www.cse.iitk.ac.in/users/biswap/cs698y.html Row decoder Accessing a Row Access Address
More information15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 19: Main Memory Prof. Onur Mutlu Carnegie Mellon University Last Time Multi-core issues in caching OS-based cache partitioning (using page coloring) Handling
More informationECE 485/585 Microprocessor System Design
Microprocessor System Design Lecture 5: Zeshan Chishti DRAM Basics DRAM Evolution SDRAM-based Memory Systems Electrical and Computer Engineering Dept. Maseeh College of Engineering and Computer Science
More informationCOMPUTER ARCHITECTURES
COMPUTER ARCHITECTURES Random Access Memory Technologies Gábor Horváth BUTE Department of Networked Systems and Services ghorvath@hit.bme.hu Budapest, 2019. 02. 24. Department of Networked Systems and
More informationIntroduction to memory system :from device to system
Introduction to memory system :from device to system Jianhui Yue Electrical and Computer Engineering University of Maine The Position of DRAM in the Computer 2 The Complexity of Memory 3 Question Assume
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Memory / DRAM SRAM = Static RAM SRAM vs. DRAM As long as power is present, data is retained DRAM = Dynamic RAM If you don t do anything, you lose the data SRAM: 6T per bit
More informationMemories: Memory Technology
Memories: Memory Technology Z. Jerry Shi Assistant Professor of Computer Science and Engineering University of Connecticut * Slides adapted from Blumrich&Gschwind/ELE475 03, Peh/ELE475 * Memory Hierarchy
More information,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics
,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics The objectives of this module are to discuss about the need for a hierarchical memory system and also
More informationCS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory
CS65 Computer Architecture Lecture 9 Memory Hierarchy - Main Memory Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 9: Main Memory 9-/ /6/ A. Sohn Memory Cycle Time 5
More informationInternal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.
Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Internal Memory http://www.yildiz.edu.tr/~naydin 1 2 Outline Semiconductor main memory Random Access Memory
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Memory / DRAM SRAM = Static RAM SRAM vs. DRAM As long as power is present, data is retained DRAM = Dynamic RAM If you don t do anything, you lose the data SRAM: 6T per bit
More informationCOSC 6385 Computer Architecture - Memory Hierarchies (III)
COSC 6385 Computer Architecture - Memory Hierarchies (III) Edgar Gabriel Spring 2014 Memory Technology Performance metrics Latency problems handled through caches Bandwidth main concern for main memory
More informationEE382N (20): Computer Architecture - Parallelism and Locality Fall 2011 Lecture 23 Memory Systems
EE382 (20): Computer Architecture - Parallelism and Locality Fall 2011 Lecture 23 Memory Systems Mattan Erez The University of Texas at Austin EE382: Principles of Computer Architecture, Fall 2011 -- Lecture
More informationComputer Systems Laboratory Sungkyunkwan University
DRAMs Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Main Memory & Caches Use DRAMs for main memory Fixed width (e.g., 1 word) Connected by fixed-width
More informationLECTURE 5: MEMORY HIERARCHY DESIGN
LECTURE 5: MEMORY HIERARCHY DESIGN Abridged version of Hennessy & Patterson (2012):Ch.2 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive
More informationMemory. Lecture 22 CS301
Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch
More informationComputer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationEE414 Embedded Systems Ch 5. Memory Part 2/2
EE414 Embedded Systems Ch 5. Memory Part 2/2 Byung Kook Kim School of Electrical Engineering Korea Advanced Institute of Science and Technology Overview 6.1 introduction 6.2 Memory Write Ability and Storage
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology
More informationComputer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per
More informationMemory Access Pattern-Aware DRAM Performance Model for Multi-core Systems
Memory Access Pattern-Aware DRAM Performance Model for Multi-core Systems ISPASS 2011 Hyojin Choi *, Jongbok Lee +, and Wonyong Sung * hjchoi@dsp.snu.ac.kr, jblee@hansung.ac.kr, wysung@snu.ac.kr * Seoul
More informationBasics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS
Basics DRAM ORGANIZATION DRAM Word Line Bit Line Storage element (capacitor) In/Out Buffers Decoder Sense Amps... Bit Lines... Switching element Decoder... Word Lines... Memory Array Page 1 Basics BUS
More information18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013
18-447: Computer Architecture Lecture 25: Main Memory Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013 Reminder: Homework 5 (Today) Due April 3 (Wednesday!) Topics: Vector processing,
More informationChapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.
Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.5) Memory Technologies Dynamic Random Access Memory (DRAM) Optimized
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationChapter 5B. Large and Fast: Exploiting Memory Hierarchy
Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,
More informationCENG4480 Lecture 09: Memory 1
CENG4480 Lecture 09: Memory 1 Bei Yu byu@cse.cuhk.edu.hk (Latest update: November 8, 2017) Fall 2017 1 / 37 Overview Introduction Memory Principle Random Access Memory (RAM) Non-Volatile Memory Conclusion
More informationMemory systems. Memory technology. Memory technology Memory hierarchy Virtual memory
Memory systems Memory technology Memory hierarchy Virtual memory Memory technology DRAM Dynamic Random Access Memory bits are represented by an electric charge in a small capacitor charge leaks away, need
More informationReducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University
Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity Donghyuk Lee Carnegie Mellon University Problem: High DRAM Latency processor stalls: waiting for data main memory high latency Major bottleneck
More informationCENG3420 Lecture 08: Memory Organization
CENG3420 Lecture 08: Memory Organization Bei Yu byu@cse.cuhk.edu.hk (Latest update: February 22, 2018) Spring 2018 1 / 48 Overview Introduction Random Access Memory (RAM) Interleaving Secondary Memory
More informationLecture 14: Cache Innovations and DRAM. Today: cache access basics and innovations, DRAM (Sections )
Lecture 14: Cache Innovations and DRAM Today: cache access basics and innovations, DRAM (Sections 5.1-5.3) 1 Reducing Miss Rate Large block size reduces compulsory misses, reduces miss penalty in case
More informationMultilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823
More informationTECHNOLOGY BRIEF. Double Data Rate SDRAM: Fast Performance at an Economical Price EXECUTIVE SUMMARY C ONTENTS
TECHNOLOGY BRIEF June 2002 Compaq Computer Corporation Prepared by ISS Technology Communications C ONTENTS Executive Summary 1 Notice 2 Introduction 3 SDRAM Operation 3 How CAS Latency Affects System Performance
More informationMain Memory Systems. Department of Electrical Engineering Stanford University Lecture 5-1
Lecture 5 Main Memory Systems Department of Electrical Engineering Stanford University http://eeclass.stanford.edu/ee282 Lecture 5-1 Announcements If you don t have a group of 3, contact us ASAP HW-1 is
More informationMainstream Computer System Components
Mainstream Computer System Components Double Date Rate (DDR) SDRAM One channel = 8 bytes = 64 bits wide Current DDR3 SDRAM Example: PC3-12800 (DDR3-1600) 200 MHz (internal base chip clock) 8-way interleaved
More informationECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)]
ECE7995 (4) Basics of Memory Hierarchy [Adapted from Mary Jane Irwin s slides (PSU)] Major Components of a Computer Processor Devices Control Memory Input Datapath Output Performance Processor-Memory Performance
More informationCpE 442. Memory System
CpE 442 Memory System CPE 442 memory.1 Outline of Today s Lecture Recap and Introduction (5 minutes) Memory System: the BIG Picture? (15 minutes) Memory Technology: SRAM and Register File (25 minutes)
More informationCS311 Lecture 21: SRAM/DRAM/FLASH
S 14 L21-1 2014 CS311 Lecture 21: SRAM/DRAM/FLASH DARM part based on ISCA 2002 tutorial DRAM: Architectures, Interfaces, and Systems by Bruce Jacob and David Wang Jangwoo Kim (POSTECH) Thomas Wenisch (University
More informationThe Memory Hierarchy 1
The Memory Hierarchy 1 What is a cache? 2 What problem do caches solve? 3 Memory CPU Abstraction: Big array of bytes Memory memory 4 Performance vs 1980 Processor vs Memory Performance Memory is very slow
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design Edited by Mansour Al Zuair 1 Introduction Programmers want unlimited amounts of memory with low latency Fast
More information15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 20: Main Memory II Prof. Onur Mutlu Carnegie Mellon University Today SRAM vs. DRAM Interleaving/Banking DRAM Microarchitecture Memory controller Memory buses
More informationCS152 Computer Architecture and Engineering Lecture 16: Memory System
CS152 Computer Architecture and Engineering Lecture 16: System March 15, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http://http.cs.berkeley.edu/~patterson
More informationMark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness
EE 352 Unit 10 Memory System Overview SRAM vs. DRAM DMA & Endian-ness The Memory Wall Problem: The Memory Wall Processor speeds have been increasing much faster than memory access speeds (Memory technology
More informationLecture 15: DRAM Main Memory Systems. Today: DRAM basics and innovations (Section 2.3)
Lecture 15: DRAM Main Memory Systems Today: DRAM basics and innovations (Section 2.3) 1 Memory Architecture Processor Memory Controller Address/Cmd Bank Row Buffer DIMM Data DIMM: a PCB with DRAM chips
More informationDRAM Main Memory. Dual Inline Memory Module (DIMM)
DRAM Main Memory Dual Inline Memory Module (DIMM) Memory Technology Main memory serves as input and output to I/O interfaces and the processor. DRAMs for main memory, SRAM for caches Metrics: Latency,
More informationSpring 2018 :: CSE 502. Main Memory & DRAM. Nima Honarmand
Main Memory & DRAM Nima Honarmand Main Memory Big Picture 1) Last-level cache sends its memory requests to a Memory Controller Over a system bus of other types of interconnect 2) Memory controller translates
More informationLecture 18: DRAM Technologies
Lecture 18: DRAM Technologies Last Time: Cache and Virtual Memory Review Today DRAM organization or, why is DRAM so slow??? Lecture 18 1 Main Memory = DRAM Lecture 18 2 Basic DRAM Architecture Lecture
More informationECE 551 System on Chip Design
ECE 551 System on Chip Design Introducing Bus Communications Garrett S. Rose Fall 2018 Emerging Applications Requirements Data Flow vs. Processing µp µp Mem Bus DRAMC Core 2 Core N Main Bus µp Core 1 SoCs
More informationWilliam Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory
William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory Semiconductor Memory Types Semiconductor Memory RAM Misnamed as all semiconductor memory is random access
More informationChapter 8 Memory Basics
Logic and Computer Design Fundamentals Chapter 8 Memory Basics Charles Kime & Thomas Kaminski 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Overview Memory definitions Random Access
More informationEI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)
EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building
More informationOrganization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition
William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory 5.1 Semiconductor Main Memory 5.2 Error Correction 5.3 Advanced DRAM Organization 5.1 Semiconductor Main Memory
More informationISSN: [Bilani* et al.,7(2): February, 2018] Impact Factor: 5.164
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A REVIEWARTICLE OF SDRAM DESIGN WITH NECESSARY CRITERIA OF DDR CONTROLLER Sushmita Bilani *1 & Mr. Sujeet Mishra 2 *1 M.Tech Student
More informationThe University of Adelaide, School of Computer Science 13 September 2018
Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per
More informationSummer 2003 Lecture 18 07/09/03
Summer 2003 Lecture 18 07/09/03 NEW HOMEWORK Instruction Execution Times: The 8088 CPU is a synchronous machine that operates at a particular clock frequency. In the case of the original IBM PC, that clock
More informationCSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1
CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson
More informationCOSC 6385 Computer Architecture - Memory Hierarchies (II)
COSC 6385 Computer Architecture - Memory Hierarchies (II) Edgar Gabriel Spring 2018 Types of cache misses Compulsory Misses: first access to a block cannot be in the cache (cold start misses) Capacity
More informationMainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation
Mainstream Computer System Components CPU Core 2 GHz - 3.0 GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation One core or multi-core (2-4) per chip Multiple FP, integer
More informationWhere Have We Been? Ch. 6 Memory Technology
Where Have We Been? Combinational and Sequential Logic Finite State Machines Computer Architecture Instruction Set Architecture Tracing Instructions at the Register Level Building a CPU Pipelining Where
More informationregisters data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.
13 1 CMPE110 Computer Architecture, Winter 2009 Andrea Di Blas 110 Winter 2009 CMPE Cache Direct-mapped cache Reads and writes Cache associativity Cache and performance Textbook Edition: 7.1 to 7.3 Third
More informationAdapted from David Patterson s slides on graduate computer architecture
Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual
More informationLecture-14 (Memory Hierarchy) CS422-Spring
Lecture-14 (Memory Hierarchy) CS422-Spring 2018 Biswa@CSE-IITK The Ideal World Instruction Supply Pipeline (Instruction execution) Data Supply - Zero-cycle latency - Infinite capacity - Zero cost - Perfect
More informationRecap: Machine Organization
ECE232: Hardware Organization and Design Part 14: Hierarchy Chapter 5 (4 th edition), 7 (3 rd edition) http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy,
More informationLarge and Fast: Exploiting Memory Hierarchy
CSE 431: Introduction to Operating Systems Large and Fast: Exploiting Memory Hierarchy Gojko Babić 10/5/018 Memory Hierarchy A computer system contains a hierarchy of storage devices with different costs,
More informationCycle Time for Non-pipelined & Pipelined processors
Cycle Time for Non-pipelined & Pipelined processors Fetch Decode Execute Memory Writeback 250ps 350ps 150ps 300ps 200ps For a non-pipelined processor, the clock cycle is the sum of the latencies of all
More informationNegotiating the Maze Getting the most out of memory systems today and tomorrow. Robert Kaye
Negotiating the Maze Getting the most out of memory systems today and tomorrow Robert Kaye 1 System on Chip Memory Systems Systems use external memory Large address space Low cost-per-bit Large interface
More informationThe Memory Component
The Computer Memory Chapter 6 forms the first of a two chapter sequence on computer memory. Topics for this chapter include. 1. A functional description of primary computer memory, sometimes called by
More informationMemory. Objectives. Introduction. 6.2 Types of Memory
Memory Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured. Master the concepts
More informationComputer Structure. The Uncore. Computer Structure 2013 Uncore
Computer Structure The Uncore 1 2 nd Generation Intel Next Generation Intel Turbo Boost Technology High Bandwidth Last Level Cache Integrates CPU, Graphics, MC, PCI Express* on single chip PCH DMI PCI
More informationLecture: Memory Technology Innovations
Lecture: Memory Technology Innovations Topics: memory schedulers, refresh, state-of-the-art and upcoming changes: buffer chips, 3D stacking, non-volatile cells, photonics Multiprocessor intro 1 Row Buffers
More informationOverview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM
Memories Overview Memory Classification Read-Only Memory (ROM) Types of ROM PROM, EPROM, E 2 PROM Flash ROMs (Compact Flash, Secure Digital, Memory Stick) Random Access Memory (RAM) Types of RAM Static
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationAdapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]
Lecture 17 Adapted from instructor s supplementary material from Computer Organization and Design, 4th Edition, Patterson & Hennessy, 2008, MK] SRAM / / Flash / RRAM / HDD SRAM / / Flash / RRAM/ HDD SRAM
More informationCPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?
cps 14 memory.1 RW Fall 2 CPS11 Computer Organization and Programming Lecture 13 The System Robert Wagner Outline of Today s Lecture System the BIG Picture? Technology Technology DRAM A Real Life Example
More informationComputer Architecture Lecture 24: Memory Scheduling
18-447 Computer Architecture Lecture 24: Memory Scheduling Prof. Onur Mutlu Presented by Justin Meza Carnegie Mellon University Spring 2014, 3/31/2014 Last Two Lectures Main Memory Organization and DRAM
More informationViews of Memory. Real machines have limited amounts of memory. Programmer doesn t want to be bothered. 640KB? A few GB? (This laptop = 2GB)
CS6290 Memory Views of Memory Real machines have limited amounts of memory 640KB? A few GB? (This laptop = 2GB) Programmer doesn t want to be bothered Do you think, oh, this computer only has 128MB so
More informationComputer Organization. 8th Edition. Chapter 5 Internal Memory
William Stallings Computer Organization and Architecture 8th Edition Chapter 5 Internal Memory Semiconductor Memory Types Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM)
More informationMemory System Overview. DMA & Endian-ness. Technology. Architectural. Problem: The Memory Wall
The Memory Wall EE 357 Unit 13 Problem: The Memory Wall Processor speeds have been increasing much faster than memory access speeds (Memory technology targets density rather than speed) Large memories
More informationChapter 5 Internal Memory
Chapter 5 Internal Memory Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM) Read-write memory Electrically, byte-level Electrically Volatile Read-only memory (ROM) Read-only
More information2. Link and Memory Architectures and Technologies
2. Link and Memory Architectures and Technologies 2.1 Links, Thruput/Buffering, Multi-Access Ovrhds 2.2 Memories: On-chip / Off-chip SRAM, DRAM 2.A Appendix: Elastic Buffers for Cross-Clock Commun. Manolis
More informationA Comparative Study of Predictable DRAM Controllers
1 A Comparative Study of Predictable DRAM Controllers DALU GUO, MOHAMED HASSA, RODOLFO PELLIZZOI, and HIRE PATEL, University of Waterloo, CADA Recently, the research community has introduced several predictable
More informationChapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY
Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored
More informationEmbedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts
Hardware/Software Introduction Chapter 5 Memory Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 1 2 Introduction Memory:
More informationEmbedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction
Hardware/Software Introduction Chapter 5 Memory 1 Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 2 Introduction Embedded
More informationSlide credit: Slides adapted from David Kirk/NVIDIA and Wen-mei W. Hwu, DRAM Bandwidth
Slide credit: Slides adapted from David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2016 DRAM Bandwidth MEMORY ACCESS PERFORMANCE Objective To learn that memory bandwidth is a first-order performance factor in
More informationOrganization Row Address Column Address Bank Address Auto Precharge 128Mx8 (1GB) based module A0-A13 A0-A9 BA0-BA2 A10
GENERAL DESCRIPTION The Gigaram is ECC Registered Dual-Die DIMM with 1.25inch (30.00mm) height based on DDR2 technology. DIMMs are available as ECC modules in 256Mx72 (2GByte) organization and density,
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to
More informationEvaluating STT-RAM as an Energy-Efficient Main Memory Alternative
Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative Emre Kültürsay *, Mahmut Kandemir *, Anand Sivasubramaniam *, and Onur Mutlu * Pennsylvania State University Carnegie Mellon University
More informationComputer Architecture
Computer Architecture Lecture 7: Memory Hierarchy and Caches Dr. Ahmed Sallam Suez Canal University Spring 2015 Based on original slides by Prof. Onur Mutlu Memory (Programmer s View) 2 Abstraction: Virtual
More informationComputer System Components
Computer System Components CPU Core 1 GHz - 3.2 GHz 4-way Superscaler RISC or RISC-core (x86): Deep Instruction Pipelines Dynamic scheduling Multiple FP, integer FUs Dynamic branch prediction Hardware
More informationSemiconductor Memory Types Microprocessor Design & Organisation HCA2102
Semiconductor Memory Types Microprocessor Design & Organisation HCA2102 Internal & External Memory Semiconductor Memory RAM Misnamed as all semiconductor memory is random access Read/Write Volatile Temporary
More informationVariability Windows for Predictable DDR Controllers, A Technical Report
Variability Windows for Predictable DDR Controllers, A Technical Report MOHAMED HASSAN 1 INTRODUCTION In this technical report, we detail the derivation of the variability window for the eight predictable
More informationAddressing the Memory Wall
Lecture 26: Addressing the Memory Wall Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Tunes Cage the Elephant Back Against the Wall (Cage the Elephant) This song is for the
More informationComputer Memory. Textbook: Chapter 1
Computer Memory Textbook: Chapter 1 ARM Cortex-M4 User Guide (Section 2.2 Memory Model) STM32F4xx Technical Reference Manual: Chapter 2 Memory and Bus Architecture Chapter 3 Flash Memory Chapter 36 Flexible
More information