Lecture 15: DRAM Main Memory Systems. Today: DRAM basics and innovations (Section 2.3)

Similar documents
Lecture: Memory Technology Innovations

DRAM Main Memory. Dual Inline Memory Module (DIMM)

Lecture: DRAM Main Memory. Topics: virtual memory wrap-up, DRAM intro and basics (Section 2.3)

Lecture: DRAM Main Memory. Topics: virtual memory wrap-up, DRAM intro and basics (Section 2.3)

Lecture: Memory, Multiprocessors. Topics: wrap-up of memory systems, intro to multiprocessors and multi-threaded programming models

M2 Outline. Memory Hierarchy Cache Blocking Cache Aware Programming SRAM, DRAM Virtual Memory Virtual Machines Non-volatile Memory, Persistent NVM

Lecture: Memory, Coherence Protocols. Topics: wrap-up of memory systems, intro to multi-thread programming models

Lecture 14: Cache Innovations and DRAM. Today: cache access basics and innovations, DRAM (Sections )

EEM 486: Computer Architecture. Lecture 9. Memory

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University

CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013

CS311 Lecture 21: SRAM/DRAM/FLASH

Introduction to memory system :from device to system

COSC 6385 Computer Architecture - Memory Hierarchies (III)

Computer Systems Laboratory Sungkyunkwan University

CS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

LECTURE 5: MEMORY HIERARCHY DESIGN

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.

Mainstream Computer System Components

COSC 6385 Computer Architecture - Memory Hierarchies (II)

CSE502: Computer Architecture CSE 502: Computer Architecture

Lecture 5: Refresh, Chipkill. Topics: refresh basics and innovations, error correction

Lecture 5: Scheduling and Reliability. Topics: scheduling policies, handling DRAM errors

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ARCHITECTURES

The DRAM Cell. EEC 581 Computer Architecture. Memory Hierarchy Design (III) 1T1C DRAM cell

Basics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS

The University of Adelaide, School of Computer Science 13 September 2018

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Lecture: Memory, Coherence Protocols. Topics: wrap-up of memory systems, multi-thread programming models, snooping-based protocols

Lecture 24: Memory, VM, Multiproc

Copyright 2012, Elsevier Inc. All rights reserved.

Spring 2018 :: CSE 502. Main Memory & DRAM. Nima Honarmand

CSE502: Computer Architecture CSE 502: Computer Architecture

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?

ECE 485/585 Microprocessor System Design

The Memory Hierarchy 1

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Memories: Memory Technology

Memory technology and optimizations ( 2.3) Main Memory

Adapted from David Patterson s slides on graduate computer architecture

CENG3420 Lecture 08: Memory Organization

ECE 485/585 Midterm Exam

Main Memory Systems. Department of Electrical Engineering Stanford University Lecture 5-1

Topic 21: Memory Technology

Topic 21: Memory Technology

ECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)]

Computer Architecture Lecture 24: Memory Scheduling

Computer System Components

Addressing the Memory Wall

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

CpE 442. Memory System

ECE 485/585 Microprocessor System Design

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Copyright 2012, Elsevier Inc. All rights reserved.

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

EE414 Embedded Systems Ch 5. Memory Part 2/2

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory

An introduction to SDRAM and memory controllers. 5kk73

CENG4480 Lecture 09: Memory 1

EE382N (20): Computer Architecture - Parallelism and Locality Fall 2011 Lecture 23 Memory Systems

Lecture 18: DRAM Technologies

ECE 485/585 Midterm Exam

ECE 152 Introduction to Computer Architecture

CS152 Computer Architecture and Engineering Lecture 16: Memory System

ECEN 449 Microprocessor System Design. Memories. Texas A&M University

ECE232: Hardware Organization and Design

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Memory hierarchy Outline

Mark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness

Lecture: Memory Technology Innovations

Views of Memory. Real machines have limited amounts of memory. Programmer doesn t want to be bothered. 640KB? A few GB? (This laptop = 2GB)

EECS150 - Digital Design Lecture 17 Memory 2

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

Design with Microprocessors

CSE 599 I Accelerated Computing - Programming GPUS. Memory performance

Improving DRAM Performance by Parallelizing Refreshes with Accesses

Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems

IMM128M72D1SOD8AG (Die Revision F) 1GByte (128M x 72 Bit)

ECE 250 / CS250 Introduction to Computer Architecture

CENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu

Intro to Computer Architecture, Spring 2012 Midterm Exam II

The Memory Hierarchy & Cache

IMM128M64D1DVD8AG (Die Revision F) 1GByte (128M x 64 Bit)

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Memory. Lecture 22 CS301

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Recap: Machine Organization

CS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Lecture-14 (Memory Hierarchy) CS422-Spring

Transcription:

Lecture 15: DRAM Main Memory Systems Today: DRAM basics and innovations (Section 2.3) 1

Memory Architecture Processor Memory Controller Address/Cmd Bank Row Buffer DIMM Data DIMM: a PCB with DRAM chips on the back and front Rank: a collection of DRAM chips that work together to respond to a request and keep the data bus full A 64-bit data bus will need 8 x8 DRAM chips or 4 x16 DRAM chips or.. Bank: a subset of a rank that is busy during one request Row buffer: the last row (say, 8 KB) read from a bank, acts like a cache 2

DRAM Array Access 16Mb DRAM array = 4096 x 4096 array of bits 12 row address bits arrive first Row Access Strobe (RAS) 12 column address bits arrive next Column Access Strobe (CAS) 4096 bits are read out Column decoder Row Buffer Some bits returned to CPU 3

Organizing a Rank DIMM, rank, bank, array form a hierarchy in the storage organization Because of electrical constraints, only a few DIMMs can be attached to a bus One DIMM can have 1-4 ranks For energy efficiency, use wide-output DRAM chips better to activate only 4 x16 chips per request than 16 x4 chips For high capacity, use narrow-output DRAM chips since the ranks on a channel are limited, capacity per rank is boosted by having 16 x4 2Gb chips than 4 x16 2Gb chips 4

Organizing Banks and Arrays A rank is split into many banks (4-16) to boost parallelism within a rank Ranks and banks offer memory-level parallelism A bank is made up of multiple arrays (subarrays, tiles, mats) To maximize density, arrays within a bank are made large rows are wide row buffers are wide (8KB read for a 64B request, called overfetch) Each array provides a single bit to the output pin in a cycle (for high density) 5

Row Buffers Each bank has a single row buffer Row buffers act as a cache within DRAM Row buffer hit: ~20 ns access time (must only move data from row buffer to pins) Empty row buffer access: ~40 ns (must first read arrays, then move data from row buffer to pins) Row buffer conflict: ~60 ns (must first precharge the bitlines, then read new row, then move data to pins) In addition, must wait in the queue (tens of nano-seconds) and incur address/cmd/data transfer delays (~10 ns) 6

Reads and Writes A single bus is used for reads and writes The bus direction must be reversed when switching between reads and writes; this takes time and leads to bus idling Hence, writes are performed in bursts; a write buffer stores pending writes until a high water mark is reached Writes are drained until a low water mark is reached 7

Open/Closed Page Policies If an access stream has locality, a row buffer is kept open Row buffer hits are cheap (open-page policy) Row buffer miss is a bank conflict and expensive because precharge is on the critical path If an access stream has little locality, bitlines are precharged immediately after access (close-page policy) Nearly every access is a row buffer miss The precharge is usually not on the critical path Modern memory controller policies lie somewhere between these two extremes (usually proprietary) 8

Address Mapping Policies Consecutive cache lines can be placed in the same row to boost row buffer hit rates Consecutive cache lines can be placed in different ranks to boost parallelism Example address mapping policies: row:rank:bank:channel:column:blkoffset row:column:rank:bank:channel:blkoffset 9

Scheduling Policies FCFS: Issue the first read or write in the queue that is ready for issue First Ready - FCFS: First issue row buffer hits if you can Stall Time Fair: First issue row buffer hits, unless other threads are being neglected 10

Refresh Every DRAM cell must be refreshed within a 64 ms window A row read/write automatically refreshes the row Every refresh command performs refresh on a number of rows, the memory system is unavailable during that time A refresh command is issued by the memory controller once every 7.8us on average 11

Error Correction For every 64-bit word, can add an 8-bit code that can detect two errors and correct one error; referred to as SECDED single error correct double error detect A rank is now made up of 9 x8 chips, instead of 8 x8 chips Stronger forms of error protection exist: a system is chipkill correct if it can handle an entire DRAM chip failure 12

Modern Memory System...... PROC.. 4 DDR3 channels 64-bit data channels 800 MHz channels 1-2 DIMMs/channel 1-4 ranks/channel 13

Cutting-Edge Systems PROC SMB.... The link into the processor is narrow and high frequency The Scalable Memory Buffer chip is a router that connects to multiple DDR3 channels (wide and slow) Boosts processor pin bandwidth and memory capacity More expensive, high power 14

Title Bullet 15