ECE 485/585 Microprocessor System Design

Similar documents
ECE 485/585 Microprocessor System Design

Computer Systems Laboratory Sungkyunkwan University

CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda

EEM 486: Computer Architecture. Lecture 9. Memory

JEDEC Standard No. 21 -C Page Appendix E: Specific PD s for Synchronous DRAM (SDRAM).

Memories: Memory Technology

Real Time Embedded Systems

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation

ECE 485/585 Microprocessor System Design

COMPUTER ARCHITECTURES

Features. DDR2 UDIMM with ECC Product Specification. Rev. 1.2 Aug. 2011

CS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory

COSC 6385 Computer Architecture - Memory Hierarchies (III)

IMM128M72D1SOD8AG (Die Revision F) 1GByte (128M x 72 Bit)

DRAM Main Memory. Dual Inline Memory Module (DIMM)

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University

VS133-S512 PDRB X DATA SHEET. Memory Module Part Number VS133-S512 BUFFALO INC. (1/7)

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.

IMM64M64D1SOD16AG (Die Revision D) 512MByte (64M x 64 Bit)

8M x 64 Bit PC-100 SDRAM DIMM

The DRAM Cell. EEC 581 Computer Architecture. Memory Hierarchy Design (III) 1T1C DRAM cell

Mainstream Computer System Components

IMM128M64D1DVD8AG (Die Revision F) 1GByte (128M x 64 Bit)

Basics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS

Lecture 15: DRAM Main Memory Systems. Today: DRAM basics and innovations (Section 2.3)

IMM64M72SDDUD8AG (Die Revision B) 512MByte (64M x 72 Bit)

Features. DDR2 UDIMM w/o ECC Product Specification. Rev. 1.1 Aug. 2011

Organization Row Address Column Address Bank Address Auto Precharge 128Mx8 (1GB) based module A0-A13 A0-A9 BA0-BA2 A10

Memory Technology. Assignment 08. CSTN3005 PC Architecture III October 25, 2005 Author: Corina Roofthooft Instructor: Dave Crabbe

CS311 Lecture 21: SRAM/DRAM/FLASH

Introduction read-only memory random access memory

Introduction to memory system :from device to system

Lecture 18: DRAM Technologies

1998 Technical Documentation Services

DDR SDRAM SODIMM MT8VDDT1664H 128MB 1. MT8VDDT3264H 256MB 2 MT8VDDT6464H 512MB For component data sheets, refer to Micron s Web site:

VNR133-D128 PDRB X DATA SHEET. Memory Module Part Number VNR133-D128 BUFFALO INC. (1/7)

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

DDR SDRAM UDIMM. Draft 9/ 9/ MT18VDDT6472A 512MB 1 MT18VDDT12872A 1GB For component data sheets, refer to Micron s Web site:

LECTURE 5: MEMORY HIERARCHY DESIGN

COSC 6385 Computer Architecture - Memory Hierarchies (II)

DDR SDRAM SODIMM MT16VDDF6464H 512MB MT16VDDF12864H 1GB

Computer System Components

DDR2 SDRAM UDIMM MT8HTF12864AZ 1GB

VS133-S128 PDRB X DATA SHEET. Memory Module Part Number VS133-S128 BUFFALO INC. (1/7)

Options. Data Rate (MT/s) CL = 3 CL = 2.5 CL = 2-40B PC PC PC

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

DDR SDRAM UDIMM MT16VDDT6464A 512MB MT16VDDT12864A 1GB MT16VDDT25664A 2GB

IMM64M64D1DVS8AG (Die Revision D) 512MByte (64M x 64 Bit)

ECE 571 Advanced Microprocessor-Based Design Lecture 16

MEMORY SYSTEM MEMORY TECHNOLOGY SUMMARY DESIGNING MEMORY SYSTEM. The goal in designing any memory system is to provide

Spring 2018 :: CSE 502. Main Memory & DRAM. Nima Honarmand

VN133-D256 PDRB X DATA SHEET. Memory Module Part Number VN133-D256 BUFFALO INC. (1/7)

IMME256M64D2SOD8AG (Die Revision E) 2GByte (256M x 64 Bit)

IMM64M72D1SCS8AG (Die Revision D) 512MByte (64M x 72 Bit)

18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013

June 2004 Now let s find out exactly what we ve bought, how to shop a new system and how to speed up an existing PC!

Memory technology and optimizations ( 2.3) Main Memory

Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

DDR SDRAM UDIMM MT8VDDT3264A 256MB MT8VDDT6464A 512MB For component data sheets, refer to Micron s Web site:

Topic 21: Memory Technology

Topic 21: Memory Technology

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

M2U1G64DS8HB1G and M2Y1G64DS8HB1G are unbuffered 200-Pin Double Data Rate (DDR) Synchronous DRAM Unbuffered Dual In-Line

Later designs used arrays of small ferrite electromagnets, known as core memory.

IMME256M64D2DUD8AG (Die Revision E) 2GByte (256M x 64 Bit)

Copyright 2012, Elsevier Inc. All rights reserved.

IS 258 PC Maintenance. Lecture 6: Installing, Upgrading and Troubleshooting Memory Instructor: Henry Kalisti

DDR2 SDRAM SODIMM MT16HTF12864HZ 1GB MT16HTF25664HZ 2GB. Features. 1GB, 2GB (x64, DR) 200-Pin DDR2 SDRAM SODIMM. Features

A+ Certification Guide. Chapter 5 Random Access Memory

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

LE4ASS21PEH 16GB Unbuffered 2048Mx64 DDR4 SO-DIMM 1.2V Up to PC CL

Main Memory Systems. Department of Electrical Engineering Stanford University Lecture 5-1

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

2GB DDR3 SDRAM SODIMM with SPD

DDR2 SDRAM UDIMM MT16HTF25664AZ 2GB MT16HTF51264AZ 4GB For component data sheets, refer to Micron s Web site:

Organization Row Address Column Address Bank Address Auto Precharge 256Mx4 (1GB) based module A0-A13 A0-A9 BA0-BA2 A10

1. The values of t RCD and t RP for -335 modules show 18ns to align with industry specifications; actual DDR SDRAM device specifications are 15ns.

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

Case Study 1: Optimizing Cache Performance via Advanced Techniques

(SO-DIMM) 64 MB / 128 MB / 256 MB / 512 MB PC100

PM PDRB X DATA SHEET. Memory Module Part Number. PM MByte Non ECC BUFFALO INC. (1/15)

ECE 485/585 Midterm Exam

Adapted from David Patterson s slides on graduate computer architecture

Memory latency: Affects cache miss penalty. Measured by:

ECE 250 / CS250 Introduction to Computer Architecture

DDR2 SDRAM UDIMM MT18HTF12872AZ 1GB MT18HTF25672AZ 2GB MT18HTF51272AZ 4GB. Features. 1GB, 2GB, 4GB (x72, ECC, DR) 240-Pin DDR2 SDRAM UDIMM.

Using SDRAM in Intel 430TX PCIset Embedded Designs

M1U51264DS8HC1G, M1U51264DS8HC3G and M1U25664DS88C3G are unbuffered 184-Pin Double Data Rate (DDR) Synchronous

Chapter 5. Internal Memory. Yonsei University

DDR2 SDRAM UDIMM MT4HTF1664AY 128MB MT4HTF3264AY 256MB MT4HTF6464AY 512MB. Features. 128MB, 256MB, 512MB (x64, SR) 240-Pin DDR2 SDRAM UDIMM.

4GB Unbuffered VLP DDR3 SDRAM DIMM with SPD

PDRB X DD4333-1G DATA SHEET. Memory Module Part Number DD4333-1G

The Memory Hierarchy 1

DDR2 SDRAM SODIMM MT8HTF12864HZ 1GB MT8HTF25664HZ 2GB. Features. 1GB, 2GB (x64, SR) 200-Pin DDR2 SDRAM SODIMM. Features

DDR2 SDRAM UDIMM MT9HTF6472AZ 512MB MT9HTF12872AZ 1GB MT9HTF25672AZ 2GB. Features. 512MB, 1GB, 2GB (x72, SR) 240-Pin DDR2 SDRAM UDIMM.

EE414 Embedded Systems Ch 5. Memory Part 2/2

Memory latency: Affects cache miss penalty. Measured by:

M8M644S3V9 M16M648S3V9. 8M, 16M x 64 SODIMM

Transcription:

Microprocessor System Design Lecture 7: Memory Modules Error Correcting Codes Memory Controllers Zeshan Chishti Electrical and Computer Engineering Dept. Maseeh College of Engineering and Computer Science Source: Lecture based on materials provided by Mark F.

Memory Modules 184 pin DDR SDRAM DIMM All chips in a rank receive same address and control signals Each chip responsible for subset of data bits in its rank Module acts as high capacity DRAM with wide data path Example: 8 chips, each 8 bits wide = 64 bits Easy to add/replace memory in a system No need to solder or remove individual chips Memory granularity issue What s the smallest increment in memory size? From Hsien-Hsin Sean Lee, Georgia Institute of Technology

DRAM Ranks

Organization of DRAM Modules

Memory Modules SIMM (Single Inline Memory Module) 30-pin: some 286, most 386, some 486 systems Page Mode, Fast Page mode devices 72-pin: some 386, most 486, nearly all Pentium (before DIMM) Fast Page Mode, EDO devices DIMM (Dual Inline Memory Module) Dominant today SODIMM (Small Outline DIMM) Used in notebooks, Apple imac RIMM (Rambus RDRAM Module) SIMM 168 pin SDRAM DIMM 184 pin DDR SDRAM DIMM SODIMM 240 pin DDR2, DDR3 SDRAM DIMM 200 pin DDR2, DDR3 SDRAM DIMM RIMM RIMM

SPD (Serial Presence Detect) 8-pin serial EEPROM on memory module Key parameters for SDRAM controller Number of row/column addresses Number of ranks Module width Refresh rate/type Error checking (none, parity, ECC) Latency Timing parameters

DRAM and DIMM Nomenclature Device name Clock M transfers per sec MB/sec Per DIMM DIMM name DDR200 100 MHz 200 1,600 MB/s PC-1600 DDR266 133 MHz 266 2,133 MB/s PC-2100 DDR333 166 MHz 333 2,666 MB/s PC-2700 DDR400 200 MHz 400 3,200 MB/s PC-3200 DDR2-400 200 MHz 400 3,200 MB/s PC2-3200 DDR2-533 266 MHz 533 4,266 MB/s PC2-4200 DDR2-667 333 MHz 666 5,333 MB/s PC2-5300 DDR2-800 400 MHz 800 6,400 MB/s PC2-6400 DDR2-1066 533 MHz 1066 8,533 MB/s PC2-8500 DDR3-800 400 MHz 800 6,400 MB/s PC3-6400 DDR3-1066 533 MHz 1066 8,500 MB/s PC3-8500 DDR3-1333 666 MHz 1333 10,666 MB/s PC3-10600 DDR3-1600 800 MHz 1600 12,800 MB/s PC3-12800 DDR3-1866 933 MHZ 1866 14928 MB/s PC3-14900 M transfers/second = 2 transfers (DDR) x Clock Rate DRAM name incorporates M transfers per second MB/sec = 8 bytes x M transfers per second DIMM name incorporates MB/sec (rounded)

DRAM/SDRAM Latency Specifications DRAM Used 4 numbers (e.g. 4-1-1-1) Indicates number of CPU cycles for 1st and successive accesses SDRAM CAS Latency (CAS or CL) Delay in clock cycles between request and the time the first data is available PC133 module might be described as CAS-2, CAS=2, CL2, CL-2, or CL=2 SDR-DRAM CAS Latency of 1, 2, or 3 DDR-DRAM CAS Latency of 2 or 2.5 When three numbers appear (e.g. 3-2-2) CAS Latency (tcac) RAS-to-CAS delay (trcd) RAS pre-charge time (trp) DDR3 seeing use of four numbers CAS Latency ( tcas tcl, CL) RAS-to-CAS delay (trcd) RAS pre-charge time (trp) RAS access time (tras) 3-3-3-10 timing

Key SDRAM Timing Parameters Determines Latency: t RCD : Minimum time between an ACTIVE command and READ command CL (CAS Latency): Time between READ command and first data valid Determines Bandwidth: t RC: Time between successive row access to different rows (t RC = t RAS + t RP) t RAS : Time between ACTIVE command and end of restoration of data in DRAM array t RP: Time to pre-charge DRAM array in preparation for another row access

EX: Comparing Performance of DIMMs Parameter SDRAM PC3-12800 PC3-14900 DIMM Spec DIMM Clock Period T CK 1/800Mhz = 1/933Mhz = 1.07ns 1.25ns CAS Latency CL 9 9 RAS-to-CAS Delay T RCD 9 9 RAS pre-charge time T RP 9 9 RAS access time T RAS 27 27 Cost/pair $ 176 196 Best Bandwidth/$: t RC = t RAS + t RP = 27 + 9 = 36 (for both DIMMs) 14900/12800 = 1.16, 196/176 = 1.11 so 16% bandwidth gain, 11% increase in cost I d buy the PC3-14900 DIMMs Time from ACTIVE to end of cycle: Time to first byte (Latency) for PC3-12800 = T RCD + CL = 9 + 9 = 18 Time to get 8 bytes of data (burst size = 8, DDR) = 4 Total time = (18 + 4) * 1.25ns = 27.5ns

DDR4 JEDEC released standard September 2012 Projected to be ~50% of market by 2015-2016 Hynix announced 128 GB module using 8 Gb DDR4 in April 2014 AMD (Hierofalcon), Intel (Haswell-E) supporting DDR4 in 2014 No longer multi-drop point-to-point with single DIMM per channel 284-pin DIMM interface

Error Correcting Codes

Error Correction Motivation Failures/time proportional to number of bits As DRAM cells size & voltages shrink, more vulnerable Why was/is this not issue on your PC? Failure rate was low Few consumers would know what to do anyway DRAM banks too large so much memory that not likely to encounter an error Servers (always) correct memory system errors (e.g. usually use ECC) Sources Alpha particles (impurities in IC manufacturing) Cosmic rays (vary with altitude) Bigger problem in Denver and on space-bound electronics Noise Need to handle failures throughout memory subsystem DRAM chips, module, bus DRAM chips don t incorporate ECC Store the ECC bits in DRAM alongside the data bits Chipset (or integrated controller) handles ECC

Error Detection: Parity [from Bruce Jacob]

Error Correction Codes (ECC) Single bit error correction requires n+1 check bits for 2 n data bits

Error Correction Codes (ECC) =1^0^0^0 = 1 1

Error Correction Codes (ECC) Sent -> Recv d -> An example: decoding and verifying 1 1 =1^0^0^0 = 1 R 1011 1 1

Error Correction Codes (ECC) Add another check bit SECDED Single Error Correction Double Error Detection requires n+2 check bits for 2 n data bits

Error Correction Codes (ECC) 64-bit data path + 8 bits ECC stored to DRAM module [from Bruce Jacob]

Memory Controllers

Memory Controllers Handle the actual interface to memory Determine memory configuration/capability Memory Timing/Signal interface Address Mapping Physical Address to Memory Topology Error Correction Scheduling Refresh WAS in North Bridge of chipset Intel prior to Nehalem MCH (Memory Controller Hub) Isolates mp from memory technology/device changes IS Integrated with microprocessor AMD, Intel Nehalem Low latency for high performance Opens possibility for processor-directed hints

Address Mapping Dual channels Memory module Channel ID Rank Row Bank Column

Address Mapping (cont d) Dual channels Memory module Channel ID Rank Row Bank Column Channel Physical path between CPU and memory Rank Group of DRAM chips operating in lockstep Same address, control, CS Responsible for subset of same word Bank Set of independent memory arrays in DRAM chip Row/Column Address of bit cell in a bank May be several planes to achieve n bits wide

Memory Scheduling Memory transactions: read, write DRAM commands: refresh, activate, read, write, precharge Memory scheduling policy Handle transaction requests Possibly from different cores Refresh Prioritize low/high priority CPU cache line fill request Prefetch Prioritize Read over Write Re-order to take advantage of open page in bank Page policy Open Page Close Page

Memory Scheduling Without access scheduling (56 DRAM cycles) Time (cycles) 01 10 20 30 40 50 56 (0,0,0) (0,1,0) (0,0,1) (0,1,3) (1,0,0) (1,1,1) (1,0,1) (1,1,2) P A C P A C P A C P A C P A C P A C P A C P A C With access scheduling (19 DRAM cycles) 01 10 20 (0,0,0) P A C (0,1,0) P A C (0,0,1) C C (0,1,3) (1,0,0) P A C (1,1,1) P A C (1,0,1) C (1,1,2) C (bank,row,col) DRAM commands P: bank precharge (3 cycles) A: row activation (3 cycles) C: column access (1 cycle)

Memory Access to Idle Bank

Memory Access to Active Page (Open Bank)

Memory Access to New Page (Open Bank)

Open page vs.close page policy Open page policy: Row hit latency: t CL +t BURST Row miss latency: t RP + t RCD + t CL + t BURST Close page policy: Row is closed after every access => no row hits Latency: t RCD + t CL + t BURST (slower than open page row hits but faster than open page row misses) Assume than n% of the accesses are row hits with open page policy, then the break-even point for leaving the page open (or close) will be: tr CD + t CL = (n * t CL ) + ((1 n) * (t RP +t RCD + t CL )) n = t RP / (t RP + t RCD )