ECE 485/585 Microprocessor System Design

Similar documents
ECE 485/585 Microprocessor System Design

Review: Timing. EECS Components and Design Techniques for Digital Systems. Lec 13 Storage: Regs, SRAM, ROM. Outline.

EECS150 - Digital Design Lecture 16 Memory 1

CMPEN 411 VLSI Digital Circuits Spring Lecture 22: Memery, ROM

EECS150 - Digital Design Lecture 16 - Memory

Very Large Scale Integration (VLSI)

! Memory Overview. ! ROM Memories. ! RAM Memory " SRAM " DRAM. ! This is done because we can build. " large, slow memories OR

Semiconductor Memory Classification. Today. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. CPU Memory Hierarchy.

The Memory Hierarchy Part I

Summer 2003 Lecture 18 07/09/03

Concept of Memory. The memory of computer is broadly categories into two categories:

ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems

CpE 442. Memory System

CS152 Computer Architecture and Engineering Lecture 16: Memory System

DIGITAL SYSTEM FUNDAMENTALS (ECE421) DIGITAL ELECTRONICS FUNDAMENTAL (ECE422)

ECE 485/585 Microprocessor System Design

Lecture 13: SRAM. Slides courtesy of Deming Chen. Slides based on the initial set from David Harris. 4th Ed.

Memory Overview. Overview - Memory Types 2/17/16. Curtis Nelson Walla Walla University

Memory. Outline. ECEN454 Digital Integrated Circuit Design. Memory Arrays. SRAM Architecture DRAM. Serial Access Memories ROM

COMP3221: Microprocessors and. and Embedded Systems. Overview. Lecture 23: Memory Systems (I)

Lecture 11 SRAM Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010

Overview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM

ECEN 449 Microprocessor System Design. Memories. Texas A&M University

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?

Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction

CREATED BY M BILAL & Arslan Ahmad Shaad Visit:

ECEN 449 Microprocessor System Design. Memories

Semiconductor Memory Classification

ECE 341. Lecture # 16

Memory and Programmable Logic

William Stallings Computer Organization and Architecture 8th Edition. Chapter 5 Internal Memory

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (5 th Week)

Computer Memory. Textbook: Chapter 1

The Memory Component

Introduction to CMOS VLSI Design Lecture 13: SRAM

UNIT V (PROGRAMMABLE LOGIC DEVICES)

Address connections Data connections Selection connections

Microcontroller Systems. ELET 3232 Topic 11: General Memory Interfacing

Logic and Computer Design Fundamentals. Chapter 8 Memory Basics

ECE 2300 Digital Logic & Computer Organization

Introduction to SRAM. Jasur Hanbaba

ELCT 912: Advanced Embedded Systems

Introduction read-only memory random access memory

EEM 486: Computer Architecture. Lecture 9. Memory

Topic 21: Memory Technology

Topic 21: Memory Technology

Computer Organization and Assembly Language (CS-506)

Memory Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Design with Microprocessors

Memories: Memory Technology

SRAM. Introduction. Digital IC

Design with Microprocessors

+1 (479)

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Caches and Memory Hierarchies.

Memory and Programmable Logic

UNIT:4 MEMORY ORGANIZATION

Module 5a: Introduction To Memory System (MAIN MEMORY)

Digital Integrated Circuits Lecture 13: SRAM

CENG 4480 L09 Memory 2

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

Memory System Overview. DMA & Endian-ness. Technology. Architectural. Problem: The Memory Wall

ENGIN 112 Intro to Electrical and Computer Engineering

ECSE-2610 Computer Components & Operations (COCO)

Semiconductor Memories: RAMs and ROMs

Memory Challenges. Issues & challenges in memory design: Cost Performance Power Scalability

Design and Implementation of an AHB SRAM Memory Controller

Sistemas Digitais I LESI - 2º ano

Mark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

CENG3420 Lecture 08: Memory Organization

Computer Organization. 8th Edition. Chapter 5 Internal Memory

CPE300: Digital System Architecture and Design

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 15 Memories

Chapter 5 Internal Memory

k -bit address bus n-bit data bus Control lines ( R W, MFC, etc.)

CS311 Lecture 21: SRAM/DRAM/FLASH

EECS 151/251A Spring 2019 Digital Design and Integrated Circuits. Instructor: John Wawrzynek. Lecture 18 EE141

EECS150 - Digital Design Lecture 13 - Project Description, Part 2: Memory Blocks. Project Overview

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

Memory in Digital Systems

COMPUTER ARCHITECTURES

CENG4480 Lecture 09: Memory 1

RTL Design (2) Memory Components (RAMs & ROMs)

Memory System Design. Outline

Random Access Memory (RAM)

CS 320 February 2, 2018 Ch 5 Memory

Chapter TEN. Memory and Memory Interfacing

Memory. Memory Technologies

! Memory. " RAM Memory. " Serial Access Memories. ! Cell size accounts for most of memory array size. ! 6T SRAM Cell. " Used in most commercial chips

Altera FLEX 8000 Block Diagram

Computer Memory Basic Concepts. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Memory & Logic Array. Lecture # 23 & 24 By : Ali Mustafa

Picture of memory. Word FFFFFFFD FFFFFFFE FFFFFFFF

Read and Write Cycles

Lecture-14 (Memory Hierarchy) CS422-Spring

Chapter 8 Memory Basics

UMBC. Select. Read. Write. Output/Input-output connection. 1 (Feb. 25, 2002) Four commonly used memories: Address connection ... Dynamic RAM (DRAM)

ECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)]

1. Explain in detail memory classification.[summer-2016, Summer-2015]

Transcription:

Microprocessor System Design Lecture 4: Memory Hierarchy Memory Taxonomy SRAM Basics Memory Organization DRAM Basics Zeshan Chishti Electrical and Computer Engineering Dept Maseeh College of Engineering and Computer Science Source: Lecture based on materials provided by Mark F.

Memory "640K ought to be enough for anybody." -- Bill Gates, 1981

Outline Taxonomy of Memories Memory Hierarchy SRAM Basic Cell, Devices, Timing Memory Organization Multiple banks, interleaving DRAM Basic Cell, Timing DRAM Evolution DRAM modules Error Correction Memory Controllers

Memory Taxonomy Read/Write Memory Volatile Non-Random Access Non-Volatile Random Access Read Only Shift Register FIFO CAM SRAM DRAM EPROM E 2 PROM Flash NAND NOR NVRAM Mask ROM PROM

Computer Memory Hierarchy From Hennessy & Patterson, Computer Architecture: A Quantitative Approach (4 th edition) Processor Datapath Control Registers Intermediate results On-Chip Cache Second Level Cache (SRAM) Cached DRAM Third Level Cache (SRAM) Main Memory (DRAM) Secondary Storage (Disk) Instructions File System Data Paging [Cached Files] Tertiary Storage (Tape) Archive Backup

Register Files sel a sel b sel c data a Register File General Purpose Registers Usually have multiple ports Support CPU architecture s datapaths Ability to read two operands, write one Operate at CPU speed data b data c For read operations, the register file is equivalent to a 2-D array of flip-flops with tri-state outputs For write operations, we add some additional circuitry to the basic cell

Address Decoding Address decoder generates a one-hot code (1-of-n code) from the address binary to unary The output is used for row selection

Accessing Register Files Read Address following Change address Data from new address appears on output Asynchronous Write is synchronous Clock RegID WE If WE, input data is written to selected word on the clock edge Din Register File Dout Clock RegID RegID X RegID Y Dout R[X] R[Y] val Din val WE

Multi-ported Register File A memory unit with two output ports is said to be dual ported Two ways to implement a dual-ported register file True ports: Single set of registers with duplicate data paths and access circuitry that enables two registers to be read at a time Two copies: Use 2 memory blocks each containing one copy of the register file To read two registers, one register can be accessed from each file To write a register, data needs to be written to both the copies of that register Input Data C C Address C Address A Regist er File A Address B Regist er File B Output Data

Static RAMs (SRAM)

SRAM Technology addr SRAM Cell bit line data 6 transistors bit line word line Write Write bit and bit onto bit lines Select desired word ( row ) Turns on pass transistors Writes new value to cell [One inverter input will be low, turning its output high] Read Select desired word ( row ) One bit line will be pulled low Other will remain high For density and low power, want tiny transistors but they can t drive long bit lines Sol n: Pre-charge bit lines (Vdd/2) before read Sense differential between bit and bit

Dual-ported Memory Internals Add decoder, another set of read/write logic, bits lines, word lines Example cell: SRAM WL 2 WL 1 dec a dec b cell array b 2 b 1 b 1 b 2 r/w logic Repeat everything but cross-coupled inverters. address ports data ports r/w logic This scheme extends up to a couple more ports, then need to add additional transistors.

Basic SRAM Size in bits (organization) 1Mb (256K x 4) 256K words of 4 bits 1Mb (128K x 8) 128K words of 8 bits Most control signals are active Low Chip Select (/CS) effectively an enable Write Enable (/WE) controls read/write To perform a write /WE is asserted (Low) /CS is asserted (Low) To perform a read /WE is de-asserted (High) /CS is asserted (Low) A 0 A 1 A n-1 DIN 0 DIN 1 CS WE 2 n x b RAM DOUT 0 DOUT 1 DIN b-1 DOUT b-1

SRAM Variations 2 n x b RAM 2 n x b RAM A 0 A 1 A 0 A 1 A n-1 A n-1 DIN 0 DIN 1 DOUT 0 DOUT 1 D 0 D 1 DIN b-1 DOUT b-1 CS WE D b-1 CS OE WE Dedicated Din & Dout Trade pin count ($) for higher performance No bidirectional turnaround time required Din & Dout often combined to save pins ($) A new control signal, Output Enable (/OE)

Simplified SRAM timing diagram Read: Valid address, then /CS (Chip Select) asserted Access Time: Address good to data valid Cycle Time: Minimum time between subsequent memory operations Write: Valid address and data with /WE asserted, then /CS asserted Address must be stable a setup time before /WE and /CS go low Add hold time after one of the signals goes high

Internal SRAM Organization (16x4) Din 3 Din 2 Din 1 Din 0 WriteEnable Wr Driver - + Wr Driver - + Wr Driver - + Wr Driver - + SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell SRAM Cell : : : : Word 0 Word 1 Address Decoder A0 A1 A2 A3 SRAM Cell SRAM Cell SRAM Cell SRAM Cell Word 15 - + Sense Amp - + Sense Amp - + Sense Amp - + Sense Amp Dout 3 Dout 2 Dout 1 Dout 0

Example: Cypress SRAM Note address following mode Key SRAM timing parameters t AA Address access time: time between a valid address being applied and valid data available on data outputs t RC Read cycle time: Minimum time that one address must be held on the address lines before a second address can be presented t AA represents latency t RC represents bandwidth (throughput)

n bits What happens as number of bits increases? Decoder gets larger and slower Bit lines increase in length Large distributed RC load Compensate with larger, slower transistors Log 2 n bit address Remember Treat output as differential signal Pre-charge both bit lines high Memory cell pulls only one low Sense bit value by comparing sense lines Option: Make array shorter and wider!

Inside a Tall Thin RAM is n = k x m bits Log 2 k bit row address Sense amps mux Log 2 m bit column address 1 data bit

Replicate for Desired Width Log 2 k bit row address n = k x m bits Sense amps Log 2 m bit column address mux 4 data bits 1 data bit x 4

Physical SRAM Array Should Be Square Example: 16 x 1 SRAM 4 x 4 Array DI A1 A0 A3-A2 /WE /CS 2-to-4 Decoder 1 1 0 0 2 3 2-to-4 Decoder IN SEL WR IN SEL WR IN SEL WR IN SEL WR OUT OUT OUT OUT IN SEL WR IN SEL WR IN SEL WR IN SEL WR OUT OUT OUT OUT IN SEL WR IN SEL WR IN SEL WR IN SEL WR OUT OUT OUT OUT IN SEL WR IN SEL WR IN SEL WR IN SEL WR OUT OUT OUT OUT /OE S E 4-to-1 Mux DO

Synchronous SRAM So far we ve been talking about SRAMs w/ asynchronous reads but there are fully synchronous SRAMs Faster than asynchronous SRAMs but need to be clocked Microprocessor manufacturers implement synchronous SRAMs for internal caches FPGA manufacturers embed dedicated synchronous SRAM blocks in their FPGAs Provides Kb s to Mb s of RAM w/o using flip-flops in FPGA fabric Highly configurable (bit width, memory depth, parity/no parity, input/output latches, pipeline registers, etc.) Single cycle access up to speeds near max for FPGA depending on FGPA family

Memory Subsystems

Memory Organization How do we build memory subsystems out of memory devices?

Making the Memory Deeper 256K x 8 Memory System: Use four 64K x 8 RAM chips 256K 18 address lines 16 shared address lines to array 2 address lines decoded to provide /CS (one per chip) common R/W and tri-state data outputs

Making the Memory Wider 64K x 16 Memory System: Use two 64K x 8 RAM chips 16 shared address lines shared control signals

Access Bank 0 Memory Interleaving Access Pattern without Interleaving: CPU Memory D1 available Start Access for D1 Start Access for D2 Access Pattern with 4-way Interleaving: CPU Memory Bank 0 Memory Bank 1 Memory Bank 2 Access Bank 1 Access Bank 2 Access Bank 3 We can Access Bank 0 again Memory Bank 3

Memory Interleaving (cont d) read 00000 read 00001 read 00002 read 00003 read 00004 for (i = 0; i <16; i++) A[i] = A[i] * c + d; (assume A[0] at address 0) address address address address 0 4 8 12 1 5 9 13 2 6 10 14 3 7 11 15 Bank 0 Bank 1 Bank 2 Bank 3

Memory Interleaving (cont d) Low Order Address Interleaving

Memory Interleaving (cont d) Low Order Address Interleaving w/ Byte Select Bank Select Byte Select

Memory Interleaving (cont d) High Order Address Interleaving

High Order Interleaving at Work 256K x 8 Memory System: Use four 64K x 8 RAM chips 256K 18 address lines 16 shared address lines to array 2 address lines decoded to provide /CS (one per chip) common R/W and tri-state data outputs

Memory Interleaving (cont d) High Order Address Interleaving Bank Select Byte Select