EECS150 - Digital Design Lecture 16 Memory 1

Similar documents
EECS150 - Digital Design Lecture 16 - Memory

Review: Timing. EECS Components and Design Techniques for Digital Systems. Lec 13 Storage: Regs, SRAM, ROM. Outline.

EECS150 - Digital Design Lecture 13 - Project Description, Part 2: Memory Blocks. Project Overview

Computer Organization. 8th Edition. Chapter 5 Internal Memory

Chapter 5 Internal Memory

EECS 151/251A Spring 2019 Digital Design and Integrated Circuits. Instructor: John Wawrzynek. Lecture 18 EE141

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory

Memory and Programmable Logic

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

EE178 Lecture Module 2. Eric Crabill SJSU / Xilinx Fall 2007

ECE 485/585 Microprocessor System Design

EECS150 - Digital Design Lecture 17 Memory 2

Introduction to SRAM. Jasur Hanbaba

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

Basic Organization Memory Cell Operation. CSCI 4717 Computer Architecture. ROM Uses. Random Access Memory. Semiconductor Memory Types

Chapter 5. Internal Memory. Yonsei University

ELCT 912: Advanced Embedded Systems

ECEU530. Project Presentations. ECE U530 Digital Hardware Synthesis. Rest of Semester. Memory Structures

Microcontroller Systems. ELET 3232 Topic 11: General Memory Interfacing

CSEE 3827: Fundamentals of Computer Systems. Storage

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs)

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs?

Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction

UNIT V (PROGRAMMABLE LOGIC DEVICES)

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 15 Memories

Basic FPGA Architecture Xilinx, Inc. All Rights Reserved

Memory and Programmable Logic

Memories. Design of Digital Circuits 2017 Srdjan Capkun Onur Mutlu.

ECE 485/585 Microprocessor System Design

ECSE-2610 Computer Components & Operations (COCO)

Chapter 4 Main Memory

Computer Organization and Assembly Language (CS-506)

FPGA RAM (C1) Young Won Lim 5/13/16

William Stallings Computer Organization and Architecture 8th Edition. Chapter 5 Internal Memory

Computer Memory. Textbook: Chapter 1

Semiconductor Memories: RAMs and ROMs

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (5 th Week)

Memory. Outline. ECEN454 Digital Integrated Circuit Design. Memory Arrays. SRAM Architecture DRAM. Serial Access Memories ROM

Design and Implementation of an AHB SRAM Memory Controller

Lecture 18: DRAM Technologies

ECE 331 Digital System Design

Computer Systems Laboratory Sungkyunkwan University

Memory classification:- Topics covered:- types,organization and working

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics

Introduction read-only memory random access memory

UMBC. Select. Read. Write. Output/Input-output connection. 1 (Feb. 25, 2002) Four commonly used memories: Address connection ... Dynamic RAM (DRAM)

Memory Overview. Overview - Memory Types 2/17/16. Curtis Nelson Walla Walla University

EE414 Embedded Systems Ch 5. Memory Part 2/2

Chapter 8 Memory Basics

Address connections Data connections Selection connections

ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems

+1 (479)

COMPUTER ARCHITECTURES

Section 6. Memory Components Chapter 5.7, 5.8 Physical Implementations Chapter 7 Programmable Processors Chapter 8

Overview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM

Memories: Memory Technology

ALTERA M9K EMBEDDED MEMORY BLOCKS

Design with Microprocessors

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

Read and Write Cycles

Memory Systems for Embedded Applications. Chapter 4 (Sections )

Read Only Memory ROM

Memory Pearson Education, Inc., Hoboken, NJ. All rights reserved.

MEMORY AND PROGRAMMABLE LOGIC

Real Time Embedded Systems

Lecture 11 SRAM Zhuo Feng. Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 2010

COMP3221: Microprocessors and. and Embedded Systems. Overview. Lecture 23: Memory Systems (I)

Views of Memory. Real machines have limited amounts of memory. Programmer doesn t want to be bothered. 640KB? A few GB? (This laptop = 2GB)

Design with Microprocessors

ECE 341. Lecture # 16

ECE 2300 Digital Logic & Computer Organization

Unit 6 1.Random Access Memory (RAM) Chapter 3 Combinational Logic Design 2.Programmable Logic

RTL Design (2) Memory Components (RAMs & ROMs)

! Memory. " RAM Memory. " Serial Access Memories. ! Cell size accounts for most of memory array size. ! 6T SRAM Cell. " Used in most commercial chips

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

CS311 Lecture 21: SRAM/DRAM/FLASH

Lecture 13: SRAM. Slides courtesy of Deming Chen. Slides based on the initial set from David Harris. 4th Ed.

Summer 2003 Lecture 18 07/09/03

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 13

Large and Fast: Exploiting Memory Hierarchy

COSC 6385 Computer Architecture - Memory Hierarchies (III)

Topic 21: Memory Technology

Topic 21: Memory Technology

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.

Semiconductor Memory Classification. Today. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. CPU Memory Hierarchy.

Lecture 13: Memory and Programmable Logic

CMPEN 411 VLSI Digital Circuits Spring Lecture 22: Memery, ROM

CS 320 February 2, 2018 Ch 5 Memory

Memory Challenges. Issues & challenges in memory design: Cost Performance Power Scalability

Evolution of Implementation Technologies. ECE 4211/5211 Rapid Prototyping with FPGAs. Gate Array Technology (IBM s) Programmable Logic

PROGRAMMABLE MODULES SPECIFICATION OF PROGRAMMABLE COMBINATIONAL AND SEQUENTIAL MODULES

EECS150 - Digital Design Lecture 10 Logic Synthesis

Memories: a practical primer

8051 INTERFACING TO EXTERNAL MEMORY

EEM 486: Computer Architecture. Lecture 9. Memory

Digital Integrated Circuits Lecture 13: SRAM

Logic Synthesis. EECS150 - Digital Design Lecture 6 - Synthesis

Field Programmable Gate Array (FPGA)

FPGA: What? Why? Marco D. Santambrogio

Transcription:

EECS150 - Digital Design Lecture 16 Memory 1 March 13, 2003 John Wawrzynek Spring 2003 EECS150 - Lec16-mem1 Page 1

Memory Basics Uses: Whenever a large collection of state elements is required. data & program storage general purpose registers buffering table lookups CL implementation Example RAM: Register file from microprocessor clk Types: RAM - random access memory ROM - read only memory EPROM, FLASH - electrically programmable read only memory regid = register identifier (address of word in memory) sizeof(regid) = log2(# of reg) WE = write enable Spring 2003 EECS150 - Lec16-mem1 Page 2

Definitions Memory Interfaces for Accessing Data Asynchronous (unclocked): A change in the address results in data appearing Synchronous (clocked): A change in address, followed by an edge on CLK results in data appearing or write operation occurring. A common arrangement is to have synchronous write operations and asynchronous read operations. Volatile: Looses its state when the power goes off. Nonvolatile: Retains it state when power goes off. Spring 2003 EECS150 - Lec16-mem1 Page 3

For read operations, functionally the regfile is equivalent to a 2-D array of flip-flops with tristate outputs on each: Register File Internals Cell with added write logic: These circuits are just functional abstractions of the actual circuits used. How do we go from "regid" to "SEL"? Spring 2003 EECS150 - Lec16-mem1 Page 4

Regid (address) Decoding The function of the address decoder is to generate a one-hot code word from the address. The output is use for row selection. Many different circuits exist for this function. A simple one is shown to the right. Spring 2003 EECS150 - Lec16-mem1 Page 5

Standard Internal Memory Organization 2-D arrary of bit cells. Each cell stores one bit of data. Special circuit tricks are used for the cell array to improve storage density. (We will look at these later) RAM/ROM naming convention: examples: 32 X 8, "32 by 8" => 32 8-bit words 1M X 1, "1 meg by 1" => 1M 1-bit words Spring 2003 EECS150 - Lec16-mem1 Page 6

Read Only Memory (ROM) Simply for of memory. No write operation needed. Functional Equivalence: Connections to Vdd used to store a logic 1, connections to GND for storing logic 0. address decoder bit-cell array Full tri-state buffers are not needed at each cell point. In practice, single transistors are used to implement zero cells. Logic one s are derived through precharging or bit-line pullup transistor. Spring 2003 EECS150 - Lec16-mem1 Page 7

Column MUX in ROMs and RAMs: Controls physical aspect ratio Important for physical layout and to control delay on wires. In DRAM, allows time-multiplexing of chip address pins Spring 2003 EECS150 - Lec16-mem1 Page 8

Cascading Memory Modules (or chips) Example: assemblage of 256 x 8 ROM using 256 x 4 modules: example: 1K x * ROM using 256 x 4 modules: each module has tri-state outputs: Spring 2003 EECS150 - Lec16-mem1 Page 9

Memory Components Types: Volatile: Random Access Memory (RAM): DRAM "dynamic" SRAM "static" Non-volatile: Read Only Memory (ROM): Mask ROM "mask programmable" EPROM "electrically programmable" EEPROM "erasable electrically programmable" FLASH memory - similar to EEPROM with programmer integrated on chip Spring 2003 EECS150 - Lec16-mem1 Page 10

Volatile Memory Comparison The primary difference between different memory types is the bit cell. SRAM Cell DRAM Cell word line word line bit line bit line bit line Larger cell lower density, higher cost/bit No refresh required Simple read faster access Standard IC process natural for integration with logic Smaller cell higher density, lower cost/bit Needs periodic refresh, and refresh after read Complex read longer access time Special IC process difficult to integrate with logic circuits Spring 2003 EECS150 - Lec16-mem1 Page 11

Multi-ported Memory Motivation: Consider CPU core register file: 1 read or write per cycle limits processor performance. Complicates pipelining. Difficult for different instructions to simultaneously read or write regfile. Common arrangement in pipelined CPUs is 2 read ports and 1 write port. sel a sel b sel c data a Regfile Motivation: I/O data buffering: disk/network data buffer CPU dual-porting allows both sides to simultaneously access memory at full bandwidth. addr a data a RW a addr b data b RW b Dual-port Memory data b data c Spring 2003 EECS150 - Lec16-mem1 Page 12

Dual-ported Memory Internals Add decoder, another set of read/write logic, bits lines, word lines: dec a dec b cell array Example cell: SRAM WL 2 WL 1 b 2 b 1 b 1 b 2 address ports data ports r/w logic r/w logic Repeat everything but crosscoupled inverters. This scheme extends up to a couple more ports, then need to add additional transistors. Spring 2003 EECS150 - Lec16-mem1 Page 13

Memory in Desktop Computer Systems: SRAM (lower density, higher speed) used in CPU register file, on- and off-chip caches. DRAM (higher density, lower speed) used in main memory Closing the GAP: 1. Caches are growing in size. 2. Innovation targeted towards higher bandwidth for memory systems: SDRAM - synchronous DRAM RDRAM - Rambus DRAM EDORAM - extended data out SRAM Three-dimensional RAM hyper-page mode DRAM video RAM multibank DRAM Spring 2003 EECS150 - Lec16-mem1 Page 14

Important DRAM Examples: EDO - extended data out (similar to fast-page mode) RAS cycle fetched rows of data from cell array blocks (long access time, around 100ns) Subsequent CAS cycles quickly access data from row buffers if within an address page (page is around 256 Bytes) SDRAM - synchronous DRAM clocked interface uses dual banks internally. Start access in one back then next, then receive data from first then second. DDR - Double data rate SDRAM Uses both rising (positive edge) and falling (negative) edge of clock for data transfer. (typical 100MHz clock with 200 MHz transfer). RDRAM - Rambus DRAM Entire data blocks are access and transferred out on a highspeed bus-like interface (500 MB/s, 1.6 GB/s) Tricky system level design. More expensive memory chips. Spring 2003 EECS150 - Lec16-mem1 Page 15

Non-volatile Memory Used to hold fixed code (ex. BIOS), tables of data (ex. FSM next state/output logic), slowly changing values (date/time on computer) Mask ROM Used with logic circuits for tables etc. Contents fixed at IC fab time (truly write once!) EPROM (erasable programmable) & FLASH requires special IC process (floating gate technology) writing is slower than RAM. EPROM uses special programming system to provide special voltages and timing. reading can be made fairly fast. rewriting is very slow. erasure is first required, EPROM - UV light exposure, EEPROM electrically erasable Spring 2003 EECS150 - Lec16-mem1 Page 16

FLASH Memory Electrically erasable In system programmability and erasability (no special system or voltages needed) On-chip circuitry (FSM) and voltage generators to control erasure and programming (writing) Erasure happens in variable sized "sectors" in a flash (16K - 64K Bytes) See: http://developer.intel.com/design/flash/ for product descriptions, etc. Compact flash cards are based on this type of memory. Spring 2003 EECS150 - Lec16-mem1 Page 17

Memory Specification in Verilog Memory modeled by an array of registers: reg[15:0] memword[0:1023]; // 1,024 registers of 16 bits each //Example Memory Block Specification // Uses enable to control both write and read //----------------------------- //Read and write operations of memory. //Memory size is 64 words of 4 bits each. module memory (Enable,ReadWrite,Address,DataIn,DataOut); input Enable,ReadWrite; input [3:0] DataIn; input [5:0] Address; output [3:0] DataOut; reg [3:0] DataOut; reg [3:0] Mem [0:63]; //64 x 4 memory always @ (Enable or ReadWrite) if (Enable) if (ReadWrite) DataOut = Mem[Address]; //Read else Mem[Address] = DataIn; //Write else DataOut = 4'bz; //High impedance state endmodule Spring 2003 EECS150 - Lec16-mem1 Page 18

Memory Blocks in FPGAs LUTs can double as small RAM blocks: 4-LUT is really a 16x1 memory. Normally we think of the contents being written from the configuration bit stream, but Virtex architecture (and others) allow bits of LUT to be written and read from the general interconnect structure. achieves 16x density advantage over using CLB flipflops. Furthermore, the two LUTs within a slice can be combined to create a 16 x 2-bit or 32 x 1-bit synchronous RAM, or a 16x1-bit dual-port synchronous RAM. The Virtex-E LUT can also provide a 16-bit shift register of adjustable length. Newer FPGA families include larger on-chip RAM blocks (usually dual ported): Called block selectrams in Xilinx Virtex series 4k bits each RAM16X1S Spring 2003 EECS150 - Lec16-mem1 Page 19 WE WE D WCLK A0 A1 A2 A3 D WCLK A0 A1 A2 A3 DPRA0 DPRA1 DPRA2 DPRA3 RAM16X1D X4950 SPO DPO X4942 one read port, one write port O Synchronous write, asynchronous read D CE CLK A0 A1 A2 A3 SRL16E_1 X8421 Q Length controlled by A0-A3

Verilog Specification for Virtex LUT RAM module ram16x1(q, a, d, we, clk); output q; input d; input [3:0] a; input clk, we; reg mem [15:0]; always @(posedge clk) begin if(we) mem[a] <= d; end assign q = mem[a]; endmodule WE D WCLK A0 A1 A2 A3 RAM16X1S X4942 Deeper and/or wider RAMs can be specified and the synthesis tool will do the job of wiring together multiple LUTs. How does the synthesis tool choose to implement your RAM as a collection of LUTs or as block RAMs? O Spring 2003 EECS150 - Lec16-mem1 Page 20

Each block SelectRAM (block RAM) is a fully synchronous (synchronous write and read) dual-ported (true dual port) 4096-bit RAM with independent control signals for each port. The data widths of the two ports can be configured independently, providing built-in bus-width conversion. CLKA and CLKB can be independent, providing an easy way to cross clock bounders. Around 160 of these on the 2000E. Multiples can be combined to implement, wider or deeper memories. See chapter 8 of Synplify reference manual on how to write Verilog for implied Block RAMs. Or instead, explicitly instantiate as primitive (project checkpoint will use this method). Virtex Block RAMs WEA ENA RSTA CLKA ADDRA[#:0] DIA[#:0] WEB ENB RSTB CLKB ADDRB[#:0] DIB[#:0] DOA[#:0] DOB[#:0] Table 5: Block SelectRAM Port AspectRatios Width Depth ADDR Bus Data Bus 1 4096 ADDR<11:0> DATA<0> 2 2048 ADDR<10:0> DATA<1:0> 4 1024 ADDR<9:0> DATA<3:0> 8 512 ADDR<8:0> DATA<7:0> 16 256 ADDR<7:0> DATA<15:0> Spring 2003 EECS150 - Lec16-mem1 Page 21

Relationship between Memory and CL Memory blocks can be (and often are) used to implement combinational logic functions: Examples: LUTs in FPGAs 1Mbit x 8 EPROM can implement 8 independent functions each of log 2 (1M)=20 inputs. The decoder part of a memory block can be considered a minterm generator. The cell array part of a memory block can be considered an OR function over a subset of rows. The combination gives us a way to implement logic functions directly in sum of products form. Several variations on this theme exist in a set of devices called Programmable logic devices (PLDs) Spring 2003 EECS150 - Lec16-mem1 Page 22

A ROM as AND/OR Logic Device Spring 2003 EECS150 - Lec16-mem1 Page 23

PLD Summary Spring 2003 EECS150 - Lec16-mem1 Page 24

PLA Example Spring 2003 EECS150 - Lec16-mem1 Page 25

PAL Example Spring 2003 EECS150 - Lec16-mem1 Page 26