ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Lec 26: November 9, 2018 Memory Overview
Dynamic OR4! Precharge time?! Driving input " With R 0 /2 inverter! Driving inverter and self cap?! Output self delay? 2
CMOS NOR4! Driving input " With R 0 /2 inverter! Driving self cap? 3
CMOS NAND4! Driving input " With R 0 /2! Driving self cap? 4
Today! Memory " Classification " Architecture " Periphery " Project 2 is on this 5
Semiconductor Memory Classification RWM NVRWM ROM Random Access Non-Random Access EPROM E 2 PROM Mask-Programmed Programmable (PROM) SRAM FIFO FLASH DRAM LIFO Shift Register CAM
Memory Architecture: Core M bits M bits N Words S 0 S 1 S 2 S N-2 S N_1 Word 0 Word 1 Word 2 Word N-2 Word N-1 Storage Cell A 0 A 1 A K-1 Decoder S 0 Word 0 Word 1 Word 2 Word N-2 Word N-1 Storage Cell Input-Output (M bits) Input-Output (M bits) N words => N select signals Too many select signals Decoder reduces # of select signals K = log 2 N
Memory Architecture: Decoders M bits M bits N Words S 0 S 1 S 2 S N-2 S N_1 Word 0 Word 1 Word 2 Word N-2 Word N-1 Storage Cell A 0 A 1 A K-1 Decoder S 0 Word 0 Word 1 Word 2 Word N-2 Word N-1 Storage Cell Input-Output (M bits) Input-Output (M bits) N words => N select signals Too many select signals Decoder reduces # of select signals K = log 2 N
Array-Structured Memory Architecture Problem: ASPECT RATIO or HEIGHT >> WIDTH 2 L-K Bit Line Storage Cell A K A K+1 A L-1 Row Decoder Word Line Sense Amplifiers / Drivers M.2 K Amplify swing to rail-to-rail amplitude A 0 A K-1 Column Decoder Selects appropriate word Input-Output (M bits)
Latches/Register Can Store a State! Build register from pair of latches! Control with non-overlapping clocks 10
ROM Memories 11
MOS NOR ROM V DD Pull-up devices WL[0] WL[1] GND WL[2] WL[3] GND BL[0] BL[1] BL[2] BL[3]
MOS NOR ROM V DD Pull-up devices WL[0] WL[1] GND WL[2] WL[3] GND BL[0] BL[1] BL[2] BL[3]
MOS NOR ROM V DD Pull-up devices WL[0] WL[1] 1 0 1 1 GND WL[2] GND WL[3] BL[0] BL[1] BL[2] BL[3]
MOS NOR ROM V DD Pull-up devices WL[0] WL[1] 1 0 0 1 1 1 1 0 GND WL[2] 1 0 1 0 GND WL[3] 1 1 1 1 BL[0] BL[1] BL[2] BL[3]
MOS NAND ROM BL[0] BL[1] BL[2] BL[3] V DD Pull-up devices WL[0] WL[1] WL[2] WL[3] All word lines high by default with exception of selected row
MOS NAND ROM V DD Pull-up devices WL[0] WL[1] 0 BL[0] BL[1] BL[2] BL[3] 1 0 0 WL[2] WL[3] All word lines high by default with exception of selected row
MOS NAND ROM V DD Pull-up devices WL[0] WL[1] WL[2] WL[3] 0 1 0 0 BL[0] BL[1] BL[2] BL[3] 1 0 0 0 0 1 1 0 1 0 0 0 All word lines high by default with exception of selected row
Array-Structured Memory Architecture Problem: ASPECT RATIO or HEIGHT >> WIDTH 2 L-K Bit Line Storage Cell A K A K+1 A L-1 Row Decoder Word Line Sense Amplifiers / Drivers M.2 K Amplify swing to rail-to-rail amplitude A 0 A K-1 Column Decoder Selects appropriate word Input-Output (M bits)
Memory Periphery Penn ESE 370 Fall 2018 Khanna
Periphery! Decoders! Sense Amplifiers! Input/Output Buffers! Control/Timing Circuitry
Array Architecture! 2 n words of 2 m bits each! Good regularity easy to design! Very high density if good cells are used Penn ESE 370 Fall 2018 Khanna 22
Array Architecture! 2 n words of 2 m bits each! Good regularity easy to design! Very high density if good cells are used Penn ESE 370 Fall 2018 Khanna 23
Array Architecture! 2 n words of 2 m bits each! Good regularity easy to design! Very high density if good cells are used 24
6T SRAM Cell! Cell size accounts for most of array size " Reduce cell size at expense of complexity! 6T SRAM Cell " Used in most commercial chips " Data stored in cross-coupled inverters! Read: BL " Precharge BL, BL " Raise WL WL! Write: " Drive data onto BL, BL " Raise WL BL 25
Decoders Penn ESE 370 Fall 2018 Khanna
Array Architecture! 2 n words of 2 m bits each! Good regularity easy to design! Very high density if good cells are used Penn ESE 370 Fall 2018 Khanna 27
Decoders! n:2 n decoder consists of 2 n n-input AND gates " One needed for each row of memory " Build AND from NAND or NOR gates Static CMOS A1 A0 word0 word1 A1 A0 1 1 1 1 8 4 word word2 word3 28
Large Decoders! For n > 4, NAND gates become slow " Break large gates into multiple smaller gates 29
Large Decoders! For n > 4, NAND gates become slow " Break large gates into multiple smaller gates 30
Predecoding! Many of these gates are redundant " Factor out common gates into predecoder " Saves area " Same path effort 31
Row Select: Precharge NAND Penn ESE 370 Fall 2018 Khanna 32
Row Select: Precharge NAND Penn ESE 370 Fall 2018 Khanna 33
Row Select: Precharge NOR Penn ESE 370 Fall 2018 Khanna 34
Column Circuitry & Bit-line Conditioning Penn ESE 370 Fall 2018 Khanna
Array Architecture! 2 n words of 2 m bits each! Good regularity easy to design! Very high density if good cells are used Penn ESE 370 Fall 2018 Khanna 36
Column Circuitry! Some circuitry is required for each column " Bitline conditioning " Precharging " Driving input data to bitline " Sense amplifiers " Column multiplexing (AKA Column Decoders) Penn ESE 370 Fall 2018 Khanna 37
Bitline Conditioning! Precharge bitlines high before reads BL bit φ BL bit_b Penn ESE 370 Fall 2018 Khanna 38
Bitline Conditioning! Precharge bitlines high before reads BL bit φ BL bit_b Penn ESE 370 Fall 2018 Khanna 39
Bitline Conditioning! Precharge bitlines high before reads BL bit φ BL bit_b! What if pre-charged to Vdd/2? " Pros: reduces read-upset " Challenge: generate Vdd/2 voltage on chip Penn ESE 370 Fall 2018 Khanna 40
Column Capacitance! What is capacitance of bit line (column)? " W access transistor width of column device " d rows " γ=c diff /C gate Penn ESE 370 Fall 2018 Khanna
Delay Driving Bit Line! In terms of W access, W buf, d! For W access =W buf =1, d=32/512, γ=0.5 Penn ESE 370 Fall 2018 Khanna 42
Column Capacitance Consequence! Want W access, W buf small to keep memory cell small! Increasing W access, also increases C bl " Don t really win by sizing up! Conclude: Driving bit line will be slow Penn ESE 370 Fall 2018 Khanna 43
Sense Amplifiers! Bitlines have many cells attached " Ex: 32-kbit SRAM has 128 rows x 256 cols " 128 cells on each bitline! t pd (C/I) ΔV " Even with shared diffusion contacts, 64C of diffusion capacitance (big C) " Discharged slowly through small transistors in each memory cell (small I)! Sense amplifiers are triggered on small voltage swing V (ΔV) BL V(1) V PRE ΔV Penn ESE 370 Fall 2018 Khanna Sense amp activated Word line activated V(0) t 44
Differential Pair Amp! Differential pair requires no clock! But always dissipates static power sense_b bit BL P1 N1 N2 P2 sense bit_b BL N3 45
Clocked Sense Amp! Clocked sense amp saves power! Requires sense_clk after enough bitline swing! Isolation transistors cut off large bitline capacitance bit bit_b sense_clk isolation transistors regenerative feedback sense sense_b 46
Word Line Capacitance! What is capacitance of word line (row)? " W access transistor width of column device " w columns " γ=c diff /C gate! Delay driving word line?
Column Drivers: Memory Bank Penn ESE 370 Fall 2018 Khanna 48
Tristate Buffer! Typically used for signal traveling, e.g. bus! Ideally all devices connected to a bus should be disconnected except for active device reading or writing to bus! Use high-impedance state to simulate disconnecting Input En Output Input En Ouptut 0 0 Z 1 0 Z Active-high buffer 0 1 0 1 1 1 Penn ESE 370 Fall 2018 Khanna 49
Tristate Buffer En Input Output Input Output Vdd En Input En Output En CMOS circuit Penn ESE 370 Fall 2018 Khanna 50
Tristate Inverters En En Input Output Input Output Penn ESE 370 Fall 2018 Khanna 51
8x4 Memory with column decoder A0 A1 2-to-4 Decoder Row Decoder CS Column Select (A2) 0 1 2 3 CS 8x4 Memory 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 0 1 1-to-2 Decoder Column Decoder Tristate Buffer (read) D0 D1 D2 D3 Penn ESE 370 Fall 2018 Khanna 52
Read/Write Memory 0 8x4 Memory 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit A0 2-to-4 Row Decoder 1 2 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit A1 3 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit CS 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit Rd/Wr Column Select (A2) CS 0 1 1-to-2 Column Decoder D0 D1 D2 D3 Penn ESE 370 Fall 2018 Khanna 53
Read/Write Memory 0 8x4 Memory 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit A0 2-to-4 Row Decoder 1 2 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit A1 3 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit CS 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit Rd/Wr = 0 Column Select (A2) = 0 CS 0 1 1-to-2 Column Decoder D0 D1 D2 D3 Penn ESE 370 Fall 2018 Khanna 54
Read/Write Memory 0 8x4 Memory 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit A0 2-to-4 Row Decoder 1 2 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit A1 3 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit CS 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit Rd/Wr = 1 Column Select (A2) = 1 CS 0 1 1-to-2 Column Decoder D0 D1 D2 D3 Penn ESE 370 Fall 2018 Khanna 55
Serial Access Memories! Serial access memories do not use an address " Serial In Parallel Out (SIPO) " Parallel In Serial Out (PISO) " Shift Registers " Queues (FIFO, LIFO) 56
Serial In Parallel Out! 1-bit shift register reads in serial data " After N steps, presents N-bit parallel output clk Sin P0 P1 P2 P3 57
Parallel In Serial Out! Load all N bits in parallel when shift = 0 " Then shift one bit out per cycle shift/load clk P0 P1 P2 P3 Sout 58
Shift Register! Shift registers store and delay data! Simple design: cascade of registers clk Din 8 Dout 59
Denser Shift Registers! Flip-flops aren t very area-efficient! For large shift registers, keep data in SRAM instead! Move read/write pointers to RAM rather than data " Initialize read address to first entry, write to last " Increment address on each cycle clk Din 00...00 11...11 counter counter readaddr writeaddr dual-ported SRAM reset Dout 60
Queues! Queues allow data to be read and written at different rates.! Read and write each use their own clock, data! Queue indicates whether it is full or empty! Build with SRAM and read/write counters (pointers) storing read/write address WriteClk WriteData FULL Queue ReadClk ReadData EMPTY 61
FIFO, LIFO Queues! First In First Out (FIFO) " Initialize read and write pointers to first element " Queue is EMPTY " On write, increment write pointer " If write almost catches read, Queue is FULL " On read, increment read pointer " If read catches write, Queue is FULL! Last In First Out (LIFO) " Also called a stack " Use a single stack pointer for read and write 62
Idea! Memory for compact state storage " Minimize area per bit # maximize density! Share circuitry across many bits " Precharge, Amplifiers! Serial address memories " Use pointers to access memory " Eg. FIFO queue 63
Admin! Homework 7 due Monday midnight! Project handout will be released on Monday " More details in Monday s lecture 64