Data Cache Final Project Report ECE251: VLSI Systems Design UCI Spring, 2000

Size: px
Start display at page:

Download "Data Cache Final Project Report ECE251: VLSI Systems Design UCI Spring, 2000"

Transcription

1 June 15, 2000 Data Cache Final Project Report ECE251: VLSI Systems Design UCI Spring, 2000 Jinfeng Liu Yi Deng ID: ID:

2 Project Summary In this project, we have designed and implemented a direct-mapped write back data cache using TSMC 0.18µm technology. This design has been carried out with Magic layout editor using a SCN6M_SUBM.10 technology file. The data cache is featured with a total capacity of 512 bytes with four-word per cache line. All the cells in the cache share one read and one write port through MUX. In addition, separated read and write address decoders are used to support simultaneous read and write operations. Logic function verification is performed by IRSIM, whereas a more comprehensive simulation of the data cache is done with HSPICE. Preliminary results show our design performs well with 3ns cycle time.

3 1. OBJECTIVE This project is to design and implement a data cache system using the 0.18 technology. The specifications are: o o o o o o Direct-mapped Total 32 cache lines Four words (128 bits) per cache line Write-back One read port and one write port with separated address decoder 32-bit data input, 32-bit data output o Initialize valid bit to 0, dirty bit to 1 The project structure is shown in Figure 1. Two separated read and write address decoders and ports are needed in order to support simultaneous read and write operation. In addition, comparators are used to compare the tag in the address and the tag stored in the cache, and generate HIT signal for read and write accordingly. Furthermore, the 4-1 MUX serves as a selector, which picks one of the four words that are read from the cells according to the offset bits -- the third and fourth bits in the address. Similarly, the 1-4 DEMUX addresses the 32-bit data input to 1 of 4 data words in each cache line according to the offset bits in write address. The byte offset bits (bit 0 and bit 1 in 32-bit address) are not used, since the basic unit for read and write operation is one 32-bit word.

4 Fig. 1 System overview 2. PROCEDURES We started from the design of the individual function cells, such as static RAM, MUX, COMPARATOR, and row decoder. After we finished the logic design, we did the implementation and simulation on the individual function cells. We then put them together to form a one entry with four data words cache. Only when the simulation result of this one line data cache was correctly, we then proceeded to put thirty two lines together to form the 512 bytes, direct-mapped, write back data cache STATIC RAM Figure 2 shows the mask-level layout of our static RAM. We used 6 transistors to implement the static RAM cell. The cell stores data on the gate of the storage transistor. Separate read and write control lines are used. Bit line is used to write, while bit line is used to read.

5 Fig. 2 Structure of Static SRAM cell 2.2. STATIC RAM WRITE AND READ The objective of the RAM write operation is to apply voltages to the RAM cell such that it will flip state. There are two kinds of WRITE operations: memory WRITE and processor WRTIE. For a memory WRITE, the write word line is asserted by row decoder. For a processor WRITE, if the comparator generates a write HIT signal, the write word line is asserted by row decoder. The Bit line is then driven to either VDD or VSS depending on the value that are to be stored in the cell. Figure 3.a shows a plot of wave forms during a WRITE operation. The state retained in SRAM cell (q) follows the input data appearing on the write bit line during the write stage (WWL is high). To read the RAM cell, the read word line is asserted by the read row decoder. All the bit lines is then pulled either up or down according to the values stored in the RAM cells in this row. Thirty two Muxes is used. Each Mux is to select one bit out of four bits, which we will discuss in detail section 2.4. Comparator generats HIT or MISS signal by comparing the tag in the address and

6 the tag stored in the cache. Figure 3.b shows a plot of wave forms during a READ operation. Correct value of SRAM cell is read during the read stage, while RWL is asserted. Fig. 3.a SRAM write operation Fig. 3.b SRAM read operation We use this basic cell structure to implement all data bits. However, there are different issues in data, tag, dirty and tag bits. We will address these issues later.

7 2.3. ADDRESS DECODER The simplest row decoder is an AND gate. To avoid exponentially increased delay due to large fan-in AND gate, we implemented different the address decoders. These decoders are illustrated in Figure 4. Although the 5-NAND decoder is supposed to deliver the worse performance, we discovered that when we attach a load to the output, the delay time is not too different in all these configurations. This implies the delay time is dominated by load, rather than the gate delay itself. We chose the NOR-NAND decoder in our implementation. Fig 4.a cascaded 2-NAND and 3-NAND decoder Fig 4.b 3-NOR and 3-NAND decoder

8 Fig. 4.c 5-NAND decoder 2.4. MUX To select one of the four words, we used thirty-two MUXes. Each MUX selects one bit among four bits according to the two offset bits, which are the third and fourth bits in the address. Eight n-transistors are used to implement MUX, which is shown in Figure 5. Since data bits are ready through!bl, the inverted data values are fed into the MUX. Based the value of A3 and A2 which constitutes the word offset fields, one of the four data values are selected and then inverted to generate the data out.!data3 A3!Data1!A3 Vdd A2 A2 Dataout!A0!A3 A3!A2!Data2 Fig. 5.a!Data0 Gnd 4-1 1bit Mux schematic

9 Fig. 5.b bit Mux layout 2.5. COMPARATOR We used twenty-three XOR gates, six four-input NOR gates and one six-input NAND gate to implement comparator. To break down the delay time, the six-input NAND is divided into two level NAND logic. An XOR gate takes one tag bit from the address and one from the corresponding cell as inputs. All the outputs from the XOR gates, along with VALID signal, are routed to the six four-input NOR gates as inputs. The six outputs from the six NOR gates are in turn routed to the NAND which generates HIT or MISS signal as output. Figure 6 shows the three basic XOR gates used in the COMPARATOR. Fig. 6 XOR gate as comparator

10 2.6. TAG Similar to the static RAM that we discussed in section 2.1, we used six transistors to implement TAG cell. The distinction between a tag bit and a data bit is that during a CPU write stage, the tag bit is being read instead of written. While the MEM write will make the mandatory update to all SRAM cells. This difference also applies to valid and dirty bits. The different operations are summarized in table 1. Table 1. Different operations on data, tag, valid and dirty bits It can be seen from the table that the bit line of TAG bit will be shared for both read (CPU write) and write (MEM write). There is another approach that will add another read port on TAG bit dedicated for CPU write operation. Our design is to share the bit line. Therefore, the CPU write and MEM write operation should be insulated on the bit line. Otherwise, during the CPU write stage, the wrong value may be put into the SRAM cell. We use a tri-state buffer as the input to

11 the bit line for MEM write operation; and a standard 2-invertor output buffer is connected to the bit line for CPU write operations. The MEM write signal (mwr) is used to enable the tri-state buffer. This will separate the CPU write stage and MEM write stage and protect the correct value in SRAM cell. The tri-state buffer structure is shown in Fig. 7. The TAG bit structure is shown in Fig. 8. Vdd In En Out Fig. 7. Tri-state buffer GND Vd Vdd tagin tagout mwr GN GND BL WWL!BL SRAM cell Of TAG RWL Fig. 8. TAG cell structure

12 Fig. 9 shows simulation results for a pair of tag bits in one cache line. TAG0 and TAG1 are written during MEM write stages twice (signal mwr) with different values (wtag0in & wtag1in). The MEM write operations are followed by a read (signal rwl) and a CPU write (wwl is high while mwr is low). The output signals demonstrate correct behavior at read port (rtag0out, rtag1out) and write port (wtag0out, wtag1out). Fig. 9. Simulation result of a TAG cell

13 2.7. WORD LINE AND DEMUX LOGIC Due to the different operation modes on different cells, the logic to assert the word line is also implemented differently. The DEMUX logic is also combined in word line logic using predecoded offset signals. One of the four data words in each cache line by asserted WWL based on the value of offset field A1A0. Table 2 summarizes the word line logic combined with the predecoded DEMUX logic. mwr and pwr are control signals for MEM write and CPU write. The address decodes generates addr_decode signal to select one cache lline. CPU write / WWL = MEM write / WWL = Read / RWL = Valid mwr addr_decode pwr addr_decode addr_decode Dirty mwr addr_decode pwr addr_decode addr_decode Tag mwr addr_decode pwr addr_decode addr_decode Data00 mwr addr_decode!a3!a2 pwr addr_decode!a3!a2 addr_decode Data01 mwr addr_decode!a3 A2 pwr addr_decode!a3 A2 addr_decode Data10 mwr addr_decode A3!A2 pwr addr_decode A3!A2 addr_decode Data11 mwr addr_decode A3 A2 pwr addr_decode A3 A2 addr_decode Table 2. Word line logic on different operations 3. RESULTS Based on the issues discussed in the previous section, the layout of one single cache line is drafted. We first completed one cache line with one port to perform either read or write operation. After the correct logic is verified, we added the second port and separated read and

14 write operations. The following simulation results are performed for single cache line with one read port and one write port. Load is added to the output signals. All speed models ff, tt and ss are simulated. The system clock cycle time is 3ns, in another word, the clock speed is 333 MHz. Simulation 1, perform read and write operation, test hit, valid, dirty signals and data out. In this simulation, we simulated the conditions at which read/write hit/miss can occur at the single cache line. The read address is kept the same as write address to check how soon read operation can respond to a immediate write operation in the same cycle. All 3 speed models, FF, TT and SS are simulated. All results show that the operation can be completed in 3ns cycle time. Cycle Operation reset MEM write To ffff Datain = 1 Read from ffff MEM write To ffbf Datain = 0 Read from ffff CPU write To ffff Datain = 0 Read from ffff CPU write To 001f Datain = 0 Read from ffff CPU write To ffff Datain = 1 Read from ffff Result X Write to word3 Data = 1 Read hit* Not write to this line Read hit Write hit To word3 Data = 0 Read hit* Write miss By conflict Read miss By conflict Write hit To word3 Data = 1 Read hit* wlinesel X rlinesel X Whit X X X X Rhit X 1* 1 1* 0 1* 1* Wvalidout X 1 X 1 X 1 1 Rvalidout X 1* 1 1* X 1* 1* Wdirtyout X 0 X 1 X 1 0 Rdirtyout X 0* 1 1* X 1* 0* Dataout X X 1 0 Mem write To ffff Datain = 0 Read from ffff Write to word3 Data = 0 Read hit* Table 3. Test vector of simulation 1 and the truth table of some signals * reads are performed to save cell while it is being written. This should not be allowed in reality, but here it is used to observe how soon the output can keep up with the in the same cycle. a. Use TT configuration.

15 All related signals are listed below. Although we perform some illegal operations read from the same cell that is being written at the same cycle, the output data can still match with the input in the same cycle. The logic of all signals concords with the truth table. A delay time around 1ns 1.5ns can be seen on the read data output. The delay time of write operation also contributes to this delay since the data is just written in the same cycle. The average power consumption is 7mW, while the peak power is 44mW.

16 Fig. 10. Simulation 1 results of TT configuration b. Use FF configuration. Delay time is reduced around 0.5ns. All logic is correct. The power consumption increases to 7.7mW average, 60mW at peak.

17 Fig 11. Simulation result of FF model c. Use SS configuration Severe delay can be observed. Sometimes the delay time is around 2ns in a 3ns cycle time. However, the logic is still correct, but it might not be safe to implement our design in slow materials. The average power consumption is 6mW, the peak power is 43mW. Fig 12. Simulation result of SS model

18 Simulation 2, read and write at same time from/to different location In this simulation, we only consider valid write and read. In consecutive 4 cycles, MEM write updates word0, word1, word2 and word3 in the same line. The read operation will validate the write value in the next cycle. The scenario of this simulation is summarized in table 4. We only use TT model for simulation. A 2-invertor load is attached to output. Cycle Operation reset MEM write To ffffc Datain = 1 Read from fffc MEM write To fffd Datain = 0 Read from fffc MEM write To fffe Datain = 1 Read from fffd Result X Write to word0 Data = 1 Read hit* Write to word1 Data = 0 Read hit Data = 1 Write to word2 Data = 1 Read hit Data = 0 wlinesel X rlinesel X 1* Whit X X X X X Rhit X 1* Wvalidout X Rvalidout X 1* Wdirtyout X Rdirtyout X 1* Dataout X 1* Table 4. Operations in simulation 2. * read/write at the same cycle MEM write To ffff Datain = 0 Read from fffe Write to word3 Data = 0 Read hit Data = 0 The simulation result shows the correct logical sequence. The delay time is 0.5ns 1ns in 3ns cycle time. The average power is 6.4mW. The peak power is 46mW.

19

20 fig 13. simulation result of simulation 2, tt model After we validated the correct functionality of the single cache line, we completed the whole 32- line cache in layout. However, we have troubles to run simulation. Each time spice refuses to execute after 2 or 3 hours due to some not converging nodes. Since each time it takes a long time before the error comes out, we are not able to continue the simulation in spice. This problem happens also in our simulation for a single line. We will leave this issue in the discussion section. 4. DISCUSSION In the implementation we found the most critical part that seriously impacts the correctness of the logic is incurred by different delay paths on word line and bit line. In most cases, the word line arrives/leaves the SRAM cell after the corresponding bit line signal that carries data, therefore wrong data will be stored into the SRAM cell. Fig 14 illustrate this behavior.

21 Control signal data Word line Bit line Wrong data will be written to SRAM Cell Clocked control signal Fig 14. Different delay time will affect the logic We tried different approaches to solve this problem. 1. We made more delay circuit to the data signals that will generate the bine line signal. We hope that the data on the bit line will remain longer before word line signal drops. This scheme failed since the word line signal is distributed to many SRAM cells, the delay time has to long; at the same time we can not effectively make longer delay on data signals. 2. We use control signals to turn on and off a transmission gate that passed data signal to bit line. We hope the word line signal drops as the control signal closes the transmission gate, so that the data signal will remain longer at bit line. This is also malfunctioning due to unpredictable factors that impact delay time. 3. We have to make a conservative approach, that is, we clock the control signals to make a short pulse on word line that will always turn on the SRAM cell when data on bit line is valid. This scheme works well on our implementation. It is also quite

22 tolerant to different delay paths, only if the clocked control signal lags the data signal on bit line more than a half of clock cycle should any error occur. We should notice data signal also has delay to be present on bit line, therefore such worse-case should be rare. Our implementation works well in 3ns cycle time, that is, the short pulse on word line around 1.5ns will make the system function correctly. However, in this approach, we may not be able to achieve higher clock rate due to the clock skew problem that may transition the short pulse on word line to a spike. As far as we consider 3ns cycle time, this scheme is not a bad solution. To avoid some long delays, we put larger transistors in every signal that will drive many gates in long wires. We also need such large transistors to prevent clock skew since this problem will kill the basis of our design. By inspecting the simulation results, our effort to reduce the delay time is quite effective. We have troubles to run simulation for the whole cache system. Spice tells inconvergenced node after we executed the program for 2 or 3 hours. This also happened when we simulate the single cache line. We used to have a smaller SRAM cell that was validated to work by simulation. When we run spice in the whole cache line, the circuit cannot converge. So we have to give up our original design for SRAM cell. This time the same problem happens again when we are moving from a validated single line to the whole cache system. 5. CONCLUSIONS We tried different implementations on proposed cache system. We found that the delay time on different paths can make a critical impact on the performance of a VLSI system, as well as the correctness of the logic functions.

23 In this project, we tried two ways to reduce the undesirable delay time. First, we made some driving signals more powerful by enlarge transistor size at the output. We found this scheme quite effective when a driving signal has moderate load, such as some internal logical operations on address bits (A3A2, and etc), control signals (mwr, pwr, hit), and clock. On the other hand, for the signals that will be broadcasting over long wires, the delay time seems to bound to a lower limit (caused by big load and long wire) that can will be further reduced by enlarging transistor size. The examples of these signals are word line and bit line signals. In our first approach to fight against the different delay paths, we put more driving power on word line and delayed the bit line signal, but the effectiveness is trivial. Our decision is to cut the long delay short by clocking the word line signals. This implementation functions well with a targeted 3ns clock cycle time.

24 The magic file for whole cache system is located in ~/jinfengl/ece251/project The tasks in this project is distributed by following: System design SRAM cell design Address decoder both both Yi Deng Word line and bit line logic Jinfeng Liu Tag comparator Component simulation Read port layout Write port layout Full layout for whole cache Simulation for single line Report Yi Deng Yi Deng Yi Deng Jinfeng Liu both Jinfeng Liu both

PICo Embedded High Speed Cache Design Project

PICo Embedded High Speed Cache Design Project PICo Embedded High Speed Cache Design Project TEAM LosTohmalesCalientes Chuhong Duan ECE 4332 Fall 2012 University of Virginia cd8dz@virginia.edu Andrew Tyler ECE 4332 Fall 2012 University of Virginia

More information

6. Latches and Memories

6. Latches and Memories 6 Latches and Memories This chapter . RS Latch The RS Latch, also called Set-Reset Flip Flop (SR FF), transforms a pulse into a continuous state. The RS latch can be made up of two interconnected

More information

Picture of memory. Word FFFFFFFD FFFFFFFE FFFFFFFF

Picture of memory. Word FFFFFFFD FFFFFFFE FFFFFFFF Memory Sequential circuits all depend upon the presence of memory A flip-flop can store one bit of information A register can store a single word, typically 32-64 bits Memory allows us to store even larger

More information

Prototype of SRAM by Sergey Kononov, et al.

Prototype of SRAM by Sergey Kononov, et al. Prototype of SRAM by Sergey Kononov, et al. 1. Project Overview The goal of the project is to create a SRAM memory layout that provides maximum utilization of the space on the 1.5 by 1.5 mm chip. Significant

More information

CPE300: Digital System Architecture and Design

CPE300: Digital System Architecture and Design CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Cache 11232011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review Memory Components/Boards Two-Level Memory Hierarchy

More information

Overview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM

Overview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM Memories Overview Memory Classification Read-Only Memory (ROM) Types of ROM PROM, EPROM, E 2 PROM Flash ROMs (Compact Flash, Secure Digital, Memory Stick) Random Access Memory (RAM) Types of RAM Static

More information

Memory and Programmable Logic

Memory and Programmable Logic Memory and Programmable Logic Memory units allow us to store and/or retrieve information Essentially look-up tables Good for storing data, not for function implementation Programmable logic device (PLD),

More information

SIDDHARTH INSTITUTE OF ENGINEERING AND TECHNOLOGY :: PUTTUR (AUTONOMOUS) Siddharth Nagar, Narayanavanam Road QUESTION BANK UNIT I

SIDDHARTH INSTITUTE OF ENGINEERING AND TECHNOLOGY :: PUTTUR (AUTONOMOUS) Siddharth Nagar, Narayanavanam Road QUESTION BANK UNIT I SIDDHARTH INSTITUTE OF ENGINEERING AND TECHNOLOGY :: PUTTUR (AUTONOMOUS) Siddharth Nagar, Narayanavanam Road 517583 QUESTION BANK Subject with Code : DICD (16EC5703) Year & Sem: I-M.Tech & I-Sem Course

More information

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,

More information

BUILDING BLOCKS OF A BASIC MICROPROCESSOR. Part 1 PowerPoint Format of Lecture 3 of Book

BUILDING BLOCKS OF A BASIC MICROPROCESSOR. Part 1 PowerPoint Format of Lecture 3 of Book BUILDING BLOCKS OF A BASIC MICROPROCESSOR Part PowerPoint Format of Lecture 3 of Book Decoder Tri-state device Full adder, full subtractor Arithmetic Logic Unit (ALU) Memories Example showing how to write

More information

! Memory. " RAM Memory. " Serial Access Memories. ! Cell size accounts for most of memory array size. ! 6T SRAM Cell. " Used in most commercial chips

! Memory.  RAM Memory.  Serial Access Memories. ! Cell size accounts for most of memory array size. ! 6T SRAM Cell.  Used in most commercial chips ESE 57: Digital Integrated Circuits and VLSI Fundamentals Lec : April 5, 8 Memory: Periphery circuits Today! Memory " RAM Memory " Architecture " Memory core " SRAM " DRAM " Periphery " Serial Access Memories

More information

ENGIN 112 Intro to Electrical and Computer Engineering

ENGIN 112 Intro to Electrical and Computer Engineering ENGIN 112 Intro to Electrical and Computer Engineering Lecture 30 Random Access Memory (RAM) Overview Memory is a collection of storage cells with associated input and output circuitry Possible to read

More information

A Comparative Study of Power Efficient SRAM Designs

A Comparative Study of Power Efficient SRAM Designs A Comparative tudy of Power Efficient RAM Designs Jeyran Hezavei, N. Vijaykrishnan, M. J. Irwin Pond Laboratory, Department of Computer cience & Engineering, Pennsylvania tate University {hezavei, vijay,

More information

COE758 Digital Systems Engineering

COE758 Digital Systems Engineering COE758 Digital Systems Engineering Project #1 Memory Hierarchy: Cache Controller Objectives To learn the functionality of a cache controller and its interaction with blockmemory (SRAM based) and SDRAM-controllers.

More information

CSEE 3827: Fundamentals of Computer Systems. Storage

CSEE 3827: Fundamentals of Computer Systems. Storage CSEE 387: Fundamentals of Computer Systems Storage The big picture General purpose processor (e.g., Power PC, Pentium, MIPS) Internet router (intrusion detection, pacet routing, etc.) WIreless transceiver

More information

Spiral 2-9. Tri-State Gates Memories DMA

Spiral 2-9. Tri-State Gates Memories DMA 2-9.1 Spiral 2-9 Tri-State Gates Memories DMA 2-9.2 Learning Outcomes I understand how a tri-state works and the rules for using them to share a bus I understand how SRAM and DRAM cells perform reads and

More information

EECS 150 Homework 7 Solutions Fall (a) 4.3 The functions for the 7 segment display decoder given in Section 4.3 are:

EECS 150 Homework 7 Solutions Fall (a) 4.3 The functions for the 7 segment display decoder given in Section 4.3 are: Problem 1: CLD2 Problems. (a) 4.3 The functions for the 7 segment display decoder given in Section 4.3 are: C 0 = A + BD + C + BD C 1 = A + CD + CD + B C 2 = A + B + C + D C 3 = BD + CD + BCD + BC C 4

More information

PROGRAMMABLE MODULES SPECIFICATION OF PROGRAMMABLE COMBINATIONAL AND SEQUENTIAL MODULES

PROGRAMMABLE MODULES SPECIFICATION OF PROGRAMMABLE COMBINATIONAL AND SEQUENTIAL MODULES PROGRAMMABLE MODULES SPECIFICATION OF PROGRAMMABLE COMBINATIONAL AND SEQUENTIAL MODULES. psa. rom. fpga THE WAY THE MODULES ARE PROGRAMMED NETWORKS OF PROGRAMMABLE MODULES EXAMPLES OF USES Programmable

More information

CS429: Computer Organization and Architecture

CS429: Computer Organization and Architecture CS429: Computer Organization and Architecture Dr. Bill Young Department of Computer Sciences University of Texas at Austin Last updated: January 2, 2018 at 11:23 CS429 Slideset 5: 1 Topics of this Slideset

More information

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu CENG 3420 Computer Organization and Design Lecture 08: Cache Review Bei Yu CEG3420 L08.1 Spring 2016 A Typical Memory Hierarchy q Take advantage of the principle of locality to present the user with as

More information

Lecture 13: SRAM. Slides courtesy of Deming Chen. Slides based on the initial set from David Harris. 4th Ed.

Lecture 13: SRAM. Slides courtesy of Deming Chen. Slides based on the initial set from David Harris. 4th Ed. Lecture 13: SRAM Slides courtesy of Deming Chen Slides based on the initial set from David Harris CMOS VLSI Design Outline Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Multiple Ports

More information

CS250 VLSI Systems Design Lecture 9: Memory

CS250 VLSI Systems Design Lecture 9: Memory CS250 VLSI Systems esign Lecture 9: Memory John Wawrzynek, Jonathan Bachrach, with Krste Asanovic, John Lazzaro and Rimas Avizienis (TA) UC Berkeley Fall 2012 CMOS Bistable Flip State 1 0 0 1 Cross-coupled

More information

Digital Systems Design with PLDs and FPGAs Kuruvilla Varghese Department of Electronic Systems Engineering Indian Institute of Science Bangalore

Digital Systems Design with PLDs and FPGAs Kuruvilla Varghese Department of Electronic Systems Engineering Indian Institute of Science Bangalore Digital Systems Design with PLDs and FPGAs Kuruvilla Varghese Department of Electronic Systems Engineering Indian Institute of Science Bangalore Lecture-32 Simple PLDs So welcome to just lecture on programmable

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 22: Memery, ROM

CMPEN 411 VLSI Digital Circuits Spring Lecture 22: Memery, ROM CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 22: Memery, ROM [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN 411 L22 S.1

More information

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 6T- SRAM for Low Power Consumption Mrs. J.N.Ingole 1, Ms.P.A.Mirge 2 Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 PG Student [Digital Electronics], Dept. of ExTC, PRMIT&R,

More information

Dec Hex Bin ORG ; ZERO. Introduction To Computing

Dec Hex Bin ORG ; ZERO. Introduction To Computing Dec Hex Bin 0 0 00000000 ORG ; ZERO Introduction To Computing OBJECTIVES this chapter enables the student to: Convert any number from base 2, base 10, or base 16 to any of the other two bases. Add and

More information

Low-Power SRAM and ROM Memories

Low-Power SRAM and ROM Memories Low-Power SRAM and ROM Memories Jean-Marc Masgonty 1, Stefan Cserveny 1, Christian Piguet 1,2 1 CSEM, Neuchâtel, Switzerland 2 LAP-EPFL Lausanne, Switzerland Abstract. Memories are a main concern in low-power

More information

Random Access Memory (RAM)

Random Access Memory (RAM) Random Access Memory (RAM) EED2003 Digital Design Dr. Ahmet ÖZKURT Dr. Hakkı YALAZAN 1 Overview Memory is a collection of storage cells with associated input and output circuitry Possible to read and write

More information

ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems

ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Lec 26: November 9, 2018 Memory Overview Dynamic OR4! Precharge time?! Driving input " With R 0 /2 inverter! Driving inverter

More information

FPGA Programming Technology

FPGA Programming Technology FPGA Programming Technology Static RAM: This Xilinx SRAM configuration cell is constructed from two cross-coupled inverters and uses a standard CMOS process. The configuration cell drives the gates of

More information

Introduction to SRAM. Jasur Hanbaba

Introduction to SRAM. Jasur Hanbaba Introduction to SRAM Jasur Hanbaba Outline Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Non-volatile Memory Manufacturing Flow Memory Arrays Memory Arrays Random Access Memory Serial

More information

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Building Memory

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Building Memory Computer Science 324 Computer rchitecture Mount Holyoke College Fall 2007 Topic Notes: Building Memory We ll next look at how we can use the devices we ve been looking at to construct memory. Tristate

More information

falling edge Intro Computer Organization

falling edge Intro Computer Organization Clocks 1 A clock is a free-running signal with a cycle time. A clock may be either high or low, and alternates between the two states. The length of time the clock is high before changing states is its

More information

CENG 4480 L09 Memory 2

CENG 4480 L09 Memory 2 CENG 4480 L09 Memory 2 Bei Yu Reference: Chapter 11 Memories CMOS VLSI Design A Circuits and Systems Perspective by H.E.Weste and D.M.Harris 1 v.s. CENG3420 CENG3420: architecture perspective memory coherent

More information

ECE 2300 Digital Logic & Computer Organization

ECE 2300 Digital Logic & Computer Organization ECE 2300 Digital Logic & Computer Organization Spring 201 Memories Lecture 14: 1 Announcements HW6 will be posted tonight Lab 4b next week: Debug your design before the in-lab exercise Lecture 14: 2 Review:

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 23: Associative Caches Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Last time: Write-Back Alternative: On data-write hit, just

More information

STUDY OF SRAM AND ITS LOW POWER TECHNIQUES

STUDY OF SRAM AND ITS LOW POWER TECHNIQUES INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN ISSN 0976 6464(Print)

More information

CS311 Lecture 21: SRAM/DRAM/FLASH

CS311 Lecture 21: SRAM/DRAM/FLASH S 14 L21-1 2014 CS311 Lecture 21: SRAM/DRAM/FLASH DARM part based on ISCA 2002 tutorial DRAM: Architectures, Interfaces, and Systems by Bruce Jacob and David Wang Jangwoo Kim (POSTECH) Thomas Wenisch (University

More information

Contents. Main Memory Memory access time Memory cycle time. Types of Memory Unit RAM ROM

Contents. Main Memory Memory access time Memory cycle time. Types of Memory Unit RAM ROM Memory Organization Contents Main Memory Memory access time Memory cycle time Types of Memory Unit RAM ROM Memory System Virtual Memory Cache Memory - Associative mapping Direct mapping Set-associative

More information

Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts

Embedded Systems Design: A Unified Hardware/Software Introduction. Outline. Chapter 5 Memory. Introduction. Memory: basic concepts Hardware/Software Introduction Chapter 5 Memory Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 1 2 Introduction Memory:

More information

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction

Embedded Systems Design: A Unified Hardware/Software Introduction. Chapter 5 Memory. Outline. Introduction Hardware/Software Introduction Chapter 5 Memory 1 Outline Memory Write Ability and Storage Permanence Common Memory Types Composing Memory Memory Hierarchy and Cache Advanced RAM 2 Introduction Embedded

More information

VLSI for Multi-Technology Systems (Spring 2003)

VLSI for Multi-Technology Systems (Spring 2003) VLSI for Multi-Technology Systems (Spring 2003) Digital Project Due in Lecture Tuesday May 6th Fei Lu Ping Chen Electrical Engineering University of Cincinnati Abstract In this project, we realized the

More information

ECE3663 Design Project: Design Review #1

ECE3663 Design Project: Design Review #1 ECE3663 Design Project: Design Review #1 General Overview: For the first stage of the project, we designed four different components of the arithmetic logic unit. First, schematics for each component were

More information

Introduction to CMOS VLSI Design Lecture 13: SRAM

Introduction to CMOS VLSI Design Lecture 13: SRAM Introduction to CMOS VLSI Design Lecture 13: SRAM David Harris Harvey Mudd College Spring 2004 1 Outline Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Multiple Ports Serial Access

More information

Topic Notes: Building Memory

Topic Notes: Building Memory Computer Science 220 ssembly Language & Comp. rchitecture Siena College Fall 2011 Topic Notes: Building Memory We ll next see how we can use flip-flop devices to construct memory. Buffers We ve seen and

More information

Column decoder using PTL for memory

Column decoder using PTL for memory IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735. Volume 5, Issue 4 (Mar. - Apr. 2013), PP 07-14 Column decoder using PTL for memory M.Manimaraboopathy

More information

CS152 Computer Architecture and Engineering Lecture 16: Memory System

CS152 Computer Architecture and Engineering Lecture 16: Memory System CS152 Computer Architecture and Engineering Lecture 16: System March 15, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http://http.cs.berkeley.edu/~patterson

More information

Memory Supplement for Section 3.6 of the textbook

Memory Supplement for Section 3.6 of the textbook The most basic -bit memory is the SR-latch with consists of two cross-coupled NOR gates. R Recall the NOR gate truth table: A S B (A + B) The S stands for Set to remember, and the R for Reset to remember.

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Caches hold a subset of data from the main

More information

Digital Integrated Circuits Lecture 13: SRAM

Digital Integrated Circuits Lecture 13: SRAM Digital Integrated Circuits Lecture 13: SRAM Chih-Wei Liu VLSI Signal Processing LAB National Chiao Tung University cwliu@twins.ee.nctu.edu.tw DIC-Lec13 cwliu@twins.ee.nctu.edu.tw 1 Outline Memory Arrays

More information

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS)

ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) Objective Part A: To become acquainted with Spectre (or HSpice) by simulating an inverter,

More information

A NEW GENERATION OF TAG SRAMS THE IDT71215 AND IDT71216

A NEW GENERATION OF TAG SRAMS THE IDT71215 AND IDT71216 A NEW GENERATION OF TAG SRAMS THE IDT71215 AND IDT71216 APPLICATION NOTE AN-16 Integrated Device Technology, Inc. By Kelly Maas INTRODUCTION The 71215 and 71216 represent a new generation of integrated

More information

3. Implementing Logic in CMOS

3. Implementing Logic in CMOS 3. Implementing Logic in CMOS 3. Implementing Logic in CMOS Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 27 September, 27 ECE Department,

More information

Design of Low Power Wide Gates used in Register File and Tag Comparator

Design of Low Power Wide Gates used in Register File and Tag Comparator www..org 1 Design of Low Power Wide Gates used in Register File and Tag Comparator Isac Daimary 1, Mohammed Aneesh 2 1,2 Department of Electronics Engineering, Pondicherry University Pondicherry, 605014,

More information

EECS150 - Digital Design Lecture 16 Memory 1

EECS150 - Digital Design Lecture 16 Memory 1 EECS150 - Digital Design Lecture 16 Memory 1 March 13, 2003 John Wawrzynek Spring 2003 EECS150 - Lec16-mem1 Page 1 Memory Basics Uses: Whenever a large collection of state elements is required. data &

More information

CpE 442. Memory System

CpE 442. Memory System CpE 442 Memory System CPE 442 memory.1 Outline of Today s Lecture Recap and Introduction (5 minutes) Memory System: the BIG Picture? (15 minutes) Memory Technology: SRAM and Register File (25 minutes)

More information

COMP 3221: Microprocessors and Embedded Systems

COMP 3221: Microprocessors and Embedded Systems COMP 3: Microprocessors and Embedded Systems Lectures 7: Cache Memory - III http://www.cse.unsw.edu.au/~cs3 Lecturer: Hui Wu Session, 5 Outline Fully Associative Cache N-Way Associative Cache Block Replacement

More information

10/24/2016. Let s Name Some Groups of Bits. ECE 120: Introduction to Computing. We Just Need a Few More. You Want to Use What as Names?!

10/24/2016. Let s Name Some Groups of Bits. ECE 120: Introduction to Computing. We Just Need a Few More. You Want to Use What as Names?! University of Illinois at Urbana-Champaign Dept. of Electrical and Computer Engineering ECE 120: Introduction to Computing Memory Let s Name Some Groups of Bits I need your help. The computer we re going

More information

Chap-2 Boolean Algebra

Chap-2 Boolean Algebra Chap-2 Boolean Algebra Contents: My name Outline: My position, contact Basic information theorem and postulate of Boolean Algebra. or project description Boolean Algebra. Canonical and Standard form. Digital

More information

Memory. Lecture 22 CS301

Memory. Lecture 22 CS301 Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch

More information

The University of Adelaide, School of Computer Science 13 September 2018

The University of Adelaide, School of Computer Science 13 September 2018 Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

Lecture 11: MOS Memory

Lecture 11: MOS Memory Lecture 11: MOS Memory MAH, AEN EE271 Lecture 11 1 Memory Reading W&E 8.3.1-8.3.2 - Memory Design Introduction Memories are one of the most useful VLSI building blocks. One reason for their utility is

More information

Integrated Circuits & Systems

Integrated Circuits & Systems Federal University of Santa Catarina Center for Technology Computer Science & Electronics Engineering Integrated Circuits & Systems INE 5442 Lecture 23-1 guntzel@inf.ufsc.br Semiconductor Memory Classification

More information

+1 (479)

+1 (479) Memory Courtesy of Dr. Daehyun Lim@WSU, Dr. Harris@HMC, Dr. Shmuel Wimer@BIU and Dr. Choi@PSU http://csce.uark.edu +1 (479) 575-6043 yrpeng@uark.edu Memory Arrays Memory Arrays Random Access Memory Serial

More information

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1

The Memory Hierarchy. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1 The Memory Hierarchy Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 3, 2018 L13-1 Memory Technologies Technologies have vastly different tradeoffs between capacity, latency,

More information

CHAPTER 12 ARRAY SUBSYSTEMS [ ] MANJARI S. KULKARNI

CHAPTER 12 ARRAY SUBSYSTEMS [ ] MANJARI S. KULKARNI CHAPTER 2 ARRAY SUBSYSTEMS [2.4-2.9] MANJARI S. KULKARNI OVERVIEW Array classification Non volatile memory Design and Layout Read-Only Memory (ROM) Pseudo nmos and NAND ROMs Programmable ROMS PROMS, EPROMs,

More information

ECE 2300 Digital Logic & Computer Organization. Caches

ECE 2300 Digital Logic & Computer Organization. Caches ECE 23 Digital Logic & Computer Organization Spring 217 s Lecture 2: 1 Announcements HW7 will be posted tonight Lab sessions resume next week Lecture 2: 2 Course Content Binary numbers and logic gates

More information

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 09, 2016 ISSN (online): 2321-0613 A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM Yogit

More information

MODULE 12 APPLICATIONS OF MEMORY DEVICES:

MODULE 12 APPLICATIONS OF MEMORY DEVICES: Introduction to Digital Electronic Design, Module 12 Application of Memory Devices 1 MODULE 12 APPLICATIONS OF MEMORY DEVICES: CONCEPT 12-1: REVIEW OF HOW MEMORY DEVICES WORK Memory consists of two parts.

More information

Mark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness

Mark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness EE 352 Unit 10 Memory System Overview SRAM vs. DRAM DMA & Endian-ness The Memory Wall Problem: The Memory Wall Processor speeds have been increasing much faster than memory access speeds (Memory technology

More information

Chapter 3 Semiconductor Memories. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Chapter 3 Semiconductor Memories. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Chapter 3 Semiconductor Memories Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Introduction Random Access Memories Content Addressable Memories Read

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

Learning Outcomes. Spiral 2-9. Typical Logic Gate TRI-STATE GATES

Learning Outcomes. Spiral 2-9. Typical Logic Gate TRI-STATE GATES 2-9.1 Learning Outcomes 2-9.2 Spiral 2-9 Tri-State Gates Memories DMA I understand how a tri-state works and the rules for using them to share a bus I understand how SRAM and DRAM cells perform reads and

More information

DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES

DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES Volume 120 No. 6 2018, 4453-4466 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ http://www.acadpubl.eu/hub/ DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR

More information

EE577A FINAL PROJECT REPORT Design of a General Purpose CPU

EE577A FINAL PROJECT REPORT Design of a General Purpose CPU EE577A FINAL PROJECT REPORT Design of a General Purpose CPU Submitted By Youngseok Lee - 4930239194 Narayana Reddy Lekkala - 9623274062 Chirag Ahuja - 5920609598 Phase 2 Part 1 A. Introduction The core

More information

MIPS) ( MUX

MIPS) ( MUX Memory What do we use for accessing small amounts of data quickly? Registers (32 in MIPS) Why not store all data and instructions in registers? Too much overhead for addressing; lose speed advantage Register

More information

EECS150, Fall 2004, Midterm 1, Prof. Culler. Problem 1 (15 points) 1.a. Circle the gate-level circuits that DO NOT implement a Boolean AND function.

EECS150, Fall 2004, Midterm 1, Prof. Culler. Problem 1 (15 points) 1.a. Circle the gate-level circuits that DO NOT implement a Boolean AND function. Problem 1 (15 points) 1.a. Circle the gate-level circuits that DO NOT implement a Boolean AND function. 1.b. Show that a 2-to-1 MUX is universal (i.e. that any Boolean expression can be implemented with

More information

Memory memories memory

Memory memories memory Memory Organization Memory Hierarchy Memory is used for storing programs and data that are required to perform a specific task. For CPU to operate at its maximum speed, it required an uninterrupted and

More information

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 9 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 369 Winter 2018 Section 01

More information

GHz Asynchronous SRAM in 65nm. Jonathan Dama, Andrew Lines Fulcrum Microsystems

GHz Asynchronous SRAM in 65nm. Jonathan Dama, Andrew Lines Fulcrum Microsystems GHz Asynchronous SRAM in 65nm Jonathan Dama, Andrew Lines Fulcrum Microsystems Context Three Generations in Production, including: Lowest latency 24-port 10G L2 Ethernet Switch Lowest Latency 24-port 10G

More information

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6.

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6. Announcement Computer Architecture (CSC-350) Lecture 0 (08 April 008) Seung-Jong Park (Jay) http://www.csc.lsu.edu/~sjpark Chapter 6 Objectives 6. Introduction Master the concepts of hierarchical memory

More information

SRAM. Introduction. Digital IC

SRAM. Introduction. Digital IC SRAM Introduction Outline Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Multiple Ports Serial Access Memories Memory Arrays Memory Arrays Random Access Memory Serial Access Memory

More information

Memory. Outline. ECEN454 Digital Integrated Circuit Design. Memory Arrays. SRAM Architecture DRAM. Serial Access Memories ROM

Memory. Outline. ECEN454 Digital Integrated Circuit Design. Memory Arrays. SRAM Architecture DRAM. Serial Access Memories ROM ECEN454 Digital Integrated Circuit Design Memory ECEN 454 Memory Arrays SRAM Architecture SRAM Cell Decoders Column Circuitry Multiple Ports DRAM Outline Serial Access Memories ROM ECEN 454 12.2 1 Memory

More information

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely

More information

190-MHz CMOS 4-Kbyte Pipelined Caches

190-MHz CMOS 4-Kbyte Pipelined Caches 90-MHz CMOS -Kbyte Pipelined Caches Apoorv Srivastava, Yong-Seon Koh, Barton Sano, and Alvin M. Despain ACAL-TR-9- November 99 ABSTRACT In this paper we describe the design and implementation of a 90-MHz

More information

Lecture 21: Combinational Circuits. Integrated Circuits. Integrated Circuits, cont. Integrated Circuits Combinational Circuits

Lecture 21: Combinational Circuits. Integrated Circuits. Integrated Circuits, cont. Integrated Circuits Combinational Circuits Lecture 21: Combinational Circuits Integrated Circuits Combinational Circuits Multiplexer Demultiplexer Decoder Adders ALU Integrated Circuits Circuits use modules that contain multiple gates packaged

More information

12 Cache-Organization 1

12 Cache-Organization 1 12 Cache-Organization 1 Caches Memory, 64M, 500 cycles L1 cache 64K, 1 cycles 1-5% misses L2 cache 4M, 10 cycles 10-20% misses L3 cache 16M, 20 cycles Memory, 256MB, 500 cycles 2 Improving Miss Penalty

More information

DLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 3 DLD P VIDYA SAGAR

DLD VIDYA SAGAR P. potharajuvidyasagar.wordpress.com. Vignana Bharathi Institute of Technology UNIT 3 DLD P VIDYA SAGAR DLD UNIT III Combinational Circuits (CC), Analysis procedure, Design Procedure, Combinational circuit for different code converters and other problems, Binary Adder- Subtractor, Decimal Adder, Binary Multiplier,

More information

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Main Memory. Readings

Where We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Main Memory. Readings Introduction to Computer Architecture Main Memory and Virtual Memory Copyright 2012 Daniel J. Sorin Duke University Slides are derived from work by Amir Roth (Penn) Spring 2012 Where We Are in This Course

More information

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang

CHAPTER 6 Memory. CMPS375 Class Notes Page 1/ 16 by Kuo-pao Yang CHAPTER 6 Memory 6.1 Memory 233 6.2 Types of Memory 233 6.3 The Memory Hierarchy 235 6.3.1 Locality of Reference 237 6.4 Cache Memory 237 6.4.1 Cache Mapping Schemes 239 6.4.2 Replacement Policies 247

More information

Problem Set 10 Solutions

Problem Set 10 Solutions CSE 260 Digital Computers: Organization and Logical Design Problem Set 10 Solutions Jon Turner thru 6.20 1. The diagram below shows a memory array containing 32 words of 2 bits each. Label each memory

More information

Memory. Memory Technologies

Memory. Memory Technologies Memory Memory technologies Memory hierarchy Cache basics Cache variations Virtual memory Synchronization Galen Sasaki EE 36 University of Hawaii Memory Technologies Read Only Memory (ROM) Static RAM (SRAM)

More information

NAND/NOR Logic Gate Equivalent Training Tool Design Document. Team 34 TA: Xinrui Zhu ECE Fall Jeremy Diamond and Matthew LaGreca

NAND/NOR Logic Gate Equivalent Training Tool Design Document. Team 34 TA: Xinrui Zhu ECE Fall Jeremy Diamond and Matthew LaGreca NAND/NOR Logic Gate Equivalent Training Tool Design Document Team 34 TA: Xinrui Zhu ECE 445 - Fall 2017 Jeremy Diamond and Matthew LaGreca Table of Contents 1.0 INTRODUCTION 1.1 Objective 1.2 Background

More information

Concept of Memory. The memory of computer is broadly categories into two categories:

Concept of Memory. The memory of computer is broadly categories into two categories: Concept of Memory We have already mentioned that digital computer works on stored programmed concept introduced by Von Neumann. We use memory to store the information, which includes both program and data.

More information

An Overview of Standard Cell Based Digital VLSI Design

An Overview of Standard Cell Based Digital VLSI Design An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,

More information

The Memory Hierarchy Cache, Main Memory, and Virtual Memory

The Memory Hierarchy Cache, Main Memory, and Virtual Memory The Memory Hierarchy Cache, Main Memory, and Virtual Memory Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University The Simple View of Memory The simplest view

More information

Semiconductor Memory Classification. Today. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. CPU Memory Hierarchy.

Semiconductor Memory Classification. Today. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. CPU Memory Hierarchy. ESE 57: Digital Integrated Circuits and VLSI Fundamentals Lec : April 4, 7 Memory Overview, Memory Core Cells Today! Memory " Classification " ROM Memories " RAM Memory " Architecture " Memory core " SRAM

More information

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017 Advanced Parallel Architecture Lessons 5 and 6 Annalisa Massini - Pipelining Hennessy, Patterson Computer architecture A quantitive approach Appendix C Sections C.1, C.2 Pipelining Pipelining is an implementation

More information

Chapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review

Chapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review Memories: Review Chapter 7 Large and Fast: Exploiting Hierarchy DRAM (Dynamic Random Access ): value is stored as a charge on capacitor that must be periodically refreshed, which is why it is called dynamic

More information

Computer Architecture Memory hierarchies and caches

Computer Architecture Memory hierarchies and caches Computer Architecture Memory hierarchies and caches S Coudert and R Pacalet January 23, 2019 Outline Introduction Localities principles Direct-mapped caches Increasing block size Set-associative caches

More information