CSEE W4824 Computer Architecture Fall 2012

Size: px
Start display at page:

Download "CSEE W4824 Computer Architecture Fall 2012"

Transcription

1 CSEE W4824 Computer Architecture Fall 2012 Lecture 8 Memory Hierarchy Design: Memory Technologies and the Basics of Caches Luca Carloni Department of Computer Science Columbia University in the City of New York Announcements: Class Pre-Taping Wednesday /3 Lecture #8 Regular Class Monday /8 Lecture #9 (Pre-taped) Pre-taped this Wed /3 at 4:15pm in Mudd 1127 Wednesday / Lecture # Guest lecturer Regular Class Reason: Instructor is traveling to attend Embedded Systems Week 2012 Pre-taped lectures will be shown as videos from the class PC during regular class time in Mudd 535 Instructor s office hours are canceled for the week of October 8 CSEE 4824 Fall Lecture 7 Page 2 1

2 Announcement Homework #1 Results: Average score: / 35 Std. Deviation: 2.71 CSEE 4824 Fall Lecture 8 Page 3 The Processor-Memory Performance Gap (log scale) CPU speed assumes 25% improvement per year until 1986, 52% until 2000, 20% until 2005, and no change (on a per-core basis) until 20 Memory Baseline: 64KB DRAM w/ ns latency in 1980, 7% per year latency improvement Architects must attempt to work around this gap to minimize the memory bottleneck CSEE 4824 Fall Lecture 8 Page 4 2

3 How Many Memory References? A modern high-end multi-core processor (e.g. Intel Core i7) can generate two data memory references per core each clock cycle with 4 cores and a 3.2 GHz clock rate, this leads to a peak of 25.6 billion 64-bit data-memory references per second, in addition to a peak of about 12.8 billion 128-bit instruction references. How to support a total peak bandwidth of GB/sec!? The Memory Hierarchy multiporting and pipelining the caches, using multiple levels of caches, using separate first- and sometimes second-level caches per core; and by using a Harvard architecture for the first-level cache in contrast, the peak bandwidth to DRAM main memory is only 6% of this (25 GB/sec). CSEE 4824 Fall Lecture 8 Page 5 Typical PC Organization Source: B. Jacob et al. Memory Systems CSEE 4824 Fall Lecture 8 Page 7 3

4 DSP-Style Memory System: Example based on TI TMS320C3x DSP family Source: B. Jacob et al. Memory Systems dual tag-less on-chip SRAMs (visible to programmer) off-chip programmable ROM (or PROM or FLASH) that holds the executable image off-chip DRAM used for computation CSEE 4824 Fall Lecture 8 Page 8 Memory Technology At the core of the success of computers Various types of memory most common types Dynamic Random-Access Memory (DRAM) Static Random-Access Memory (SRAM) Read-Only Memory (ROM) Flash Memory Memory Latency Metrics Access time time between when a read is requested and when the desired word arrives Cycle time ( Access time) minimum time between two requests to memory memory needs the address lines to be stable between accesses CSEE 4824 Fall Lecture 8 Page 9 4

5 A 64M-bit DRAM: Logical Organization Highest memory cell density only 1 transistor used to store 1 bit to prevent data loss, each bit must be refreshed periodically DRAM access periodically all bits in every row (refresh) about 5% of the time a DRAM is not available due to refreshing To limit package costs, address lines are multiplexed e.g., first send 14-bit row address (Row Access Strobe), then 14-bit column address (Column Access Strobe) CSEE 4824 Fall Lecture 8 Page 11 Logical Organization of Wide Data-Out DRAMs In order to output more than one bit at a time, the DRAM is organized internally with multiple arrays, each providing one bit towards the aggregate output Wider output DRAMs have appeared in the last two decades DRAM parts with x16 and x32 data widths are now common, used primarily in high-performance applications CSEE 4824 Fall Lecture 8 Page 12 Source: B. Jacob et al. Memory Systems 5

6 DIMMs, Ranks, Banks, and Arrays A memory system may have many DIMMs, each of which may contain one or more ranks Each rank is a set of engaged DRAM devices, each of which may have many banks Each bank may have many constituent arrays, depending on the part s data width Source: B. Jacob et al. Memory Systems CSEE 4824 Fall Lecture 8 Page 13 DRAM Generations Year of Introd. Chip Size (bit) $ per GB Total Access Time to a new row/column Total Access Time to existing row K 1,5M 250ns 150ns K 500k 185ns 0ns M 200k 135ns 40ns M 50k 1ns 40ns M 15k 90ns 30ns M k 60ns 12ns M 4k 60ns ns M 1k 55ns 7ns M ns 5ns G 50 40ns 1.25ns CSEE 4824 Fall Lecture 8 Page 14 6

7 SRAMs SRAM memory cell is bigger than DRAM cell typically 6 transistors per bit Better for low-power applications thanks to stand-by mode only minimal power is necessary to retain charge in stand-by mode Access Time = Cycle Time Address lines are not multiplexed (for speed) In comparable technologies SRAM has a only 1/4-1/8 of DRAM capacity SRAM cycle time is 8-16 times faster than DRAM SRAM cost-per-bit is 8-16 times more expensive than DRAM CSEE 4824 Fall Lecture 8 Page 15 ROM and Flash Memory ROM programmed once and for all at manufacture time cannot be rewritten by microprocessor 1 transistor per bit good for storing code and data constants in embedded applications replace magnetic disks in providing nonvolatile storage add level of protection for embedded software Flash Memories floating-gate technology read access time comparable to DRAMs 50-0us depending on size (16M-128M) write is -0 slower than DRAMs (plus erasing time 1-2ms) price is cheaper than DRAM but more expensive than magnetic disks Flash: $2/GB, DRAM: $40/GB; disk = $0.09/GB Initially, mostly used for low power/embedded applications but now also as solid-state replacements for disks or efficient intermediate storage between DRAM and disks CSEE 4824 Fall Lecture 8 Page 16 7

8 Flash Storage: Increasingly an Alternative to Magnetic Disks nonvolatile like disks, but smaller (0-00x) latency smaller, more power efficient, more shock resistant critical for mobile electronics high volumes lead to technology improvements cost per GB is falling 50% per year $2-4 per GB (in 2011) 2-40x higher than disk 5-x lower than DRAM Unlike DRAM, flash memory bits wear out on-chip controller necessary to spread the writes by remapping blocks that have been written multiple times (wear leveling) write limits are delaying the application to desktops/servers but now commonly used in laptops instead of hard disks to offer faster boot times, smaller size, and longer battery life CSEE 4824 Fall Lecture 8 Page 17 FLASH Storage Memories: Price Decrease and Relative Performance/Power Source: A.Leventhal, Flash Storage Memories CSEE 4824 Fall Lecture 8 Page 18 8

9 DRAM vs SDRAM vs DDR SDRAM Conventional DRAM asynchronous interface to memory controller every transfer involves additional synchronization overhead Synchronous DRAM added a clock signal so that repeated transfers would not bear that overhead typically have a programmable register to hold the number of bytes requested, to send many bytes over several cycles per request Double Data Rate (DDR) DRAM double peak bandwidth by transferring data on both clock edges to supply data at these high rates, DDR SDRAM activate multiple banks internally CSEE 4824 Fall Lecture 8 Page 19 Clock Rate, Bandwidth, and Names of DDR DRAMs and DIMMs in 20 Standard Clock Rate (Mhz) Transfers (M / sec) DRAM name MB/sec/ DIMM DIMM name DDR DDR PC20 DDR DDR PC2400 DDR DDR PC3200 DDR DDR PC4300 DDR DDR PC5300 DDR DDR PC6400 DDR DDR PC8500 DDR DDR PC700 DDR DDR PC12800 DDR DDR PC25600 CSEE 4824 Fall Lecture 8 Page 20 9

10 Giving the Illusion of Unlimited, Fast Memory: Exploiting Memory Hierarchy Technology SRAM DRAM Magnetic Disk 2008 Cost ($/GB) $2000-$5000 $20-$75 $0.2-$ GB 4-16 TB ns ns 5-ms Principle of Locality Smaller HW is typically faster All data in one level are usually found also in the level below Bandwidth 20-0 GB/sec 5- GB/sec 1-5 GB/sec MB/sec Energy per Access 1nJ 1-0nJ (per device) 0-00mJ Managed by Backed by compiler hardware operating systems operating systems / operator cache main memory disk CD or tape CSEE 4824 Fall Lecture 8 Page 21 Typical Memory Hierarchies: Servers vs. Personal Mobile Devices CSEE 4824 Fall Lecture 8 Page 22

11 Review: Principle of Locality Temporal Locality a resource that is referenced at one point in time will be referenced again sometime in the near future Spatial Locality the likelihood of referencing a resource is higher if a resource near it was just referenced 90/ Locality Rule of Thumb a program spends 90% of its execution time in only % of its code a consequence of how we program and we store the data in the memory hence, it is possible to predict with reasonable accuracy what instructions and data a program will use in the near future based on its accesses in the recent past CSEE 4824 Fall Lecture 8 Page 23 Cache Concepts The term Cache the first (from the CPU) level of the memory hierarchy often used to refer to any buffering technique exploiting the principle of locality Directly exploits temporal locality providing faster access to a smaller subset of the main memory which contains copy of data recently used But, all data in the cache are not necessarily data that are spatially close in the main memory still, when a cache miss occurs a fixed-size block of contiguous memory cells is retrieved from the main memory based on the principle of spatial locality CSEE 4824 Fall Lecture 8 Page 24 11

12 Cache Concepts cont. Cache Hit CPU find the requested data item in the cache Cache Miss CPU doesn t find the requested data item in the cache Miss Penalty time to replace a block in the cache (plus time to deliver data item to CPU) time depends on both latency & bandwidth latency determines the time to retrieve the first word bandwidth determines the time to retrieve rest of the block handled by hardware that stalls the memory unit (and, therefore, the whole instruction processing in case of simple single-issue P) CSEE 4824 Fall Lecture 8 Page 25 Cache : Main Memory = Main Memory : Disk Virtual Memory makes it possible to increase the amount of memory that a program can use by temporarily storing some objects on disk program address space is divided in pages (fixed-size blocks) which reside either in cache/main memory or disk better way to organize address space across programs necessary protection scheme to control page access when the CPU references an item within a page that is not present in cache/main memory a page fault occurs the entire page is moved from the disk to main memory page faults have long penalty time handled in SW without stalling the CPU, which switches to other tasks CSEE 4824 Fall Lecture 8 Page 26 12

13 Caching the Address Space Programs today are written to run on no particular HW configuration Processes execute in imaginary address spaces that are mapped onto the memory system (including DRAM and disk) by the OS Every HW memory structure between the CPU and the permanent store is a cache for instruction & data in the process s address space Source: B. Jacob et al. Memory Systems CSEE 4824 Fall Lecture 8 Page 27 Cache Schemes: Placing a Memory Block into a Cache Block Frame Block unit of memory transferred across hierarchy levels Set a group of blocks The range of caches is really a continuum of levels of set associativity 8-way set associative 1-way set associative 2-way set associative Modern processors direct map 2-way set associative 4-way set associative Modern memories millions of blocks Modern caches thousands of block frames Set Index = (Block Address) MOD (Number of Sets in Cache) CSEE 4824 Fall Lecture 8 Page 28 13

14 Example: Direct Mapped Cache with 8 Block Frames Each memory block is mapped to one cache entry cache index = (block address) mod (# of cache blocks) e.g., with 8 blocks, 3 low-order address bits are sufficient Log2 (8) = 3 Is a block present in cache? must check cache block tag upper bit of block address Block offset addresses bytes in a block block==word offset =2 bits How do we know if data in a block is valid? add valid bit to each entry The tag index boundary moves to the right as we increase associativity (no index field in fully associative caches) CSEE 4824 Fall Lecture 8 Page 29 Ex: Direct Mapped with 24 Blocks Frames and Block Size of 1 Word for MIPS-32 Block Offset is just a byte offset because each block of this cache contains 1 word Byte Offset least significant 2 bits because in MIPS-32 memory words are aligned to multiples of 4 bytes Block Index low-order address bits because this cache has 24 block frames Block Tag remaining 20 address bits in order to check that the address of the requested word matches the cache entry Index is for addressing Tag is for checking/searching CSEE 4824 Fall Lecture 8 Page 30 14

15 Example: 16KB Direct Mapped Cache with 256 Block Frames (of 16 Words Each) 18 Single tag comparator needed CSEE 4824 Fall Lecture 8 Page 31 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 N 001 N 0 N 011 N 0 N 1 N 1 N 111 N Assumption 8 block frames block size = 1 word main memory of 32 words toy example we consider ten subsequent accesses to memory CSEE 4824 Fall Lecture 8 Page 32 15

16 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 N 001 N 0 N 011 N 0 N 1 N 1 Y Mem[1] 111 N cycle Memory Address address in decimal Cache Event miss CSEE 4824 Fall Lecture 8 Page 33 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 N 001 N 0 Y 11 Mem[1] 011 N 0 N 1 N 1 Y Mem[1] 111 N cycle Memory Address address in decimal Cache Event miss miss CSEE 4824 Fall Lecture 8 Page 34 16

17 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 N 001 N 0 Y 11 Mem[1] 011 N 0 N 1 N 1 Y Mem[1] 111 N cycle Memory Address address in decimal Cache Event miss miss hit CSEE 4824 Fall Lecture 8 Page 35 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 N 001 N 0 Y 11 Mem[1] 011 N 0 N 1 N 1 Y Mem[1] 111 N cycle Memory Address address in decimal Cache Event miss miss hit hit CSEE 4824 Fall Lecture 8 Page 36 17

18 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 Y Mem[000] 001 N 0 Y 11 Mem[1] 011 N 0 N 1 N 1 Y Mem[1] 111 N cycle Memory Address address in decimal Cache Event miss miss hit hit miss CSEE 4824 Fall Lecture 8 Page 37 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 Y Mem[000] 001 N 0 Y 11 Mem[1] 011 Y 00 Mem[00011] 0 N 1 N 1 Y Mem[1] 111 N cycle Memory Address address in decimal Cache Event miss miss hit hit miss miss CSEE 4824 Fall Lecture 8 Page 38 18

19 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 Y Mem[000] 001 N 0 Y 11 Mem[1] 011 Y 00 Mem[00011] 0 N 1 N 1 Y Mem[1] 111 N cycle Memory Address address in decimal Cache Event miss miss hit hit miss miss hit 8 9 CSEE 4824 Fall Lecture 8 Page 39 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 Y Mem[000] 001 N 0 Y Mem[0] 011 Y 00 Mem[00011] 0 N 1 N 1 Y Mem[1] 111 N cycle Memory Address address in decimal Cache Event miss miss hit hit miss miss hit miss 9 CSEE 4824 Fall Lecture 8 Page 40 19

20 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 Y Mem[000] 001 N 0 Y 11 Mem[1] 011 Y 00 Mem[00011] 0 N 1 N 1 Y Mem[1] 111 N cycle Memory Address address in decimal Cache Event miss miss hit hit miss miss hit miss miss CSEE 4824 Fall Lecture 8 Page 41 Example: Accessing a Direct Mapped Cache with 8 Blocks and Block Size of 1 Word Index V Tag Data 000 Y Mem[000] 001 N 0 Y 11 Mem[1] 011 Y 00 Mem[00011] 0 N 1 N 1 Y Mem[1] 111 N cycle Memory Address address in decimal Cache Event miss miss hit hit miss miss hit miss miss 1 26 hit CSEE 4824 Fall Lecture 8 Page 42 20

21 Example: Measuring Cache Size How many total bits are required for a direct-mapped cache with 16KB of data and 4-word block frames assuming a 32-bit address? 12 16KB of data = 4K words = 2 words 2 Block Size of 4 (=2 ) words 2 blocks TAG 18 INDEX OFFSET 2 2 # Bits in a Tag = 32 - ( ) = 18 # Bits in a block = # Tag Bits + # Data Bits + Valid bit # Bits in a block = 18 + (4 * 32) + 1 = 147 Cache Size= # Blocks x #Bits in a block= 2 x 147=147Kbits Cache Overhead = 147Kbits / 16KB = 147 / 128 = 1.15 CSEE 4824 Fall Lecture 8 Page 43 Performance Metrics for Caches Miss Rate (misses per memory references) fraction of cache accesses that result in a miss Misses Per Instructions often reported as misses per 00 instructions for speculative processors we only count the instructions that commit Miss Per Instructions = Miss Rate x (Memory Accesses / Instruction Count) Miss Penalty additional clock cycles necessary to retrieve the block with the missing word from the main memory CSEE 4824 Fall Lecture 8 Page 44 21

22 Performance Metrics for Caches - continue Average Memory Access Time (AMAT) AMAT = Hit time + Miss rate x Miss penalty Average Memory Access Time a better estimate on cache performance but still not a substitute for execution time Impact on CPU Time including hit clock cycles in CPU execution clock cycles CPU Time = (CPU execution cycles + memory stall cycles) x CCT CSEE 4824 Fall Lecture 8 Page 45 Performance Metrics for Caches - continue Impact on CPU Time including hit clock cycles in CPU execution clock cycles and breaking down the memory stall cycles CPU Time = IC x (CPIexec+ missrate x memaccperinstr x misspenalty) x CCT the loweris the CPI, the higher the relative impact of a fixed number of cache miss clock cycles the fasterthe CPU (i.e. the lower CCT), the higher is the number of clock cycles per miss CSEE 4824 Fall Lecture 8 Page 46 22

23 Example: The Impact of Cache on Performance Assumptions CPI_exec = 1 clock cycle (ignoring memory stalls) Miss rate = 2% Miss penalty = 200 clock cycles Average memory references per instruction = 1.5 (CPI)no_cache = x 200 = 301 (CPI)with_cache = 1 + (1.5 x 0.02 x 200) = 7 Impact of Cache on CPU Time is greater the lower is the CPI of the other instructions for a fixed number of cache miss clock cycles the lower is the clock cycle time of the CPU because the CPU has a larger number of clock cycles per miss (i.e. a higher memory portion of CPI) CSEE 4824 Fall Lecture 8 Page 47 Assigned Readings Computer Architecture A Quantitative Approach by John Hennessy Stanford University Dave Patterson UC Berkeley Fifth Edition Morgan Kaufmann (Elsevier) Section 2.1 and 2.3 Appendix B.1 For review purposes: see Chapter 7 of Hennessy & Patterson Computer Organization & Design book Assigned paper: A. Leventhal, Flash Storage Memories CSEE 4824 Fall Lecture 8 Page 48 23

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology

More information

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

Adapted from David Patterson s slides on graduate computer architecture

Adapted from David Patterson s slides on graduate computer architecture Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual

More information

LECTURE 5: MEMORY HIERARCHY DESIGN

LECTURE 5: MEMORY HIERARCHY DESIGN LECTURE 5: MEMORY HIERARCHY DESIGN Abridged version of Hennessy & Patterson (2012):Ch.2 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive

More information

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building

More information

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James Computer Systems Architecture I CSE 560M Lecture 18 Guest Lecturer: Shakir James Plan for Today Announcements No class meeting on Monday, meet in project groups Project demos < 2 weeks, Nov 23 rd Questions

More information

The University of Adelaide, School of Computer Science 13 September 2018

The University of Adelaide, School of Computer Science 13 September 2018 Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

Introduction to cache memories

Introduction to cache memories Course on: Advanced Computer Architectures Introduction to cache memories Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Summary Summary Main goal Spatial and temporal

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design Edited by Mansour Al Zuair 1 Introduction Programmers want unlimited amounts of memory with low latency Fast

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to

More information

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast

More information

COSC 6385 Computer Architecture - Memory Hierarchies (III)

COSC 6385 Computer Architecture - Memory Hierarchies (III) COSC 6385 Computer Architecture - Memory Hierarchies (III) Edgar Gabriel Spring 2014 Memory Technology Performance metrics Latency problems handled through caches Bandwidth main concern for main memory

More information

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds. Performance 980 98 982 983 984 985 986 987 988 989 990 99 992 993 994 995 996 997 998 999 2000 7/4/20 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Instructor: Michael Greenbaum

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.

Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B. Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.5) Memory Technologies Dynamic Random Access Memory (DRAM) Optimized

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics

,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics ,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics The objectives of this module are to discuss about the need for a hierarchical memory system and also

More information

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored

More information

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1 CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson

More information

The Memory Hierarchy & Cache

The Memory Hierarchy & Cache Removing The Ideal Memory Assumption: The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 21: Memory Hierarchy Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Ideally, computer memory would be large and fast

More information

CENG3420 Lecture 08: Memory Organization

CENG3420 Lecture 08: Memory Organization CENG3420 Lecture 08: Memory Organization Bei Yu byu@cse.cuhk.edu.hk (Latest update: February 22, 2018) Spring 2018 1 / 48 Overview Introduction Random Access Memory (RAM) Interleaving Secondary Memory

More information

COSC 6385 Computer Architecture - Memory Hierarchies (II)

COSC 6385 Computer Architecture - Memory Hierarchies (II) COSC 6385 Computer Architecture - Memory Hierarchies (II) Edgar Gabriel Spring 2018 Types of cache misses Compulsory Misses: first access to a block cannot be in the cache (cold start misses) Capacity

More information

Memory Hierarchy Y. K. Malaiya

Memory Hierarchy Y. K. Malaiya Memory Hierarchy Y. K. Malaiya Acknowledgements Computer Architecture, Quantitative Approach - Hennessy, Patterson Vishwani D. Agrawal Review: Major Components of a Computer Processor Control Datapath

More information

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches CS 61C: Great Ideas in Computer Architecture Direct Mapped Caches Instructor: Justin Hsia 7/05/2012 Summer 2012 Lecture #11 1 Review of Last Lecture Floating point (single and double precision) approximates

More information

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5 th Edition Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic

More information

3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition

3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

Memory. Lecture 22 CS301

Memory. Lecture 22 CS301 Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch

More information

Memory latency: Affects cache miss penalty. Measured by:

Memory latency: Affects cache miss penalty. Measured by: Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row. Static RAM may be used for main memory

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed) Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2011/12 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2011/12 1 2

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed) Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2012/13 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2012/13 1 2

More information

Memory latency: Affects cache miss penalty. Measured by:

Memory latency: Affects cache miss penalty. Measured by: Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row. Static RAM may be used for main memory

More information

Main Memory. EECC551 - Shaaban. Memory latency: Affects cache miss penalty. Measured by:

Main Memory. EECC551 - Shaaban. Memory latency: Affects cache miss penalty. Measured by: Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row (~every 8 msec). Static RAM may be

More information

Recap: Machine Organization

Recap: Machine Organization ECE232: Hardware Organization and Design Part 14: Hierarchy Chapter 5 (4 th edition), 7 (3 rd edition) http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy,

More information

Large and Fast: Exploiting Memory Hierarchy

Large and Fast: Exploiting Memory Hierarchy CSE 431: Introduction to Operating Systems Large and Fast: Exploiting Memory Hierarchy Gojko Babić 10/5/018 Memory Hierarchy A computer system contains a hierarchy of storage devices with different costs,

More information

Topics. Digital Systems Architecture EECE EECE Need More Cache?

Topics. Digital Systems Architecture EECE EECE Need More Cache? Digital Systems Architecture EECE 33-0 EECE 9-0 Need More Cache? Dr. William H. Robinson March, 00 http://eecs.vanderbilt.edu/courses/eece33/ Topics Cache: a safe place for hiding or storing things. Webster

More information

LECTURE 10: Improving Memory Access: Direct and Spatial caches

LECTURE 10: Improving Memory Access: Direct and Spatial caches EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses

More information

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University Computer Architecture Memory Hierarchy Lynn Choi Korea University Memory Hierarchy Motivated by Principles of Locality Speed vs. Size vs. Cost tradeoff Locality principle Temporal Locality: reference to

More information

Course Administration

Course Administration Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570

More information

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

Chapter Seven. Large & Fast: Exploring Memory Hierarchy Chapter Seven Large & Fast: Exploring Memory Hierarchy 1 Memories: Review SRAM (Static Random Access Memory): value is stored on a pair of inverting gates very fast but takes up more space than DRAM DRAM

More information

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation

Mainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation Mainstream Computer System Components CPU Core 2 GHz - 3.0 GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation One core or multi-core (2-4) per chip Multiple FP, integer

More information

Computer System Components

Computer System Components Computer System Components CPU Core 1 GHz - 3.2 GHz 4-way Superscaler RISC or RISC-core (x86): Deep Instruction Pipelines Dynamic scheduling Multiple FP, integer FUs Dynamic branch prediction Hardware

More information

Chapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review

Chapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review Memories: Review Chapter 7 Large and Fast: Exploiting Hierarchy DRAM (Dynamic Random Access ): value is stored as a charge on capacitor that must be periodically refreshed, which is why it is called dynamic

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface COEN-4710 Computer Hardware Lecture 7 Large and Fast: Exploiting Memory Hierarchy (Chapter 5) Cristinel Ababei Marquette University Department

More information

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off

More information

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches CS 61C: Great Ideas in Computer Architecture The Memory Hierarchy, Fully Associative Caches Instructor: Alan Christopher 7/09/2014 Summer 2014 -- Lecture #10 1 Review of Last Lecture Floating point (single

More information

Chapter 7-1. Large and Fast: Exploiting Memory Hierarchy (part I: cache) 臺大電機系吳安宇教授. V1 11/24/2004 V2 12/01/2004 V3 12/08/2004 (minor)

Chapter 7-1. Large and Fast: Exploiting Memory Hierarchy (part I: cache) 臺大電機系吳安宇教授. V1 11/24/2004 V2 12/01/2004 V3 12/08/2004 (minor) Chapter 7-1 Large and Fast: Exploiting Memory Hierarchy (part I: cache) 臺大電機系吳安宇教授 V1 11/24/2004 V2 12/01/2004 V3 12/08/2004 (minor) 臺大電機吳安宇教授 - 計算機結構 1 Outline 7.1 Introduction 7.2 The Basics of Caches

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Review: Major Components of a Computer Processor Devices Control Memory Input Datapath Output Secondary Memory (Disk) Main Memory Cache Performance

More information

Memory technology and optimizations ( 2.3) Main Memory

Memory technology and optimizations ( 2.3) Main Memory Memory technology and optimizations ( 2.3) 47 Main Memory Performance of Main Memory: Latency: affects Cache Miss Penalty» Access Time: time between request and word arrival» Cycle Time: minimum time between

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

Mainstream Computer System Components

Mainstream Computer System Components Mainstream Computer System Components Double Date Rate (DDR) SDRAM One channel = 8 bytes = 64 bits wide Current DDR3 SDRAM Example: PC3-12800 (DDR3-1600) 200 MHz (internal base chip clock) 8-way interleaved

More information

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): Motivation for The Memory Hierarchy: { CPU/Memory Performance Gap The Principle Of Locality Cache $$$$$ Cache Basics:

More information

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory

Memory systems. Memory technology. Memory technology Memory hierarchy Virtual memory Memory systems Memory technology Memory hierarchy Virtual memory Memory technology DRAM Dynamic Random Access Memory bits are represented by an electric charge in a small capacitor charge leaks away, need

More information

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)

Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) Department of Electr rical Eng ineering, Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Depar rtment of Electr rical Engineering,

More information

Memory Hierarchy and Caches

Memory Hierarchy and Caches Memory Hierarchy and Caches COE 301 / ICS 233 Computer Organization Dr. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals Presentation Outline

More information

CENG4480 Lecture 09: Memory 1

CENG4480 Lecture 09: Memory 1 CENG4480 Lecture 09: Memory 1 Bei Yu byu@cse.cuhk.edu.hk (Latest update: November 8, 2017) Fall 2017 1 / 37 Overview Introduction Memory Principle Random Access Memory (RAM) Non-Volatile Memory Conclusion

More information

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar

More information

14:332:331. Week 13 Basics of Cache

14:332:331. Week 13 Basics of Cache 14:332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Lec20.1 Fall 2003 Head

More information

The levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms

The levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms The levels of a memory hierarchy CPU registers C A C H E Memory bus Main Memory I/O bus External memory 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms 1 1 Some useful definitions When the CPU finds a requested

More information

Advanced Memory Organizations

Advanced Memory Organizations CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

Lecture-14 (Memory Hierarchy) CS422-Spring

Lecture-14 (Memory Hierarchy) CS422-Spring Lecture-14 (Memory Hierarchy) CS422-Spring 2018 Biswa@CSE-IITK The Ideal World Instruction Supply Pipeline (Instruction execution) Data Supply - Zero-cycle latency - Infinite capacity - Zero cost - Perfect

More information

LECTURE 11. Memory Hierarchy

LECTURE 11. Memory Hierarchy LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 16

More information

Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site:

Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site: Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, 2003 Textbook web site: www.vrtechnology.org 1 Textbook web site: www.vrtechnology.org Laboratory Hardware 2 Topics 14:332:331

More information

Contents Slide Set 9. Final Notes on Textbook Chapter 7. Outline of Slide Set 9. More about skipped sections in Chapter 7. Outline of Slide Set 9

Contents Slide Set 9. Final Notes on Textbook Chapter 7. Outline of Slide Set 9. More about skipped sections in Chapter 7. Outline of Slide Set 9 slide 2/41 Contents Slide Set 9 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014

More information

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck Main memory management CMSC 411 Computer Systems Architecture Lecture 16 Memory Hierarchy 3 (Main Memory & Memory) Questions: How big should main memory be? How to handle reads and writes? How to find

More information

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) Caches Han Wang CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes) This week: Announcements PA2 Work-in-progress submission Next six weeks: Two labs and two projects

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality: it will tend to be referenced again soon spatial locality: nearby items will tend to be

More information

Lecture 18: Memory Systems. Spring 2018 Jason Tang

Lecture 18: Memory Systems. Spring 2018 Jason Tang Lecture 18: Memory Systems Spring 2018 Jason Tang 1 Topics Memory hierarchy Memory operations Cache basics 2 Computer Organization Computer Processor Memory Devices Control Datapath Input Output So far,

More information

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

CPU issues address (and data for write) Memory returns data (or acknowledgment for write) The Main Memory Unit CPU and memory unit interface Address Data Control CPU Memory CPU issues address (and data for write) Memory returns data (or acknowledgment for write) Memories: Design Objectives

More information

Chapter Seven Morgan Kaufmann Publishers

Chapter Seven Morgan Kaufmann Publishers Chapter Seven Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored as a charge on capacitor (must be

More information

Contents. Memory System Overview Cache Memory. Internal Memory. Virtual Memory. Memory Hierarchy. Registers In CPU Internal or Main memory

Contents. Memory System Overview Cache Memory. Internal Memory. Virtual Memory. Memory Hierarchy. Registers In CPU Internal or Main memory Memory Hierarchy Contents Memory System Overview Cache Memory Internal Memory External Memory Virtual Memory Memory Hierarchy Registers In CPU Internal or Main memory Cache RAM External memory Backing

More information

Memory Hierarchy. Slides contents from:

Memory Hierarchy. Slides contents from: Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory

More information

CS 152 Computer Architecture and Engineering. Lecture 6 - Memory

CS 152 Computer Architecture and Engineering. Lecture 6 - Memory CS 152 Computer Architecture and Engineering Lecture 6 - Memory Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste! http://inst.eecs.berkeley.edu/~cs152!

More information

The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs.

The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. The Hierarchical Memory System The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory Hierarchy:

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major

More information

Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy

Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy ENG338 Computer Organization and Architecture Part II Winter 217 S. Areibi School of Engineering University of Guelph Hierarchy Topics Hierarchy Locality Motivation Principles Elements of Design: Addresses

More information

Question?! Processor comparison!

Question?! Processor comparison! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 5.1-5.2!! (Over the next 2 lectures)! Lecture 18" Introduction to Memory Hierarchies! 3! Processor components! Multicore processors and programming! Question?!

More information

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies 1 Lecture 22 Introduction to Memory Hierarchies Let!s go back to a course goal... At the end of the semester, you should be able to......describe the fundamental components required in a single core of

More information

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved.

Internal Memory. Computer Architecture. Outline. Memory Hierarchy. Semiconductor Memory Types. Copyright 2000 N. AYDIN. All rights reserved. Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Internal Memory http://www.yildiz.edu.tr/~naydin 1 2 Outline Semiconductor main memory Random Access Memory

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Components of a Computer Processor

More information

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Slide Set 8. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 8. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 8 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 369 Winter 2018 Section 01

More information

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review CISC 662 Graduate Computer Architecture Lecture 6 - Cache and virtual memory review Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David

More information

Chapter 5 Internal Memory

Chapter 5 Internal Memory Chapter 5 Internal Memory Memory Type Category Erasure Write Mechanism Volatility Random-access memory (RAM) Read-write memory Electrically, byte-level Electrically Volatile Read-only memory (ROM) Read-only

More information

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016 Caches and Memory Hierarchy: Review UCSB CS240A, Winter 2016 1 Motivation Most applications in a single processor runs at only 10-20% of the processor peak Most of the single processor performance loss

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common

More information

CENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu CENG 3420 Computer Organization and Design Lecture 08: Memory - I Bei Yu CEG3420 L08.1 Spring 2016 Outline q Why Memory Hierarchy q How Memory Hierarchy? SRAM (Cache) & DRAM (main memory) Memory System

More information

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6.

Announcement. Computer Architecture (CSC-3501) Lecture 20 (08 April 2008) Chapter 6 Objectives. 6.1 Introduction. 6. Announcement Computer Architecture (CSC-350) Lecture 0 (08 April 008) Seung-Jong Park (Jay) http://www.csc.lsu.edu/~sjpark Chapter 6 Objectives 6. Introduction Master the concepts of hierarchical memory

More information

MEMORY. Objectives. L10 Memory

MEMORY. Objectives. L10 Memory MEMORY Reading: Chapter 6, except cache implementation details (6.4.1-6.4.6) and segmentation (6.5.5) https://en.wikipedia.org/wiki/probability 2 Objectives Understand the concepts and terminology of hierarchical

More information