Fundamentals of Computer Systems
|
|
- Tyler McCormick
- 5 years ago
- Views:
Transcription
1 Fundamentals of Computer Systems Caches Martha A. Kim Columbia University Fall 215 Illustrations Copyright 27 Elsevier 1 / 23
2 Computer Systems Performance depends on which is slowest: the processor or the memory system CLK Processor MemWrite Address Write WE Memory Read 2 / 23
3 Memory Speeds Haven t Kept Up 1, 1, Performance 1 1 CPU Year Memory Our single-cycle memory assumption has been wrong since 198. Hennessy and Patterson. Computer Architecture: A Quantitative Approach. 3rd ed., Morgan Kaufmann, / 23
4 Your Choice of Memories Fast Cheap Large On-Chip SRAM Commodity DRAM Supercomputer 4 / 23
5 Memory Hierarchy An essential trick that makes big memory appear fast Technology Cost Access Time Density ($/Gb) (ns) (Gb/cm2) SRAM DRAM Flash Hard Disk Read speed; writing much, much slower 5 / 23
6 A Modern Memory Hierarchy A desktop machine: AMD Phenom 96 Quad-core 2.3 GHz V 95 W 65 nm Level Size Tech. L1 Instruction 64 K SRAM L1 64 K SRAM L2 512 K SRAM L3 2 MB SRAM Memory 4 GB DRAM Disk 5 GB Magnetic per core 6 / 23
7 A Simple Memory Hierarchy???? First level: small, fast storage (typically SRAM) H i p h i p h o o r a y! \ h i p \ Last level: large, slow storage (typically DRAM) Can fit a subset of lower level in upper level, but which subset? 7 / 23
8 Locality Example: Substring Matching H i p h i p h o o r a y! \ h i p \ Addresses accessed over time aka reference stream h H h i h p h h h i i p p h h temporal locality: if needed X recently, likely to need X again soon spatial locality: if need X, likely also need something near X 8 / 23
9 Cache Highest levels of memory hierarchy Fast: level 1 typically 1 cycle access time With luck, supplies most data Cache design questions: What data does it hold? How is data found? What data is replaced? Recently accessed Simple address hash Often the oldest 9 / 23
10 What is Held in the Cache? Ideal cache: always correctly guesses what you want before you want it. Real cache: never that smart Caches Exploit Temporal Locality Copy newly accessed data into cache, replacing oldest if necessary Spatial Locality Copy nearby data into the cache at the same time Specifically, always read and write a block, also called a line, at a time (e.g., 64 bytes), never a single byte. 1 / 23
11 Memory Performance Hit: is found in the level of memory hierarchy Miss: not found; will look in next level Hit Rate = Miss Rate = Number of hits Number of accesses Number of misses Number of accesses Hit Rate + Miss Rate = 1 The expected access time E L for a memory level L with latency t L and miss rate M L : E L = t L + M L E L+1 11 / 23
12 Memory Performance Example Two-level hierarchy: Cache and main memory Program executes 1 loads & stores 75 of these are found in the cache What s the cache hit and miss rate? 12 / 23
13 Memory Performance Example Two-level hierarchy: Cache and main memory Program executes 1 loads & stores 75 of these are found in the cache What s the cache hit and miss rate? Hit Rate = 75 1 = 75% Miss Rate = 1.75 = 25% If the cache takes 1 cycle and the main memory 1, What s the expected access time? 12 / 23
14 Memory Performance Example Two-level hierarchy: Cache and main memory Program executes 1 loads & stores 75 of these are found in the cache What s the cache hit and miss rate? Hit Rate = 75 1 = 75% Miss Rate = 1.75 = 25% If the cache takes 1 cycle and the main memory 1, What s the expected access time? Expected access time of main memory: E 1 = 1 cycles Access time for the cache: t = 1 cycle Cache miss rate: M =.25 E = t + M E 1 = = 26 cycles 12 / 23
15 A Direct-Mapped Cache Address mem[xfffffffc] mem[xfffffff8] mem[xfffffff4] mem[xfffffff] mem[xffffffec] mem[xffffffe8] mem[xffffffe4] mem[xffffffe]...11 mem[x24]...1 mem[x2] mem[x1c]...11 mem[x18]...11 mem[x14]...1 mem[x1]...11 mem[xc]...1 mem[x8]...1 mem[x4]... mem[x] 2 3 -Word Main Memory This simple cache has 8 sets 1 block per set 4 bytes per block To simplify answering is this data in the cache?, each byte is mapped to exactly one set Word Cache Set 7 (111) Set 6 (11) Set 5 (11) Set 4 (1) Set 3 (11) Set 2 (1) Set 1 (1) Set () 13 / 23
16 Direct-Mapped Cache Hardware Memory Address Tag Set 27 3 = Byte Offset V Tag Address bits: 1: byte within block 2 4: set number 5 31: block tag Set 7 Set 6 Set 5 Set 4 Set 3 Set 2 Set 1 Set Cache hit if 8-entry x ( )-bit SRAM in the set of the address, Hit block is valid (V=1) tag (address bits 5 31) matches 14 / 23
17 Direct-Mapped Cache Behavior A dumb loop: repeat 5 times load from x4; load from xc; load from x8. li $t, 5 l1: beq $t, $, done lw $t1, x4($) lw $t2, xc($) lw $t3, x8($) addiu $t, $t, -1 j l1 done: Memory Address Tag... Byte Set Offset 1 3 V Tag 1... mem[x...c] 1... mem[x...8] 1... mem[x...4] Set 7 (111) Set 6 (11) Set 5 (11) Set 4 (1) Set 3 (11) Set 2 (1) Set 1 (1) Set () Cache when reading x4 last time Assuming the cache starts empty, what s the miss rate? 15 / 23
18 Direct-Mapped Cache Behavior A dumb loop: repeat 5 times load from x4; load from xc; load from x8. li $t, 5 l1: beq $t, $, done lw $t1, x4($) lw $t2, xc($) lw $t3, x8($) addiu $t, $t, -1 j l1 done: Memory Address Tag... Byte Set Offset 1 3 V Tag 1... mem[x...c] 1... mem[x...8] 1... mem[x...4] Set 7 (111) Set 6 (11) Set 5 (11) Set 4 (1) Set 3 (11) Set 2 (1) Set 1 (1) Set () Cache when reading x4 last time Assuming the cache starts empty, what s the miss rate? 4 C 8 4 C 8 4 C 8 4 C 8 4 C 8 M M M H H H H H H H H H H H H 3/15 =.2 = 2% 15 / 23
19 Direct-Mapped Cache: Conflict A dumber loop: repeat 5 times load from x4; load from x24 Memory Address li $t, 5 l1: beq $t, $, done lw $t1, x4($) lw $t2, x24($) addiu $t, $t, -1 j l1 done: Tag Set Byte Offset V Tag Set 7 (111) Set 6 (11) Set 5 (11) Set 4 (1) Set 3 (11) Set 2 (1) Set 1 (1) Set () Cache State Assuming the cache starts empty, what s the miss rate? These are conflict misses 16 / 23
20 Direct-Mapped Cache: Conflict A dumber loop: repeat 5 times load from x4; load from x24 Memory Address li $t, 5 l1: beq $t, $, done lw $t1, x4($) lw $t2, x24($) addiu $t, $t, -1 j l1 done: Tag Set Byte Offset V 1 Tag... mem[x...4] mem[x...24] Set 7 (111) Set 6 (11) Set 5 (11) Set 4 (1) Set 3 (11) Set 2 (1) Set 1 (1) Set () Cache State Assuming the cache starts empty, what s the miss rate? M M M M M M M M M M 1/1 = 1 = 1% Oops These are conflict misses 16 / 23
21 No Way! Yes Way! 2-Way Set Associative Cache Memory Address Tag Set 28 2 Byte Offset V Tag Way 1 Way V Tag Set 3 Set 2 Set 1 Set = = Hit 1 Hit 1 Hit 1 32 Hit 17 / 23
22 2-Way Set Associative Behavior li $t, 5 l1: beq $t, $, done lw $t1, x4($) lw $t2, x24($) addiu $t, $t, -1 j l1 done: Assuming the cache starts empty, what s the miss rate? M M H H H H H H H H 2/1 =.2 = 2% Associativity reduces conflict misses Way 1 Way V Tag V Tag 1... mem[x...24] mem[x...4] Set 3 Set 2 Set 1 Set 18 / 23
23 An Eight-way Fully Associative Cache Way 7 Way 6 Way 5 Way 4 Way 3 Way 2 Way 1 Way V Tag V Tag V Tag V Tag V Tag V Tag V Tag V Tag No conflict misses: only compulsory or capacity misses Either very expensive or slow because of all the associativity 19 / 23
24 Exploiting Spatial Locality: Larger Blocks Block Byte Tag Set Offset Offset Memory x8 9C: Address 8 9 C Memory Address Block Byte Tag Set Offset Offset 27 2 V Tag Set 1 Set = 32 Hit 2 sets 1 block per set (Direct Mapped) 4 words per block 2 / 23
25 Direct-Mapped Cache Behavior w/ 4-word block The dumb loop: repeat 5 times load from x4; load from xc; load from x8. Block Byte Tag Set Offset Offset Memory Address V Tag 1... mem[x...c] mem[x...8] mem[x...4] mem[x...] Set 1 Set Cache when reading xc li $t, 5 l1: beq $t, $, done lw $t1, x4($) lw $t2, xc($) lw $t3, x8($) addiu $t, $t, -1 j l1 done: Assuming the cache starts empty, what s the miss rate? 21 / 23
26 Direct-Mapped Cache Behavior w/ 4-word block The dumb loop: repeat 5 times load from x4; load from xc; load from x8. Block Byte Tag Set Offset Offset Memory Address V Tag 1... mem[x...c] mem[x...8] mem[x...4] mem[x...] Set 1 Set Cache when reading xc li $t, 5 l1: beq $t, $, done lw $t1, x4($) lw $t2, xc($) lw $t3, x8($) addiu $t, $t, -1 j l1 done: Assuming the cache starts empty, what s the miss rate? 4 C 8 4 C 8 4 C 8 4 C 8 4 C 8 M H H H H H H H H H H H H H H 1/15 =.666 = 6.7% Larger blocks reduce compulsory misses by exploting spatial locality 21 / 23
27 Stephen s Desktop Machine Revisited On-chip caches: AMD Phenom 96 Quad-core 2.3 GHz V 95 W 65 nm Cache Size Sets Ways Block L1I 64 K way 64-byte L1D 64 K way 64-byte L2 512 K way 64-byte L3 2 MB way 64-byte per core 22 / 23
28 Intel On-Chip Caches Chip Year Freq. (MHz) off-chip none K unified off-chip Pentium K 8K off-chip Pentium Pro K 8K 256K 1M (MCM) Pentium II K 16K 256K 512K (Cartridge) Pentium III K 16K 256K 512K Pentium Pentium M K 32K 1M 2M K 32K 2M 6M23 / 23 Core 2 Duo K L1 Instr L2 12k op 256K 2M trace cache
Fundamentals of Computer Systems
Fundamentals of Computer Systems Caches Stephen A. Edwards Columbia University Summer 217 Illustrations Copyright 27 Elsevier Computer Systems Performance depends on which is slowest: the processor or
More informationCSEE 3827: Fundamentals of Computer Systems, Spring Caches
CSEE 3827: Fundamentals of Computer Systems, Spring 2011 11. Caches Prof. Martha Kim (martha@cs.columbia.edu) Web: http://www.cs.columbia.edu/~martha/courses/3827/sp11/ Outline (H&H 8.2-8.3) Memory System
More informationCache Architectures Design of Digital Circuits 217 Srdjan Capkun Onur Mutlu http://www.syssec.ethz.ch/education/digitaltechnik_17 Adapted from Digital Design and Computer Architecture, David Money Harris
More informationSarah L. Harris and David Money Harris. Digital Design and Computer Architecture: ARM Edition Chapter 8 <1>
Chapter 8 Digital Design and Computer Architecture: ARM Edition Sarah L. Harris and David Money Harris Digital Design and Computer Architecture: ARM Edition 215 Chapter 8 Chapter 8 :: Topics Introduction
More informationDirect Mapped Cache Hardware. Direct Mapped Cache. Direct Mapped Cache Performance. Direct Mapped Cache Performance. Miss Rate = 3/15 = 20%
Direct Mapped Cache Direct Mapped Cache Hardware........................ mem[xff...fc] mem[xff...f8] mem[xff...f4] mem[xff...f] mem[xff...ec] mem[xff...e8] mem[xff...e4] mem[xff...e] 27 8-entry x (+27+)-bit
More informationDigital Logic & Computer Design CS Professor Dan Moldovan Spring Copyright 2007 Elsevier 8-<1>
Digital Logic & Computer Design CS 4341 Professor Dan Moldovan Spring 21 Copyright 27 Elsevier 8- Chapter 8 :: Memory Systems Digital Design and Computer Architecture David Money Harris and Sarah L.
More informationChapter 8 Memory Hierarchy and Cache Memory
Chapter 8 Memory Hierarchy and Cache Memory Digital Design and Computer Architecture: ARM Edi*on Sarah L. Harris and David Money Harris Digital Design and Computer Architecture: ARM Edi>on 215 Chapter
More informationChapter 8 :: Topics. Chapter 8 :: Memory Systems. Introduction Memory System Performance Analysis Caches Virtual Memory Memory-Mapped I/O Summary
Chapter 8 :: Systems Chapter 8 :: Topics Digital Design and Computer Architecture David Money Harris and Sarah L. Harris Introduction System Performance Analysis Caches Virtual -Mapped I/O Summary Copyright
More informationEN1640: Design of Computing Systems Topic 06: Memory System
EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring
More informationEN1640: Design of Computing Systems Topic 06: Memory System
EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring
More informationCMPSC 311- Introduction to Systems Programming Module: Caching
CMPSC 311- Introduction to Systems Programming Module: Caching Professor Patrick McDaniel Fall 2016 Reminder: Memory Hierarchy L0: Registers CPU registers hold words retrieved from L1 cache Smaller, faster,
More informationCaches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)
Caches Han Wang CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes) This week: Announcements PA2 Work-in-progress submission Next six weeks: Two labs and two projects
More informationChapter 8. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 8 <1>
Chapter 8 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 8 Chapter 8 :: Topics Introduction Memory System Performance Analysis Caches Virtual
More informationChapter 3. SEQUENTIAL Logic Design. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris.
SEQUENTIAL Logic Design Chapter 3 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 3 Chapter 3 :: Topics Introduction Latches and Flip-Flops Synchronous
More informationAdministrivia. Expect new HW out today (functions in assembly!)
Caching 1/25/19 Administrivia Expect new HW out today (functions in assembly!) Memory so far Big array indexed by the memory address accessed by load and store of various types Implicitly assumed: One
More informationCS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches
CS 61C: Great Ideas in Computer Architecture Direct Mapped Caches Instructor: Justin Hsia 7/05/2012 Summer 2012 Lecture #11 1 Review of Last Lecture Floating point (single and double precision) approximates
More informationMemory Hierarchies &
Memory Hierarchies & Cache Memory CSE 410, Spring 2009 Computer Systems http://www.cs.washington.edu/410 4/26/2009 cse410-13-cache 2006-09 Perkins, DW Johnson and University of Washington 1 Reading and
More informationChapter 5. Memory Technology
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationLecture 12: Memory hierarchy & caches
Lecture 12: Memory hierarchy & caches A modern memory subsystem combines fast small memory, slower larger memories This lecture looks at why and how Focus today mostly on electronic memories. Next lecture
More informationCaches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017
Caches and Memory Hierarchy: Review UCSB CS24A, Fall 27 Motivation Most applications in a single processor runs at only - 2% of the processor peak Most of the single processor performance loss is in the
More informationCMPSC 311- Introduction to Systems Programming Module: Caching
CMPSC 311- Introduction to Systems Programming Module: Caching Professor Patrick McDaniel Fall 2014 Lecture notes Get caching information form other lecture http://hssl.cs.jhu.edu/~randal/419/lectures/l8.5.caching.pdf
More informationThe University of Adelaide, School of Computer Science 13 September 2018
Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per
More informationChapter 8. Chapter 8:: Topics. Introduction. Processor Memory Gap
Chapter 8 Chapter 8:: Topics Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Introduction System Performance Analysis Caches Virtual Mapped I/O Summary Chapter
More informationChapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY
Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 21: Memory Hierarchy Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Ideally, computer memory would be large and fast
More informationBasic Memory Hierarchy Principles. Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!)
Basic Memory Hierarchy Principles Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!) Cache memory idea Use a small faster memory, a cache memory, to store recently
More informationLECTURE 10: Improving Memory Access: Direct and Spatial caches
EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses
More informationMemory Hierarchy. Maurizio Palesi. Maurizio Palesi 1
Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio
More informationCaches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016
Caches and Memory Hierarchy: Review UCSB CS240A, Winter 2016 1 Motivation Most applications in a single processor runs at only 10-20% of the processor peak Most of the single processor performance loss
More informationLECTURE 11. Memory Hierarchy
LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed
More informationAgenda. EE 260: Introduction to Digital Design Memory. Naive Register File. Agenda. Memory Arrays: SRAM. Memory Arrays: Register File
EE 260: Introduction to Digital Design Technology Yao Zheng Department of Electrical Engineering University of Hawaiʻi at Mānoa 2 Technology Naive Register File Write Read clk Decoder Read Write 3 4 Arrays:
More informationLec 13: Linking and Memory. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements
Lec 13: Linking and Memory Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University PA 2 is out Due on Oct 22 nd Announcements Prelim Oct 23 rd, 7:30-9:30/10:00 All content up to Lecture on Oct
More informationLecture-14 (Memory Hierarchy) CS422-Spring
Lecture-14 (Memory Hierarchy) CS422-Spring 2018 Biswa@CSE-IITK The Ideal World Instruction Supply Pipeline (Instruction execution) Data Supply - Zero-cycle latency - Infinite capacity - Zero cost - Perfect
More informationCSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1
CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson
More informationLocality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example
Locality CS429: Computer Organization and Architecture Dr Bill Young Department of Computer Sciences University of Texas at Austin Principle of Locality: Programs tend to reuse data and instructions near
More informationReview: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.
Performance 980 98 982 983 984 985 986 987 988 989 990 99 992 993 994 995 996 997 998 999 2000 7/4/20 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Instructor: Michael Greenbaum
More informationChapter 5A. Large and Fast: Exploiting Memory Hierarchy
Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM
More informationEECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141
EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role
More informationRecap: Machine Organization
ECE232: Hardware Organization and Design Part 14: Hierarchy Chapter 5 (4 th edition), 7 (3 rd edition) http://www.ecs.umass.edu/ece/ece232/ Adapted from Computer Organization and Design, Patterson & Hennessy,
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationEE 4683/5683: COMPUTER ARCHITECTURE
EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major
More informationThe Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)
The Memory Hierarchy Cache, Main Memory, and Virtual Memory (Part 2) Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Cache Line Replacement The cache
More informationECE468 Computer Organization and Architecture. Memory Hierarchy
ECE468 Computer Organization and Architecture Hierarchy ECE468 memory.1 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Input Datapath Output Today s Topic:
More informationMemory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology
Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast
More informationDonn Morrison Department of Computer Science. TDT4255 Memory hierarchies
TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,
More informationMemory Hierarchy Y. K. Malaiya
Memory Hierarchy Y. K. Malaiya Acknowledgements Computer Architecture, Quantitative Approach - Hennessy, Patterson Vishwani D. Agrawal Review: Major Components of a Computer Processor Control Datapath
More informationMemory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB
Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar
More informationWhy memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho
Why memory hierarchy? L1 cache design Sangyeun Cho Computer Science Department Memory hierarchy Memory hierarchy goals Smaller Faster More expensive per byte CPU Regs L1 cache L2 cache SRAM SRAM To provide
More information14:332:331. Week 13 Basics of Cache
14:332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Week131 Spring 2006
More informationCaches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)
Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H 5.1, 5.2 (except writes) Announcements! HW3 available due next Tuesday Work with alone partner Be responsible
More informationCS161 Design and Architecture of Computer Systems. Cache $$$$$
CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks
More information3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationCSC Memory System. A. A Hierarchy and Driving Forces
CSC1016 1. System A. A Hierarchy and Driving Forces 1A_1 The Big Picture: The Five Classic Components of a Computer Processor Input Control Datapath Output Topics: Motivation for Hierarchy View of Hierarchy
More informationAlgorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory I
Memory Performance of Algorithms CSE 32 Data Structures Lecture Algorithm Performance Factors Algorithm choices (asymptotic running time) O(n 2 ) or O(n log n) Data structure choices List or Arrays Language
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common
More informationMemory hierarchies: caches and their impact on the running time
Memory hierarchies: caches and their impact on the running time Irene Finocchi Dept. of Computer and Science Sapienza University of Rome A happy coincidence A fundamental property of hardware Different
More informationV. Primary & Secondary Memory!
V. Primary & Secondary Memory! Computer Architecture and Operating Systems & Operating Systems: 725G84 Ahmed Rezine 1 Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM)
More informationCS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches
CS 61C: Great Ideas in Computer Architecture The Memory Hierarchy, Fully Associative Caches Instructor: Alan Christopher 7/09/2014 Summer 2014 -- Lecture #10 1 Review of Last Lecture Floating point (single
More informationAlgorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory II
Memory Performance of Algorithms CSE 32 Data Structures Lecture Algorithm Performance Factors Algorithm choices (asymptotic running time) O(n 2 ) or O(n log n) Data structure choices List or Arrays Language
More informationCS3350B Computer Architecture
CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &
More informationMemory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy
ENG338 Computer Organization and Architecture Part II Winter 217 S. Areibi School of Engineering University of Guelph Hierarchy Topics Hierarchy Locality Motivation Principles Elements of Design: Addresses
More information+ Random-Access Memory (RAM)
+ Memory Subsystem + Random-Access Memory (RAM) Key features RAM is traditionally packaged as a chip. Basic storage unit is normally a cell (one bit per cell). Multiple RAM chips form a memory. RAM comes
More informationCS3350B Computer Architecture
CS3350B Computer Architecture Winter 2015 Lecture 3.1: Memory Hierarchy: What and Why? Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design, Patterson
More informationCPU issues address (and data for write) Memory returns data (or acknowledgment for write)
The Main Memory Unit CPU and memory unit interface Address Data Control CPU Memory CPU issues address (and data for write) Memory returns data (or acknowledgment for write) Memories: Design Objectives
More informationCPSC 330 Computer Organization
CPSC 33 Computer Organization Lecture 7c Memory Adapted from CS52, CS 6C and notes by Kevin Peterson and Morgan Kaufmann Publishers, Copyright 24. Improving cache performance Two ways of improving performance:
More informationWelcome to Part 3: Memory Systems and I/O
Welcome to Part 3: Memory Systems and I/O We ve already seen how to make a fast processor. How can we supply the CPU with enough data to keep it busy? We will now focus on memory issues, which are frequently
More informationCaches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Caches Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns
More informationCSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]
CSF Improving Cache Performance [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user
More informationMemory Hierarchy. Slides contents from:
Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory
More information14:332:331. Week 13 Basics of Cache
14:332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Lec20.1 Fall 2003 Head
More informationECE 485/585 Microprocessor System Design
Microprocessor System Design Lecture 8: Principle of Locality Cache Architecture Cache Replacement Policies Zeshan Chishti Electrical and Computer Engineering Dept Maseeh College of Engineering and Computer
More informationMemory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky
Memory Hierarchy, Fully Associative Caches Instructor: Nick Riasanovsky Review Hazards reduce effectiveness of pipelining Cause stalls/bubbles Structural Hazards Conflict in use of datapath component Data
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Components of a Computer Processor
More informationMemory Hierarchy. Maurizio Palesi. Maurizio Palesi 1
Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio
More informationLECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY
LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal
More informationwww-inst.eecs.berkeley.edu/~cs61c/
CS61C Machine Structures Lecture 34 - Caches II 11/16/2007 John Wawrzynek (www.cs.berkeley.edu/~johnw) www-inst.eecs.berkeley.edu/~cs61c/ 1 What to do on a write hit? Two Options: Write-through update
More informationCMPT 300 Introduction to Operating Systems
CMPT 300 Introduction to Operating Systems Cache 0 Acknowledgement: some slides are taken from CS61C course material at UC Berkeley Agenda Memory Hierarchy Direct Mapped Caches Cache Performance Set Associative
More informationMemory Hierarchy. Slides contents from:
Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory
More informationComputer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg
Computer Architecture and System Software Lecture 09: Memory Hierarchy Instructor: Rob Bergen Applied Computer Science University of Winnipeg Announcements Midterm returned + solutions in class today SSD
More informationECE 2300 Digital Logic & Computer Organization. Caches
ECE 23 Digital Logic & Computer Organization Spring 217 s Lecture 2: 1 Announcements HW7 will be posted tonight Lab sessions resume next week Lecture 2: 2 Course Content Binary numbers and logic gates
More informationChapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative
Chapter 6 s Topics Memory Hierarchy Locality of Reference SRAM s Direct Mapped Associative Computer System Processor interrupt On-chip cache s s Memory-I/O bus bus Net cache Row cache Disk cache Memory
More informationCache Memory - II. Some of the slides are adopted from David Patterson (UCB)
Cache Memory - II Some of the slides are adopted from David Patterson (UCB) Outline Direct-Mapped Cache Types of Cache Misses A (long) detailed example Peer - to - peer education example Block Size Tradeoff
More informationCaches II CSE 351 Spring #include <yoda.h> int is_try(int do_flag) { return!(do_flag!do_flag); }
Caches II CSE 351 Spring 2018 #include int is_try(int do_flag) { return!(do_flag!do_flag); } Memory Hierarchies Some fundamental and enduring properties of hardware and software systems: Faster
More informationThe Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):
The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): Motivation for The Memory Hierarchy: { CPU/Memory Performance Gap The Principle Of Locality Cache $$$$$ Cache Basics:
More informationLECTURE 5: MEMORY HIERARCHY DESIGN
LECTURE 5: MEMORY HIERARCHY DESIGN Abridged version of Hennessy & Patterson (2012):Ch.2 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive
More informationMemory Hierarchy: Caches, Virtual Memory
Memory Hierarchy: Caches, Virtual Memory Readings: 5.1-5.4, 5.8 Big memories are slow Computer Fast memories are small Processor Memory Devices Control Input Datapath Output Need to get fast, big memories
More informationCSE Computer Architecture I Fall 2009 Homework 08 Pipelined Processors and Multi-core Programming Assigned: Due: Problem 1: (10 points)
CSE 30321 Computer Architecture I Fall 2009 Homework 08 Pipelined Processors and Multi-core Programming Assigned: November 17, 2009 Due: December 1, 2009 This assignment can be done in groups of 1, 2,
More informationPlot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;
How will execution time grow with SIZE? int array[size]; int A = ; for (int i = ; i < ; i++) { for (int j = ; j < SIZE ; j++) { A += array[j]; } TIME } Plot SIZE Actual Data 45 4 5 5 Series 5 5 4 6 8 Memory
More informationand data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed
5.3 By convention, a cache is named according to the amount of data it contains (i.e., a 4 KiB cache can hold 4 KiB of data); however, caches also require SRAM to store metadata such as tags and valid
More informationCSEE W4824 Computer Architecture Fall 2012
CSEE W4824 Computer Architecture Fall 2012 Lecture 8 Memory Hierarchy Design: Memory Technologies and the Basics of Caches Luca Carloni Department of Computer Science Columbia University in the City of
More informationMemory Hierarchy. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Memory Hierarchy Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Time (ns) The CPU-Memory Gap The gap widens between DRAM, disk, and CPU speeds
More informationCourse Administration
Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationMemory. Lecture 22 CS301
Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Memory Hierarchy & Caches Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required
More informationThe Memory Hierarchy. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T.
The Memory Hierarchy Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T. L13-1 Memory Technologies Technologies have vastly different tradeoffs between capacity, latency, bandwidth,
More informationEEC 170 Computer Architecture Fall Improving Cache Performance. Administrative. Review: The Memory Hierarchy. Review: Principle of Locality
Administrative EEC 7 Computer Architecture Fall 5 Improving Cache Performance Problem #6 is posted Last set of homework You should be able to answer each of them in -5 min Quiz on Wednesday (/7) Chapter
More informationCACHE MEMORIES ADVANCED COMPUTER ARCHITECTURES. Slides by: Pedro Tomás
CACHE MEMORIES Slides by: Pedro Tomás Additional reading: Computer Architecture: A Quantitative Approach, 5th edition, Chapter 2 and Appendix B, John L. Hennessy and David A. Patterson, Morgan Kaufmann,
More information