ECE 2300 Digital Logic & Computer Organization. More Caches

Size: px
Start display at page:

Download "ECE 2300 Digital Logic & Computer Organization. More Caches"

Transcription

1 ECE 23 Digital Logic & Computer Organization Spring 217 More Caches 1

2 Prelim 2 stats High: 9 (out of 9) Mean: 7.2, Median: 73 Announcements Prelab 5(C) due tomorrow 2

3 Example: Direct Mapped (DM) Cache 32-bit memory address break down 2 tag bits 1 index bits 2 byte offset bits Block size: 2 2 = 4 bytes Number of cache blocks: 2 1 =124 Total cache capacity: 124 x 4B = 4KB 3

4 Example: DM Cache Address Breakdown Assuming 16-bit memory addresses, how many bits are associated with the tag, index, and offset of the following configurations for a direct mapped cache? (a) 32 blocks, 8 bytes per block Byte offset: 3 bits; Index: 5 bits; Tag: 8 bits (b) 16 blocks, 16 bytes per block Byte offset: 4 bits; Index: 4 bits; Tag: 8 bits

5 Block Placement in DM Cache Direct mapped cache: Each memory block maps to one cache block Mapping conflicts may increase miss rate Block Block 1 Block 2 Block 3 Block 4 Block 5 Block 6 Block 7 Memory with 8 blocks Direct mapped cache with 4 blocks 5

6 More Flexible Block Placement K-way Set Associate Cache: each memory block maps to one set, which contains K blocks A block can be stored anywhere in the set Block Block 1 Block 2 Block 3 Block 4 Set Set 1 Way Way 1 Block 5 Block 6 2-way set associative cache with 4 blocks Block 7 Memory with 8 blocks 6

7 4-way Set Associative Cache 256 sets (4 ways per set, 124 blocks) 7

8 Associative Caches K-way set associative Index bits determine which set to address Each set contains K entries (ways) All ways in the selected set are searched in parallel K comparators (more expensive than direct mapped) An extreme case: Fully associative Block can go in any cache location No need for index bits All entries are searched in parallel Comparator per entry (most expensive) 8

9 Address Translation for Associative Caches Breakdown of memory address for cache use n-i-b tag bits i index bits b byte offset bits Parameters for a K-way set associative cache Size of each cache block is 2 b bytes Number of sets is 2 i Number of blocks is K 2 i Total cache size is (K 2 b+i ) bytes 9

10 Spectrum of Associativity A K-way set associative cache with N blocks Number of cache sets S = N / K Number of index bits = log 2 (S) When K = N, fully associative cache ONE cache set à zero index bits When K = 1 (one-way), direct mapped cache N cache sets Increasing the associatively Typically improves the hit rate (fewer conflicts) But increases the hit time (takes longer to search) 1

11 Spectrum of Associativity For a cache with 8 blocks 11

12 2-way Set Associative Example Size of each block is 4 bytes Cache holds 4 blocks, 2-way set associative Memory holds 16 blocks Memory address 1 3 tag bits 2 byte offset bits 2 sets 1 index bit 2 ways 12

13 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R3 miss

14 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R3 1 miss

15 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R3 1 miss

16 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R miss

17 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R miss

18 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R miss

19 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R miss

20 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R miss

21 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R hit

22 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R hit

23 2-way Set Associative Example Processor Cache Memory R1 <= M[] R2 <= M[1] R3 <= M[1] R2 <= M[111] R1 <= M[] R1 <= M[1] R R1 R2 R hit

24 Miss Classification Compulsory (Cold) misses Caused by the first access to a memory block Capacity misses Occur because the cache might not be big enough to hold the active set of memory blocks needed during program execution Conflict misses Occur with a direct mapped or set-associative cache when multiple memory blocks compete in the same set due to the inflexibility of block placement Would not occur in a fully associative cache 24

25 Misses vs. Associativity Example Compare different caches Capacity: 4 blocks (1 byte / block) Direct mapped, 2-way set associative, fully associative Block address sequence:, 8,, 6, 8 Direct mapped Block address Cache index Hit/miss Cache contents after access Blocks (or Sets) 25

26 Misses vs. Associativity Example Compare different caches Capacity: 4 blocks (1 byte / block) Direct mapped, 2-way set associative, fully associative Block address sequence:, 8,, 6, 8 (in decimal) Direct mapped Block address Cache Hit/miss index miss Mem[] 8 miss Mem[8] miss Mem[] 6 2 miss Mem[] Mem[6] 8 miss Mem[8] Mem[6] Cache contents after access Color code: Cold miss Conflict miss 26

27 Misses vs. Associativity Example 2-way set associative Block address Cache index Hit/miss Cache contents after access Set Set 1 Fully associative Ways Ways Block address Hit/miss Cache contents after access 27

28 Misses vs. Associativity Example 2-way set associative Block address Cache Hit/miss index miss Mem[] 8 miss Mem[] Mem[8] hit Mem[] Mem[8] 6 miss Mem[] Mem[6] 8 miss Mem[8] Mem[6] Fully associative Cache contents after access Set Set 1 Block Hit/miss Cache contents after access address miss Mem[] 8 miss Mem[] Mem[8] hit Mem[] Mem[8] 6 miss Mem[] Mem[8] Mem[6] 8 hit Mem[] Mem[8] Mem[6] Color code: Cold miss Conflict miss 28

29 Block Replacement Policy Direct mapped: no choice Set associative and fully associative Pick non-valid entry, if there is one Otherwise, choose among entries in the set Least recently used (LRU) Choose the one unused for the longest time Requires extra bits to order the blocks High overhead beyond 4-way set associative Random Similar performance as LRU for high associativity 29

30 LRU Replacement Example Fully associative (X) = LRU Age Block Hit/miss Cache contents after access address miss Mem[] () 4 miss Mem[] (1) Mem[4] () 2 miss Mem[] (2) Mem[4] (1) Mem[2] () 6 miss Mem[] (3) Mem[4] (2) Mem[2] (1) Mem[6] () 8 miss Mem[8] () Mem[4] (3) Mem[2] (2) Mem[6] (1) miss Mem[8] (1) Mem[] () Mem[2] (3) Mem[6] (2) 4 miss Mem[8] (2) Mem[] (1) Mem[4] () Mem[6] (3) 2 miss Mem[8] (3) Mem[] (2) Mem[4] (1) Mem[2] () 6 miss Mem[6] () Mem[] (3) Mem[4] (2) Mem[2] (1) 8 miss Mem[6] (1) Mem[8] () Mem[4] (3) Mem[2] (2) 2 hit Mem[6] (2) Mem[8] (1) Mem[4] (3) Mem[2] () 6 hit Mem[6] () Mem[8] (2) Mem[4] (3) Mem[2] (1) 2 hit Mem[6] (1) Mem[8] (2) Mem[4] (3) Mem[2] () miss Mem[6] (2) Mem[8] (3) Mem[] () Mem[2] (1) Color code: Cold miss Conflict miss Capacity miss 3

31 LRU Replacement Example 2-way set associative Block address Cache Hit/miss index miss Mem[] 4 miss Mem[] (*) Mem[4] 2 miss Mem[2] Mem[4] (*) 6 miss Mem[2] (*) Mem[6] 8 miss Mem[8] Mem[6] (*) miss Mem[8] (*) Mem[] 4 miss Mem[4] Mem[] (*) 2 miss Mem[4] (*) Mem[2] 6 miss Mem[6] Mem[2] (*) 8 miss Mem[6] (*) Mem[8] 2 miss Mem[2] Mem[8] (*) 6 miss Mem[2] (*) Mem[6] 2 hit Mem[2] Mem[6] (*) miss Mem[2] (*) Mem[] (*) = LRU block Cache contents after access Set Set 1 Color code: Cold miss Conflict miss Capacity miss 31

32 H&H 7.5.5, 8.2 Before Next Class Next Time More Caches Measuring Performance 32

ECE 2300 Digital Logic & Computer Organization. More Caches

ECE 2300 Digital Logic & Computer Organization. More Caches ECE 23 Digital Logic & Computer Organization Spring 218 More Caches 1 Announcements Prelim 2 stats High: 79.5 (out of 8), Mean: 65.9, Median: 68 Prelab 5(C) deadline extended to Saturday 3pm No further

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 23: Associative Caches Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Last time: Write-Back Alternative: On data-write hit, just

More information

ECE 30 Introduction to Computer Engineering

ECE 30 Introduction to Computer Engineering ECE 0 Introduction to Computer Engineering Study Problems, Set #9 Spring 01 1. Given the following series of address references given as word addresses:,,, 1, 1, 1,, 8, 19,,,,, 7,, and. Assuming a direct-mapped

More information

ECE 2300 Digital Logic & Computer Organization. More Caches Measuring Performance

ECE 2300 Digital Logic & Computer Organization. More Caches Measuring Performance ECE 23 Digital Logic & Computer Organization Spring 28 More s Measuring Performance Announcements HW7 due tomorrow :59pm Prelab 5(c) due Saturday 3pm Lab 6 (last one) released HW8 (last one) to be released

More information

A Framework for Memory Hierarchies

A Framework for Memory Hierarchies Associativity schemes Scheme Number of sets Blocks per set Direct mapped Number of blocks in cache 1 Set associative Blocks in cache / Associativity Associativity (2-8) Fully associative 1 Number Blocks

More information

ECE232: Hardware Organization and Design

ECE232: Hardware Organization and Design ECE232: Hardware Organization and Design Lecture 23: Associative Caches Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Last time: Direct mapped cache Pretty simple to

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 24: Cache Performance Analysis Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Last time: Associative caches How do we

More information

Main Memory Supporting Caches

Main Memory Supporting Caches Main Memory Supporting Caches Use DRAMs for main memory Fixed width (e.g., 1 word) Connected by fixed-width clocked bus Bus clock is typically slower than CPU clock Cache Issues 1 Example cache block read

More information

COSC3330 Computer Architecture Lecture 19. Cache

COSC3330 Computer Architecture Lecture 19. Cache COSC3330 Computer Architecture Lecture 19 Cache Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston Cache Topics 3 Cache Hardware Cost How many total bits are required

More information

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar

More information

Homework 6. BTW, This is your last homework. Assigned today, Tuesday, April 10 Due time: 11:59PM on Monday, April 23. CSCI 402: Computer Architectures

Homework 6. BTW, This is your last homework. Assigned today, Tuesday, April 10 Due time: 11:59PM on Monday, April 23. CSCI 402: Computer Architectures Homework 6 BTW, This is your last homework 5.1.1-5.1.3 5.2.1-5.2.2 5.3.1-5.3.5 5.4.1-5.4.2 5.6.1-5.6.5 5.12.1 Assigned today, Tuesday, April 10 Due time: 11:59PM on Monday, April 23 1 CSCI 402: Computer

More information

Course Administration

Course Administration Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570

More information

CS 61C: Great Ideas in Computer Architecture Caches Part 2

CS 61C: Great Ideas in Computer Architecture Caches Part 2 CS 61C: Great Ideas in Computer Architecture Caches Part 2 Instructors: Nicholas Weaver & Vladimir Stojanovic http://insteecsberkeleyedu/~cs61c/fa15 Software Parallel Requests Assigned to computer eg,

More information

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Improving Cache Performance [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user

More information

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)]

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)] ECE7995 (6) Improving Cache Performance [Adapted from Mary Jane Irwin s slides (PSU)] Measuring Cache Performance Assuming cache hit costs are included as part of the normal CPU execution cycle, then CPU

More information

data block 0, word 0 block 0, word 1 block 1, word 0 block 1, word 1 block 2, word 0 block 2, word 1 block 3, word 0 block 3, word 1 Word index cache

data block 0, word 0 block 0, word 1 block 1, word 0 block 1, word 1 block 2, word 0 block 2, word 1 block 3, word 0 block 3, word 1 Word index cache Taking advantage of spatial locality Use block size larger than one word Example: two words Block index tag () () Alternate representations Word index tag block, word block, word block, word block, word

More information

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141 EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

Introduction to OpenMP. Lecture 10: Caches

Introduction to OpenMP. Lecture 10: Caches Introduction to OpenMP Lecture 10: Caches Overview Why caches are needed How caches work Cache design and performance. The memory speed gap Moore s Law: processors speed doubles every 18 months. True for

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common

More information

Caching Basics. Memory Hierarchies

Caching Basics. Memory Hierarchies Caching Basics CS448 1 Memory Hierarchies Takes advantage of locality of reference principle Most programs do not access all code and data uniformly, but repeat for certain data choices spatial nearby

More information

CS161 Design and Architecture of Computer Systems. Cache $$$$$

CS161 Design and Architecture of Computer Systems. Cache $$$$$ CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks

More information

ECEC 355: Cache Design

ECEC 355: Cache Design ECEC 355: Cache Design November 28, 2007 Terminology Let us first define some general terms applicable to caches. Cache block or line. The minimum unit of information (in bytes) that can be either present

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir Stojanovic http://insteecsberkeleyedu/~cs61c/ Typical Memory Hierarchy Datapath On-Chip

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

ECE 2300 Digital Logic & Computer Organization. Caches

ECE 2300 Digital Logic & Computer Organization. Caches ECE 23 Digital Logic & Computer Organization Spring 217 s Lecture 2: 1 Announcements HW7 will be posted tonight Lab sessions resume next week Lecture 2: 2 Course Content Binary numbers and logic gates

More information

Caches II. CSE 351 Spring Instructor: Ruth Anderson

Caches II. CSE 351 Spring Instructor: Ruth Anderson Caches II CSE 35 Spring 27 Instructor: Ruth Anderson Teaching Assistants: Dylan Johnson Kevin Bi Linxing Preston Jiang Cody Ohlsen Yufang Sun Joshua Curtis Administrivia Office Hours Changes check calendar!!

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to

More information

Cache Optimization. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Cache Optimization. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Cache Optimization Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Cache Misses On cache hit CPU proceeds normally On cache miss Stall the CPU pipeline

More information

CACHE ARCHITECTURE. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

CACHE ARCHITECTURE. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah CACHE ARCHITECTURE Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 3 will be released on Oct. 31 st This

More information

3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition

3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

Page 1. Memory Hierarchies (Part 2)

Page 1. Memory Hierarchies (Part 2) Memory Hierarchies (Part ) Outline of Lectures on Memory Systems Memory Hierarchies Cache Memory 3 Virtual Memory 4 The future Increasing distance from the processor in access time Review: The Memory Hierarchy

More information

Cray XE6 Performance Workshop

Cray XE6 Performance Workshop Cray XE6 Performance Workshop Mark Bull David Henty EPCC, University of Edinburgh Overview Why caches are needed How caches work Cache design and performance. 2 1 The memory speed gap Moore s Law: processors

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 14 Caches III Lecturer SOE Dan Garcia Google Glass may be one vision of the future of post-pc interfaces augmented reality with video

More information

Memory Hierarchy: Caches, Virtual Memory

Memory Hierarchy: Caches, Virtual Memory Memory Hierarchy: Caches, Virtual Memory Readings: 5.1-5.4, 5.8 Big memories are slow Computer Fast memories are small Processor Memory Devices Control Input Datapath Output Need to get fast, big memories

More information

Key Point. What are Cache lines

Key Point. What are Cache lines Caching 1 Key Point What are Cache lines Tags Index offset How do we find data in the cache? How do we tell if it s the right data? What decisions do we need to make in designing a cache? What are possible

More information

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng. CS 265 Computer Architecture Wei Lu, Ph.D., P.Eng. Part 4: Memory Organization Our goal: understand the basic types of memory in computer understand memory hierarchy and the general process to access memory

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major

More information

CS3350B Computer Architecture

CS3350B Computer Architecture CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &

More information

Agenda. Recap: Components of a Computer. Agenda. Recap: Cache Performance and Average Memory Access Time (AMAT) Recap: Typical Memory Hierarchy

Agenda. Recap: Components of a Computer. Agenda. Recap: Cache Performance and Average Memory Access Time (AMAT) Recap: Typical Memory Hierarchy // CS 6C: Great Ideas in Computer Architecture (Machine Structures) Set- Associa+ve Caches Instructors: Randy H Katz David A PaFerson hfp://insteecsberkeleyedu/~cs6c/fa Cache Recap Recap: Components of

More information

Lecture 17: Memory Hierarchy: Cache Design

Lecture 17: Memory Hierarchy: Cache Design S 09 L17-1 18-447 Lecture 17: Memory Hierarchy: Cache Design James C. Hoe Dept of ECE, CMU March 24, 2009 Announcements: Project 3 is due Midterm 2 is coming Handouts: Practice Midterm 2 solutions The

More information

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies 1 Lecture 22 Introduction to Memory Hierarchies Let!s go back to a course goal... At the end of the semester, you should be able to......describe the fundamental components required in a single core of

More information

Lecture 17 Introduction to Memory Hierarchies" Why it s important " Fundamental lesson(s)" Suggested reading:" (HP Chapter

Lecture 17 Introduction to Memory Hierarchies Why it s important  Fundamental lesson(s) Suggested reading: (HP Chapter Processor components" Multicore processors and programming" Processor comparison" vs." Lecture 17 Introduction to Memory Hierarchies" CSE 30321" Suggested reading:" (HP Chapter 5.1-5.2)" Writing more "

More information

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Memory Code Stored in Memory (also, data and stack) memory PC +4 new pc

More information

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) Caches akim Weatherspoon CS 341, Spring 212 Computer Science Cornell University See P& 5.1, 5.2 (except writes) ctrl ctrl ctrl inst imm B A B D D Big Picture: emory emory: big & slow vs Caches: small &

More information

CS162 Operating Systems and Systems Programming Lecture 10 Caches and TLBs"

CS162 Operating Systems and Systems Programming Lecture 10 Caches and TLBs CS162 Operating Systems and Systems Programming Lecture 10 Caches and TLBs" October 1, 2012! Prashanth Mohan!! Slides from Anthony Joseph and Ion Stoica! http://inst.eecs.berkeley.edu/~cs162! Caching!

More information

Lecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size

Lecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size CS61C L33 Caches III (1) inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C Machine Structures Lecture 33 Caches III 27-4-11 Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Future of movies is 3D? Dreamworks

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

CSE 141 Computer Architecture Spring Lectures 17 Virtual Memory. Announcements Office Hour

CSE 141 Computer Architecture Spring Lectures 17 Virtual Memory. Announcements Office Hour CSE 4 Computer Architecture Spring 25 Lectures 7 Virtual Memory Pramod V. Argade May 25, 25 Announcements Office Hour Monday, June 6th: 6:3-8 PM, AP&M 528 Instead of regular Monday office hour 5-6 PM Reading

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/16/17 Fall 2017 - Lecture #15 1 Outline

More information

CMPSC 311- Introduction to Systems Programming Module: Caching

CMPSC 311- Introduction to Systems Programming Module: Caching CMPSC 311- Introduction to Systems Programming Module: Caching Professor Patrick McDaniel Fall 2016 Reminder: Memory Hierarchy L0: Registers CPU registers hold words retrieved from L1 cache Smaller, faster,

More information

10/16/17. Outline. Outline. Typical Memory Hierarchy. Adding Cache to Computer. Key Cache Concepts

10/16/17. Outline. Outline. Typical Memory Hierarchy. Adding Cache to Computer. Key Cache Concepts // CS C: Great Ideas in Computer Architecture (Machine Structures) s Part Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~csc/ Organization and Principles Write Back vs Write Through

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 32 Caches III 2008-04-16 Lecturer SOE Dan Garcia Hi to Chin Han from U Penn! Prem Kumar of Northwestern has created a quantum inverter

More information

Improving Cache Performance

Improving Cache Performance Improving Cache Performance Tuesday 27 October 15 Many slides adapted from: and Design, Patterson & Hennessy 5th Edition, 2014, MK and from Prof. Mary Jane Irwin, PSU Summary Previous Class Memory hierarchy

More information

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 34, Spring 23 Computer Science Cornell University Welcome back from Spring Break! Welcome back from Spring Break! Big Picture: Memory Code

More information

Improving Cache Performance

Improving Cache Performance Improving Cache Performance Computer Organization Architectures for Embedded Computing Tuesday 28 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition,

More information

Cache Performance (H&P 5.3; 5.5; 5.6)

Cache Performance (H&P 5.3; 5.5; 5.6) Cache Performance (H&P 5.3; 5.5; 5.6) Memory system and processor performance: CPU time = IC x CPI x Clock time CPU performance eqn. CPI = CPI ld/st x IC ld/st IC + CPI others x IC others IC CPI ld/st

More information

CACHE OPTIMIZATION. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

CACHE OPTIMIZATION. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah CACHE OPTIMIZATION Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 3 will be released on Oct. 31 st This

More information

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): Motivation for The Memory Hierarchy: { CPU/Memory Performance Gap The Principle Of Locality Cache $$$$$ Cache Basics:

More information

Advanced Computer Architecture

Advanced Computer Architecture ECE 563 Advanced Computer Architecture Fall 2009 Lecture 3: Memory Hierarchy Review: Caches 563 L03.1 Fall 2010 Since 1980, CPU has outpaced DRAM... Four-issue 2GHz superscalar accessing 100ns DRAM could

More information

Cache Memory Mapping Techniques. Continue to read pp

Cache Memory Mapping Techniques. Continue to read pp Cache Memory Mapping Techniques Continue to read pp. 289-305 Cache Memory Mapping Again cache memory is a small and fast memory between CPU and main memory A block of words have to be brought in and out

More information

Question?! Processor comparison!

Question?! Processor comparison! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 5.1-5.2!! (Over the next 2 lectures)! Lecture 18" Introduction to Memory Hierarchies! 3! Processor components! Multicore processors and programming! Question?!

More information

CMPSC 311- Introduction to Systems Programming Module: Caching

CMPSC 311- Introduction to Systems Programming Module: Caching CMPSC 311- Introduction to Systems Programming Module: Caching Professor Patrick McDaniel Fall 2014 Lecture notes Get caching information form other lecture http://hssl.cs.jhu.edu/~randal/419/lectures/l8.5.caching.pdf

More information

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand Caches Nima Honarmand Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required by all of the running applications

More information

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1 CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson

More information

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14 CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2014 Lecture 14 LAST TIME! Examined several memory technologies: SRAM volatile memory cells built from transistors! Fast to use, larger memory cells (6+ transistors

More information

Review: Computer Organization

Review: Computer Organization Review: Computer Organization Cache Chansu Yu Caches: The Basic Idea A smaller set of storage locations storing a subset of information from a larger set. Typically, SRAM for DRAM main memory: Processor

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

MEMORY HIERARCHY BASICS. B649 Parallel Architectures and Programming

MEMORY HIERARCHY BASICS. B649 Parallel Architectures and Programming MEMORY HIERARCHY BASICS B649 Parallel Architectures and Programming BASICS Why Do We Need Caches? 3 Overview 4 Terminology cache virtual memory memory stall cycles direct mapped valid bit block address

More information

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5 th Edition Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #24 Cache II 27-8-6 Scott Beamer, Instructor New Flow Based Routers CS61C L24 Cache II (1) www.anagran.com Caching Terminology When we try

More information

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook CS356: Discussion #9 Memory Hierarchy and Caches Marco Paolieri (paolieri@usc.edu) Illustrations from CS:APP3e textbook The Memory Hierarchy So far... We modeled the memory system as an abstract array

More information

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.2 (writes), 5.3, 5.5

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.2 (writes), 5.3, 5.5 Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H 5.2 (writes), 5.3, 5.5 Announcements! HW3 available due next Tuesday HW3 has been updated. Use updated version.

More information

Advanced Memory Organizations

Advanced Memory Organizations CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU

More information

CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2

CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2 CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2 Instructor: Sagar Karandikar sagark@eecsberkeleyedu hbp://insteecsberkeleyedu/~cs61c 1 So/ware Parallel Requests Assigned to computer

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 14 Caches III Asst. Proflecturer SOE Miki Garcia WHEN FIBER OPTICS IS TOO SLOW 07/16/2014: Wall Street Buys NATO Microwave Towers in

More information

CACHE ARCHITECTURE. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

CACHE ARCHITECTURE. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah CACHE ARCHITECTURE Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Mar. 14 th : Homework 4 release (due on Mar. 27

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Caches Part 1. Instructor: Sören Schwertfeger. School of Information Science and Technology SIST

Caches Part 1. Instructor: Sören Schwertfeger.   School of Information Science and Technology SIST CS 110 Computer Architecture Caches Part 1 Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's

More information

Block Size Tradeoff (1/3) Benefits of Larger Block Size. Lecture #22 Caches II Block Size Tradeoff (3/3) Block Size Tradeoff (2/3)

Block Size Tradeoff (1/3) Benefits of Larger Block Size. Lecture #22 Caches II Block Size Tradeoff (3/3) Block Size Tradeoff (2/3) CS61C L22 Caches II (1) inst.eecs.berkeley.edu/~cs61c CS61C Machine Structures CPS today! Lecture #22 Caches II 25-11-16 There is one handout today at the front and back of the room! Lecturer PSOE, new

More information

Caches (Writing) Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.2 3, 5.5

Caches (Writing) Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.2 3, 5.5 s (Writing) Hakim Weatherspoon CS, Spring Computer Science Cornell University P & H Chapter.,. Administrivia Lab due next onday, April th HW due next onday, April th Goals for Today Parameter Tradeoffs

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 3//5 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Part Instructors: Krste Asanovic & Vladimir Stojanovic hfp://insteecsberkeleyedu/~cs6c/ Parallel Requests Assigned to computer

More information

CS 31: Intro to Systems Caching. Martin Gagne Swarthmore College March 23, 2017

CS 31: Intro to Systems Caching. Martin Gagne Swarthmore College March 23, 2017 CS 1: Intro to Systems Caching Martin Gagne Swarthmore College March 2, 2017 Recall A cache is a smaller, faster memory, that holds a subset of a larger (slower) memory We take advantage of locality to

More information

UC Berkeley CS61C : Machine Structures

UC Berkeley CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 33 Caches III 2007-04-11 Future of movies is 3D? Dreamworks says they may exclusively release movies in this format. It s based

More information

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu CENG 3420 Computer Organization and Design Lecture 08: Cache Review Bei Yu CEG3420 L08.1 Spring 2016 A Typical Memory Hierarchy q Take advantage of the principle of locality to present the user with as

More information

Caches II CSE 351 Spring #include <yoda.h> int is_try(int do_flag) { return!(do_flag!do_flag); }

Caches II CSE 351 Spring #include <yoda.h> int is_try(int do_flag) { return!(do_flag!do_flag); } Caches II CSE 351 Spring 2018 #include int is_try(int do_flag) { return!(do_flag!do_flag); } Memory Hierarchies Some fundamental and enduring properties of hardware and software systems: Faster

More information

V. Primary & Secondary Memory!

V. Primary & Secondary Memory! V. Primary & Secondary Memory! Computer Architecture and Operating Systems & Operating Systems: 725G84 Ahmed Rezine 1 Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM)

More information

Announcements. ! Previous lecture. Caches. Inf3 Computer Architecture

Announcements. ! Previous lecture. Caches. Inf3 Computer Architecture Announcements! Previous lecture Caches Inf3 Computer Architecture - 2016-2017 1 Recap: Memory Hierarchy Issues! Block size: smallest unit that is managed at each level E.g., 64B for cache lines, 4KB for

More information

Page 1. Multilevel Memories (Improving performance using a little cash )

Page 1. Multilevel Memories (Improving performance using a little cash ) Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency

More information

ECE 30, Lab #8 Spring 2014

ECE 30, Lab #8 Spring 2014 ECE 30, Lab #8 Spring 20 Shown above is a multi-cycle CPU. There are six special registers in this datapath: PC, IR, MDR, A, B, and ALUOut. Of these, PC and IR are enabled to change when PCWr and IRWr

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #21: Caches 3 2005-07-27 CS61C L22 Caches III (1) Andy Carle Review: Why We Use Caches 1000 Performance 100 10 1 1980 1981 1982 1983

More information

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University Lecture 12 Memory Design & Caches, part 2 Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements HW3 is due today PA2 is available on-line today Part 1 is due on 2/27

More information

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work? EEC 17 Computer Architecture Fall 25 Introduction Review Review: The Hierarchy Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology

More information

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches, Set Associative Caches, Cache Performance

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches, Set Associative Caches, Cache Performance CS 6C: Great Ideas in Computer Architecture Direct Mapped Caches, Set Associative Caches, Cache Performance Instructor: Justin Hsia 7//23 Summer 23 Lecture # Great Idea #3: Principle of Locality/ Memory

More information

Figure 1: Organisation for 128KB Direct Mapped Cache with 16-word Block Size and Word Addressable

Figure 1: Organisation for 128KB Direct Mapped Cache with 16-word Block Size and Word Addressable Tutorial 12: Cache Problem 1: Direct Mapped Cache Consider a 128KB of data in a direct-mapped cache with 16 word blocks. Determine the size of the tag, index and offset fields if a 32-bit architecture

More information

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Recall

CSCI 402: Computer Architectures. Fengguang Song Department of Computer & Information Science IUPUI. Recall CSCI 402: Computer Architectures Memory Hierarchy (2) Fengguang Song Department of Computer & Information Science IUPUI Recall What is memory hierarchy? Where each level is located? Each level s speed,

More information

ADDRESS TRANSLATION AND TLB

ADDRESS TRANSLATION AND TLB ADDRESS TRANSLATION AND TLB Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 3 submission deadline: Nov.

More information