CS 61C: Great Ideas in Computer Architecture Lecture 16: Caches, Part 3
|
|
- Arron Watson
- 5 years ago
- Views:
Transcription
1 CS 6C: Great Ideas in Computer Architecture Lecture 6: s, Part 3 Instructor: Sagar Karandikar sagark@eecsberkeleyedu hep://insteecsberkeleyedu/~cs6c Parallel Requests Assigned to computer eg, Search Katz Parallel Threads Assigned to core eg, Lookup, Ads So/ware Parallel InstrucRons > one Rme eg, 5 pipelined instrucrons Parallel > data one Rme eg, Add of 4 pairs of words Hardware descriprons All one Rme Programming Languages You Are Here! Harness Parallelism & Achieve High Performance Hardware Warehouse Scale Computer Core Memory Input/Output InstrucRon Unit(s) Main Memory Computer () Core Core FuncRonal Unit(s) A +B A +B A +B A 3 +B 3 Smart Phone Today s Lecture Logic Gates s Review Processor issues addresses, cache breaks them into Tag/Index/Offset Direct- Mapped vs Set- AssociaRve vs Fully AssociaRve AMAT = Time + Miss Rate * Miss Penalty 3 Cs of cache misses: Compulsory, Capacity, Conflict Effect of cache parameters on performance Review: Processor Address Fields used by Controller Block Offset: Byte address within block #bits = log_(block size in bytes) Index: Selects which index (set) we want # bits = log_(number of sets) Tag: Remaining porron of processor address processor address size - offset bits - index bits Processor Address (3- bits total) Tag Set Index Block offset 3 4 Review: Direct Mapped One word blocks, cache size = Ki words (or 4KiB) Valid bit ensures something useful in cache for this index Compare Tag with upper part of Address to see if a Tag Index Index Valid Tag Comparator Block offset 3 Read data from cache instead of memory if a 5 Review: MulRword- Block Direct- Mapped Four words/block, cache size = Ki words Byte offset Tag Index Valid Tag Index 8 Block offset 3 6
2 Review: Four- Way Set- AssociaRve 8 = 56 sets each with four ways (each with one block) Byte offset Tag Index Index V Tag V Tag V Tag V Tag Way Way Way Way Review: Measurement Terms rate: fracron of accesses that hit in the cache Miss rate: rate Miss penalty: Rme to replace a block from lower level in memory hierarchy to cache Rme: Rme to access cache memory (including tag comparison) 4x select 3 7 These are controlled by the myriad cache parameters we can set 8 Primary Parameters Block size (aka line size) how many bytes of data in each cache entry? AssociaRvity how many ways in each set? Direct- mapped => AssociaRvity = Set- associarve => < AssociaRvity < #Entries Fully associarve => AssociaRvity = #Entries Capacity (bytes) = Total #Entries * Block size #Entries = #Sets * AssociaRvity Other Parameters Write Policy Replacement policy 9 Write Policy Choices hit: write through: writes both cache & memory on every access Generally higher memory traffic but simpler pipeline & cache design write back: writes cache only, memory `wrieen only when dirty entry evicted A dirty bit per line reduces write- back traffic Must handle,, or accesses to memory for each load/store miss: no write allocate: only write to main memory write allocate (aka fetch on write): fetch into cache Common combinarons: write through and no write allocate write back with write allocate Replacement Policy In an associarve cache, which line from a set should be evicted when the set becomes full? Random Least- Recently Used (LRU) LRU cache state must be updated on every access True implementaron only feasible for small sets (- way) Pseudo- LRU binary tree oxen used for 4-8 way First- In, First- Out (FIFO) aka Round- Robin Used in highly associarve caches Not- Most- Recently Used (NMRU) FIFO with excepron for most- recently used line or lines This is a second- order effect Why? Replacement only happens on misses
3 Sources of Misses (3 C s) Compulsory (cold start, first reference): st access to a block in memory, cold, fact of life, not a lot you can do about it If running billions of instrucrons, compulsory misses are insignificant Capacity: cannot contain all blocks accessed by the program Misses that would not occur with infinite cache Conflict (collision): MulRple memory locarons mapped to same cache set Misses that would not occur with ideal fully associarve cache 3 Rules for Determining Miss Type for a Given Access PaEern in 6C ) Compulsory: A miss is compulsory if and only if it results from accessing the data for the first Rme If you have ever brought the data you are accessing into the cache before, it's not a compulsory miss, otherwise it is ) Conflict: A conflict miss is a miss that's not compulsory and that would have been avoided if the cache was fully associarve Imagine you had a cache with the same parameters but fully- associarve (with LRU) If that would've avoided the miss, it's a conflict miss 3) Capacity: This is a miss that would not have happened with an infinitely large cache If your miss is not a compulsory or conflict miss, it's a capacity miss 4 Impact of Parameters on Performance AMAT = Time + Miss Rate * Miss Penalty Note, we assume always first search cache, so must incorporate hit Rme for both hits and misses! For misses, characterize by 3Cs Administrivia Project 3- Out Last week, we built a CPU together, this week, you start building your own! HW4 Out - s Guerrilla SecRon on Pipelining, s on Thursday, 5-7pm, Woz 5 6 Administrivia Break Midterm is one week from tomorrow In this room, at this Rme Two double- sided 85 x handwrieen cheatsheets We ll provide a MIPS green sheet No electronics Covers up to and including tomorrow s lecture (7/)* Review session is Friday, 7/4 from - 4pm in HP Aud * This is one less lecture than inirally indicated 7 8 3
4 CPU- InteracRon (5- stage pipeline) Increasing Block Size? x4 Add bubble PC addr inst hit? PCen Primary InstrucRon Stall at fetch stage on inst cache miss (but not the whole CPU) IR D To Memory Control Decode, Register Fetch A B MD ALU Y MD Refill from Lower Levels of Memory Hierarchy E M we addr Primary rdata hit? wdata R Stall enrre CPU on data cache miss 9 Rme as block size increases? Rme unchanged, but might be slight hit- Rme reducron as number of tags is reduced, so faster to access memory holding tags Miss rate as block size increases? Goes down at first due to sparal locality, then increases due to increased conflict misses due to fewer blocks in cache Miss penalty as block size increases? Rises with longer block size, but with fixed constant iniral latency that is amorrzed over whole block Increasing AssociaRvity? Rme as associarvity increases? Increases, with large step from direct- mapped to >= ways, as now need to mux correct way to processor Smaller increases in hit Rme for further increases in associarvity Miss rate as associarvity increases? Goes down due to reduced conflict misses, but most gain is from - >- >4- way with limited benefit from higher associarvires Miss penalty as associarvity increases? Unchanged, replacement policy runs in parallel with fetching missing line from memory Increasing #Entries? (Capacity) Rme as #entries increases? Increases, since reading tags and data from larger memory structures Miss rate as #entries increases? Goes down due to reduced capacity and conflict misses Architects rule of thumb: miss rate drops ~x for every ~4x increase in capacity (only a gross approximaron) Miss penalty as #entries increases? Unchanged How to Reduce Miss Penalty? Could there be locality on misses from a cache? Use mulrple cache levels! With Moore s Law, more room on die for bigger L caches and for second- level (L) cache And in some cases even an L3 cache! IBM mainframes have ~GB L4 cache off- chip 3 Inner Levels in memory hierarchy Outer Review: Memory Hierarchy Processor Level Level Level 3 Level n Increasing distance from processor, decreasing speed Size of memory at each level As we move to outer levels the latency goes up and price per bit goes down 4 4
5 Local vs Global Miss Rates Local miss rate the fracron of references to one level of a cache that miss Local Miss rate L$ = # $L Misses / # L$ Misses Global miss rate the fracron of references that miss in all levels of a mulrlevel cache L$ local miss rate >> than the global miss rate L$ Global Miss rate = L$ Misses / Total Accesses L : 3KB I$, 3KB D$ L : 56 KB L3 : 4 MB 5 7//5 Fall Lecture #3 6 Local vs Global Miss Rates Local miss rate the fracron of references to one level of a cache that miss Local Miss rate L$ = $L Misses / L$ Misses Global miss rate the fracron of references that miss in all levels of a mulrlevel cache L$ Global Miss rate = L$ Misses / Total Accesses = L$ Misses / L$ Misses x L$ Misses / Total Accesses = Local Miss rate L$ x Local Miss rate L$ AMAT = Time for a hit + Miss rate x Miss penalty AMAT = Time for a L$ hit + (local) Miss rate L$ x (Time for a L$ hit + (local) Miss rate L$ x L$ Miss penalty) 7 8 On a ReRna MBP ~» sudo sysctl -a grep cache hwcachelinesize = 64 hwlicachesize = 3768 hwldcachesize = 3768 hwlcachesize = 644 hwl3cachesize = kernflush_cache_on_write: kernnamecache_disabled: vmvm_page_filecache_min: 975 On a hive machine, run: cat /sys/devices/system/ cpu/cpu/cache/index/* A cool thing about Unix: you can poke around by reading files vfsgenericnfsclientaccess_cache_timeout: 6 vfsgenericnfsserverreqcache_size: 64 netinetiprtmaxcache: 8 netinet6ip6rtmaxcache: 8 hwcacheconfig: 8 8 hwcachesize: hwcachelinesize: 64 9 hwlicachesize: 3768 CPI/Miss Rates/DRAM Access SpecInt6 Only Only InstrucRons and Fall 3 3 7//5 - - Lecture # 5
6 Clickers/Peer InstrucRon Overall, what are L and L3 local miss rates? A: L > 5%, L3 > 5% B: L < 5%, L3 < 5% C: L ~ 5%, L3 ~ 5% D: L > 5%, L3 < 5% E: L > 5%, L3 ~5% CS6C In the News Privacy- respecrng Laptops on the frontpage of Hacker News yesterday: The Gluglug Libreboot X Purism Librem 3/5 Why does it maeer? You can run all the free soxware you want, but: ) Who knows what the hardware is really doing? ) 99999% of your machines srll have proprietary blobs of driver/firmware code (some are even encrypted) 3 heps://newsycombinatorcom/item?id=9934 heps://newsycombinatorcom/item?id= Break 33 Let s say we re interested in analyzing program performance with respect to the cache: i++) { sum += a[i]; 34 ) Assume we only care about D$ ) Assume array is block aligned ie First byte of a[] is the first byte in a block i++) { sum += a[i]; 3) Assume local vars live in registers So what do we care about? The sequence of accesses to array a 35 i++) { sum += a[i]; Important first quesrons: ) What does our access paeern look like within a block? ) How many cache blocks does our array take up? Does it fit enrrely in the cache at once? 36 6
7 i++) { sum += a[i]; ) What does our access paeern look like within a block? We re iterarng over 48 ints, each of which are 4 bytes A block in this cache holds 4 ints: (6 B/block)/(4 B/int) = 4 ints/block 37 ) What does our access paeern look like within a block? We re iterarng over 48 ints, each of which are 4 bytes A block in this cache holds 4 ints: (6 B/block)/ (4 B/int) = 4 ints/block Block as bytes: Block as ints: This repeats over and over! 5% miss rate a[] a[] a[] a[3] IteraRon : Compulsory Miss IteraRon : IteraRon : IteraRon 3: 38 # ) What does our access paeern look like within a block? We re iterarng over an array of 48 ints (but only accessing 4 of them), each of which are 4 bytes A block in this cache holds 4 ints: (6 B/block)/(4 B/int) = 4 ints/block 39 ) What does our access paeern look like within a block? (let s srck with the first loop) We re iterarng over 48 ints 4 ints, each of which are 4 bytes A block in this cache holds 4 ints: (6 B/block)/(4 B/int) = 4 ints/block Block as bytes: Block as ints: This repeats over and over for the first loop! 5% miss rate a[] a[] a[] a[3] IteraRon : Compulsory Miss IteraRon : 4 But now, we have to worry about quesron : ) How many cache blocks does our array take up? The array contains 48 ints That s 48*4 = 89 B blocks are 6 B each, so 89/6 = 5 blocks Does it fit enurely in the cache at once? No is only 496 B, contains only 56 blocks 4 # So, to figure out what loop s performance is going to be like, we need to know what s in the cache at the end of loop 4 7
8 # // what s the state of the cache here? Say for the sake of argument a[] that our array starts at address Then, halfway through the first loop (between i = and i = 4), the first half of the array is in the cache 43 a[3] # // what s the state of the cache here? Now, when we execute the a[] a[4] second half of the accesses in loop one, we replace everything already in the cache: a[3] a[47] 44 # // what s the state of the cache here? So, at the end of the first loop, the second half of the array is in the cache a[4] a[47] 45 # Now, let s take a look at loop NoRce that it s going to start by accessing the first half of the array, none of which is in the cache 46 ) What does our access paeern look like within a block? (now, for the second loop) We re iterarng over 48 ints 4 ints, each of which are 4 bytes A block in this cache holds 4 ints: (6 B/block)/(4 B/int) = 4 ints/block Block as bytes: Block as ints: This repeats over and over for the second loop! 5% miss rate a[] a[] a[] a[3] IteraRon : Capacity Miss a[4] was in this spot in the cache IteraRon : 47 # So, we got a 5% miss rate for loop # and a 5% miss rate for loop # Now, we compute a weighted average: Total miss rate = weighted average of miss rates Half the accesses were in loop, the other half in loop : Total miss rate = 5 * loop miss rate + 5 * loop miss rate 48 Total miss rate = 5 * 5% + 5 * 5% = 5% 8
9 Clicker QuesRon p KiB Direct- Mapped What s the miss rate for the first loop? A % B 5 % C 5 % D 75 % E % Clicker QuesRon p KiB Direct- Mapped What s the miss rate for the second loop? A % B 5 % C 5 % D 75 % E % 49 5 Clicker QuesRon p3 KiB Direct- Mapped What types of misses do we get in the first/second loops? A Capacity/Capacity B Compulsory/Capacity C Compulsory/Conflict D Conflict/Capacity E Compulsory/Compulsory 5 In Conclusion, Huge Design Space Several interacrng dimensions size Block size AssociaRvity Replacement policy Write- through vs write- back Write- allocauon OpRmal choice is a compromise Depends on access characterisrcs Workload Use (I- cache, D- cache) So we learned how to analyze code performance on a cache Simplicity oxen wins Bad Good Size Factor A Less AssociaUvity Block Size Factor B More 5 9
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanovic & Vladimir Stojanovic hfp://inst.eecs.berkeley.edu/~cs61c/ Parallel Requests Assigned to computer
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanovic & Vladimir Stojanovic hcp://inst.eecs.berkeley.edu/~cs61c/ So$ware Parallel Requests Assigned
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Bernhard Boser & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/24/16 Fall 2016 - Lecture #16 1 Software
More information10/19/17. You Are Here! Review: Direct-Mapped Cache. Typical Memory Hierarchy
CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~cs6c/ Parallel Requests Assigned to computer eg, Search
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/19/17 Fall 2017 - Lecture #16 1 Parallel
More informationShow Me the $... Performance And Caches
Show Me the $... Performance And Caches 1 CPU-Cache Interaction (5-stage pipeline) PCen 0x4 Add bubble PC addr inst hit? Primary Instruction Cache IR D To Memory Control Decode, Register Fetch E A B MD1
More informationMo Money, No Problems: Caches #2...
Mo Money, No Problems: Caches #2... 1 Reminder: Cache Terms... Cache: A small and fast memory used to increase the performance of accessing a big and slow memory Uses temporal locality: The tendency to
More informationCS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2
CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2 Instructor: Sagar Karandikar sagark@eecsberkeleyedu hbp://insteecsberkeleyedu/~cs61c 1 So/ware Parallel Requests Assigned to computer
More informationCS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2
7/6/5 CS 6C: Great Ideas in Computer Architecture Lecture 5: Caches, Part Instructor: Sagar Karandikar sagark@eecsberkeleyedu hdp://insteecsberkeleyedu/~cs6c Parallel Requests Assigned to computer eg,
More informationCS 61C: Great Ideas in Computer Architecture Caches Part 2
CS 61C: Great Ideas in Computer Architecture Caches Part 2 Instructors: Nicholas Weaver & Vladimir Stojanovic http://insteecsberkeleyedu/~cs61c/fa15 Software Parallel Requests Assigned to computer eg,
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Mul$- Level Caches
Costs of Set AssociaRve s CS 61C: Great Ideas in Computer Architecture (Machine Structures) Mul$- Level s Instructors: Randy H. Katz David A. PaGerson hgp://inst.eecs.berkeley.edu/~cs61c/fa10 1 When miss
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationMemory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt
Memory Hierarchy 2/18/2016 CS 152 Sec6on 5 Colin Schmidt Agenda Review Memory Hierarchy Lab 2 Ques6ons Return Quiz 1 Latencies Comparison Numbers L1 Cache 0.5 ns L2 Cache 7 ns 14x L1 cache Main Memory
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationEECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141
EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role
More informationLecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II John Wawrzynek Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~johnw
More informationAgenda. Cache-Memory Consistency? (1/2) 7/14/2011. New-School Machine Structures (It s a bit more complicated!)
7/4/ CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches II Instructor: Michael Greenbaum New-School Machine Structures (It s a bit more complicated!) Parallel Requests Assigned to
More informationCS 110 Computer Architecture
CS 110 Computer Architecture Performance and Floating Point Arithmetic Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir Stojanovic http://insteecsberkeleyedu/~cs61c/ Typical Memory Hierarchy Datapath On-Chip
More informationCaches Part 1. Instructor: Sören Schwertfeger. School of Information Science and Technology SIST
CS 110 Computer Architecture Caches Part 1 Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's
More informationInstructors: Randy H. Katz David A. PaGerson hgp://inst.eecs.berkeley.edu/~cs61c/fa10. 10/4/10 Fall Lecture #16. Agenda
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instructors: Randy H. Katz David A. PaGerson hgp://inst.eecs.berkeley.edu/~cs61c/fa10 1 Agenda Cache Sizing/Hits and Misses Administrivia
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2
3//5 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Part Instructors: Krste Asanovic & Vladimir Stojanovic hfp://insteecsberkeleyedu/~cs6c/ Parallel Requests Assigned to computer
More informationCS 61C: Great Ideas in Computer Architecture Direct- Mapped Caches. Increasing distance from processor, decreasing speed.
CS 6C: Great Ideas in Computer Architecture Direct- Mapped s 9/27/2 Instructors: Krste Asanovic, Randy H Katz hdp://insteecsberkeleyedu/~cs6c/fa2 Fall 2 - - Lecture #4 New- School Machine Structures (It
More informationLecture 11 Cache. Peng Liu.
Lecture 11 Cache Peng Liu liupeng@zju.edu.cn 1 Associative Cache Example 2 Associative Cache Example 3 Associativity Example Compare 4-block caches Direct mapped, 2-way set associative, fully associative
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures)
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Instructors: Randy H. Katz David A. PaGerson hgp://inst.eecs.berkeley.edu/~cs61c/fa10 1 2 Cache Field Sizes Number of bits in a cache includes
More informationECSE 425 Lecture 21: More Cache Basics; Cache Performance
ECSE 425 Lecture 21: More Cache Basics; Cache Performance H&P Appendix C 2011 Gross, Hayward, Arbel, Vu, Meyer Textbook figures Last Time Two QuesRons Q1: Block placement Q2: Block idenrficaron ECSE 425,
More informationCaches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University
Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 34, Spring 23 Computer Science Cornell University Welcome back from Spring Break! Welcome back from Spring Break! Big Picture: Memory Code
More informationLecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University
Lecture 12 Memory Design & Caches, part 2 Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements HW3 is due today PA2 is available on-line today Part 1 is due on 2/27
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Components of a Computer Processor
More informationCaches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University
Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Memory Code Stored in Memory (also, data and stack) memory PC +4 new pc
More informationCS 61C: Great Ideas in Computer Architecture. Multilevel Caches, Cache Questions
CS 61C: Great Ideas in Computer Architecture Multilevel Caches, Cache Questions Instructor: Alan Christopher 7/14/2014 Summer 2014 -- Lecture #12 1 Great Idea #3: Principle of Locality/ Memory Hierarchy
More informationCS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches, Set Associative Caches, Cache Performance
CS 6C: Great Ideas in Computer Architecture Direct Mapped Caches, Set Associative Caches, Cache Performance Instructor: Justin Hsia 7//23 Summer 23 Lecture # Great Idea #3: Principle of Locality/ Memory
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/16/17 Fall 2017 - Lecture #15 1 Outline
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Bernhard Boser & Randy H. Katz hdp://inst.eecs.berkeley.edu/~cs61c/ 10/13/16 Fall 2016 - Lecture #14 1 New-School
More information10/16/17. Outline. Outline. Typical Memory Hierarchy. Adding Cache to Computer. Key Cache Concepts
// CS C: Great Ideas in Computer Architecture (Machine Structures) s Part Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~csc/ Organization and Principles Write Back vs Write Through
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Bernhard Boser & Randy H Katz http://insteecsberkeleyedu/~cs61c/ 10/18/16 Fall 2016 - Lecture #15 1 Outline
More informationAgenda. EE 260: Introduction to Digital Design Memory. Naive Register File. Agenda. Memory Arrays: SRAM. Memory Arrays: Register File
EE 260: Introduction to Digital Design Technology Yao Zheng Department of Electrical Engineering University of Hawaiʻi at Mānoa 2 Technology Naive Register File Write Read clk Decoder Read Write 3 4 Arrays:
More informationCS3350B Computer Architecture
CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &
More informationEE 4683/5683: COMPUTER ARCHITECTURE
EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major
More information10/11/17. New-School Machine Structures. Review: Single Cycle Instruction Timing. Review: Single-Cycle RISC-V RV32I Datapath. Components of a Computer
// CS C: Great Ideas in Computer Architecture (Machine Structures) s Part Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~csc/ // Fall - Lecture # Parallel Requests Assigned to computer
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #21: Caches 3 2005-07-27 CS61C L22 Caches III (1) Andy Carle Review: Why We Use Caches 1000 Performance 100 10 1 1980 1981 1982 1983
More informationCS 61C: Great Ideas in Computer Architecture. Cache Performance, Set Associative Caches
CS 61C: Great Ideas in Computer Architecture Cache Performance, Set Associative Caches Instructor: Justin Hsia 7/09/2012 Summer 2012 Lecture #12 1 Great Idea #3: Principle of Locality/ Memory Hierarchy
More informationEEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?
EEC 17 Computer Architecture Fall 25 Introduction Review Review: The Hierarchy Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/11/17 Fall 2017 - Lecture #14 1 Parallel
More informationCourse Administration
Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures)
CS 6C: Great Ideas in Computer Architecture (Machine Structures) Instructors: Randy H Katz David A PaHerson hhp://insteecsberkeleyedu/~cs6c/fa Direct Mapped (contnued) - Interface CharacterisTcs of the
More informationUCB CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 13 Caches II 2013-02-22 Lecturer SOE Dan Garcia HP has begun testing research prototypes of a novel non-volatile memory element, the
More informationImproving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches
Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging 6.823, L8--1 Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Highly-Associative
More informationMemory hier ar hier ch ar y ch rev re i v e i w e ECE 154B Dmitri Struko Struk v o
Memory hierarchy review ECE 154B Dmitri Strukov Outline Cache motivation Cache basics Opteron example Cache performance Six basic optimizations Virtual memory Processor DRAM gap (latency) Four issue superscalar
More informationPage 1. Multilevel Memories (Improving performance using a little cash )
Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency
More informationReview: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.
Performance 980 98 982 983 984 985 986 987 988 989 990 99 992 993 994 995 996 997 998 999 2000 7/4/20 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Instructor: Michael Greenbaum
More informationDirect-Mapped and Set Associative Caches. Instructor: Steven Ho
Direct-Mapped and Set Associative Caches Instructor: Steven Ho Great Idea #3: Principle of Locality/ Memory Hierarchy 7/6/28 CS6C Su8 - Lecture 5 2 Extended Review of Last Lecture Why have caches? Intermediate
More informationCSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]
CSF Cache Introduction [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user with as much
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationThe Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)
The Memory Hierarchy Cache, Main Memory, and Virtual Memory (Part 2) Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Cache Line Replacement The cache
More informationCaches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)
s akim Weatherspoon CS, Spring Computer Science Cornell University See P&.,. (except writes) Big Picture: : big & slow vs s: small & fast compute jump/branch targets memory PC + new pc Instruction Fetch
More informationMemory hierarchy review. ECE 154B Dmitri Strukov
Memory hierarchy review ECE 154B Dmitri Strukov Outline Cache motivation Cache basics Six basic optimizations Virtual memory Cache performance Opteron example Processor-DRAM gap in latency Q1. How to deal
More informationTopics. Digital Systems Architecture EECE EECE Need More Cache?
Digital Systems Architecture EECE 33-0 EECE 9-0 Need More Cache? Dr. William H. Robinson March, 00 http://eecs.vanderbilt.edu/courses/eece33/ Topics Cache: a safe place for hiding or storing things. Webster
More informationReducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip
Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off
More informationCOEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory
1 COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory Cristinel Ababei Dept. of Electrical and Computer Engineering Marquette University Credits: Slides adapted from presentations
More informationCS 136: Advanced Architecture. Review of Caches
1 / 30 CS 136: Advanced Architecture Review of Caches 2 / 30 Why Caches? Introduction Basic goal: Size of cheapest memory... At speed of most expensive Locality makes it work Temporal locality: If you
More informationCaches III. CSE 351 Autumn Instructor: Justin Hsia
Caches III CSE 351 Autumn 2018 Instructor: Justin Hsia Teaching Assistants: Akshat Aggarwal An Wang Andrew Hu Brian Dai Britt Henderson James Shin Kevin Bi Kory Watson Riley Germundson Sophie Tian Teagan
More informationAdvanced Computer Architecture
ECE 563 Advanced Computer Architecture Fall 2009 Lecture 3: Memory Hierarchy Review: Caches 563 L03.1 Fall 2010 Since 1980, CPU has outpaced DRAM... Four-issue 2GHz superscalar accessing 100ns DRAM could
More informationCS61C Midterm 2 Review Session. Problems Only
CS61C Midterm 2 Review Session Problems Only Floating Point Big Ideas in Number Rep Represent a large range of values by trading off precision. Store a number of fixed size (significand) and float its
More informationCaches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University
s See P&.,. (except writes) akim Weatherspoon CS, Spring Computer Science Cornell University What will you do over Spring Break? A) Relax B) ead home C) ead to a warm destination D) Stay in (frigid) Ithaca
More informationCache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Cache Memory COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline The Need for Cache Memory The Basics
More informationQ3: Block Replacement. Replacement Algorithms. ECE473 Computer Architecture and Organization. Memory Hierarchy: Set Associative Cache
Fundamental Questions Computer Architecture and Organization Hierarchy: Set Associative Q: Where can a block be placed in the upper level? (Block placement) Q: How is a block found if it is in the upper
More informationCSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1
CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson
More informationCS 61C: Great Ideas in Computer Architecture Lecture 17: Performance and Floa.ng Point Arithme.c
CS 61C: Great Ideas in Computer Architecture Lecture 17: Performance and Floa.ng Point Arithme.c Instructor: Sagar Karandikar sagark@eecs.berkeley.edu hbp://inst.eecs.berkeley.edu/~cs61c 1 Review: Adding
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationReview : Pipelining. Memory Hierarchy
CS61C L11 Caches (1) CS61CL : Machine Structures Review : Pipelining The Big Picture Lecture #11 Caches 2009-07-29 Jeremy Huddleston!! Pipeline challenge is hazards "! Forwarding helps w/many data hazards
More informationMemory Hierarchy. Slides contents from:
Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory
More informationCaches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University
s See P&.,. (except writes) akim Weatherspoon CS, Spring Computer Science Cornell University What will you do over Spring Break? A) Relax B) ead home C) ead to a warm destination D) Stay in (frigid) Ithaca
More informationMemory Hierarchy. Slides contents from:
Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory
More information3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationCS161 Design and Architecture of Computer Systems. Cache $$$$$
CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks
More informationReview: New-School Machine Structures. Review: Direct-Mapped Cache. TIO Dan s great cache mnemonic. Memory Access without Cache
In st r uct io n Un it ( s) A+B A1+B1 A+B A3+B3 Guest Lecturer Alan Christopher inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 13 Caches II 14--1 MEMRISTOR MEMORY ON ITS WAY (HOPEFULLY)
More informationPage 1. Memory Hierarchies (Part 2)
Memory Hierarchies (Part ) Outline of Lectures on Memory Systems Memory Hierarchies Cache Memory 3 Virtual Memory 4 The future Increasing distance from the processor in access time Review: The Memory Hierarchy
More informationLevels in memory hierarchy
CS1C Cache Memory Lecture 1 March 1, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs1c/schedule.html Review 1/: Memory Hierarchy Pyramid Upper Levels in memory hierarchy
More informationModern Computer Architecture
Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap
More informationCS61C Midterm 2 Review Session. Problems + Solutions
CS61C Midterm 2 Review Session Problems + Solutions Floating Point Big Ideas in Number Rep Represent a large range of values by trading off precision. Store a number of fixed size (significand) and float
More informationCaches (Writing) Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.2 3, 5.5
s (Writing) Hakim Weatherspoon CS, Spring Computer Science Cornell University P & H Chapter.,. Administrivia Lab due next onday, April th HW due next onday, April th Goals for Today Parameter Tradeoffs
More informationLecture 17 Introduction to Memory Hierarchies" Why it s important " Fundamental lesson(s)" Suggested reading:" (HP Chapter
Processor components" Multicore processors and programming" Processor comparison" vs." Lecture 17 Introduction to Memory Hierarchies" CSE 30321" Suggested reading:" (HP Chapter 5.1-5.2)" Writing more "
More informationECE 552 / CPS 550 Advanced Computer Architecture I. Lecture 13 Memory Part 2
ECE 552 / CPS 550 Advanced Computer Architecture I Lecture 13 Memory Part 2 Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall12.html
More informationCaches and Memory. Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.10, 5.13, 5.15, 5.17
Caches and emory Anne Bracy CS 34 Computer Science Cornell University Slides by Anne Bracy with 34 slides by Professors Weatherspoon, Bala, ckee, and Sirer. See P&H Chapter: 5.-5.4, 5.8, 5., 5.3, 5.5,
More informationCSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]
CSF Improving Cache Performance [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationCaches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)
Caches akim Weatherspoon CS 341, Spring 212 Computer Science Cornell University See P& 5.1, 5.2 (except writes) ctrl ctrl ctrl inst imm B A B D D Big Picture: emory emory: big & slow vs Caches: small &
More informationECE 252 / CPS 220 Advanced Computer Architecture I. Lecture 13 Memory Part 2
ECE 252 / CPS 220 Advanced Computer Architecture I Lecture 13 Memory Part 2 Benjamin Lee Electrical and Computer Engineering Duke University www.duke.edu/~bcl15 www.duke.edu/~bcl15/class/class_ece252fall11.html
More informationCENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu
CENG 3420 Computer Organization and Design Lecture 08: Memory - I Bei Yu CEG3420 L08.1 Spring 2016 Outline q Why Memory Hierarchy q How Memory Hierarchy? SRAM (Cache) & DRAM (main memory) Memory System
More informationSpring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand
Caches Nima Honarmand Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required by all of the running applications
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #23 Cache I 2007-8-2 Scott Beamer, Instructor CS61C L23 Caches I (1) The Big Picture Computer Processor (active) Control ( brain ) Datapath
More informationSpring 2018 :: CSE 502. Cache Design Basics. Nima Honarmand
Cache Design Basics Nima Honarmand Storage Hierarchy Make common case fast: Common: temporal & spatial locality Fast: smaller, more expensive memory Bigger Transfers Registers More Bandwidth Controlled
More informationJohn Wawrzynek & Nick Weaver
CS 61C: Great Ideas in Computer Architecture Lecture 23: Virtual Memory John Wawrzynek & Nick Weaver http://inst.eecs.berkeley.edu/~cs61c From Previous Lecture: Operating Systems Input / output (I/O) Memory
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #24 Cache II 27-8-6 Scott Beamer, Instructor New Flow Based Routers CS61C L24 Cache II (1) www.anagran.com Caching Terminology When we try
More informationNew-School Machine Structures. Overarching Theme for Today. Agenda. Review: Memory Management. The Problem 8/1/2011
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Virtual Instructor: Michael Greenbaum 1 New-School Machine Structures Software Parallel Requests Assigned to computer e.g., Search Katz
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 23: Associative Caches Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Last time: Write-Back Alternative: On data-write hit, just
More informationCaches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.2 (writes), 5.3, 5.5
Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H 5.2 (writes), 5.3, 5.5 Announcements! HW3 available due next Tuesday HW3 has been updated. Use updated version.
More informationMemory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy
ENG338 Computer Organization and Architecture Part II Winter 217 S. Areibi School of Engineering University of Guelph Hierarchy Topics Hierarchy Locality Motivation Principles Elements of Design: Addresses
More information