Lecture 15: Cache Design (in Isolation) James C. Hoe Department of ECE Carnegie Mellon University
|
|
- Jocelyn Sharleen Owen
- 6 years ago
- Views:
Transcription
1 Lecture 15: Cache Design (in Isolation) James C. Hoe Department of ECE Carnegie Mellon University S18 L15 S1, James C. Hoe, CMU/ECE/CALCM, 2018
2 Your goal today Housekeeping recover from Spring Break understand ABC of caches understand 3 C s of caches Notices Lab 3, due net week HW4 out on Wed Midterm 1 regrade in by Friday Readings P&H Ch S18 L15 S2, James C. Hoe, CMU/ECE/CALCM, 2018
3 The Basic Problem Potentially M=2 m bytes of memory, how to keep copies of most frequently used locations in C bytes of fast storage where C << M Basic issues (intertwined) (1) when to cache a copy of a memory location (2) where in fast storage to keep the copy (3) how to find the copy later on (LW and SW only give indices into M) S18 L15 S3, James C. Hoe, CMU/ECE/CALCM, 2018
4 Direct Mapped Cache (v1) lg 2 M bit address tag id g Tag Bank Data Bank lg 2 (C/G) bits t bits C/G lines by t bits valid C/G lines by G bytes t bits = let t= lg 2 M lg 2 C What about writes? S18 L15 S4, James C. Hoe, CMU/ECE/CALCM, 2018 hit? G bytes data
5 Storage Overhead and Block Size For each cache block of G bytes, also storing t+1 bits of tag (where t=lg 2 M lg 2 C) if M=2 32, G=4, C=16K=2 14 t=18 bits for each 4 byte block 60% overhead; 16KB cache actually 25.5KB SRAM Solution: amortize tag over larger B byte block manage B/G consecutive words as indivisible unit if M=2 32, B=16, G=4, C=16K t=18 bits for each 16 byte block 15% overhead; 16KB cache actually 18.4KB SRAM spatial locality also says this is a good idea Larger caches wants even bigger blocks S18 L15 S5, James C. Hoe, CMU/ECE/CALCM, 2018
6 Direct Mapped Cache (final) lg 2 M bit address tag id bo g lg 2 (C/B) bits Tag Bank C/B by t bits valid Data Bank C/B by B bytes t bits lg 2 (B/G) bits = t bits B bytes let t= lg 2 M lg 2 C S18 L15 S6, James C. Hoe, CMU/ECE/CALCM, 2018 hit? G bytes data
7 Is this okay? bo id tag g lg 2 (C/B) bits t bits Tag Bank C/B by t bits valid Data Bank C/B by B bytes lg 2 (B/G) bits = t bits B bytes let t= lg 2 M lg 2 C S18 L15 S7, James C. Hoe, CMU/ECE/CALCM, 2018 hit? G bytes data
8 Is this okay? tag id bo g lg 2 (C/B)+k bits t bits Tag Bank C/B by t bits valid Data Bank 2 k C/B by B/2 k bytes lg 2 (B/G) k bits = t bits B/2 k bytes let t= lg 2 M lg 2 C S18 L15 S8, James C. Hoe, CMU/ECE/CALCM, 2018 hit? G bytes data
9 S18 L15 S9, James C. Hoe, CMU/ECE/CALCM, 2018 Direct Mapped Cache C bytes of storage managed as C/B cache blocks Agiven block address directly maps to eactly one choice of cache block (by block inde field) Block addresses with same block inde field map to same cache block of 2 t such addresses, hold only one at a time even if C > working set size, conflict is possible ( working set is not one continuous region) probability 2 random addresses conflict is 1/(C/B); likelihood working for conflict decreases with set size (W) increasing number of blocks C hit rate
10 C, B and m i Increasing B has prefetching benefit pay miss penalty only once per cache block works especially well in instruction caches Effective up to the limit of spatial locality Increasing B too much wastes capacity increases probability for conflict hit rate S18 L15 S10, James C. Hoe, CMU/ECE/CALCM, 2018 B
11 B and T i+1 Loading a large cache block increases T i+1 Solution 1: critical word first reload L i+1 returns requested word first then rotate around the complete block supply requested word to pipeline ASAP Solution 2: sub blocking (at very large C and B) valid bit per sub block, but still common tag reload only requested sub block on demand reduce T i+1 ; reduce BW at L i+1 but loses prefetching; wastes capacity tag v s block 0 v s block 1 v s block S18 L15 S11, James C. Hoe, CMU/ECE/CALCM, 2018
12 S18 L15 S12, James C. Hoe, CMU/ECE/CALCM, 2018 Now for the general case
13 tag id bo g Set Associative Cache C/a byte direct mapped Tag Data Tag Data C/a/B by t bits valid C/a/B by B bytes a banks C/a/B by t bits valid C/a/B by B bytes = = some kind of mu hit? S18 L15 S13, James C. Hoe, CMU/ECE/CALCM, 2018 data t= lg 2 M lg 2 (C/a)
14 a way Set Associative Cache C bytes of storage divided into a direct mapped banks (aka ways ) each way has (C/a)/B cache blocks a given block address maps to eactly one choice per way ; a choices constitute the set direct mapped is special case a=1 overhead: a comparators and a to 1 multipleer Block addresses with same inde map to same set 2 t such addresses; hold a different ones at a time if C > working set size higher degree of associativity fewer conflicts What if C < working set size? S18 L15 S14, James C. Hoe, CMU/ECE/CALCM, 2018
15 Replacement Policy New block displaces an eisting block from set pick the one that is least recently used (LRU) eactly LRU epensive for a>2 pick any one ecept the most recently used pick the most recently used one pick one based on some part of the address bits pick the one used again furthest in the future pick a (pseudo) random one No real best choice; second order impact only if actively using less than a blocks in a set, any sensible replacement policy will quickly converge if actively using more than a blocks in a set, no replacement policy can help you S18 L15 S15, James C. Hoe, CMU/ECE/CALCM, 2018
16 Pseudo Associative Cache set0 way0 set0 way1 set0 way set1 way0 set1 way1 set1 way Associativity is a placement policy S18 L15 S16, James C. Hoe, CMU/ECE/CALCM, 2018 it says a block address could be placed in one of a different blocks it doesn t say ways are parallel look up banks Pseudo a way associativity: given a direct mapped array with C/B blocks logically partition into C/B/a sets given an address A, inde into set and sequentially search its ways: Optimization: record the most recently used way (MRU) to check first e.g., used by MIPS R10K L2
17 Skewed Associative Cache tag id bo g different hash for each way hash 0 hash a 1 C/a byte direct mapped Tag Data Tag Data C/a/B by t bits = valid C/a/B by B bytes a banks C/a/B by t bits = valid C/a/B by B bytes hit? S18 L15 S17, James C. Hoe, CMU/ECE/CALCM, 2018 data t= lg 2 M lg 2 (C/a)
18 Fully Associative Cache: a C/B tag bo g t bits 1 by t bits v 1 by B bytes = 1 by t bits = v 1 by B bytes C/B blocks 1 by t bits v 1 by B bytes hit? = let t=lg 2 M lg 2 B S18 L15 S18, James C. Hoe, CMU/ECE/CALCM, 2018 data
19 Fully Associative Cache: a=c/b A content addressable memory no inde bits used in lookup present tag to find a block with matching tag, or else miss Any block address can go into any of C/B cache blocks if C > working set size, no conflicts Requires 1 comparator per cache block, a huge multipleer, and many long wires epensive/difficult for more than a few tens of blocks at L1 speed few reasons for very large fully assoc. caches S18 L15 S19, James C. Hoe, CMU/ECE/CALCM, 2018? hit rate ~5 a
20 S18 L15 S20, James C. Hoe, CMU/ECE/CALCM, C s of Cache Misses
21 Compulsory Miss First reference to a block address always misses Dominates when locality is poor for eample, in a streaming data access pattern where many addresses are visited, but each is used only once Main design factor: B and prefetching hit rate S18 L15 S21, James C. Hoe, CMU/ECE/CALCM, 2018 B
22 Capacity Miss Cache is too small to hold everything needed Defined as the misses that would occur in a fullyassociative cache of the same capacity using optimum (Belady) replacement Dominates when C < W for eample, the L1 cache usually not big enough due to cycle time tradeoff Main design factor: C 100% hit rate working set size (W) S18 L15 S22, James C. Hoe, CMU/ECE/CALCM, 2018 C
23 Conflict Miss Miss to a previously visited block address displaced due to conflict under direct mapped or set associative allocation Defined as a miss that is neither compulsory nor capacity Dominates when C W or when C/B is small Main design factor: a? S18 L15 S23, James C. Hoe, CMU/ECE/CALCM, 2018 hit rate ~5 a
24 3 C worksheet: a=1, b=1, C=2 addr set# which C? set[2] F.A. + Belady compulsory [, ] [0, ] { } {0} S18 L15 S24, James C. Hoe, CMU/ECE/CALCM, 2018
25 3 C worksheet: a=1, b=1, C=2 addr set# which C? set[2] F.A. + Belady compulsory [, ] [0, ] {} {0} compulsory [0, ] [2, ] {0} {0,2} conflict [2, ] [0, ] {0,2} hit conflict [0, ] [2, ] {0,2} hit compulsory [2, ] [2,1] {0,2} {0,1} conflict [2,1] [0,1] {0,1} hit capacity [0,1] [2,1] {0,1} {0,2} conflict [2,1] [0,1] {0,2} hit S18 L15 S25, James C. Hoe, CMU/ECE/CALCM, 2018
26 Recap: Basic Cache Parameters ISA M = 2 m : size of address space in bytes sample values: 2 32, 2 64 G=2 g : cache access granularity in bytes sample values: 4, 8 Implementation C : capacity of cache in bytes sample values: 16 KByte (L1), 1 MByte (L2) B = 2 b : block size in bytes sample values: 16 (L1), >64 (L2) a: associativity of the cache sample values: 1, 2, 4, 5(?),... C/B S18 L15 S26, James C. Hoe, CMU/ECE/CALCM, 2018 C/a should be a 2 power
27 Recap: Address Fields lg 2 M bit address tag inde B.O S18 L15 S27, James C. Hoe, CMU/ECE/CALCM, 2018
28 Recap: M=2 32, G=, M=2 C=, 32, a=2, C=1K, B=4, G=2 B=, a= S18 L15 S28, James C. Hoe, CMU/ECE/CALCM, 2018
29 M=2 32, a=2, C=1K, B=4, G=2: basic solution tag PA[31:9] id PA[8:2] b.o. PA[1] PA[0] id id id id tag0 v0 tag1 v1 data 0 data l 23 b 1 b 128 l 23 b 1 b 128 lines 4 bytes 128 lines 4 bytes tag 23 = = hit0 hit S18 L15 S29, James C. Hoe, CMU/ECE/CALCM, 2018 hit0 hit1 HIT b.o. hit0 hit1 2 1 mu 2 1 mu d 16 DATA 2 1 mu
30 tag PA[31:9] id 7 Same cache parameters but tune for narrower data SRAMs id id PA[8:2] 7 b.o. PA[1] PA[0] {id,bo} 8 {id,bo} 8 tag0 128 l 23 b v0 1 b tag1 128 l 23 b v1 1 b this part is unchanged data lines 2 bytes data lines 2 bytes tag 23 = hit0 hit1 HIT Can you play the same trick on the tag SRAMs? S18 L15 S30, James C. Hoe, CMU/ECE/CALCM, 2018 = hit0 hit1 hit0 hit mu d 16 DATA
31 tag0 128 l 23 b tag PA[31:9] id 7 v0 1 b Same cache parameters but tune for fatter data SRAMs tag1 128 l 23 b id PA[8:2] v1 1 b b.o. PA[1] PA[0] id 6 7 this part is unchanged PA[8:3] data 0 64 lines 8 bytes PA[8:3] 6 data 1 64 lines 8 bytes tag 23 = hit0 hit1 HIT Can you play the same trick on the tag SRAMs? S18 L15 S31, James C. Hoe, CMU/ECE/CALCM, 2018 = hit0 hit1 {PA[2],b.o.} hit0 hit1 4 1 mu 4 1 mu 2 1 mu d 16 DATA
32 Same cache parameters but each block frame is interleaved over 2 SRAM banks tag PA[31:9] id 7 id id PA[8:2] 7 b.o. PA[1] PA[0] id 7 id 7 tag0 128 l 23 b v0 1 b tag1 128 l 23 b v1 1 b this part is unchanged data lines 4 bytes h0 bo h1 bo data lines 4 bytes h1 bo h0 bo tag 23 = = h0 h S18 L15 S32, James C. Hoe, CMU/ECE/CALCM, 2018 h0 h1 HIT b.o. h0 bo+h1 bo h1 bo+h0 bo 2 1 mu 2 1 mu 2 1 mu d 16 DATA
Lecture 17: Memory Hierarchy: Cache Design
S 09 L17-1 18-447 Lecture 17: Memory Hierarchy: Cache Design James C. Hoe Dept of ECE, CMU March 24, 2009 Announcements: Project 3 is due Midterm 2 is coming Handouts: Practice Midterm 2 solutions The
More informationLecture 19: Memory Hierarchy: Cache Design. Recap: Basic Cache Parameters
S 09 L19-1 18-447 Lecture 19: Memory Hierarchy: Cache Design James C. Hoe Dept of ECE, CMU April 6, 2009 Announcements: Ckpt 1 bonus reminder Graded midterms You are invited to attend Amdahl's Law in the
More informationLecture 14: Memory Hierarchy. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 14: Memory Hierarchy James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L14 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Your goal today Housekeeping understand memory system
More informationEECS 470. Lecture 14 Advanced Caches. DEC Alpha. Fall Jon Beaumont
Lecture 14 Advanced Caches DEC Alpha Fall 2018 Instruction Cache BIU Jon Beaumont www.eecs.umich.edu/courses/eecs470/ Data Cache Slides developed in part by Profs. Austin, Brehob, Falsafi, Hill, Hoe, Lipasti,
More informationFall 2007 Prof. Thomas Wenisch
Basic Caches Fall 2007 Prof. Thomas Wenisch http://www.eecs.umich.edu/courses/eecs470 Slides developed in part by Profs. Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Shen, Smith, Sohi, Tyson, and Vijaykumar
More informationLecture 17: Address Translation. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 17: Address Translation James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L17 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Your goal today Housekeeping see Virtual Memory into
More informationEECS 470 Lecture 13. Basic Caches. Fall 2018 Jon Beaumont
Basic Caches Fall 2018 Jon Beaumont http://www.eecs.umich.edu/courses/eecs470 Slides developed in part by Profs. Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Shen, Smith, Sohi, Tyson, and Vijaykumar of
More informationSpring 2018 :: CSE 502. Cache Design Basics. Nima Honarmand
Cache Design Basics Nima Honarmand Storage Hierarchy Make common case fast: Common: temporal & spatial locality Fast: smaller, more expensive memory Bigger Transfers Registers More Bandwidth Controlled
More informationSpring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand
Caches Nima Honarmand Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required by all of the running applications
More informationPage 1. Multilevel Memories (Improving performance using a little cash )
Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Memory Hierarchy & Caches Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required
More informationLecture 16: Cache in Context (Uniprocessor) James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 16: Cache in Context (Uniprocessor) James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L16 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Your goal today Housekeeping understand
More information15-740/ Computer Architecture Lecture 12: Advanced Caching. Prof. Onur Mutlu Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 12: Advanced Caching Prof. Onur Mutlu Carnegie Mellon University Announcements Chuck Thacker (Microsoft Research) Seminar Tomorrow RARE: Rethinking Architectural
More informationReducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip
Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off
More information18-447: Computer Architecture Lecture 25: Main Memory. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013
18-447: Computer Architecture Lecture 25: Main Memory Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/3/2013 Reminder: Homework 5 (Today) Due April 3 (Wednesday!) Topics: Vector processing,
More informationReducing Conflict Misses with Set Associative Caches
/6/7 Reducing Conflict es with Set Associative Caches Not too conflict y. Not too slow. Just Right! 8 byte, way xx E F xx C D What should the offset be? What should the be? What should the tag be? xx N
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationAdvanced Caching Techniques (2) Department of Electrical Engineering Stanford University
Lecture 4: Advanced Caching Techniques (2) Department of Electrical Engineering Stanford University http://eeclass.stanford.edu/ee282 Lecture 4-1 Announcements HW1 is out (handout and online) Due on 10/15
More informationLecture 20: Virtual Memory, Protection and Paging. Multi-Level Caches
S 09 L20-1 18-447 Lecture 20: Virtual Memory, Protection and Paging James C. Hoe Dept of ECE, CMU April 8, 2009 Announcements: Best class ever, next Monday Handouts: H14 HW#4 (on Blackboard), due 4/22/09
More informationECE 2300 Digital Logic & Computer Organization. More Caches
ECE 23 Digital Logic & Computer Organization Spring 217 More Caches 1 Prelim 2 stats High: 9 (out of 9) Mean: 7.2, Median: 73 Announcements Prelab 5(C) due tomorrow 2 Example: Direct Mapped (DM) Cache
More information15-740/ Computer Architecture
15-740/18-740 Computer Architecture Lecture 19: Caching II Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 10/31/2011 Announcements Milestone II Due November 4, Friday Please talk with us if you
More informationMIPS) ( MUX
Memory What do we use for accessing small amounts of data quickly? Registers (32 in MIPS) Why not store all data and instructions in registers? Too much overhead for addressing; lose speed advantage Register
More informationCaches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University
Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 34, Spring 23 Computer Science Cornell University Welcome back from Spring Break! Welcome back from Spring Break! Big Picture: Memory Code
More informationCS 152 Computer Architecture and Engineering
CS 152 Computer Architecture and Engineering Lecture 15 Cache II 2005-3-8 John Lazzaro (www.cs.berkeley.edu/~lazzaro) TAs: Ted Hong and David Marquardt www-inst.eecs.berkeley.edu/~cs152/ Last Time: Locality
More informationEECS 470. Lecture 15. Prefetching. Fall 2018 Jon Beaumont. History Table. Correlating Prediction Table
Lecture 15 History Table Correlating Prediction Table Prefetching Latest A0 A0,A1 A3 11 Fall 2018 Jon Beaumont A1 http://www.eecs.umich.edu/courses/eecs470 Prefetch A3 Slides developed in part by Profs.
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationCaching Basics. Memory Hierarchies
Caching Basics CS448 1 Memory Hierarchies Takes advantage of locality of reference principle Most programs do not access all code and data uniformly, but repeat for certain data choices spatial nearby
More informationCaches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)
Caches akim Weatherspoon CS 341, Spring 212 Computer Science Cornell University See P& 5.1, 5.2 (except writes) ctrl ctrl ctrl inst imm B A B D D Big Picture: emory emory: big & slow vs Caches: small &
More informationCaches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.2 (writes), 5.3, 5.5
Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H 5.2 (writes), 5.3, 5.5 Announcements! HW3 available due next Tuesday HW3 has been updated. Use updated version.
More informationAnnouncements. ! Previous lecture. Caches. Inf3 Computer Architecture
Announcements! Previous lecture Caches Inf3 Computer Architecture - 2016-2017 1 Recap: Memory Hierarchy Issues! Block size: smallest unit that is managed at each level E.g., 64B for cache lines, 4KB for
More information10/19/17. You Are Here! Review: Direct-Mapped Cache. Typical Memory Hierarchy
CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~cs6c/ Parallel Requests Assigned to computer eg, Search
More informationMemory Hierarchy. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
Memory Hierarchy Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3050: Theory on Computer Architectures, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #21: Caches 3 2005-07-27 CS61C L22 Caches III (1) Andy Carle Review: Why We Use Caches 1000 Performance 100 10 1 1980 1981 1982 1983
More informationLRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved.
LRU A list to keep track of the order of access to every block in the set. The least recently used block is replaced (if needed). How many bits we need for that? 27 Pseudo LRU A B C D E F G H A B C D E
More informationMemory. Lecture 22 CS301
Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch
More informationCS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck
Main memory management CMSC 411 Computer Systems Architecture Lecture 16 Memory Hierarchy 3 (Main Memory & Memory) Questions: How big should main memory be? How to handle reads and writes? How to find
More informationLecture 11. Virtual Memory Review: Memory Hierarchy
Lecture 11 Virtual Memory Review: Memory Hierarchy 1 Administration Homework 4 -Due 12/21 HW 4 Use your favorite language to write a cache simulator. Input: address trace, cache size, block size, associativity
More informationCaches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)
Caches Han Wang CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes) This week: Announcements PA2 Work-in-progress submission Next six weeks: Two labs and two projects
More informationMain Memory (Fig. 7.13) Main Memory
Main Memory (Fig. 7.13) CPU CPU CPU Cache Multiplexor Cache Cache Bus Bus Bus Memory Memory bank 0 Memory bank 1 Memory bank 2 Memory bank 3 Memory b. Wide memory organization c. Interleaved memory organization
More informationCS3350B Computer Architecture
CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &
More informationLecture 19: Survey of Modern VMs. Housekeeping
S 17 L19 1 18 447 Lecture 19: Survey of Modern VMs James C. Hoe Department of ECE Carnegie Mellon University Housekeeping S 17 L19 2 Your goal today see the many realizations of VM, focusing on deviation
More informationCS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14
CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2014 Lecture 14 LAST TIME! Examined several memory technologies: SRAM volatile memory cells built from transistors! Fast to use, larger memory cells (6+ transistors
More informationLecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II John Wawrzynek Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~johnw
More informationCaches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University
Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Memory Code Stored in Memory (also, data and stack) memory PC +4 new pc
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/19/17 Fall 2017 - Lecture #16 1 Parallel
More informationCache Performance (H&P 5.3; 5.5; 5.6)
Cache Performance (H&P 5.3; 5.5; 5.6) Memory system and processor performance: CPU time = IC x CPI x Clock time CPU performance eqn. CPI = CPI ld/st x IC ld/st IC + CPI others x IC others IC CPI ld/st
More informationMemory hierarchy review. ECE 154B Dmitri Strukov
Memory hierarchy review ECE 154B Dmitri Strukov Outline Cache motivation Cache basics Six basic optimizations Virtual memory Cache performance Opteron example Processor-DRAM gap in latency Q1. How to deal
More informationBackground. Memory Hierarchies. Register File. Background. Forecast Memory (B5) Motivation for memory hierarchy Cache ECC Virtual memory.
Memory Hierarchies Forecast Memory (B5) Motivation for memory hierarchy Cache ECC Virtual memory Mem Element Background Size Speed Price Register small 1-5ns high?? SRAM medium 5-25ns $100-250 DRAM large
More informationECE 30 Introduction to Computer Engineering
ECE 0 Introduction to Computer Engineering Study Problems, Set #9 Spring 01 1. Given the following series of address references given as word addresses:,,, 1, 1, 1,, 8, 19,,,,, 7,, and. Assuming a direct-mapped
More informationWhy memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho
Why memory hierarchy? L1 cache design Sangyeun Cho Computer Science Department Memory hierarchy Memory hierarchy goals Smaller Faster More expensive per byte CPU Regs L1 cache L2 cache SRAM SRAM To provide
More informationCPU issues address (and data for write) Memory returns data (or acknowledgment for write)
The Main Memory Unit CPU and memory unit interface Address Data Control CPU Memory CPU issues address (and data for write) Memory returns data (or acknowledgment for write) Memories: Design Objectives
More informationLecture 9: Improving Cache Performance: Reduce miss rate Reduce miss penalty Reduce hit time
Lecture 9: Improving Cache Performance: Reduce miss rate Reduce miss penalty Reduce hit time Review ABC of Cache: Associativity Block size Capacity Cache organization Direct-mapped cache : A =, S = C/B
More informationCPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner
CPS104 Computer Organization and Programming Lecture 16: Virtual Memory Robert Wagner cps 104 VM.1 RW Fall 2000 Outline of Today s Lecture Virtual Memory. Paged virtual memory. Virtual to Physical translation:
More informationLecture 13: Bus and I/O. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 13: Bus and I/O James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L13 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Your goal today Housekeeping take first peek outside of the
More informationCPS 104 Computer Organization and Programming Lecture 20: Virtual Memory
CPS 104 Computer Organization and Programming Lecture 20: Virtual Nov. 10, 1999 Dietolf (Dee) Ramm http://www.cs.duke.edu/~dr/cps104.html CPS 104 Lecture 20.1 Outline of Today s Lecture O Virtual. 6 Paged
More informationCaches. Samira Khan March 23, 2017
Caches Samira Khan March 23, 2017 Agenda Review from last lecture Data flow model Memory hierarchy More Caches The Dataflow Model (of a Computer) Von Neumann model: An instruction is fetched and executed
More informationComputer Architecture Spring 2016
Computer Architecture Spring 2016 Lecture 08: Caches III Shuai Wang Department of Computer Science and Technology Nanjing University Improve Cache Performance Average memory access time (AMAT): AMAT =
More informationwww-inst.eecs.berkeley.edu/~cs61c/
CS61C Machine Structures Lecture 34 - Caches II 11/16/2007 John Wawrzynek (www.cs.berkeley.edu/~johnw) www-inst.eecs.berkeley.edu/~cs61c/ 1 What to do on a write hit? Two Options: Write-through update
More informationAdvanced Computer Architecture
ECE 563 Advanced Computer Architecture Fall 2009 Lecture 3: Memory Hierarchy Review: Caches 563 L03.1 Fall 2010 Since 1980, CPU has outpaced DRAM... Four-issue 2GHz superscalar accessing 100ns DRAM could
More informationLECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY
LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal
More informationCS 153 Design of Operating Systems Winter 2016
CS 153 Design of Operating Systems Winter 2016 Lecture 16: Memory Management and Paging Announcement Homework 2 is out To be posted on ilearn today Due in a week (the end of Feb 19 th ). 2 Recap: Fixed
More informationComputer Architecture Spring 2016
Computer Architecture Spring 2016 Lecture 07: Caches II Shuai Wang Department of Computer Science and Technology Nanjing University 63 address 0 [63:6] block offset[5:0] Fully-AssociativeCache Keep blocks
More informationLocality and Data Accesses video is wrong one notes when video is correct
Cache Review This lesson is a review of caches. Beginning with the structure of the cache itself, including set associative and direct mapped caches. Then the lesson discusses replacement policies, specifically
More informationMo Money, No Problems: Caches #2...
Mo Money, No Problems: Caches #2... 1 Reminder: Cache Terms... Cache: A small and fast memory used to increase the performance of accessing a big and slow memory Uses temporal locality: The tendency to
More informationModern Computer Architecture
Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap
More informationMemory Hierarchy: Caches, Virtual Memory
Memory Hierarchy: Caches, Virtual Memory Readings: 5.1-5.4, 5.8 Big memories are slow Computer Fast memories are small Processor Memory Devices Control Input Datapath Output Need to get fast, big memories
More informationAnd in Review! ! Locality of reference is a Big Idea! 3. Load Word from 0x !
CS61C L23 Caches II (1)! inst.eecs.berkeley.edu/~cs61c CS61C Machine Structures Lecture 23 Caches II 2010-07-29!!!Instructor Paul Pearce! TOOLS THAT AUTOMATICALLY FIND SOFTWARE BUGS! Black Hat (a security
More informationShow Me the $... Performance And Caches
Show Me the $... Performance And Caches 1 CPU-Cache Interaction (5-stage pipeline) PCen 0x4 Add bubble PC addr inst hit? Primary Instruction Cache IR D To Memory Control Decode, Register Fetch E A B MD1
More informationComputer Architecture
Computer Architecture Lecture 7: Memory Hierarchy and Caches Dr. Ahmed Sallam Suez Canal University Spring 2015 Based on original slides by Prof. Onur Mutlu Memory (Programmer s View) 2 Abstraction: Virtual
More informationPick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality
Pick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality Repeated References, to a set of locations: Temporal Locality Take advantage of behavior
More informationECE 2300 Digital Logic & Computer Organization. More Caches
ECE 23 Digital Logic & Computer Organization Spring 218 More Caches 1 Announcements Prelim 2 stats High: 79.5 (out of 8), Mean: 65.9, Median: 68 Prelab 5(C) deadline extended to Saturday 3pm No further
More informationMemory Hierarchy. Caching Chapter 7. Locality. Program Characteristics. What does that mean?!? Exploiting Spatial & Temporal Locality
Caching Chapter 7 Basics (7.,7.2) Cache Writes (7.2 - p 483-485) configurations (7.2 p 487-49) Performance (7.3) Associative caches (7.3 p 496-54) Multilevel caches (7.3 p 55-5) Tech SRAM (logic) SRAM
More informationCache Memories /18-213/15-513: Introduction to Computer Systems 12 th Lecture, October 5, Today s Instructor: Phil Gibbons
Cache Memories 15-213/18-213/15-513: Introduction to Computer Systems 12 th Lecture, October 5, 2017 Today s Instructor: Phil Gibbons 1 Today Cache memory organization and operation Performance impact
More information1/19/2009. Data Locality. Exploiting Locality: Caches
Spring 2009 Prof. Hyesoon Kim Thanks to Prof. Loh & Prof. Prvulovic Data Locality Temporal: if data item needed now, it is likely to be needed again in near future Spatial: if data item needed now, nearby
More informationEECS 470. Lecture 16 Virtual Memory. Fall 2018 Jon Beaumont
Lecture 16 Virtual Memory Fall 2018 Jon Beaumont http://www.eecs.umich.edu/courses/eecs470 Slides developed in part by Profs. Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Shen, Smith, Sohi, Tyson, and
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to
More informationAdapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]
Lecture 17 Adapted from instructor s supplementary material from Computer Organization and Design, 4th Edition, Patterson & Hennessy, 2008, MK] SRAM / / Flash / RRAM / HDD SRAM / / Flash / RRAM/ HDD SRAM
More informationEITF20: Computer Architecture Part 5.1.1: Virtual Memory
EITF20: Computer Architecture Part 5.1.1: Virtual Memory Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache optimization Virtual memory Case study AMD Opteron Summary 2 Memory hierarchy 3 Cache
More informationPortland State University ECE 587/687. Caches and Memory-Level Parallelism
Portland State University ECE 587/687 Caches and Memory-Level Parallelism Revisiting Processor Performance Program Execution Time = (CPU clock cycles + Memory stall cycles) x clock cycle time For each
More informationCS152 Computer Architecture and Engineering Lecture 18: Virtual Memory
CS152 Computer Architecture and Engineering Lecture 18: Virtual Memory March 22, 1995 Dave Patterson (patterson@cs) and Shing Kong (shingkong@engsuncom) Slides available on http://httpcsberkeleyedu/~patterson
More informationMemory Hierarchy Basics. Ten Advanced Optimizations. Small and Simple
Memory Hierarchy Basics Six basic cache optimizations: Larger block size Reduces compulsory misses Increases capacity and conflict misses, increases miss penalty Larger total cache capacity to reduce miss
More informationCS 61C: Great Ideas in Computer Architecture Direct- Mapped Caches. Increasing distance from processor, decreasing speed.
CS 6C: Great Ideas in Computer Architecture Direct- Mapped s 9/27/2 Instructors: Krste Asanovic, Randy H Katz hdp://insteecsberkeleyedu/~cs6c/fa2 Fall 2 - - Lecture #4 New- School Machine Structures (It
More information18-447: Computer Architecture Lecture 17: Memory Hierarchy and Caches. Prof. Onur Mutlu Carnegie Mellon University Spring 2012, 3/26/2012
18-447: Computer Architecture Lecture 17: Memory Hierarchy and Caches Prof. Onur Mutlu Carnegie Mellon University Spring 2012, 3/26/2012 Reminder: Homeworks Homework 5 Due April 2 Topics: Out-of-order
More informationAgenda. Recap: Components of a Computer. Agenda. Recap: Cache Performance and Average Memory Access Time (AMAT) Recap: Typical Memory Hierarchy
// CS 6C: Great Ideas in Computer Architecture (Machine Structures) Set- Associa+ve Caches Instructors: Randy H Katz David A PaFerson hfp://insteecsberkeleyedu/~cs6c/fa Cache Recap Recap: Components of
More informationComputer Architecture Lecture 19: Memory Hierarchy and Caches. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 3/19/2014
18-447 Computer Architecture Lecture 19: Memory Hierarchy and Caches Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 3/19/2014 Extra Credit Recognition for Lab 3 1. John Greth (13157 ns) 2. Kevin
More informationMemories. CPE480/CS480/EE480, Spring Hank Dietz.
Memories CPE480/CS480/EE480, Spring 2018 Hank Dietz http://aggregate.org/ee480 What we want, what we have What we want: Unlimited memory space Fast, constant, access time (UMA: Uniform Memory Access) What
More informationLecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size
CS61C L33 Caches III (1) inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C Machine Structures Lecture 33 Caches III 27-4-11 Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Future of movies is 3D? Dreamworks
More informationCACHE OPTIMIZATION. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah
CACHE OPTIMIZATION Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 3 will be released on Oct. 31 st This
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!
More informationMemory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt
Memory Hierarchy 2/18/2016 CS 152 Sec6on 5 Colin Schmidt Agenda Review Memory Hierarchy Lab 2 Ques6ons Return Quiz 1 Latencies Comparison Numbers L1 Cache 0.5 ns L2 Cache 7 ns 14x L1 cache Main Memory
More informationECE468 Computer Organization and Architecture. Virtual Memory
ECE468 Computer Organization and Architecture Virtual Memory ECE468 vm.1 Review: The Principle of Locality Probability of reference 0 Address Space 2 The Principle of Locality: Program access a relatively
More informationMemory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5 th Edition Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic
More informationCaches and Memory. Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.10, 5.13, 5.15, 5.17
Caches and emory Anne Bracy CS 34 Computer Science Cornell University Slides by Anne Bracy with 34 slides by Professors Weatherspoon, Bala, ckee, and Sirer. See P&H Chapter: 5.-5.4, 5.8, 5., 5.3, 5.5,
More informationLecture 13: Cache Hierarchies. Today: cache access basics and innovations (Sections )
Lecture 13: Cache Hierarchies Today: cache access basics and innovations (Sections 5.1-5.2) 1 The Cache Hierarchy Core L1 L2 L3 Off-chip memory 2 Accessing the Cache Byte address 101000 Offset 8-byte words
More informationECE4680 Computer Organization and Architecture. Virtual Memory
ECE468 Computer Organization and Architecture Virtual Memory If I can see it and I can touch it, it s real. If I can t see it but I can touch it, it s invisible. If I can see it but I can t touch it, it
More informationdata block 0, word 0 block 0, word 1 block 1, word 0 block 1, word 1 block 2, word 0 block 2, word 1 block 3, word 0 block 3, word 1 Word index cache
Taking advantage of spatial locality Use block size larger than one word Example: two words Block index tag () () Alternate representations Word index tag block, word block, word block, word block, word
More informationCS152 Computer Architecture and Engineering Lecture 17: Cache System
CS152 Computer Architecture and Engineering Lecture 17 System March 17, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http//http.cs.berkeley.edu/~patterson
More informationTopic 18: Virtual Memory
Topic 18: Virtual Memory COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Virtual Memory Any time you see virtual, think using a level of indirection
More informationCHAPTER 4 MEMORY HIERARCHIES TYPICAL MEMORY HIERARCHY TYPICAL MEMORY HIERARCHY: THE PYRAMID CACHE PERFORMANCE MEMORY HIERARCHIES CACHE DESIGN
CHAPTER 4 TYPICAL MEMORY HIERARCHY MEMORY HIERARCHIES MEMORY HIERARCHIES CACHE DESIGN TECHNIQUES TO IMPROVE CACHE PERFORMANCE VIRTUAL MEMORY SUPPORT PRINCIPLE OF LOCALITY: A PROGRAM ACCESSES A RELATIVELY
More informationCaches Concepts Review
Caches Concepts Review What is a block address? Why not bring just what is needed by the processor? What is a set associative cache? Write-through? Write-back? Then we ll see: Block allocation policy on
More information