Caches (Writing) Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.2 3, 5.5

Size: px
Start display at page:

Download "Caches (Writing) Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.2 3, 5.5"

Transcription

1 s (Writing) Hakim Weatherspoon CS, Spring Computer Science Cornell University P & H Chapter.,.

2 Administrivia Lab due next onday, April th HW due next onday, April th

3 Goals for Today Parameter Tradeoffs Conscious Programming Writing to the Write through vs Write back

4 Design Tradeoffs

5 Design Need to determine parameters: size Block size (aka line size) Number of ways of set associativity (, N, ) Eviction policy Number of levels of caching, parameters for each Separate I cache from D cache, or Unified cache Prefetching policies / instructions Write policy

6 A Real Example > dmidecode t cache Information Configuration: Enabled, Not Socketed, Level Operational ode: Write Back Installed Size: KB Error Correction Type: None Information Configuration: Enabled, Not Socketed, Level Operational ode: Varies With emory Address Installed Size: KB Error Correction Type: Single bit ECC > cd /sys/devices/system/cpu/cpu; grep cache/*/* cache/index/level: cache/index/type:data cache/index/ways_of_associativity: cache/index/number_of_sets: cache/index/coherency_line_size: cache/index/size:k cache/index/level: cache/index/type:instruction cache/index/ways_of_associativity: cache/index/number_of_sets: cache/index/coherency_line_size: cache/index/size:k cache/index/level: cache/index/type:unified cache/index/shared_cpu_list: cache/index/ways_of_associativity: cache/index/number_of_sets: cache/index/coherency_line_size: cache/index/size:k Dual core.ghz Intel (purchased in )

7 A Real Example Dual K L Instruction caches way set associative sets byte line size Dual K L Data caches Same as above Single L Unified cache way set associative (!!!) sets byte line size GB ain memory TB Disk Dual core.ghz Intel (purchased in )

8 Basic Organization Q: How to decide block size? A: Try it and see But: depends on cache size, workload, associativity, Experimental approach!

9 Experimental Results

10 Tradeoffs For a given total cache size, larger block sizes mean. fewer lines so fewer tags (and smaller tags for associative caches) so less overhead and fewer cold misses (within block prefetching ) But also fewer blocks available (for scattered accesses!) so more conflicts and larger miss penalty (time to fetch block)

11 Conscious Programming

12 Conscious Programming // H =, W = int A[H][W]; for(x=; x < W; x++) for(y=; y < H; y++) sum += A[y][x]; Every access is a cache miss! (unless entire matrix can fit in cache)

13 Conscious Programming // H =, W = int A[H][W]; for(y=; y < H; y++) for(x=; x < W; x++) sum += A[y][x]; Block size = % hit rate Block size =.% hit rate Block size =.% hit rate And you can easily prefetch to warm the cache.

14 Writing with s

15 Eviction Which cache line should be evicted from the cache to make room for a new line? Direct mapped no choice, must evict line selected by index Associative caches random: select one of the lines at random round robin: similar to random FIFO: replace oldest line LRU: replace line that has not been used in the longest time

16 d Write Policies Q: How to write data? CPU addr data SRA emory DRA If data is already in the cache No Write writes invalidate the cache and go directly to memory Write Through writes go to main memory and cache Write Back CPU writes only to cache cache writes to main memory later (when block is evicted)

17 What about Stores? Where should you write the result of a store? If that memory location is in the cache? Send it to the cache Should we also send it to memory right away? (write through policy) Wait until we kick the block out (write back policy) If it is not in the cache? Allocate the line (put it in the cache)? (write allocate policy) Write it directly to memory without allocation? (no write allocate policy)

18 Write Allocation Policies Q: How to write data? CPU addr data SRA emory DRA If data is not in the cache Write Allocate allocate a cache line for new data (and maybe write through) No Write Allocate ignore cache, just go to main memory

19 Handling Stores (Write-Through) Using byte addresses in this example! Addr Bus = bits Assume write allocate policy LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ Fully Associative cache lines word block bit tag field bit block offset field Vtag data isses: Hits: emory

20 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ V tag data isses: Hits:

21 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

22 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

23 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

24 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

25 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

26 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

27 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

28 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

29 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

30 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

31 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

32 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] $ $ $ $ lru V tag data isses: Hits:

33 Write-Through (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] H $ $ $ $ lru V tag data isses: Hits:

34 How any emory References? Write through performance Each miss (read or write) reads a block from mem misses memreads Each store writes an item to mem mem writes Evictions don t need to write to mem no need for dirty bit

35 Write-Through (REF,) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ H H H lru V tag data isses: Hits:

36 Write-Through (REF,) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ H H H H H lru V tag data isses: Hits:

37 Write-Through vs. Write-Back Can we also design the cache NOT to write all stores immediately to memory? Keep the most current copy in cache, and update memory when that data is evicted (write back policy) Do we need to write back all evicted lines? No, only blocks that have been stored into (written)

38 Write-Back eta-data V D Tag Byte Byte Byte N V = means the line has valid data D = means the bytes are newer than main memory When allocating line: Set V =, D =, fill in Tag and Data When writing line: Set D = When evicting line: If D = : just set V = If D = : write back Data, then set D =, V =

39 Handling Stores (Write-Back) Using byte addresses in this example! Addr Bus = bits Assume write allocate policy LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ Fully Associative cache lines word block bit tag field bit block offset field Vd tag data isses: Hits: emory

40 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ V d tag data isses: Hits:

41 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

42 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

43 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

44 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

45 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

46 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

47 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

48 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

49 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

50 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

51 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

52 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

53 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] LB $ [ ] SB $ [ ] H SB $ [ ] $ $ $ $ lru V d tag data isses: Hits:

54 Write-Back (REF ) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ H H H lru V d tag data isses: Hits:

55 How any emory References? Write back performance Each miss (read or write) reads a block from mem misses memreads Some evictions write a block to mem dirty eviction mem writes (+ dirty evictions later + mem writes)

56 How many memory references? Each miss reads a block Two words in this cache Each evicted dirty cache line writes a block Total reads: six words Total writes: / words (after final eviction)

57 Write-Back (REF,) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ H H H lru V d tag data isses: Hits:

58 Write-Back (REF,) emory LB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] LB $ [ ] SB $ [ ] SB $ [ ] SB $ [ ] SB $ [ ] $ $ $ $ H H H H H lru V d tag data isses: Hits:

59 How any emory References? Write back performance Each miss (read or write) reads a block from mem misses memreads Some evictions write a block to mem dirty eviction mem writes (+ dirty evictions later + mem writes) By comparison write through was Reads: eight words Writes: // etc words Write through or Write back?

60 Write-through vs. Write-back Write through is slower But cleaner (memory always consistent) Write back is faster But complicated when multi cores sharing memory

61 Performance: An Example Performance: Write back versus Write through Assume: large associative cache, byte lines for (i=; i<n; i++) A[] += A[i]; for (i=; i<n; i++) B[i] = A[i]

62 Performance Tradeoffs Q: Hit time: write through vs. write back? A: Write through slower on writes. Q: iss penalty: write through vs. write back? A: Write back slower on evictions.

63 Write Buffering Q: Writes to main memory are slow! A: Use a write back buffer A small queue holding dirty lines Add to end upon eviction Remove from front upon completion Q: What does it help? A: short bursts of writes (but not sustained writes) A: fast eviction reduces miss penalty

64 Write-through vs. Write-back Write through is slower But simpler (memory always consistent) Write back is almost always faster write back buffer hides large eviction cost But what about multiple cores with separate caches but sharing memory? Write back requires a cache coherency protocol Inconsistent views of memory Need to snoop in each other s caches Extremely complex protocols, very hard to get right

65 -coherency Q: ultiple readers and writers? A: Potentially inconsistent views of memory CPU CPU CPU CPU L L L L L L L L L L net em disk coherency protocol ay need to snoop on other CPU s cache activity Invalidate cache line when other CPU writes Flush write back caches before other CPU reads Or the reverse: Before writing/reading Extremely complex protocols, very hard to get right

66 Caching assumptions Summary small working set: / rule can predict future: spatial & temporal locality Benefits (big & fast) built from (big & slow) + (small & fast) Tradeoffs: associativity, line size, hit cost, miss penalty, hit rate

67 Summary emory performance matters! often more than CPU performance because it is the bottleneck, and not improving much because most programs move a LOT of data Design space is huge Gambling against program behavior Cuts across all layers: users programs os hardware ulti core / ulti is complicated Inconsistent views of memory Extremely complex protocols, very hard to get right

Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.10, 5.15; Also, 5.13 & 5.

Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.10, 5.15; Also, 5.13 & 5. Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University See P&H Chapter: 5.1-5.4, 5.8, 5.10, 5.15; Also, 5.13 & 5.17 Writing to caches: policies, performance Cache tradeoffs and

More information

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.2 (writes), 5.3, 5.5

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.2 (writes), 5.3, 5.5 Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H 5.2 (writes), 5.3, 5.5 Announcements! HW3 available due next Tuesday HW3 has been updated. Use updated version.

More information

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Memory Code Stored in Memory (also, data and stack) memory PC +4 new pc

More information

Caches & Memory. CS 3410 Computer System Organization & Programming

Caches & Memory. CS 3410 Computer System Organization & Programming Caches & Memory CS 34 Computer System Organization & Programming These slides are the product of many rounds of teaching CS 34 by Professors Weatherspoon, Bala, Bracy, and Sirer. Programs C Code int main

More information

Caches and Memory. Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.10, 5.13, 5.15, 5.17

Caches and Memory. Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.10, 5.13, 5.15, 5.17 Caches and emory Anne Bracy CS 34 Computer Science Cornell University Slides by Anne Bracy with 34 slides by Professors Weatherspoon, Bala, ckee, and Sirer. See P&H Chapter: 5.-5.4, 5.8, 5., 5.3, 5.5,

More information

Virtual Memory 1! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. P & H Chapter 5.4 (up to TLBs)

Virtual Memory 1! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. P & H Chapter 5.4 (up to TLBs) Virtual Memory 1! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University P & H Chapter 5.4 (up to TLBs) Announcements! HW3 available due today Tuesday HW3 has been updated. Use updated

More information

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 34, Spring 23 Computer Science Cornell University Welcome back from Spring Break! Welcome back from Spring Break! Big Picture: Memory Code

More information

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) Caches akim Weatherspoon CS 341, Spring 212 Computer Science Cornell University See P& 5.1, 5.2 (except writes) ctrl ctrl ctrl inst imm B A B D D Big Picture: emory emory: big & slow vs Caches: small &

More information

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) s akim Weatherspoon CS, Spring Computer Science Cornell University See P&.,. (except writes) Big Picture: : big & slow vs s: small & fast compute jump/branch targets memory PC + new pc Instruction Fetch

More information

Caches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University s See P&.,. (except writes) akim Weatherspoon CS, Spring Computer Science Cornell University What will you do over Spring Break? A) Relax B) ead home C) ead to a warm destination D) Stay in (frigid) Ithaca

More information

Caches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University s See P&.,. (except writes) akim Weatherspoon CS, Spring Computer Science Cornell University What will you do over Spring Break? A) Relax B) ead home C) ead to a warm destination D) Stay in (frigid) Ithaca

More information

Caches and Memory Deniz Altinbuken CS 3410, Spring 2015

Caches and Memory Deniz Altinbuken CS 3410, Spring 2015 s and emory Deniz Altinbuken CS, Spring Computer Science Cornell University See P& Chapter:.-. (except writes) Big Picture: emory Code Stored in emory (also, data and stack) compute jump/branch targets

More information

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) Caches Han Wang CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes) This week: Announcements PA2 Work-in-progress submission Next six weeks: Two labs and two projects

More information

Page 1. Multilevel Memories (Improving performance using a little cash )

Page 1. Multilevel Memories (Improving performance using a little cash ) Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency

More information

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) Chapter Seven emories: Review SRA: value is stored on a pair of inverting gates very fast but takes up more space than DRA (4 to transistors) DRA: value is stored as a charge on capacitor (must be refreshed)

More information

Virtual Memory 2. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.4

Virtual Memory 2. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.4 Virtual Memory 2 Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University P & H Chapter 5.4 Administrivia Project3 available now Design Doc due next week, Monday, April 16 th Schedule

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste

More information

CS 136: Advanced Architecture. Review of Caches

CS 136: Advanced Architecture. Review of Caches 1 / 30 CS 136: Advanced Architecture Review of Caches 2 / 30 Why Caches? Introduction Basic goal: Size of cheapest memory... At speed of most expensive Locality makes it work Temporal locality: If you

More information

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H 5.1, 5.2 (except writes) Announcements! HW3 available due next Tuesday Work with alone partner Be responsible

More information

CS161 Design and Architecture of Computer Systems. Cache $$$$$

CS161 Design and Architecture of Computer Systems. Cache $$$$$ CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks

More information

12 Cache-Organization 1

12 Cache-Organization 1 12 Cache-Organization 1 Caches Memory, 64M, 500 cycles L1 cache 64K, 1 cycles 1-5% misses L2 cache 4M, 10 cycles 10-20% misses L3 cache 16M, 20 cycles Memory, 256MB, 500 cycles 2 Improving Miss Penalty

More information

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho Why memory hierarchy? L1 cache design Sangyeun Cho Computer Science Department Memory hierarchy Memory hierarchy goals Smaller Faster More expensive per byte CPU Regs L1 cache L2 cache SRAM SRAM To provide

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!

More information

Caching Basics. Memory Hierarchies

Caching Basics. Memory Hierarchies Caching Basics CS448 1 Memory Hierarchies Takes advantage of locality of reference principle Most programs do not access all code and data uniformly, but repeat for certain data choices spatial nearby

More information

Locality and Data Accesses video is wrong one notes when video is correct

Locality and Data Accesses video is wrong one notes when video is correct Cache Review This lesson is a review of caches. Beginning with the structure of the cache itself, including set associative and direct mapped caches. Then the lesson discusses replacement policies, specifically

More information

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative Chapter 6 s Topics Memory Hierarchy Locality of Reference SRAM s Direct Mapped Associative Computer System Processor interrupt On-chip cache s s Memory-I/O bus bus Net cache Row cache Disk cache Memory

More information

Mo Money, No Problems: Caches #2...

Mo Money, No Problems: Caches #2... Mo Money, No Problems: Caches #2... 1 Reminder: Cache Terms... Cache: A small and fast memory used to increase the performance of accessing a big and slow memory Uses temporal locality: The tendency to

More information

1/19/2009. Data Locality. Exploiting Locality: Caches

1/19/2009. Data Locality. Exploiting Locality: Caches Spring 2009 Prof. Hyesoon Kim Thanks to Prof. Loh & Prof. Prvulovic Data Locality Temporal: if data item needed now, it is likely to be needed again in near future Spatial: if data item needed now, nearby

More information

Key Point. What are Cache lines

Key Point. What are Cache lines Caching 1 Key Point What are Cache lines Tags Index offset How do we find data in the cache? How do we tell if it s the right data? What decisions do we need to make in designing a cache? What are possible

More information

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University Lecture 12 Memory Design & Caches, part 2 Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements HW3 is due today PA2 is available on-line today Part 1 is due on 2/27

More information

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook CS356: Discussion #9 Memory Hierarchy and Caches Marco Paolieri (paolieri@usc.edu) Illustrations from CS:APP3e textbook The Memory Hierarchy So far... We modeled the memory system as an abstract array

More information

Systems Programming and Computer Architecture ( ) Timothy Roscoe

Systems Programming and Computer Architecture ( ) Timothy Roscoe Systems Group Department of Computer Science ETH Zürich Systems Programming and Computer Architecture (252-0061-00) Timothy Roscoe Herbstsemester 2016 AS 2016 Caches 1 16: Caches Computer Architecture

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major

More information

Course Administration

Course Administration Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570

More information

CS 61C: Great Ideas in Computer Architecture Caches Part 2

CS 61C: Great Ideas in Computer Architecture Caches Part 2 CS 61C: Great Ideas in Computer Architecture Caches Part 2 Instructors: Nicholas Weaver & Vladimir Stojanovic http://insteecsberkeleyedu/~cs61c/fa15 Software Parallel Requests Assigned to computer eg,

More information

ECE ECE4680

ECE ECE4680 ECE468. -4-7 The otivation for s System ECE468 Computer Organization and Architecture DRA Hierarchy System otivation Large memories (DRA) are slow Small memories (SRA) are fast ake the average access time

More information

Virtual Memory 3. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.4

Virtual Memory 3. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.4 Virtual Memory 3 Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University P & H Chapter 5.4 Project3 available now Administrivia Design Doc due next week, Monday, April 16 th Schedule

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

Lecture 7 - Memory Hierarchy-II

Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II John Wawrzynek Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~johnw

More information

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand Caches Nima Honarmand Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required by all of the running applications

More information

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng Slide Set 9 for ENCM 369 Winter 2018 Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary March 2018 ENCM 369 Winter 2018 Section 01

More information

Memory Hierarchy. Slides contents from:

Memory Hierarchy. Slides contents from: Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

CSE502: Computer Architecture CSE 502: Computer Architecture

CSE502: Computer Architecture CSE 502: Computer Architecture CSE 502: Computer Architecture Memory Hierarchy & Caches Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required

More information

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp. Cache associativity Cache and performance 12 1 CMPE110 Spring 2005 A. Di Blas 110 Spring 2005 CMPE Cache Direct-mapped cache Reads and writes Textbook Edition: 7.1 to 7.3 Second Third Edition: 7.1 to 7.3

More information

Advanced Computer Architecture

Advanced Computer Architecture ECE 563 Advanced Computer Architecture Fall 2009 Lecture 3: Memory Hierarchy Review: Caches 563 L03.1 Fall 2010 Since 1980, CPU has outpaced DRAM... Four-issue 2GHz superscalar accessing 100ns DRAM could

More information

Page 1. Memory Hierarchies (Part 2)

Page 1. Memory Hierarchies (Part 2) Memory Hierarchies (Part ) Outline of Lectures on Memory Systems Memory Hierarchies Cache Memory 3 Virtual Memory 4 The future Increasing distance from the processor in access time Review: The Memory Hierarchy

More information

Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory

Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory Cache Memories Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory CPU looks first for data in caches (e.g., L1, L2, and

More information

EEC 483 Computer Organization

EEC 483 Computer Organization EEC 48 Computer Organization 5. The Basics of Cache Chansu Yu Caches: The Basic Idea A smaller set of storage locations storing a subset of information from a larger set (memory) Unlike registers or memory,

More information

CS3350B Computer Architecture

CS3350B Computer Architecture CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &

More information

ECE 571 Advanced Microprocessor-Based Design Lecture 10

ECE 571 Advanced Microprocessor-Based Design Lecture 10 ECE 571 Advanced Microprocessor-Based Design Lecture 10 Vince Weaver http://www.eece.maine.edu/ vweaver vincent.weaver@maine.edu 2 October 2014 Performance Concerns Caches Almost all programming can be

More information

10/16/2017. Miss Rate: ABC. Classifying Misses: 3C Model (Hill) Reducing Conflict Misses: Victim Buffer. Overlapping Misses: Lockup Free Cache

10/16/2017. Miss Rate: ABC. Classifying Misses: 3C Model (Hill) Reducing Conflict Misses: Victim Buffer. Overlapping Misses: Lockup Free Cache Classifying Misses: 3C Model (Hill) Divide cache misses into three categories Compulsory (cold): never seen this address before Would miss even in infinite cache Capacity: miss caused because cache is

More information

Memory Hierarchy. Slides contents from:

Memory Hierarchy. Slides contents from: Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 24: Cache Performance Analysis Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Last time: Associative caches How do we

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar

More information

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 5 Memory Hierachy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic

More information

Agenda. EE 260: Introduction to Digital Design Memory. Naive Register File. Agenda. Memory Arrays: SRAM. Memory Arrays: Register File

Agenda. EE 260: Introduction to Digital Design Memory. Naive Register File. Agenda. Memory Arrays: SRAM. Memory Arrays: Register File EE 260: Introduction to Digital Design Technology Yao Zheng Department of Electrical Engineering University of Hawaiʻi at Mānoa 2 Technology Naive Register File Write Read clk Decoder Read Write 3 4 Arrays:

More information

Caches Concepts Review

Caches Concepts Review Caches Concepts Review What is a block address? Why not bring just what is needed by the processor? What is a set associative cache? Write-through? Write-back? Then we ll see: Block allocation policy on

More information

Lecture 11 Cache. Peng Liu.

Lecture 11 Cache. Peng Liu. Lecture 11 Cache Peng Liu liupeng@zju.edu.cn 1 Associative Cache Example 2 Associative Cache Example 3 Associativity Example Compare 4-block caches Direct mapped, 2-way set associative, fully associative

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Static RAM (SRAM) Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 0.5ns 2.5ns, $2000 $5000 per GB 5.1 Introduction Memory Technology 5ms

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!

More information

Computer Systems C S Cynthia Lee Today s materials adapted from Kevin Webb at Swarthmore College

Computer Systems C S Cynthia Lee Today s materials adapted from Kevin Webb at Swarthmore College Computer Systems C S 0 7 Cynthia Lee Today s materials adapted from Kevin Webb at Swarthmore College 2 Today s Topics TODAY S LECTURE: Caching ANNOUNCEMENTS: Assign6 & Assign7 due Friday! 6 & 7 NO late

More information

Lecture 20: Multi-Cache Designs. Spring 2018 Jason Tang

Lecture 20: Multi-Cache Designs. Spring 2018 Jason Tang Lecture 20: Multi-Cache Designs Spring 2018 Jason Tang 1 Topics Split caches Multi-level caches Multiprocessor caches 2 3 Cs of Memory Behaviors Classify all cache misses as: Compulsory Miss (also cold-start

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Lecture 15: Caches and Optimization Computer Architecture and Systems Programming ( )

Lecture 15: Caches and Optimization Computer Architecture and Systems Programming ( ) Systems Group Department of Computer Science ETH Zürich Lecture 15: Caches and Optimization Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Last time Program

More information

Introduction to OpenMP. Lecture 10: Caches

Introduction to OpenMP. Lecture 10: Caches Introduction to OpenMP Lecture 10: Caches Overview Why caches are needed How caches work Cache design and performance. The memory speed gap Moore s Law: processors speed doubles every 18 months. True for

More information

Memory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt

Memory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt Memory Hierarchy 2/18/2016 CS 152 Sec6on 5 Colin Schmidt Agenda Review Memory Hierarchy Lab 2 Ques6ons Return Quiz 1 Latencies Comparison Numbers L1 Cache 0.5 ns L2 Cache 7 ns 14x L1 cache Main Memory

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #24 Cache II 27-8-6 Scott Beamer, Instructor New Flow Based Routers CS61C L24 Cache II (1) www.anagran.com Caching Terminology When we try

More information

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work? EEC 17 Computer Architecture Fall 25 Introduction Review Review: The Hierarchy Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology

More information

Cache Memories. Topics. Next time. Generic cache memory organization Direct mapped caches Set associative caches Impact of caches on performance

Cache Memories. Topics. Next time. Generic cache memory organization Direct mapped caches Set associative caches Impact of caches on performance Cache Memories Topics Generic cache memory organization Direct mapped caches Set associative caches Impact of caches on performance Next time Dynamic memory allocation and memory bugs Fabián E. Bustamante,

More information

CS 31: Intro to Systems Caching. Kevin Webb Swarthmore College March 24, 2015

CS 31: Intro to Systems Caching. Kevin Webb Swarthmore College March 24, 2015 CS 3: Intro to Systems Caching Kevin Webb Swarthmore College March 24, 205 Reading Quiz Abstraction Goal Reality: There is no one type of memory to rule them all! Abstraction: hide the complex/undesirable

More information

ECE 2300 Digital Logic & Computer Organization. More Caches

ECE 2300 Digital Logic & Computer Organization. More Caches ECE 23 Digital Logic & Computer Organization Spring 217 More Caches 1 Prelim 2 stats High: 9 (out of 9) Mean: 7.2, Median: 73 Announcements Prelab 5(C) due tomorrow 2 Example: Direct Mapped (DM) Cache

More information

Advanced Memory Organizations

Advanced Memory Organizations CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common

More information

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1 CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson

More information

Performance metrics for caches

Performance metrics for caches Performance metrics for caches Basic performance metric: hit ratio h h = Number of memory references that hit in the cache / total number of memory references Typically h = 0.90 to 0.97 Equivalent metric:

More information

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Cache Introduction [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user with as much

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir Stojanovic http://insteecsberkeleyedu/~cs61c/ Typical Memory Hierarchy Datapath On-Chip

More information

Chapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST

Chapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST Chapter 5 Memory Hierarchy Design In-Cheol Park Dept. of EE, KAIST Why cache? Microprocessor performance increment: 55% per year Memory performance increment: 7% per year Principles of locality Spatial

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

See also cache study guide. Contents Memory and Caches. Supplement to material in section 5.2. Includes notation presented in class.

See also cache study guide. Contents Memory and Caches. Supplement to material in section 5.2. Includes notation presented in class. 13 1 Memory and Caches 13 1 See also cache study guide. Contents Supplement to material in section 5.2. Includes notation presented in class. 13 1 EE 4720 Lecture Transparency. Formatted 9:11, 22 April

More information

The levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms

The levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms The levels of a memory hierarchy CPU registers C A C H E Memory bus Main Memory I/O bus External memory 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms 1 1 Some useful definitions When the CPU finds a requested

More information

LECTURE 12. Virtual Memory

LECTURE 12. Virtual Memory LECTURE 12 Virtual Memory VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a cache for magnetic disk. The mechanism by which this is accomplished

More information

Introduction to cache memories

Introduction to cache memories Course on: Advanced Computer Architectures Introduction to cache memories Prof. Cristina Silvano Politecnico di Milano email: cristina.silvano@polimi.it 1 Summary Summary Main goal Spatial and temporal

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/16/17 Fall 2017 - Lecture #15 1 Outline

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Bernhard Boser & Randy H Katz http://insteecsberkeleyedu/~cs61c/ 10/18/16 Fall 2016 - Lecture #15 1 Outline

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Components of a Computer Processor

More information

Cache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6.

Cache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6. Cache Memories From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6. Today Cache memory organization and operation Performance impact of caches The memory mountain Rearranging

More information

CS 261 Fall Caching. Mike Lam, Professor. (get it??)

CS 261 Fall Caching. Mike Lam, Professor. (get it??) CS 261 Fall 2017 Mike Lam, Professor Caching (get it??) Topics Caching Cache policies and implementations Performance impact General strategies Caching A cache is a small, fast memory that acts as a buffer

More information

10/16/17. Outline. Outline. Typical Memory Hierarchy. Adding Cache to Computer. Key Cache Concepts

10/16/17. Outline. Outline. Typical Memory Hierarchy. Adding Cache to Computer. Key Cache Concepts // CS C: Great Ideas in Computer Architecture (Machine Structures) s Part Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~csc/ Organization and Principles Write Back vs Write Through

More information

Agenda. Cache-Memory Consistency? (1/2) 7/14/2011. New-School Machine Structures (It s a bit more complicated!)

Agenda. Cache-Memory Consistency? (1/2) 7/14/2011. New-School Machine Structures (It s a bit more complicated!) 7/4/ CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches II Instructor: Michael Greenbaum New-School Machine Structures (It s a bit more complicated!) Parallel Requests Assigned to

More information

CS 3410, Spring 2014 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.15

CS 3410, Spring 2014 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.15 CS 34, Spring 4 Computer Science Cornell University See P& Chapter: 5.- 5.4, 5.8, 5.5 Code Stored in emory (also, data and stack) memory PC +4 new pc inst control extend imm B A compute jump/branch targets

More information

Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example

Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example Locality CS429: Computer Organization and Architecture Dr Bill Young Department of Computer Sciences University of Texas at Austin Principle of Locality: Programs tend to reuse data and instructions near

More information

CS 31: Intro to Systems Virtual Memory. Kevin Webb Swarthmore College November 15, 2018

CS 31: Intro to Systems Virtual Memory. Kevin Webb Swarthmore College November 15, 2018 CS 31: Intro to Systems Virtual Memory Kevin Webb Swarthmore College November 15, 2018 Reading Quiz Memory Abstraction goal: make every process think it has the same memory layout. MUCH simpler for compiler

More information

Virtual Memory. CS 3410 Computer System Organization & Programming

Virtual Memory. CS 3410 Computer System Organization & Programming Virtual Memory CS 3410 Computer System Organization & Programming These slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, and Sirer. Where are we now and

More information

Cray XE6 Performance Workshop

Cray XE6 Performance Workshop Cray XE6 Performance Workshop Mark Bull David Henty EPCC, University of Edinburgh Overview Why caches are needed How caches work Cache design and performance. 2 1 The memory speed gap Moore s Law: processors

More information