Caches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Size: px
Start display at page:

Download "Caches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University"

Transcription

1 s See P&.,. (except writes) akim Weatherspoon CS, Spring Computer Science Cornell University

2 What will you do over Spring Break? A) Relax B) ead home C) ead to a warm destination D) Stay in (frigid) Ithaca E) Study

3 inst Big Picture: emory imm B A ctrl B D ctrl D ctrl Code Stored in emory (also, data and stack) memory PC + new pc Instruction Fetch compute jump/branch targets $ (zero) $ ($at) register file $ ($sp) $ ($ra) control extend detect hazard Instruction Decode alu forward unit Execute d in addr d out memory Stack, Data, Code Stored in emory emory Write- Back IF/ID ID/EX EX/E E/WB

4 ctrl ctrl ctrl inst imm B A B D D memory PC + new pc Instruction Fetch Big Picture: emory emory: big & slow vs s: small & fast Code Stored in emory (also, data and stack) compute jump/branch targets $ (zero) $ ($at) register file $ ($sp) $ ($ra) control extend detect hazard Instruction Decode alu forward unit Execute d in addr d out memory Stack, Data, Code Stored in emory emory Write- Back IF/ID ID/EX EX/E E/WB

5 ctrl ctrl ctrl inst imm B A B D D $$ memory PC + new pc Instruction Fetch Big Picture: emory emory: big & slow vs s: small & fast Code Stored in emory (also, data and stack) compute jump/branch targets $ (zero) $ ($at) register file $ ($sp) $ ($ra) control extend detect hazard Instruction Decode alu forward unit Execute $$ addr d in d out memory Stack, Data, Code Stored in emory emory Write- Back IF/ID ID/EX EX/E E/WB

6 Goals for Today: caches s vs memory vs tertiary storage Tradeoffs: big & slow vs small & fast working set: / rule ow to predict future: temporal & spacial locality Examples of caches: Direct apped Fully Associative N-way set associative Caching Questions ow does a cache work? ow effective is the cache (hit rate/miss rate)? ow large is the cache? ow fast is the cache (AAT=average memory access time)

7 Big Picture ow do we make the processor fast, Given that memory is VEEERRRYYYY SLLOOOWWW!!

8 Performance CPU clock rates ~.ns ns (Gz-z) Technology Capacity $/GB Latency Tape TB $. s of seconds Disk TB $. illions of cycles (ms) SSD (Flash) GB $ Thousands of cycles (us) DRA GB $ - cycles (s of ns) SRA off-chip B $ - cycles (few ns) SRA on-chip KB??? - cycles (ns) Others: edra aka T SRA, FeRA, CD, DVD, Q: Can we create illusion of cheap + large + fast?

9 L becoming more common (edra?) emory Pyramid RegFile s bytes L (several KB) L (½-B) emory Pyramid < cycle access emory (B few GB) - cycle access - cycle access - cycle access Disk (any GB few TB) + cycle access These are rough numbers: mileage may vary for latest/greatest s usually made of SRA (or edra)

10 emory ierarchy emory closer to processor small & fast stores active data emory farther from processor big & slow stores inactive data L/L SRA L SRA-on-chip emory DRA

11 emory ierarchy emory closer to processor small & fast stores active data emory farther from processor big & slow stores inactive data % of data inactive (not accessed) emory DRA % of data access the most L SRA-on-chip L/L SRA % of data is active $R Reg LW $R, em

12 emory ierarchy Insight for s If em[x] was accessed recently... then em[x] is likely to be accessed soon Exploit temporal locality: Put recently accessed em[x] higher in memory hierarchy since it will likely be accessed again soon then em[x ± ε] is likely to be accessed soon Exploit spatial locality: Put entire block containing em[x] and surrounding addresses higher in memory hierarchy since nearby address will likely be accessed

13 emory ierarchy emory closer to processor is fast but small usually stores subset of memory farther away strictly inclusive Transfer whole blocks (cache lines): kb: disk ram b: ram L b: L L

14 emory trace xcab xcab xcaba xcabb xcabc xcabd xcabe xcabf xcab xcab xcab xcab xcab xcabc xc x xcab x xcab x xc... emory ierarchy int n = ; int k[] = {,,, }; int fib(int i) { if (i <= ) return i; else return fib(i-)+fib(i-); } Temporal Locality int main(int ac, char **av) { for (int i = ; i < n; i++) { printi(fib(k[i])); prints("\n"); } Spatial Locality }

15 Lookups (Read) tries to access em[x] Check: is block containing em[x] in the cache? Yes: cache hit return requested data from cache line No: cache miss read block from memory (or lower level cache) (evict an existing cache line to make room) place new block in cache return requested data and stall the pipeline while all of this happens

16 Three common designs A given data block can be placed in exactly one cache line Direct apped in any cache line Fully Associative in a small set of cache lines Set Associative

17 Direct apped Direct apped line line Each block number mapped to a single cache line index Simplest hardware cachelines -words per cacheline x x x xc x x x xc x x x xc x x x xc x x x emory

18 Direct apped Direct apped line line Each block number mapped to a single cache line index Simplest hardware -addr x tag -bits x index offset -bits -bits x xc cachelines -words per cacheline addr x x x xc x x x xc x x x xc x x x xc x x x emory

19 Direct apped Direct apped line line Each block number mapped to a single cache line index Simplest hardware -addr x tag -bits x index offset -bits -bits x xc cachelines -words per cacheline line line x x x xc x x x xc x x x xc x x x xc x x x emory

20 Direct apped Direct apped line line line line Each block number mapped to a single cache line index Simplest hardware -addr x tag -bits x cachelines -words per cacheline index offset -bits -bits line line line line line line line line x x x xc x x x xc x x x xc x x x xc x x x emory

21 Direct apped

22 Direct apped (Reading) Tag Index Offset V Tag Block word selector Byte offset in word tag offset index hit? = word select data bits

23 Direct apped (Reading) Tag Index Offset V Tag n bit index, m bit offset Q: ow big is cache (data only)?

24 Direct apped (Reading) Tag Index Offset m bytes per block V Tag Block n blocks n bit index, m bit offset Q: ow big is cache (data only)? of size n blocks Block size of m bytes Size: n bytes per block x n blocks = n+m bytes

25 Direct apped (Reading) Tag Index Offset V Tag Block n bit index, m bit offset Q: ow much SRA is needed (data + overhead)?

26 Direct apped (Reading) Tag Index Offset m bytes per block V Tag Block n blocks n bit index, m bit offset Q: ow much SRA is needed (data + overhead)? of size n blocks Block size of m bytes Tag field: (n + m), Valid bit: SRA Size: n x (block size + tag size + valid bit size) = n x ( m bytes x bits-per-byte + ( n m) + )

27 Example:A Simple Direct apped Using byte addresses in this example! Addr Bus = bits LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ cache lines word block V tag data emory

28 Example:A Simple Direct apped Using byte addresses in this example! Addr Bus = bits LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ cache lines word block bit tag field bit index field bit block offset V tag data emory

29 st Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag data

30 st Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: V tag isses: its: data

31 nd Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag isses: its: data

32 nd Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V Addr: tag isses: its: data

33 rd Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag isses: its: data

34 rd Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V Addr: tag isses: its: data

35 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag data isses: its:

36 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: V tag data isses: its:

37 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag data isses: its:

38 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: V tag data isses: its:

39 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag data isses: its:

40 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: V tag data isses: its:

41 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag data isses: its:

42 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: V tag isses: its: data

43 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag isses: its: data

44 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: V tag isses: its: data

45 isses Three types of misses Cold (aka Compulsory) The line is being referenced for the first time Capacity The line was evicted because the cache was not large enough Conflict The line was evicted because of another access whose index conflicted

46 Q: ow to avoid Cold isses isses Unavoidable? The data was never in the cache Prefetching! Capacity isses Buy more SRA Conflict isses Use a more flexible cache design

47 Direct apped Example: th Access Using byte addresses in this example! Addr Bus = bits Pathological example emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag data isses: its:

48 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: V tag data isses: its:

49 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag data isses: its:

50 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: V tag isses: its: data

51 th and th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag isses: its: data

52 th and th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag isses: + its: data

53 th and th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] V tag isses: + its: data

54 th and th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] V tag data isses: ++ its:

55 Organization ow to avoid Conflict isses Three common designs Fully associative: Block can be anywhere in the cache Direct mapped: Block can only be in one line in the cache Set-associative: Block can be in a few ( to ) places in the cache

56 Fully Associative (Reading) Tag Offset V Tag Block = = = = hit? line select word select data bytes bits

57 Fully Associative (Reading) Tag Offset V Tag Block m bit offset, n blocks (cache lines) Q: ow big is cache (data only)?

58 Fully Associative (Reading) Tag Offset V Tag Block m bit offset Q: ow big is cache (data only)? of size n blocks Block size of m bytes, n blocks (cache lines) Size: number-of-blocks x block size = n x m bytes = n+m bytes

59 Fully Associative (Reading) Tag Offset V Tag Block m bit offset, n blocks (cache lines) Q: ow much SRA needed (data + overhead)?

60 Fully Associative (Reading) Tag Offset V Tag Block m bit offset Q: ow much SRA needed (data + overhead)? of size n blocks Block size of m bytes Tag field: m Valid bit:, n blocks (cache lines) SRA size: n x (block size + tag size + valid bit size) = n x ( m bytes x bits-per-byte + (-m) + )

61 Example:Simple Fully Associative Using byte addresses in this example! Addr Bus = bits LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ cache lines word block bit tag field bit block offset V tag data V V V V emory

62 st Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ V tag data

63 st Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ LRU Addr: tag isses: its: data

64 nd Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ lru tag isses: its: data

65 nd Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: tag data isses: its:

66 rd Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ tag data isses: its:

67 rd Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: tag data isses: its:

68 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ tag data isses: its:

69 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: tag data isses: its:

70 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ tag data isses: its:

71 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: tag data isses: its:

72 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ tag data isses: its:

73 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: tag isses: its: data

74 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ tag isses: its: data

75 th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ Addr: tag isses: its: data

76 th and th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ tag isses: its: data

77 th and th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] $ $ $ $ tag isses: data its: +

78 th and th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] tag isses: data its: +

79 th and th Access emory LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] tag isses: data its: ++

80 Eviction Which cache line should be evicted from the cache to make room for a new line? Direct-mapped no choice, must evict line selected by index Associative caches random: select one of the lines at random round-robin: similar to random FIFO: replace oldest line LRU: replace line that has not been used in the longest time

81 Direct apped + Smaller + Less + Less + Faster + Less + Very Lots Low Common Tradeoffs Tag Size SRA Overhead Controller Logic Speed Price Scalability # of conflict misses it rate Pathological Cases? Fully Associative Larger ore ore Slower ore Not Very Zero + igh +?

82 Set-associative cache Compromise Like a direct-mapped cache Index into a location Fast Like a fully-associative cache Can store multiple entries decreases thrashing in cache Search in each element

83 -Way Set Associative (Reading) Tag Index Offset = = = hit? line select word select data bytes bits

84 Comparison: Direct apped Using byte addresses in this example! Addr Bus = bits LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] cache lines word block bit tag field bit index field bit block offset field tag data isses: its: emory

85 Comparison: Direct apped Using byte addresses in this example! Addr Bus = bits LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] cache lines word block bit tag field bit index field bit block offset field tag data isses: its: emory

86 Comparison: Fully Associative Using byte addresses in this example! Addr Bus = bits LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] cache lines word block bit tag field bit block offset field isses: its: tag data emory

87 Comparison: Fully Associative Using byte addresses in this example! Addr Bus = bits LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] cache lines word block bit tag field bit block offset field tag isses: its: data emory

88 Comparison: Way Set Assoc Using byte addresses in this example! Addr Bus = bits LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] tag data sets word block bit tag field bit set index field bit block offset field isses: its: emory

89 Comparison: Way Set Assoc Using byte addresses in this example! Addr Bus = bits LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] LB $ [ ] tag data sets word block bit tag field bit set index field bit block offset field isses: its: emory

90 Remaining Issues To Do: Evicting cache lines Picking cache parameters Writing using the cache

91 Summary Caching assumptions small working set: / rule can predict future: spatial & temporal locality Benefits big & fast memory built from (big & slow) + (small & fast) Tradeoffs: associativity, line size, hit cost, miss penalty, hit rate Fully Associative higher hit cost, higher hit rate Larger block size lower hit cost, higher miss penalty Next up: other designs; writing to caches

92 Administrivia Upcoming agenda W was due yesterday Wednesday, arch th PA Work-in-Progress circuit due before spring break Spring break: Saturday, arch th to Sunday, arch th Prelim Thursday, arch th, right after spring break PA due Thursday, April th

93 ave a great Spring Break!!!

Caches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches. See P&H 5.1, 5.2 (except writes) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University s See P&.,. (except writes) akim Weatherspoon CS, Spring Computer Science Cornell University What will you do over Spring Break? A) Relax B) ead home C) ead to a warm destination D) Stay in (frigid) Ithaca

More information

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) s akim Weatherspoon CS, Spring Computer Science Cornell University See P&.,. (except writes) Big Picture: : big & slow vs s: small & fast compute jump/branch targets memory PC + new pc Instruction Fetch

More information

Caches and Memory Deniz Altinbuken CS 3410, Spring 2015

Caches and Memory Deniz Altinbuken CS 3410, Spring 2015 s and emory Deniz Altinbuken CS, Spring Computer Science Cornell University See P& Chapter:.-. (except writes) Big Picture: emory Code Stored in emory (also, data and stack) compute jump/branch targets

More information

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Han Wang CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) Caches Han Wang CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes) This week: Announcements PA2 Work-in-progress submission Next six weeks: Two labs and two projects

More information

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) Caches akim Weatherspoon CS 341, Spring 212 Computer Science Cornell University See P& 5.1, 5.2 (except writes) ctrl ctrl ctrl inst imm B A B D D Big Picture: emory emory: big & slow vs Caches: small &

More information

CS 3410, Spring 2014 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.15

CS 3410, Spring 2014 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.15 CS 34, Spring 4 Computer Science Cornell University See P& Chapter: 5.- 5.4, 5.8, 5.5 Code Stored in emory (also, data and stack) memory PC +4 new pc inst control extend imm B A compute jump/branch targets

More information

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes)

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.1, 5.2 (except writes) Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H 5.1, 5.2 (except writes) Announcements! HW3 available due next Tuesday Work with alone partner Be responsible

More information

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 34, Spring 23 Computer Science Cornell University Welcome back from Spring Break! Welcome back from Spring Break! Big Picture: Memory Code

More information

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Caches (Writing) P & H Chapter 5.2 3, 5.5. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Caches (Writing) P & H Chapter 5.2 3, 5.5 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Memory Code Stored in Memory (also, data and stack) memory PC +4 new pc

More information

Caches and Memory. Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.10, 5.13, 5.15, 5.17

Caches and Memory. Anne Bracy CS 3410 Computer Science Cornell University. See P&H Chapter: , 5.8, 5.10, 5.13, 5.15, 5.17 Caches and emory Anne Bracy CS 34 Computer Science Cornell University Slides by Anne Bracy with 34 slides by Professors Weatherspoon, Bala, ckee, and Sirer. See P&H Chapter: 5.-5.4, 5.8, 5., 5.3, 5.5,

More information

Caches (Writing) Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.2 3, 5.5

Caches (Writing) Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.2 3, 5.5 s (Writing) Hakim Weatherspoon CS, Spring Computer Science Cornell University P & H Chapter.,. Administrivia Lab due next onday, April th HW due next onday, April th Goals for Today Parameter Tradeoffs

More information

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) Chapter Seven emories: Review SRA: value is stored on a pair of inverting gates very fast but takes up more space than DRA (4 to transistors) DRA: value is stored as a charge on capacitor (must be refreshed)

More information

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.2 (writes), 5.3, 5.5

Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. See P&H 5.2 (writes), 5.3, 5.5 Caches! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H 5.2 (writes), 5.3, 5.5 Announcements! HW3 available due next Tuesday HW3 has been updated. Use updated version.

More information

CS161 Design and Architecture of Computer Systems. Cache $$$$$

CS161 Design and Architecture of Computer Systems. Cache $$$$$ CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks

More information

Memory Hierarchy: Caches, Virtual Memory

Memory Hierarchy: Caches, Virtual Memory Memory Hierarchy: Caches, Virtual Memory Readings: 5.1-5.4, 5.8 Big memories are slow Computer Fast memories are small Processor Memory Devices Control Input Datapath Output Need to get fast, big memories

More information

Prelim 3 Review. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University

Prelim 3 Review. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University Prelim 3 Review Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University Administrivia Pizza party: PA3 Games Night Tomorrow, Friday, April 27 th, 5:00-7:00pm Location: Upson B17 Prelim

More information

Prelim 3 Review. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Prelim 3 Review. Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Prelim 3 Review Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Administrivia Pizza party: Project3 Games Night Cache Race Tomorrow, Friday, April 26 th, 5:00-7:00pm Location:

More information

Lec 13: Linking and Memory. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements

Lec 13: Linking and Memory. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements Lec 13: Linking and Memory Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University PA 2 is out Due on Oct 22 nd Announcements Prelim Oct 23 rd, 7:30-9:30/10:00 All content up to Lecture on Oct

More information

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Cache Introduction [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user with as much

More information

Caches & Memory. CS 3410 Computer System Organization & Programming

Caches & Memory. CS 3410 Computer System Organization & Programming Caches & Memory CS 34 Computer System Organization & Programming These slides are the product of many rounds of teaching CS 34 by Professors Weatherspoon, Bala, Bracy, and Sirer. Programs C Code int main

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141

EECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141 EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role

More information

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1 CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson

More information

10/19/17. You Are Here! Review: Direct-Mapped Cache. Typical Memory Hierarchy

10/19/17. You Are Here! Review: Direct-Mapped Cache. Typical Memory Hierarchy CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~cs6c/ Parallel Requests Assigned to computer eg, Search

More information

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals Cache Memory COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline The Need for Cache Memory The Basics

More information

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work? EEC 17 Computer Architecture Fall 25 Introduction Review Review: The Hierarchy Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology

More information

Virtual Memory 3. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.4

Virtual Memory 3. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.4 Virtual Memory 3 Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University P & H Chapter 5.4 Project3 available now Administrivia Design Doc due next week, Monday, April 16 th Schedule

More information

Welcome to Part 3: Memory Systems and I/O

Welcome to Part 3: Memory Systems and I/O Welcome to Part 3: Memory Systems and I/O We ve already seen how to make a fast processor. How can we supply the CPU with enough data to keep it busy? We will now focus on memory issues, which are frequently

More information

Memory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt

Memory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt Memory Hierarchy 2/18/2016 CS 152 Sec6on 5 Colin Schmidt Agenda Review Memory Hierarchy Lab 2 Ques6ons Return Quiz 1 Latencies Comparison Numbers L1 Cache 0.5 ns L2 Cache 7 ns 14x L1 cache Main Memory

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/19/17 Fall 2017 - Lecture #16 1 Parallel

More information

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016 Caches and Memory Hierarchy: Review UCSB CS240A, Winter 2016 1 Motivation Most applications in a single processor runs at only 10-20% of the processor peak Most of the single processor performance loss

More information

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu CENG 3420 Computer Organization and Design Lecture 08: Cache Review Bei Yu CEG3420 L08.1 Spring 2016 A Typical Memory Hierarchy q Take advantage of the principle of locality to present the user with as

More information

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky

Memory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky Memory Hierarchy, Fully Associative Caches Instructor: Nick Riasanovsky Review Hazards reduce effectiveness of pipelining Cause stalls/bubbles Structural Hazards Conflict in use of datapath component Data

More information

Virtual Memory. P & H Chapter 5.4 (up to TLBs) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Virtual Memory. P & H Chapter 5.4 (up to TLBs) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Virtual Memory P & H Chapter 5.4 (up to TLBs) Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: (Virtual) Memory 0xfffffffc top system reserved 0x80000000 0x7ffffffc

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major

More information

14:332:331. Week 13 Basics of Cache

14:332:331. Week 13 Basics of Cache 14:332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Week131 Spring 2006

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Components of a Computer Processor

More information

Key Point. What are Cache lines

Key Point. What are Cache lines Caching 1 Key Point What are Cache lines Tags Index offset How do we find data in the cache? How do we tell if it s the right data? What decisions do we need to make in designing a cache? What are possible

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!

More information

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017 Caches and Memory Hierarchy: Review UCSB CS24A, Fall 27 Motivation Most applications in a single processor runs at only - 2% of the processor peak Most of the single processor performance loss is in the

More information

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches CS 61C: Great Ideas in Computer Architecture The Memory Hierarchy, Fully Associative Caches Instructor: Alan Christopher 7/09/2014 Summer 2014 -- Lecture #10 1 Review of Last Lecture Floating point (single

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste

More information

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp. Cache associativity Cache and performance 12 1 CMPE110 Spring 2005 A. Di Blas 110 Spring 2005 CMPE Cache Direct-mapped cache Reads and writes Textbook Edition: 7.1 to 7.3 Second Third Edition: 7.1 to 7.3

More information

Course Administration

Course Administration Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570

More information

Memory hierarchy review. ECE 154B Dmitri Strukov

Memory hierarchy review. ECE 154B Dmitri Strukov Memory hierarchy review ECE 154B Dmitri Strukov Outline Cache motivation Cache basics Six basic optimizations Virtual memory Cache performance Opteron example Processor-DRAM gap in latency Q1. How to deal

More information

Mo Money, No Problems: Caches #2...

Mo Money, No Problems: Caches #2... Mo Money, No Problems: Caches #2... 1 Reminder: Cache Terms... Cache: A small and fast memory used to increase the performance of accessing a big and slow memory Uses temporal locality: The tendency to

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Bernhard Boser & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/24/16 Fall 2016 - Lecture #16 1 Software

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #21: Caches 3 2005-07-27 CS61C L22 Caches III (1) Andy Carle Review: Why We Use Caches 1000 Performance 100 10 1 1980 1981 1982 1983

More information

EECS 322 Computer Architecture Improving Memory Access: the Cache

EECS 322 Computer Architecture Improving Memory Access: the Cache EECS 322 Computer Architecture Improving emory Access: the Cache Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses powerpoint animation: please viewshow

More information

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University Lecture 12 Memory Design & Caches, part 2 Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements HW3 is due today PA2 is available on-line today Part 1 is due on 2/27

More information

Page 1. Multilevel Memories (Improving performance using a little cash )

Page 1. Multilevel Memories (Improving performance using a little cash ) Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency

More information

Levels in memory hierarchy

Levels in memory hierarchy CS1C Cache Memory Lecture 1 March 1, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs1c/schedule.html Review 1/: Memory Hierarchy Pyramid Upper Levels in memory hierarchy

More information

CENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu

CENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu CENG 3420 Computer Organization and Design Lecture 08: Memory - I Bei Yu CEG3420 L08.1 Spring 2016 Outline q Why Memory Hierarchy q How Memory Hierarchy? SRAM (Cache) & DRAM (main memory) Memory System

More information

ECE ECE4680

ECE ECE4680 ECE468. -4-7 The otivation for s System ECE468 Computer Organization and Architecture DRA Hierarchy System otivation Large memories (DRA) are slow Small memories (SRA) are fast ake the average access time

More information

Advanced Computer Architecture

Advanced Computer Architecture ECE 563 Advanced Computer Architecture Fall 2009 Lecture 3: Memory Hierarchy Review: Caches 563 L03.1 Fall 2010 Since 1980, CPU has outpaced DRAM... Four-issue 2GHz superscalar accessing 100ns DRAM could

More information

Review : Pipelining. Memory Hierarchy

Review : Pipelining. Memory Hierarchy CS61C L11 Caches (1) CS61CL : Machine Structures Review : Pipelining The Big Picture Lecture #11 Caches 2009-07-29 Jeremy Huddleston!! Pipeline challenge is hazards "! Forwarding helps w/many data hazards

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Computer Science 432/563 Operating Systems The College of Saint Rose Spring Topic Notes: Memory Hierarchy

Computer Science 432/563 Operating Systems The College of Saint Rose Spring Topic Notes: Memory Hierarchy Computer Science 432/563 Operating Systems The College of Saint Rose Spring 2016 Topic Notes: Memory Hierarchy We will revisit a topic now that cuts across systems classes: memory hierarchies. We often

More information

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):

The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): Motivation for The Memory Hierarchy: { CPU/Memory Performance Gap The Principle Of Locality Cache $$$$$ Cache Basics:

More information

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory 1 COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory Cristinel Ababei Dept. of Electrical and Computer Engineering Marquette University Credits: Slides adapted from presentations

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #24 Cache II 27-8-6 Scott Beamer, Instructor New Flow Based Routers CS61C L24 Cache II (1) www.anagran.com Caching Terminology When we try

More information

Memory Hierarchies &

Memory Hierarchies & Memory Hierarchies & Cache Memory CSE 410, Spring 2009 Computer Systems http://www.cs.washington.edu/410 4/26/2009 cse410-13-cache 2006-09 Perkins, DW Johnson and University of Washington 1 Reading and

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanovic & Vladimir Stojanovic hfp://inst.eecs.berkeley.edu/~cs61c/ Parallel Requests Assigned to computer

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanovic & Vladimir Stojanovic hcp://inst.eecs.berkeley.edu/~cs61c/ So$ware Parallel Requests Assigned

More information

Cache Architectures Design of Digital Circuits 217 Srdjan Capkun Onur Mutlu http://www.syssec.ethz.ch/education/digitaltechnik_17 Adapted from Digital Design and Computer Architecture, David Money Harris

More information

CMPSC 311- Introduction to Systems Programming Module: Caching

CMPSC 311- Introduction to Systems Programming Module: Caching CMPSC 311- Introduction to Systems Programming Module: Caching Professor Patrick McDaniel Fall 2016 Reminder: Memory Hierarchy L0: Registers CPU registers hold words retrieved from L1 cache Smaller, faster,

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

Introduction to OpenMP. Lecture 10: Caches

Introduction to OpenMP. Lecture 10: Caches Introduction to OpenMP Lecture 10: Caches Overview Why caches are needed How caches work Cache design and performance. The memory speed gap Moore s Law: processors speed doubles every 18 months. True for

More information

CPUs. Caching: The Basic Idea. Cache : MainMemory :: Window : Caches. Memory management. CPU performance. 1. Door 2. Bigger Door 3. The Great Outdoors

CPUs. Caching: The Basic Idea. Cache : MainMemory :: Window : Caches. Memory management. CPU performance. 1. Door 2. Bigger Door 3. The Great Outdoors CPUs Caches. Memory management. CPU performance. Cache : MainMemory :: Window : 1. Door 2. Bigger Door 3. The Great Outdoors 4. Horizontal Blinds 18% 9% 64% 9% Door Bigger Door The Great Outdoors Horizontal

More information

Caching Basics. Memory Hierarchies

Caching Basics. Memory Hierarchies Caching Basics CS448 1 Memory Hierarchies Takes advantage of locality of reference principle Most programs do not access all code and data uniformly, but repeat for certain data choices spatial nearby

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 14 Caches III Lecturer SOE Dan Garcia Google Glass may be one vision of the future of post-pc interfaces augmented reality with video

More information

Cycle Time for Non-pipelined & Pipelined processors

Cycle Time for Non-pipelined & Pipelined processors Cycle Time for Non-pipelined & Pipelined processors Fetch Decode Execute Memory Writeback 250ps 350ps 150ps 300ps 200ps For a non-pipelined processor, the clock cycle is the sum of the latencies of all

More information

P & H Chapter 5.7 (up to TLBs) Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University

P & H Chapter 5.7 (up to TLBs) Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University P & H Chapter 5.7 (up to TLBs) Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University Where did you go? a) Home b) Caribbean, Hawaii, Florida, California, South America, etc

More information

CS162 Operating Systems and Systems Programming Lecture 10 Caches and TLBs"

CS162 Operating Systems and Systems Programming Lecture 10 Caches and TLBs CS162 Operating Systems and Systems Programming Lecture 10 Caches and TLBs" October 1, 2012! Prashanth Mohan!! Slides from Anthony Joseph and Ion Stoica! http://inst.eecs.berkeley.edu/~cs162! Caching!

More information

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches CS 61C: Great Ideas in Computer Architecture Direct Mapped Caches Instructor: Justin Hsia 7/05/2012 Summer 2012 Lecture #11 1 Review of Last Lecture Floating point (single and double precision) approximates

More information

Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy

Memory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy ENG338 Computer Organization and Architecture Part II Winter 217 S. Areibi School of Engineering University of Guelph Hierarchy Topics Hierarchy Locality Motivation Principles Elements of Design: Addresses

More information

www-inst.eecs.berkeley.edu/~cs61c/

www-inst.eecs.berkeley.edu/~cs61c/ CS61C Machine Structures Lecture 34 - Caches II 11/16/2007 John Wawrzynek (www.cs.berkeley.edu/~johnw) www-inst.eecs.berkeley.edu/~cs61c/ 1 What to do on a write hit? Two Options: Write-through update

More information

Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example

Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example Locality CS429: Computer Organization and Architecture Dr Bill Young Department of Computer Sciences University of Texas at Austin Principle of Locality: Programs tend to reuse data and instructions near

More information

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality: it will tend to be referenced again soon spatial locality: nearby items will tend to be

More information

CMPSC 311- Introduction to Systems Programming Module: Caching

CMPSC 311- Introduction to Systems Programming Module: Caching CMPSC 311- Introduction to Systems Programming Module: Caching Professor Patrick McDaniel Fall 2014 Lecture notes Get caching information form other lecture http://hssl.cs.jhu.edu/~randal/419/lectures/l8.5.caching.pdf

More information

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14 CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2014 Lecture 14 LAST TIME! Examined several memory technologies: SRAM volatile memory cells built from transistors! Fast to use, larger memory cells (6+ transistors

More information

Handout 4 Memory Hierarchy

Handout 4 Memory Hierarchy Handout 4 Memory Hierarchy Outline Memory hierarchy Locality Cache design Virtual address spaces Page table layout TLB design options (MMU Sub-system) Conclusion 2012/11/7 2 Since 1980, CPU has outpaced

More information

Memory. Objectives. Introduction. 6.2 Types of Memory

Memory. Objectives. Introduction. 6.2 Types of Memory Memory Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured. Master the concepts

More information

The Memory Hierarchy & Cache

The Memory Hierarchy & Cache Removing The Ideal Memory Assumption: The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory

More information

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,

More information

10/11/17. New-School Machine Structures. Review: Single Cycle Instruction Timing. Review: Single-Cycle RISC-V RV32I Datapath. Components of a Computer

10/11/17. New-School Machine Structures. Review: Single Cycle Instruction Timing. Review: Single-Cycle RISC-V RV32I Datapath. Components of a Computer // CS C: Great Ideas in Computer Architecture (Machine Structures) s Part Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~csc/ // Fall - Lecture # Parallel Requests Assigned to computer

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap

More information

Memory Hierarchy. Slides contents from:

Memory Hierarchy. Slides contents from: Memory Hierarchy Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing, NPTEL Memory Performance Gap Memory

More information

Digital Logic & Computer Design CS Professor Dan Moldovan Spring Copyright 2007 Elsevier 8-<1>

Digital Logic & Computer Design CS Professor Dan Moldovan Spring Copyright 2007 Elsevier 8-<1> Digital Logic & Computer Design CS 4341 Professor Dan Moldovan Spring 21 Copyright 27 Elsevier 8- Chapter 8 :: Memory Systems Digital Design and Computer Architecture David Money Harris and Sarah L.

More information

And in Review! ! Locality of reference is a Big Idea! 3. Load Word from 0x !

And in Review! ! Locality of reference is a Big Idea! 3. Load Word from 0x ! CS61C L23 Caches II (1)! inst.eecs.berkeley.edu/~cs61c CS61C Machine Structures Lecture 23 Caches II 2010-07-29!!!Instructor Paul Pearce! TOOLS THAT AUTOMATICALLY FIND SOFTWARE BUGS! Black Hat (a security

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 27: Midterm2 review Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Midterm 2 Review Midterm will cover Section 1.6: Processor

More information

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored

More information

Fundamentals of Computer Systems

Fundamentals of Computer Systems Fundamentals of Computer Systems Caches Martha A. Kim Columbia University Fall 215 Illustrations Copyright 27 Elsevier 1 / 23 Computer Systems Performance depends on which is slowest: the processor or

More information

CS3350B Computer Architecture

CS3350B Computer Architecture CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &

More information

Lecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size

Lecture 33 Caches III What to do on a write hit? Block Size Tradeoff (1/3) Benefits of Larger Block Size CS61C L33 Caches III (1) inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C Machine Structures Lecture 33 Caches III 27-4-11 Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Future of movies is 3D? Dreamworks

More information