NOW Handout Page 1. Review: Who Cares About the Memory Hierarchy? EECS 252 Graduate Computer Architecture. Lec 12 - Caches
|
|
- Randell Nash
- 6 years ago
- Views:
Transcription
1 NOW Handout Page EECS 5 Graduate Computer Architecture Lec - Caches David Culler Electrical Engineering and Computer Sciences University of California, Berkeley http// http//www-inst.eecs.berkeley.edu/~cs5 Review Who Cares About the Memory Hierarchy? Processor Only Thus Far in Course CPU cost/performance, ISA, Pipelined Execution CPU-DRAM Gap Moore s Law Performance Less Law? µproc 6%/yr. DRAM 7%/yr. /8/ CS5-S5 L Caches CPU Processor-Memory Performance Gap (grows 5% / year) 98 no cache in µproc; 995 -level cache on chip (989 first Intel µproc with a cache on chip) DRAM Review What is a cache? Small, fast storage used to improve average access time to slow memory. Exploits spacial and temporal locality In computer architecture, almost everything is a cache! Registers a cache on variables First-level cache a cache on second-level cache Second-level cache a cache on memory Memory a cache on disk (virtual memory) TLB a cache on page table Branch-prediction a cache on prediction information? Bigger Proc/Regs L-Cache L-Cache Memory Faster Disk, Tape, etc. /8/ CS5-S5 L Caches 3 Review Terminology Hit data appears in some block in the upper level (example Block X) Hit Rate the fraction of memory access found in the upper level Hit Time Time to access the upper level which consists of RAM access time + Time to determine hit/miss Miss data needs to be retrieve from a block in the lower level (Block Y) Miss Rate = - (Hit Rate) Miss Penalty Time to replace a block in the upper level + Time to deliver the block the processor Hit Time << Miss Penalty (5 instructions on 6!) To Processor Upper Level Memory Lower Level Memory Blk X From Processor Blk Y /8/ CS5-S5 L Caches Why it works Exploit the statistical properties of programs Locality of reference Temporal Spatial P(access,t) Block Placement Q Where can a block be placed in the upper level? Fully Associative, Set Associative, Direct Mapped Average Memory Access Time AMAT = HitTime + MissRate MissPenalty = ( HitTimeInst + MissRateInst MissPenaltyInst ) + ( HitTimeData + MissRateData MissPenaltyData ) Simple hardware structure that observes program behavior and reacts to improve future performance Is the cache visible in the ISA? address /8/ CS5-S5 L Caches 5 /8/ CS5-S5 L Caches 6
2 NOW Handout Page KB Direct Mapped Cache, 3B blocks For a ** N byte cache The uppermost (3 - N) bits are always the The lowest M bits are the Byte Select (Block Size = ** M) 3 Valid Bit x5 Example x5 Stored as part of the cache state Cache Index Ex x Byte 3 Byte 63 Byte 3 Byte Byte Byte 33 Byte 3 /8/ CS5-S5 L Caches 7 9 Byte Select Ex x 3 Byte 99 3 Review Set Associative Cache N-way set associative N entries for each Cache Index N direct mapped caches operates in parallel How big is the tag? Example Two-way set associative cache Cache Index selects a set from the cache The two tags in the set are compared to the input in parallel Data is selected based on the tag result Valid Cache Block Adr Tag Compare Sel Cache Index Mux Cache Block Sel OR /8/ CS5-S5 L Cache Caches Block Hit 8 Compare Valid Q How is a block found if it is in the upper level? Index identifies set of possibilities Tag on each block No need to check index or block offset Increasing associativity shrinks index, expands tag Block Address Block Tag Index Offset Cache size = Associativity * index_size * offest_size Q3 Which block should be replaced on a miss? Easy for Direct Mapped Set Associative or Fully Associative Random LRU (Least Recently Used) Assoc -way -way 8-way Size LRU Ran LRU Ran LRU Ran 6 KB 5.% 5.7%.7% 5.3%.% 5.% 6 KB.9%.%.5%.7%.%.5% 56 KB.5%.7%.3%.3%.%.% /8/ CS5-S5 L Caches 9 /8/ CS5-S5 L Caches Q What happens on a write? Write through The information is written to both the block in the cache and to the block in the lowerlevel memory. Write back The information is written only to the block in the cache. The modified cache block is written to main memory only when it is replaced. is block clean or dirty? Pros and Cons of each? WT read misses cannot result in writes WB no repeated writes to same location WT always combined with write buffers so that don t wait for lower level memory What about on a miss? Write_no_allocate vs write_allocate /8/ CS5-S5 L Caches Write Buffer for Write Through Processor Cache Write Buffer DRAM A Write Buffer is needed between the Cache and Memory Processor writes data into the cache and the write buffer Memory controller write contents of the buffer to memory Write buffer is just a FIFO Typical number of entries Works fine if Store frequency (w.r.t. time) << / DRAM write cycle /8/ CS5-S5 L Caches
3 NOW Handout Page 3 Review Cache performance Miss-oriented Approach to Memory Access MemAccess CPUtime = IC CPI Execution + MissRate MissPenalty CycleTime Inst Separating out Memory component entirely AMAT = Average Memory Access Time MemAccess CPUtime = IC CPI AluOps + AMAT CycleTime Inst Effective CPI = CPI ideal_mem + P mem * AMAT /8/ CS5-S5 L Caches 3 Impact on Performance Suppose a processor executes at Clock Rate = MHz (5 ns per cycle), Ideal (no misses) CPI =. 5% arith/logic, 3% ld/st, % control Suppose that % of memory operations get 5 cycle miss penalty Suppose that % of instructions get same miss penalty CPI = ideal CPI + average stalls per instruction.(cycles/ins) + [.3 (DataMops/ins) x. (miss/datamop) x 5 (cycle/miss)] + [ (InstMop/ins) x. (miss/instmop) x 5 (cycle/miss)] = ( ) cycle/ins = 3. 58% of the time the proc is stalled waiting for memory! AMAT=(/.3)x[+.x5]+(.3/.3)x[+.x5]=.5 /8/ CS5-S5 L Caches Example Harvard Architecture Unified vs Separate I&D (Harvard) Proc Unified Cache- Unified Cache- I-Cache- Proc D-Cache- Unified Cache- Statistics (given in H&P) 6KB I&D Inst miss rate=.6%, Data miss rate=6.7% 3KB unified Aggregate miss rate=.99% Which is better (ignore L cache)? Assume 33% data ops 75% accesses from instructions (./.33) hit time=, miss time=5 Note that data hit has stall for unified cache (only one port) AMAT Harvard =75%x(+.6%x5)+5%x(+6.7%x5) =.5 AMAT Unified =75%x(+.99%x5)+5%x(++.99%x5)=. /8/ CS5-S5 L Caches 5 The Cache Design Space Several interacting dimensions cache size block size associativity replacement policy write-through vs write-back The optimal choice is a compromise depends on access characteristics» workload Bad» use (I-cache, D-cache, TLB) depends on technology / cost Good Simplicity often wins Cache Size Factor A Less Associativity Block Size Factor B More /8/ CS5-S5 L Caches 6 Review Improving Cache Performance Memory accesses CPUtime = IC CPI Execution + Miss rate Miss penalty Clock cycle time Instruction. Reduce the miss rate,. Reduce the miss penalty, or 3. Reduce the time to hit in the cache. /8/ CS5-S5 L Caches 7 Reducing Misses Classifying Misses 3 Cs Compulsory The first access to a block is not in the cache, so the block must be brought into the cache. Also called cold start misses or first reference misses. (Misses in even an Infinite Cache) Capacity If the cache cannot contain all the blocks needed during execution of a program, capacity misses will occur due to blocks being discarded and later retrieved. (Misses in Fully Associative Size X Cache) Conflict If block-placement strategy is set associative or direct mapped, conflict misses (in addition to compulsory & capacity misses) will occur because a block can be discarded and later retrieved if too many blocks map to its set. Also called collision misses or interference misses. (Misses in N-way Associative, Size X Cache) More recent, th C Coherence - Misses caused by cache coherence. /8/ CS5-S5 L Caches 8
4 NOW Handout Page 3Cs Absolute Miss Rate (SPEC9) way -way -way 8-way Conflict Capacity Cache Rule miss rate -way associative cache size X ~= miss rate -way associative cache size X/ way -way -way 8-way Conflict Capacity Compulsory vanishingly small Cache Size (KB) Compulsory /8/ CS5-S5 L Caches Cache Size (KB) Compulsory /8/ CS5-S5 L Caches 3Cs Relative Miss Rate How Can Reduce Misses? % 8% 6% % % -way -way -way 8-way Capacity Conflict 3 Cs Compulsory, Capacity, Conflict In all cases, assume total cache size not changed What happens if ) Change Block Size Which of 3Cs is obviously affected? ) Change Associativity Which of 3Cs is obviously affected? % Caveat fixed block size Compulsory Cache Size (KB) /8/ CS5-S5 L Caches 3) Change Algorithm / Compiler Which of 3Cs is obviously affected? /8/ CS5-S5 L Caches. Reduce Misses via Larger Block Size Miss Rate 5% % 5% % 5% K K 6K 6K 56K. Reduce Misses via Higher Associativity Cache Rule Miss Rate DM cache size N ~ Miss Rate -way cache size N/ Beware Execution time is only final measure! Will Clock Cycle time increase? Hill [988] suggested hit time for -way vs. -way external cache +%, internal + % % Block Size (bytes) 56 /8/ CS5-S5 L Caches 3 /8/ CS5-S5 L Caches
5 NOW Handout Page 5 Example Avg. Memory Access Time vs. Miss Rate 3. Reducing Misses via a Victim Cache assume CCT =. for -way,. for -way,. for 8-way vs. CCT direct mapped Cache Size Associativity (KB) -way -way -way 8-way (Red means A.M.A.T. not improved by more associativity) /8/ CS5-S5 L Caches 5 How to combine fast hit time of direct mapped yet still avoid conflict misses? Add buffer to place data discarded from cache Jouppi [99] -entry victim cache removed % to 95% of conflicts for a KB direct mapped data cache Used in Alpha, HP machines TAGS DATA Tag and Comparator One Cache line of Data Tag and Comparator One Cache line of Data Tag and Comparator One Cache line of Data Tag and Comparator One Cache line of Data To Next Lower Level In Hierarchy /8/ CS5-S5 L Caches 6. Reducing Misses via Pseudo-Associativity How to combine fast hit time of Direct Mapped and have the lower conflict misses of -way SA cache? Divide cache on a miss, check other half of cache to see if there, if so have a pseudo-hit (slow hit) Hit Time Pseudo Hit Time Miss Penalty Time Drawback CPU pipeline is hard if hit takes or cycles Better for caches not tied directly to processor (L) Used in MIPS R L cache, similar in UltraSPARC 5. Reducing Misses by Hardware Prefetching of Instructions & Data E.g., Instruction Prefetching Alpha 6 fetches blocks on a miss Extra block placed in stream buffer On miss check stream buffer Works with data blocks too Jouppi [99] data stream buffer got 5% misses from KB cache; streams got 3% Palacharla & Kessler [99] for scientific programs for 8 streams got 5% to 7% of misses from 6KB, -way set associative caches Prefetching relies on having extra memory bandwidth that can be used without penalty /8/ CS5-S5 L Caches 7 /8/ CS5-S5 L Caches 8 6. Reducing Misses by Software Prefetching Data Data Prefetch Load data into register (HP PA-RISC loads) Cache Prefetch load into cache (MIPS IV, PowerPC, SPARC v. 9) Special prefetching instructions cannot cause faults; a form of speculative execution Issuing Prefetch Instructions takes time Is cost of prefetch issues < savings in reduced misses? Higher superscalar reduces difficulty of issue bandwidth 7. Reducing Misses by Compiler Optimizations McFarling [989] reduced caches misses by 75% on 8KB direct mapped cache, byte blocks in software Instructions Reorder procedures in memory so as to reduce conflict misses Profiling to look at conflicts(using tools they developed) Data Merging Arrays improve spatial locality by single array of compound elements vs. arrays Loop Interchange change nesting of loops to access data in order stored in memory Loop Fusion Combine independent loops that have same looping and some variables overlap Blocking Improve temporal locality by accessing blocks of data repeatedly vs. going down whole columns or rows /8/ CS5-S5 L Caches 9 /8/ CS5-S5 L Caches 3
6 NOW Handout Page 6 Merging Arrays Example Loop Interchange Example /* Before sequential arrays */ int val[size]; int key[size]; /* After array of stuctures */ struct merge { int val; int key; }; struct merge merged_array[size]; /* Before */ for (k = ; k < ; k = k+) for (j = ; j < ; j = j+) for (i = ; i < 5; i = i+) x[i][j] = * x[i][j]; /* After */ for (k = ; k < ; k = k+) for (i = ; i < 5; i = i+) for (j = ; j < ; j = j+) x[i][j] = * x[i][j]; Reducing conflicts between val & key; improve spatial locality /8/ CS5-S5 L Caches 3 Sequential accesses instead of striding through memory every words; improved spatial locality /8/ CS5-S5 L Caches 3 Loop Fusion Example /* Before */ for (i = ; i < N; i = i+) for (j = ; j < N; j = j+) a[i][j] = /b[i][j] * c[i][j]; for (i = ; i < N; i = i+) for (j = ; j < N; j = j+) d[i][j] = a[i][j] + c[i][j]; /* After */ for (i = ; i < N; i = i+) for (j = ; j < N; j = j+) { a[i][j] = /b[i][j] * c[i][j]; d[i][j] = a[i][j] + c[i][j];} misses per access to a & c vs. one miss per access; improve spatial locality /8/ CS5-S5 L Caches 33 Blocking Example /* Before */ for (i = ; i < N; i = i+) for (j = ; j < N; j = j+) {r = ; for (k = ; k < N; k = k+){ r = r + y[i][k]*z[k][j];}; x[i][j] = r; }; Two Inner Loops Read all NxN elements of z[] Read N elements of row of y[] repeatedly Write N elements of row of x[] Capacity Misses a function of N & Cache Size N 3 + N => (assuming no conflict; otherwise ) Idea compute on BxB submatrix that fits /8/ CS5-S5 L Caches 3 Blocking Example /* After */ for (jj = ; jj < N; jj = jj+b) for (kk = ; kk < N; kk = kk+b) for (i = ; i < N; i = i+) for (j = jj; j < min(jj+b-,n); j = j+) {r = ; for (k = kk; k < min(kk+b-,n); k = k+) { r = r + y[i][k]*z[k][j];}; x[i][j] = x[i][j] + r; }; B called Blocking Factor Capacity Misses from N 3 + N to N 3 /B +N Conflict Misses Too? /8/ CS5-S5 L Caches 35 Reducing Conflict Misses by Blocking..5 Blocking Factor Direct Mapped Cache Fully Associative Cache 5 5 Conflict misses in caches not FA vs. Blocking size Lam et al [99] a blocking factor of had a fifth the misses vs. 8 despite both fit in cache /8/ CS5-S5 L Caches 36
7 NOW Handout Page 7 Summary of Compiler Optimizations to Reduce Cache Misses (by hand) vpenta (nasa7) gmty (nasa7) tomcatv btrix (nasa7) mxm (nasa7) spice cholesky (nasa7) compress Performance Improvement Impact of Memory Hierarchy on Algorithms Today CPU time is a function of (ops, cache misses) vs. just f(ops) What does this mean to Compilers, Data structures, Algorithms? The Influence of Caches on the Performance of Sorting by A. LaMarca and R.E. Ladner. Proceedings of the Eighth Annual ACM- SIAM Symposium on Discrete Algorithms, January, 997, Quicksort fastest comparison based sorting algorithm when all keys fit in memory Radix sort also called linear time sort because for keys of fixed length and fixed radix a constant number of passes over the data is sufficient independent of the number of keys For Alphastation 5, 3 byte blocks, direct mapped L MB cache, 8 byte keys, from to merged loop loop fusion blocking arrays interchange /8/ CS5-S5 L Caches 37 /8/ CS5-S5 L Caches 38 Quicksort vs. Radix as vary number keys Instructions Quicksort vs. Radix as vary number keys Instrs & Time Radix sort Quick (Instr/key) Radix (Instr/key) Radix sort Time Quick (Instr/key) Radix (Instr/key) Quick (Clocks/key) Radix (clocks/key) 3 Quick sort E+7 Instructions/key Set size in keys /8/ CS5-S5 L Caches 39 3 Quick sort Instructions E+7 Set size in keys /8/ CS5-S5 L Caches Quicksort vs. Radix as vary number keys Cache misses 5 3 Quick sort Radix sort Cache misses Set size in keys Quick(miss/key) Radix(miss/key) What is proper approach to fast algorithms? /8/ CS5-S5 L Caches Review What happens on Cache miss? For in-order pipeline, options Freeze pipeline in Mem stage (popular early on Sparc, R) IF ID EX Mem stall stall stall stall Mem Wr IF ID EX stall stall stall stall Ex Mem Wr» Stall, Load cache line, Restart mem stage» This is why cost on CM = Penalty + Hit Time Use Full/Empty bits in registers + MSHR queue» MSHR = Miss Status/Handler Registers (Kroft) Each entry in this queue keeps track of status of outstanding memory requests to one complete memory line. Per cache-line keep info about memory address. For each word register (if any) that is waiting for result. Used to merge multiple requests to one memory line» New load creates MSHR entry and sets destination register to Empty. Load is released from pipeline.» Attempt to use register before result returns causes instruction to block in decode stage.» Limited out-of-order execution with respect to loads. Popular with in-order superscalar architectures. Out-of-order pipelines already have this functionality built in /8/ (load queues, etc). CS5-S5 L Caches
8 NOW Handout Page 8 Disadvantage of Set Associative Cache N-way Set Associative Cache v. Direct Mapped Cache N comparators vs. Extra MUX delay for the data Data comes AFTER Hit/Miss In a direct mapped cache, Cache Block is available BEFORE Hit/Miss Possible to assume a hit and continue. Recover later if miss. Valid Adr Tag Compare Cache Index Cache Block Cache Block Sel Mux Sel Compare Valid Review Four Questions for Memory Hierarchy Designers Q Where can a block be placed in the upper level? (Block placement) Fully Associative, Set Associative, Direct Mapped Q How is a block found if it is in the upper level? (Block identification) Tag/Block Q3 Which block should be replaced on a miss? (Block replacement) Random, LRU Q What happens on a write? (Write strategy) Write Back or Write Through (with Write Buffer) OR /8/ CS5-S5 L Cache Caches Block Hit 3 /8/ CS5-S5 L Caches Summary Memory accesses CPUtime = IC CPI Execution + Miss rate Miss penalty Clock cycle time Instruction 3 Cs Compulsory, Capacity, Conflict. Reduce Misses via Larger Block Size. Reduce Misses via Higher Associativity 3. Reducing Misses via Victim Cache. Reducing Misses via Pseudo-Associativity 5. Reducing Misses by HW Prefetching Instr, Data 6. Reducing Misses by SW Prefetching Data 7. Reducing Misses by Compiler Optimizations Remember danger of concentrating on just one parameter when evaluating performance /8/ CS5-S5 L Caches 5
Lecture 16: Memory Hierarchy Misses, 3 Cs and 7 Ways to Reduce Misses. Professor Randy H. Katz Computer Science 252 Fall 1995
Lecture 16: Memory Hierarchy Misses, 3 Cs and 7 Ways to Reduce Misses Professor Randy H. Katz Computer Science 252 Fall 1995 Review: Who Cares About the Memory Hierarchy? Processor Only Thus Far in Course:
More informationLecture 16: Memory Hierarchy Misses, 3 Cs and 7 Ways to Reduce Misses Professor Randy H. Katz Computer Science 252 Spring 1996
Lecture 16: Memory Hierarchy Misses, 3 Cs and 7 Ways to Reduce Misses Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Review: Who Cares About the Memory Hierarchy? Processor Only Thus
More informationImproving Cache Performance. Reducing Misses. How To Reduce Misses? 3Cs Absolute Miss Rate. 1. Reduce the miss rate, Classifying Misses: 3 Cs
Improving Cache Performance 1. Reduce the miss rate, 2. Reduce the miss penalty, or 3. Reduce the time to hit in the. Reducing Misses Classifying Misses: 3 Cs! Compulsory The first access to a block is
More informationMemory Hierarchy 3 Cs and 6 Ways to Reduce Misses
Memory Hierarchy 3 Cs and 6 Ways to Reduce Misses Soner Onder Michigan Technological University Randy Katz & David A. Patterson University of California, Berkeley Four Questions for Memory Hierarchy Designers
More informationLecture 9: Improving Cache Performance: Reduce miss rate Reduce miss penalty Reduce hit time
Lecture 9: Improving Cache Performance: Reduce miss rate Reduce miss penalty Reduce hit time Review ABC of Cache: Associativity Block size Capacity Cache organization Direct-mapped cache : A =, S = C/B
More informationMemory Cache. Memory Locality. Cache Organization -- Overview L1 Data Cache
Memory Cache Memory Locality cpu cache memory Memory hierarchies take advantage of memory locality. Memory locality is the principle that future memory accesses are near past accesses. Memory hierarchies
More informationDECstation 5000 Miss Rates. Cache Performance Measures. Example. Cache Performance Improvements. Types of Cache Misses. Cache Performance Equations
DECstation 5 Miss Rates Cache Performance Measures % 3 5 5 5 KB KB KB 8 KB 6 KB 3 KB KB 8 KB Cache size Direct-mapped cache with 3-byte blocks Percentage of instruction references is 75% Instr. Cache Data
More informationLecture 11 Reducing Cache Misses. Computer Architectures S
Lecture 11 Reducing Cache Misses Computer Architectures 521480S Reducing Misses Classifying Misses: 3 Cs Compulsory The first access to a block is not in the cache, so the block must be brought into the
More informationTopics. Digital Systems Architecture EECE EECE Need More Cache?
Digital Systems Architecture EECE 33-0 EECE 9-0 Need More Cache? Dr. William H. Robinson March, 00 http://eecs.vanderbilt.edu/courses/eece33/ Topics Cache: a safe place for hiding or storing things. Webster
More informationImproving Cache Performance. Dr. Yitzhak Birk Electrical Engineering Department, Technion
Improving Cache Performance Dr. Yitzhak Birk Electrical Engineering Department, Technion 1 Cache Performance CPU time = (CPU execution clock cycles + Memory stall clock cycles) x clock cycle time Memory
More informationCPE 631 Lecture 06: Cache Design
Lecture 06: Cache Design Aleksandar Milenkovic, milenka@ece.uah.edu Electrical and Computer Engineering University of Alabama in Huntsville Outline Cache Performance How to Improve Cache Performance 0/0/004
More informationAleksandar Milenkovich 1
Review: Caches Lecture 06: Cache Design Aleksandar Milenkovic, milenka@ece.uah.edu Electrical and Computer Engineering University of Alabama in Huntsville The Principle of Locality: Program access a relatively
More informationTypes of Cache Misses: The Three C s
Types of Cache Misses: The Three C s 1 Compulsory: On the first access to a block; the block must be brought into the cache; also called cold start misses, or first reference misses. 2 Capacity: Occur
More informationAleksandar Milenkovich 1
Lecture 05 CPU Caches Outline Memory Hierarchy Four Questions for Memory Hierarchy Cache Performance Aleksandar Milenkovic, milenka@ece.uah.edu Electrical and Computer Engineering University of Alabama
More informationCache Memory: Instruction Cache, HW/SW Interaction. Admin
Cache Memory Instruction Cache, HW/SW Interaction Computer Science 104 Admin Project Due Dec 7 Homework #5 Due November 19, in class What s Ahead Finish Caches Virtual Memory Input/Output (1 homework)
More informationClassification Steady-State Cache Misses: Techniques To Improve Cache Performance:
#1 Lec # 9 Winter 2003 1-21-2004 Classification Steady-State Cache Misses: The Three C s of cache Misses: Compulsory Misses Capacity Misses Conflict Misses Techniques To Improve Cache Performance: Reduce
More informationMemory Hierarchy. Maurizio Palesi. Maurizio Palesi 1
Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio
More informationLec 12 How to improve cache performance (cont.)
Lec 12 How to improve cache performance (cont.) Homework assignment Review: June.15, 9:30am, 2000word. Memory home: June 8, 9:30am June 22: Q&A ComputerArchitecture_CachePerf. 2/34 1.2 How to Improve Cache
More informationSome material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier
Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science CPUtime = IC CPI Execution + Memory accesses Instruction
More informationComputer Architecture Spring 2016
Computer Architecture Spring 2016 Lecture 08: Caches III Shuai Wang Department of Computer Science and Technology Nanjing University Improve Cache Performance Average memory access time (AMAT): AMAT =
More informationLecture 7: Memory Hierarchy 3 Cs and 7 Ways to Reduce Misses Professor David A. Patterson Computer Science 252 Fall 1996
Lecture 7: Memory Hierarchy 3 Cs and 7 Ways to Reduce Misses Professor David A. Patterson Computer Science 252 Fall 1996 DAP.F96 1 Vector Summary Like superscalar and VLIW, exploits ILP Alternate model
More informationCache performance Outline
Cache performance 1 Outline Metrics Performance characterization Cache optimization techniques 2 Page 1 Cache Performance metrics (1) Miss rate: Neglects cycle time implications Average memory access time
More informationCOSC 6385 Computer Architecture. - Memory Hierarchies (I)
COSC 6385 Computer Architecture - Hierarchies (I) Fall 2007 Slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05 Recap
More informationEEC 170 Computer Architecture Fall Improving Cache Performance. Administrative. Review: The Memory Hierarchy. Review: Principle of Locality
Administrative EEC 7 Computer Architecture Fall 5 Improving Cache Performance Problem #6 is posted Last set of homework You should be able to answer each of them in -5 min Quiz on Wednesday (/7) Chapter
More informationCMSC 611: Advanced Computer Architecture. Cache and Memory
CMSC 611: Advanced Computer Architecture Cache and Memory Classification of Cache Misses Compulsory The first access to a block is never in the cache. Also called cold start misses or first reference misses.
More informationMemory Hierarchy Design. Chapter 5
Memory Hierarchy Design Chapter 5 1 Outline Review of the ABCs of Caches (5.2) Cache Performance Reducing Cache Miss Penalty 2 Problem CPU vs Memory performance imbalance Solution Driven by temporal and
More informationLevels in a typical memory hierarchy
Levels in a typical memory hierarchy CS 211: Computer Architecture Cache Memory Design 2 Memory Hierarchies Memory and CPU performance gap since 1980 Key Principles Locality most programs do not access
More informationCISC 662 Graduate Computer Architecture Lecture 18 - Cache Performance. Why More on Memory Hierarchy?
CISC 662 Graduate Computer Architecture Lecture 18 - Cache Performance Michela Taufer Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture, 4th edition ---- Additional
More informationRecap: The Big Picture: Where are We Now? The Five Classic Components of a Computer. CS152 Computer Architecture and Engineering Lecture 20.
Recap The Big Picture Where are We Now? CS5 Computer Architecture and Engineering Lecture s April 4, 3 John Kubiatowicz (www.cs.berkeley.edu/~kubitron) lecture slides http//inst.eecs.berkeley.edu/~cs5/
More informationCPE 631 Lecture 04: CPU Caches
Lecture 04 CPU Caches Electrical and Computer Engineering University of Alabama in Huntsville Outline Memory Hierarchy Four Questions for Memory Hierarchy Cache Performance 26/01/2004 UAH- 2 1 Processor-DR
More informationNOW Handout Page # Why More on Memory Hierarchy? CISC 662 Graduate Computer Architecture Lecture 18 - Cache Performance
CISC 66 Graduate Computer Architecture Lecture 8 - Cache Performance Michela Taufer Performance Why More on Memory Hierarchy?,,, Processor Memory Processor-Memory Performance Gap Growing Powerpoint Lecture
More informationChapter 5. Topics in Memory Hierachy. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ.
Computer Architectures Chapter 5 Tien-Fu Chen National Chung Cheng Univ. Chap5-0 Topics in Memory Hierachy! Memory Hierachy Features: temporal & spatial locality Common: Faster -> more expensive -> smaller!
More informationEITF20: Computer Architecture Part4.1.1: Cache - 2
EITF20: Computer Architecture Part4.1.1: Cache - 2 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache performance optimization Bandwidth increase Reduce hit time Reduce miss penalty Reduce miss
More informationModern Computer Architecture
Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap
More informationMemory Hierarchy. Maurizio Palesi. Maurizio Palesi 1
Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio
More informationChapter-5 Memory Hierarchy Design
Chapter-5 Memory Hierarchy Design Unlimited amount of fast memory - Economical solution is memory hierarchy - Locality - Cost performance Principle of locality - most programs do not access all code or
More informationLecture 11. Virtual Memory Review: Memory Hierarchy
Lecture 11 Virtual Memory Review: Memory Hierarchy 1 Administration Homework 4 -Due 12/21 HW 4 Use your favorite language to write a cache simulator. Input: address trace, cache size, block size, associativity
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationEE 4683/5683: COMPUTER ARCHITECTURE
EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major
More informationEITF20: Computer Architecture Part 5.1.1: Virtual Memory
EITF20: Computer Architecture Part 5.1.1: Virtual Memory Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache optimization Virtual memory Case study AMD Opteron Summary 2 Memory hierarchy 3 Cache
More informationCS/ECE 250 Computer Architecture
Computer Architecture Caches and Memory Hierarchies Benjamin Lee Duke University Some slides derived from work by Amir Roth (Penn), Alvin Lebeck (Duke), Dan Sorin (Duke) 2013 Alvin R. Lebeck from Roth
More informationEITF20: Computer Architecture Part 5.1.1: Virtual Memory
EITF20: Computer Architecture Part 5.1.1: Virtual Memory Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Virtual memory Case study AMD Opteron Summary 2 Memory hierarchy 3 Cache performance 4 Cache
More informationCOSC 6385 Computer Architecture - Memory Hierarchies (I)
COSC 6385 Computer Architecture - Memory Hierarchies (I) Edgar Gabriel Spring 2018 Some slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05
More informationQ3: Block Replacement. Replacement Algorithms. ECE473 Computer Architecture and Organization. Memory Hierarchy: Set Associative Cache
Fundamental Questions Computer Architecture and Organization Hierarchy: Set Associative Q: Where can a block be placed in the upper level? (Block placement) Q: How is a block found if it is in the upper
More informationCS61C Review of Cache/VM/TLB. Lecture 26. April 30, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson)
CS6C Review of Cache/VM/TLB Lecture 26 April 3, 999 Dave Patterson (http.cs.berkeley.edu/~patterson) www-inst.eecs.berkeley.edu/~cs6c/schedule.html Outline Review Pipelining Review Cache/VM/TLB Review
More informationEECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141
EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role
More informationCSE 502 Graduate Computer Architecture
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 CSE 502 Graduate Computer Architecture Lec 11-14 Advanced Memory Memory Hierarchy Design Larry Wittie Computer Science, StonyBrook
More informationMemory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy
Memory Hierarchy Motivation, Definitions, Four Questions about Memory Hierarchy Soner Onder Michigan Technological University Randy Katz & David A. Patterson University of California, Berkeley Levels in
More informationGraduate Computer Architecture. Handout 4B Cache optimizations and inside DRAM
Graduate Computer Architecture Handout 4B Cache optimizations and inside DRAM Outline 11 Advanced Cache Optimizations What inside DRAM? Summary 2018/1/15 2 Why More on Memory Hierarchy? 100,000 10,000
More informationAdvanced optimizations of cache performance ( 2.2)
Advanced optimizations of cache performance ( 2.2) 30 1. Small and Simple Caches to reduce hit time Critical timing path: address tag memory, then compare tags, then select set Lower associativity Direct-mapped
More informationCOEN6741. Computer Architecture and Design
Computer Architecture and Design Chapter 5 Memory Hierarchy Design (Dr. Sofiène Tahar) Chap 5.1 Outline Introduction Memory Hierarchy Cache Memory Cache Performance Main Memory Virtual Memory Translation
More informationAnnouncements. ! Previous lecture. Caches. Inf3 Computer Architecture
Announcements! Previous lecture Caches Inf3 Computer Architecture - 2016-2017 1 Recap: Memory Hierarchy Issues! Block size: smallest unit that is managed at each level E.g., 64B for cache lines, 4KB for
More informationCSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]
CSF Improving Cache Performance [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user
More informationLecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University
Lecture 12 Memory Design & Caches, part 2 Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements HW3 is due today PA2 is available on-line today Part 1 is due on 2/27
More informationComputer Architecture Spring 2016
Computer Architecture Spring 2016 Lecture 02: Introduction II Shuai Wang Department of Computer Science and Technology Nanjing University Pipeline Hazards Major hurdle to pipelining: hazards prevent the
More informationEITF20: Computer Architecture Part4.1.1: Cache - 2
EITF20: Computer Architecture Part4.1.1: Cache - 2 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache performance optimization Bandwidth increase Reduce hit time Reduce miss penalty Reduce miss
More informationThe Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):
The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): Motivation for The Memory Hierarchy: { CPU/Memory Performance Gap The Principle Of Locality Cache $$$$$ Cache Basics:
More informationארכיטקטורת יחידת עיבוד מרכזי ת
ארכיטקטורת יחידת עיבוד מרכזי ת (36113741) תשס"ג סמסטר א' July 2, 2008 Hugo Guterman (hugo@ee.bgu.ac.il) Arch. CPU L8 Cache Intr. 1/77 Memory Hierarchy Arch. CPU L8 Cache Intr. 2/77 Why hierarchy works
More informationCache Performance (H&P 5.3; 5.5; 5.6)
Cache Performance (H&P 5.3; 5.5; 5.6) Memory system and processor performance: CPU time = IC x CPI x Clock time CPU performance eqn. CPI = CPI ld/st x IC ld/st IC + CPI others x IC others IC CPI ld/st
More informationTime 11/03/99 UCB Fall 1999
Recap Who Cares About the Hierarchy? CS52 Computer Architecture and Engineering Lecture 9 s and TLBs November 3, 999 John Kubiatowicz (http.cs.berkeley.edu/~kubitron) lecture slides http//www-inst.eecs.berkeley.edu/~cs52/
More informationCache Performance! ! Memory system and processor performance:! ! Improving memory hierarchy performance:! CPU time = IC x CPI x Clock time
Cache Performance!! Memory system and processor performance:! CPU time = IC x CPI x Clock time CPU performance eqn. CPI = CPI ld/st x IC ld/st IC + CPI others x IC others IC CPI ld/st = Pipeline time +
More informationCache Performance! ! Memory system and processor performance:! ! Improving memory hierarchy performance:! CPU time = IC x CPI x Clock time
Cache Performance!! Memory system and processor performance:! CPU time = IC x CPI x Clock time CPU performance eqn. CPI = CPI ld/st x IC ld/st IC + CPI others x IC others IC CPI ld/st = Pipeline time +
More informationPage 1. Memory Hierarchies (Part 2)
Memory Hierarchies (Part ) Outline of Lectures on Memory Systems Memory Hierarchies Cache Memory 3 Virtual Memory 4 The future Increasing distance from the processor in access time Review: The Memory Hierarchy
More informationGraduate Computer Architecture. Handout 4B Cache optimizations and inside DRAM
Graduate Computer Architecture Handout 4B Cache optimizations and inside DRAM Outline 11 Advanced Cache Optimizations What inside DRAM? Summary 2012/11/21 2 Why More on Memory Hierarchy? 100,000 10,000
More informationCaching Basics. Memory Hierarchies
Caching Basics CS448 1 Memory Hierarchies Takes advantage of locality of reference principle Most programs do not access all code and data uniformly, but repeat for certain data choices spatial nearby
More informationCISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review
CISC 662 Graduate Computer Architecture Lecture 6 - Cache and virtual memory review Michela Taufer http://www.cis.udel.edu/~taufer/teaching/cis662f07 Powerpoint Lecture Notes from John Hennessy and David
More informationCS422 Computer Architecture
CS422 Computer Architecture Spring 2004 Lecture 19, 04 Mar 2004 Bhaskaran Raman Department of CSE IIT Kanpur http://web.cse.iitk.ac.in/~cs422/index.html Topics for Today Cache Performance Cache Misses:
More informationCourse Administration
Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570
More information3Introduction. Memory Hierarchy. Chapter 2. Memory Hierarchy Design. Computer Architecture A Quantitative Approach, Fifth Edition
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationCSE 502 Graduate Computer Architecture. Lec Advanced Memory Hierarchy
CSE 502 Graduate Computer Architecture Lec 19-20 Advanced Memory Hierarchy Larry Wittie Computer Science, StonyBrook University http://www.cs.sunysb.edu/~cse502 and ~lw Slides adapted from David Patterson,
More informationPage 1. Multilevel Memories (Improving performance using a little cash )
Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency
More information5008: Computer Architecture
5008: Computer Architecture Chapter 5 Memory Hierarchy Design CA Lecture10 - memory hierarchy design (cwliu@twins.ee.nctu.edu.tw) 10-1 Outline 11 Advanced Cache Optimizations Memory Technology and DRAM
More informationL2 cache provides additional on-chip caching space. L2 cache captures misses from L1 cache. Summary
HY425 Lecture 13: Improving Cache Performance Dimitrios S. Nikolopoulos University of Crete and FORTH-ICS November 25, 2011 Dimitrios S. Nikolopoulos HY425 Lecture 13: Improving Cache Performance 1 / 40
More informationCS152: Computer Architecture and Engineering Caches and Virtual Memory. October 31, 1997 Dave Patterson (http.cs.berkeley.
CS152 Computer Architecture and Engineering Caches and Virtual Memory October 31, 1997 Dave Patterson (http.cs.berkeley.edu/~patterson) lecture slides http//www-inst.eecs.berkeley.edu/~cs152/ cs 152 L1
More informationTime. Recap: Who Cares About the Memory Hierarchy? Performance. Processor-DRAM Memory Gap (latency)
Recap Who Cares About the Hierarchy? -DRAM Gap (latency) CS52 Computer Architecture and Engineering s and Virtual October 3, 997 Dave Patterson (http.cs.berkeley.edu/~patterson) lecture slides http//www-inst.eecs.berkeley.edu/~cs52/
More informationCSE 502 Graduate Computer Architecture. Lec Advanced Memory Hierarchy and Application Tuning
CSE 502 Graduate Computer Architecture Lec 25-26 Advanced Memory Hierarchy and Application Tuning Larry Wittie Computer Science, StonyBrook University http://www.cs.sunysb.edu/~cse502 and ~lw Slides adapted
More informationLecture 13: Memory Systems -- Cache Op7miza7ons CSE 564 Computer Architecture Summer 2017
Lecture 13: Memory Systems -- Cache Op7miza7ons CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan yan@oakland.edu www.secs.oakland.edu/~yan 1 Topics
More informationCOSC4201. Chapter 5. Memory Hierarchy Design. Prof. Mokhtar Aboelaze York University
COSC4201 Chapter 5 Memory Hierarchy Design Prof. Mokhtar Aboelaze York University 1 Memory Hierarchy The gap between CPU performance and main memory has been widening with higher performance CPUs creating
More informationUNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568/668
UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Computer Architecture ECE 568/668 Part 11 Memory Hierarchy - I Israel Koren ECE568/Koren Part.11.1 ECE568/Koren Part.11.2 Ideal Memory
More informationMemory Hierarchies 2009 DAT105
Memory Hierarchies Cache performance issues (5.1) Virtual memory (C.4) Cache performance improvement techniques (5.2) Hit-time improvement techniques Miss-rate improvement techniques Miss-penalty improvement
More informationCS3350B Computer Architecture
CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &
More informationECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)]
ECE7995 (6) Improving Cache Performance [Adapted from Mary Jane Irwin s slides (PSU)] Measuring Cache Performance Assuming cache hit costs are included as part of the normal CPU execution cycle, then CPU
More informationLec 11 How to improve cache performance
Lec 11 How to improve cache performance How to Improve Cache Performance? AMAT = HitTime + MissRate MissPenalty 1. Reduce the time to hit in the cache.--4 small and simple caches, avoiding address translation,
More informationTDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading
Review on ILP TDT 4260 Chap 5 TLP & Hierarchy What is ILP? Let the compiler find the ILP Advantages? Disadvantages? Let the HW find the ILP Advantages? Disadvantages? Contents Multi-threading Chap 3.5
More informationCOEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory
1 COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory Cristinel Ababei Dept. of Electrical and Computer Engineering Marquette University Credits: Slides adapted from presentations
More informationEECS150 - Digital Design Lecture 11 SRAM (II), Caches. Announcements
EECS15 - Digital Design Lecture 11 SRAM (II), Caches September 29, 211 Elad Alon Electrical Engineering and Computer Sciences University of California, Berkeley http//www-inst.eecs.berkeley.edu/~cs15 Fall
More informationOutline. 1 Reiteration. 2 Cache performance optimization. 3 Bandwidth increase. 4 Reduce hit time. 5 Reduce miss penalty. 6 Reduce miss rate
Outline Lecture 7: EITF20 Computer Architecture Anders Ardö EIT Electrical and Information Technology, Lund University November 21, 2012 A. Ardö, EIT Lecture 7: EITF20 Computer Architecture November 21,
More informationLecture 19: Memory Hierarchy Five Ways to Reduce Miss Penalty (Second Level Cache) Admin
Lecture 19: Memory Hierarchy Five Ways to Reduce Miss Penalty (Second Level Cache) Professor Alvin R. Lebeck Computer Science 220 Fall 1999 Exam Average 76 90-100 4 80-89 3 70-79 3 60-69 5 < 60 1 Admin
More informationMemory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)
Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2012/13 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2012/13 1 2
More informationLRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved.
LRU A list to keep track of the order of access to every block in the set. The least recently used block is replaced (if needed). How many bits we need for that? 27 Pseudo LRU A B C D E F G H A B C D E
More informationCOSC4201. Chapter 4 Cache. Prof. Mokhtar Aboelaze York University Based on Notes By Prof. L. Bhuyan UCR And Prof. M. Shaaban RIT
COSC4201 Chapter 4 Cache Prof. Mokhtar Aboelaze York University Based on Notes By Prof. L. Bhuyan UCR And Prof. M. Shaaban RIT 1 Memory Hierarchy The gap between CPU performance and main memory has been
More informationMemory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)
Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2011/12 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2011/12 1 2
More informationAdvanced Caching Techniques (2) Department of Electrical Engineering Stanford University
Lecture 4: Advanced Caching Techniques (2) Department of Electrical Engineering Stanford University http://eeclass.stanford.edu/ee282 Lecture 4-1 Announcements HW1 is out (handout and online) Due on 10/15
More information! 11 Advanced Cache Optimizations! Memory Technology and DRAM optimizations! Virtual Machines
Outline EEL 5764 Graduate Computer Architecture 11 Advanced Cache Optimizations Memory Technology and DRAM optimizations Virtual Machines Chapter 5 Advanced Memory Hierarchy Ann Gordon-Ross Electrical
More informationQuestion?! Processor comparison!
1! 2! Suggested Readings!! Readings!! H&P: Chapter 5.1-5.2!! (Over the next 2 lectures)! Lecture 18" Introduction to Memory Hierarchies! 3! Processor components! Multicore processors and programming! Question?!
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #21: Caches 3 2005-07-27 CS61C L22 Caches III (1) Andy Carle Review: Why We Use Caches 1000 Performance 100 10 1 1980 1981 1982 1983
More informationMemory Hierarchy Technology. The Big Picture: Where are We Now? The Five Classic Components of a Computer
The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Datapath Today s Topics: technologies Technology trends Impact on performance Hierarchy The principle of locality
More informationTextbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site:
Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, 2003 Textbook web site: www.vrtechnology.org 1 Textbook web site: www.vrtechnology.org Laboratory Hardware 2 Topics 14:332:331
More informationThe Memory Hierarchy & Cache
Removing The Ideal Memory Assumption: The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory
More informationMemory Hierarchy: The motivation
Memory Hierarchy: The motivation The gap between CPU performance and main memory has been widening with higher performance CPUs creating performance bottlenecks for memory access instructions. The memory
More information