Memory. Lecture 22 CS301
|
|
- Sabrina Hunt
- 6 years ago
- Views:
Transcription
1 Memory Lecture 22 CS301
2 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday
3 Pipelined Machine Fetch Decode Execute Memory PC 4 Read Addr Out Data Instruction Memory << 2 op/fun rs rt rd imm src1 src1data src2 src2data Register File destreg destdata << 2 Addr Out Data Data Memory In Data 16 Sign Ext 32 Pipeline Register (Writeback)
4 The Challenge Be able to randomly access gigabytes (or more) of data at processor speeds
5 How Do We Access Data?
6 Program Characteristics Temporal Locality Spatial Locality
7 Program Characteristics Temporal Locality w If you use one item, you are likely to use it again soon Spatial Locality
8 Program Characteristics Temporal Locality w If you use one item, you are likely to use it again Spatial Locality w If you use one item, you are likely to use its neighbors soon
9 Examples of Each Type of Locality? Temporal locality w Good? w Bad? Spatial locality w Good? w Bad?
10 Locality Programs tend to exhibit spatial & temporal locality. Just a fact of life. How can we use this knowledge of program behavior to solve our problem?
11 Predicting Data Accesses Can we predict what data we will use?
12 Predicting Data Accesses Can we predict what data we will use? w Instead of predicting branch direction, predict next memory address request
13 Predicting Data Accesses Can we predict what data we will use? w Instead of predicting branch direction, predict next memory address request w Like branch prediction, use previous behavior
14 Predicting Data Accesses Can we predict what data we will use? w Instead of predicting branch direction, predict next memory address request w Like branch prediction, use previous behavior Keep a prediction for every load? w Fetch stage for load is *TOO LATE* Keep a prediction per-memory address?
15 Predicting Data Accesses Can we predict what data we will use? w Instead of predicting branch direction, predict next memory address request w Like branch prediction, use previous behavior Keep a prediction for every load? w Fetch stage for load is *TOO LATE* Keep a prediction per-memory address? w Given address, guess next likely address
16 Predicting Data Accesses Can we predict what data we will use? w Instead of predicting branch direction, predict next memory address request w Like branch prediction, use previous behavior Keep a prediction for every load? w Fetch stage for load is *TOO LATE* Keep a prediction per-memory address? w Given address, guess next likely address w Too many choices table too large or fits
17 Memory Hierarchy Tech Speed Size Cost/bit SRAM (logic) Fastest CPU L1 Smallest Highest SRAM (logic) L2 Cache DRAM (capacitors) Slowest DRAM Largest Lowest
18 Using Caches To Improve Performance Caches make the large gap between processor speed and memory speed appear much smaller Caches give the appearance of having lots and lots of quickly accessible memory w Achieved by exploiting spatial and temporal locality
19 SRAM Static Random Access Memory w Volatile memory array 4-6 transistors per bit w Fast accesses ns Dimensions: w Height: # addressable locations w Width: # of b per addressable unit Usually 1 or 4 2M x 16 SRAM Height: 2M Width: 16
20 SRAM: Selection Logic Need to choose which addressable unit goes to output lines 2M multiplexor infeasible Single shared output line: bit line w Tri-state buffer used to allow multiple sources to drive bit line Tri-State Buffer A Ctrl Z 0 0 X X Select 0 Enable Data 0 In Out Select 1 Enable Data 1 In Out Select 2 Enable Output Data 2 In Out Select 3 Data 3 In Enable Out
21 SRAM: Using Bit Lines input lines address or word lines bit (output) lines
22 SRAM: For Large Arrays Large arrays (4Mx8 SRAM) require HUGE decoders and word lines Instead, 2-stage decoding w Selects addresses for eight 4Kx1024 arrays w Multiplexors select 1 bit from each 1024-b wide array
23 DRAM Dynamic RAM Value stored as charge in capacitor (1T) w Must be refreshed w Refresh by reading and writing back cells w Only uses 1-2% of active DRAM cycles 2 level decoder w Row access Row access strobe (RAS) w Column access Column access strobe (CAS) Access times w ns (typical) 4M x 1 DRAM 2048 x 2048 array
24 Memory Hierarchy Tech Speed Size Cost/bit SRAM (logic) Fastest CPU L1 Smallest Highest SRAM (logic) L2 Cache DRAM (capacitors) Slowest DRAM Largest Lowest
25 What Do We Need to Think About? 1. Design cache that takes advantage of spatial & temporal locality
26 What does that mean?!? 1. Design cache that takes advantage of spatial & temporal locality 2. When you program, place data together that is used together to increase spatial & temporal locality
27 What does that mean?!? 1. Design cache that takes advantage of spatial & temporal locality 2. When you program, place data together that is used together to increase locality w Java - difficult to do w C - more control over data placement
28 What does that mean?!? 1. Design cache that takes advantage of spatial & temporal locality 2. When you program, place data together that is used together to increase locality w Java - difficult to do w C - more control over data placement Note: Caches exploit locality. Programs have varying degrees of locality. Caches do not have locality!
29 Cache Design Temporal Locality Spatial Locality
30 Cache Design Temporal Locality w When we obtain the data, store it in the cache. Spatial Locality
31 Cache Design Temporal Locality w When we obtain the data, store it in the cache. Spatial Locality w Transfer large block of contiguous data to get item s neighbors. w Block (Line): Amount of data transferred for a single miss (data plus neighbors)
32 Where do we put data? Searching whole cache takes time & power Direct-mapped w Limit each piece of data to one possible position Search is quick and simple
33 Memory Direct-Mapped Index Cache
34 Memory Direct-Mapped One block (line) Index Cache
35 Direct-Mapped cache Block (Line) size = 2 words (8B) Index Data Byte Address 0b Where do we look in the cache? How do we know if it is there?
36 Direct-Mapped cache Block (Line) size = 2 words (8B) Index Data Byte Address 0b Block Address Where is it within the block? Where do we look in the cache? BlockAddress mod #slots BlockAddress & (#slots-1) How do we know if it is there?
37 Direct-Mapped cache Block (Line) size = 2 words (8B) Valid Tag Data M[ ] M[ ] Byte Address 0b Tag Index Where is it within the block? Where do we look in the cache? BlockAddress mod #slots BlockAddress & (#slots-1) How do we know if it is there? We need a tag & valid bit
38 Splitting the Address Direct-Mapped Cache Valid Tag Data b Tag Index Block Offset Byte Offset
39 Definitions Byte Offset: Which within? Block Offset: Which within? Set: Group of checked each access Index: Which within cache? Tag: Is this the right one?
40 Definitions Byte Offset: Which byte within word Block Offset: Which within? Set: Group of checked each access Index: Which within cache? Tag: Is this the right one?
41 Definitions Byte Offset: Which byte within word Block Offset: Which word within block Set: Group of checked each access Index: Which within cache? Tag: Is this the right one?
42 Definitions Byte Offset: Which byte within word Block Offset: Which word within block Set: Group of blocks checked each access Index: Which within cache? Tag: Is this the right one?
43 Definitions Byte Offset: Which byte within word Block Offset: Which word within block Set: Group of blocks checked each access Index: Which set within cache? Tag: Is this the right one?
44 Definitions Block (Line) Hit Miss Hit time / Access time Miss Penalty
45 Definitions Block - unit of data transfer bytes/ words Hit Miss Hit time / Access time Miss Penalty
46 Definitions Block - unit of data transfer bytes/ words Hit - data found in this cache Miss Hit time / Access time Miss Penalty
47 Definitions Block - unit of data transfer bytes/ words Hit - data found in this cache Miss - data not found in this cache w Send request to lower level Hit time / Access time Miss Penalty
48 Definitions Block - unit of data transfer bytes/ words Hit - data found in this cache Miss - data not found in this cache w Send request to lower level Hit time / Access time w Time to access this cache Miss Penalty
49 Definitions Block - unit of data transfer bytes/words Hit - data found in this cache Miss - data not found in this cache w Send request to lower level Hit time / Access time w Time to access this cache Miss Penalty w Time to receive block from lower level w Not always constant
50 Example 1 Direct-Mapped Block size=2 words Direct-Mapped Cache Valid Tag Data x Tag Index Block Offset Byte Offset
51 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data Reference Stream: Hit/Miss 0b b b b b b Miss Rate: Tag Index Block Offset Byte Offset
52 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data 72 Reference Stream: Hit/Miss 0b b b b b b Miss Rate: Tag Index Block Offset Byte Offset
53 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag 10 Data M[76-79] M[72-75] 72 Reference Stream: Hit/Miss 0b M 0b b b b b Miss Rate: Tag Index Block Offset Byte Offset
54 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag 10 Data M[76-79] M[72-75] 20 Reference Stream: Hit/Miss 0b M 0b b b b b Miss Rate: Tag Index Block Offset Byte Offset
55 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[76-79] M[72-75] M[20-23] M[16-19] 20 Reference Stream: Hit/Miss 0b M 0b b b b b Miss Rate: Tag Index Block Offset Byte Offset
56 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[76-79] M[72-75] M[20-23] M[16-19] 56 Reference Stream: Hit/Miss 0b M 0b M 0b b b b Miss Rate: Tag Index Block Offset Byte Offset
57 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[76-79] M[72-75] M[20-23] M[16-19] 11 M[60-63] M[56-59] 56 Reference Stream: Hit/Miss 0b M 0b M 0b M 0b b b Miss Rate: Tag Index Block Offset Byte Offset
58 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[76-79] M[72-75] M[20-23] M[16-19] M[60-63] M[56-59] 16 Reference Stream: Hit/Miss 0b M 0b M 0b M 0b b b Miss Rate: Tag Index Block Offset Byte Offset
59 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[76-79] M[72-75] M[20-23] M[16-19] M[60-63] M[56-59] 16 Reference Stream: Hit/Miss 0b M 0b M 0b M 0b H 0b b Miss Rate: Tag Index Block Offset Byte Offset
60 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[20-23] M[16-19] M[76-79] M[72-75] M[60-63] M[56-59] 20 Reference Stream: Hit/Miss 0b M 0b M 0b M 0b H 0b b Miss Rate: Tag Index Block Offset Byte Offset
61 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[20-23] M[16-19] M[76-79] M[72-75] M[60-63] M[56-59] 20 Reference Stream: Hit/Miss 0b M 0b M 0b M 0b H 0b H 0b Miss Rate: Tag Index Block Offset Byte Offset
62 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[76-79] M[72-75] M[20-23] M[16-19] M[60-63] M[56-59] 36 Reference Stream: Hit/Miss 0b M 0b M 0b M 0b H 0b H 0b M Miss Rate: Tag Index Block Offset Byte Offset
63 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[36-39] M[32-35] M[76-79] M[72-75] M[20-23] M[16-19] M[60-63] M[56-59] 36 Reference Stream: Hit/Miss 0b M 0b M 0b M 0b H 0b H 0b M Miss Rate: Tag Index Block Offset Byte Offset
64 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[36-39] M[32-35] M[76-79] M[72-75] M[20-23] M[16-19] M[60-63] M[56-59] Reference Stream: Hit/Miss 0b M 0b M 0b M 0b H 0b H 0b M Miss Rate: Tag Index Block Offset Byte Offset
65 Example 1 Direct-Mapped Block size=2 words Valid Direct-Mapped Cache Tag Data M[36-39] M[32-35] M[76-79] M[72-75] M[20-23] M[16-19] M[60-63] M[56-59] Reference Stream: Hit/Miss 0b M 0b M 0b M 0b H 0b H 0b M Miss Rate: 4 / 6 = 67% Hit Rate: 2 / 6 = 33% Tag Index Block Offset Byte Offset
66 Implementation Byte Address 0x Tag Valid Index Tag Data Byte Offset Block offset = MUX Hit? Data
67 Example 2 You are implementing a 64-Kbyte cache, 32-bit address The block size (line size) is 16 bytes. Each word is 4 bytes How many bits is the block offset? How many bits is the index? How many bits is the tag?
68 Example 2 You are implementing a 64-Kbyte cache The block size (line size) is 16 bytes. Each word is 4 bytes How many bits is the block offset? w 16 / 4 = 4 words -> 2 bits How many bits is the index? How many bits is the tag?
69 Example 2 You are implementing a 64-Kbyte cache The block size (line size) is 16 bytes. Each word is 4 bytes, address 32 bits How many bits is the block offset? w 16 / 4 = 4 words -> 2 bits How many bits is the index? w 64*1024 / 16 = > 12 bits How many bits is the tag?
70 Example 2 You are implementing a 64-Kbyte cache The block size (line size) is 16 bytes. Each word is 4 bytes, address 32 bits How many bits is the block offset? w 16 / 4 = 4 words -> 2 bits How many bits is the index? w 64*1024 / 16 = > 12 bits How many bits is the tag? w 32 - ( ) = 16 bits
71 Direct-mapped $ w Block size = 2 words w Total size = 16 words Word addresses w 0 w 16 w 1 w 17 w 32 w 16 w 36 w 45 What is the hit rate? Example
72 Example Direct-mapped $ w Block size = 2 words w Total size = 16 words Word addresses w 0 w 16 w 1 w 17 w 32 w 16 w 36 w 45 What is the hit rate?
73 Reducing Cache Conflicts Problem: w Lines that map to same cache index conflict w Lines conflict even if other cache lines unused Solution: w Have multiple cache lines for each mapping
74 Cache Set Associativity Set: Group of cache lines address can map to Direct-mapped: 1 location for block n-way set associative: n locations for block Fully-associative: Maps to any location Direct-mapped 2-way set associative Fully-associative Set Set Set 0
75 Cache Set Associativity Decreases conflicts => increases hit rate On cache request, must check every cache line in set w Increases hit time Number of sets smaller than direct mapped, so fewer index bits w lg (number of sets) where sets < # cache lines Tag bits increase
76 2-way set associative $ w Block size = 2 words w Total size = 16 words Word addresses w 0 w 16 w 1 w 17 w 32 w 16 w 36 w 45 What is the hit rate? Example
77 Example 2-way set associative $ w Block size = 2 words w Total size = 16 words Word addresses w 0 w 16 w 1 w 17 w 32 w 16 w 36 w 45 What is the hit rate?
78 Implementation Byte Address 0x Valid Tag Tag Data Index Valid Byte Offset Block offset Tag Data 1 Hit? = MUX = MUX MUX Data
79 Example You are implementing a 1Mbyte 4-way set associative cache, 32-bit address The block size (line size) is 256 bytes. How many bits is the block offset? How many bits is the index? How many bits is the tag?
80 What Happens on Cache Miss? Detect desired block is not there w Valid bit 0 OR w Tag not one we re looking for If valid bit is set but tag not one we re looking for, evict current block Request line from lower level Upon receipt of data from lower level, set tag and valid bits and store data. Pass data up to requestor
81 How caches work Classic abstraction Each level of hierarchy has no knowledge of the configuration of lower level L1 cache s perspective Me L1 L2 cache s perspective Me L2 Cache Memory L2 Cache Memory DRAM DRAM
82 Memory Operation at any level Address 1. Me Cache 1. Cache receives request Memory
83 Memory operation at any level Address 1. Me 2. Cache 1. Cache receives request 2. Look for item in cache Memory
84 Memory operation at any level Address 1. Me 2. Cache 3. Data 1. Cache receives request 2. Look for item in cache Hit - return data Memory
85 Memory operation at any level Address 1. Me Memory Cache 1. Cache receives request 2. Look for item in cache Hit - return data Miss - request memory
86 Memory operation at any level Address 1. Me Memory Cache Cache receives request 2. Look for item in cache Hit - return data Miss - request memory receive data update cache
87 Memory operation at any level Address 1. Me Memory Cache 5. Data Cache receives request 2. Look for item in cache Hit - return data Miss - request memory receive data update cache return data
88 Performance Hit: latency = Miss: latency = Goal: minimize misses!!!
89 Performance Hit: latency = access time Miss: latency = Goal: minimize misses!!!
90 Performance Hit: latency = access time Miss: latency = access time + miss penalty Goal: minimize misses!!!
91 Performance How does the memory system affect CPI? Penalty on cache hit: w hit time w frequently only 1 cycle needed to access on cache hit Penalty on cache miss: w miss time time to get from lower level of memory CPI = 1 + memory stalls/instruction = 1 + (% miss) (cache miss penalty)
92 L1 cache s perspective Me Memory L1 L1 s miss penalty contains the access of L2, and possibly the access of DRAM!!! L2 Cache DRAM
93 Multi-level Caches Base CPI 1.0, 500MHz clock Main memory-100 cycles, L2-10 cycles L1 miss rate per instruction - 5% W/L2-2% of instructions go to DRAM What is the speedup with the L2 cache?
94 Multi-level Caches CPI = 1 + memory stalls / instructions
95 Multi-level Caches CPI = 1 + memory stalls / instructions CPI old = 1 + 5% miss/instr * 100 cycles/ miss = = 6 cycles / instr
96 Multi-level Caches CPI = 1 + memory stalls / instructions CPI old = 1 + 5% miss/instr * 100 cycles/miss = = 6 cycles / instr CPI new =1 + L2%*L2penalty +Mem %*MemPenalty instr =1 + 5% * % * 100=3.5 cycles/
97 Multi-level Caches CPI = 1 + memory stalls / instructions CPI old = 1 + 5% miss/instr * 100 cycles/miss = = 6 cycles / instr CPI new =1 + L2%*L2penalty +Mem %*MemPenalty instr =1 + 5% * % * 100=3.5 cycles/ Speedup = 6 / 3.5 = 1.7
98 Average Memory Access Time AMAT = L1 access time + L1 miss penalty L1 miss penalty = L2 access time + L2 miss penalty L2 miss penalty = Memory access time + Memory miss penalty
99 Calculate AMAT Organization: w L1 cache Access time is 1 cycle Hit rate of 90% w L2 cache Access time is 10 cycles Hit rate of 95% w Memory Access time is 100 cycles Hit rate of 100%
100 Ways To Improve Cache Performance Make the cache bigger w Pro: More stuff can fit in the cache so stuff doesn t have to get thrown out as often w Con: Time to access larger memory longer Reduce the number of conflicts in the cache by increasing associativity w Pro: Multiple memory lines that map to same cache set can reside in cache simultaneously w Con: More time needed to determine if there is a hit because have to check multiple cache blocks
101 Ways To Improve Cache Performance Use multiple levels of cache w Access time of non-primary cache not as important. More important for it to have lower miss rate. w Pro: Reduces (average) miss penalty if there is a hit in lower level of cache w Con: Takes up space and increases (worst-case) latency if access misses in this level of cache. Make the block size larger to exploit spatial locality w Pro: Fewer misses for sequential accesses w Pro: Decreases bits dedicated to tags w Con: Fewer blocks in cache for given cache size w Con: Miss penalty may be larger because larger blocks need to be retrieved from lower level of hierarchy
102 2-way set associative $ w Block size = 4 words w Total size = 32 words Word addresses w 2 w 35 w 63 w 110 w 210 w 77 w 3 w 97 w 170 What is the hit rate? Example
103 Cache Writes There are multiple copies of the data lying around w L1 cache, L2 cache, DRAM Do we write to all of them? Do we wait for the write to complete before the processor can proceed?
104 Do we write to all of them? Write-through - write to all levels of hierarchy Write-back - write to lower level only when cache line gets evicted from cache w creates inconsistent data - different values for same item in cache and DRAM. w Inconsistent data in highest level in cache is referred to as dirty
105 Write-Through CPU sw $3, 0($5) L1 L2 Cache DRAM
106 Write-Back CPU sw $3, 0($5) L1 L2 Cache DRAM
107 Write-through vs Write-back Which performs the write faster? w Write-back - it only writes the L1 cache Which has faster evictions from a cache? w Write-through - no write involved, just overwrite tag Which causes more bus traffic? w Write-through. DRAM is written every store. Write-back only writes on eviction.
108 Beyond The Cache: Memory
109 Memory System Design Challenges DRAM is designed for density, not speed DRAM is slower than the bus We are allowed to change the width, the number of DRAMs, and the bus protocol, but the access latency stays slow. Widening anything increases the cost by quite a bit.
110 Narrow Configuration CPU Given: w 1 clock cycle request w 15 cycles / word DRAM latency w 1 cycle / word bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss? Cache Bus DRAM
111 Narrow Configuration CPU Given: w 1 clock cycle request w 15 cycles / word DRAM latency w 1 cycle / word bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss? 1cycle + 15 cycles/word * 8 words + 1 cycle/word * 8 words = 129 cycles Cache Bus DRAM
112 Wide Configuration CPU Given: w 1 clock cycle request w 15 cycles / 2 words DRAM latency w 1 cycle / 2 words bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss? Cache Bus DRAM
113 Wide Configuration CPU Given: w 1 clock cycle request w 15 cycles / 2 words DRAM latency w 1 cycle / 2 words bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss? 1cycle + 15 cycles/2 words * 8 words + 1 cycle/2words*8words = 65 cycles Cache Bus DRAM
114 Interleaved Configuration CPU Byte 0 in DRAM 0, byte 1 in DRAM 1, Byte 2 in DRAM 0,... Given: w 1 clock cycle request w 15 cycles / word DRAM latency w 1 cycle / word bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss? Cache Bus DRAM DRAM
115 Interleaved Configuration CPU Given: w 1 clock cycle request w 15 cycles / word DRAM latency w 1 cycle / word bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss? 1 cycle + 15 cycles / 2 words * 8 words + 1 cycle / word * 8 words = 69 cycles Cache Bus DRAM DRAM
116 DRAM Optimizations Fast page mode w Allow repeated accesses to row buffer without another row access time Synchronous DRAM w Add clock signal to DRAM interface to make synchronous w Programmable register holds number of bytes to transfer over many cycles Double Data Rate (DDR) w Transfer data on rising and falling clock edges instead of just one.
117 DRAM Optimizations Make DRAM chip act like a memory system Each chip has interleaved memory and a high speed interface RDRAM w Switch RAS/CAS lines to bus that allows multiple access to be inflight simultaneously You don t have to wait for one DRAM request to finish before sending another request Direct RDRAM w Don t multiplex over one bus. Have 3: Data Row Column
Memory Hierarchy. Caching Chapter 7. Locality. Program Characteristics. What does that mean?!? Exploiting Spatial & Temporal Locality
Caching Chapter 7 Basics (7.,7.2) Cache Writes (7.2 - p 483-485) configurations (7.2 p 487-49) Performance (7.3) Associative caches (7.3 p 496-54) Multilevel caches (7.3 p 55-5) Tech SRAM (logic) SRAM
More informationCSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1
CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson
More informationChapter 7 Large and Fast: Exploiting Memory Hierarchy. Memory Hierarchy. Locality. Memories: Review
Memories: Review Chapter 7 Large and Fast: Exploiting Hierarchy DRAM (Dynamic Random Access ): value is stored as a charge on capacitor that must be periodically refreshed, which is why it is called dynamic
More informationMemory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology
Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast
More informationMemory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB
Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar
More informationChapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1)
Department of Electr rical Eng ineering, Chapter 5 Large and Fast: Exploiting Memory Hierarchy (Part 1) 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Depar rtment of Electr rical Engineering,
More informationAdvanced Memory Organizations
CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU
More informationCENG 3420 Computer Organization and Design. Lecture 08: Cache Review. Bei Yu
CENG 3420 Computer Organization and Design Lecture 08: Cache Review Bei Yu CEG3420 L08.1 Spring 2016 A Typical Memory Hierarchy q Take advantage of the principle of locality to present the user with as
More informationLECTURE 11. Memory Hierarchy
LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed
More informationCPU issues address (and data for write) Memory returns data (or acknowledgment for write)
The Main Memory Unit CPU and memory unit interface Address Data Control CPU Memory CPU issues address (and data for write) Memory returns data (or acknowledgment for write) Memories: Design Objectives
More informationCourse Administration
Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570
More informationThe Memory Hierarchy & Cache
Removing The Ideal Memory Assumption: The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationCS161 Design and Architecture of Computer Systems. Cache $$$$$
CS161 Design and Architecture of Computer Systems Cache $$$$$ Memory Systems! How can we supply the CPU with enough data to keep it busy?! We will focus on memory issues,! which are frequently bottlenecks
More informationMemory Hierarchy and Caches
Memory Hierarchy and Caches COE 301 / ICS 233 Computer Organization Dr. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals Presentation Outline
More informationDonn Morrison Department of Computer Science. TDT4255 Memory hierarchies
TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,
More informationChapter Seven. Large & Fast: Exploring Memory Hierarchy
Chapter Seven Large & Fast: Exploring Memory Hierarchy 1 Memories: Review SRAM (Static Random Access Memory): value is stored on a pair of inverting gates very fast but takes up more space than DRAM DRAM
More informationChapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY
Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored
More informationCaches. Hiding Memory Access Times
Caches Hiding Memory Access Times PC Instruction Memory 4 M U X Registers Sign Ext M U X Sh L 2 Data Memory M U X C O N T R O L ALU CTL INSTRUCTION FETCH INSTR DECODE REG FETCH EXECUTE/ ADDRESS CALC MEMORY
More informationReview: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.
Performance 980 98 982 983 984 985 986 987 988 989 990 99 992 993 994 995 996 997 998 999 2000 7/4/20 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Instructor: Michael Greenbaum
More informationCENG 3420 Computer Organization and Design. Lecture 08: Memory - I. Bei Yu
CENG 3420 Computer Organization and Design Lecture 08: Memory - I Bei Yu CEG3420 L08.1 Spring 2016 Outline q Why Memory Hierarchy q How Memory Hierarchy? SRAM (Cache) & DRAM (main memory) Memory System
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/16/17 Fall 2017 - Lecture #15 1 Outline
More information10/16/17. Outline. Outline. Typical Memory Hierarchy. Adding Cache to Computer. Key Cache Concepts
// CS C: Great Ideas in Computer Architecture (Machine Structures) s Part Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~csc/ Organization and Principles Write Back vs Write Through
More informationEECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141
EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to
More informationEE 4683/5683: COMPUTER ARCHITECTURE
EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir Stojanovic http://insteecsberkeleyedu/~cs61c/ Typical Memory Hierarchy Datapath On-Chip
More informationComputer Architecture Spring 2016
Computer Architecture Spring 2016 Lecture 02: Introduction II Shuai Wang Department of Computer Science and Technology Nanjing University Pipeline Hazards Major hurdle to pipelining: hazards prevent the
More informationCS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches
CS 61C: Great Ideas in Computer Architecture Direct Mapped Caches Instructor: Justin Hsia 7/05/2012 Summer 2012 Lecture #11 1 Review of Last Lecture Floating point (single and double precision) approximates
More informationMainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation
Mainstream Computer System Components CPU Core 2 GHz - 3.0 GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation One core or multi-core (2-4) per chip Multiple FP, integer
More information10/19/17. You Are Here! Review: Direct-Mapped Cache. Typical Memory Hierarchy
CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanović & Randy H Katz http://insteecsberkeleyedu/~cs6c/ Parallel Requests Assigned to computer eg, Search
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 22: Direct Mapped Cache Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Intel 8-core i7-5960x 3 GHz, 8-core, 20 MB of cache, 140
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II
ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,
More informationCPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?
cps 14 memory.1 RW Fall 2 CPS11 Computer Organization and Programming Lecture 13 The System Robert Wagner Outline of Today s Lecture System the BIG Picture? Technology Technology DRAM A Real Life Example
More informationThe Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350):
The Memory Hierarchy & Cache Review of Memory Hierarchy & Cache Basics (from 350): Motivation for The Memory Hierarchy: { CPU/Memory Performance Gap The Principle Of Locality Cache $$$$$ Cache Basics:
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 22: Introduction to Caches Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Caches hold a subset of data from the main
More informationThe University of Adelaide, School of Computer Science 13 September 2018
Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 1 Instructors: Nicholas Weaver & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Components of a Computer Processor
More informationCSE 2021: Computer Organization
CSE 2021: Computer Organization Lecture-12a Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB
More informationCS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory
CS65 Computer Architecture Lecture 9 Memory Hierarchy - Main Memory Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 9: Main Memory 9-/ /6/ A. Sohn Memory Cycle Time 5
More informationCpE 442. Memory System
CpE 442 Memory System CPE 442 memory.1 Outline of Today s Lecture Recap and Introduction (5 minutes) Memory System: the BIG Picture? (15 minutes) Memory Technology: SRAM and Register File (25 minutes)
More informationPage 1. Multilevel Memories (Improving performance using a little cash )
Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency
More informationChapter 5A. Large and Fast: Exploiting Memory Hierarchy
Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM
More informationMainstream Computer System Components
Mainstream Computer System Components Double Date Rate (DDR) SDRAM One channel = 8 bytes = 64 bits wide Current DDR3 SDRAM Example: PC3-12800 (DDR3-1600) 200 MHz (internal base chip clock) 8-way interleaved
More informationEN1640: Design of Computing Systems Topic 06: Memory System
EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring
More informationCSE 2021: Computer Organization
CSE 2021: Computer Organization Lecture-12 Caches-1 The basics of caches Shakil M. Khan Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 3 Instructors: Krste Asanović & Randy H. Katz http://inst.eecs.berkeley.edu/~cs61c/ 10/19/17 Fall 2017 - Lecture #16 1 Parallel
More informationregisters data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.
13 1 CMPE110 Computer Architecture, Winter 2009 Andrea Di Blas 110 Winter 2009 CMPE Cache Direct-mapped cache Reads and writes Cache associativity Cache and performance Textbook Edition: 7.1 to 7.3 Third
More informationThe Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs.
The Hierarchical Memory System The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory Hierarchy:
More informationCaches Part 1. Instructor: Sören Schwertfeger. School of Information Science and Technology SIST
CS 110 Computer Architecture Caches Part 1 Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University Slides based on UC Berkley's
More informationPage 1. Memory Hierarchies (Part 2)
Memory Hierarchies (Part ) Outline of Lectures on Memory Systems Memory Hierarchies Cache Memory 3 Virtual Memory 4 The future Increasing distance from the processor in access time Review: The Memory Hierarchy
More informationMemory Hierarchies &
Memory Hierarchies & Cache Memory CSE 410, Spring 2009 Computer Systems http://www.cs.washington.edu/410 4/26/2009 cse410-13-cache 2006-09 Perkins, DW Johnson and University of Washington 1 Reading and
More informationReducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip
Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off
More informationregisters data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.
Cache associativity Cache and performance 12 1 CMPE110 Spring 2005 A. Di Blas 110 Spring 2005 CMPE Cache Direct-mapped cache Reads and writes Textbook Edition: 7.1 to 7.3 Second Third Edition: 7.1 to 7.3
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per
More informationCS152 Computer Architecture and Engineering Lecture 16: Memory System
CS152 Computer Architecture and Engineering Lecture 16: System March 15, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http://http.cs.berkeley.edu/~patterson
More informationECE468 Computer Organization and Architecture. Memory Hierarchy
ECE468 Computer Organization and Architecture Hierarchy ECE468 memory.1 The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Input Datapath Output Today s Topic:
More informationLocality. Cache. Direct Mapped Cache. Direct Mapped Cache
Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality: it will tend to be referenced again soon spatial locality: nearby items will tend to be
More informationChapter 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 5 Large and Fast: Exploiting Memory Hierarchy Review: Major Components of a Computer Processor Devices Control Memory Input Datapath Output Secondary Memory (Disk) Main Memory Cache Performance
More informationMemory. Objectives. Introduction. 6.2 Types of Memory
Memory Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured. Master the concepts
More informationRegisters. Instruction Memory A L U. Data Memory C O N T R O L M U X A D D A D D. Sh L 2 M U X. Sign Ext M U X ALU CTL INSTRUCTION FETCH
PC Instruction Memory 4 M U X Registers Sign Ext M U X Sh L 2 Data Memory M U X C O T R O L ALU CTL ISTRUCTIO FETCH ISTR DECODE REG FETCH EXECUTE/ ADDRESS CALC MEMOR ACCESS WRITE BACK A D D A D D A L U
More informationModern Computer Architecture
Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap
More informationECE7995 (4) Basics of Memory Hierarchy. [Adapted from Mary Jane Irwin s slides (PSU)]
ECE7995 (4) Basics of Memory Hierarchy [Adapted from Mary Jane Irwin s slides (PSU)] Major Components of a Computer Processor Devices Control Memory Input Datapath Output Performance Processor-Memory Performance
More informationMemory systems. Memory technology. Memory technology Memory hierarchy Virtual memory
Memory systems Memory technology Memory hierarchy Virtual memory Memory technology DRAM Dynamic Random Access Memory bits are represented by an electric charge in a small capacitor charge leaks away, need
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Bernhard Boser & Randy H Katz http://insteecsberkeleyedu/~cs61c/ 10/18/16 Fall 2016 - Lecture #15 1 Outline
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common
More informationCS3350B Computer Architecture
CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &
More informationLECTURE 10: Improving Memory Access: Direct and Spatial caches
EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses
More informationPipelining, Instruction Level Parallelism and Memory in Processors. Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010
Pipelining, Instruction Level Parallelism and Memory in Processors Advanced Topics ICOM 4215 Computer Architecture and Organization Fall 2010 NOTE: The material for this lecture was taken from several
More informationMultilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823
More informationCPE 631 Lecture 04: CPU Caches
Lecture 04 CPU Caches Electrical and Computer Engineering University of Alabama in Huntsville Outline Memory Hierarchy Four Questions for Memory Hierarchy Cache Performance 26/01/2004 UAH- 2 1 Processor-DR
More informationCycle Time for Non-pipelined & Pipelined processors
Cycle Time for Non-pipelined & Pipelined processors Fetch Decode Execute Memory Writeback 250ps 350ps 150ps 300ps 200ps For a non-pipelined processor, the clock cycle is the sum of the latencies of all
More informationCaching Basics. Memory Hierarchies
Caching Basics CS448 1 Memory Hierarchies Takes advantage of locality of reference principle Most programs do not access all code and data uniformly, but repeat for certain data choices spatial nearby
More information14:332:331. Week 13 Basics of Cache
14:332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Week131 Spring 2006
More informationCS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches
CS 61C: Great Ideas in Computer Architecture The Memory Hierarchy, Fully Associative Caches Instructor: Alan Christopher 7/09/2014 Summer 2014 -- Lecture #10 1 Review of Last Lecture Floating point (single
More informationMemory Hierarchy, Fully Associative Caches. Instructor: Nick Riasanovsky
Memory Hierarchy, Fully Associative Caches Instructor: Nick Riasanovsky Review Hazards reduce effectiveness of pipelining Cause stalls/bubbles Structural Hazards Conflict in use of datapath component Data
More informationCSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]
CSF Improving Cache Performance [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user
More informationTextbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, Textbook web site:
Textbook: Burdea and Coiffet, Virtual Reality Technology, 2 nd Edition, Wiley, 2003 Textbook web site: www.vrtechnology.org 1 Textbook web site: www.vrtechnology.org Laboratory Hardware 2 Topics 14:332:331
More information14:332:331. Week 13 Basics of Cache
14:332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson s UCB CS152 slides and Mary Jane Irwin s PSU CSE331 slides] 331 Lec20.1 Fall 2003 Head
More informationLecture 11 Cache. Peng Liu.
Lecture 11 Cache Peng Liu liupeng@zju.edu.cn 1 Associative Cache Example 2 Associative Cache Example 3 Associativity Example Compare 4-block caches Direct mapped, 2-way set associative, fully associative
More informationThe Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)
The Memory Hierarchy Cache, Main Memory, and Virtual Memory (Part 2) Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Cache Line Replacement The cache
More informationLecture 20: Memory Hierarchy Main Memory and Enhancing its Performance. Grinch-Like Stuff
Lecture 20: ory Hierarchy Main ory and Enhancing its Performance Professor Alvin R. Lebeck Computer Science 220 Fall 1999 HW #4 Due November 12 Projects Finish reading Chapter 5 Grinch-Like Stuff CPS 220
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationLecture 18: DRAM Technologies
Lecture 18: DRAM Technologies Last Time: Cache and Virtual Memory Review Today DRAM organization or, why is DRAM so slow??? Lecture 18 1 Main Memory = DRAM Lecture 18 2 Basic DRAM Architecture Lecture
More informationReview : Pipelining. Memory Hierarchy
CS61C L11 Caches (1) CS61CL : Machine Structures Review : Pipelining The Big Picture Lecture #11 Caches 2009-07-29 Jeremy Huddleston!! Pipeline challenge is hazards "! Forwarding helps w/many data hazards
More informationMemory Hierarchy: Caches, Virtual Memory
Memory Hierarchy: Caches, Virtual Memory Readings: 5.1-5.4, 5.8 Big memories are slow Computer Fast memories are small Processor Memory Devices Control Input Datapath Output Need to get fast, big memories
More informationMemory Hierarchy Technology. The Big Picture: Where are We Now? The Five Classic Components of a Computer
The Big Picture: Where are We Now? The Five Classic Components of a Computer Processor Control Datapath Today s Topics: technologies Technology trends Impact on performance Hierarchy The principle of locality
More informationChapter 6 Objectives
Chapter 6 Memory Chapter 6 Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.
More informationAdapted from David Patterson s slides on graduate computer architecture
Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual
More informationComputer Architecture. Memory Hierarchy. Lynn Choi Korea University
Computer Architecture Memory Hierarchy Lynn Choi Korea University Memory Hierarchy Motivated by Principles of Locality Speed vs. Size vs. Cost tradeoff Locality principle Temporal Locality: reference to
More informationCaches Concepts Review
Caches Concepts Review What is a block address? Why not bring just what is needed by the processor? What is a set associative cache? Write-through? Write-back? Then we ll see: Block allocation policy on
More informationCaches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016
Caches and Memory Hierarchy: Review UCSB CS240A, Winter 2016 1 Motivation Most applications in a single processor runs at only 10-20% of the processor peak Most of the single processor performance loss
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #21: Caches 3 2005-07-27 CS61C L22 Caches III (1) Andy Carle Review: Why We Use Caches 1000 Performance 100 10 1 1980 1981 1982 1983
More informationMemory Hierarchy. Maurizio Palesi. Maurizio Palesi 1
Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio
More informationCENG4480 Lecture 09: Memory 1
CENG4480 Lecture 09: Memory 1 Bei Yu byu@cse.cuhk.edu.hk (Latest update: November 8, 2017) Fall 2017 1 / 37 Overview Introduction Memory Principle Random Access Memory (RAM) Non-Volatile Memory Conclusion
More informationMemory Hierarchy. ENG3380 Computer Organization and Architecture Cache Memory Part II. Topics. References. Memory Hierarchy
ENG338 Computer Organization and Architecture Part II Winter 217 S. Areibi School of Engineering University of Guelph Hierarchy Topics Hierarchy Locality Motivation Principles Elements of Design: Addresses
More informationCaches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017
Caches and Memory Hierarchy: Review UCSB CS24A, Fall 27 Motivation Most applications in a single processor runs at only - 2% of the processor peak Most of the single processor performance loss is in the
More informationEEC 170 Computer Architecture Fall Improving Cache Performance. Administrative. Review: The Memory Hierarchy. Review: Principle of Locality
Administrative EEC 7 Computer Architecture Fall 5 Improving Cache Performance Problem #6 is posted Last set of homework You should be able to answer each of them in -5 min Quiz on Wednesday (/7) Chapter
More information5. Memory Hierarchy Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3. Emil Sekerinski, McMaster University, Fall Term 2015/16
5. Memory Hierarchy Computer Architecture COMP SCI 2GA3 / SFWR ENG 2GA3 Emil Sekerinski, McMaster University, Fall Term 2015/16 Movie Rental Store You have a huge warehouse with every movie ever made.
More information