Background. Memory Hierarchies. Register File. Background. Forecast Memory (B5) Motivation for memory hierarchy Cache ECC Virtual memory.

Size: px
Start display at page:

Download "Background. Memory Hierarchies. Register File. Background. Forecast Memory (B5) Motivation for memory hierarchy Cache ECC Virtual memory."

Transcription

1 Memory Hierarchies Forecast Memory (B5) Motivation for memory hierarchy Cache ECC Virtual memory Mem Element Background Size Speed Price Register small 1-5ns high?? SRAM medium 5-25ns $ DRAM large ns $5-$10 Disk large 10-20ms $ EE365 Lecture Notes: Chapter 7 1 EE365 Lecture Notes: Chapter 7 2 Background Register File Need basic element to store a bit - latch, flip-flop, capacitor Memory is logically a 2D array of #locations x data-width e.g., 16 registers 32 bits each is a 16 x 32 memory (4 address bits; 32 bits of data) today s main memory chips are 8M x 8 (23 address bits; 8 bits of data) 32 FF in parallel => one register 16 registers one 16-way mux per read port decode write enable can use tri-state and bus for each port EE365 Lecture Notes: Chapter 7 3 EE365 Lecture Notes: Chapter 7 4

2 SRAM Static RAM does not lose data like DRAM 6T CMOS cell pass transistors as switch bit lines, word lines SRAM interface Today - 2M x 8 in 5-15ns Typical large implementations (512 x 64) x 8 DRAM Dense memory 1 T cell forgets data on read and after a while e.g., 16M x 1 in 4k x 4k array 24 address bits - 12 for row and 12 for column Implementation writeback row to restore destroyed value Refresh - in background, march through reading all rows Interface reflects internal orgn. - addr/2, RAS, CAS, data EE365 Lecture Notes: Chapter 7 5 EE365 Lecture Notes: Chapter 7 6 Optimizations Motivation for Hierarchy Give faster access to some bits of row static column - change column address page mode - change column address & CAS hit (EDO) nibble mode - fast access to 4 bits Bigger changes in future bandwidth inside >> external bandwidth 8kb/50ns/chip >> 8b/50ns/chip 164 Gb/s >> 20 Mb/s RAMBUS, IRAM, etc CPU wants memory reference/insn * bytes-per-reference * IPC/Cycle 1.2*4*1/2ns = 2.4 GB/s CPU can go only as fast as memory can supply EE365 Lecture Notes: Chapter 7 7 EE365 Lecture Notes: Chapter 7 8

3 Motivation for Hierarchy Want memory with fast access (e.g., one 2 ns CPU cycle) large capacity (1 GB) inexpensive ($1/MB) Incompatible requirements Fortunately memory references are not random! Motivation for Hierarchy Locality in time (temporal locality) if a datum is recently referenced, it is likely to be referenced again soon Locality in space (spacial locality) If a datum is recently referenced, neighbouring data is likely to be referenced soon EE365 Lecture Notes: Chapter 7 9 EE365 Lecture Notes: Chapter 7 10 Motivation for Hierarchy Motivation for Hierarchy E.g., researching term paper - don t look at all books at random if you look at a chapter in one book temporal - may re-read the chapter again spatial - may read neighbouring chapters Solution - leave the book on desk for a while hit - book on desk miss - book not on desk miss ratio - fraction not on desk Memory access time = access-desk + miss-ratio * access-shelf * << 100 Extend this to several levels of hierarchy EE365 Lecture Notes: Chapter 7 11 EE365 Lecture Notes: Chapter 7 12

4 Memory Hierarchy Memory Hierarchy Small, fast, inexpensive memory larger, slower, cheaper memory... largest, slowest, cheapest memory CPU L1 L2 larger Type Size Speed (ns) Bandwid ($/MB) Price ($) Register < 1 KB >> 100 Cache < 512 KB Main memory < 512 MB L3 Disk > 1 GB 20 x faster Ln EE365 Lecture Notes: Chapter 7 13 EE365 Lecture Notes: Chapter 7 14 Memory Hierarchy Memory Hierarchy Registers <-> Main memory: managed by compiler/programmer holds expression temporaries holds variables - more aggressive register allocation spill when needed hard! Main memory <-> Disk: managed by program - explicit I/O operating system - virtual memory illusion of larger memory protection transparent to user EE365 Lecture Notes: Chapter 7 15 EE365 Lecture Notes: Chapter 7 16

5 Cache Cache cache managed by hardware keep recently accessed block temporal locality break memory into blocks (several bytes) spatial locality transfer data to/from cache in blocks CPU $ put block in block frame state (e.g., valid) address tag data Main Memory EE365 Lecture Notes: Chapter 7 17 EE365 Lecture Notes: Chapter 7 18 Cache Cache Example on memory access if incoming tag == stored tag then HIT else MISS << replace old block >> get block from memory put block in cache return appropriate word within block Memory words: 0x11c 0xe0e0e0e0 0x120 0xffffffff 0x124 0x x128 0x x12c 0x x130 0xabababab EE365 Lecture Notes: Chapter 7 19 EE365 Lecture Notes: Chapter 7 20

6 Cache Example a 16-byte cache block frame: state tag data invalid 0x????? lw $4, 0x128 Is tag ox120 in cache? (0x128 mod 16 = 0x128 & 0xfffffff0) Return 0x7 to CPU to put in $4 lw $5, 0x124 Is tag 0x120 in cache? Yes, return 0x1 to CPU Cache Example No, get block state tag data valid 0x129 0xffffffff, 0x1, 0x7, 0x3 EE365 Lecture Notes: Chapter 7 21 EE365 Lecture Notes: Chapter 7 22 Cache Example Cache Often cache 1 cycle main memory 20 cycles Performance for data accesses with miss ratio 0.1 mean access = cache access + miss ratio * main memory access 4 questions where is block placed how is block found which block is replaced what happens on a write = * 20 = 1.2 Typically caches 64K, main memory 64M 20 times faster 1/1000 capacity but contains 98% of references EE365 Lecture Notes: Chapter 7 23 EE365 Lecture Notes: Chapter 7 24

7 Simple cache first block size = 1 word direct-mapped 16K words (64KB) index - 14 bits tag - 16 bits Cache Design Cache Design Hit Miss place? replace? Build 64K with 16-byte blocks What if blocks conflict? Fully associative cache CAM cells hold D and D ; incoming bits B and B match = AND (B i *D i + B i *D i ) compromise - set associative cache EE365 Lecture Notes: Chapter 7 25 EE365 Lecture Notes: Chapter 7 26 Cache Design Cache Design 3C model Conflict Capacity Compulsory Q3. Which block is replaced LRU random Q4. What happens on a write? write hit must be slower propagate to memory? immediately - write-through on replacement - write-back EE365 Lecture Notes: Chapter 7 27 EE365 Lecture Notes: Chapter 7 28

8 Cache Design Exploit spatial locality bigger block size may increase miss penalty Reduce conflicts more associativity may increase cache hit time Cache Design Unified vs. split instruction and data cache Example consider building 16K I and D cache or a 32K unified cache let t cache be 1 cycle and t memory be 10 cycles EE365 Lecture Notes: Chapter 7 29 EE365 Lecture Notes: Chapter 7 30 Cache Design Cache Design I and D split cache (a) I miss is 5% and D miss is 6% 75% references are instruction fetches t avg = ( *10)* ( *10) * 0.25 = 1.5 Unified cache t avg = *10 = 1.4 WRONG! Multi-level caches Many systems today have a cache hierarchy E.g., 16K I-cache 16K D-cache 1M L2-cache t avg = cycles-lost-to-interference will cycles-lost-to-interference be < 0.1? NOT for modern pipelined processors! EE365 Lecture Notes: Chapter 7 31 EE365 Lecture Notes: Chapter 7 32

9 Cache Design Why? Processors getting faster w.r.t. main memory want larger caches to reduce frequency of costly misses but larger caches are slower! Solution: Reduce cost of misses with a second level cache exploits today s technology can t put large cache on microprocessor board designer can vary cost/performance CPU and Cache Performance Cache only miss ratio average access time Integrate - assume cache hits are part of the pipeline Time/prog = insn/prog * cycles/insn * sec/cycle CPI = (execution cycles + stall cycles)/insn CPI = execution cycles/insn + stall cycles/insn EE365 Lecture Notes: Chapter 7 33 EE365 Lecture Notes: Chapter 7 34 CPU and Cache Performance Stall cycles/insn = read stall cycles/insn + write stall cycles/insn read stall cycles/insn = read/insn * miss ratio * read miss penalty write stall cycles/insn = more complex - write through, write back, write buffer? CPU and Cache Performance Example CPI with ideal memory is 1.5 Assume IF and write never stall How is CPI degraded if loads are 25% of all insns loads miss 10% and miss cost is 20 cycles CPI = *0.10*20 = 2 2/1,5 = 33% slower EE365 Lecture Notes: Chapter 7 35 EE365 Lecture Notes: Chapter 7 36

10 Main Memory Main Memory Each memory access 1 cycle address 5 cycle DRAM (really 10+) 1 cycle data 4 word cache block one word wide: adddddbdddddbdddddbdddddbdddddb *(5+1) = 25 Four word wide: adddddb = 7 Interleaved (pipelined) adddddb ddddd b ddddd b ddddd b = 10 EE365 Lecture Notes: Chapter 7 37 EE365 Lecture Notes: Chapter 7 38 Error Correcting Codes (ECC) Read ECC stuff in Appendix B Assume small number of random errors - bit(s) get flipped So in 1 word no errors > single error > two errors > >2 errors Detection - signal a problem Correction - restore data to correct value Most common Parity - single error detection SECDED - single error correction; double bit detection ECC Power correct #bits comments nothing 0, 1 1 SED 00, , 10 detect errors SEC 000, 111 SECDED 0000, , 010, 100 => , 101, 011 => one 1 => 0000 two 1 s => error three 1 s => 1111 EE365 Lecture Notes: Chapter 7 39 EE365 Lecture Notes: Chapter 7 40

11 ECC For SECDED # 1 s result 0 error 1 Hamming distance no. of changes to convert one code to another All legal SECDED codes must be at Hamming distance 4 ECC Reduce overhead by doing codes on word, not bit overhead # bits SED SECDED 1 1(100%) 3(300%) 32 1 (3%) 7 (22%) 64 1 (1.6%) 8 (13%) n 1 (1/n) 1 + log 2 n + a little EE365 Lecture Notes: Chapter 7 41 EE365 Lecture Notes: Chapter 7 42 ECC ECC 64 bits data 8 bits check To store dddd...d ccccccc use eight by 9 SIMMs = 72 bits Intuition one check bit is parity other check bits point to error in data error in all check bits no error use data 0 to compute check 0 store data 0 and check 0 To load read data 1 and check 1 use data 1 to compute check 2 syndrome = check 1 xor check 2 EE365 Lecture Notes: Chapter 7 43 EE365 Lecture Notes: Chapter 7 44

12 ECC Virtual Memory Basic idea move data from disk and main memory like caches to/from main memory But miss penalty for first byte is 1M cycles, not therefore engineered differently later, we will return to the 4 questions EE365 Lecture Notes: Chapter 7 45 EE365 Lecture Notes: Chapter 7 46 Virtual Memory Virtual Memory Blocks are called pages typically 4K-16K fixed size per system Picture Architecture presents programs with a simple view memory addressed with 32-bit addresses lw $1, 0x => 0x is the virtual address system maps VA to physical address (PA) 0x > 0xF028 (page 15, offset 28 for 4K page) someone else and I run unrelated programs each lw $1, 0x VA must map to different PA Thus, VA allows use more physical memory than system has think it is the only program running in memory think it always starts at address 0x0 be protected from rogue programs start running when most of the program is still on disk EE365 Lecture Notes: Chapter 7 47 EE365 Lecture Notes: Chapter 7 48

13 Virtual Memory A VA miss is called a page fault an exception that saves the PC OS gains control and initiates disk access OS usually runs someone else in the meantime interrupt when disk access is complete original instruction restarts Address Translation VA -> PA E.g., 4K pages Use page tables of 4B PTEs index with page offset address of PTE = PTBR + page offset*4 Unlike cache misses, why is OS used to handle a page fault? EE365 Lecture Notes: Chapter 7 49 EE365 Lecture Notes: Chapter 7 50 Address Translation Translation Buffer PTE contains page frame number valid bit protection bits Each program has own PT; switch by chaging PTBR VM causes 100% overhead - 2 memory accesses - PTE + data! What to do? temporal and spatial locality Translation (Lookaside) Buffer a cache of translations valid tag data valid page# page frame# rest of PTE ? could make Fully/Set associative/direct mapped EE365 Lecture Notes: Chapter 7 51 EE365 Lecture Notes: Chapter 7 52

14 Example Virtual Memory 64 entries, FA, maps 64*4K = 256 bytes Figure Virtual address caches are also possible faster but synonym problem 4 Questions where is a page placed fully associative - any page on any frame How is page found not associative search but indirection through PT On context switch change PTBR either flush TLB or add PIDs to TLB tags EE365 Lecture Notes: Chapter 7 53 EE365 Lecture Notes: Chapter 7 54 Virtual Memory Protection Which page is replaced approx LRU clock use page reference bit What happens on a write write-backs use page dirty bit User VAs map to different PAs - no overlap But may want sharing user-user user-kernel (mode bit, syscall interface) In PTE and TLB entry invalid (had before) read-only read-write (had before) EE365 Lecture Notes: Chapter 7 55 EE365 Lecture Notes: Chapter 7 56

15 Page Table Size How big is the PT? 2 32 /4K * 4 = 4M per program To make smaller define a limit register do limit registers for a few regions - stack, heap page a part of PT (terminate recursion) Segmented VA (noncontiguous alloc, segment table->pt) use Hash table to map PA-VA - called inverted PT More Optimizations Non-blocking caches handle hits under misses Interleaved/banked caches multiple requests simultaneously (poor-man s multiporting) Write Buffers miss penalty of dirty blocks Out-of-order CPU tolerate cache hit and miss latencies EE365 Lecture Notes: Chapter 7 57 EE365 Lecture Notes: Chapter 7 58 More Optimizations Real Stuff Compiler optimizations get rid of memory accesses (register allocation, reuse) improve locality (blocking, tiling) insert prefetch code scheduling DEC Alpha (550 MHz) L1 cache 4 way out-of-order CPU pipeline 2 loads/stores per cycle (phase pipelined) 3 cycles hit latency, 8+ GB/s bandwidth L2 cache 12 cycle hit latency, 4+ GB/s bandwidth System interface 64 bit bus, 80 cycle latency, 2+ GB/s bandwidth EE365 Lecture Notes: Chapter 7 59 EE365 Lecture Notes: Chapter 7 60

16 Real Stuff Charac Pentium Pro PowerPC VA 32 bits 52 bits PA 32 bits 32 bits Page size 4 KB, 4 MB 4 KB, selectable, 256 MB TLB split I and D 4-way assoc pseudo random I - 32, D - 64 TLB miss H/W split I and D 2-way assoc LRU I - 128, D- 128 TLB miss H/W Real Stuff Charac Pentium Pro PowerPC cache split I and D split I and D size 8KB each 16 KB each assoc 4-way 4-way replace approx LRU LRU block 32 bytes 32 bytes write write-back write-back or write-through EE365 Lecture Notes: Chapter 7 61 EE365 Lecture Notes: Chapter 7 62 Summary Summary Temporal and spatial locality, Memory hierarchy Cache design - block size, associativity, write back/through Multilevel cache hierarchies Virtual memory, translation (VA -> PA), page table (PT) VM design - page size, FA through PT, reference bit, dirty bit Fast translations - TLB Protection, page faults (exceptions) 4 Questions - cache, VM, TLB Where can a block be placed one (DM), a few (SA), any (FA) How is a block found indexing (DM), search (SA/FA), table lookup (PT) What is replaced on a miss LRU or random How are writes handled write through or write back; write back for VM EE365 Lecture Notes: Chapter 7 63 EE365 Lecture Notes: Chapter 7 64

Main Memory (Fig. 7.13) Main Memory

Main Memory (Fig. 7.13) Main Memory Main Memory (Fig. 7.13) CPU CPU CPU Cache Multiplexor Cache Cache Bus Bus Bus Memory Memory bank 0 Memory bank 1 Memory bank 2 Memory bank 3 Memory b. Wide memory organization c. Interleaved memory organization

More information

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY

Chapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

CPU issues address (and data for write) Memory returns data (or acknowledgment for write)

CPU issues address (and data for write) Memory returns data (or acknowledgment for write) The Main Memory Unit CPU and memory unit interface Address Data Control CPU Memory CPU issues address (and data for write) Memory returns data (or acknowledgment for write) Memories: Design Objectives

More information

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

Chapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST

Chapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST Chapter 5 Memory Hierarchy Design In-Cheol Park Dept. of EE, KAIST Why cache? Microprocessor performance increment: 55% per year Memory performance increment: 7% per year Principles of locality Spatial

More information

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James Computer Systems Architecture I CSE 560M Lecture 18 Guest Lecturer: Shakir James Plan for Today Announcements No class meeting on Monday, meet in project groups Project demos < 2 weeks, Nov 23 rd Questions

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

CSE502: Computer Architecture CSE 502: Computer Architecture

CSE502: Computer Architecture CSE 502: Computer Architecture CSE 502: Computer Architecture Memory Hierarchy & Caches Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required

More information

Lecture 11. Virtual Memory Review: Memory Hierarchy

Lecture 11. Virtual Memory Review: Memory Hierarchy Lecture 11 Virtual Memory Review: Memory Hierarchy 1 Administration Homework 4 -Due 12/21 HW 4 Use your favorite language to write a cache simulator. Input: address trace, cache size, block size, associativity

More information

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1

Memory Technology. Chapter 5. Principle of Locality. Chapter 5 Large and Fast: Exploiting Memory Hierarchy 1 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface Chapter 5 Large and Fast: Exploiting Memory Hierarchy 5 th Edition Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic

More information

Page 1. Multilevel Memories (Improving performance using a little cash )

Page 1. Multilevel Memories (Improving performance using a little cash ) Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency

More information

Handout 4 Memory Hierarchy

Handout 4 Memory Hierarchy Handout 4 Memory Hierarchy Outline Memory hierarchy Locality Cache design Virtual address spaces Page table layout TLB design options (MMU Sub-system) Conclusion 2012/11/7 2 Since 1980, CPU has outpaced

More information

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors)

Chapter Seven. SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) Chapter Seven emories: Review SRA: value is stored on a pair of inverting gates very fast but takes up more space than DRA (4 to transistors) DRA: value is stored as a charge on capacitor (must be refreshed)

More information

LECTURE 10: Improving Memory Access: Direct and Spatial caches

LECTURE 10: Improving Memory Access: Direct and Spatial caches EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff wolff@eecs.cwru.edu Case Western Reserve University This presentation uses

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

Topic 18: Virtual Memory

Topic 18: Virtual Memory Topic 18: Virtual Memory COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Virtual Memory Any time you see virtual, think using a level of indirection

More information

EITF20: Computer Architecture Part 5.1.1: Virtual Memory

EITF20: Computer Architecture Part 5.1.1: Virtual Memory EITF20: Computer Architecture Part 5.1.1: Virtual Memory Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache optimization Virtual memory Case study AMD Opteron Summary 2 Memory hierarchy 3 Cache

More information

Modern Computer Architecture

Modern Computer Architecture Modern Computer Architecture Lecture3 Review of Memory Hierarchy Hongbin Sun 国家集成电路人才培养基地 Xi an Jiaotong University Performance 1000 Recap: Who Cares About the Memory Hierarchy? Processor-DRAM Memory Gap

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

CS3350B Computer Architecture

CS3350B Computer Architecture CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 5. Large and Fast: Exploiting Memory Hierarchy COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology

Memory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Large and Fast: Exploiting Memory Hierarchy The Basic of Caches Measuring & Improving Cache Performance Virtual Memory A Common

More information

Cache Architectures Design of Digital Circuits 217 Srdjan Capkun Onur Mutlu http://www.syssec.ethz.ch/education/digitaltechnik_17 Adapted from Digital Design and Computer Architecture, David Money Harris

More information

Memory Technologies. Technology Trends

Memory Technologies. Technology Trends . 5 Technologies Random access technologies Random good access time same for all locations DRAM Dynamic Random Access High density, low power, cheap, but slow Dynamic need to be refreshed regularly SRAM

More information

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University Lecture 12 Memory Design & Caches, part 2 Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements HW3 is due today PA2 is available on-line today Part 1 is due on 2/27

More information

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1 CSE 431 Computer Architecture Fall 2008 Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Mary Jane Irwin ( www.cse.psu.edu/~mji ) [Adapted from Computer Organization and Design, 4 th Edition, Patterson

More information

Memory. Lecture 22 CS301

Memory. Lecture 22 CS301 Memory Lecture 22 CS301 Administrative Daily Review of today s lecture w Due tomorrow (11/13) at 8am HW #8 due today at 5pm Program #2 due Friday, 11/16 at 11:59pm Test #2 Wednesday Pipelined Machine Fetch

More information

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand Caches Nima Honarmand Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required by all of the running applications

More information

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck Main memory management CMSC 411 Computer Systems Architecture Lecture 16 Memory Hierarchy 3 (Main Memory & Memory) Questions: How big should main memory be? How to handle reads and writes? How to find

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

Memory Hierarchy. Mehran Rezaei

Memory Hierarchy. Mehran Rezaei Memory Hierarchy Mehran Rezaei What types of memory do we have? Registers Cache (Static RAM) Main Memory (Dynamic RAM) Disk (Magnetic Disk) Option : Build It Out of Fast SRAM About 5- ns access Decoders

More information

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY

LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal

More information

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Improving Cache Performance [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major

More information

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University

Computer Architecture. Memory Hierarchy. Lynn Choi Korea University Computer Architecture Memory Hierarchy Lynn Choi Korea University Memory Hierarchy Motivated by Principles of Locality Speed vs. Size vs. Cost tradeoff Locality principle Temporal Locality: reference to

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Page 1. Memory Hierarchies (Part 2)

Page 1. Memory Hierarchies (Part 2) Memory Hierarchies (Part ) Outline of Lectures on Memory Systems Memory Hierarchies Cache Memory 3 Virtual Memory 4 The future Increasing distance from the processor in access time Review: The Memory Hierarchy

More information

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner CPS104 Computer Organization and Programming Lecture 16: Virtual Memory Robert Wagner cps 104 VM.1 RW Fall 2000 Outline of Today s Lecture Virtual Memory. Paged virtual memory. Virtual to Physical translation:

More information

EN1640: Design of Computing Systems Topic 06: Memory System

EN1640: Design of Computing Systems Topic 06: Memory System EN164: Design of Computing Systems Topic 6: Memory System Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University Spring

More information

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed) Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2011/12 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2011/12 1 2

More information

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2)

The Memory Hierarchy. Cache, Main Memory, and Virtual Memory (Part 2) The Memory Hierarchy Cache, Main Memory, and Virtual Memory (Part 2) Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Cache Line Replacement The cache

More information

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed) Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2012/13 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2012/13 1 2

More information

COMP 3221: Microprocessors and Embedded Systems

COMP 3221: Microprocessors and Embedded Systems COMP 3: Microprocessors and Embedded Systems Lectures 7: Cache Memory - III http://www.cse.unsw.edu.au/~cs3 Lecturer: Hui Wu Session, 5 Outline Fully Associative Cache N-Way Associative Cache Block Replacement

More information

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory CPS 104 Computer Organization and Programming Lecture 20: Virtual Nov. 10, 1999 Dietolf (Dee) Ramm http://www.cs.duke.edu/~dr/cps104.html CPS 104 Lecture 20.1 Outline of Today s Lecture O Virtual. 6 Paged

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

CS61C : Machine Structures

CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #24 Cache II 27-8-6 Scott Beamer, Instructor New Flow Based Routers CS61C L24 Cache II (1) www.anagran.com Caching Terminology When we try

More information

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches

CS 61C: Great Ideas in Computer Architecture. Direct Mapped Caches CS 61C: Great Ideas in Computer Architecture Direct Mapped Caches Instructor: Justin Hsia 7/05/2012 Summer 2012 Lecture #11 1 Review of Last Lecture Floating point (single and double precision) approximates

More information

Computer Science 146. Computer Architecture

Computer Science 146. Computer Architecture Computer Architecture Spring 2004 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture 18: Virtual Memory Lecture Outline Review of Main Memory Virtual Memory Simple Interleaving Cycle

More information

The University of Adelaide, School of Computer Science 13 September 2018

The University of Adelaide, School of Computer Science 13 September 2018 Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality: it will tend to be referenced again soon spatial locality: nearby items will tend to be

More information

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II

ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,

More information

1. Memory technology & Hierarchy

1. Memory technology & Hierarchy 1 Memory technology & Hierarchy Caching and Virtual Memory Parallel System Architectures Andy D Pimentel Caches and their design cf Henessy & Patterson, Chap 5 Caching - summary Caches are small fast memories

More information

Chapter Seven Morgan Kaufmann Publishers

Chapter Seven Morgan Kaufmann Publishers Chapter Seven Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored as a charge on capacitor (must be

More information

HY225 Lecture 12: DRAM and Virtual Memory

HY225 Lecture 12: DRAM and Virtual Memory HY225 Lecture 12: DRAM and irtual Memory Dimitrios S. Nikolopoulos University of Crete and FORTH-ICS May 16, 2011 Dimitrios S. Nikolopoulos Lecture 12: DRAM and irtual Memory 1 / 36 DRAM Fundamentals Random-access

More information

Topic 18 (updated): Virtual Memory

Topic 18 (updated): Virtual Memory Topic 18 (updated): Virtual Memory COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Virtual Memory Any time you see virtual, think using a level

More information

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory 1 COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory Cristinel Ababei Dept. of Electrical and Computer Engineering Marquette University Credits: Slides adapted from presentations

More information

COSC 6385 Computer Architecture - Memory Hierarchies (I)

COSC 6385 Computer Architecture - Memory Hierarchies (I) COSC 6385 Computer Architecture - Memory Hierarchies (I) Edgar Gabriel Spring 2018 Some slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05

More information

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 5. Memory Hierachy BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 5 Memory Hierachy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic

More information

COSC3330 Computer Architecture Lecture 20. Virtual Memory

COSC3330 Computer Architecture Lecture 20. Virtual Memory COSC3330 Computer Architecture Lecture 20. Virtual Memory Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston Virtual Memory Topics Reducing Cache Miss Penalty (#2) Use

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste

More information

Memory Hierarchies 2009 DAT105

Memory Hierarchies 2009 DAT105 Memory Hierarchies Cache performance issues (5.1) Virtual memory (C.4) Cache performance improvement techniques (5.2) Hit-time improvement techniques Miss-rate improvement techniques Miss-penalty improvement

More information

Chapter 5. Topics in Memory Hierachy. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ.

Chapter 5. Topics in Memory Hierachy. Computer Architectures. Tien-Fu Chen. National Chung Cheng Univ. Computer Architectures Chapter 5 Tien-Fu Chen National Chung Cheng Univ. Chap5-0 Topics in Memory Hierachy! Memory Hierachy Features: temporal & spatial locality Common: Faster -> more expensive -> smaller!

More information

COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address

More information

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading Review on ILP TDT 4260 Chap 5 TLP & Hierarchy What is ILP? Let the compiler find the ILP Advantages? Disadvantages? Let the HW find the ILP Advantages? Disadvantages? Contents Multi-threading Chap 3.5

More information

COSC 6385 Computer Architecture. - Memory Hierarchies (I)

COSC 6385 Computer Architecture. - Memory Hierarchies (I) COSC 6385 Computer Architecture - Hierarchies (I) Fall 2007 Slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05 Recap

More information

Memory latency: Affects cache miss penalty. Measured by:

Memory latency: Affects cache miss penalty. Measured by: Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row. Static RAM may be used for main memory

More information

Memory latency: Affects cache miss penalty. Measured by:

Memory latency: Affects cache miss penalty. Measured by: Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store a bit, but require a periodic data refresh by reading every row. Static RAM may be used for main memory

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface COEN-4710 Computer Hardware Lecture 7 Large and Fast: Exploiting Memory Hierarchy (Chapter 5) Cristinel Ababei Marquette University Department

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

Spring 2018 :: CSE 502. Cache Design Basics. Nima Honarmand

Spring 2018 :: CSE 502. Cache Design Basics. Nima Honarmand Cache Design Basics Nima Honarmand Storage Hierarchy Make common case fast: Common: temporal & spatial locality Fast: smaller, more expensive memory Bigger Transfers Registers More Bandwidth Controlled

More information

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now?

CPS101 Computer Organization and Programming Lecture 13: The Memory System. Outline of Today s Lecture. The Big Picture: Where are We Now? cps 14 memory.1 RW Fall 2 CPS11 Computer Organization and Programming Lecture 13 The System Robert Wagner Outline of Today s Lecture System the BIG Picture? Technology Technology DRAM A Real Life Example

More information

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds.

Review: Performance Latency vs. Throughput. Time (seconds/program) is performance measure Instructions Clock cycles Seconds. Performance 980 98 982 983 984 985 986 987 988 989 990 99 992 993 994 995 996 997 998 999 2000 7/4/20 CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches Instructor: Michael Greenbaum

More information

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp.

registers data 1 registers MEMORY ADDRESS on-chip cache off-chip cache main memory: real address space part of virtual addr. sp. Cache associativity Cache and performance 12 1 CMPE110 Spring 2005 A. Di Blas 110 Spring 2005 CMPE Cache Direct-mapped cache Reads and writes Textbook Edition: 7.1 to 7.3 Second Third Edition: 7.1 to 7.3

More information

CHAPTER 4 MEMORY HIERARCHIES TYPICAL MEMORY HIERARCHY TYPICAL MEMORY HIERARCHY: THE PYRAMID CACHE PERFORMANCE MEMORY HIERARCHIES CACHE DESIGN

CHAPTER 4 MEMORY HIERARCHIES TYPICAL MEMORY HIERARCHY TYPICAL MEMORY HIERARCHY: THE PYRAMID CACHE PERFORMANCE MEMORY HIERARCHIES CACHE DESIGN CHAPTER 4 TYPICAL MEMORY HIERARCHY MEMORY HIERARCHIES MEMORY HIERARCHIES CACHE DESIGN TECHNIQUES TO IMPROVE CACHE PERFORMANCE VIRTUAL MEMORY SUPPORT PRINCIPLE OF LOCALITY: A PROGRAM ACCESSES A RELATIVELY

More information

Course Administration

Course Administration Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570

More information

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics Systemprogrammering 27 Föreläsning 4 Topics The memory hierarchy Motivations for VM Address translation Accelerating translation with TLBs Random-Access (RAM) Key features RAM is packaged as a chip. Basic

More information

This Unit: Main Memory. Virtual Memory. Virtual Memory. Other Uses of Virtual Memory

This Unit: Main Memory. Virtual Memory. Virtual Memory. Other Uses of Virtual Memory This Unit: Virtual Application OS Compiler Firmware I/O Digital Circuits Gates & Transistors hierarchy review DRAM technology A few more transistors Organization: two level addressing Building a memory

More information

Memory Hierarchy and Caches

Memory Hierarchy and Caches Memory Hierarchy and Caches COE 301 / ICS 233 Computer Organization Dr. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals Presentation Outline

More information

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy Systemprogrammering 29 Föreläsning 4 Topics! The memory hierarchy! Motivations for VM! Address translation! Accelerating translation with TLBs Random-Access (RAM) Key features! RAM is packaged as a chip.!

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 08: Caches III Shuai Wang Department of Computer Science and Technology Nanjing University Improve Cache Performance Average memory access time (AMAT): AMAT =

More information

Cache Performance (H&P 5.3; 5.5; 5.6)

Cache Performance (H&P 5.3; 5.5; 5.6) Cache Performance (H&P 5.3; 5.5; 5.6) Memory system and processor performance: CPU time = IC x CPI x Clock time CPU performance eqn. CPI = CPI ld/st x IC ld/st IC + CPI others x IC others IC CPI ld/st

More information

The Memory Hierarchy & Cache

The Memory Hierarchy & Cache Removing The Ideal Memory Assumption: The Memory Hierarchy & Cache The impact of real memory on CPU Performance. Main memory basic properties: Memory Types: DRAM vs. SRAM The Motivation for The Memory

More information

Adapted from David Patterson s slides on graduate computer architecture

Adapted from David Patterson s slides on graduate computer architecture Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual

More information

Virtual Memory Virtual memory first used to relive programmers from the burden of managing overlays.

Virtual Memory Virtual memory first used to relive programmers from the burden of managing overlays. CSE420 Virtual Memory Prof. Mokhtar Aboelaze York University Based on Slides by Prof. L. Bhuyan (UCR) Prof. M. Shaaban (RIT) Virtual Memory Virtual memory first used to relive programmers from the burden

More information

Topics: Memory Management (SGG, Chapter 08) 8.1, 8.2, 8.3, 8.5, 8.6 CS 3733 Operating Systems

Topics: Memory Management (SGG, Chapter 08) 8.1, 8.2, 8.3, 8.5, 8.6 CS 3733 Operating Systems Topics: Memory Management (SGG, Chapter 08) 8.1, 8.2, 8.3, 8.5, 8.6 CS 3733 Operating Systems Instructor: Dr. Turgay Korkmaz Department Computer Science The University of Texas at San Antonio Office: NPB

More information

Virtual Memory. Stefanos Kaxiras. Credits: Some material and/or diagrams adapted from Hennessy & Patterson, Hill, online sources.

Virtual Memory. Stefanos Kaxiras. Credits: Some material and/or diagrams adapted from Hennessy & Patterson, Hill, online sources. Virtual Memory Stefanos Kaxiras Credits: Some material and/or diagrams adapted from Hennessy & Patterson, Hill, online sources. Caches Review & Intro Intended to make the slow main memory look fast by

More information

SE-292 High Performance Computing. Memory Hierarchy. R. Govindarajan

SE-292 High Performance Computing. Memory Hierarchy. R. Govindarajan SE-292 High Performance Computing Memory Hierarchy R. Govindarajan govind@serc Reality Check Question 1: Are real caches built to work on virtual addresses or physical addresses? Question 2: What about

More information

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB

Memory Technology. Caches 1. Static RAM (SRAM) Dynamic RAM (DRAM) Magnetic disk. Ideal memory. 0.5ns 2.5ns, $2000 $5000 per GB Memory Technology Caches 1 Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per GB Ideal memory Average access time similar

More information

Page 1. Review: Address Segmentation " Review: Address Segmentation " Review: Address Segmentation "

Page 1. Review: Address Segmentation  Review: Address Segmentation  Review: Address Segmentation Review Address Segmentation " CS162 Operating Systems and Systems Programming Lecture 10 Caches and TLBs" February 23, 2011! Ion Stoica! http//inst.eecs.berkeley.edu/~cs162! 1111 0000" 1110 000" Seg #"

More information

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Multilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823

More information

UCB CS61C : Machine Structures

UCB CS61C : Machine Structures inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 14 Caches III Lecturer SOE Dan Garcia Google Glass may be one vision of the future of post-pc interfaces augmented reality with video

More information

and data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed

and data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed 5.3 By convention, a cache is named according to the amount of data it contains (i.e., a 4 KiB cache can hold 4 KiB of data); however, caches also require SRAM to store metadata such as tags and valid

More information

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches CS 61C: Great Ideas in Computer Architecture The Memory Hierarchy, Fully Associative Caches Instructor: Alan Christopher 7/09/2014 Summer 2014 -- Lecture #10 1 Review of Last Lecture Floating point (single

More information

12 Cache-Organization 1

12 Cache-Organization 1 12 Cache-Organization 1 Caches Memory, 64M, 500 cycles L1 cache 64K, 1 cycles 1-5% misses L2 cache 4M, 10 cycles 10-20% misses L3 cache 16M, 20 cycles Memory, 256MB, 500 cycles 2 Improving Miss Penalty

More information

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1 Virtual Memory Patterson & Hennessey Chapter 5 ELEC 5200/6200 1 Virtual Memory Use main memory as a cache for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs

More information

Advanced Memory Organizations

Advanced Memory Organizations CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU

More information