The Last Mile An Empirical Study of Timing Channels on sel4

Size: px
Start display at page:

Download "The Last Mile An Empirical Study of Timing Channels on sel4"

Transcription

1 The Last Mile An Empirical Study of Timing on David Cock Qian Ge Toby Murray Gernot Heiser 4 November 2014 NICTA Funding and Supporting Members and Partners

2 Outline The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 2/34

3 is a verified, high-performance microkernel. We have: Proof of functional correctness. Proof of authority confinement. Proof of explicit information-flow control. WCET analysis. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 3/34

4 is a verified, high-performance microkernel. We have: Proof of functional correctness. Proof of authority confinement. Proof of explicit information-flow control. WCET analysis. We don t have: An comprehensive hardware model. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 3/34

5 Covert and Side RSA Cache AES ALU Unexpected channels invalidate info-flow control. We can t prove their absence: Depend heavily on undocumented chip internals. are probabilistic. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 4/34

6 of Interest we consider Can subvert our proof: Timing channels. Could be fixed with OS techniques e.g. Cache contention. That can be exploited in software. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 5/34

7 of Interest we consider Can subvert our proof: Timing channels. Could be fixed with OS techniques e.g. Cache contention. That can be exploited in software. we don t consider Physical attacks e.g. DPA. already excluded by proof e.g. Storage channels. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 5/34

8 An Experimental Corpus We exploit 2 principal hardware channels: The L2 cache channel. The bus contention channel (not covered today). The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 6/34

9 An Experimental Corpus We exploit 2 principal hardware channels: The L2 cache channel. The bus contention channel (not covered today). The data also suggests 3 more (2 as-yet-unrecognised). The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 6/34

10 An Experimental Corpus We exploit 2 principal hardware channels: The L2 cache channel. The bus contention channel (not covered today). The data also suggests 3 more (2 as-yet-unrecognised). We evaluate 2 cache-channel countermeasures: Instruction-based scheduling. Cache colouring. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 6/34

11 An Experimental Corpus We exploit 2 principal hardware channels: The L2 cache channel. The bus contention channel (not covered today). The data also suggests 3 more (2 as-yet-unrecognised). We evaluate 2 cache-channel countermeasures: Instruction-based scheduling. Cache colouring. We also consider countermeasures against remote, algorithmic channels: The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 6/34

12 An Experimental Corpus We exploit 2 principal hardware channels: The L2 cache channel. The bus contention channel (not covered today). The data also suggests 3 more (2 as-yet-unrecognised). We evaluate 2 cache-channel countermeasures: Instruction-based scheduling. Cache colouring. We also consider countermeasures against remote, algorithmic channels: Lucky-13 against OpenSSL is our example victim. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 6/34

13 An Experimental Corpus We exploit 2 principal hardware channels: The L2 cache channel. The bus contention channel (not covered today). The data also suggests 3 more (2 as-yet-unrecognised). We evaluate 2 cache-channel countermeasures: Instruction-based scheduling. Cache colouring. We also consider countermeasures against remote, algorithmic channels: Lucky-13 against OpenSSL is our example victim. Scheduled delivery is our countermeasure. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 6/34

14 Experimental Platforms Core Date L2 Cache imx.31 ARM1136JF-S (ARMv6) KiB E6550 Conroe (x86-64) KiB DM3730 Cortex A8 (ARMv7) KiB AM3358 Cortex A8 (ARMv7) KiB imx.6 Cortex A9 (ARMv7) KiB Exynos4412 Cortex A9 (ARMv7) KiB 7 years and 3 (ARM) core generations. 32-fold range of cache sizes. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 7/34

15 Methodology Integrated with nightly regression test. Runs each channel with each countermeasure (54 combinations). 2,000 hours of data over 12 months, 4.3 GiB. Data and analysis tools open source: timingchannels.pml The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 8/34

16 Outline The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 9/34

17 Cache Contention Conflict! domain switch If you share a core, you share a cache. Hit vs. miss makes a big time difference. Many published attacks steal keys through the cache. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 10/34

18 Cache Contention Conflict! domain switch If you share a core, you share a cache. Hit vs. miss makes a big time difference. Many published attacks steal keys through the cache. We can control cache allocation (more shortly). The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 10/34

19 Exynos4412 Cache Channel Lines touched / Lines evicted / ,768 cache lines, 1000Hz sample rate (preemption). Bandwidth: 2400b/s. Baseline for comparison. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 11/34

20 What is (IBS)? Our example attack used preemption as a clock. If it s tied to progress, the channel should vanish. This is a form of deterministic execution. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 12/34

21 What is (IBS)? Our example attack used preemption as a clock. If it s tied to progress, the channel should vanish. This is a form of deterministic execution. Advantages Applies to any channel. Simple to implement (18 lines in ). The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 12/34

22 What is (IBS)? Our example attack used preemption as a clock. If it s tied to progress, the channel should vanish. This is a form of deterministic execution. Advantages Applies to any channel. Simple to implement (18 lines in ). Disadvantages Restrictive Need to remove all clocks. Performance counter accuracy critical. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 12/34

23 What is (IBS)? Our example attack used preemption as a clock. If it s tied to progress, the channel should vanish. This is a form of deterministic execution. Advantages Applies to any channel. Simple to implement (18 lines in ). Disadvantages Restrictive Need to remove all clocks. Performance counter accuracy critical. Works great on older chips (imx.31), but... The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 12/34

24 Exynos4412 Cache Channel with IBS Lines touched Lines evicted /10 3 Preempt after 10 5 instructions. Bandwidth: 400b/s. 6 reduction very poor result. Event delivery is imprecise thanks to speculation. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 13/34

25 Exynos4412 Cache Channel with IBS Lines touched Lines evicted /10 3 Preempt after 10 5 instructions. Bandwidth: 400b/s. 6 reduction very poor result. Event delivery is imprecise thanks to speculation. Attacker can modulate it! The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 13/34

26 What is Colouring? frame number frame offset set selector line index Caches are divided into sets by physical address. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 14/34

27 What is Colouring? frame number frame offset set selector line index Caches are divided into sets by physical address. The set selector bits of the address choose the set. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 14/34

28 What is Colouring? frame number frame offset set selector line index Caches are divided into sets by physical address. The set selector bits of the address choose the set. These bits (usually) overlap the frame number. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 14/34

29 What is Colouring? frame number frame offset set selector line index Caches are divided into sets by physical address. The set selector bits of the address choose the set. These bits (usually) overlap the frame number. If these colour bits differ, the frames cannot collide. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 14/34

30 What is Colouring? frame number frame offset set selector line index Caches are divided into sets by physical address. The set selector bits of the address choose the set. These bits (usually) overlap the frame number. If these colour bits differ, the frames cannot collide. The frame number (phys. addr.) is under OS control. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 14/34

31 Colouring in In, resource allocation is securely delegated. The initial task splits RAM in to coloured pools. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 15/34

32 Colouring in In, resource allocation is securely delegated. The initial task splits RAM in to coloured pools. Advantages Very few kernel changes (in ). Low overhead. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 15/34

33 Colouring in In, resource allocation is securely delegated. The initial task splits RAM in to coloured pools. Advantages Very few kernel changes (in ). Low overhead. Disadvantages Only applies to the cache channel. Relies on internal details of cache operation. Doesn t work with large pages. Hashed caches break everything. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 15/34

34 Exynos4412 Cache Channel, Partitioned Lines touched / Lines evicted /10 3 Bandwidth: 15b/s. 160 reduction Much better, but not perfect. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 16/34

35 Exynos4412 Cache Channel, Partitioned Lines touched / Lines evicted /10 3 Bandwidth: 15b/s. 160 reduction Much better, but not perfect. Where does that 15b/s come from? The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 16/34

36 Exynos4412 TLB Channel Lines touched No TLB flush TLB flush Stalls/line No TLB flush TLB flush 4 Average rate correlates with TLB misses. Flushing on switch removes the signal. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 17/34

37 Misprediction and the Cycle Counter HPT ticks / Branch mispredicts Lines evicted 1 Cycle counter affected by invisible mispredicts. A new (an unexpected) channel. Event delivery is precise, the cycle counter is wrong. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 18/34

38 Outline The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 19/34

39 The Lucky-13 Attack on TLS Al Fardan & Paterson, SSP 2013 Exploits non-constant-time MAC calculation. This is an Algorithmic side-channel. Remotely exploitable. We reproduce the distinguishing attack. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 20/34

40 OpenSSL 1.0.1c Response Times Probability M 0 VL M 1 VL Response time (µs) Nearby attacker (crossover cable). Modified vs. unmodified packet. Distinguishable with 100% probability. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 21/34

41 The Upstream Fix: Constant-Time Code Fixed in OpenSSL version 1.0.1e A constant-time padding/mac check. Now secure (on x86). We tested on ARM still a small channel. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 22/34

42 The Upstream Fix: Constant-Time Code Fixed in OpenSSL version 1.0.1e A constant-time padding/mac check. Now secure (on x86). We tested on ARM still a small channel. We present an OS-level solution, with: Better performance. Lower latency. Lower CPU overhead. No modification of OpenSSL. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 22/34

43 OpenSSL 1.0.1e Response Times Probability M 0 VL M 1 VL Response time (µs) The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 23/34

44 OpenSSL 1.0.1e Response Times Probability M 0 VL M 1 VL M 0 CT M 1 CT Response time (µs) Constant-time implementation. Better, but still distinguishable 62%. 60µs (8.1%) latency penalty. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 24/34

45 What is? in SSL_read handler out SSL_write server Separate OpenSSL and application. Announce packets over IPC. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 25/34

46 What is? in 1 SSL_read handler out SSL_write server Separate OpenSSL and application. Announce packets over IPC. Record arrival time. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 25/34

47 What is? in 1 SSL_read handler out SSL_write C 2 server Separate OpenSSL and application. Announce packets over IPC. Record arrival time. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 25/34

48 What is? in out 1 SSL_read SSL_write 3 handler C 2 server Separate OpenSSL and application. Announce packets over IPC. Record arrival time. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 25/34

49 What is? in out 1 SSL_read SSL_write 3 handler 4 2 server Separate OpenSSL and application. C Announce packets over IPC. Record arrival time. Block response on timer. R The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 25/34

50 What is? in out 1 SSL_read SSL_write 3 handler 4 server Separate OpenSSL and application. C Announce packets over IPC. Record arrival time. Block response on timer. 2 R The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 25/34

51 What is? in out 1 SSL_read SSL_write 3 handler 4 C 2 server Separate OpenSSL and application. Announce packets over IPC. Record arrival time. Block response on timer. 5 R The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 25/34

52 on We use real-time scheduling to precisely delay messages. Provides an efficient mechanism to enforce a delay policy: See Askarov et. al., CCS The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 26/34

53 on We use real-time scheduling to precisely delay messages. Provides an efficient mechanism to enforce a delay policy: See Askarov et. al., CCS Advantages Uses existing IPC controls no modifications. Fast and effective (see next slide). The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 26/34

54 on We use real-time scheduling to precisely delay messages. Provides an efficient mechanism to enforce a delay policy: See Askarov et. al., CCS Advantages Uses existing IPC controls no modifications. Fast and effective (see next slide). Disadvantages Specific to remote/network attacks. Need to wrap (but not modify) vulnerable component. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 26/34

55 Security of Probability M 0 VL M 1 VL M 0 CT M 1 CT Response time (µs) The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 27/34

56 Security of Probability M 0 VL M 1 VL M 0 CT M 1 CT M 0 SD M 1 SD Response time (µs) More secure 57% distinguishable. Faster 10µs (1.4%) latency penalty. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 28/34

57 Performance of CPU load c 1.0.1e 1.0.1c-sd Ingress rate (packets/s) Lower overhead (2% vs. 11%), but earlier saturation. This is a worst-case benchmark no server work. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 29/34

58 Outline The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 30/34

59 These channels are real We managed to exploit every channel we tried. The bandwidth is high, and growing. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 31/34

60 These channels are real We managed to exploit every channel we tried. The bandwidth is high, and growing. They re getting worse Countermeasures get less effective on newer chips. New hardware channels e.g. branch predictor. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 31/34

61 These channels are real We managed to exploit every channel we tried. The bandwidth is high, and growing. They re getting worse Countermeasures get less effective on newer chips. New hardware channels e.g. branch predictor. But there is hope Resource partitioning (e.g. colouring) is effective. Repurpose hardware QoS features. OS-level techniques can help. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 31/34

62 Ongoing project at NICTA (Qian Ge). Developing a fully cache-coloured. We use these tools to evaluate the result. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 32/34

63 Questions? Data and Tools TS/timingchannels.pml Is Open Source! The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 33/34

64 Capacity vs. Injected Noise Injected noise (b) uncorrelated noise anticorrelated noise Capacity (b) Noise injection doesn t scale. Increasing determinism is the way to go. The Last Mile Copyright NICTA 2014 David Cock, Qian Ge, Toby Murray, Gernot Heiser 34/34

COMP9242 Advanced OS. S2/2017 W03: Caches: What Every OS Designer Must

COMP9242 Advanced OS. S2/2017 W03: Caches: What Every OS Designer Must COMP9242 Advanced OS S2/2017 W03: Caches: What Every OS Designer Must Know @GernotHeiser Copyright Notice These slides are distributed under the Creative Commons Attribution 3.0 License You are free: to

More information

COMP9242 Advanced OS. Copyright Notice. The Memory Wall. Caching. These slides are distributed under the Creative Commons Attribution 3.

COMP9242 Advanced OS. Copyright Notice. The Memory Wall. Caching. These slides are distributed under the Creative Commons Attribution 3. Copyright Notice COMP9242 Advanced OS S2/2018 W03: s: What Every OS Designer Must Know @GernotHeiser These slides are distributed under the Creative Commons Attribution 3.0 License You are free: to share

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

Advanced Caching Techniques (2) Department of Electrical Engineering Stanford University

Advanced Caching Techniques (2) Department of Electrical Engineering Stanford University Lecture 4: Advanced Caching Techniques (2) Department of Electrical Engineering Stanford University http://eeclass.stanford.edu/ee282 Lecture 4-1 Announcements HW1 is out (handout and online) Due on 10/15

More information

Course Outline. Processes CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Distributed Systems

Course Outline. Processes CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Distributed Systems Course Outline Processes CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Distributed Systems 1 Today: Memory Management Terminology Uniprogramming Multiprogramming Contiguous

More information

Lecture 21: Virtual Memory. Spring 2018 Jason Tang

Lecture 21: Virtual Memory. Spring 2018 Jason Tang Lecture 21: Virtual Memory Spring 2018 Jason Tang 1 Topics Virtual addressing Page tables Translation lookaside buffer 2 Computer Organization Computer Processor Memory Devices Control Datapath Input Output

More information

1. Memory technology & Hierarchy

1. Memory technology & Hierarchy 1 Memory technology & Hierarchy Caching and Virtual Memory Parallel System Architectures Andy D Pimentel Caches and their design cf Henessy & Patterson, Chap 5 Caching - summary Caches are small fast memories

More information

Page 1. Multilevel Memories (Improving performance using a little cash )

Page 1. Multilevel Memories (Improving performance using a little cash ) Page 1 Multilevel Memories (Improving performance using a little cash ) 1 Page 2 CPU-Memory Bottleneck CPU Memory Performance of high-speed computers is usually limited by memory bandwidth & latency Latency

More information

Caching. From an OS Perspective. COMP /S2 Week 3

Caching. From an OS Perspective. COMP /S2 Week 3 cse/unsw/nicta Caching From an OS Perspective COMP9242 2005/S2 Week 3 CACHING Registers Cache Main Memory Disk Cache is fast (1 5 cycle access time) memory sitting between fast registers and slow RAM (10

More information

Department of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware.

Department of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware. Department of Computer Science, Institute for System Architecture, Operating Systems Group Real-Time Systems '08 / '09 Hardware Marcus Völp Outlook Hardware is Source of Unpredictability Caches Pipeline

More information

Reducing Miss Penalty: Read Priority over Write on Miss. Improving Cache Performance. Non-blocking Caches to reduce stalls on misses

Reducing Miss Penalty: Read Priority over Write on Miss. Improving Cache Performance. Non-blocking Caches to reduce stalls on misses Improving Cache Performance 1. Reduce the miss rate, 2. Reduce the miss penalty, or 3. Reduce the time to hit in the. Reducing Miss Penalty: Read Priority over Write on Miss Write buffers may offer RAW

More information

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James Computer Systems Architecture I CSE 560M Lecture 18 Guest Lecturer: Shakir James Plan for Today Announcements No class meeting on Monday, meet in project groups Project demos < 2 weeks, Nov 23 rd Questions

More information

Exploitation of instruction level parallelism

Exploitation of instruction level parallelism Exploitation of instruction level parallelism Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering

More information

Homework 5. Start date: March 24 Due date: 11:59PM on April 10, Monday night. CSCI 402: Computer Architectures

Homework 5. Start date: March 24 Due date: 11:59PM on April 10, Monday night. CSCI 402: Computer Architectures Homework 5 Start date: March 24 Due date: 11:59PM on April 10, Monday night 4.1.1, 4.1.2 4.3 4.8.1, 4.8.2 4.9.1-4.9.4 4.13.1 4.16.1, 4.16.2 1 CSCI 402: Computer Architectures The Processor (4) Fengguang

More information

Micro-Architectural Attacks and Countermeasures

Micro-Architectural Attacks and Countermeasures Micro-Architectural Attacks and Countermeasures Çetin Kaya Koç koc@cs.ucsb.edu Çetin Kaya Koç http://koclab.org Winter 2017 1 / 25 Contents Micro-Architectural Attacks Cache Attacks Branch Prediction Attack

More information

Caching. From an OS Perspective. COMP /S2 Week 3

Caching. From an OS Perspective. COMP /S2 Week 3 cse/unsw Caching From an OS Perspective COMP9242 2004/S2 Week 3 CACHING Registers Cache Main Memory Disk Cache is fast (1 5 cycle access time) memory sitting between fast registers and slow RAM (10 100

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2016 Lecture 33 Virtual Memory Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ How does the virtual

More information

CS3350B Computer Architecture

CS3350B Computer Architecture CS335B Computer Architecture Winter 25 Lecture 32: Exploiting Memory Hierarchy: How? Marc Moreno Maza wwwcsduwoca/courses/cs335b [Adapted from lectures on Computer Organization and Design, Patterson &

More information

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance 6.823, L11--1 Cache Performance and Memory Management: From Absolute Addresses to Demand Paging Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Cache Performance 6.823,

More information

Computer Architecture Spring 2016

Computer Architecture Spring 2016 Computer Architecture Spring 2016 Lecture 08: Caches III Shuai Wang Department of Computer Science and Technology Nanjing University Improve Cache Performance Average memory access time (AMAT): AMAT =

More information

Basic Memory Management

Basic Memory Management Basic Memory Management CS 256/456 Dept. of Computer Science, University of Rochester 10/15/14 CSC 2/456 1 Basic Memory Management Program must be brought into memory and placed within a process for it

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

CSE Memory Hierarchy Design Ch. 5 (Hennessy and Patterson)

CSE Memory Hierarchy Design Ch. 5 (Hennessy and Patterson) CSE 4201 Memory Hierarchy Design Ch. 5 (Hennessy and Patterson) Memory Hierarchy We need huge amount of cheap and fast memory Memory is either fast or cheap; never both. Do as politicians do: fake it Give

More information

OS Extensibility: Spin, Exo-kernel and L4

OS Extensibility: Spin, Exo-kernel and L4 OS Extensibility: Spin, Exo-kernel and L4 Extensibility Problem: How? Add code to OS how to preserve isolation? without killing performance? What abstractions? General principle: mechanisms in OS, policies

More information

Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches

Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging 6.823, L8--1 Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Highly-Associative

More information

Spectre and Meltdown. Clifford Wolf q/talk

Spectre and Meltdown. Clifford Wolf q/talk Spectre and Meltdown Clifford Wolf q/talk 2018-01-30 Spectre and Meltdown Spectre (CVE-2017-5753 and CVE-2017-5715) Is an architectural security bug that effects most modern processors with speculative

More information

Lecture 20: Multi-Cache Designs. Spring 2018 Jason Tang

Lecture 20: Multi-Cache Designs. Spring 2018 Jason Tang Lecture 20: Multi-Cache Designs Spring 2018 Jason Tang 1 Topics Split caches Multi-level caches Multiprocessor caches 2 3 Cs of Memory Behaviors Classify all cache misses as: Compulsory Miss (also cold-start

More information

CSE502: Computer Architecture CSE 502: Computer Architecture

CSE502: Computer Architecture CSE 502: Computer Architecture CSE 502: Computer Architecture Memory Hierarchy & Caches Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required

More information

Breaking Kernel Address Space Layout Randomization (KASLR) with Intel TSX. Yeongjin Jang, Sangho Lee, and Taesoo Kim Georgia Institute of Technology

Breaking Kernel Address Space Layout Randomization (KASLR) with Intel TSX. Yeongjin Jang, Sangho Lee, and Taesoo Kim Georgia Institute of Technology Breaking Kernel Address Space Layout Randomization (KASLR) with Intel TSX Yeongjin Jang, Sangho Lee, and Taesoo Kim Georgia Institute of Technology Kernel Address Space Layout Randomization (KASLR) A statistical

More information

Caches: What Every OS Designer Must Know. COMP /S2 Week 4

Caches: What Every OS Designer Must Know. COMP /S2 Week 4 Caches: What Every OS Designer Must Know COMP9242 2008/S2 Week 4 Copyright Notice These slides are distributed under the Creative Commons Attribution 3.0 License You are free: to share to copy, distribute

More information

CS162 Operating Systems and Systems Programming Lecture 11 Page Allocation and Replacement"

CS162 Operating Systems and Systems Programming Lecture 11 Page Allocation and Replacement CS162 Operating Systems and Systems Programming Lecture 11 Page Allocation and Replacement" October 3, 2012 Ion Stoica http://inst.eecs.berkeley.edu/~cs162 Lecture 9 Followup: Inverted Page Table" With

More information

From bottom to top: Exploiting hardware side channels in web browsers

From bottom to top: Exploiting hardware side channels in web browsers From bottom to top: Exploiting hardware side channels in web browsers Clémentine Maurice, Graz University of Technology July 4, 2017 RMLL, Saint-Étienne, France Rennes Graz Clémentine Maurice PhD since

More information

Page 1. Memory Hierarchies (Part 2)

Page 1. Memory Hierarchies (Part 2) Memory Hierarchies (Part ) Outline of Lectures on Memory Systems Memory Hierarchies Cache Memory 3 Virtual Memory 4 The future Increasing distance from the processor in access time Review: The Memory Hierarchy

More information

Meltdown or "Holy Crap: How did we do this to ourselves" Meltdown exploits side effects of out-of-order execution to read arbitrary kernelmemory

Meltdown or Holy Crap: How did we do this to ourselves Meltdown exploits side effects of out-of-order execution to read arbitrary kernelmemory Meltdown or "Holy Crap: How did we do this to ourselves" Abstract Meltdown exploits side effects of out-of-order execution to read arbitrary kernelmemory locations Breaks all security assumptions given

More information

EITF20: Computer Architecture Part 5.1.1: Virtual Memory

EITF20: Computer Architecture Part 5.1.1: Virtual Memory EITF20: Computer Architecture Part 5.1.1: Virtual Memory Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache optimization Virtual memory Case study AMD Opteron Summary 2 Memory hierarchy 3 Cache

More information

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg Computer Architecture and System Software Lecture 09: Memory Hierarchy Instructor: Rob Bergen Applied Computer Science University of Winnipeg Announcements Midterm returned + solutions in class today SSD

More information

CSE 451: Operating Systems. Section 10 Project 3 wrap-up, final exam review

CSE 451: Operating Systems. Section 10 Project 3 wrap-up, final exam review CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review Final exam review Goal of this section: key concepts you should understand Not just a summary of lectures Slides coverage and

More information

Chapter 4: Memory Management. Part 1: Mechanisms for Managing Memory

Chapter 4: Memory Management. Part 1: Mechanisms for Managing Memory Chapter 4: Memory Management Part 1: Mechanisms for Managing Memory Memory management Basic memory management Swapping Virtual memory Page replacement algorithms Modeling page replacement algorithms Design

More information

Operating Systems. Operating Systems Sina Meraji U of T

Operating Systems. Operating Systems Sina Meraji U of T Operating Systems Operating Systems Sina Meraji U of T Recap Last time we looked at memory management techniques Fixed partitioning Dynamic partitioning Paging Example Address Translation Suppose addresses

More information

Outline EEL 5764 Graduate Computer Architecture. Chapter 3 Limits to ILP and Simultaneous Multithreading. Overcoming Limits - What do we need??

Outline EEL 5764 Graduate Computer Architecture. Chapter 3 Limits to ILP and Simultaneous Multithreading. Overcoming Limits - What do we need?? Outline EEL 7 Graduate Computer Architecture Chapter 3 Limits to ILP and Simultaneous Multithreading! Limits to ILP! Thread Level Parallelism! Multithreading! Simultaneous Multithreading Ann Gordon-Ross

More information

Cache Performance (H&P 5.3; 5.5; 5.6)

Cache Performance (H&P 5.3; 5.5; 5.6) Cache Performance (H&P 5.3; 5.5; 5.6) Memory system and processor performance: CPU time = IC x CPI x Clock time CPU performance eqn. CPI = CPI ld/st x IC ld/st IC + CPI others x IC others IC CPI ld/st

More information

CS 153 Design of Operating Systems Winter 2016

CS 153 Design of Operating Systems Winter 2016 CS 153 Design of Operating Systems Winter 2016 Lecture 16: Memory Management and Paging Announcement Homework 2 is out To be posted on ilearn today Due in a week (the end of Feb 19 th ). 2 Recap: Fixed

More information

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals Cache Memory COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline The Need for Cache Memory The Basics

More information

Lec 11 How to improve cache performance

Lec 11 How to improve cache performance Lec 11 How to improve cache performance How to Improve Cache Performance? AMAT = HitTime + MissRate MissPenalty 1. Reduce the time to hit in the cache.--4 small and simple caches, avoiding address translation,

More information

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand

Spring 2016 :: CSE 502 Computer Architecture. Caches. Nima Honarmand Caches Nima Honarmand Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required by all of the running applications

More information

ARMageddon: Cache Attacks on Mobile Devices

ARMageddon: Cache Attacks on Mobile Devices ARMageddon: Cache Attacks on Mobile Devices Moritz Lipp, Daniel Gruss, Raphael Spreitzer, Clémentine Maurice, Stefan Mangard Graz University of Technology 1 TLDR powerful cache attacks (like Flush+Reload)

More information

Introduction. COMP9242 Advanced Operating Systems 2010/S2 Week 1

Introduction. COMP9242 Advanced Operating Systems 2010/S2 Week 1 Introduction COMP9242 Advanced Operating Systems 2010/S2 Week 1 2010 Gernot Heiser UNSW/NICTA/OK Labs. Distributed under Creative Commons Attribution License 1 Copyright Notice These slides are distributed

More information

Improving Interrupt Response Time in a Verifiable Protected Microkernel

Improving Interrupt Response Time in a Verifiable Protected Microkernel Improving Interrupt Response Time in a Verifiable Protected Microkernel Bernard Blackham Yao Shi Gernot Heiser The University of New South Wales & NICTA, Sydney, Australia EuroSys 2012 Motivation The desire

More information

PAGE REPLACEMENT. Operating Systems 2015 Spring by Euiseong Seo

PAGE REPLACEMENT. Operating Systems 2015 Spring by Euiseong Seo PAGE REPLACEMENT Operating Systems 2015 Spring by Euiseong Seo Today s Topics What if the physical memory becomes full? Page replacement algorithms How to manage memory among competing processes? Advanced

More information

csci 3411: Operating Systems

csci 3411: Operating Systems csci 3411: Operating Systems Memory Management II Gabriel Parmer Slides adapted from Silberschatz and West Each Process has its Own Little World Virtual Address Space Picture from The

More information

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] CSF Cache Introduction [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005] Review: The Memory Hierarchy Take advantage of the principle of locality to present the user with as much

More information

Tutorial 11. Final Exam Review

Tutorial 11. Final Exam Review Tutorial 11 Final Exam Review Introduction Instruction Set Architecture: contract between programmer and designers (e.g.: IA-32, IA-64, X86-64) Computer organization: describe the functional units, cache

More information

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner CPS104 Computer Organization and Programming Lecture 16: Virtual Memory Robert Wagner cps 104 VM.1 RW Fall 2000 Outline of Today s Lecture Virtual Memory. Paged virtual memory. Virtual to Physical translation:

More information

Survey results. CS 6354: Memory Hierarchy I. Variety in memory technologies. Processor/Memory Gap. SRAM approx. 4 6 transitors/bit optimized for speed

Survey results. CS 6354: Memory Hierarchy I. Variety in memory technologies. Processor/Memory Gap. SRAM approx. 4 6 transitors/bit optimized for speed Survey results CS 6354: Memory Hierarchy I 29 August 2016 1 2 Processor/Memory Gap Variety in memory technologies SRAM approx. 4 6 transitors/bit optimized for speed DRAM approx. 1 transitor + capacitor/bit

More information

Write only as much as necessary. Be brief!

Write only as much as necessary. Be brief! 1 CIS371 Computer Organization and Design Final Exam Prof. Martin Wednesday, May 2nd, 2012 This exam is an individual-work exam. Write your answers on these pages. Additional pages may be attached (with

More information

EE 4683/5683: COMPUTER ARCHITECTURE

EE 4683/5683: COMPUTER ARCHITECTURE EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major

More information

Improving Cache Performance

Improving Cache Performance Improving Cache Performance Tuesday 27 October 15 Many slides adapted from: and Design, Patterson & Hennessy 5th Edition, 2014, MK and from Prof. Mary Jane Irwin, PSU Summary Previous Class Memory hierarchy

More information

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory

CPS 104 Computer Organization and Programming Lecture 20: Virtual Memory CPS 104 Computer Organization and Programming Lecture 20: Virtual Nov. 10, 1999 Dietolf (Dee) Ramm http://www.cs.duke.edu/~dr/cps104.html CPS 104 Lecture 20.1 Outline of Today s Lecture O Virtual. 6 Paged

More information

CS510 Operating System Foundations. Jonathan Walpole

CS510 Operating System Foundations. Jonathan Walpole CS510 Operating System Foundations Jonathan Walpole Page Replacement Page Replacement Assume a normal page table (e.g., BLITZ) User-program is executing A PageInvalidFault occurs! - The page needed is

More information

Memory hier ar hier ch ar y ch rev re i v e i w e ECE 154B Dmitri Struko Struk v o

Memory hier ar hier ch ar y ch rev re i v e i w e ECE 154B Dmitri Struko Struk v o Memory hierarchy review ECE 154B Dmitri Strukov Outline Cache motivation Cache basics Opteron example Cache performance Six basic optimizations Virtual memory Processor DRAM gap (latency) Four issue superscalar

More information

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management CSE 120 July 18, 2006 Day 5 Memory Instructor: Neil Rhodes Translation Lookaside Buffer (TLB) Implemented in Hardware Cache to map virtual page numbers to page frame Associative memory: HW looks up in

More information

Topic 18: Virtual Memory

Topic 18: Virtual Memory Topic 18: Virtual Memory COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Virtual Memory Any time you see virtual, think using a level of indirection

More information

CS 6354: Memory Hierarchy I. 29 August 2016

CS 6354: Memory Hierarchy I. 29 August 2016 1 CS 6354: Memory Hierarchy I 29 August 2016 Survey results 2 Processor/Memory Gap Figure 2.2 Starting with 1980 performance as a baseline, the gap in performance, measured as the difference in the time

More information

Shadow: Real Applications, Simulated Networks. Dr. Rob Jansen U.S. Naval Research Laboratory Center for High Assurance Computer Systems

Shadow: Real Applications, Simulated Networks. Dr. Rob Jansen U.S. Naval Research Laboratory Center for High Assurance Computer Systems Shadow: Real Applications, Simulated Networks Dr. Rob Jansen Center for High Assurance Computer Systems Cyber Modeling and Simulation Technical Working Group Mark Center, Alexandria, VA October 25 th,

More information

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh Accelerating Pointer Chasing in 3D-Stacked : Challenges, Mechanisms, Evaluation Kevin Hsieh Samira Khan, Nandita Vijaykumar, Kevin K. Chang, Amirali Boroumand, Saugata Ghose, Onur Mutlu Executive Summary

More information

Efficient Memory Integrity Verification and Encryption for Secure Processors

Efficient Memory Integrity Verification and Encryption for Secure Processors Efficient Memory Integrity Verification and Encryption for Secure Processors G. Edward Suh, Dwaine Clarke, Blaise Gassend, Marten van Dijk, Srinivas Devadas Massachusetts Institute of Technology New Security

More information

Memory Management. Dr. Yingwu Zhu

Memory Management. Dr. Yingwu Zhu Memory Management Dr. Yingwu Zhu Big picture Main memory is a resource A process/thread is being executing, the instructions & data must be in memory Assumption: Main memory is infinite Allocation of memory

More information

Pick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality

Pick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality Pick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality Repeated References, to a set of locations: Temporal Locality Take advantage of behavior

More information

Adapted from David Patterson s slides on graduate computer architecture

Adapted from David Patterson s slides on graduate computer architecture Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual

More information

As the amount of ILP to exploit grows, control dependences rapidly become the limiting factor.

As the amount of ILP to exploit grows, control dependences rapidly become the limiting factor. Hiroaki Kobayashi // As the amount of ILP to exploit grows, control dependences rapidly become the limiting factor. Branches will arrive up to n times faster in an n-issue processor, and providing an instruction

More information

Course Administration

Course Administration Spring 207 EE 363: Computer Organization Chapter 5: Large and Fast: Exploiting Memory Hierarchy - Avinash Kodi Department of Electrical Engineering & Computer Science Ohio University, Athens, Ohio 4570

More information

Introduction to Operating Systems

Introduction to Operating Systems Introduction to Operating Systems Lecture 6: Memory Management MING GAO SE@ecnu (for course related communications) mgao@sei.ecnu.edu.cn Apr. 22, 2015 Outline 1 Issues of main memory 2 Main memory management

More information

Virtual Memory #2 Feb. 21, 2018

Virtual Memory #2 Feb. 21, 2018 15-410...The mysterious TLB... Virtual Memory #2 Feb. 21, 2018 Dave Eckhardt Brian Railing 1 L16_VM2 Last Time Mapping problem: logical vs. physical addresses Contiguous memory mapping (base, limit) Swapping

More information

Department of Computer Science Institute for System Architecture, Operating Systems Group REAL-TIME MICHAEL ROITZSCH OVERVIEW

Department of Computer Science Institute for System Architecture, Operating Systems Group REAL-TIME MICHAEL ROITZSCH OVERVIEW Department of Computer Science Institute for System Architecture, Operating Systems Group REAL-TIME MICHAEL ROITZSCH OVERVIEW 2 SO FAR talked about in-kernel building blocks: threads memory IPC drivers

More information

CSE Computer Architecture I Fall 2011 Homework 07 Memory Hierarchies Assigned: November 8, 2011, Due: November 22, 2011, Total Points: 100

CSE Computer Architecture I Fall 2011 Homework 07 Memory Hierarchies Assigned: November 8, 2011, Due: November 22, 2011, Total Points: 100 CSE 30321 Computer Architecture I Fall 2011 Homework 07 Memory Hierarchies Assigned: November 8, 2011, Due: November 22, 2011, Total Points: 100 Problem 1: (30 points) Background: One possible organization

More information

Virtual Memory Design and Implementation

Virtual Memory Design and Implementation Virtual Memory Design and Implementation To do q Page replacement algorithms q Design and implementation issues q Next: Last on virtualization VMMs Loading pages When should the OS load pages? On demand

More information

Virtual Memory: Mechanisms. CS439: Principles of Computer Systems February 28, 2018

Virtual Memory: Mechanisms. CS439: Principles of Computer Systems February 28, 2018 Virtual Memory: Mechanisms CS439: Principles of Computer Systems February 28, 2018 Last Time Physical addresses in physical memory Virtual/logical addresses in process address space Relocation Algorithms

More information

Instruction Level Parallelism (Branch Prediction)

Instruction Level Parallelism (Branch Prediction) Instruction Level Parallelism (Branch Prediction) Branch Types Type Direction at fetch time Number of possible next fetch addresses? When is next fetch address resolved? Conditional Unknown 2 Execution

More information

Move back and forth between memory and disk. Memory Hierarchy. Two Classes. Don t

Move back and forth between memory and disk. Memory Hierarchy. Two Classes. Don t Memory Management Ch. 3 Memory Hierarchy Cache RAM Disk Compromise between speed and cost. Hardware manages the cache. OS has to manage disk. Memory Manager Memory Hierarchy Cache CPU Main Swap Area Memory

More information

Memory Management Ch. 3

Memory Management Ch. 3 Memory Management Ch. 3 Ë ¾¾ Ì Ï ÒÒØ Å ÔÔ ÓÐÐ 1 Memory Hierarchy Cache RAM Disk Compromise between speed and cost. Hardware manages the cache. OS has to manage disk. Memory Manager Ë ¾¾ Ì Ï ÒÒØ Å ÔÔ ÓÐÐ

More information

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative Chapter 6 s Topics Memory Hierarchy Locality of Reference SRAM s Direct Mapped Associative Computer System Processor interrupt On-chip cache s s Memory-I/O bus bus Net cache Row cache Disk cache Memory

More information

Complex Pipelines and Branch Prediction

Complex Pipelines and Branch Prediction Complex Pipelines and Branch Prediction Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. L22-1 Processor Performance Time Program Instructions Program Cycles Instruction CPI Time Cycle

More information

Chapter 8: Memory-Management Strategies

Chapter 8: Memory-Management Strategies Chapter 8: Memory-Management Strategies Chapter 8: Memory Management Strategies Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel 32 and

More information

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs Virtual Memory Today Motivations for VM Address translation Accelerating translation with TLBs Fabián Chris E. Bustamante, Riesbeck, Fall Spring 2007 2007 A system with physical memory only Addresses generated

More information

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory Recall: Address Space Map 13: Memory Management Biggest Virtual Address Stack (Space for local variables etc. For each nested procedure call) Sometimes Reserved for OS Stack Pointer Last Modified: 6/21/2004

More information

Replacement Policy: Which block to replace from the set?

Replacement Policy: Which block to replace from the set? Replacement Policy: Which block to replace from the set? Direct mapped: no choice Associative: evict least recently used (LRU) difficult/costly with increasing associativity Alternative: random replacement

More information

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off

More information

Page 1. Goals for Today" TLB organization" CS162 Operating Systems and Systems Programming Lecture 11. Page Allocation and Replacement"

Page 1. Goals for Today TLB organization CS162 Operating Systems and Systems Programming Lecture 11. Page Allocation and Replacement Goals for Today" CS162 Operating Systems and Systems Programming Lecture 11 Page Allocation and Replacement" Finish discussion on TLBs! Page Replacement Policies! FIFO, LRU! Clock Algorithm!! Working Set/Thrashing!

More information

Page 1 ILP. ILP Basics & Branch Prediction. Smarter Schedule. Basic Block Problems. Parallelism independent enough

Page 1 ILP. ILP Basics & Branch Prediction. Smarter Schedule. Basic Block Problems. Parallelism independent enough ILP ILP Basics & Branch Prediction Today s topics: Compiler hazard mitigation loop unrolling SW pipelining Branch Prediction Parallelism independent enough e.g. avoid s» control correctly predict decision

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Processor-Memory Performance Gap 10000 µproc 55%/year (2X/1.5yr) Performance 1000 100 10 1 1980 1983 1986 1989 Moore s Law Processor-Memory Performance

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Spring 2018 L20 Virtual Memory Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 Questions from last time Page

More information

OPERATING SYSTEMS CS136

OPERATING SYSTEMS CS136 OPERATING SYSTEMS CS136 Jialiang LU Jialiang.lu@sjtu.edu.cn Based on Lecture Notes of Tanenbaum, Modern Operating Systems 3 e, 1 Chapter 5 INPUT/OUTPUT 2 Overview o OS controls I/O devices => o Issue commands,

More information

Topics to be covered. EEC 581 Computer Architecture. Virtual Memory. Memory Hierarchy Design (II)

Topics to be covered. EEC 581 Computer Architecture. Virtual Memory. Memory Hierarchy Design (II) EEC 581 Computer Architecture Memory Hierarchy Design (II) Department of Electrical Engineering and Computer Science Cleveland State University Topics to be covered Cache Penalty Reduction Techniques Victim

More information

Slide 6-1. Processes. Operating Systems: A Modern Perspective, Chapter 6. Copyright 2004 Pearson Education, Inc.

Slide 6-1. Processes. Operating Systems: A Modern Perspective, Chapter 6. Copyright 2004 Pearson Education, Inc. Slide 6-1 6 es Announcements Slide 6-2 Extension til Friday 11 am for HW #1 Previous lectures online Program Assignment #1 online later today, due 2 weeks from today Homework Set #2 online later today,

More information

CS 153 Design of Operating Systems Winter 2016

CS 153 Design of Operating Systems Winter 2016 CS 153 Design of Operating Systems Winter 2016 Lecture 18: Page Replacement Terminology in Paging A virtual page corresponds to physical page/frame Segment should not be used anywhere Page out = Page eviction

More information

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs Chapter 5 (Part II) Baback Izadi Division of Engineering Programs bai@engr.newpaltz.edu Virtual Machines Host computer emulates guest operating system and machine resources Improved isolation of multiple

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

Mo Money, No Problems: Caches #2...

Mo Money, No Problems: Caches #2... Mo Money, No Problems: Caches #2... 1 Reminder: Cache Terms... Cache: A small and fast memory used to increase the performance of accessing a big and slow memory Uses temporal locality: The tendency to

More information

COSC 6385 Computer Architecture - Memory Hierarchy Design (III)

COSC 6385 Computer Architecture - Memory Hierarchy Design (III) COSC 6385 Computer Architecture - Memory Hierarchy Design (III) Fall 2006 Reducing cache miss penalty Five techniques Multilevel caches Critical word first and early restart Giving priority to read misses

More information

Memory management, part 2: outline

Memory management, part 2: outline Memory management, part 2: outline Page replacement algorithms Modeling PR algorithms o Working-set model and algorithms Virtual memory implementation issues 1 Page Replacement Algorithms Page fault forces

More information