Precise and Efficient FIFO-Replacement Analysis Based on Static Phase Detection

Size: px
Start display at page:

Download "Precise and Efficient FIFO-Replacement Analysis Based on Static Phase Detection"

Transcription

1 Precise and Efficient FIFO-Replacement Analysis Based on Static Phase Detection Daniel Grund 1 Jan Reineke 2 1 Saarland University, Saarbrücken, Germany 2 University of California, Berkeley, USA Euromicro Conference on Real-Time Systems 2010

2 Outline 1 Introduction and Problem Timing Analysis Cache Analysis Challenge FIFO Replacement 2 Predicting Hits for FIFO Idea and Theorem Must Analysis Efficient Implementation 3 Paper Contents 4 Evaluation Related Work Analysis Precision 5 Summary Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

3 Timing Analysis for Real-Time Systems Frequency Analysis-guaranteed timing bounds Possible execution times Overest. LB BCET WCET UB Execution time Need to bound execution time of programs Execution time influenced by architectural features pipelines, caches, branch prediction,... Need to analyze behavior of architectural components Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

4 Caches and Replacement Policies hit (ab) CPU Cache Main Memory Capacity: Latency: 32 KB 3 cycles 2 MB 100 cycles Caches transparently buffer memory blocks Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

5 Caches and Replacement Policies a? hit (ab) CPU Cache Main Memory Capacity: Latency: 32 KB 3 cycles 2 MB 100 cycles Caches transparently buffer memory blocks Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

6 Caches and Replacement Policies a! hit (ab) CPU Cache Main Memory Capacity: Latency: 32 KB 3 cycles 2 MB 100 cycles Caches transparently buffer memory blocks Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

7 Caches and Replacement Policies c? miss (ab) CPU Cache Main Memory Capacity: Latency: 32 KB 3 cycles 2 MB 100 cycles Caches transparently buffer memory blocks Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

8 Caches and Replacement Policies miss (ab) c? CPU Cache Main Memory Capacity: Latency: 32 KB 3 cycles 2 MB 100 cycles Caches transparently buffer memory blocks Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

9 Caches and Replacement Policies miss (ac) CPU Cache Main Memory c! Capacity: Latency: 32 KB 3 cycles 2 MB 100 cycles Caches transparently buffer memory blocks Replacement policy dynamically decides which element to replace LRU least recently used PLRU pseudo LRU FIFO first-in first-out Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

10 Caches and Replacement Policies c! miss (ac) CPU Cache Main Memory Capacity: Latency: 32 KB 3 cycles 2 MB 100 cycles Caches transparently buffer memory blocks Replacement policy dynamically decides which element to replace LRU least recently used PLRU pseudo LRU FIFO first-in first-out Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

11 Static Cache Analysis Goals & Notions Derive approximations to cache contents at each program point in order to classify memory accesses as cache hits or cache misses Must-information Underapproximation of cache contents Used to soundly classify cache hits May-information Overapproximation of cache contents Used to soundly classify cache misses Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

12 Static Cache Analysis Challenges read z read y read x write z Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

13 Static Cache Analysis Challenges read z Initial cache contents unknown read y read x write z Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

14 Static Cache Analysis Challenges read z Initial cache contents unknown read y read x Need to combine analysis information write z Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

15 Static Cache Analysis Challenges read z Initial cache contents unknown read y read x Need to combine analysis information Need to determine addresses of x, y, z write z Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

16 Static Cache Analysis Challenges read z Initial cache contents unknown read y read x Need to combine analysis information Need to determine addresses of x, y, z write z 1 Approximate accessed addresses by value analysis (not this talk) 2 Approximate cached contents by replacement analysis Cache analysis = value analysis replacement analysis Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

17 FIFO Replacement FIFO cache of size k: last-in first-in [b 1,..., b k ] Q k := B k Example updates: [d, c, b, a] c hit [d, c, b, a] [d, c, b, a] e miss [e, d, c, b] Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

18 FIFO Replacement Analysis Why Predicting Hits is Difficult Take a set of blocks B that does fit into a cache q For example, B = {a, b, e} and k = 4. B k. Access all blocks in B: q a,b,e q Must all accessed blocks be cached? q : B q? Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

19 FIFO Replacement Analysis Why Predicting Hits is Difficult Take a set of blocks B that does fit into a cache q For example, B = {a, b, e} and k = 4. B k. Access all blocks in B: q a,b,e q Must all accessed blocks be cached? q : B q? No. [d, c, b, a] a hit [d, c, b, a] Observation b hit [d, c, b, a] After accessing a set of fitting blocks, not all of them must be cached. e miss [e, d, c, b] a Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

20 FIFO Replacement Analysis Why Predicting Misses is Difficult Take a set of blocks B that does not fit into a cache q For example, B = {a, b, c, d, e, f} and k = 4. B k. Access all blocks in B: q a,b,c,d,e,f q Must all non-accessed blocks be evicted? q : q B? Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

21 FIFO Replacement Analysis Why Predicting Misses is Difficult Take a set of blocks B that does not fit into a cache q For example, B = {a, b, c, d, e, f} and k = 4. B k. Access all blocks in B: q a,b,c,d,e,f q Must all non-accessed blocks be evicted? q : q B? No. [x, c, b, a] Observation a,b,c d,e,f hits [x, c, b, a] [f, e, d, x] x misses After accessing a set of non-fitting blocks, other blocks may still be cached. Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

22 Outline 1 Introduction and Problem Timing Analysis Cache Analysis Challenge FIFO Replacement 2 Predicting Hits for FIFO Idea and Theorem Must Analysis Efficient Implementation 3 Paper Contents 4 Evaluation Related Work Analysis Precision 5 Summary Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

23 To the point: Anticipation & Idea Considering repeated accesses to fitting blocks B helps: B = {a, b, c} s = a, b, b, c, b, b, a, c, c, a, b,... Eventually, all blocks in B must be cached. Need to detect repetitions Partition access sequence s into phases Definition (Phase) A B-phase is an access sequence s such that the set of accessed blocks A(s) = B. a, b, b, c, b, b, a, c,... }{{}}{{} {a,b,c}-phase {a,b,c}-phase Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

24 Predicting Hits by Detecting Phases Lemma (Single Phase) Let s be a B-phase and B k. q Q k, q s q : Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

25 Predicting Hits by Detecting Phases Lemma (Single Phase) Let s be a B-phase and B k. q Q k, q s q : B q 1 Either all blocks already cached: B q only hits in s B q Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

26 Predicting Hits by Detecting Phases Lemma (Single Phase) Let s be a B-phase and B k. q Q k, q s q : B q C 1 (q ) B 1 Either all blocks already cached: B q only hits in s B q 2 Or not: B q at least one miss s C 1 (q ) B a,b,e [d, c, b, a] [ }{{} e, d, c, b] C 1(q )={e} B Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

27 Predicting Hits by Detecting Phases Theorem (Multiple Phases) Let s i be B-phases and B k and s = s 1... s j q Q k, q s q : B q C j (q ) B 1 For each individual phase the lemma applies 2 Misses, if any, accumulate in last-in positions C j (q ) b,a,e a,b,e a,b,e [d, c, b, a] [ }{{} e, d, c, b] [ a, e, d, c] [b, a, e, d] }{{}}{{} C 1 B C 2 B C 3 B Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

28 Predicting Hits by Detecting Phases Theorem (Multiple Phases) Let s i be B-phases and B k and s = s 1... s j q Q k, q s q : B q C j (q ) B 1 For each individual phase the lemma applies 2 Misses, if any, accumulate in last-in positions C j (q ) b,a,e a,b,e a,b,e [d, c, b, a] [ }{{} e, d, c, b] [ a, e, d, c] [b, a, e, d] }{{}}{{} C 1 B C 2 B C 3 B Corollary: After B B-phases, all blocks in B must be cached Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

29 The Must Analysis How to Count Phases For phase blocks B, the analysis maintains: P phase progress, blocks already accessed in current phase pc phase counter, number of detected B-phases Predicts hits for blocks in B if pc = B Example for B = {a, b} s a b b b a b P {a} {a, b} {b} {b} {a, b} {b} pc Hit Hit Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

30 The Must Analysis Dependency on Future Accesses Need B B-phases to predict hits for blocks in B How to choose B? After observing a, b, c it makes sense trying to detect 2 further {a, b, c}-phases 1 further {b, c}-phase 0 further {c}-phases Optimal B depends on future accesses a, b, c, a, b, c, a, b, c, a, b, c a, b, c, b, c, b, c, b, c, b, c Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

31 The Must Analysis Resolving the Dependency Perform multiple analyses for different B sets For which? B already determines sensible contents of B For B = 2, after a, b, c already detected 1 {b, c}-phase no advantage in trying to detect 2 {x, y}-phases Perform k analyses for different B sets for each phase size n = 1... k B n consists of the n most-recently-used blocks Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

32 The Must Analysis Subanalyses for n = a b c c b c a a c a b a n = 1 : n = 2 : n = 3 : Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

33 The Must Analysis Subanalyses for n = n = 1 : a b c c b c a a c a b a H {c} n = 2 : n = 3 : Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

34 The Must Analysis Subanalyses for n = n = 1 : a b c c b c a a c a b a H H {c} n = 2 : {b, c} {b, c} n = 3 : Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

35 The Must Analysis Subanalyses for n = n = 1 : a b c c b c a a c a b a H H H {c} {a} n = 2 : {b, c} {b, c} n = 3 : Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

36 The Must Analysis Subanalyses for n = n = 1 : a b c c b c a a c a b a H H H H {c} {a} n = 2 : {b, c} {b, c} {a, c} {a, c} n = 3 : Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

37 The Must Analysis Subanalyses for n = n = 1 : a b c c b c a a c a b a H H H H H {c} {a} n = 2 : {b, c} {b, c} {a, c} {a, c} n = 3 : {a, b, c} {a, b, c} {a, b, c} Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

38 Efficient Implementation Observation For n = 1... k, analysis needs to maintain: phase blocks B n 2 B phase progress P n 2 B phase counter pc n N Phase blocks B n are the n most-recently-used blocks For i < j : B i B j Encode all B n in a single LRU-stack For all i : P i B i Encode all P n as pointers into the stack Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

39 Efficient Implementation Encoding For phase blocks B n : pc n complete B n -phases were detected current phase progress is B ppn B 1 pc 1, pp 1 B 2 \ B 1 pc 2, pp 2 B 3 \ B 2 pc 3, pp 3 B 4 \ B 3 pc 4, pp 4 B 1 {b} P 1 pc 1 1 B 2 {b, c} P 2 {b} pc 2 2 B 3 {a, b, c} P 3 {b, c} pc 3 1 = {b} 1, 0 {c} 2, 1 {a} 1, 2 0, 3 Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

40 Outline 1 Introduction and Problem Timing Analysis Cache Analysis Challenge FIFO Replacement 2 Predicting Hits for FIFO Idea and Theorem Must Analysis Efficient Implementation 3 Paper Contents 4 Evaluation Related Work Analysis Precision 5 Summary Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

41 Contents of the Paper So far, we have seen parts of the must-analysis The paper contains, for must- and may-analysis, basic theorem generalization to arbitrary control-flow formalization as abstract interpretation Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

42 Outline 1 Introduction and Problem Timing Analysis Cache Analysis Challenge FIFO Replacement 2 Predicting Hits for FIFO Idea and Theorem Must Analysis Efficient Implementation 3 Paper Contents 4 Evaluation Related Work Analysis Precision 5 Summary Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

43 Brief History of Replacement Analysis Before 97 LRU analyses LCTRTS 97 Precise and efficient must- and may-analysis for LRU [1] LCTES 08 Generic analyses for FIFO and PLRU [2] SAS 09 Cache analysis framework and FIFO analysis [3] WCET 10 Toward precise analysis for PLRU [4] ECRTS 10 Precise and efficient must- and may-analysis for FIFO Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

44 Evaluation Setup Analyses: HAM Must-analysis of SAS 09 RC Generic analyses of LCTES 08 PD Phase detecting analyses Collecting semantics: CS Limit for any static analysis Spectrum of synthetic benchmarks: Random access sequences and program fragments With varying locality Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

45 Evaluation Results k=8, random sequences % 100 CS hits n.c. misses n n is number of distinct elements that get accessed Average guaranteed hit- and miss-rates Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

46 Evaluation Results k=8, random sequences % CS RC n n is number of distinct elements that get accessed Average guaranteed hit- and miss-rates Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

47 Evaluation Results k=8, random sequences % CS RC+HAM n n is number of distinct elements that get accessed Average guaranteed hit- and miss-rates Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

48 Evaluation Results k=8, random sequences % CS RC+HAM PD+HAM n n is number of distinct elements that get accessed Average guaranteed hit- and miss-rates Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

49 Outline 1 Introduction and Problem Timing Analysis Cache Analysis Challenge FIFO Replacement 2 Predicting Hits for FIFO Idea and Theorem Must Analysis Efficient Implementation 3 Paper Contents 4 Evaluation Related Work Analysis Precision 5 Summary Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

50 Summary Precise and Efficient FIFO-Replacement Analysis based on Static Phase Detection Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

51 Summary % B1 {b} 100 P1 80 pc1 1 B2 {b, c} 60 P2 {b} 40 pc2 2 B3 {a, b, c} 20 P3 {b, c} 0 pc n = {b} 1, 0 {c} 2, 1 {a} 1, 2 0, 3 Precise and Efficient FIFO-Replacement Analysis based on Static Phase Detection { {a,b,c}-phase }} {{ {a,b,c}-phase }} { a, b, c }{{}, c, b }{{}, b, a,... {b,c}-phase {b,c}-phase Two theorems on FIFO-contents bound on number of phases must be cached / evicted Must- and may-analysis static phase detection multiple sub-analyses Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

52 Further Reading C. Ferdinand Cache Behaviour Prediction for Real-Time Systems PhD Thesis, Saarland University, 1997 J. Reineke and D. Grund Relative competitive analysis of cache replacement policies LCTES 2008 D. Grund and J. Reineke Abstract Interpretation of FIFO Replacement SAS 2009 D. Grund and J. Reineke Toward Precise PLRU Cache Analysis WCET 2010 Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

53 Related Work: LRU Analyses Analyses directed at worst-case execution-time analysis Mueller By static cache simulation Li By integer linear programming Ferdinand By abstract interpretation Other than that Ghosh Cache Miss Equations, loop nests Chatterjee Exact model of cache behavior for loop nests All for LRU caches only Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

54 Static Timing-Analysis Framework Binary Executable Legend: Data CFG Reconstruction Action Micro-architectural analysis Control-flow Graph models pipeline, caches, buses, etc. derives bounds on BB exec. times Value Analysis Loop Bound Analysis Control-flow Analysis is an abstract interpretation with a huge domain Annotated CFG is the computationally most expensive module Microarchitectural Analysis Basic Block Timing Info Global Bound Analysis Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

55 Applicability Any buffer with transparent FIFO replacement: Individual cache sets of instruction of data caches (I$, D$) Branch target buffers (BTB, BTIC) Translation lookaside buffers (TLB) Instances: I$ D$ ARM 1136, 1156, 1176, 920T, 922T, 926EJ-S (k {2, 4, 64}) I$ D$ Marvell (Intel) XScale(s) (k = 32) BTB Freescale (Motorola) MPC 56x, 7450-Family (k {4, 8})... Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

56 Must Analysis Full Example for k = 3 For 1 n k maintain B n, P n, pc n Example s a b c c b c a B 1 {a} {b} {c} {c} {b} {c} {a} P 1 pc B 2 {a} {a, b} {b, c} {b, c} {b, c} {b, c} {a, c} P 2 {a} {c} {c} pc B 3 {a} {a, b} {a, b, c} {a, b, c} {a, b, c} {a, b, c} {a, b, c} P 3 {a} {a, b} {c} {b, c} {b, c} pc Hit Hit Hit Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

57 Must Analysis Abstraction and Join Analysis domain is Lru k PC k PP k (2 B ) k N k N k Lru k PC k PP k LRU must-analysis, under-approximates accessed blocks lower bounds on number of phases lower bounds on phase progress Reuse abstract transformer and join of Lru k Define appropriately for PC k and PP k Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

58 May-Analysis Similar to must-analysis Difference: Phases may be of different lengths and contents Theorem (Multiple Phases) s = s 1... s j, i : A(s i ) = n i k: q Q k, q s q : C j i=1 (n i k+1) (q ) A(s) = i A(s i ) More simultaneous sub-analyses Similar implementation employing LRU may-analysis Daniel Grund and Jan Reineke FIFO-Replacement Analysis ECRTS / 34

CACHE-RELATED PREEMPTION DELAY COMPUTATION FOR SET-ASSOCIATIVE CACHES

CACHE-RELATED PREEMPTION DELAY COMPUTATION FOR SET-ASSOCIATIVE CACHES CACHE-RELATED PREEMPTION DELAY COMPUTATION FOR SET-ASSOCIATIVE CACHES PITFALLS AND SOLUTIONS 1 Claire Burguière, Jan Reineke, Sebastian Altmeyer 2 Abstract In preemptive real-time systems, scheduling analyses

More information

Design and Analysis of Real-Time Systems Microarchitectural Analysis

Design and Analysis of Real-Time Systems Microarchitectural Analysis Design and Analysis of Real-Time Systems Microarchitectural Analysis Jan Reineke Advanced Lecture, Summer 2013 Structure of WCET Analyzers Reconstructs a control-flow graph from the binary. Determines

More information

HW/SW Codesign. WCET Analysis

HW/SW Codesign. WCET Analysis HW/SW Codesign WCET Analysis 29 November 2017 Andres Gomez gomeza@tik.ee.ethz.ch 1 Outline Today s exercise is one long question with several parts: Basic blocks of a program Static value analysis WCET

More information

FIFO Cache Analysis for WCET Estimation: A Quantitative Approach

FIFO Cache Analysis for WCET Estimation: A Quantitative Approach FIFO Cache Analysis for WCET Estimation: A Quantitative Approach Abstract Although most previous work in cache analysis for WCET estimation assumes the LRU replacement policy, in practise more processors

More information

ait: WORST-CASE EXECUTION TIME PREDICTION BY STATIC PROGRAM ANALYSIS

ait: WORST-CASE EXECUTION TIME PREDICTION BY STATIC PROGRAM ANALYSIS ait: WORST-CASE EXECUTION TIME PREDICTION BY STATIC PROGRAM ANALYSIS Christian Ferdinand and Reinhold Heckmann AbsInt Angewandte Informatik GmbH, Stuhlsatzenhausweg 69, D-66123 Saarbrucken, Germany info@absint.com

More information

Timing analysis and timing predictability

Timing analysis and timing predictability Timing analysis and timing predictability Architectural Dependences Reinhard Wilhelm Saarland University, Saarbrücken, Germany ArtistDesign Summer School in China 2010 What does the execution time depends

More information

Hardware-Software Codesign. 9. Worst Case Execution Time Analysis

Hardware-Software Codesign. 9. Worst Case Execution Time Analysis Hardware-Software Codesign 9. Worst Case Execution Time Analysis Lothar Thiele 9-1 System Design Specification System Synthesis Estimation SW-Compilation Intellectual Prop. Code Instruction Set HW-Synthesis

More information

Design and Analysis of Real-Time Systems Predictability and Predictable Microarchitectures

Design and Analysis of Real-Time Systems Predictability and Predictable Microarchitectures Design and Analysis of Real-Time Systems Predictability and Predictable Microarcectures Jan Reineke Advanced Lecture, Summer 2013 Notion of Predictability Oxford Dictionary: predictable = adjective, able

More information

Chapter 2 MRU Cache Analysis for WCET Estimation

Chapter 2 MRU Cache Analysis for WCET Estimation Chapter 2 MRU Cache Analysis for WCET Estimation Most previous work on cache analysis for WCET estimation assumes a particular replacement policy LRU. In contrast, much less work has been done for non-lru

More information

Scope-based Method Cache Analysis

Scope-based Method Cache Analysis Scope-based Method Cache Analysis Benedikt Huber 1, Stefan Hepp 1, Martin Schoeberl 2 1 Vienna University of Technology 2 Technical University of Denmark 14th International Workshop on Worst-Case Execution

More information

SIC: Provably Timing-Predictable Strictly In-Order Pipelined Processor Core

SIC: Provably Timing-Predictable Strictly In-Order Pipelined Processor Core SIC: Provably Timing-Predictable Strictly In-Order Pipelined Processor Core Sebastian Hahn and Jan Reineke RTSS, Nashville December, 2018 saarland university computer science SIC: Provably Timing-Predictable

More information

Timing Analysis - timing guarantees for hard real-time systems- Reinhard Wilhelm Saarland University Saarbrücken

Timing Analysis - timing guarantees for hard real-time systems- Reinhard Wilhelm Saarland University Saarbrücken Timing Analysis - timing guarantees for hard real-time systems- Reinhard Wilhelm Saarland University Saarbrücken Structure of the Lecture 1. Introduction 2. Static timing analysis 1. the problem 2. our

More information

Caches in Real-Time Systems. Instruction Cache vs. Data Cache

Caches in Real-Time Systems. Instruction Cache vs. Data Cache Caches in Real-Time Systems [Xavier Vera, Bjorn Lisper, Jingling Xue, Data Caches in Multitasking Hard Real- Time Systems, RTSS 2003.] Schedulability Analysis WCET Simple Platforms WCMP (memory performance)

More information

TOWARD PRECISE PLRU CACHE ANALYSIS 1 Daniel Grund 2 and Jan Reineke 3

TOWARD PRECISE PLRU CACHE ANALYSIS 1 Daniel Grund 2 and Jan Reineke 3 TOWARD PRECISE PLRU CACHE ANALYSIS 1 Daniel Grund 2 and Jan Reinee 3 Abstract Schedulability analysis for hard real-time systems requires bounds on the execution times of its tass. To obtain useful bounds

More information

Data-Flow Based Detection of Loop Bounds

Data-Flow Based Detection of Loop Bounds Data-Flow Based Detection of Loop Bounds Christoph Cullmann and Florian Martin AbsInt Angewandte Informatik GmbH Science Park 1, D-66123 Saarbrücken, Germany cullmann,florian@absint.com, http://www.absint.com

More information

Timing Predictability of Processors

Timing Predictability of Processors Timing Predictability of Processors Introduction Airbag opens ca. 100ms after the crash Controller has to trigger the inflation in 70ms Airbag useless if opened too early or too late Controller has to

More information

Timing Anomalies Reloaded

Timing Anomalies Reloaded Gernot Gebhard AbsInt Angewandte Informatik GmbH 1 of 20 WCET 2010 Brussels Belgium Timing Anomalies Reloaded Gernot Gebhard AbsInt Angewandte Informatik GmbH Brussels, 6 th July, 2010 Gernot Gebhard AbsInt

More information

Caches in Real-Time Systems. Instruction Cache vs. Data Cache

Caches in Real-Time Systems. Instruction Cache vs. Data Cache Caches in Real-Time Systems [Xavier Vera, Bjorn Lisper, Jingling Xue, Data Caches in Multitasking Hard Real- Time Systems, RTSS 2003.] Schedulability Analysis WCET Simple Platforms WCMP (memory performance)

More information

A Refining Cache Behavior Prediction using Cache Miss Paths

A Refining Cache Behavior Prediction using Cache Miss Paths A Refining Cache Behavior Prediction using Cache Miss Paths KARTIK NAGAR and Y N SRIKANT, Indian Institute of Science Worst Case Execution Time (WCET) is an important metric for programs running on real-time

More information

Predictability Considerations in the Design of Multi-Core Embedded Systems

Predictability Considerations in the Design of Multi-Core Embedded Systems Predictability Considerations in the Design of Multi-Core Embedded Systems Christoph Cullmann 1, Christian Ferdinand 1, Gernot Gebhard 1, Daniel Grund 2, Claire Maiza (Burguière) 2, Jan Reineke 3, Benoît

More information

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015 Operating Systems 09. Memory Management Part 1 Paul Krzyzanowski Rutgers University Spring 2015 March 9, 2015 2014-2015 Paul Krzyzanowski 1 CPU Access to Memory The CPU reads instructions and reads/write

More information

Worst-Case Execution Time (WCET)

Worst-Case Execution Time (WCET) Introduction to Cache Analysis for Real-Time Systems [C. Ferdinand and R. Wilhelm, Efficient and Precise Cache Behavior Prediction for Real-Time Systems, Real-Time Systems, 17, 131-181, (1999)] Schedulability

More information

Timing Analysis of Embedded Software for Families of Microarchitectures

Timing Analysis of Embedded Software for Families of Microarchitectures Analysis of Embedded Software for Families of Microarchitectures Jan Reineke, UC Berkeley Edward A. Lee, UC Berkeley Representing Distributed Sense and Control Systems (DSCS) theme of MuSyC With thanks

More information

Static Memory and Timing Analysis of Embedded Systems Code

Static Memory and Timing Analysis of Embedded Systems Code Static Memory and Timing Analysis of Embedded Systems Code Christian Ferdinand Reinhold Heckmann Bärbel Franzen AbsInt Angewandte Informatik GmbH Science Park 1, D-66123 Saarbrücken, Germany Phone: +49-681-38360-0

More information

Worst-Case Execution Time (WCET)

Worst-Case Execution Time (WCET) Introduction to Cache Analysis for Real-Time Systems [C. Ferdinand and R. Wilhelm, Efficient and Precise Cache Behavior Prediction for Real-Time Systems, Real-Time Systems, 17, 131-181, (1999)] Schedulability

More information

Relational Domains for the Quantification of Cache Side Channels

Relational Domains for the Quantification of Cache Side Channels Saarland University Faculty of Natural Sciences and Technology I Department of Computer Science Master s Program in Computer Science Master s Thesis Relational Domains for the Quantification of Cache Side

More information

Course Outline. Processes CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Distributed Systems

Course Outline. Processes CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Distributed Systems Course Outline Processes CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Distributed Systems 1 Today: Memory Management Terminology Uniprogramming Multiprogramming Contiguous

More information

Design and Analysis of Time-Critical Systems Introduction

Design and Analysis of Time-Critical Systems Introduction Design and Analysis of Time-Critical Systems Introduction Jan Reineke @ saarland university ACACES Summer School 2017 Fiuggi, Italy computer science Structure of this Course 2. How are they implemented?

More information

Final Exam Preparation Questions

Final Exam Preparation Questions EECS 678 Spring 2013 Final Exam Preparation Questions 1 Chapter 6 1. What is a critical section? What are the three conditions to be ensured by any solution to the critical section problem? 2. The following

More information

Design and Analysis of Time-Critical Systems Timing Predictability and Analyzability + Case Studies: PTARM and Kalray MPPA-256

Design and Analysis of Time-Critical Systems Timing Predictability and Analyzability + Case Studies: PTARM and Kalray MPPA-256 Design and Analysis of Time-Critical Systems Timing Predictability and Analyzability + Case Studies: PTARM and Kalray MPPA-256 Jan Reineke @ saarland university computer science ACACES Summer School 2017

More information

Static WCET Analysis: Methods and Tools

Static WCET Analysis: Methods and Tools Static WCET Analysis: Methods and Tools Timo Lilja April 28, 2011 Timo Lilja () Static WCET Analysis: Methods and Tools April 28, 2011 1 / 23 1 Methods 2 Tools 3 Summary 4 References Timo Lilja () Static

More information

On the Use of Context Information for Precise Measurement-Based Execution Time Estimation

On the Use of Context Information for Precise Measurement-Based Execution Time Estimation On the Use of Context Information for Precise Measurement-Based Execution Time Estimation Stefan Stattelmann Florian Martin FZI Forschungszentrum Informatik Karlsruhe AbsInt Angewandte Informatik GmbH

More information

A Template for Predictability Definitions with Supporting Evidence

A Template for Predictability Definitions with Supporting Evidence A Template for Predictability Definitions with Supporting Evidence Daniel Grund 1, Jan Reineke 2, and Reinhard Wilhelm 1 1 Saarland University, Saarbrücken, Germany. grund@cs.uni-saarland.de 2 University

More information

Evaluating Static Worst-Case Execution-Time Analysis for a Commercial Real-Time Operating System

Evaluating Static Worst-Case Execution-Time Analysis for a Commercial Real-Time Operating System Evaluating Static Worst-Case Execution-Time Analysis for a Commercial Real-Time Operating System Daniel Sandell Master s thesis D-level, 20 credits Dept. of Computer Science Mälardalen University Supervisor:

More information

6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU

6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU 1-6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU Product Overview Introduction 1. ARCHITECTURE OVERVIEW The Cyrix 6x86 CPU is a leader in the sixth generation of high

More information

Lecture 17: Virtual Memory, Large Caches. Today: virtual memory, shared/pvt caches, NUCA caches

Lecture 17: Virtual Memory, Large Caches. Today: virtual memory, shared/pvt caches, NUCA caches Lecture 17: Virtual Memory, Large Caches Today: virtual memory, shared/pvt caches, NUCA caches 1 Virtual Memory Processes deal with virtual memory they have the illusion that a very large address space

More information

Classification of Code Annotations and Discussion of Compiler-Support for Worst-Case Execution Time Analysis

Classification of Code Annotations and Discussion of Compiler-Support for Worst-Case Execution Time Analysis Proceedings of the 5th Intl Workshop on Worst-Case Execution Time (WCET) Analysis Page 41 of 49 Classification of Code Annotations and Discussion of Compiler-Support for Worst-Case Execution Time Analysis

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste

More information

Automatic Qualification of Abstract Interpretation-based Static Analysis Tools. Christian Ferdinand, Daniel Kästner AbsInt GmbH 2013

Automatic Qualification of Abstract Interpretation-based Static Analysis Tools. Christian Ferdinand, Daniel Kästner AbsInt GmbH 2013 Automatic Qualification of Abstract Interpretation-based Static Analysis Tools Christian Ferdinand, Daniel Kästner AbsInt GmbH 2013 2 Functional Safety Demonstration of functional correctness Well-defined

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE382V: System-on-a-Chip (SoC) Design Lecture 5 Performance Analysis Sources: Prof. Jacob Abraham, UT Austin Prof. Lothar Thiele, ETH Zurich Prof. Reinhard Wilhelm, Saarland Univ. Andreas Gerstlauer Electrical

More information

Efficient Microarchitecture Modeling and Path Analysis for Real-Time Software. Yau-Tsun Steven Li Sharad Malik Andrew Wolfe

Efficient Microarchitecture Modeling and Path Analysis for Real-Time Software. Yau-Tsun Steven Li Sharad Malik Andrew Wolfe Efficient Microarchitecture Modeling and Path Analysis for Real-Time Software Yau-Tsun Steven Li Sharad Malik Andrew Wolfe 1 Introduction Paper examines the problem of determining the bound on the worst

More information

Lecture: Large Caches, Virtual Memory. Topics: cache innovations (Sections 2.4, B.4, B.5)

Lecture: Large Caches, Virtual Memory. Topics: cache innovations (Sections 2.4, B.4, B.5) Lecture: Large Caches, Virtual Memory Topics: cache innovations (Sections 2.4, B.4, B.5) 1 Intel Montecito Cache Two cores, each with a private 12 MB L3 cache and 1 MB L2 Naffziger et al., Journal of Solid-State

More information

Architectural Time-predictability Factor (ATF) to Measure Architectural Time Predictability

Architectural Time-predictability Factor (ATF) to Measure Architectural Time Predictability Architectural Time-predictability Factor (ATF) to Measure Architectural Time Predictability Yiqiang Ding, Wei Zhang Department of Electrical and Computer Engineering Virginia Commonwealth University Outline

More information

Memory hier ar hier ch ar y ch rev re i v e i w e ECE 154B Dmitri Struko Struk v o

Memory hier ar hier ch ar y ch rev re i v e i w e ECE 154B Dmitri Struko Struk v o Memory hierarchy review ECE 154B Dmitri Strukov Outline Cache motivation Cache basics Opteron example Cache performance Six basic optimizations Virtual memory Processor DRAM gap (latency) Four issue superscalar

More information

Evaluating compositional timing analyses

Evaluating compositional timing analyses Saarland University Faculty of Natural Sciences and Technology I Department of Computer Science Master thesis Evaluating compositional timing analyses submitted by Claus Michael Faymonville submitted September

More information

Chapter 10: Virtual Memory. Lesson 05: Translation Lookaside Buffers

Chapter 10: Virtual Memory. Lesson 05: Translation Lookaside Buffers Chapter 10: Virtual Memory Lesson 05: Translation Lookaside Buffers Objective Learn that a page table entry access increases the latency for a memory reference Understand that how use of translationlookaside-buffers

More information

Cache Performance Analysis with Callgrind and KCachegrind

Cache Performance Analysis with Callgrind and KCachegrind Cache Performance Analysis with Callgrind and KCachegrind VI-HPS Tuning Workshop 8 September 2011, Aachen Josef Weidendorfer Computer Architecture I-10, Department of Informatics Technische Universität

More information

ECE 571 Advanced Microprocessor-Based Design Lecture 13

ECE 571 Advanced Microprocessor-Based Design Lecture 13 ECE 571 Advanced Microprocessor-Based Design Lecture 13 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 21 March 2017 Announcements More on HW#6 When ask for reasons why cache

More information

Lecture 8: Virtual Memory. Today: DRAM innovations, virtual memory (Sections )

Lecture 8: Virtual Memory. Today: DRAM innovations, virtual memory (Sections ) Lecture 8: Virtual Memory Today: DRAM innovations, virtual memory (Sections 5.3-5.4) 1 DRAM Technology Trends Improvements in technology (smaller devices) DRAM capacities double every two years, but latency

More information

CS 153 Design of Operating Systems Winter 2016

CS 153 Design of Operating Systems Winter 2016 CS 153 Design of Operating Systems Winter 2016 Lecture 16: Memory Management and Paging Announcement Homework 2 is out To be posted on ilearn today Due in a week (the end of Feb 19 th ). 2 Recap: Fixed

More information

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng. CS 265 Computer Architecture Wei Lu, Ph.D., P.Eng. Part 4: Memory Organization Our goal: understand the basic types of memory in computer understand memory hierarchy and the general process to access memory

More information

MAKING DYNAMIC MEMORY ALLOCATION STATIC TO SUPPORT WCET ANALYSES 1 Jörg Herter 2 and Jan Reineke 2

MAKING DYNAMIC MEMORY ALLOCATION STATIC TO SUPPORT WCET ANALYSES 1 Jörg Herter 2 and Jan Reineke 2 MAKING DYNAMIC MEMORY ALLOCATION STATIC TO SUPPORT WCET ANALYSES 1 Jörg Herter 2 and Jan Reineke 2 Abstract Current worst-case execution time (WCET) analyses do not support programs using dynamic memory

More information

Timing Analysis of Parallel Software Using Abstract Execution

Timing Analysis of Parallel Software Using Abstract Execution Timing Analysis of Parallel Software Using Abstract Execution Björn Lisper School of Innovation, Design, and Engineering Mälardalen University bjorn.lisper@mdh.se 2014-09-10 EACO Workshop 2014 Motivation

More information

PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE

PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE Yau-Tsun Steven Li Monterey Design Systems, Inc. Sharad Malik Princeton University ~. " SPRINGER

More information

Virtual Memory: From Address Translation to Demand Paging

Virtual Memory: From Address Translation to Demand Paging Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November 12, 2014

More information

The Heptane Static Worst-Case Execution Time Estimation Tool

The Heptane Static Worst-Case Execution Time Estimation Tool The Heptane Static Worst-Case Execution Time Estimation Tool Damien Hardy Benjamin Rouxel Isabelle Puaut damien.hardy@irisa.fr benjamin.rouxel@irisa.fr isabelle.puaut@irisa.fr Institut de Recherche en

More information

Survey results. CS 6354: Memory Hierarchy I. Variety in memory technologies. Processor/Memory Gap. SRAM approx. 4 6 transitors/bit optimized for speed

Survey results. CS 6354: Memory Hierarchy I. Variety in memory technologies. Processor/Memory Gap. SRAM approx. 4 6 transitors/bit optimized for speed Survey results CS 6354: Memory Hierarchy I 29 August 2016 1 2 Processor/Memory Gap Variety in memory technologies SRAM approx. 4 6 transitors/bit optimized for speed DRAM approx. 1 transitor + capacitor/bit

More information

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1 Virtual Memory Patterson & Hennessey Chapter 5 ELEC 5200/6200 1 Virtual Memory Use main memory as a cache for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs

More information

CS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches

CS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches CS 152 Computer Architecture and Engineering Lecture 11 - Virtual Memory and Caches Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste

More information

The Impact of Write Back on Cache Performance

The Impact of Write Back on Cache Performance The Impact of Write Back on Cache Performance Daniel Kroening and Silvia M. Mueller Computer Science Department Universitaet des Saarlandes, 66123 Saarbruecken, Germany email: kroening@handshake.de, smueller@cs.uni-sb.de,

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste!

More information

Quick Reference Card. Timing and Stack Verifiers Supported Platforms. SCADE Suite 6.3

Quick Reference Card. Timing and Stack Verifiers Supported Platforms. SCADE Suite 6.3 Timing and Stack Verifiers Supported Platforms SCADE Suite 6.3 About Timing and Stack Verifiers supported platforms Timing and Stack Verifiers Supported Platforms SCADE Suite Timing Verifier and SCADE

More information

Chapter 9 Memory Management

Chapter 9 Memory Management Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual

More information

Timing Analysis - timing guarantees for hard real-time systems- Reinhard Wilhelm Saarland University Saarbrücken

Timing Analysis - timing guarantees for hard real-time systems- Reinhard Wilhelm Saarland University Saarbrücken Timing Analysis - timing guarantees for hard real-time systems- Reinhard Wilhelm Saarland University Saarbrücken Structure of the Lecture 1. 2. 1. 2. 3. 4. 5. 6. Introduction Static timing analysis the

More information

CS 6354: Memory Hierarchy I. 29 August 2016

CS 6354: Memory Hierarchy I. 29 August 2016 1 CS 6354: Memory Hierarchy I 29 August 2016 Survey results 2 Processor/Memory Gap Figure 2.2 Starting with 1980 performance as a baseline, the gap in performance, measured as the difference in the time

More information

Question 1 (5 points) Consider a cache with the following specifications Address space is 1024 words. The memory is word addressable The size of the

Question 1 (5 points) Consider a cache with the following specifications Address space is 1024 words. The memory is word addressable The size of the Question 1 (5 points) Consider a cache with the following specifications Address space is 1024 words. he memory is word addressable he size of the cache is 8 blocks; each block is 4 words (32 words cache).

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2016 Lecture 33 Virtual Memory Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ How does the virtual

More information

Cache Controller with Enhanced Features using Verilog HDL

Cache Controller with Enhanced Features using Verilog HDL Cache Controller with Enhanced Features using Verilog HDL Prof. V. B. Baru 1, Sweety Pinjani 2 Assistant Professor, Dept. of ECE, Sinhgad College of Engineering, Vadgaon (BK), Pune, India 1 PG Student

More information

CS/ECE 552: Introduction to Computer Architecture

CS/ECE 552: Introduction to Computer Architecture CS/ECE 552: Introduction to Computer Architecture Prof. David A. Wood Final Exam May 9, 2010 10:05am-12:05pm, 2241 Chamberlin Approximate Weight: 25% CLOSED BOOK TWO SHEETS OF NOTES NAME: DO NOT OPEN THE

More information

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Address Translation Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics How to reduce the size of page tables? How to reduce the time for

More information

Operating Systems Design Exam 2 Review: Spring 2011

Operating Systems Design Exam 2 Review: Spring 2011 Operating Systems Design Exam 2 Review: Spring 2011 Paul Krzyzanowski pxk@cs.rutgers.edu 1 Question 1 CPU utilization tends to be lower when: a. There are more processes in memory. b. There are fewer processes

More information

Lecture 21. Monday, February 28 CS 470 Operating Systems - Lecture 21 1

Lecture 21. Monday, February 28 CS 470 Operating Systems - Lecture 21 1 Lecture 21 Case study guidelines posted Case study assignments Third project (virtual memory simulation) will go out after spring break. Extra credit fourth project (shell program) will go out after that.

More information

Virtual Memory, Address Translation

Virtual Memory, Address Translation Memory Hierarchy Virtual Memory, Address Translation Slides contents from: Hennessy & Patterson, 5ed Appendix B and Chapter 2 David Wentzlaff, ELE 475 Computer Architecture MJT, High Performance Computing,

More information

Parametric Timing Analysis for Complex Architectures

Parametric Timing Analysis for Complex Architectures Parametric Timing Analysis for Complex Architectures Sebastian Altmeyer Department of Computer Science Saarland University altmeyer@cs.uni-sb.de Björn Lisper Department of Computer Science and Electronics

More information

CS 416: Opera-ng Systems Design March 23, 2012

CS 416: Opera-ng Systems Design March 23, 2012 Question 1 Operating Systems Design Exam 2 Review: Spring 2011 Paul Krzyzanowski pxk@cs.rutgers.edu CPU utilization tends to be lower when: a. There are more processes in memory. b. There are fewer processes

More information

Memory Hierarchy. Goal: Fast, unlimited storage at a reasonable cost per bit.

Memory Hierarchy. Goal: Fast, unlimited storage at a reasonable cost per bit. Memory Hierarchy Goal: Fast, unlimited storage at a reasonable cost per bit. Recall the von Neumann bottleneck - single, relatively slow path between the CPU and main memory. Fast: When you need something

More information

CHAPTER 4 MEMORY HIERARCHIES TYPICAL MEMORY HIERARCHY TYPICAL MEMORY HIERARCHY: THE PYRAMID CACHE PERFORMANCE MEMORY HIERARCHIES CACHE DESIGN

CHAPTER 4 MEMORY HIERARCHIES TYPICAL MEMORY HIERARCHY TYPICAL MEMORY HIERARCHY: THE PYRAMID CACHE PERFORMANCE MEMORY HIERARCHIES CACHE DESIGN CHAPTER 4 TYPICAL MEMORY HIERARCHY MEMORY HIERARCHIES MEMORY HIERARCHIES CACHE DESIGN TECHNIQUES TO IMPROVE CACHE PERFORMANCE VIRTUAL MEMORY SUPPORT PRINCIPLE OF LOCALITY: A PROGRAM ACCESSES A RELATIVELY

More information

CSE Computer Architecture I Fall 2011 Homework 07 Memory Hierarchies Assigned: November 8, 2011, Due: November 22, 2011, Total Points: 100

CSE Computer Architecture I Fall 2011 Homework 07 Memory Hierarchies Assigned: November 8, 2011, Due: November 22, 2011, Total Points: 100 CSE 30321 Computer Architecture I Fall 2011 Homework 07 Memory Hierarchies Assigned: November 8, 2011, Due: November 22, 2011, Total Points: 100 Problem 1: (30 points) Background: One possible organization

More information

Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches

Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging 6.823, L8--1 Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Highly-Associative

More information

CISC 7310X. C08: Virtual Memory. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 3/22/2018 CUNY Brooklyn College

CISC 7310X. C08: Virtual Memory. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 3/22/2018 CUNY Brooklyn College CISC 7310X C08: Virtual Memory Hui Chen Department of Computer & Information Science CUNY Brooklyn College 3/22/2018 CUNY Brooklyn College 1 Outline Concepts of virtual address space, paging, virtual page,

More information

CS152 Computer Architecture and Engineering

CS152 Computer Architecture and Engineering CS152 Computer Architecture and Engineering Caches and the Memory Hierarchy Assigned 9/17/2016 Problem Set #2 Due Tue, Oct 4 http://inst.eecs.berkeley.edu/~cs152/fa16 The problem sets are intended to help

More information

Cache-Oblivious Algorithms A Unified Approach to Hierarchical Memory Algorithms

Cache-Oblivious Algorithms A Unified Approach to Hierarchical Memory Algorithms Cache-Oblivious Algorithms A Unified Approach to Hierarchical Memory Algorithms Aarhus University Cache-Oblivious Current Trends Algorithms in Algorithms, - A Unified Complexity Approach to Theory, Hierarchical

More information

Reduced Instruction Set Computer

Reduced Instruction Set Computer Reduced Instruction Set Computer RISC - Reduced Instruction Set Computer By reducing the number of instructions that a processor supports and thereby reducing the complexity of the chip, it is possible

More information

KeyStone II. CorePac Overview

KeyStone II. CorePac Overview KeyStone II ARM Cortex A15 CorePac Overview ARM A15 CorePac in KeyStone II Standard ARM Cortex A15 MPCore processor Cortex A15 MPCore version r2p2 Quad core, dual core, and single core variants 4096kB

More information

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management CSE 120 July 18, 2006 Day 5 Memory Instructor: Neil Rhodes Translation Lookaside Buffer (TLB) Implemented in Hardware Cache to map virtual page numbers to page frame Associative memory: HW looks up in

More information

CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ARCHITECURE- III YEAR EEE-6 TH SEMESTER 16 MARKS QUESTION BANK UNIT-1

CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ARCHITECURE- III YEAR EEE-6 TH SEMESTER 16 MARKS QUESTION BANK UNIT-1 CHETTINAD COLLEGE OF ENGINEERING AND TECHNOLOGY COMPUTER ARCHITECURE- III YEAR EEE-6 TH SEMESTER 16 MARKS QUESTION BANK UNIT-1 Data representation: (CHAPTER-3) 1. Discuss in brief about Data types, (8marks)

More information

CacheAudit: A Tool for the Static Analysis of Cache Side Channels

CacheAudit: A Tool for the Static Analysis of Cache Side Channels CacheAudit: A Tool for the Static Analysis of Cache Side Channels Goran Doychev, IMDEA Software Institute; Dominik Feld, Saarland University; Boris Köpf and Laurent Mauborgne, IMDEA Software Institute;

More information

CS307 Operating Systems Main Memory

CS307 Operating Systems Main Memory CS307 Main Memory Fan Wu Department of Computer Science and Engineering Shanghai Jiao Tong University Spring 2018 Background Program must be brought (from disk) into memory and placed within a process

More information

Itanium 2 Processor Microarchitecture Overview

Itanium 2 Processor Microarchitecture Overview Itanium 2 Processor Microarchitecture Overview Don Soltis, Mark Gibson Cameron McNairy, August 2002 Block Diagram F 16KB L1 I-cache Instr 2 Instr 1 Instr 0 M/A M/A M/A M/A I/A Template I/A B B 2 FMACs

More information

Design and Functional Verification of Four Way Set Associative Cache Controller

Design and Functional Verification of Four Way Set Associative Cache Controller International Journal of Research in Computer and Communication Technology, Vol 4, Issue 3, March -2015 ISSN (Online) 2278-5841 ISSN (Print) 2320-5156 Design and Functional Verification of Four Way Set

More information

CS152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Spring Caches and the Memory Hierarchy

CS152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Spring Caches and the Memory Hierarchy CS152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Spring 2019 Caches and the Memory Hierarchy Assigned February 13 Problem Set #2 Due Wed, February 27 http://inst.eecs.berkeley.edu/~cs152/sp19

More information

Modeling and Simulating Discrete Event Systems in Metropolis

Modeling and Simulating Discrete Event Systems in Metropolis Modeling and Simulating Discrete Event Systems in Metropolis Guang Yang EECS 290N Report December 15, 2004 University of California at Berkeley Berkeley, CA, 94720, USA guyang@eecs.berkeley.edu Abstract

More information

Static Analysis. Systems and Internet Infrastructure Security

Static Analysis. Systems and Internet Infrastructure Security Systems and Internet Infrastructure Security Network and Security Research Center Department of Computer Science and Engineering Pennsylvania State University, University Park PA Static Analysis Trent

More information

Processing Unit CS206T

Processing Unit CS206T Processing Unit CS206T Microprocessors The density of elements on processor chips continued to rise More and more elements were placed on each chip so that fewer and fewer chips were needed to construct

More information

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance 6.823, L11--1 Cache Performance and Memory Management: From Absolute Addresses to Demand Paging Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Cache Performance 6.823,

More information

ECE 411 Exam 1 Practice Problems

ECE 411 Exam 1 Practice Problems ECE 411 Exam 1 Practice Problems Topics Single-Cycle vs Multi-Cycle ISA Tradeoffs Performance Memory Hierarchy Caches (including interactions with VM) 1.) Suppose a single cycle design uses a clock period

More information

Tutorial for Chapter 3, 4

Tutorial for Chapter 3, 4 Eastern Mediterranean University School of Computing and Technology ITEC2 Computer Organization & Architecture Tutorial for Chapter 3, 4 Number Systems Binary Number Systems o Base = 2 o A single bit can

More information

Chapter 8 Memory Management

Chapter 8 Memory Management Chapter 8 Memory Management Da-Wei Chang CSIE.NCKU Source: Abraham Silberschatz, Peter B. Galvin, and Greg Gagne, "Operating System Concepts", 9th Edition, Wiley. 1 Outline Background Swapping Contiguous

More information

Lecture 12: Cleaning up CFGs and Chomsky Normal

Lecture 12: Cleaning up CFGs and Chomsky Normal CS 373: Theory of Computation Sariel Har-Peled and Madhusudan Parthasarathy Lecture 12: Cleaning up CFGs and Chomsky Normal m 3 March 2009 In this lecture, we are interested in transming a given grammar

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 13 Virtual memory and memory management unit In the last class, we had discussed

More information