HEPTANE (Hades Embedded Processor Timing ANalyzEr) A Modular Tool for Sta>c WCET Analysis

Size: px
Start display at page:

Download "HEPTANE (Hades Embedded Processor Timing ANalyzEr) A Modular Tool for Sta>c WCET Analysis"

Transcription

1 HEPTANE (Hades Embedded Processor Timing ANalyzEr) A Modular Tool for Sta>c WCET Analysis Damien Hardy ALF research group, IRISA/INRIA Rennes TuTor 16, Vienna, April 2016

2 About HEPTANE Developed at IRISA/University of Rennes Stable research prototype Licenced under GPL C++ code Supported architectures MIPS Par>ally: ARM v7, Patmos (under construc>on) Not supported anymore: Power PC The first version has been designed around 2000 Many updates since then! Available online htps://team.inria.fr/alf/sovware/heptane/ 2

3 Developers over the ages Isabelle Puaut Antoine Colin Jean- François Deverge Thomas Piquet Damien Hardy Benjamin Lesage Erven Rohou François Joulaud MaThieu Avila Nicolas Kiss Loïc Besnard 3

4 Research based on Heptane Cache analysis (cache hierarchies, mul>- cores) Branch predic>on (analysis, compiler- directed) Cache locking, par>>oning Scratchpad management CRPD es>ma>on Sta>c probabilis>c WCET Traceability of flow informa>on 4

5 Quick overview Framework Installa>on How to use it HeptaneExtract HeptaneAnalysis Low level analysis High level analysis Extra analysis Inside Heptane build your analysis Source Cfglib Useful func>ons Outlines 5

6 Heptane Framework CFG Library Extensible program representa>on Program, CFG, Loop, basic block, instruc>on HeptaneExtract CFG building from the binary Loops extrac>on HeptaneAnalysis Mul>ple analyses step Interface to define new analyses CFG LIBRARY Analysis 1 Analysis 2 Analysis 3 Based on 1&2 6

7 Heptane toolchain Function: Multiply loop [maxiter=20].c file Compiler (gcc/llvm) Binary CFG extraction loop [maxiter=20] loop (BB) & loops bounds (BB) HeptaneAnalysis XML file Low-Level Analysis High-Level Analysis WCET WCEP Statistics HTML 7

8 Installa>on (1/2) Requirements OS: Linux or Mac OS X Libxml2 ILP solver (lp- solve, cplex) Op>onal Doxygen Dot epstopdf 8

9 Installa>on (2/2) For the tutorial Virtual machine htps://tutor2016.inria.fr/heptane Otherwise Script: install.sh System dependent Part to install 9

10 Quick overview Framework Installa>on How to use it HeptaneExtract HeptaneAnalysis Low level analysis High level analysis Extra analysis Inside Heptane build your analysis Source Cfglib Useful func>ons Outlines 10

11 How to use it Source code in C With annota>on for the loops bounds: ANNOT_MAXITER(cst) Include needed ANNOT_MAXITER for each loop Some restric>ons No pointers No malloc Natural loops No recursion 11

12 How to use it: HeptaneExtract.c file & loops bounds Compiler (gcc/llvm) HeptaneExtract Binary CFG extraction.xml CFG configuration file (.xml) + optional files (binary, objdump ) 12

13 HeptaneExtract: configura>on file Target Compiler configuration Binutils configuration 13

14 HeptaneExtract: configura>on file in/tmp/out directories Output files and information 14

15 HeptaneExtract: configura>on file Program to extract 15

16 HeptaneExtract: Good news! Automa>c configura>on (for simple cases) In the directory benchmarks configextract_template.xml extract.sh./extract.sh <benchmark_name> 16

17 CFG.xml 17

18 How to use it: HeptaneAnalysis.xml HeptaneAnalysis CFG Low Level Analysis High Level Analysis WCET WCEP Statistics HTML configuration file (.xml) 18

19 HeptaneAnalysis: configura>on file Directory of the program Architecture information LRU/PLRU/FIFO/MRU/RANDOM 19

20 HeptaneAnalysis: configura>on file 20

21 HeptaneAnalysis: Good news! Automa>c configura>on In the directory config_files: configwcet_template.xml analysis.sh:./analysis.sh <benchmark_name> Or:./bin/HeptaneAnalysis <configwcet.xml> 21

22 HeptaneAnalysis: Analyses Each analysis can be seen as a pass Performed in the order of declara>on in the config file Each analysis extends the XML file (CFG) Informa>on added with atributes Program, CFG, Node, Loop, Edge, Instruc>on 3 steps per analysis CheckInputATributes PerformAnalysis RemovePrivateATributes 22

23 HeptaneAnalysis: common parameters Keep results in memory Input file Output file 23

24 Analyses Low- level analysis: Timing informa>on per basic block Address Analysis Cache Analysis Pipeline Analysis High- level analysis: WCET es>ma>on Implicit Path Enumera>on Technique (IPET) No Tree based method anymore Extra analysis: sta>s>cs & some informa>on 24

25 Address analysis (1/2) Instruc>on address analysis Performed during the CFG extrac>on (OUTPUTCODEADDRESS) Data address analysis Contextual analysis (calling context) Performed at the basic block granularity Pointers not supported (currently) Initial stack pointer address 25

26 Address analysis (2/2) For each func>on: stack informa>on per context For each access 26

27 Cache analysis Instruc>on & data caches Contextual analysis (calling context) Based on abstract interpreta>on (must, may, persistence) Mul>- level non inclusive cache hierarchies write- through no- write allocate data caches Input atributes: instruc>on & data addresses 27

28 Cache analysis: inside Computa>on of Abstract Cache States (ACS) Contains all possible cache contents considering all possible execu>on paths 3 Analyses (Fixpoint computa>on) Must, May and Persistence Modifica>on of ACS Update: at every reference Join: at every path merging point Access categoriza>on from ACS Cache Hit Miss Classifica>on (CHMC) Always hit, Always miss, First miss, Not- classified 28

29 Must analysis ACS contain all program lines guaranteed to be in the cache at a given point Age + a b c d [e] e a b c Update Apply replacement policy (ex: LRU) Age + Age + a b c d b e d a {} b {} d,a Join Intersection + max age 29

30 Cache analysis: Mul>- level Filtering informa>on between cache levels Cache Access Classifica>on (CAC) Always, Never, Uncertain Modifica>on of the update func>on Always: the ACS is updated as before Never: no modifica>on of the ACS ACS in Uncertain Always access Never access Memory references Cache analysis Level L Filtering Cache analysis Level L+1 Cache Hit/Miss Classification Cache Access Classification Update Identity Filtering Join function ACS out Update function for uncertain access m 30

31 Cache analysis: output 31

32 Pipeline analysis Pipeline analysis: compute the >ming of basic blocks Contextual analysis (calling context) Input atributes: CHMC of I$ and D$ 32

33 Pipeline analysis: inside Fetch Decode Execute Memory Write Back IF ID EX ME WB Principle : parallelism between instruc>ons Intra basic- block Inter basic- block Time Time Time Time 33 33

34 Analyses Low- level analysis: Timing informa>on per basic block Address Analysis Cache Analysis Pipeline Analysis High- level analysis: WCET es7ma7on Implicit Path Enumera>on Technique (IPET) No Tree based method anymore Extra analysis: sta>s>cs & some informa>on 34

35 IPET analysis IPET analysis: final WCET computa>on Contextual analysis (calling context) Input atributes: CHMC of I$ and D$ or >ming of basic blocks METHOD_CACHE_NOPIPELINE METHOD_CACHE_PIPELINE METHOD_NOCACHE_NOPIPELINE Solver: lp_solve or cplex 35

36 IPET: inside T 1 T 2 T 3 T 4 T 5 T 6 T 7 Integer Linear Programming (ILP) Constant: T i Variable: f i Objective function: max: f 1 T 1 +f 2 T 2 + +f n T n Structural constraints bb i : f i = Σ a j = Σ a k f 1 = 1 a j In(bb i ) a k Out(bb i ) Extra flow information bb i in loop, f i k (loop bound) f i + f j 1 (mutually exclusive paths not in loop) f i 2 f j (relations between execution freqs.) 36

37 Analyses Low- level analysis: Timing informa>on per basic block Address Analysis Cache Analysis Pipeline Analysis High- level analysis: WCET es>ma>on Implicit Path Enumera>on Technique (IPET) No Tree based method anymore Extra analysis: sta7s7cs & some informa7ons 37

38 Extra analysis: sta>s>cs (1/2) Cache sta>s>cs Contextual analysis (calling context) Input atributes: CHMC of caches & WCET computa>on 38

39 Extra analysis: sta>s>cs (2/2) Codeline & htmlprint Input atributes: WCET computa>on 39

40 Extra analysis: some informa>ons (1/2) Dotprint Function: Multiply loop [maxiter=20] loop [maxiter=20] loop (BB) 40

41 Extra analysis: some informa>ons (2/2) Simpleprint 41

42 For the avernoon session Heptane u>liza>on HeptaneExtract HeptaneAnalysis Build your own analysis Percentage of load/store instruc>ons in a program 1. Number of instruc>ons 2. Number of load/store 3. Contextual analysis (calling context) 42

43 Build your analysis Inheritance: Analysis.h CheckInputATributes, PerformAnalysis, RemovePrivateATributes Generic/config.cc Parsing of the parameters Instanciate the analysis Good news: DummyAnalysis Heptane/src/HeptaneAnalysis/src/Specific/DummyAnalysis/ <DUMMYANALYSIS/> in the configura>on file 43

44 Inside Heptane: Source Common cfglib ArchitectureDependent, GlobalATributes, Utl HeptaneExtract HeptaneAnalysis Generic: context, AnalysisHelper, Analysis.h SharedATributes: name of Serialisable ATributes (SharedATributes.h) Specific: a directory for each analysis 44

45 CFG library (1/3) Hierarchical structure Program CFG Node Edge Loop Instruc>on GetProgram GetAllNodes/GetAllLoops GetPredecessors/GetSuccessors Func>ons to manipulate the program representa>on /home/user/desktop/heptane/src/cfglib_install/doc/ generated- doc/html/index.html 45

46 CFG library (2/3) Informa>on added with atributes Program, CFG, Node, Loop, Edge, Instruc>on ATributes: Serialisable & non serialisable [Non]Serialisable[Type]ATribute Type: Integer, String, Float, UnsignedLong New atributes Inheritance: NonSerialisableATribute / SerialisableATribute SerialisableATribute for Heptane Func>ons: WriteXml & ReadXml Declara>on of the atribute in main.cc (func>on main of HeptaneAnalysis) 46

47 CFG library (3/3) Generic func>ons (ATributed.h) bool HasATribute (std::string ATributeName); void SetATribute (std::string ATributeName, ATribute &atribute); void RemoveATribute (std::string ATributeName) ; ATribute &GetATribute (std::string ATributeName) ; 47

48 HeptaneAnalysis: useful func>ons (1/2) Generic func>ons applytoallnodesrecursive (Program * p, t_node_func>on * f, void *param); 48

49 HeptaneAnalysis: useful func>ons (2/2) Context (atached to cfgs) AnalysisHelper::computeContext (Program* p); Context manipula>on GetContextualPredecessors/GetContextualSuccessors Return: Vector<ContextualNode> 49

50 Ques>ons? htps://team.inria.fr/alf/sovware/heptane/

The Heptane Static Worst-Case Execution Time Estimation Tool

The Heptane Static Worst-Case Execution Time Estimation Tool The Heptane Static Worst-Case Execution Time Estimation Tool Damien Hardy Benjamin Rouxel Isabelle Puaut damien.hardy@irisa.fr benjamin.rouxel@irisa.fr isabelle.puaut@irisa.fr Institut de Recherche en

More information

Predictable paging in real-time systems: an ILP formulation

Predictable paging in real-time systems: an ILP formulation Predictable paging in real-time systems: an ILP formulation Damien Hardy Isabelle Puaut Université Européenne de Bretagne / IRISA, Rennes, France Abstract Conventionally, the use of virtual memory in real-time

More information

CS 465 Final Review. Fall 2017 Prof. Daniel Menasce

CS 465 Final Review. Fall 2017 Prof. Daniel Menasce CS 465 Final Review Fall 2017 Prof. Daniel Menasce Ques@ons What are the types of hazards in a datapath and how each of them can be mi@gated? State and explain some of the methods used to deal with branch

More information

Static WCET Analysis: Methods and Tools

Static WCET Analysis: Methods and Tools Static WCET Analysis: Methods and Tools Timo Lilja April 28, 2011 Timo Lilja () Static WCET Analysis: Methods and Tools April 28, 2011 1 / 23 1 Methods 2 Tools 3 Summary 4 References Timo Lilja () Static

More information

Evaluating Static Worst-Case Execution-Time Analysis for a Commercial Real-Time Operating System

Evaluating Static Worst-Case Execution-Time Analysis for a Commercial Real-Time Operating System Evaluating Static Worst-Case Execution-Time Analysis for a Commercial Real-Time Operating System Daniel Sandell Master s thesis D-level, 20 credits Dept. of Computer Science Mälardalen University Supervisor:

More information

Instructor: Randy H. Katz hap://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #7. Warehouse Scale Computer

Instructor: Randy H. Katz hap://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #7. Warehouse Scale Computer CS 61C: Great Ideas in Computer Architecture Everything is a Number Instructor: Randy H. Katz hap://inst.eecs.berkeley.edu/~cs61c/fa13 9/19/13 Fall 2013 - - Lecture #7 1 New- School Machine Structures

More information

Sireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern

Sireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern Sireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern Introduction WCET of program ILP Formulation Requirement SPM allocation for code SPM allocation for data Conclusion

More information

HW/SW Codesign. WCET Analysis

HW/SW Codesign. WCET Analysis HW/SW Codesign WCET Analysis 29 November 2017 Andres Gomez gomeza@tik.ee.ethz.ch 1 Outline Today s exercise is one long question with several parts: Basic blocks of a program Static value analysis WCET

More information

Main Points. Address Transla+on Concept. Flexible Address Transla+on. Efficient Address Transla+on

Main Points. Address Transla+on Concept. Flexible Address Transla+on. Efficient Address Transla+on Address Transla+on Main Points Address Transla+on Concept How do we convert a virtual address to a physical address? Flexible Address Transla+on Segmenta+on Paging Mul+level transla+on Efficient Address

More information

Lecture 4: Build Systems, Tar, Character Strings

Lecture 4: Build Systems, Tar, Character Strings CIS 330:! / / / / (_) / / / / _/_/ / / / / / \/ / /_/ / `/ \/ / / / _/_// / / / / /_ / /_/ / / / / /> < / /_/ / / / / /_/ / / / /_/ / / / / / \ /_/ /_/_/_/ _ \,_/_/ /_/\,_/ \ /_/ \ //_/ /_/ Lecture 4:

More information

Document Databases: MongoDB

Document Databases: MongoDB NDBI040: Big Data Management and NoSQL Databases hp://www.ksi.mff.cuni.cz/~svoboda/courses/171-ndbi040/ Lecture 9 Document Databases: MongoDB Marn Svoboda svoboda@ksi.mff.cuni.cz 28. 11. 2017 Charles University

More information

Modeling Hardware Timing 1 Caches and Pipelines

Modeling Hardware Timing 1 Caches and Pipelines Modeling Hardware Timing 1 Caches and Pipelines Peter Puschner slides: P. Puschner, R. Kirner, B. Huber VU 2.0 182.101 SS 2016 Generic WCET Analysis Framework source code Extraction of Flow Facts Compilation

More information

Estimation of Cache Related Migration Delays for Multi-Core Processors with Shared Instruction Caches

Estimation of Cache Related Migration Delays for Multi-Core Processors with Shared Instruction Caches Estimation of Cache Related Migration Delays for Multi-Core Processors with Shared Instruction Caches Damien Hardy, Isabelle Puaut To cite this version: Damien Hardy, Isabelle Puaut. Estimation of Cache

More information

WCET-Aware C Compiler: WCC

WCET-Aware C Compiler: WCC 12 WCET-Aware C Compiler: WCC Jian-Jia Chen (slides are based on Prof. Heiko Falk) TU Dortmund, Informatik 12 2015 年 05 月 05 日 These slides use Microsoft clip arts. Microsoft copyright restrictions apply.

More information

CSSE232 Computer Architecture I. Datapath

CSSE232 Computer Architecture I. Datapath CSSE232 Computer Architecture I Datapath Class Status Reading Sec;ons 4.1-3 Project Project group milestone assigned Indicate who you want to work with Indicate who you don t want to work with Due next

More information

Scope-based Method Cache Analysis

Scope-based Method Cache Analysis Scope-based Method Cache Analysis Benedikt Huber 1, Stefan Hepp 1, Martin Schoeberl 2 1 Vienna University of Technology 2 Technical University of Denmark 14th International Workshop on Worst-Case Execution

More information

Design and Analysis of Real-Time Systems Microarchitectural Analysis

Design and Analysis of Real-Time Systems Microarchitectural Analysis Design and Analysis of Real-Time Systems Microarchitectural Analysis Jan Reineke Advanced Lecture, Summer 2013 Structure of WCET Analyzers Reconstructs a control-flow graph from the binary. Determines

More information

Con$nuous Integra$on Development Environment. Kovács Gábor

Con$nuous Integra$on Development Environment. Kovács Gábor Con$nuous Integra$on Development Environment Kovács Gábor kovacsg@tmit.bme.hu Before we start anything Select a language Set up conven$ons Select development tools Set up development environment Set up

More information

hashfs Applying Hashing to Op2mize File Systems for Small File Reads

hashfs Applying Hashing to Op2mize File Systems for Small File Reads hashfs Applying Hashing to Op2mize File Systems for Small File Reads Paul Lensing, Dirk Meister, André Brinkmann Paderborn Center for Parallel Compu2ng University of Paderborn Mo2va2on and Problem Design

More information

Hardware-Software Codesign. 9. Worst Case Execution Time Analysis

Hardware-Software Codesign. 9. Worst Case Execution Time Analysis Hardware-Software Codesign 9. Worst Case Execution Time Analysis Lothar Thiele 9-1 System Design Specification System Synthesis Estimation SW-Compilation Intellectual Prop. Code Instruction Set HW-Synthesis

More information

Simulation Of Computer Systems. Prof. S. Shakya

Simulation Of Computer Systems. Prof. S. Shakya Simulation Of Computer Systems Prof. S. Shakya Purpose & Overview Computer systems are composed from timescales flip (10-11 sec) to time a human interacts (seconds) It is a multi level system Different

More information

A Fast and Precise Worst Case Interference Placement for Shared Cache Analysis

A Fast and Precise Worst Case Interference Placement for Shared Cache Analysis A Fast and Precise Worst Case Interference Placement for Shared Cache Analysis KARTIK NAGAR and Y N SRIKANT, Indian Institute of Science Real-time systems require a safe and precise estimate of the Worst

More information

Research opportuni/es with me

Research opportuni/es with me Research opportuni/es with me Independent study for credit - Build PL tools (parsers, editors) e.g., JDial - Build educa/on tools (e.g., Automata Tutor) - Automata theory problems e.g., AutomatArk - Research

More information

Vulnerability Analysis (III): Sta8c Analysis

Vulnerability Analysis (III): Sta8c Analysis Computer Security Course. Vulnerability Analysis (III): Sta8c Analysis Slide credit: Vijay D Silva 1 Efficiency of Symbolic Execu8on 2 A Sta8c Analysis Analogy 3 Syntac8c Analysis 4 Seman8cs- Based Analysis

More information

RISC-V, Rocket, and RoCC Spring 2017 James Mar2n

RISC-V, Rocket, and RoCC Spring 2017 James Mar2n RISC-V, Rocket, and RoCC Spring 2017 James Mar2n What s new in Lab 2: In lab 1, you built a SHA3 unit that operates in isola2on We would like Sha3Accel to act as an accelerator for a processor Lab 2 introduces

More information

Using Graph- Based Characteriza4on for Predic4ve Modeling of Vectorizable Loop Nests

Using Graph- Based Characteriza4on for Predic4ve Modeling of Vectorizable Loop Nests Using Graph- Based Characteriza4on for Predic4ve Modeling of Vectorizable Loop Nests William Killian PhD Prelimary Exam Presenta4on Department of Computer and Informa4on Science CommiIee John Cavazos and

More information

Timing Analysis. Dr. Florian Martin AbsInt

Timing Analysis. Dr. Florian Martin AbsInt Timing Analysis Dr. Florian Martin AbsInt AIT OVERVIEW 2 3 The Timing Problem Probability Exact Worst Case Execution Time ait WCET Analyzer results: safe precise (close to exact WCET) computed automatically

More information

Code Genera*on for Control Flow Constructs

Code Genera*on for Control Flow Constructs Code Genera*on for Control Flow Constructs 1 Roadmap Last *me: Got the basics of MIPS CodeGen for some AST node types This *me: Do the rest of the AST nodes Introduce control flow graphs Scanner Parser

More information

Virtualization. Introduction. Why we interested? 11/28/15. Virtualiza5on provide an abstract environment to run applica5ons.

Virtualization. Introduction. Why we interested? 11/28/15. Virtualiza5on provide an abstract environment to run applica5ons. Virtualization Yifu Rong Introduction Virtualiza5on provide an abstract environment to run applica5ons. Virtualiza5on technologies have a long trail in the history of computer science. Why we interested?

More information

Precise and Efficient FIFO-Replacement Analysis Based on Static Phase Detection

Precise and Efficient FIFO-Replacement Analysis Based on Static Phase Detection Precise and Efficient FIFO-Replacement Analysis Based on Static Phase Detection Daniel Grund 1 Jan Reineke 2 1 Saarland University, Saarbrücken, Germany 2 University of California, Berkeley, USA Euromicro

More information

Using Bypass to Tighten WCET Estimates for Multi-Core Processors with Shared Instruction Caches

Using Bypass to Tighten WCET Estimates for Multi-Core Processors with Shared Instruction Caches INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE Using Bypass to Tighten WCET Estimates for Multi-Core Processors with Shared Instruction Caches Damien Hardy Thomas Piquet Isabelle Puaut

More information

CS101: Fundamentals of Computer Programming. Dr. Tejada www-bcf.usc.edu/~stejada Week 1 Basic Elements of C++

CS101: Fundamentals of Computer Programming. Dr. Tejada www-bcf.usc.edu/~stejada Week 1 Basic Elements of C++ CS101: Fundamentals of Computer Programming Dr. Tejada stejada@usc.edu www-bcf.usc.edu/~stejada Week 1 Basic Elements of C++ 10 Stacks of Coins You have 10 stacks with 10 coins each that look and feel

More information

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System Ilkay Al(ntas and Daniel Crawl San Diego Supercomputer Center UC San Diego Jianwu Wang UMBC WorDS.sdsc.edu Computa3onal

More information

Handling Write Backs in Multi-Level Cache Analysis for WCET Estimation

Handling Write Backs in Multi-Level Cache Analysis for WCET Estimation Handling Write Backs in Multi-Level Cache Analysis for WCET Estimation Zhenkai Zhang Institute for Software Integrated Systems Vanderbilt University Nashville, TN, USA zhenkai.zhang@vanderbilt.edu Zhishan

More information

Shared Cache Aware Task Mapping for WCRT Minimization

Shared Cache Aware Task Mapping for WCRT Minimization Shared Cache Aware Task Mapping for WCRT Minimization Huping Ding & Tulika Mitra School of Computing, National University of Singapore Yun Liang Center for Energy-efficient Computing and Applications,

More information

Status of the Bound-T WCET Tool

Status of the Bound-T WCET Tool Status of the Bound-T WCET Tool Niklas Holsti and Sami Saarinen Space Systems Finland Ltd Niklas.Holsti@ssf.fi, Sami.Saarinen@ssf.fi Abstract Bound-T is a tool for static WCET analysis from binary executable

More information

Profiling & Tuning Applica1ons. CUDA Course July István Reguly

Profiling & Tuning Applica1ons. CUDA Course July István Reguly Profiling & Tuning Applica1ons CUDA Course July 21-25 István Reguly Introduc1on Why is my applica1on running slow? Work it out on paper Instrument code Profile it NVIDIA Visual Profiler Works with CUDA,

More information

Compiler Optimization Intermediate Representation

Compiler Optimization Intermediate Representation Compiler Optimization Intermediate Representation Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology

More information

History of Java. Java was originally developed by Sun Microsystems star:ng in This language was ini:ally called Oak Renamed Java in 1995

History of Java. Java was originally developed by Sun Microsystems star:ng in This language was ini:ally called Oak Renamed Java in 1995 Java Introduc)on History of Java Java was originally developed by Sun Microsystems star:ng in 1991 James Gosling Patrick Naughton Chris Warth Ed Frank Mike Sheridan This language was ini:ally called Oak

More information

Architecture of so-ware systems

Architecture of so-ware systems Architecture of so-ware systems Lecture 13: Class/object ini

More information

A System for Genera/ng Sta/c Analyzers for Machine Instruc/ons (TSL)

A System for Genera/ng Sta/c Analyzers for Machine Instruc/ons (TSL) Seminar on A System for Genera/ng Sta/c Analyzers for Machine Instruc/ons (TSL) Junghee Lim, Univ. of Wisconsin Madison, USA and Thomas Reps, GrammaTech, USA Presenter : Anand Ramkumar S Universitat des

More information

NISC Technology Online Toolset

NISC Technology Online Toolset NISC Technology Online Toolset Mehrdad Reshadi, Bita Gorjiara, Daniel Gajski Technical Report CECS-05-19 December 2005 Center for Embedded Computer Systems University of California Irvine Irvine, CA 92697-3425,

More information

Caches in Real-Time Systems. Instruction Cache vs. Data Cache

Caches in Real-Time Systems. Instruction Cache vs. Data Cache Caches in Real-Time Systems [Xavier Vera, Bjorn Lisper, Jingling Xue, Data Caches in Multitasking Hard Real- Time Systems, RTSS 2003.] Schedulability Analysis WCET Simple Platforms WCMP (memory performance)

More information

Static Analysis of Worst-Case Stack Cache Behavior

Static Analysis of Worst-Case Stack Cache Behavior Static Analysis of Worst-Case Stack Cache Behavior Alexander Jordan 1 Florian Brandner 2 Martin Schoeberl 2 Institute of Computer Languages 1 Embedded Systems Engineering Section 2 Compiler and Languages

More information

Caching and Demand- Paged Virtual Memory

Caching and Demand- Paged Virtual Memory Caching and Demand- Paged Virtual Memory Defini8ons Cache Copy of data that is faster to access than the original Hit: if cache has copy Miss: if cache does not have copy Cache block Unit of cache storage

More information

Worst-Case Execution Time (WCET)

Worst-Case Execution Time (WCET) Introduction to Cache Analysis for Real-Time Systems [C. Ferdinand and R. Wilhelm, Efficient and Precise Cache Behavior Prediction for Real-Time Systems, Real-Time Systems, 17, 131-181, (1999)] Schedulability

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 32: Pipeline Parallelism 3

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 32: Pipeline Parallelism 3 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 32: Pipeline Parallelism 3 Instructor: Dan Garcia inst.eecs.berkeley.edu/~cs61c! Compu@ng in the News At a laboratory in São Paulo,

More information

Computer Organization MIPS Architecture. Department of Computer Science Missouri University of Science & Technology

Computer Organization MIPS Architecture. Department of Computer Science Missouri University of Science & Technology Computer Organization MIPS Architecture Department of Computer Science Missouri University of Science & Technology hurson@mst.edu Computer Organization Note, this unit will be covered in three lectures.

More information

Static Analysis of Worst-Case Stack Cache Behavior

Static Analysis of Worst-Case Stack Cache Behavior Static Analysis of Worst-Case Stack Cache Behavior Florian Brandner Unité d Informatique et d Ing. des Systèmes ENSTA-ParisTech Alexander Jordan Embedded Systems Engineering Sect. Technical University

More information

: Advanced Compiler Design. 8.0 Instruc?on scheduling

: Advanced Compiler Design. 8.0 Instruc?on scheduling 6-80: Advanced Compiler Design 8.0 Instruc?on scheduling Thomas R. Gross Computer Science Department ETH Zurich, Switzerland Overview 8. Instruc?on scheduling basics 8. Scheduling for ILP processors 8.

More information

Advanced Parallel Programming I

Advanced Parallel Programming I Advanced Parallel Programming I Alexander Leutgeb, RISC Software GmbH RISC Software GmbH Johannes Kepler University Linz 2016 22.09.2016 1 Levels of Parallelism RISC Software GmbH Johannes Kepler University

More information

Shared Data Cache Conflicts Reduction for WCET Computation in Multi-Core Architectures.

Shared Data Cache Conflicts Reduction for WCET Computation in Multi-Core Architectures. Shared Data Cache Conflicts Reduction for WCET Computation in Multi-Core Architectures. Benjamin Lesage Damien Hardy Isabelle Puaut University of Rennes 1, UEB, IRISA Rennes, France {Benjamin.Lesage, Damien.Hardy,

More information

Caches in Real-Time Systems. Instruction Cache vs. Data Cache

Caches in Real-Time Systems. Instruction Cache vs. Data Cache Caches in Real-Time Systems [Xavier Vera, Bjorn Lisper, Jingling Xue, Data Caches in Multitasking Hard Real- Time Systems, RTSS 2003.] Schedulability Analysis WCET Simple Platforms WCMP (memory performance)

More information

Pa#ern Recogni-on for Neuroimaging Toolbox

Pa#ern Recogni-on for Neuroimaging Toolbox Pa#ern Recogni-on for Neuroimaging Toolbox Pa#ern Recogni-on Methods: Basics João M. Monteiro Based on slides from Jessica Schrouff and Janaina Mourão-Miranda PRoNTo course UCL, London, UK 2017 Outline

More information

High Performance Computing Lecture 21. Matthew Jacob Indian Institute of Science

High Performance Computing Lecture 21. Matthew Jacob Indian Institute of Science High Performance Computing Lecture 21 Matthew Jacob Indian Institute of Science Semaphore Examples Semaphores can do more than mutex locks Example: Consider our concurrent program where process P1 reads

More information

Compiler Architecture

Compiler Architecture Code Generation 1 Compiler Architecture Source language Scanner (lexical analysis) Tokens Parser (syntax analysis) Syntactic structure Semantic Analysis (IC generator) Intermediate Language Code Optimizer

More information

Cache Optimization. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Cache Optimization. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Cache Optimization Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Cache Misses On cache hit CPU proceeds normally On cache miss Stall the CPU pipeline

More information

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Indexing Process Indexes Indexes are data structures designed to make search faster Text search

More information

Performance Measurement

Performance Measurement ECPE 170 Jeff Shafer University of the Pacific Performance Measurement 2 Lab Schedule Ac?vi?es Today Background discussion Lab 5 Performance Measurement Wednesday Lab 5 Performance Measurement Friday Lab

More information

CS 61C: Great Ideas in Computer Architecture Func%ons and Numbers

CS 61C: Great Ideas in Computer Architecture Func%ons and Numbers CS 61C: Great Ideas in Computer Architecture Func%ons and Numbers 9/11/12 Instructor: Krste Asanovic, Randy H. Katz hcp://inst.eecs.berkeley.edu/~cs61c/sp12 Fall 2012 - - Lecture #8 1 New- School Machine

More information

Dissecting Execution Traces to Understand Long Timing Effects

Dissecting Execution Traces to Understand Long Timing Effects Dissecting Execution Traces to Understand Long Timing Effects Christine Rochange and Pascal Sainrat February 2005 Rapport IRIT-2005-6-R Contents 1. Introduction... 5 2. Long timing effects... 5 3. Methodology...

More information

Superscalar Architectures: Part 2

Superscalar Architectures: Part 2 Superscalar Architectures: Part 2 Dynamic (Out-of-Order) Scheduling Lecture 3.2 August 23 rd, 2017 Jae W. Lee (jaewlee@snu.ac.kr) Computer Science and Engineering Seoul NaMonal University Download this

More information

Tutorial 11. Final Exam Review

Tutorial 11. Final Exam Review Tutorial 11 Final Exam Review Introduction Instruction Set Architecture: contract between programmer and designers (e.g.: IA-32, IA-64, X86-64) Computer organization: describe the functional units, cache

More information

Computer Architecture: Mul1ple Issue. Berk Sunar and Thomas Eisenbarth ECE 505

Computer Architecture: Mul1ple Issue. Berk Sunar and Thomas Eisenbarth ECE 505 Computer Architecture: Mul1ple Issue Berk Sunar and Thomas Eisenbarth ECE 505 Outline 5 stages of RISC Type of hazards Sta@c and Dynamic Branch Predic@on Pipelining with Excep@ons Pipelining with Floa@ng-

More information

CSE Opera+ng System Principles

CSE Opera+ng System Principles CSE 30341 Opera+ng System Principles Lecture 3 Systems Structure Project 1 Intro CSE 30341 Opera+ng System Principles 2 1 Recap Last Lecture I/O Structure (I/O Interface, DMA) Storage and Memory Hierarchy

More information

Good luck and have fun!

Good luck and have fun! Midterm Exam October 13, 2014 Name: Problem 1 2 3 4 total Points Exam rules: Time: 90 minutes. Individual test: No team work! Open book, open notes. No electronic devices, except an unprogrammed calculator.

More information

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination

THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY Computer Organization (COMP 2611) Spring Semester, 2014 Final Examination May 23, 2014 Name: Email: Student ID: Lab Section Number: Instructions: 1. This

More information

Lecture 1 Introduc-on

Lecture 1 Introduc-on Lecture 1 Introduc-on What would you get out of this course? Structure of a Compiler Op9miza9on Example 15-745: Introduc9on 1 What Do Compilers Do? 1. Translate one language into another e.g., convert

More information

Chapter 3: Instruc0on Level Parallelism and Its Exploita0on

Chapter 3: Instruc0on Level Parallelism and Its Exploita0on Chapter 3: Instruc0on Level Parallelism and Its Exploita0on - Abdullah Muzahid Hardware- Based Specula0on (Sec0on 3.6) In mul0ple issue processors, stalls due to branches would be frequent: You may need

More information

History-based Schemes and Implicit Path Enumeration

History-based Schemes and Implicit Path Enumeration History-based Schemes and Implicit Path Enumeration Claire Burguière and Christine Rochange Institut de Recherche en Informatique de Toulouse Université Paul Sabatier 6 Toulouse cedex 9, France {burguier,rochange}@irit.fr

More information

FIFO Cache Analysis for WCET Estimation: A Quantitative Approach

FIFO Cache Analysis for WCET Estimation: A Quantitative Approach FIFO Cache Analysis for WCET Estimation: A Quantitative Approach Abstract Although most previous work in cache analysis for WCET estimation assumes the LRU replacement policy, in practise more processors

More information

Objec,ves. Review: Object-Oriented Programming. Object-oriented programming in Java. What is OO programming? Benefits?

Objec,ves. Review: Object-Oriented Programming. Object-oriented programming in Java. What is OO programming? Benefits? Objec,ves Object-oriented programming in Java Ø Encapsula,on Ø Access modifiers Ø Using others classes Ø Defining own classes Sept 16, 2016 Sprenkle - CSCI209 1 Review: Object-Oriented Programming What

More information

Founda'ons of So,ware Engineering. Sta$c analysis (1/2) Claire Le Goues

Founda'ons of So,ware Engineering. Sta$c analysis (1/2) Claire Le Goues Founda'ons of So,ware Engineering Sta$c analysis (1/2) Claire Le Goues 1 Two fundamental concepts Abstrac'on. Elide details of a specific implementa$on. Capture seman$cally relevant details; ignore the

More information

CSCE 5610: Computer Architecture

CSCE 5610: Computer Architecture HW #1 1.3, 1.5, 1.9, 1.12 Due: Sept 12, 2018 Review: Execution time of a program Arithmetic Average, Weighted Arithmetic Average Geometric Mean Benchmarks, kernels and synthetic benchmarks Computing CPI

More information

Advanced branch predic.on algorithms. Ryan Gabrys Ilya Kolykhmatov

Advanced branch predic.on algorithms. Ryan Gabrys Ilya Kolykhmatov Advanced branch predic.on algorithms Ryan Gabrys Ilya Kolykhmatov Context Branches are frequent: 15-25 % A branch predictor allows the processor to specula.vely fetch and execute instruc.ons down the predicted

More information

Use JSL to Scrape Data from the Web and Predict Football Wins! William Baum Graduate Sta/s/cs Student University of New Hampshire

Use JSL to Scrape Data from the Web and Predict Football Wins! William Baum Graduate Sta/s/cs Student University of New Hampshire Use JSL to Scrape Data from the Web and Predict Football Wins! William Baum Graduate Sta/s/cs Student University of New Hampshire Just for Fun! I m an avid American football fan Sports sta/s/cs are easily

More information

LRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved.

LRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved. LRU A list to keep track of the order of access to every block in the set. The least recently used block is replaced (if needed). How many bits we need for that? 27 Pseudo LRU A B C D E F G H A B C D E

More information

Decision making for autonomous naviga2on. Anoop Aroor Advisor: Susan Epstein CUNY Graduate Center, Computer science

Decision making for autonomous naviga2on. Anoop Aroor Advisor: Susan Epstein CUNY Graduate Center, Computer science Decision making for autonomous naviga2on Anoop Aroor Advisor: Susan Epstein CUNY Graduate Center, Computer science Overview Naviga2on and Mobile robots Decision- making techniques for naviga2on Building

More information

Lecture 31: Building a Runnable Program

Lecture 31: Building a Runnable Program The University of North Carolina at Chapel Hill Spring 2002 Lecture 31: Building a Runnable Program April 10 1 From Source Code to Executable Code program gcd(input, output); var i, j: integer; begin read(i,

More information

Agenda. General Organiza/on and architecture Structural/func/onal view of a computer Evolu/on/brief history of computer.

Agenda. General Organiza/on and architecture Structural/func/onal view of a computer Evolu/on/brief history of computer. UNIT I: OVERVIEW Agenda General Organiza/on and architecture Structural/func/onal view of a computer Evolu/on/brief history of computer. Architecture & Organiza/on Computer Architecture is those abributes

More information

Language Design for Streams: Asynchrony, Concurrency and Granularity

Language Design for Streams: Asynchrony, Concurrency and Granularity Language Design for Streams: Asynchrony, Concurrency and Granularity Yanif Ahmad, yanif@jhu.edu Data Management Systems Lab The Johns Hopkins University Students: P. C. Shyamshankar, Y. Barnoy, Z. Palmer

More information

CSE Compilers. Reminders/ Announcements. Lecture 15: Seman9c Analysis, Part III Michael Ringenburg Winter 2013

CSE Compilers. Reminders/ Announcements. Lecture 15: Seman9c Analysis, Part III Michael Ringenburg Winter 2013 CSE 401 - Compilers Lecture 15: Seman9c Analysis, Part III Michael Ringenburg Winter 2013 Winter 2013 UW CSE 401 (Michael Ringenburg) Reminders/ Announcements Project Part 2 due Wednesday Midterm Friday

More information

Modern CPU Architectures

Modern CPU Architectures Modern CPU Architectures Alexander Leutgeb, RISC Software GmbH RISC Software GmbH Johannes Kepler University Linz 2014 16.04.2014 1 Motivation for Parallelism I CPU History RISC Software GmbH Johannes

More information

Main Points of the Computer Organization and System Software Module

Main Points of the Computer Organization and System Software Module Main Points of the Computer Organization and System Software Module You can find below the topics we have covered during the COSS module. Reading the relevant parts of the textbooks is essential for a

More information

A Refining Cache Behavior Prediction using Cache Miss Paths

A Refining Cache Behavior Prediction using Cache Miss Paths A Refining Cache Behavior Prediction using Cache Miss Paths KARTIK NAGAR and Y N SRIKANT, Indian Institute of Science Worst Case Execution Time (WCET) is an important metric for programs running on real-time

More information

(1) Measuring performance on multiprocessors using linear speedup instead of execution time is a good idea.

(1) Measuring performance on multiprocessors using linear speedup instead of execution time is a good idea. 1. (11) True or False: (1) DRAM and Disk access times are rapidly converging. (1) Measuring performance on multiprocessors using linear speedup instead of execution time is a good idea. (1) Amdahl s law

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE382V: System-on-a-Chip (SoC) Design Lecture 5 Performance Analysis Sources: Prof. Jacob Abraham, UT Austin Prof. Lothar Thiele, ETH Zurich Prof. Reinhard Wilhelm, Saarland Univ. Andreas Gerstlauer Electrical

More information

Please state clearly any assumptions you make in solving the following problems.

Please state clearly any assumptions you make in solving the following problems. Computer Architecture Homework 3 2012-2013 Please state clearly any assumptions you make in solving the following problems. 1 Processors Write a short report on at least five processors from at least three

More information

Outline. How Fast is -fast? Performance Analysis of KKD Applications using Hardware Performance Counters on UltraSPARC-III

Outline. How Fast is -fast? Performance Analysis of KKD Applications using Hardware Performance Counters on UltraSPARC-III Outline How Fast is -fast? Performance Analysis of KKD Applications using Hardware Performance Counters on UltraSPARC-III Peter Christen and Adam Czezowski CAP Research Group Department of Computer Science,

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Indexes Indexes are data structures designed to make search faster Text search has unique

More information

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Computer Architecture ECE 568 Sample Midterm I Questions Israel Koren ECE568/Koren Sample Midterm.1.1 1. The cost of a pipeline can

More information

Timing Analysis Enhancement for Synchronous Program

Timing Analysis Enhancement for Synchronous Program Timing Analysis Enhancement for Synchronous Program Extended Abstract Pascal Raymond, Claire Maiza, Catherine Parent-Vigouroux, Fabienne Carrier, and Mihail Asavoae Grenoble-Alpes University Verimag, Centre

More information

DAD Lab. 1 Introduc7on to C#

DAD Lab. 1 Introduc7on to C# DAD 2017-18 Lab. 1 Introduc7on to C# Summary 1..NET Framework Architecture 2. C# Language Syntax C# vs. Java vs C++ 3. IDE: MS Visual Studio Tools Console and WinForm Applica7ons 1..NET Framework Introduc7on

More information

Memory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt

Memory Hierarchy. 2/18/2016 CS 152 Sec6on 5 Colin Schmidt Memory Hierarchy 2/18/2016 CS 152 Sec6on 5 Colin Schmidt Agenda Review Memory Hierarchy Lab 2 Ques6ons Return Quiz 1 Latencies Comparison Numbers L1 Cache 0.5 ns L2 Cache 7 ns 14x L1 cache Main Memory

More information

Module Road Map. 7. Version Control with Subversion Introduction Terminology

Module Road Map. 7. Version Control with Subversion Introduction Terminology Module Road Map 1. Overview 2. Installing and Running 3. Building and Running Java Classes 4. Refactoring 5. Debugging 6. Testing with JUnit 7. Version Control with Subversion Introduction Terminology

More information

How to use the BigDataBench simulator versions

How to use the BigDataBench simulator versions How to use the BigDataBench simulator versions Zhen Jia Institute of Computing Technology, Chinese Academy of Sciences BigDataBench Tutorial MICRO 2014 Cambridge, UK INSTITUTE OF COMPUTING TECHNOLOGY Objec8ves

More information

Worst-Case Execution Time (WCET)

Worst-Case Execution Time (WCET) Introduction to Cache Analysis for Real-Time Systems [C. Ferdinand and R. Wilhelm, Efficient and Precise Cache Behavior Prediction for Real-Time Systems, Real-Time Systems, 17, 131-181, (1999)] Schedulability

More information

CS 61C: Great Ideas in Computer Architecture Strings and Func.ons. Anything can be represented as a number, i.e., data or instruc\ons

CS 61C: Great Ideas in Computer Architecture Strings and Func.ons. Anything can be represented as a number, i.e., data or instruc\ons CS 61C: Great Ideas in Computer Architecture Strings and Func.ons Instructor: Krste Asanovic, Randy H. Katz hdp://inst.eecs.berkeley.edu/~cs61c/sp12 Fall 2012 - - Lecture #7 1 New- School Machine Structures

More information

Case Study IBM PowerPC 620

Case Study IBM PowerPC 620 Case Study IBM PowerPC 620 year shipped: 1995 allowing out-of-order execution (dynamic scheduling) and in-order commit (hardware speculation). using a reorder buffer to track when instruction can commit,

More information

IF1/IF2. Dout2[31:0] Data Memory. Addr[31:0] Din[31:0] Zero. Res ALU << 2. CPU Registers. extension. sign. W_add[4:0] Din[31:0] Dout[31:0] PC+4

IF1/IF2. Dout2[31:0] Data Memory. Addr[31:0] Din[31:0] Zero. Res ALU << 2. CPU Registers. extension. sign. W_add[4:0] Din[31:0] Dout[31:0] PC+4 12 1 CMPE110 Fall 2006 A. Di Blas 110 Fall 2006 CMPE pipeline concepts Advanced ffl ILP ffl Deep pipeline ffl Static multiple issue ffl Loop unrolling ffl VLIW ffl Dynamic multiple issue Textbook Edition:

More information