Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 1 / 1
|
|
- Cori Sherman
- 6 years ago
- Views:
Transcription
1 Dynamic Binary Translation for Generation of Cycle Accurate Architecture Simulators Institut für Computersprachen Technische Universtät Wien Austria Andreas Fellnhofer Andreas Krall David Riegler Part of this work was supported by OnDemand Microelectronics and the Christian Doppler Forschungsgesellschaft Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 1 / 1
2 Overview Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 2 / 1
3 Previous Projects Previous Projects incremental compilers static assembly language instrumentation Molecule (ATOM clone) STonX CACAO bintrans reverse compilation (DSP VLIW to C) compiled cycle accurate instruction set simulation iboy Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 3 / 1
4 Previous Projects STonX AtariST on X windows generated 68k instruction set emulator work on binary translation unfinished Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 4 / 1
5 Previous Projects CACAO JIT-only Java Virtual Machine ultra-fast basic compiler recompilation with optimizations on-stack replacement deoptimization when assumptions become invalid Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 5 / 1
6 Previous Projects bintrans generator for user mode binary translators LISP like source and target architecture specification direct translation of blocks/traces without intermediate representation hybrid fixed register mapping and local register allocation local register liveness analysis with global propagation between different runs 1.8 to 2.5 overhead compared to native code Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 6 / 1
7 Previous Projects iboy gameboy emulator for ipod full system level cycle accurate emulation uses only dynamic binary translation self modifying code leads to recompilation of basic block template based generated compiler local flag constant propagation and liveness analysis ROM/RAM code caches Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 7 / 1
8 Overview Simulator Specification Processor Architecture Description processor.xml adlgen tool VHDL Generator Common xadl Compiler Generator Processing Simulator Generator VHDL Hardware Compiler decoder.c simulator.c Preprocessed Description Simulator Generator Instruction 1 Pipeline Stage 1... Instruction 2 Pipeline Stage 1 Pipeline Stage 2 Pipeline Stage 3 Pipeline Stage 4 Instruction N Pipeline Stage read input port a read input port b read input port c add a, b... Memory-Element Pool a b Sequence Element Sequence Element Sequence Element... C Variable Definitions Definition of a Definition of b... C Function Instruction 2, Stage 3 C Statement Variable Usage C Statement Variable Usage C Statement Variable Usage... Instruction Set List of Built-in Calls Internal Description C Simulator Source Build Description Emit Code Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 8 / 1
9 Simulator Specification Architecture Description Language mostly structural xml syntax graphical user interface available no redundant information MIPS R2000 specification is about 1000 lines Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 9 / 1
10 Simulator Specification Architecture Description Example <Operation name="addu" syntax="op3 s" > <Syntax syntax="op3 s" token="op" value="addu" /> <Body> <add a="rs i" b="rt i" d="rd o" o="overflow" c="carry"/> </Body> </Operation> Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 10 / 1
11 Simulator Implementation Simulator Basics generated mixed interpreting/translating simulator translation of blocks and traces common IR for interpreter and translator backend is LLVM just-in-time compiler own and LLVM optimizations used Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 11 / 1
12 Simulator Implementation Differences to Standard Binary Translation cycle accurate simulator full system simulation simulation of in-order pipelined architectures instructions cross basic block borders Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 12 / 1
13 Simulator Implementation Overlapping Instructions stage number block A block B cycle 1 cycle 2 cycle 3 cycle 4 cycle 5 cycle 6 cycle 7 cycle 8 A4 inst 1 jump inst 3 B1 B2 B3 B4 A3 A4 inst 1 jump inst 3 B1 B2 B3 A2 A3 A4 inst 1 jump inst 3 B1 B2 A1 A2 A3 A4 inst 1 jump inst 3 B1 instructions executed in block A splitted instruction 1 splitted jump instruction 2 (modifiy PC in stage 2) splitted instruction 3 instructions executed in block B Instructions which depend on the predecessor basic block Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 13 / 1
14 Simulator Implementation Similar Predecessor Blocks program memory 0x100: add 0x101: add 0x102: store 0x103: sub 0x104: cbranch 0x101 0x105: store 0x106: add basic block 1 0x100: add 0x101: add 0x102: store 0x103: sub 0x104: cbranch 0x101 basic block 3 0x102 0x103 0x104 0x105: store 0x106: add basic block 2 0x102 0x103 0x104 0x102: store 0x103: sub 0x104: cbranch 0x101 Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 14 / 1
15 Simulator Implementation Basic Block Duplication program memory 0x100: add 0x101: add 0x102: store 0x103: sub 0x104: cbranch 0x101 0x105: store 0x106: add basic block 1 0x100: add 0x101: add 0x102: store 0x103: sub 0x104: cbranch 0x101 basic block 3a 0x101 0x102 0x103 0x104 0x105: store basic block 2a 0x101 0x102 0x103 0x104 0x102: store 0x103: sub 0x104: cbranch 0x101 basic block 3b 0x104 0x102 0x103 0x104 0x105: store basic block 2b 0x104 0x102 0x103 0x104 0x102: store 0x103: sub 0x104: cbranch 0x101 0x106: add 0x106: add Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 15 / 1
16 Trace Formation Simulator Implementation entry point of trace Trace 1 basic block 1 bb1 pipeline restore info successor array 0 1 bb2 bb3 basic block 2 bb2 pipeline restore info basic block 3 successor array 0 1 bb4 bb2 basic block 4 bb4 pipeline restore info successor array bb3 pipeline restore info successor array 0 bb4 exit point for basic block 1 exit points for basic blocks 2 to 4 Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 16 / 1
17 Simulator Implementation Optimizations LLVM optimizations (e.g. constant propagation, dead store elimination) local copy of global values linking of basic blocks constant forward optimization LLVM JIT very slow (mostly instruction selection) Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 17 / 1
18 Empirical Evaluation Simulation Speed MIPS MHz 100.0MHz 21.9MHz 10.9MHz 10.0MHz 195.1MHz 111.2MHz 66.0MHz 10.4MHz 4.9MHz 43.8MHz 10.2MHz Translator Interpreter 4.2MHz 43.6MHz 1.0MHz 1.0MHz prime jpeg crc32 sha dijkstra bitcount blowfishstringsearch adpcm gsm rijndael AVG Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 18 / 1
19 Empirical Evaluation Simulation Speed CHILI Mhz 100.0Mhz 77.5Mhz 482.4Mhz 99.5Mhz 111.8Mhz 73.4Mhz Translator Interpreter 79.6Mhz 10.0Mhz 3.9Mhz 8.2Mhz 3.4Mhz 10.5Mhz 4.1Mhz 1.0Mhz 0.4Mhz prime jpeg crc32 sha dijkstra bitcount blowfishstringsearch adpcm gsm rijndael AVG Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 19 / 1
20 Empirical Evaluation Breakdown of Simulation Cycles MIPS 100% 80% 60% 40% 20% 0% Trace BasicBlock Interpreter prime jpeg crc32 sha bitcount stringsearch gsm dijkstra blowfish adpcm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 20 / 1
21 Empirical Evaluation Breakdown of Simulation Cycles CHILI 100% 80% 60% 40% 20% 0% Trace BasicBlock Interpreter prime jpeg crc32 sha bitcount stringsearch gsm dijkstra blowfish adpcm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 21 / 1
22 Empirical Evaluation Compile and overall Run Time MIPS 16s 14s 12s 10s 8s 6s 4s 2s 0s prime Compiler Runtime jpeg crc32 sha bitcount stringsearch gsm dijkstra blowfish adpcm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 22 / 1
23 Empirical Evaluation Compile and overall Run Time CHILI 16s 14s 12s 10s 8s 6s 4s 2s 0s prime Compiler Runtime jpeg crc32 sha bitcount stringsearch gsm dijkstra blowfish adpcm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 23 / 1
24 Empirical Evaluation Performance with increasing Simulation Time CHILI 1000MHz 100MHz 10MHz jpeg crc32 blowfish 1MHz 10k cycles 100k cycles 1M cylces 10M cycles 100M cycles 1000M cycles 10000M cycles Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 24 / 1
25 Empirical Evaluation Simulation Speed with different Compilation Thresholds Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 25 / 1
26 Empirical Evaluation Optimized Block Linkage MIPS 250% 200% 150% 100% Trace Block Link 50% 0% prime jpeg crc32 sha dijkstra bitcount blowfish stringsearch adpcm gsm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 26 / 1
27 Empirical Evaluation Optimized Block Linkage CHILI 300% 250% 200% 150% Trace Block Link 100% 50% 0% prime jpeg crc32 sha dijkstra bitcount blowfish stringsearch adpcm gsm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 27 / 1
28 Empirical Evaluation Local Copy of Global Values MIPS 450% 400% 393% 350% 354% 300% 250% 200% 150% 100% 173% 229% 123% 125% 152% 174% 241% 140% 125% 109% 147% 170% 181% 209% 130% 133% 192% 119% 83% 142% Local Scope Opt. Local Scope 50% 0% prime jpeg crc32 sha dijkstra bitcount blowfish stringsearch adpcm gsm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 28 / 1
29 Empirical Evaluation Local Copy of Global Values CHILI 600% 500% 521% 400% 300% 312% 351% 250% 253% Local Scope Opt. Local Scope 200% 100% 145% 133% 125% 176% 159% 142% 99% 100% 123% 139% 153% 123% 152% 163% 123% 94% 170% 0% prime jpeg crc32 sha dijkstra bitcount blowfish stringsearch adpcm gsm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 29 / 1
30 Empirical Evaluation Forwarding Optimization MIPS 700% 600% 611% 500% 400% 401% Constant Forwards 300% 284% 228% 200% 100% 138% 182% 160% 166% 161% 111% 146% 146% 0% prime jpeg crc32 sha dijkstra bitcount blowfish stringsearch adpcm gsm rijndael AVG Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 30 / 1
31 Conclusion and Further Work Conclusion cycle accurate simulation rises additional problems binary translation is efficient LLVM JIT is slow Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 31 / 1
Fast and Accurate Simulation using the LLVM Compiler Framework
Fast and Accurate Simulation using the LLVM Compiler Framework Florian Brandner, Andreas Fellnhofer, Andreas Krall, and David Riegler Christian Doppler Laboratory: Compilation Techniques for Embedded Processors
More informationJust-In-Time Compilation
Just-In-Time Compilation Thiemo Bucciarelli Institute for Software Engineering and Programming Languages 18. Januar 2016 T. Bucciarelli 18. Januar 2016 1/25 Agenda Definitions Just-In-Time Compilation
More informationUltra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures
Ultra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures Stefan Farfeleder 1, Andreas Krall 1, and Nigel Horspool 2 1 Institut für Computersprachen, TU Wien, Austria {stefanf, andi}@complang.tuwien.ac.at
More informationEnhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension
Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension Hamid Noori, Farhad Mehdipour, Koji Inoue, and Kazuaki Murakami Institute of Systems, Information
More informationUltra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures
Ultra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures Stefan Farfeleder Andreas Krall Institut für Computersprachen, TU Wien Nigel Horspool Department of Computer Science, University
More informationLeveraging Predicated Execution for Multimedia Processing
Leveraging Predicated Execution for Multimedia Processing Dietmar Ebner Florian Brandner Andreas Krall Institut für Computersprachen Technische Universität Wien Argentinierstr. 8, A-1040 Wien, Austria
More informationEnergy Consumption Evaluation of an Adaptive Extensible Processor
Energy Consumption Evaluation of an Adaptive Extensible Processor Hamid Noori, Farhad Mehdipour, Maziar Goudarzi, Seiichiro Yamaguchi, Koji Inoue, and Kazuaki Murakami December 2007 Outline Introduction
More informationJust-In-Time Compilers & Runtime Optimizers
COMP 412 FALL 2017 Just-In-Time Compilers & Runtime Optimizers Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved.
More informationSABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine p. 1
SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine David Bélanger dbelan2@cs.mcgill.ca Sable Research Group McGill University Montreal, QC January 28, 2004 SABLEJIT: A Retargetable
More informationExploiting Hardware Resources: Register Assignment across Method Boundaries
Exploiting Hardware Resources: Register Assignment across Method Boundaries Ian Rogers, Alasdair Rawsthorne, Jason Souloglou The University of Manchester, England {Ian.Rogers,Alasdair.Rawsthorne,Jason.Souloglou}@cs.man.ac.uk
More informationTrace Compilation. Christian Wimmer September 2009
Trace Compilation Christian Wimmer cwimmer@uci.edu www.christianwimmer.at September 2009 Department of Computer Science University of California, Irvine Background Institute for System Software Johannes
More informationUndergraduate Compilers in a Day
Question of the Day Backpatching o.foo(); In Java, the address of foo() is often not known until runtime (due to dynamic class loading), so the method call requires a table lookup. After the first execution
More informationBEAMJIT, a Maze of Twisty Little Traces
BEAMJIT, a Maze of Twisty Little Traces A walk-through of the prototype just-in-time (JIT) compiler for Erlang. Frej Drejhammar 130613 Who am I? Senior researcher at the Swedish Institute
More informationEvaluating Heuristic Optimization Phase Order Search Algorithms
Evaluating Heuristic Optimization Phase Order Search Algorithms by Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Computer Science Department, Florida State University, Tallahassee,
More informationHistory of Compilers The term
History of Compilers The term compiler was coined in the early 1950s by Grace Murray Hopper. Translation was viewed as the compilation of a sequence of machine-language subprograms selected from a library.
More informationTopic 9: Control Flow
Topic 9: Control Flow COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer 1 The Front End The Back End (Intel-HP codename for Itanium ; uses compiler to identify parallelism)
More informationWhat do Compilers Produce?
What do Compilers Produce? Pure Machine Code Compilers may generate code for a particular machine, not assuming any operating system or library routines. This is pure code because it includes nothing beyond
More informationCS 351 Final Exam Solutions
CS 351 Final Exam Solutions Notes: You must explain your answers to receive partial credit. You will lose points for incorrect extraneous information, even if the answer is otherwise correct. Question
More informationIntermediate Code & Local Optimizations
Lecture Outline Intermediate Code & Local Optimizations Intermediate code Local optimizations Compiler Design I (2011) 2 Code Generation Summary We have so far discussed Runtime organization Simple stack
More informationReducing Power Consumption for High-Associativity Data Caches in Embedded Processors
Reducing Power Consumption for High-Associativity Data Caches in Embedded Processors Dan Nicolaescu Alex Veidenbaum Alex Nicolau Dept. of Information and Computer Science University of California at Irvine
More informationIs dynamic compilation possible for embedded system?
Is dynamic compilation possible for embedded system? Scopes 2015, St Goar Victor Lomüller, Henri-Pierre Charles CEA DACLE / Grenoble www.cea.fr June 2 2015 Introduction : Wake Up Questions Session FAQ
More informationIntermediate Representations
Intermediate Representations Intermediate Representations (EaC Chaper 5) Source Code Front End IR Middle End IR Back End Target Code Front end - produces an intermediate representation (IR) Middle end
More informationBEAMJIT: An LLVM based just-in-time compiler for Erlang. Frej Drejhammar
BEAMJIT: An LLVM based just-in-time compiler for Erlang Frej Drejhammar 140407 Who am I? Senior researcher at the Swedish Institute of Computer Science (SICS) working on programming languages,
More informationSista: Improving Cog s JIT performance. Clément Béra
Sista: Improving Cog s JIT performance Clément Béra Main people involved in Sista Eliot Miranda Over 30 years experience in Smalltalk VM Clément Béra 2 years engineer in the Pharo team Phd student starting
More informationAn Investigation into Value Profiling and its Applications
Manchester Metropolitan University BSc. (Hons) Computer Science An Investigation into Value Profiling and its Applications Author: Graham Markall Supervisor: Dr. Andy Nisbet No part of this project has
More informationLecture Compiler Backend
Lecture 19-23 Compiler Backend Jianwen Zhu Electrical and Computer Engineering University of Toronto Jianwen Zhu 2009 - P. 1 Backend Tasks Instruction selection Map virtual instructions To machine instructions
More informationAdaptive Multi-Level Compilation in a Trace-based Java JIT Compiler
Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu and Toshio Nakatani IBM Research Tokyo IBM Research T.J. Watson Research Center October
More informationVM instruction formats. Bytecode translator
Implementing an Ecient Java Interpreter David Gregg 1, M. Anton Ertl 2 and Andreas Krall 2 1 Department of Computer Science, Trinity College, Dublin 2, Ireland. David.Gregg@cs.tcd.ie 2 Institut fur Computersprachen,
More informationCompilers and Interpreters
Overview Roadmap Language Translators: Interpreters & Compilers Context of a compiler Phases of a compiler Compiler Construction tools Terminology How related to other CS Goals of a good compiler 1 Compilers
More informationCS 406/534 Compiler Construction Putting It All Together
CS 406/534 Compiler Construction Putting It All Together Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy
More informationWhat the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.
C Flow Control David Chisnall February 1, 2011 Outline What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope Disclaimer! These slides contain a lot of
More informationTrace-based JIT Compilation
Trace-based JIT Compilation Hiroshi Inoue, IBM Research - Tokyo 1 Trace JIT vs. Method JIT https://twitter.com/yukihiro_matz/status/533775624486133762 2 Background: Trace-based Compilation Using a Trace,
More informationInteraction of JVM with x86, Sparc and MIPS
Interaction of JVM with x86, Sparc and MIPS Sasikanth Avancha, Dipanjan Chakraborty, Dhiral Gada, Tapan Kamdar {savanc1, dchakr1, dgada1, kamdar}@cs.umbc.edu Department of Computer Science and Electrical
More informationA Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler
A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu and Toshio Nakatani IBM Research Tokyo IBM Research T.J. Watson Research Center April
More information9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation
Language Implementation Methods The Design and Implementation of Programming Languages Compilation Interpretation Hybrid In Text: Chapter 1 2 Compilation Interpretation Translate high-level programs to
More informationA Dynamic Binary Translation System in a Client/Server Environment
A Dynamic Binary Translation System in a Client/Server Environment Chun-Chen Hsu a, Ding-Yong Hong b,, Wei-Chung Hsu a, Pangfeng Liu a, Jan-Jan Wu b a Department of Computer Science and Information Engineering,
More informationpoint in worrying about performance. The goal of our work is to show that this is not true. This paper is organised as follows. In section 2 we introd
A Fast Java Interpreter David Gregg 1, M. Anton Ertl 2 and Andreas Krall 2 1 Department of Computer Science, Trinity College, Dublin 2, Ireland. David.Gregg@cs.tcd.ie 2 Institut fur Computersprachen, TU
More informationVirtual Machines and Dynamic Translation: Implementing ISAs in Software
Virtual Machines and Dynamic Translation: Implementing ISAs in Software Krste Asanovic Laboratory for Computer Science Massachusetts Institute of Technology Software Applications How is a software application
More informationImproving Data Access Efficiency by Using Context-Aware Loads and Stores
Improving Data Access Efficiency by Using Context-Aware Loads and Stores Alen Bardizbanyan Chalmers University of Technology Gothenburg, Sweden alenb@chalmers.se Magnus Själander Uppsala University Uppsala,
More informationCS553 Lecture Profile-Guided Optimizations 3
Profile-Guided Optimizations Last time Instruction scheduling Register renaming alanced Load Scheduling Loop unrolling Software pipelining Today More instruction scheduling Profiling Trace scheduling CS553
More informationNative Simulation of Complex VLIW Instruction Sets Using Static Binary Translation and Hardware-Assisted Virtualization
Native Simulation of Complex VLIW Instruction Sets Using Static Binary Translation and Hardware-Assisted Virtualization Mian-Muhammad Hamayun, Frédéric Pétrot and Nicolas Fournel System Level Synthesis
More informationIn Search of Near-Optimal Optimization Phase Orderings
In Search of Near-Optimal Optimization Phase Orderings Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Languages, Compilers, and Tools for Embedded Systems Optimization Phase Ordering
More informationAutomatic Instrumentation of Embedded Software for High Level Hardware/Software Co-Simulation
Automatic Instrumentation of Embedded Software for High Level Hardware/Software Co-Simulation Aimen Bouchhima, Patrice Gerin and Frédéric Pétrot System-Level Synthesis Group TIMA Laboratory 46, Av Félix
More informationModel-based Software Development
Model-based Software Development 1 SCADE Suite Application Model in SCADE (data flow + SSM) System Model (tasks, interrupts, buses, ) SymTA/S Generator System-level Schedulability Analysis Astrée ait StackAnalyzer
More informationRunning class Timing on Java HotSpot VM, 1
Compiler construction 2009 Lecture 3. A first look at optimization: Peephole optimization. A simple example A Java class public class A { public static int f (int x) { int r = 3; int s = r + 5; return
More informationAutomatic Instrumentation Technique of Embedded Software for High Level Hardware/Software Co-Simulation
Automatic Instrumentation Technique of Embedded Software for High Level Hardware/Software Co-Simulation Aimen Bouchhima, Patrice Gerin and Frédéric Pétrot System Level Synthesis Group, TIMA Laboratory
More informationCompiler Design Spring 2018
Compiler Design Spring 2018 Thomas R. Gross Computer Science Department ETH Zurich, Switzerland 1 Logistics Lecture Tuesdays: 10:15 11:55 Thursdays: 10:15 -- 11:55 In ETF E1 Recitation Announced later
More informationFLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES DEEP: DEPENDENCY ELIMINATION USING EARLY PREDICTIONS LUIS G. PENAGOS
FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES DEEP: DEPENDENCY ELIMINATION USING EARLY PREDICTIONS By LUIS G. PENAGOS A Thesis submitted to the Department of Computer Science in partial fulfillment
More informationWhat is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573
What is a compiler? Xiaokang Qiu Purdue University ECE 573 August 21, 2017 What is a compiler? What is a compiler? Traditionally: Program that analyzes and translates from a high level language (e.g.,
More informationThe basic operations defined on a symbol table include: free to remove all entries and free the storage of a symbol table
SYMBOL TABLE: A symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier in a program's source code is associated with information relating
More informationWhen do We Run a Compiler?
When do We Run a Compiler? Prior to execution This is standard. We compile a program once, then use it repeatedly. At the start of each execution We can incorporate values known at the start of the run
More informationRegister Allocation. CS 502 Lecture 14 11/25/08
Register Allocation CS 502 Lecture 14 11/25/08 Where we are... Reasonably low-level intermediate representation: sequence of simple instructions followed by a transfer of control. a representation of static
More informationTruffle A language implementation framework
Truffle A language implementation framework Boris Spasojević Senior Researcher VM Research Group, Oracle Labs Slides based on previous talks given by Christian Wimmer, Christian Humer and Matthias Grimmer.
More informationCompiling Techniques
Lecture 2: The view from 35000 feet 19 September 2017 Table of contents 1 2 Passes Representations 3 Instruction Selection Register Allocation Instruction Scheduling 4 of a compiler Source Compiler Machine
More informationFor our next chapter, we will discuss the emulation process which is an integral part of virtual machines.
For our next chapter, we will discuss the emulation process which is an integral part of virtual machines. 1 2 For today s lecture, we ll start by defining what we mean by emulation. Specifically, in this
More informationOutline. Lecture 17: Putting it all together. Example (input program) How to make the computer understand? Example (Output assembly code) Fall 2002
Outline 5 Fall 2002 Lecture 17: Putting it all together From parsing to code generation Saman Amarasinghe 2 6.035 MIT Fall 1998 How to make the computer understand? Write a program using a programming
More informationDesign & Implementation Overview
P Fall 2017 Outline P 1 2 3 4 5 6 7 P P Ontological commitments P Imperative l Architecture: Memory cells variables Data movement (memory memory, CPU memory) assignment Sequential machine instruction execution
More informationGroup B Assignment 9. Code generation using DAG. Title of Assignment: Problem Definition: Code generation using DAG / labeled tree.
Group B Assignment 9 Att (2) Perm(3) Oral(5) Total(10) Sign Title of Assignment: Code generation using DAG. 9.1.1 Problem Definition: Code generation using DAG / labeled tree. 9.1.2 Perquisite: Lex, Yacc,
More informationCompilers and Code Optimization EDOARDO FUSELLA
Compilers and Code Optimization EDOARDO FUSELLA The course covers Compiler architecture Pre-requisite Front-end Strong programming background in C, C++ Back-end LLVM Code optimization A case study: nu+
More informationCompiler construction 2009
Compiler construction 2009 Lecture 3 JVM and optimization. A first look at optimization: Peephole optimization. A simple example A Java class public class A { public static int f (int x) { int r = 3; int
More informationJOP: A Java Optimized Processor for Embedded Real-Time Systems. Martin Schoeberl
JOP: A Java Optimized Processor for Embedded Real-Time Systems Martin Schoeberl JOP Research Targets Java processor Time-predictable architecture Small design Working solution (FPGA) JOP Overview 2 Overview
More informationCSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation
CSE 501: Compiler Construction Course outline Main focus: program analysis and transformation how to represent programs? how to analyze programs? what to analyze? how to transform programs? what transformations
More informationIntegrated Modulo Scheduling and Cluster Assignment for TMS320C64x+ Architecture 1
Integrated Modulo Scheduling and Cluster Assignment for TMS320C64x+ Architecture 1 Nikolai Kim, Andreas Krall {kim,andi}@complang.tuwien.ac.at Institute of Computer Languages University of Technology Vienna
More informationWhere We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.
Where We Are Source Code Lexical Analysis Syntax Analysis Semantic Analysis IR Generation IR Optimization Code Generation Optimization Machine Code Where We Are Source Code Lexical Analysis Syntax Analysis
More informationLECTURE 2. Compilers and Interpreters
LECTURE 2 Compilers and Interpreters COMPILATION AND INTERPRETATION Programs written in high-level languages can be run in two ways. Compiled into an executable program written in machine language for
More informationAdvanced Computer Architecture
ECE 563 Advanced Computer Architecture Fall 2007 Lecture 14: Virtual Machines 563 L14.1 Fall 2009 Outline Types of Virtual Machine User-level (or Process VMs) System-level Techniques for implementing all
More informationFunctions and Procedures
Functions and Procedures Function or Procedure A separate piece of code Possibly separately compiled Located at some address in the memory used for code, away from main and other functions (main is itself
More informationAn introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures
An introduction to DSP s Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures DSP example: mobile phone DSP example: mobile phone with video camera DSP: applications Why a DSP?
More informationINTRODUCTION TO LLVM Bo Wang SA 2016 Fall
INTRODUCTION TO LLVM Bo Wang SA 2016 Fall LLVM Basic LLVM IR LLVM Pass OUTLINE What is LLVM? LLVM is a compiler infrastructure designed as a set of reusable libraries with well-defined interfaces. Implemented
More informationIncremental Flow Analysis. Andreas Krall and Thomas Berger. Institut fur Computersprachen. Technische Universitat Wien. Argentinierstrae 8
Incremental Flow Analysis Andreas Krall and Thomas Berger Institut fur Computersprachen Technische Universitat Wien Argentinierstrae 8 A-1040 Wien fandi,tbg@mips.complang.tuwien.ac.at Abstract Abstract
More informationInstructors: Randy H. Katz David A. PaKerson hkp://inst.eecs.berkeley.edu/~cs61c/fa10. Fall Lecture #11. Agenda
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Compilers, Interpreters, Instructors: Randy H. Katz David A. PaKerson hkp://inst.eecs.berkeley.edu/~cs61c/fa10 9/20/10 Fall 2010 - - Lecture
More informationIntroduction. CS 2210 Compiler Design Wonsun Ahn
Introduction CS 2210 Compiler Design Wonsun Ahn What is a Compiler? Compiler: A program that translates source code written in one language to a target code written in another language Source code: Input
More informationECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation
ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation Weiping Liao, Saengrawee (Anne) Pratoomtong, and Chuan Zhang Abstract Binary translation is an important component for translating
More informationMemory Access Optimizations in Instruction-Set Simulators
Memory Access Optimizations in Instruction-Set Simulators Mehrdad Reshadi Center for Embedded Computer Systems (CECS) University of California Irvine Irvine, CA 92697, USA reshadi@cecs.uci.edu ABSTRACT
More informationLecture 3 Overview of the LLVM Compiler
Lecture 3 Overview of the LLVM Compiler Jonathan Burket Special thanks to Deby Katz, Gennady Pekhimenko, Olatunji Ruwase, Chris Lattner, Vikram Adve, and David Koes for their slides The LLVM Compiler Infrastructure
More informationCrusoe Reference. What is Binary Translation. What is so hard about it? Thinking Outside the Box The Transmeta Crusoe Processor
Crusoe Reference Thinking Outside the Box The Transmeta Crusoe Processor 55:132/22C:160 High Performance Computer Architecture The Technology Behind Crusoe Processors--Low-power -Compatible Processors
More informationCompilers. History of Compilers. A compiler allows programmers to ignore the machine-dependent details of programming.
Compilers Compilers are fundamental to modern computing. They act as translators, transforming human-oriented programming languages into computer-oriented machine languages. To most users, a compiler can
More informationECE 5775 (Fall 17) High-Level Digital Design Automation. Static Single Assignment
ECE 5775 (Fall 17) High-Level Digital Design Automation Static Single Assignment Announcements HW 1 released (due Friday) Student-led discussions on Tuesday 9/26 Sign up on Piazza: 3 students / group Meet
More informationCSE 401/M501 Compilers
CSE 401/M501 Compilers Intermediate Representations Hal Perkins Autumn 2018 UW CSE 401/M501 Autumn 2018 G-1 Agenda Survey of Intermediate Representations Graphical Concrete/Abstract Syntax Trees (ASTs)
More informationLESSON 13: LANGUAGE TRANSLATION
LESSON 13: LANGUAGE TRANSLATION Objective Interpreters and Compilers. Language Translation Phases. Interpreters and Compilers A COMPILER is a program that translates a complete source program into machine
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 19: Verilog and Processor Performance Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Verilog Basics Hardware description language
More informationKasper Lund, Software engineer at Google. Crankshaft. Turbocharging the next generation of web applications
Kasper Lund, Software engineer at Google Crankshaft Turbocharging the next generation of web applications Overview Why did we introduce Crankshaft? Deciding when and what to optimize Type feedback and
More informationApple LLVM GPU Compiler: Embedded Dragons. Charu Chandrasekaran, Apple Marcello Maggioni, Apple
Apple LLVM GPU Compiler: Embedded Dragons Charu Chandrasekaran, Apple Marcello Maggioni, Apple 1 Agenda How Apple uses LLVM to build a GPU Compiler Factors that affect GPU performance The Apple GPU compiler
More informationApplication-Specific Design of Low Power Instruction Cache Hierarchy for Embedded Processors
Agenda Application-Specific Design of Low Power Instruction ache ierarchy for mbedded Processors Ji Gu Onodera Laboratory Department of ommunications & omputer ngineering Graduate School of Informatics
More informationPrecept 2: Non-preemptive Scheduler. COS 318: Fall 2018
Precept 2: Non-preemptive Scheduler COS 318: Fall 2018 Project 2 Schedule Precept: Monday 10/01, 7:30pm (You are here) Design Review: Monday 10/08, 3-7pm Due: Sunday 10/14, 11:55pm Project 2 Overview Goal:
More informationIncremental Dynamic Code Generation with Trace Trees
Incremental Dynamic Code Generation with Trace Trees Andreas Gal University of California, Irvine gal@uci.edu Michael Franz University of California, Irvine franz@uci.edu Abstract The unit of compilation
More informationEmulation. Michael Jantz
Emulation Michael Jantz Acknowledgements Slides adapted from Chapter 2 in Virtual Machines: Versatile Platforms for Systems and Processes by James E. Smith and Ravi Nair Credit to Prasad A. Kulkarni some
More informationA Feasibility Study for Methods of Effective Memoization Optimization
A Feasibility Study for Methods of Effective Memoization Optimization Daniel Mock October 2018 Abstract Traditionally, memoization is a compiler optimization that is applied to regions of code with few
More informationA Method-Based Ahead-of-Time Compiler For Android Applications
A Method-Based Ahead-of-Time Compiler For Android Applications Fatma Deli Computer Science & Software Engineering University of Washington Bothell November, 2012 2 Introduction This paper proposes a method-based
More informationBuilding a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2
Building a Compiler with JoeQ AKA, how to solve Homework 2 Outline of this lecture Building a compiler: what pieces we need? An effective IR for Java joeq Homework hints How to Build a Compiler 1. Choose
More informationAccelerating Stateflow With LLVM
Accelerating Stateflow With LLVM By Dale Martin Dale.Martin@mathworks.com 2015 The MathWorks, Inc. 1 What is Stateflow? A block in Simulink, which is a graphical language for modeling algorithms 2 What
More informationMechanisms and constructs for System Virtualization
Mechanisms and constructs for System Virtualization Content Outline Design goals for virtualization General Constructs for virtualization Virtualization for: System VMs Process VMs Prevalent trends: Pros
More informationIntroduction Contech s Task Graph Representation Parallel Program Instrumentation (Break) Analysis and Usage of a Contech Task Graph Hands-on
Introduction Contech s Task Graph Representation Parallel Program Instrumentation (Break) Analysis and Usage of a Contech Task Graph Hands-on Exercises 2 Contech is An LLVM compiler pass to instrument
More informationRetDec: An Open-Source Machine-Code Decompiler. Jakub Křoustek Peter Matula
RetDec: An Open-Source Machine-Code Decompiler Jakub Křoustek Peter Matula Who Are We? 2 Jakub Křoustek Founder of RetDec Threat Labs lead @Avast (previously @AVG) Reverse engineer, malware hunter, security
More informationLLVM Tutorial. John Criswell University of Rochester
LLVM Tutorial John Criswell University of Rochester 1 Overview 2 History of LLVM Developed by Chris Lattner and Vikram Adve at the University of Illinois at Urbana-Champaign Released open-source in October
More informationJava Performance: The Definitive Guide
Java Performance: The Definitive Guide Scott Oaks Beijing Cambridge Farnham Kbln Sebastopol Tokyo O'REILLY Table of Contents Preface ix 1. Introduction 1 A Brief Outline 2 Platforms and Conventions 2 JVM
More informationHardware Emulation and Virtual Machines
Hardware Emulation and Virtual Machines Overview Review of How Programs Run: Registers Execution Cycle Processor Emulation Types: Pure Translation Static Recompilation Dynamic Recompilation Direct Bytecode
More informationThe Software Stack: From Assembly Language to Machine Code
COMP 506 Rice University Spring 2018 The Software Stack: From Assembly Language to Machine Code source code IR Front End Optimizer Back End IR target code Somewhere Out Here Copyright 2018, Keith D. Cooper
More informationLecture 2 Overview of the LLVM Compiler
Lecture 2 Overview of the LLVM Compiler Abhilasha Jain Thanks to: VikramAdve, Jonathan Burket, DebyKatz, David Koes, Chris Lattner, Gennady Pekhimenko, and Olatunji Ruwase, for their slides The LLVM Compiler
More informationSwift: A Register-based JIT Compiler for Embedded JVMs
Swift: A Register-based JIT Compiler for Embedded JVMs Yuan Zhang, Min Yang, Bo Zhou, Zhemin Yang, Weihua Zhang, Binyu Zang Fudan University Eighth Conference on Virtual Execution Environment (VEE 2012)
More information