Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 1 / 1

Size: px
Start display at page:

Download "Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 1 / 1"

Transcription

1 Dynamic Binary Translation for Generation of Cycle Accurate Architecture Simulators Institut für Computersprachen Technische Universtät Wien Austria Andreas Fellnhofer Andreas Krall David Riegler Part of this work was supported by OnDemand Microelectronics and the Christian Doppler Forschungsgesellschaft Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 1 / 1

2 Overview Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 2 / 1

3 Previous Projects Previous Projects incremental compilers static assembly language instrumentation Molecule (ATOM clone) STonX CACAO bintrans reverse compilation (DSP VLIW to C) compiled cycle accurate instruction set simulation iboy Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 3 / 1

4 Previous Projects STonX AtariST on X windows generated 68k instruction set emulator work on binary translation unfinished Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 4 / 1

5 Previous Projects CACAO JIT-only Java Virtual Machine ultra-fast basic compiler recompilation with optimizations on-stack replacement deoptimization when assumptions become invalid Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 5 / 1

6 Previous Projects bintrans generator for user mode binary translators LISP like source and target architecture specification direct translation of blocks/traces without intermediate representation hybrid fixed register mapping and local register allocation local register liveness analysis with global propagation between different runs 1.8 to 2.5 overhead compared to native code Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 6 / 1

7 Previous Projects iboy gameboy emulator for ipod full system level cycle accurate emulation uses only dynamic binary translation self modifying code leads to recompilation of basic block template based generated compiler local flag constant propagation and liveness analysis ROM/RAM code caches Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 7 / 1

8 Overview Simulator Specification Processor Architecture Description processor.xml adlgen tool VHDL Generator Common xadl Compiler Generator Processing Simulator Generator VHDL Hardware Compiler decoder.c simulator.c Preprocessed Description Simulator Generator Instruction 1 Pipeline Stage 1... Instruction 2 Pipeline Stage 1 Pipeline Stage 2 Pipeline Stage 3 Pipeline Stage 4 Instruction N Pipeline Stage read input port a read input port b read input port c add a, b... Memory-Element Pool a b Sequence Element Sequence Element Sequence Element... C Variable Definitions Definition of a Definition of b... C Function Instruction 2, Stage 3 C Statement Variable Usage C Statement Variable Usage C Statement Variable Usage... Instruction Set List of Built-in Calls Internal Description C Simulator Source Build Description Emit Code Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 8 / 1

9 Simulator Specification Architecture Description Language mostly structural xml syntax graphical user interface available no redundant information MIPS R2000 specification is about 1000 lines Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 9 / 1

10 Simulator Specification Architecture Description Example <Operation name="addu" syntax="op3 s" > <Syntax syntax="op3 s" token="op" value="addu" /> <Body> <add a="rs i" b="rt i" d="rd o" o="overflow" c="carry"/> </Body> </Operation> Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 10 / 1

11 Simulator Implementation Simulator Basics generated mixed interpreting/translating simulator translation of blocks and traces common IR for interpreter and translator backend is LLVM just-in-time compiler own and LLVM optimizations used Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 11 / 1

12 Simulator Implementation Differences to Standard Binary Translation cycle accurate simulator full system simulation simulation of in-order pipelined architectures instructions cross basic block borders Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 12 / 1

13 Simulator Implementation Overlapping Instructions stage number block A block B cycle 1 cycle 2 cycle 3 cycle 4 cycle 5 cycle 6 cycle 7 cycle 8 A4 inst 1 jump inst 3 B1 B2 B3 B4 A3 A4 inst 1 jump inst 3 B1 B2 B3 A2 A3 A4 inst 1 jump inst 3 B1 B2 A1 A2 A3 A4 inst 1 jump inst 3 B1 instructions executed in block A splitted instruction 1 splitted jump instruction 2 (modifiy PC in stage 2) splitted instruction 3 instructions executed in block B Instructions which depend on the predecessor basic block Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 13 / 1

14 Simulator Implementation Similar Predecessor Blocks program memory 0x100: add 0x101: add 0x102: store 0x103: sub 0x104: cbranch 0x101 0x105: store 0x106: add basic block 1 0x100: add 0x101: add 0x102: store 0x103: sub 0x104: cbranch 0x101 basic block 3 0x102 0x103 0x104 0x105: store 0x106: add basic block 2 0x102 0x103 0x104 0x102: store 0x103: sub 0x104: cbranch 0x101 Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 14 / 1

15 Simulator Implementation Basic Block Duplication program memory 0x100: add 0x101: add 0x102: store 0x103: sub 0x104: cbranch 0x101 0x105: store 0x106: add basic block 1 0x100: add 0x101: add 0x102: store 0x103: sub 0x104: cbranch 0x101 basic block 3a 0x101 0x102 0x103 0x104 0x105: store basic block 2a 0x101 0x102 0x103 0x104 0x102: store 0x103: sub 0x104: cbranch 0x101 basic block 3b 0x104 0x102 0x103 0x104 0x105: store basic block 2b 0x104 0x102 0x103 0x104 0x102: store 0x103: sub 0x104: cbranch 0x101 0x106: add 0x106: add Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 15 / 1

16 Trace Formation Simulator Implementation entry point of trace Trace 1 basic block 1 bb1 pipeline restore info successor array 0 1 bb2 bb3 basic block 2 bb2 pipeline restore info basic block 3 successor array 0 1 bb4 bb2 basic block 4 bb4 pipeline restore info successor array bb3 pipeline restore info successor array 0 bb4 exit point for basic block 1 exit points for basic blocks 2 to 4 Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 16 / 1

17 Simulator Implementation Optimizations LLVM optimizations (e.g. constant propagation, dead store elimination) local copy of global values linking of basic blocks constant forward optimization LLVM JIT very slow (mostly instruction selection) Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 17 / 1

18 Empirical Evaluation Simulation Speed MIPS MHz 100.0MHz 21.9MHz 10.9MHz 10.0MHz 195.1MHz 111.2MHz 66.0MHz 10.4MHz 4.9MHz 43.8MHz 10.2MHz Translator Interpreter 4.2MHz 43.6MHz 1.0MHz 1.0MHz prime jpeg crc32 sha dijkstra bitcount blowfishstringsearch adpcm gsm rijndael AVG Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 18 / 1

19 Empirical Evaluation Simulation Speed CHILI Mhz 100.0Mhz 77.5Mhz 482.4Mhz 99.5Mhz 111.8Mhz 73.4Mhz Translator Interpreter 79.6Mhz 10.0Mhz 3.9Mhz 8.2Mhz 3.4Mhz 10.5Mhz 4.1Mhz 1.0Mhz 0.4Mhz prime jpeg crc32 sha dijkstra bitcount blowfishstringsearch adpcm gsm rijndael AVG Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 19 / 1

20 Empirical Evaluation Breakdown of Simulation Cycles MIPS 100% 80% 60% 40% 20% 0% Trace BasicBlock Interpreter prime jpeg crc32 sha bitcount stringsearch gsm dijkstra blowfish adpcm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 20 / 1

21 Empirical Evaluation Breakdown of Simulation Cycles CHILI 100% 80% 60% 40% 20% 0% Trace BasicBlock Interpreter prime jpeg crc32 sha bitcount stringsearch gsm dijkstra blowfish adpcm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 21 / 1

22 Empirical Evaluation Compile and overall Run Time MIPS 16s 14s 12s 10s 8s 6s 4s 2s 0s prime Compiler Runtime jpeg crc32 sha bitcount stringsearch gsm dijkstra blowfish adpcm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 22 / 1

23 Empirical Evaluation Compile and overall Run Time CHILI 16s 14s 12s 10s 8s 6s 4s 2s 0s prime Compiler Runtime jpeg crc32 sha bitcount stringsearch gsm dijkstra blowfish adpcm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 23 / 1

24 Empirical Evaluation Performance with increasing Simulation Time CHILI 1000MHz 100MHz 10MHz jpeg crc32 blowfish 1MHz 10k cycles 100k cycles 1M cylces 10M cycles 100M cycles 1000M cycles 10000M cycles Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 24 / 1

25 Empirical Evaluation Simulation Speed with different Compilation Thresholds Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 25 / 1

26 Empirical Evaluation Optimized Block Linkage MIPS 250% 200% 150% 100% Trace Block Link 50% 0% prime jpeg crc32 sha dijkstra bitcount blowfish stringsearch adpcm gsm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 26 / 1

27 Empirical Evaluation Optimized Block Linkage CHILI 300% 250% 200% 150% Trace Block Link 100% 50% 0% prime jpeg crc32 sha dijkstra bitcount blowfish stringsearch adpcm gsm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 27 / 1

28 Empirical Evaluation Local Copy of Global Values MIPS 450% 400% 393% 350% 354% 300% 250% 200% 150% 100% 173% 229% 123% 125% 152% 174% 241% 140% 125% 109% 147% 170% 181% 209% 130% 133% 192% 119% 83% 142% Local Scope Opt. Local Scope 50% 0% prime jpeg crc32 sha dijkstra bitcount blowfish stringsearch adpcm gsm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 28 / 1

29 Empirical Evaluation Local Copy of Global Values CHILI 600% 500% 521% 400% 300% 312% 351% 250% 253% Local Scope Opt. Local Scope 200% 100% 145% 133% 125% 176% 159% 142% 99% 100% 123% 139% 153% 123% 152% 163% 123% 94% 170% 0% prime jpeg crc32 sha dijkstra bitcount blowfish stringsearch adpcm gsm rijndael Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 29 / 1

30 Empirical Evaluation Forwarding Optimization MIPS 700% 600% 611% 500% 400% 401% Constant Forwards 300% 284% 228% 200% 100% 138% 182% 160% 166% 161% 111% 146% 146% 0% prime jpeg crc32 sha dijkstra bitcount blowfish stringsearch adpcm gsm rijndael AVG Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 30 / 1

31 Conclusion and Further Work Conclusion cycle accurate simulation rises additional problems binary translation is efficient LLVM JIT is slow Dynamic Binary Translation for Generation of Cycle Accurate October Architecture 28, 2008 Simulators 31 / 1

Fast and Accurate Simulation using the LLVM Compiler Framework

Fast and Accurate Simulation using the LLVM Compiler Framework Fast and Accurate Simulation using the LLVM Compiler Framework Florian Brandner, Andreas Fellnhofer, Andreas Krall, and David Riegler Christian Doppler Laboratory: Compilation Techniques for Embedded Processors

More information

Just-In-Time Compilation

Just-In-Time Compilation Just-In-Time Compilation Thiemo Bucciarelli Institute for Software Engineering and Programming Languages 18. Januar 2016 T. Bucciarelli 18. Januar 2016 1/25 Agenda Definitions Just-In-Time Compilation

More information

Ultra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures

Ultra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures Ultra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures Stefan Farfeleder 1, Andreas Krall 1, and Nigel Horspool 2 1 Institut für Computersprachen, TU Wien, Austria {stefanf, andi}@complang.tuwien.ac.at

More information

Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension

Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension Hamid Noori, Farhad Mehdipour, Koji Inoue, and Kazuaki Murakami Institute of Systems, Information

More information

Ultra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures

Ultra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures Ultra Fast Cycle-Accurate Compiled Emulation of Inorder Pipelined Architectures Stefan Farfeleder Andreas Krall Institut für Computersprachen, TU Wien Nigel Horspool Department of Computer Science, University

More information

Leveraging Predicated Execution for Multimedia Processing

Leveraging Predicated Execution for Multimedia Processing Leveraging Predicated Execution for Multimedia Processing Dietmar Ebner Florian Brandner Andreas Krall Institut für Computersprachen Technische Universität Wien Argentinierstr. 8, A-1040 Wien, Austria

More information

Energy Consumption Evaluation of an Adaptive Extensible Processor

Energy Consumption Evaluation of an Adaptive Extensible Processor Energy Consumption Evaluation of an Adaptive Extensible Processor Hamid Noori, Farhad Mehdipour, Maziar Goudarzi, Seiichiro Yamaguchi, Koji Inoue, and Kazuaki Murakami December 2007 Outline Introduction

More information

Just-In-Time Compilers & Runtime Optimizers

Just-In-Time Compilers & Runtime Optimizers COMP 412 FALL 2017 Just-In-Time Compilers & Runtime Optimizers Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved.

More information

SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine p. 1

SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine p. 1 SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine David Bélanger dbelan2@cs.mcgill.ca Sable Research Group McGill University Montreal, QC January 28, 2004 SABLEJIT: A Retargetable

More information

Exploiting Hardware Resources: Register Assignment across Method Boundaries

Exploiting Hardware Resources: Register Assignment across Method Boundaries Exploiting Hardware Resources: Register Assignment across Method Boundaries Ian Rogers, Alasdair Rawsthorne, Jason Souloglou The University of Manchester, England {Ian.Rogers,Alasdair.Rawsthorne,Jason.Souloglou}@cs.man.ac.uk

More information

Trace Compilation. Christian Wimmer September 2009

Trace Compilation. Christian Wimmer  September 2009 Trace Compilation Christian Wimmer cwimmer@uci.edu www.christianwimmer.at September 2009 Department of Computer Science University of California, Irvine Background Institute for System Software Johannes

More information

Undergraduate Compilers in a Day

Undergraduate Compilers in a Day Question of the Day Backpatching o.foo(); In Java, the address of foo() is often not known until runtime (due to dynamic class loading), so the method call requires a table lookup. After the first execution

More information

BEAMJIT, a Maze of Twisty Little Traces

BEAMJIT, a Maze of Twisty Little Traces BEAMJIT, a Maze of Twisty Little Traces A walk-through of the prototype just-in-time (JIT) compiler for Erlang. Frej Drejhammar 130613 Who am I? Senior researcher at the Swedish Institute

More information

Evaluating Heuristic Optimization Phase Order Search Algorithms

Evaluating Heuristic Optimization Phase Order Search Algorithms Evaluating Heuristic Optimization Phase Order Search Algorithms by Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Computer Science Department, Florida State University, Tallahassee,

More information

History of Compilers The term

History of Compilers The term History of Compilers The term compiler was coined in the early 1950s by Grace Murray Hopper. Translation was viewed as the compilation of a sequence of machine-language subprograms selected from a library.

More information

Topic 9: Control Flow

Topic 9: Control Flow Topic 9: Control Flow COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer 1 The Front End The Back End (Intel-HP codename for Itanium ; uses compiler to identify parallelism)

More information

What do Compilers Produce?

What do Compilers Produce? What do Compilers Produce? Pure Machine Code Compilers may generate code for a particular machine, not assuming any operating system or library routines. This is pure code because it includes nothing beyond

More information

CS 351 Final Exam Solutions

CS 351 Final Exam Solutions CS 351 Final Exam Solutions Notes: You must explain your answers to receive partial credit. You will lose points for incorrect extraneous information, even if the answer is otherwise correct. Question

More information

Intermediate Code & Local Optimizations

Intermediate Code & Local Optimizations Lecture Outline Intermediate Code & Local Optimizations Intermediate code Local optimizations Compiler Design I (2011) 2 Code Generation Summary We have so far discussed Runtime organization Simple stack

More information

Reducing Power Consumption for High-Associativity Data Caches in Embedded Processors

Reducing Power Consumption for High-Associativity Data Caches in Embedded Processors Reducing Power Consumption for High-Associativity Data Caches in Embedded Processors Dan Nicolaescu Alex Veidenbaum Alex Nicolau Dept. of Information and Computer Science University of California at Irvine

More information

Is dynamic compilation possible for embedded system?

Is dynamic compilation possible for embedded system? Is dynamic compilation possible for embedded system? Scopes 2015, St Goar Victor Lomüller, Henri-Pierre Charles CEA DACLE / Grenoble www.cea.fr June 2 2015 Introduction : Wake Up Questions Session FAQ

More information

Intermediate Representations

Intermediate Representations Intermediate Representations Intermediate Representations (EaC Chaper 5) Source Code Front End IR Middle End IR Back End Target Code Front end - produces an intermediate representation (IR) Middle end

More information

BEAMJIT: An LLVM based just-in-time compiler for Erlang. Frej Drejhammar

BEAMJIT: An LLVM based just-in-time compiler for Erlang. Frej Drejhammar BEAMJIT: An LLVM based just-in-time compiler for Erlang Frej Drejhammar 140407 Who am I? Senior researcher at the Swedish Institute of Computer Science (SICS) working on programming languages,

More information

Sista: Improving Cog s JIT performance. Clément Béra

Sista: Improving Cog s JIT performance. Clément Béra Sista: Improving Cog s JIT performance Clément Béra Main people involved in Sista Eliot Miranda Over 30 years experience in Smalltalk VM Clément Béra 2 years engineer in the Pharo team Phd student starting

More information

An Investigation into Value Profiling and its Applications

An Investigation into Value Profiling and its Applications Manchester Metropolitan University BSc. (Hons) Computer Science An Investigation into Value Profiling and its Applications Author: Graham Markall Supervisor: Dr. Andy Nisbet No part of this project has

More information

Lecture Compiler Backend

Lecture Compiler Backend Lecture 19-23 Compiler Backend Jianwen Zhu Electrical and Computer Engineering University of Toronto Jianwen Zhu 2009 - P. 1 Backend Tasks Instruction selection Map virtual instructions To machine instructions

More information

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu and Toshio Nakatani IBM Research Tokyo IBM Research T.J. Watson Research Center October

More information

VM instruction formats. Bytecode translator

VM instruction formats. Bytecode translator Implementing an Ecient Java Interpreter David Gregg 1, M. Anton Ertl 2 and Andreas Krall 2 1 Department of Computer Science, Trinity College, Dublin 2, Ireland. David.Gregg@cs.tcd.ie 2 Institut fur Computersprachen,

More information

Compilers and Interpreters

Compilers and Interpreters Overview Roadmap Language Translators: Interpreters & Compilers Context of a compiler Phases of a compiler Compiler Construction tools Terminology How related to other CS Goals of a good compiler 1 Compilers

More information

CS 406/534 Compiler Construction Putting It All Together

CS 406/534 Compiler Construction Putting It All Together CS 406/534 Compiler Construction Putting It All Together Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy

More information

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control. C Flow Control David Chisnall February 1, 2011 Outline What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope Disclaimer! These slides contain a lot of

More information

Trace-based JIT Compilation

Trace-based JIT Compilation Trace-based JIT Compilation Hiroshi Inoue, IBM Research - Tokyo 1 Trace JIT vs. Method JIT https://twitter.com/yukihiro_matz/status/533775624486133762 2 Background: Trace-based Compilation Using a Trace,

More information

Interaction of JVM with x86, Sparc and MIPS

Interaction of JVM with x86, Sparc and MIPS Interaction of JVM with x86, Sparc and MIPS Sasikanth Avancha, Dipanjan Chakraborty, Dhiral Gada, Tapan Kamdar {savanc1, dchakr1, dgada1, kamdar}@cs.umbc.edu Department of Computer Science and Electrical

More information

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu and Toshio Nakatani IBM Research Tokyo IBM Research T.J. Watson Research Center April

More information

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation Language Implementation Methods The Design and Implementation of Programming Languages Compilation Interpretation Hybrid In Text: Chapter 1 2 Compilation Interpretation Translate high-level programs to

More information

A Dynamic Binary Translation System in a Client/Server Environment

A Dynamic Binary Translation System in a Client/Server Environment A Dynamic Binary Translation System in a Client/Server Environment Chun-Chen Hsu a, Ding-Yong Hong b,, Wei-Chung Hsu a, Pangfeng Liu a, Jan-Jan Wu b a Department of Computer Science and Information Engineering,

More information

point in worrying about performance. The goal of our work is to show that this is not true. This paper is organised as follows. In section 2 we introd

point in worrying about performance. The goal of our work is to show that this is not true. This paper is organised as follows. In section 2 we introd A Fast Java Interpreter David Gregg 1, M. Anton Ertl 2 and Andreas Krall 2 1 Department of Computer Science, Trinity College, Dublin 2, Ireland. David.Gregg@cs.tcd.ie 2 Institut fur Computersprachen, TU

More information

Virtual Machines and Dynamic Translation: Implementing ISAs in Software

Virtual Machines and Dynamic Translation: Implementing ISAs in Software Virtual Machines and Dynamic Translation: Implementing ISAs in Software Krste Asanovic Laboratory for Computer Science Massachusetts Institute of Technology Software Applications How is a software application

More information

Improving Data Access Efficiency by Using Context-Aware Loads and Stores

Improving Data Access Efficiency by Using Context-Aware Loads and Stores Improving Data Access Efficiency by Using Context-Aware Loads and Stores Alen Bardizbanyan Chalmers University of Technology Gothenburg, Sweden alenb@chalmers.se Magnus Själander Uppsala University Uppsala,

More information

CS553 Lecture Profile-Guided Optimizations 3

CS553 Lecture Profile-Guided Optimizations 3 Profile-Guided Optimizations Last time Instruction scheduling Register renaming alanced Load Scheduling Loop unrolling Software pipelining Today More instruction scheduling Profiling Trace scheduling CS553

More information

Native Simulation of Complex VLIW Instruction Sets Using Static Binary Translation and Hardware-Assisted Virtualization

Native Simulation of Complex VLIW Instruction Sets Using Static Binary Translation and Hardware-Assisted Virtualization Native Simulation of Complex VLIW Instruction Sets Using Static Binary Translation and Hardware-Assisted Virtualization Mian-Muhammad Hamayun, Frédéric Pétrot and Nicolas Fournel System Level Synthesis

More information

In Search of Near-Optimal Optimization Phase Orderings

In Search of Near-Optimal Optimization Phase Orderings In Search of Near-Optimal Optimization Phase Orderings Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Languages, Compilers, and Tools for Embedded Systems Optimization Phase Ordering

More information

Automatic Instrumentation of Embedded Software for High Level Hardware/Software Co-Simulation

Automatic Instrumentation of Embedded Software for High Level Hardware/Software Co-Simulation Automatic Instrumentation of Embedded Software for High Level Hardware/Software Co-Simulation Aimen Bouchhima, Patrice Gerin and Frédéric Pétrot System-Level Synthesis Group TIMA Laboratory 46, Av Félix

More information

Model-based Software Development

Model-based Software Development Model-based Software Development 1 SCADE Suite Application Model in SCADE (data flow + SSM) System Model (tasks, interrupts, buses, ) SymTA/S Generator System-level Schedulability Analysis Astrée ait StackAnalyzer

More information

Running class Timing on Java HotSpot VM, 1

Running class Timing on Java HotSpot VM, 1 Compiler construction 2009 Lecture 3. A first look at optimization: Peephole optimization. A simple example A Java class public class A { public static int f (int x) { int r = 3; int s = r + 5; return

More information

Automatic Instrumentation Technique of Embedded Software for High Level Hardware/Software Co-Simulation

Automatic Instrumentation Technique of Embedded Software for High Level Hardware/Software Co-Simulation Automatic Instrumentation Technique of Embedded Software for High Level Hardware/Software Co-Simulation Aimen Bouchhima, Patrice Gerin and Frédéric Pétrot System Level Synthesis Group, TIMA Laboratory

More information

Compiler Design Spring 2018

Compiler Design Spring 2018 Compiler Design Spring 2018 Thomas R. Gross Computer Science Department ETH Zurich, Switzerland 1 Logistics Lecture Tuesdays: 10:15 11:55 Thursdays: 10:15 -- 11:55 In ETF E1 Recitation Announced later

More information

FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES DEEP: DEPENDENCY ELIMINATION USING EARLY PREDICTIONS LUIS G. PENAGOS

FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES DEEP: DEPENDENCY ELIMINATION USING EARLY PREDICTIONS LUIS G. PENAGOS FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES DEEP: DEPENDENCY ELIMINATION USING EARLY PREDICTIONS By LUIS G. PENAGOS A Thesis submitted to the Department of Computer Science in partial fulfillment

More information

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573 What is a compiler? Xiaokang Qiu Purdue University ECE 573 August 21, 2017 What is a compiler? What is a compiler? Traditionally: Program that analyzes and translates from a high level language (e.g.,

More information

The basic operations defined on a symbol table include: free to remove all entries and free the storage of a symbol table

The basic operations defined on a symbol table include: free to remove all entries and free the storage of a symbol table SYMBOL TABLE: A symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier in a program's source code is associated with information relating

More information

When do We Run a Compiler?

When do We Run a Compiler? When do We Run a Compiler? Prior to execution This is standard. We compile a program once, then use it repeatedly. At the start of each execution We can incorporate values known at the start of the run

More information

Register Allocation. CS 502 Lecture 14 11/25/08

Register Allocation. CS 502 Lecture 14 11/25/08 Register Allocation CS 502 Lecture 14 11/25/08 Where we are... Reasonably low-level intermediate representation: sequence of simple instructions followed by a transfer of control. a representation of static

More information

Truffle A language implementation framework

Truffle A language implementation framework Truffle A language implementation framework Boris Spasojević Senior Researcher VM Research Group, Oracle Labs Slides based on previous talks given by Christian Wimmer, Christian Humer and Matthias Grimmer.

More information

Compiling Techniques

Compiling Techniques Lecture 2: The view from 35000 feet 19 September 2017 Table of contents 1 2 Passes Representations 3 Instruction Selection Register Allocation Instruction Scheduling 4 of a compiler Source Compiler Machine

More information

For our next chapter, we will discuss the emulation process which is an integral part of virtual machines.

For our next chapter, we will discuss the emulation process which is an integral part of virtual machines. For our next chapter, we will discuss the emulation process which is an integral part of virtual machines. 1 2 For today s lecture, we ll start by defining what we mean by emulation. Specifically, in this

More information

Outline. Lecture 17: Putting it all together. Example (input program) How to make the computer understand? Example (Output assembly code) Fall 2002

Outline. Lecture 17: Putting it all together. Example (input program) How to make the computer understand? Example (Output assembly code) Fall 2002 Outline 5 Fall 2002 Lecture 17: Putting it all together From parsing to code generation Saman Amarasinghe 2 6.035 MIT Fall 1998 How to make the computer understand? Write a program using a programming

More information

Design & Implementation Overview

Design & Implementation Overview P Fall 2017 Outline P 1 2 3 4 5 6 7 P P Ontological commitments P Imperative l Architecture: Memory cells variables Data movement (memory memory, CPU memory) assignment Sequential machine instruction execution

More information

Group B Assignment 9. Code generation using DAG. Title of Assignment: Problem Definition: Code generation using DAG / labeled tree.

Group B Assignment 9. Code generation using DAG. Title of Assignment: Problem Definition: Code generation using DAG / labeled tree. Group B Assignment 9 Att (2) Perm(3) Oral(5) Total(10) Sign Title of Assignment: Code generation using DAG. 9.1.1 Problem Definition: Code generation using DAG / labeled tree. 9.1.2 Perquisite: Lex, Yacc,

More information

Compilers and Code Optimization EDOARDO FUSELLA

Compilers and Code Optimization EDOARDO FUSELLA Compilers and Code Optimization EDOARDO FUSELLA The course covers Compiler architecture Pre-requisite Front-end Strong programming background in C, C++ Back-end LLVM Code optimization A case study: nu+

More information

Compiler construction 2009

Compiler construction 2009 Compiler construction 2009 Lecture 3 JVM and optimization. A first look at optimization: Peephole optimization. A simple example A Java class public class A { public static int f (int x) { int r = 3; int

More information

JOP: A Java Optimized Processor for Embedded Real-Time Systems. Martin Schoeberl

JOP: A Java Optimized Processor for Embedded Real-Time Systems. Martin Schoeberl JOP: A Java Optimized Processor for Embedded Real-Time Systems Martin Schoeberl JOP Research Targets Java processor Time-predictable architecture Small design Working solution (FPGA) JOP Overview 2 Overview

More information

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation CSE 501: Compiler Construction Course outline Main focus: program analysis and transformation how to represent programs? how to analyze programs? what to analyze? how to transform programs? what transformations

More information

Integrated Modulo Scheduling and Cluster Assignment for TMS320C64x+ Architecture 1

Integrated Modulo Scheduling and Cluster Assignment for TMS320C64x+ Architecture 1 Integrated Modulo Scheduling and Cluster Assignment for TMS320C64x+ Architecture 1 Nikolai Kim, Andreas Krall {kim,andi}@complang.tuwien.ac.at Institute of Computer Languages University of Technology Vienna

More information

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization. Where We Are Source Code Lexical Analysis Syntax Analysis Semantic Analysis IR Generation IR Optimization Code Generation Optimization Machine Code Where We Are Source Code Lexical Analysis Syntax Analysis

More information

LECTURE 2. Compilers and Interpreters

LECTURE 2. Compilers and Interpreters LECTURE 2 Compilers and Interpreters COMPILATION AND INTERPRETATION Programs written in high-level languages can be run in two ways. Compiled into an executable program written in machine language for

More information

Advanced Computer Architecture

Advanced Computer Architecture ECE 563 Advanced Computer Architecture Fall 2007 Lecture 14: Virtual Machines 563 L14.1 Fall 2009 Outline Types of Virtual Machine User-level (or Process VMs) System-level Techniques for implementing all

More information

Functions and Procedures

Functions and Procedures Functions and Procedures Function or Procedure A separate piece of code Possibly separately compiled Located at some address in the memory used for code, away from main and other functions (main is itself

More information

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures

An introduction to DSP s. Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures An introduction to DSP s Examples of DSP applications Why a DSP? Characteristics of a DSP Architectures DSP example: mobile phone DSP example: mobile phone with video camera DSP: applications Why a DSP?

More information

INTRODUCTION TO LLVM Bo Wang SA 2016 Fall

INTRODUCTION TO LLVM Bo Wang SA 2016 Fall INTRODUCTION TO LLVM Bo Wang SA 2016 Fall LLVM Basic LLVM IR LLVM Pass OUTLINE What is LLVM? LLVM is a compiler infrastructure designed as a set of reusable libraries with well-defined interfaces. Implemented

More information

Incremental Flow Analysis. Andreas Krall and Thomas Berger. Institut fur Computersprachen. Technische Universitat Wien. Argentinierstrae 8

Incremental Flow Analysis. Andreas Krall and Thomas Berger. Institut fur Computersprachen. Technische Universitat Wien. Argentinierstrae 8 Incremental Flow Analysis Andreas Krall and Thomas Berger Institut fur Computersprachen Technische Universitat Wien Argentinierstrae 8 A-1040 Wien fandi,tbg@mips.complang.tuwien.ac.at Abstract Abstract

More information

Instructors: Randy H. Katz David A. PaKerson hkp://inst.eecs.berkeley.edu/~cs61c/fa10. Fall Lecture #11. Agenda

Instructors: Randy H. Katz David A. PaKerson hkp://inst.eecs.berkeley.edu/~cs61c/fa10. Fall Lecture #11. Agenda CS 61C: Great Ideas in Computer Architecture (Machine Structures) Compilers, Interpreters, Instructors: Randy H. Katz David A. PaKerson hkp://inst.eecs.berkeley.edu/~cs61c/fa10 9/20/10 Fall 2010 - - Lecture

More information

Introduction. CS 2210 Compiler Design Wonsun Ahn

Introduction. CS 2210 Compiler Design Wonsun Ahn Introduction CS 2210 Compiler Design Wonsun Ahn What is a Compiler? Compiler: A program that translates source code written in one language to a target code written in another language Source code: Input

More information

ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation

ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation Weiping Liao, Saengrawee (Anne) Pratoomtong, and Chuan Zhang Abstract Binary translation is an important component for translating

More information

Memory Access Optimizations in Instruction-Set Simulators

Memory Access Optimizations in Instruction-Set Simulators Memory Access Optimizations in Instruction-Set Simulators Mehrdad Reshadi Center for Embedded Computer Systems (CECS) University of California Irvine Irvine, CA 92697, USA reshadi@cecs.uci.edu ABSTRACT

More information

Lecture 3 Overview of the LLVM Compiler

Lecture 3 Overview of the LLVM Compiler Lecture 3 Overview of the LLVM Compiler Jonathan Burket Special thanks to Deby Katz, Gennady Pekhimenko, Olatunji Ruwase, Chris Lattner, Vikram Adve, and David Koes for their slides The LLVM Compiler Infrastructure

More information

Crusoe Reference. What is Binary Translation. What is so hard about it? Thinking Outside the Box The Transmeta Crusoe Processor

Crusoe Reference. What is Binary Translation. What is so hard about it? Thinking Outside the Box The Transmeta Crusoe Processor Crusoe Reference Thinking Outside the Box The Transmeta Crusoe Processor 55:132/22C:160 High Performance Computer Architecture The Technology Behind Crusoe Processors--Low-power -Compatible Processors

More information

Compilers. History of Compilers. A compiler allows programmers to ignore the machine-dependent details of programming.

Compilers. History of Compilers. A compiler allows programmers to ignore the machine-dependent details of programming. Compilers Compilers are fundamental to modern computing. They act as translators, transforming human-oriented programming languages into computer-oriented machine languages. To most users, a compiler can

More information

ECE 5775 (Fall 17) High-Level Digital Design Automation. Static Single Assignment

ECE 5775 (Fall 17) High-Level Digital Design Automation. Static Single Assignment ECE 5775 (Fall 17) High-Level Digital Design Automation Static Single Assignment Announcements HW 1 released (due Friday) Student-led discussions on Tuesday 9/26 Sign up on Piazza: 3 students / group Meet

More information

CSE 401/M501 Compilers

CSE 401/M501 Compilers CSE 401/M501 Compilers Intermediate Representations Hal Perkins Autumn 2018 UW CSE 401/M501 Autumn 2018 G-1 Agenda Survey of Intermediate Representations Graphical Concrete/Abstract Syntax Trees (ASTs)

More information

LESSON 13: LANGUAGE TRANSLATION

LESSON 13: LANGUAGE TRANSLATION LESSON 13: LANGUAGE TRANSLATION Objective Interpreters and Compilers. Language Translation Phases. Interpreters and Compilers A COMPILER is a program that translates a complete source program into machine

More information

ECE331: Hardware Organization and Design

ECE331: Hardware Organization and Design ECE331: Hardware Organization and Design Lecture 19: Verilog and Processor Performance Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Verilog Basics Hardware description language

More information

Kasper Lund, Software engineer at Google. Crankshaft. Turbocharging the next generation of web applications

Kasper Lund, Software engineer at Google. Crankshaft. Turbocharging the next generation of web applications Kasper Lund, Software engineer at Google Crankshaft Turbocharging the next generation of web applications Overview Why did we introduce Crankshaft? Deciding when and what to optimize Type feedback and

More information

Apple LLVM GPU Compiler: Embedded Dragons. Charu Chandrasekaran, Apple Marcello Maggioni, Apple

Apple LLVM GPU Compiler: Embedded Dragons. Charu Chandrasekaran, Apple Marcello Maggioni, Apple Apple LLVM GPU Compiler: Embedded Dragons Charu Chandrasekaran, Apple Marcello Maggioni, Apple 1 Agenda How Apple uses LLVM to build a GPU Compiler Factors that affect GPU performance The Apple GPU compiler

More information

Application-Specific Design of Low Power Instruction Cache Hierarchy for Embedded Processors

Application-Specific Design of Low Power Instruction Cache Hierarchy for Embedded Processors Agenda Application-Specific Design of Low Power Instruction ache ierarchy for mbedded Processors Ji Gu Onodera Laboratory Department of ommunications & omputer ngineering Graduate School of Informatics

More information

Precept 2: Non-preemptive Scheduler. COS 318: Fall 2018

Precept 2: Non-preemptive Scheduler. COS 318: Fall 2018 Precept 2: Non-preemptive Scheduler COS 318: Fall 2018 Project 2 Schedule Precept: Monday 10/01, 7:30pm (You are here) Design Review: Monday 10/08, 3-7pm Due: Sunday 10/14, 11:55pm Project 2 Overview Goal:

More information

Incremental Dynamic Code Generation with Trace Trees

Incremental Dynamic Code Generation with Trace Trees Incremental Dynamic Code Generation with Trace Trees Andreas Gal University of California, Irvine gal@uci.edu Michael Franz University of California, Irvine franz@uci.edu Abstract The unit of compilation

More information

Emulation. Michael Jantz

Emulation. Michael Jantz Emulation Michael Jantz Acknowledgements Slides adapted from Chapter 2 in Virtual Machines: Versatile Platforms for Systems and Processes by James E. Smith and Ravi Nair Credit to Prasad A. Kulkarni some

More information

A Feasibility Study for Methods of Effective Memoization Optimization

A Feasibility Study for Methods of Effective Memoization Optimization A Feasibility Study for Methods of Effective Memoization Optimization Daniel Mock October 2018 Abstract Traditionally, memoization is a compiler optimization that is applied to regions of code with few

More information

A Method-Based Ahead-of-Time Compiler For Android Applications

A Method-Based Ahead-of-Time Compiler For Android Applications A Method-Based Ahead-of-Time Compiler For Android Applications Fatma Deli Computer Science & Software Engineering University of Washington Bothell November, 2012 2 Introduction This paper proposes a method-based

More information

Building a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2

Building a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2 Building a Compiler with JoeQ AKA, how to solve Homework 2 Outline of this lecture Building a compiler: what pieces we need? An effective IR for Java joeq Homework hints How to Build a Compiler 1. Choose

More information

Accelerating Stateflow With LLVM

Accelerating Stateflow With LLVM Accelerating Stateflow With LLVM By Dale Martin Dale.Martin@mathworks.com 2015 The MathWorks, Inc. 1 What is Stateflow? A block in Simulink, which is a graphical language for modeling algorithms 2 What

More information

Mechanisms and constructs for System Virtualization

Mechanisms and constructs for System Virtualization Mechanisms and constructs for System Virtualization Content Outline Design goals for virtualization General Constructs for virtualization Virtualization for: System VMs Process VMs Prevalent trends: Pros

More information

Introduction Contech s Task Graph Representation Parallel Program Instrumentation (Break) Analysis and Usage of a Contech Task Graph Hands-on

Introduction Contech s Task Graph Representation Parallel Program Instrumentation (Break) Analysis and Usage of a Contech Task Graph Hands-on Introduction Contech s Task Graph Representation Parallel Program Instrumentation (Break) Analysis and Usage of a Contech Task Graph Hands-on Exercises 2 Contech is An LLVM compiler pass to instrument

More information

RetDec: An Open-Source Machine-Code Decompiler. Jakub Křoustek Peter Matula

RetDec: An Open-Source Machine-Code Decompiler. Jakub Křoustek Peter Matula RetDec: An Open-Source Machine-Code Decompiler Jakub Křoustek Peter Matula Who Are We? 2 Jakub Křoustek Founder of RetDec Threat Labs lead @Avast (previously @AVG) Reverse engineer, malware hunter, security

More information

LLVM Tutorial. John Criswell University of Rochester

LLVM Tutorial. John Criswell University of Rochester LLVM Tutorial John Criswell University of Rochester 1 Overview 2 History of LLVM Developed by Chris Lattner and Vikram Adve at the University of Illinois at Urbana-Champaign Released open-source in October

More information

Java Performance: The Definitive Guide

Java Performance: The Definitive Guide Java Performance: The Definitive Guide Scott Oaks Beijing Cambridge Farnham Kbln Sebastopol Tokyo O'REILLY Table of Contents Preface ix 1. Introduction 1 A Brief Outline 2 Platforms and Conventions 2 JVM

More information

Hardware Emulation and Virtual Machines

Hardware Emulation and Virtual Machines Hardware Emulation and Virtual Machines Overview Review of How Programs Run: Registers Execution Cycle Processor Emulation Types: Pure Translation Static Recompilation Dynamic Recompilation Direct Bytecode

More information

The Software Stack: From Assembly Language to Machine Code

The Software Stack: From Assembly Language to Machine Code COMP 506 Rice University Spring 2018 The Software Stack: From Assembly Language to Machine Code source code IR Front End Optimizer Back End IR target code Somewhere Out Here Copyright 2018, Keith D. Cooper

More information

Lecture 2 Overview of the LLVM Compiler

Lecture 2 Overview of the LLVM Compiler Lecture 2 Overview of the LLVM Compiler Abhilasha Jain Thanks to: VikramAdve, Jonathan Burket, DebyKatz, David Koes, Chris Lattner, Gennady Pekhimenko, and Olatunji Ruwase, for their slides The LLVM Compiler

More information

Swift: A Register-based JIT Compiler for Embedded JVMs

Swift: A Register-based JIT Compiler for Embedded JVMs Swift: A Register-based JIT Compiler for Embedded JVMs Yuan Zhang, Min Yang, Bo Zhou, Zhemin Yang, Weihua Zhang, Binyu Zang Fudan University Eighth Conference on Virtual Execution Environment (VEE 2012)

More information