Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler

Size: px
Start display at page:

Download "Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler"

Transcription

1 Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu and Toshio Nakatani IBM Research Tokyo IBM Research T.J. Watson Research Center October 23, 202 SPLASH/OOPSLA 202 at Tucson, Arizona 202 IBM Corporation

2 Goal Balancing steady-state performance and startup performance has been a challenging task for JIT compilers Our Goal to improve the steady-state performance without hurting startup performance in a -based Java JIT compiler Our contributions:. We show that adaptive multi-level compilation is a practical way to balance the startup time and steady-state performance in a -JIT 2. We develop a new technique to efficiently identify hot paths to recompile in a higher optimization level IBM Corporation

3 Background: Trace-based Compilation Using a Trace, a hot path identified at runtime, as a basic unit of compilation method entry method f if (x!= 0) rarely executed frequently executed do something while (!end) 3 return 202 IBM Corporation

4 Background: Trace-based Compilation Using a Trace, a hot path identified at runtime, as a basic unit of compilation method entry if (x!= 0) method f Trace selection: how to form good compilation entry scope if (x!= 0) A cyclic rarely executed frequently executed A exit (side exit) frequently A executed 4 while (!end) do something return a hot path B B while (!end) exit stub exit exit include stub only a hot path exit 202 IBM Corporation

5 Background: Trace Selection and Performance Steps of selection. Identify a hot head a target of a backward branch, a exit point of an existing 2. Record next execution path starting from the head 3. Stop recording when the being recorded forms a cycle, reaches pre-defined maximum length Generating longer s (compilation scopes): yields better steady-state performance more optimization opportunities for compilers smaller transitioning overhead but it hurts startup performance longer compilation time more duplicated code among s IBM Corporation

6 Our Approach: Adaptive multi-level compilation for -JIT. Interpreted execution 2. Initial compilation for faster startup smaller compilation scope lower optimization level identify hot paths by counting the execution count for each potential head (e.g. loop backedge) Our focus Trace selection for upgrade recompilation identify really hot paths by using timer-based sampling profiler 3. Upgrade recompilation for higher steady-state performance larger compilation scope higher optimization level IBM Corporation

7 Steps of Our Multi-level Compilation source program hot path cold path head BB BB2 We limit the max length for faster startup (here 2 BBs) cyclic recom piled cyclic executed by interpreter with monitoring to identify hot path BB3 BB4 BB5 BB6 BB7 initial compilation executed as compiled code with timerbased sampling profiler 2 upgrade recompilation Our focus Trace selection for upgrade recompilation recom piled IBM Corporation

8 An Example of Inefficient Selection for Recompilation Badly formed case Desired case a straight-line hot path 2 3 upgrade recompilation recom piled Problem insufficient coverage by recompiled s We want to cover the entire hot path by a recompiled recom piled IBM Corporation

9 Our Solution: Trace Transition Graph (TTgraph) Trace-Transition Graph (TTgraph) is a weighted directed graph representing the control flow among s Node: compiled Edge: transition between two s weight represents relative frequency We identify the hot paths to recompile using the TTgraph to capture the entire hot path by traversing the edges backward 202 IBM Corporation

10 Steps to Build TTgraph 3 cyclic 2 3. We add an node when a is compiled 2. We add an edge between two s when we link them ( linking) 3. We increment weight of an edge by timer-based sampling profiler See See the the paper paper for for more more detail detail (e.g. (e.g. how how we we profile profile transition and and how how we we employ employ bursty bursty tracing) tracing) IBM Corporation

11 Steps to Build TTgraph 3 cyclic We add an node when a is compiled 2. We add an edge between two s when we link them ( linking) 3. We increment weight of an edge by timer-based sampling profiler See See the the paper paper for for more more detail detail (e.g. (e.g. how how we we profile profile transition and and how how we we employ employ bursty bursty tracing) 202 IBM Corporation

12 Steps to Build TTgraph 3 2 cyclic We add an node when a is compiled 2. We add an edge between two s when we link them ( linking) 3. We increment weight of an edge by timer-based sampling profiler 3 See See the the paper paper for for more more detail detail (e.g. (e.g. how how we we profile profile transition and 9 and how how we we employ employ bursty bursty tracing) IBM Corporation

13 Recompilation Based on TTgraph 3 2 cyclic recompile We traverse the hot edges backward to identify the new head for recompilation we stop the backward traversal when hitting a cyclic (our system does not capture cyclic and path in one ) If there are multiple hot incoming edges for a node, we track both of them See See the the paper paper for for more more detail detail IBM Corporation

14 Performance Evaluation Hardware: IBM BladeCenter JS22 4 cores (8 SMT threads) of POWER6 4.0GHz 6 GB system memory Our -JIT: Extended IBM JVM for Java 6 (32-bit) to support compilation Two optimization levels (both selection and code generation), cold and hot We compare cold only, hot only, and our adaptive multilevel compilation Benchmark: DaCapo benchmark suite IBM Corporation

15 Steady-state Performance always using hot level gives 26% gain in steady state.8 relative steady-state performance avrora cold only hot only Our adaptive multi-level compilation batik eclipse fop h2 jython luindex lusearch pmd sunflow tomcat tradebeans xalan geomean higher is faster 5 Our technique achieved 22% gain in steady state 202 IBM Corporation

16 using hot level makes the startup 50% (up to 3x) slower! Startup Time (Execution time of first iteration) 3.5 relative startup time cold only hot only Our adaptive multi-level compilation shorter is faster avrora batik eclipse fop h2 jython luindex lusearch pmd sunflow tomcat tradebeans xalan geomean Our technique does not hurt the startup performance! 202 IBM Corporation

17 Improvement by TTgraph-Based Recompilation 0.0% basic recompilation 08.0% TTgraph-based recompilation 06.0% 04.0% 02.0% 00.0% relative peak performance higher is faster 98.0% avrora batik eclipse fop h2 jython luindex lusearch pmd sunflow tomcat tradebeans xalan geomean IBM Corporation

18 Summary We showed that adaptive multi-level compilation is a practical way to balance the startup time and steadystate performance in a -JIT We developed an efficient technique based on TTgraph to identify hot paths. We described the selection engine, the timer-based sampling profiler, and the code generator work together for effective recompilation IBM Corporation

19 Thank you! IBM Corporation

20 Backup IBM Corporation

21 CPU Time Breakdown OS native libraries JIT-compiled code hot cold single-level compilation (cold level) single-level compilation (hot level) Most of the execution time is spent in recompiled code adaptive multi-level compilation (TTgraph-based recompilation) shorter is faster normalized CPU time avrora batik eclipse fop h2 jython luindex lusearch pmd sunflow tomcat tradebeans xalan GEOMEAN IBM Corporation

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu and Toshio Nakatani IBM Research Tokyo IBM Research T.J. Watson Research Center April

More information

Trace-based JIT Compilation

Trace-based JIT Compilation Trace-based JIT Compilation Hiroshi Inoue, IBM Research - Tokyo 1 Trace JIT vs. Method JIT https://twitter.com/yukihiro_matz/status/533775624486133762 2 Background: Trace-based Compilation Using a Trace,

More information

IBM Research - Tokyo 数理 計算科学特論 C プログラミング言語処理系の最先端実装技術. Trace Compilation IBM Corporation

IBM Research - Tokyo 数理 計算科学特論 C プログラミング言語処理系の最先端実装技術. Trace Compilation IBM Corporation 数理 計算科学特論 C プログラミング言語処理系の最先端実装技術 Trace Compilation Trace JIT vs. Method JIT https://twitter.com/yukihiro_matz/status/533775624486133762 2 Background: Trace-based Compilation Using a Trace, a hot path identified

More information

Identifying the Sources of Cache Misses in Java Programs Without Relying on Hardware Counters. Hiroshi Inoue and Toshio Nakatani IBM Research - Tokyo

Identifying the Sources of Cache Misses in Java Programs Without Relying on Hardware Counters. Hiroshi Inoue and Toshio Nakatani IBM Research - Tokyo Identifying the Sources of Cache Misses in Java Programs Without Relying on Hardware Counters Hiroshi Inoue and Toshio Nakatani IBM Research - Tokyo June 15, 2012 ISMM 2012 at Beijing, China Motivation

More information

The Potentials and Challenges of Trace Compilation:

The Potentials and Challenges of Trace Compilation: Peng Wu IBM Research January 26, 2011 The Potentials and Challenges of Trace Compilation: Lessons learned from building a trace-jit on top of J9 JVM (Joint work with Hiroshige Hayashizaki and Hiroshi Inoue,

More information

Continuous Object Access Profiling and Optimizations to Overcome the Memory Wall and Bloat

Continuous Object Access Profiling and Optimizations to Overcome the Memory Wall and Bloat Continuous Object Access Profiling and Optimizations to Overcome the Memory Wall and Bloat Rei Odaira, Toshio Nakatani IBM Research Tokyo ASPLOS 2012 March 5, 2012 Many Wasteful Objects Hurt Performance.

More information

Chronicler: Lightweight Recording to Reproduce Field Failures

Chronicler: Lightweight Recording to Reproduce Field Failures Chronicler: Lightweight Recording to Reproduce Field Failures Jonathan Bell, Nikhil Sarda and Gail Kaiser Columbia University, New York, NY USA What happens when software crashes? Stack Traces Aren

More information

Towards Parallel, Scalable VM Services

Towards Parallel, Scalable VM Services Towards Parallel, Scalable VM Services Kathryn S McKinley The University of Texas at Austin Kathryn McKinley Towards Parallel, Scalable VM Services 1 20 th Century Simplistic Hardware View Faster Processors

More information

Efficient Runtime Tracking of Allocation Sites in Java

Efficient Runtime Tracking of Allocation Sites in Java Efficient Runtime Tracking of Allocation Sites in Java Rei Odaira, Kazunori Ogata, Kiyokuni Kawachiya, Tamiya Onodera, Toshio Nakatani IBM Research - Tokyo Why Do You Need Allocation Site Information?

More information

String Deduplication for Java-based Middleware in Virtualized Environments

String Deduplication for Java-based Middleware in Virtualized Environments String Deduplication for Java-based Middleware in Virtualized Environments Michihiro Horie, Kazunori Ogata, Kiyokuni Kawachiya, Tamiya Onodera IBM Research - Tokyo Duplicated strings found on Web Application

More information

Reducing Trace Selection Footprint for Large-scale Java Applications without Performance Loss

Reducing Trace Selection Footprint for Large-scale Java Applications without Performance Loss Reducing Trace Selection Footprint for Large-scale Java Applications without Performance Loss Peng Wu Hiroshige Hayashizaki Hiroshi Inoue Toshio Nakatani IBM Research pengwu@us.ibm.com,{hayashiz,inouehrs,nakatani}@jp.ibm.com

More information

JIT Compilation Policy for Modern Machines

JIT Compilation Policy for Modern Machines JIT Compilation Policy for Modern Machines Prasad A. Kulkarni Department of Electrical Engineering and Computer Science, University of Kansas prasadk@ku.edu Abstract Dynamic or Just-in-Time (JIT) compilation

More information

Dynamic Vertical Memory Scalability for OpenJDK Cloud Applications

Dynamic Vertical Memory Scalability for OpenJDK Cloud Applications Dynamic Vertical Memory Scalability for OpenJDK Cloud Applications Rodrigo Bruno, Paulo Ferreira: INESC-ID / Instituto Superior Técnico, University of Lisbon Ruslan Synytsky, Tetiana Fydorenchyk: Jelastic

More information

How s the Parallel Computing Revolution Going? Towards Parallel, Scalable VM Services

How s the Parallel Computing Revolution Going? Towards Parallel, Scalable VM Services How s the Parallel Computing Revolution Going? Towards Parallel, Scalable VM Services Kathryn S McKinley The University of Texas at Austin Kathryn McKinley Towards Parallel, Scalable VM Services 1 20 th

More information

Phosphor: Illuminating Dynamic. Data Flow in Commodity JVMs

Phosphor: Illuminating Dynamic. Data Flow in Commodity JVMs Phosphor: Illuminating Dynamic Fork me on Github Data Flow in Commodity JVMs Jonathan Bell and Gail Kaiser Columbia University, New York, NY USA Dynamic Data Flow Analysis: Taint Tracking Output that is

More information

Adaptive SMT Control for More Responsive Web Applications

Adaptive SMT Control for More Responsive Web Applications Adaptive SMT Control for More Responsive Web Applications Hiroshi Inoue and Toshio Nakatani IBM Research Tokyo University of Tokyo Oct 27, 2014 IISWC @ Raleigh, NC, USA Response time matters! Peak throughput

More information

Compilation Queuing and Graph Caching for Dynamic Compilers

Compilation Queuing and Graph Caching for Dynamic Compilers Compilation Queuing and Graph Caching for Dynamic Compilers Lukas Stadler Gilles Duboscq Hanspeter Mössenböck Johannes Kepler University Linz, Austria {stadler, duboscq, moessenboeck}@ssw.jku.at Thomas

More information

CGO:U:Auto-tuning the HotSpot JVM

CGO:U:Auto-tuning the HotSpot JVM CGO:U:Auto-tuning the HotSpot JVM Milinda Fernando, Tharindu Rusira, Chalitha Perera, Chamara Philips Department of Computer Science and Engineering University of Moratuwa Sri Lanka {milinda.10, tharindurusira.10,

More information

Deriving Code Coverage Information from Profiling Data Recorded for a Trace-based Just-in-time Compiler

Deriving Code Coverage Information from Profiling Data Recorded for a Trace-based Just-in-time Compiler Deriving Code Coverage Information from Profiling Data Recorded for a Trace-based Just-in-time Compiler Christian Häubl Institute for System Software Johannes Kepler University Linz Austria haeubl@ssw.jku.at

More information

Lu Fang, University of California, Irvine Liang Dou, East China Normal University Harry Xu, University of California, Irvine

Lu Fang, University of California, Irvine Liang Dou, East China Normal University Harry Xu, University of California, Irvine Lu Fang, University of California, Irvine Liang Dou, East China Normal University Harry Xu, University of California, Irvine 2015-07-09 Inefficient code regions [G. Jin et al. PLDI 2012] Inefficient code

More information

Cross-Layer Memory Management for Managed Language Applications

Cross-Layer Memory Management for Managed Language Applications Cross-Layer Memory Management for Managed Language Applications Michael R. Jantz University of Tennessee mrjantz@utk.edu Forrest J. Robinson Prasad A. Kulkarni University of Kansas {fjrobinson,kulkarni}@ku.edu

More information

Probabilistic Calling Context

Probabilistic Calling Context Probabilistic Calling Context Michael D. Bond Kathryn S. McKinley University of Texas at Austin Why Context Sensitivity? Static program location not enough at com.mckoi.db.jdbcserver.jdbcinterface.execquery():213

More information

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization Aritra Sengupta, Man Cao, Michael D. Bond and Milind Kulkarni 1 PPPJ 2015, Melbourne, Florida, USA Programming

More information

Efficient Runtime Tracking of Allocation Sites in Java

Efficient Runtime Tracking of Allocation Sites in Java Efficient Runtime Tracking of Allocation Sites in Java Rei Odaira, Kazunori Ogata, Kiyokuni Kawachiya, Tamiya Onodera, Toshio Nakatani IBM Research Tokyo 623-4, Shimotsuruma, Yamato-shi, Kanagawa-ken,

More information

The Use of Traces in Optimization

The Use of Traces in Optimization The Use of Traces in Optimization by Borys Jan Bradel A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Edward S. Rogers Sr. Department of Electrical and

More information

Large-Scale API Protocol Mining for Automated Bug Detection

Large-Scale API Protocol Mining for Automated Bug Detection Large-Scale API Protocol Mining for Automated Bug Detection Michael Pradel Department of Computer Science ETH Zurich 1 Motivation LinkedList pinconnections =...; Iterator i = pinconnections.iterator();

More information

CSc 453 Interpreters & Interpretation

CSc 453 Interpreters & Interpretation CSc 453 Interpreters & Interpretation Saumya Debray The University of Arizona Tucson Interpreters An interpreter is a program that executes another program. An interpreter implements a virtual machine,

More information

Towards Garbage Collection Modeling

Towards Garbage Collection Modeling Towards Garbage Collection Modeling Peter Libič Petr Tůma Department of Distributed and Dependable Systems Faculty of Mathematics and Physics, Charles University Malostranské nám. 25, 118 00 Prague, Czech

More information

Performance Potential of Optimization Phase Selection During Dynamic JIT Compilation

Performance Potential of Optimization Phase Selection During Dynamic JIT Compilation Performance Potential of Optimization Phase Selection During Dynamic JIT Compilation Michael R. Jantz Prasad A. Kulkarni Electrical Engineering and Computer Science, University of Kansas {mjantz,kulkarni}@ittc.ku.edu

More information

Comparison of Garbage Collectors in Java Programming Language

Comparison of Garbage Collectors in Java Programming Language Comparison of Garbage Collectors in Java Programming Language H. Grgić, B. Mihaljević, A. Radovan Rochester Institute of Technology Croatia, Zagreb, Croatia hrvoje.grgic@mail.rit.edu, branko.mihaljevic@croatia.rit.edu,

More information

A Dynamic Evaluation of the Precision of Static Heap Abstractions

A Dynamic Evaluation of the Precision of Static Heap Abstractions A Dynamic Evaluation of the Precision of Static Heap Abstractions OOSPLA - Reno, NV October 20, 2010 Percy Liang Omer Tripp Mayur Naik Mooly Sagiv UC Berkeley Tel-Aviv Univ. Intel Labs Berkeley Tel-Aviv

More information

An Empirical Study on Deoptimization in the Graal Compiler

An Empirical Study on Deoptimization in the Graal Compiler An Empirical Study on Deoptimization in the Graal Compiler Yudi Zheng 1, Lubomír Bulej 2, and Walter Binder 3 1 Faculty of Informatics, Università della Svizzera italiana (USI), Switzerland Yudi.Zheng@usi.ch

More information

Phase-based Adaptive Recompilation in a JVM

Phase-based Adaptive Recompilation in a JVM Phase-based Adaptive Recompilation in a JVM Dayong Gu Clark Verbrugge Sable Research Group, School of Computer Science McGill University, Montréal, Canada {dgu1, clump}@cs.mcgill.ca April 7, 2008 Sable

More information

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon. School of Electrical Engineering and Computer Science Seoul National University, Korea

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon. School of Electrical Engineering and Computer Science Seoul National University, Korea H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon School of Electrical Engineering and Computer Science Seoul National University, Korea Android apps are programmed using Java Android uses DVM instead of JVM

More information

Just In Time Compilation

Just In Time Compilation Just In Time Compilation JIT Compilation: What is it? Compilation done during execution of a program (at run time) rather than prior to execution Seen in today s JVMs and elsewhere Outline Traditional

More information

Kazunori Ogata, Dai Mikurube, Kiyokuni Kawachiya and Tamiya Onodera

Kazunori Ogata, Dai Mikurube, Kiyokuni Kawachiya and Tamiya Onodera RT0854 Computer Science 13 pages Research Report April 27, 2009 A Study of Java s non-java Memory Kazunori Ogata, Dai Mikurube, Kiyokuni Kawachiya and Tamiya Onodera IBM Research, Tokyo Research Laboratory

More information

Cross-Layer Memory Management for Managed Language Applications

Cross-Layer Memory Management for Managed Language Applications Cross-Layer Memory Management for Managed Language Applications Michael R. Jantz University of Tennessee mrjantz@utk.edu Forrest J. Robinson Prasad A. Kulkarni University of Kansas {fjrobinson,kulkarni}@ku.edu

More information

Continuously Measuring Critical Section Pressure with the Free-Lunch Profiler

Continuously Measuring Critical Section Pressure with the Free-Lunch Profiler Continuously Measuring Critical Section Pressure with the Free-Lunch Profiler Florian David Gaël Thomas Julia Lawall Gilles Muller Sorbonne Universités, UPMC, LIP6 Inria, Whisper team firsname.lastname@lip6.fr

More information

Reference Object Processing in On-The-Fly Garbage Collection

Reference Object Processing in On-The-Fly Garbage Collection Reference Object Processing in On-The-Fly Garbage Collection Tomoharu Ugawa, Kochi University of Technology Richard Jones, Carl Ritson, University of Kent Weak Pointers Weak pointers are a mechanism to

More information

Just-In-Time Compilation

Just-In-Time Compilation Just-In-Time Compilation Thiemo Bucciarelli Institute for Software Engineering and Programming Languages 18. Januar 2016 T. Bucciarelli 18. Januar 2016 1/25 Agenda Definitions Just-In-Time Compilation

More information

A Black-box Approach to Understanding Concurrency in DaCapo

A Black-box Approach to Understanding Concurrency in DaCapo A Black-box Approach to Understanding Concurrency in DaCapo Tomas Kalibera Matthew Mole Richard Jones Jan Vitek University of Kent, Canterbury Purdue University Abstract Increasing levels of hardware parallelism

More information

Elfen Scheduling. Fine-Grain Principled Borrowing from Latency-Critical Workloads using Simultaneous Multithreading (SMT) Xi Yang

Elfen Scheduling. Fine-Grain Principled Borrowing from Latency-Critical Workloads using Simultaneous Multithreading (SMT) Xi Yang Elfen Scheduling Fine-Grain Principled Borrowing from Latency-Critical Workloads using Simultaneous Multithreading (SMT) Xi Yang Australian National University Stephen M Blackburn Australian National University

More information

Sista: Improving Cog s JIT performance. Clément Béra

Sista: Improving Cog s JIT performance. Clément Béra Sista: Improving Cog s JIT performance Clément Béra Main people involved in Sista Eliot Miranda Over 30 years experience in Smalltalk VM Clément Béra 2 years engineer in the Pharo team Phd student starting

More information

AOT Vs. JIT: Impact of Profile Data on Code Quality

AOT Vs. JIT: Impact of Profile Data on Code Quality AOT Vs. JIT: Impact of Profile Data on Code Quality April W. Wade University of Kansas t982w485@ku.edu Prasad A. Kulkarni University of Kansas prasadk@ku.edu Michael R. Jantz University of Tennessee mrjantz@utk.edu

More information

New Algorithms for Static Analysis via Dyck Reachability

New Algorithms for Static Analysis via Dyck Reachability New Algorithms for Static Analysis via Dyck Reachability Andreas Pavlogiannis 4th Inria/EPFL Workshop February 15, 2018 A. Pavlogiannis New Algorithms for Static Analysis via Dyck Reachability 2 A. Pavlogiannis

More information

Method-Level Phase Behavior in Java Workloads

Method-Level Phase Behavior in Java Workloads Method-Level Phase Behavior in Java Workloads Andy Georges, Dries Buytaert, Lieven Eeckhout and Koen De Bosschere Ghent University Presented by Bruno Dufour dufour@cs.rutgers.edu Rutgers University DCS

More information

Last class: OS and Architecture. OS and Computer Architecture

Last class: OS and Architecture. OS and Computer Architecture Last class: OS and Architecture OS and Computer Architecture OS Service Protection Interrupts System Calls IO Scheduling Synchronization Virtual Memory Hardware Support Kernel/User Mode Protected Instructions

More information

Last class: OS and Architecture. Chapter 3: Operating-System Structures. OS and Computer Architecture. Common System Components

Last class: OS and Architecture. Chapter 3: Operating-System Structures. OS and Computer Architecture. Common System Components Last class: OS and Architecture Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System Design and Implementation

More information

BEAMJIT: An LLVM based just-in-time compiler for Erlang. Frej Drejhammar

BEAMJIT: An LLVM based just-in-time compiler for Erlang. Frej Drejhammar BEAMJIT: An LLVM based just-in-time compiler for Erlang Frej Drejhammar 140407 Who am I? Senior researcher at the Swedish Institute of Computer Science (SICS) working on programming languages,

More information

Java On Steroids: Sun s High-Performance Java Implementation. History

Java On Steroids: Sun s High-Performance Java Implementation. History Java On Steroids: Sun s High-Performance Java Implementation Urs Hölzle Lars Bak Steffen Grarup Robert Griesemer Srdjan Mitrovic Sun Microsystems History First Java implementations: interpreters compact

More information

D4: Fast Concurrency Debugging with Parallel Differential Analysis

D4: Fast Concurrency Debugging with Parallel Differential Analysis D4: Fast Concurrency Debugging with Parallel Differential Analysis Abstract Bozhen Liu Parasol Laboratory Texas A&M University USA april1989@tamu.edu We present D4, a fast concurrency analysis framework

More information

Lightweight Data Race Detection for Production Runs

Lightweight Data Race Detection for Production Runs Lightweight Data Race Detection for Production Runs Swarnendu Biswas, UT Austin Man Cao, Ohio State University Minjia Zhang, Microsoft Research Michael D. Bond, Ohio State University Benjamin P. Wood,

More information

Executing Legacy Applications on a Java Operating System

Executing Legacy Applications on a Java Operating System Executing Legacy Applications on a Java Operating System Andreas Gal, Michael Yang, Christian Probst, and Michael Franz University of California, Irvine {gal,mlyang,probst,franz}@uci.edu May 30, 2004 Abstract

More information

Notos: Efficient Emulation of Wireless Sensor Networks with Binary-to-Source Translation

Notos: Efficient Emulation of Wireless Sensor Networks with Binary-to-Source Translation Schützenbahn 70 45127 Essen, Germany Notos: Efficient Emulation of Wireless Sensor Networks with Binary-to-Source Translation Robert Sauter, Sascha Jungen, Richard Figura, and Pedro José Marrón, Germany

More information

Workload Characterization and Optimization of TPC-H Queries on Apache Spark

Workload Characterization and Optimization of TPC-H Queries on Apache Spark Workload Characterization and Optimization of TPC-H Queries on Apache Spark Tatsuhiro Chiba and Tamiya Onodera IBM Research - Tokyo April. 17-19, 216 IEEE ISPASS 216 @ Uppsala, Sweden Overview IBM Research

More information

Exploiting FIFO Scheduler to Improve Parallel Garbage Collection Performance

Exploiting FIFO Scheduler to Improve Parallel Garbage Collection Performance Exploiting FIFO Scheduler to Improve Parallel Garbage Collection Performance Junjie Qian, Witawas Srisa-an, Sharad Seth, Hong Jiang, Du Li, Pan Yi University of Nebraska Lincoln, University of Texas Arlington,

More information

Parallelism Profiling and Wall-time Prediction for Multi-threaded Applications

Parallelism Profiling and Wall-time Prediction for Multi-threaded Applications Profiling and Wall-time Prediction for Multi-threaded Applications Achille Peternier Institute of Computational Science (ICS) University of Lugano (USI) Lugano, Switzerland achille.peternier@usi.ch Walter

More information

Evolution of a Java just-in-time compiler for IA-32 platforms

Evolution of a Java just-in-time compiler for IA-32 platforms Evolution of a Java just-in-time compiler for IA-32 platforms Java has gained widespread popularity in the industry, and an efficient Java virtual machine (JVM ) and just-in-time (JIT) compiler are crucial

More information

Trace Compilation. Christian Wimmer September 2009

Trace Compilation. Christian Wimmer  September 2009 Trace Compilation Christian Wimmer cwimmer@uci.edu www.christianwimmer.at September 2009 Department of Computer Science University of California, Irvine Background Institute for System Software Johannes

More information

YETI. GraduallY Extensible Trace Interpreter VEE Mathew Zaleski, Angela Demke Brown (University of Toronto) Kevin Stoodley (IBM Toronto)

YETI. GraduallY Extensible Trace Interpreter VEE Mathew Zaleski, Angela Demke Brown (University of Toronto) Kevin Stoodley (IBM Toronto) YETI GraduallY Extensible Trace Interpreter Mathew Zaleski, Angela Demke Brown (University of Toronto) Kevin Stoodley (IBM Toronto) VEE 2007 1 Goal Create a VM that is more easily extended with a just

More information

Avoiding Consistency Exceptions Under Strong Memory Models

Avoiding Consistency Exceptions Under Strong Memory Models Avoiding Consistency Exceptions Under Strong Memory Models Minjia Zhang Microsoft Research (USA) minjiaz@microsoft.com Swarnendu Biswas University of Texas at Austin (USA) sbiswas@ices.utexas.edu Michael

More information

Enhancing the Usage of the Shared Class Cache

Enhancing the Usage of the Shared Class Cache Enhancing the Usage of the Shared Class Cache by Devarghya Bhattacharya Bachelors of Information Technology, WBUT, 2012 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master

More information

New Java performance developments: compilation and garbage collection

New Java performance developments: compilation and garbage collection New Java performance developments: compilation and garbage collection Jeroen Borgers @jborgers #jfall17 Part 1: New in Java compilation Part 2: New in Java garbage collection 2 Part 1 New in Java compilation

More information

Interaction of JVM with x86, Sparc and MIPS

Interaction of JVM with x86, Sparc and MIPS Interaction of JVM with x86, Sparc and MIPS Sasikanth Avancha, Dipanjan Chakraborty, Dhiral Gada, Tapan Kamdar {savanc1, dchakr1, dgada1, kamdar}@cs.umbc.edu Department of Computer Science and Electrical

More information

Efficient Parametric Runtime Verification with Deterministic String Rewriting

Efficient Parametric Runtime Verification with Deterministic String Rewriting Efficient Parametric Runtime Verification with Deterministic String Rewriting Patrick Meredith University of Illinois at Urbana-Champaign Urbana IL, USA pmeredit@illinois.edu Grigore Roşu University of

More information

Cross-Layer Memory Management to Reduce DRAM Power Consumption

Cross-Layer Memory Management to Reduce DRAM Power Consumption Cross-Layer Memory Management to Reduce DRAM Power Consumption Michael Jantz Assistant Professor University of Tennessee, Knoxville 1 Introduction Assistant Professor at UT since August 2014 Before UT

More information

Adaptive Optimization using Hardware Performance Monitors. Master Thesis by Mathias Payer

Adaptive Optimization using Hardware Performance Monitors. Master Thesis by Mathias Payer Adaptive Optimization using Hardware Performance Monitors Master Thesis by Mathias Payer Supervising Professor: Thomas Gross Supervising Assistant: Florian Schneider Adaptive Optimization using HPM 1/21

More information

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization Aritra Sengupta Man Cao Michael D. Bond Milind Kulkarni Ohio State University Purdue University {sengupta,caoma,mikebond}@cse.ohio-state.edu

More information

Assessing the Scalability of Garbage Collectors on Many Cores

Assessing the Scalability of Garbage Collectors on Many Cores Assessing the Scalability of Garbage Collectors on Many Cores Lokesh Gidra Gaël Thomas Julien Sopena Marc Shapiro Regal-LIP6/INRIA Université Pierre et Marie Curie, 4 place Jussieu, Paris, France firstname.lastname@lip6.fr

More information

Improving Virtual Machine Performance Using a Cross-Run Profile Repository

Improving Virtual Machine Performance Using a Cross-Run Profile Repository Improving Virtual Machine Performance Using a Cross-Run Profile Repository Matthew Arnold Adam Welc V.T. Rajan IBM T.J. Watson Research Center {marnold,vtrajan}@us.ibm.com Purdue University welc@cs.purdue.edu

More information

BEAMJIT, a Maze of Twisty Little Traces

BEAMJIT, a Maze of Twisty Little Traces BEAMJIT, a Maze of Twisty Little Traces A walk-through of the prototype just-in-time (JIT) compiler for Erlang. Frej Drejhammar 130613 Who am I? Senior researcher at the Swedish Institute

More information

Dynamic Selection of Application-Specific Garbage Collectors

Dynamic Selection of Application-Specific Garbage Collectors Dynamic Selection of Application-Specific Garbage Collectors Sunil V. Soman Chandra Krintz University of California, Santa Barbara David F. Bacon IBM T.J. Watson Research Center Background VMs/managed

More information

Proceedings of the 2 nd Java TM Virtual Machine Research and Technology Symposium (JVM '02)

Proceedings of the 2 nd Java TM Virtual Machine Research and Technology Symposium (JVM '02) USENIX Association Proceedings of the 2 nd Java TM Virtual Machine Research and Technology Symposium (JVM '02) San Francisco, California, USA August 1-2, 2002 THE ADVANCED COMPUTING SYSTEMS ASSOCIATION

More information

Just-In-Time Compilers & Runtime Optimizers

Just-In-Time Compilers & Runtime Optimizers COMP 412 FALL 2017 Just-In-Time Compilers & Runtime Optimizers Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved.

More information

Program Calling Context. Motivation

Program Calling Context. Motivation Program alling ontext Motivation alling context enhances program understanding and dynamic analyses by providing a rich representation of program location ollecting calling context can be expensive The

More information

Ahead of Time (AOT) Compilation

Ahead of Time (AOT) Compilation Ahead of Time (AOT) Compilation Vaibhav Choudhary (@vaibhav_c) Java Platforms Team https://blogs.oracle.com/vaibhav Copyright 2018, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement

More information

Tolerating Memory Leaks

Tolerating Memory Leaks Tolerating Memory Leaks UT Austin Technical Report TR-07-64 December 7, 2007 Michael D. Bond Kathryn S. McKinley Department of Computer Sciences The University of Texas at Austin {mikebond,mckinley}@cs.utexas.edu

More information

On The Limits of Modeling Generational Garbage Collector Performance

On The Limits of Modeling Generational Garbage Collector Performance On The Limits of Modeling Generational Garbage Collector Performance Peter Libič Lubomír Bulej Vojtěch Horký Petr Tůma Department of Distributed and Dependable Systems Faculty of Mathematics and Physics,

More information

A Performance Study of Java Garbage Collectors on Multicore Architectures

A Performance Study of Java Garbage Collectors on Multicore Architectures A Performance Study of Java Garbage Collectors on Multicore Architectures Maria Carpen-Amarie Université de Neuchâtel Neuchâtel, Switzerland maria.carpen-amarie@unine.ch Patrick Marlier Université de Neuchâtel

More information

Evolution of Virtual Machine Technologies for Portability and Application Capture. Bob Vandette Java Hotspot VM Engineering Sept 2004

Evolution of Virtual Machine Technologies for Portability and Application Capture. Bob Vandette Java Hotspot VM Engineering Sept 2004 Evolution of Virtual Machine Technologies for Portability and Application Capture Bob Vandette Java Hotspot VM Engineering Sept 2004 Topics Virtual Machine Evolution Timeline & Products Trends forcing

More information

Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 1 / 1

Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 1 / 1 Dynamic Binary Translation for Generation of Cycle Accurate Architecture Simulators Institut für Computersprachen Technische Universtät Wien Austria Andreas Fellnhofer Andreas Krall David Riegler Part

More information

MicroPhase: An Approach to Proactively Invoking Garbage Collection for Improved Performance

MicroPhase: An Approach to Proactively Invoking Garbage Collection for Improved Performance MicroPhase: An Approach to Proactively Invoking Garbage Collection for Improved Performance Feng Xian, Witawas Srisa-an, and Hong Jiang Department of Computer Science & Engineering University of Nebraska-Lincoln

More information

VIRTUAL MACHINE. Shuvabrata Saha

VIRTUAL MACHINE. Shuvabrata Saha A MULTI-OBJECTIVE AUTOTUNING FRAMEWORK FOR THE JAVA VIRTUAL MACHINE by Shuvabrata Saha A thesis submitted to the Graduate College of Texas State University for the degree of Master of Science with a Major

More information

PERFBLOWER: Quickly Detecting Memory-Related Performance Problems via Amplification

PERFBLOWER: Quickly Detecting Memory-Related Performance Problems via Amplification PERFBLOWER: Quickly Detecting Memory-Related Performance Problems via Amplification Lu Fang 1, Liang Dou 2, and Guoqing Xu 1 1 University of California, Irvine, USA {lfang3, guoqingx}@ics.uci.edu 2 East

More information

Fault Tolerant Java Virtual Machine. Roy Friedman and Alon Kama Technion Haifa, Israel

Fault Tolerant Java Virtual Machine. Roy Friedman and Alon Kama Technion Haifa, Israel Fault Tolerant Java Virtual Machine Roy Friedman and Alon Kama Technion Haifa, Israel Objective Create framework for transparent fault-tolerance Support legacy applications Intended for long-lived, highly

More information

Workload Optimization on Hybrid Architectures

Workload Optimization on Hybrid Architectures IBM OCR project Workload Optimization on Hybrid Architectures IBM T.J. Watson Research Center May 4, 2011 Chiron & Achilles 2003 IBM Corporation IBM Research Goal Parallelism with hundreds and thousands

More information

Tail Latency: Beyond Queuing Theory. Kathryn S McKinley Xi Yang, Stephen M Blackburn, Sameh Elnikety, Yuxiong He, Ricardo Bianchini

Tail Latency: Beyond Queuing Theory. Kathryn S McKinley Xi Yang, Stephen M Blackburn, Sameh Elnikety, Yuxiong He, Ricardo Bianchini Tail Latency: Beyond Queuing Theory Kathryn S McKinley Xi Yang, Stephen M Blackburn, Sameh Elnikety, Yuxiong He, Ricardo Bianchini Microsoft s Quincy datacenter 3 Servers in US datacenters 2 Servers in

More information

The Design, Implementation, and Evaluation of Adaptive Code Unloading for Resource-Constrained

The Design, Implementation, and Evaluation of Adaptive Code Unloading for Resource-Constrained The Design, Implementation, and Evaluation of Adaptive Code Unloading for Resource-Constrained Devices LINGLI ZHANG, CHANDRA KRINTZ University of California, Santa Barbara Java Virtual Machines (JVMs)

More information

Towards future method hotness prediction for Virtual Machines

Towards future method hotness prediction for Virtual Machines Towards future method hotness prediction for Virtual Machines Manjiri A. Namjoshi Submitted to the Department of Electrical Engineering & Computer Science and the Faculty of the Graduate School of the

More information

Finding the Limit: Examining the Potential and Complexity of Compilation Scheduling for JIT-Based Runtime Systems

Finding the Limit: Examining the Potential and Complexity of Compilation Scheduling for JIT-Based Runtime Systems Finding the Limit: Examining the Potential and Complexity of Compilation Scheduling for JIT-Based Runtime Systems Yufei Ding, Mingzhou Zhou, Zhijia Zhao, Sarah Eisenstat, Xipeng Shen Computer Science Department,

More information

A Status Update of BEAMJIT, the Just-in-Time Compiling Abstract Machine. Frej Drejhammar and Lars Rasmusson

A Status Update of BEAMJIT, the Just-in-Time Compiling Abstract Machine. Frej Drejhammar and Lars Rasmusson A Status Update of BEAMJIT, the Just-in-Time Compiling Abstract Machine Frej Drejhammar and Lars Rasmusson 140609 Who am I? Senior researcher at the Swedish Institute of Computer Science

More information

Managed runtimes & garbage collection

Managed runtimes & garbage collection Managed runtimes Advantages? Managed runtimes & garbage collection CSE 631 Some slides by Kathryn McKinley Disadvantages? 1 2 Managed runtimes Portability (& performance) Advantages? Reliability Security

More information

PennBench: A Benchmark Suite for Embedded Java

PennBench: A Benchmark Suite for Embedded Java WWC5 Austin, TX. Nov. 2002 PennBench: A Benchmark Suite for Embedded Java G. Chen, M. Kandemir, N. Vijaykrishnan, And M. J. Irwin Penn State University http://www.cse.psu.edu/~mdl Outline Introduction

More information

Data Structure Aware Garbage Collector

Data Structure Aware Garbage Collector Data Structure Aware Garbage Collector Nachshon Cohen Technion nachshonc@gmail.com Erez Petrank Technion erez@cs.technion.ac.il Abstract Garbage collection may benefit greatly from knowledge about program

More information

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley Managed runtimes & garbage collection CSE 6341 Some slides by Kathryn McKinley 1 Managed runtimes Advantages? Disadvantages? 2 Managed runtimes Advantages? Reliability Security Portability Performance?

More information

History of Compilers The term

History of Compilers The term History of Compilers The term compiler was coined in the early 1950s by Grace Murray Hopper. Translation was viewed as the compilation of a sequence of machine-language subprograms selected from a library.

More information

AtomChase: Directed Search Towards Atomicity Violations

AtomChase: Directed Search Towards Atomicity Violations AtomChase: Directed Search Towards Atomicity Violations Mahdi Eslamimehr, Mohsen Lesani Viewpoints Research Institute, 1025 Westwood Blvd 2nd flr, Los Angeles, CA 90024 t: (310) 208-0524 AtomChase: Directed

More information

Subsuming Methods: Finding New Optimisation Opportunities in Object-Oriented Software

Subsuming Methods: Finding New Optimisation Opportunities in Object-Oriented Software Subsuming Methods: Finding New Optimisation Opportunities in Object-Oriented Software ABSTRACT David Maplesden Dept. of Computer Science The University of Auckland dmap001@aucklanduni.ac.nz John Hosking

More information

Profiling and Workflow

Profiling and Workflow Profiling and Workflow Preben N. Olsen University of Oslo and Simula Research Laboratory preben@simula.no September 13, 2013 1 / 34 Agenda 1 Introduction What? Why? How? 2 Profiling Tracing Performance

More information

Performance Profiling

Performance Profiling Performance Profiling Minsoo Ryu Real-Time Computing and Communications Lab. Hanyang University msryu@hanyang.ac.kr Outline History Understanding Profiling Understanding Performance Understanding Performance

More information