Java Instrumentation for Dynamic Analysis

Similar documents
Compiling Techniques

CSCE 314 Programming Languages

02 B The Java Virtual Machine

High-Level Language VMs

Java byte code verification

Java Security. Compiler. Compiler. Hardware. Interpreter. The virtual machine principle: Abstract Machine Code. Source Code

CS2110 Fall 2011 Lecture 25. Under the Hood: The Java Virtual Machine, Part II

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Java Class Loading and Bytecode Verification

CMPSC 497: Java Security

Java: framework overview and in-the-small features

Static Program Analysis

JavaSplit. A Portable Distributed Runtime for Java. Michael Factor Assaf Schuster Konstantin Shagin

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Compiler construction 2009

Exercise 7 Bytecode Verification self-study exercise sheet

JVML Instruction Set. How to get more than 256 local variables! Method Calls. Example. Method Calls

An Introduction to Multicodes. Ben Stephenson Department of Computer Science University of Western Ontario

JVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format

Compiler construction 2009

JAM 16: The Instruction Set & Sample Programs

Improving Java Code Performance. Make your Java/Dalvik VM happier

Code Generation Introduction

Over-view. CSc Java programs. Java programs. Logging on, and logging o. Slides by Michael Weeks Copyright Unix basics. javac.

Program Dynamic Analysis. Overview

3/15/18. Overview. Program Dynamic Analysis. What is dynamic analysis? [3] Why dynamic analysis? Why dynamic analysis? [3]

CS577 Modern Language Processors. Spring 2018 Lecture Interpreters

SOFTWARE ARCHITECTURE 7. JAVA VIRTUAL MACHINE

Oak Intermediate Bytecodes

Course Overview. PART I: overview material. PART II: inside a compiler. PART III: conclusion

Java Code Coverage Mechanics. by Evgeny Mandrikov at EclipseCon Europe 2017

The Java Virtual Machine

Under the Hood: The Java Virtual Machine. Problem: Too Many Platforms! Compiling for Different Platforms. Compiling for Different Platforms

Context Threading: A flexible and efficient dispatch technique for virtual machine interpreters

Improving Java Performance

Run-time Program Management. Hwansoo Han

Java Code Coverage Mechanics Evgeny Mandrikov Marc Hoffmann #JokerConf 2017, Saint-Petersburg

Translating JVM Code to MIPS Code 1 / 43

Plan for Today. Safe Programming Languages. What is a secure programming language?

Let s make some Marc R. Hoffmann Eclipse Summit Europe

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 16

Java Code Coverage Mechanics

Lecture 10. Dynamic Analysis

Scala Benchmark: Invariant Verifier

JoeQ Framework CS243, Winter 20156

CSC 4181 Handout : JVM

CS263: Runtime Systems Lecture: High-level language virtual machines. Part 1 of 2. Chandra Krintz UCSB Computer Science Department

Implement detection of duplicate test coverage for Java applications

Building a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2

Full file at

CSc 620 Debugging, Profiling, Tracing, and Visualizing Programs. Getting started... Getting started. 2 : Java Bytecode BCEL

Recap: Printing Trees into Bytecodes

CSE 431S Code Generation. Washington University Spring 2013

CSc 620 Debugging, Profiling, Tracing, and Visualizing Programs

Parallelism of Java Bytecode Programs and a Java ILP Processor Architecture

CSc 453 Interpreters & Interpretation

Parsing Scheme (+ (* 2 3) 1) * 1

Python Implementation Strategies. Jeremy Hylton Python / Google

Notes of the course - Advanced Programming. Barbara Russo

YETI. GraduallY Extensible Trace Interpreter VEE Mathew Zaleski, Angela Demke Brown (University of Toronto) Kevin Stoodley (IBM Toronto)

Static Analysis of Dynamic Languages. Jennifer Strater

JSR 292 backport. (Rémi Forax) University Paris East

Experiences Implementing Efficient Java Thread Serialization, Mobility and Persistence

Joeq Analysis Framework. CS 243, Winter

Why GC is eating all my CPU? Aprof - Java Memory Allocation Profiler Roman Elizarov, Devexperts Joker Conference, St.

Today. Instance Method Dispatch. Instance Method Dispatch. Instance Method Dispatch 11/29/11. today. last time

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Tutorial 3: Code Generation

Chapter 5. A Closer Look at Instruction Set Architectures

Topics. Structured Computer Organization. Assembly language. IJVM instruction set. Mic-1 simulator programming

The Java Virtual Machine. CSc 553. Principles of Compilation. 3 : The Java VM. Department of Computer Science University of Arizona

COMP 520 Fall 2009 Virtual machines (1) Virtual machines

Under the Hood: The Java Virtual Machine. Lecture 23 CS2110 Fall 2008

Shared Mutable State SWEN-220

Trace Compilation. Christian Wimmer September 2009

The Java Language Implementation

Java Internals. Frank Yellin Tim Lindholm JavaSoft

Space Exploration EECS /25

CSc 372 Comparative Programming Languages. Getting started... Getting started. B: Java Bytecode BCEL

Java TM Introduction. Renaud Florquin Isabelle Leclercq. FloConsult SPRL.

Chapter 5. A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Problem: Too Many Platforms!

301AA - Advanced Programming [AP-2017]

Portable Resource Control in Java The J-SEAL2 Approach

Living in the Matrix with Bytecode Manipulation

Portable Resource Control in Java: Application to Mobile Agent Security

Delft-Java Dynamic Translation

Hardware-Supported Pointer Detection for common Garbage Collections

ByCounter: Portable Runtime Counting of Bytecode Instructions and Method Invocations

Mnemonics Type-Safe Bytecode Generation in Scala

Numerical static analysis with Soot

Optimizing Your Android Applications

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

Before you start with this tutorial, you need to know basic Java programming.

On the Design of the Local Variable Cache in a Hardware Translation-Based Java Virtual Machine

CSC 2400: Computer Systems. Using the Stack for Function Calls

Genetic Programming in the Wild:

EXAMINATIONS 2008 MID-YEAR COMP431 COMPILERS. Instructions: Read each question carefully before attempting it.

Transcription:

Java Instrumentation for Dynamic Analysis and Michael Ernst MIT CSAIL Page 1

Java Instrumentation Approaches Instrument source files Java Debug Interface (JDI) Instrument class files Page 2

Advantages Instrument source files Instrumentation code is easy to create and understand Easy to debug Disadvantages Source is complex to parse (inner classes, generics) Source format changes often (e.g, 1.1-1.5) Complex to use (user must manage multiple copies of source and class files) Page 3

Advantages Easy to implement Doesn t change target Easy to use Disadvantages Very slow. Interface has changed often Java Debug Interface Page 4

Advantages Easy to use Class file format is stable Good editing tools exist Efficient Disadvantages Instrument class files Building instrumentation in bytecodes is cumbersome Page 5

Java Virtual Machine (JVM) Instruction Set Opcodes implement a straightforward stack machine Code is not optimized (1-1 relationship with source) Each method has a stack frame. Each parameter/local has a slot in the stack frame Example instructions load/store put/getfield invoke etc. Page 6

Constant pool Class File format Symbolic information is stored in the constant pool Instructions reference items in the pool Each Method in the class Code Local Variable Table (optional) Line Number table (optional) Inner classes are in separate files Page 7

Static translation Instrumentation Mechanisms Class files on the disk are translated Doesn t work with class file that are not available Somewhat more difficult to use Class loader New class loader delegates to the original class loader to get the original class file Target program class loaders require special support Pre-agent Instrumentation code is passed each class as it is loaded New classes cannot be created Page 8

Tools ASM (http://asm.objectweb.org/) Small and fast Supports Java 1.5 BCEL (http://jakarta.apache.org/bcel/) Works well No longer under active development SERP (http://serp.sourceforge.net/) Not designed for speed Somewhat memory intensive Soot (http://www.sable.mcgill.ca/soot/) Supports source Designed for optimizations Page 9

Source Example source and instructions public static int sum (int x, int y) { return (x + y); } Bytecodes iload 0 // load first argument iload 1 // load second argument iadd // Add operands on top of stack iret // return the sum Page 10

Example entry trace (source) public static int sum (int x, int y) { trace ("sum", new Object[] {x, y}); return (x + y); } Page 11

Example entry trace (bytecodes) trace ("sum", new Object[] {x, y}); Bytecodes ldc "sum" // push name of method iconst_2 // push number of args anewarray [java.lang.obj] // create an array for the args dup // du... Page 12

Source Example exit trace int result = x+y; trace ("sum", result); return (result); Bytecodes... ldc "sum" // push name of method istore 2 // store result in new local new [java.lang.integer] // Create wrapper dup // dup w... Page 13

Verification JVM checks each class as it is loaded Checks include Classfile format must be valid Jumps must have valid targets References to constant pool must be valid Each instruction requires appropriate types on the stack and in the local variable array Only initialized values can be accessed Stack must not underflow or overflow JVM verifier doesn t detail the exact location of errors BCEL verifier gives more information, but complains about valid code Page 14

JDK Issues Instrumenting the JDK is often required Heap Analysis Profiling Comparability JDK instrumentation problems The instrumentation itself should not use instrumented classes If a heap analysis used an instrumented version of HashMap it would recursivly call itself Some classes (200) are preloaded and can t be instrumented dynamically. Avoid when possible Page 15

New rt.jar file (static) JDK Approaches Can t be changed per target program Calls or classes must be doubled to avoid instrumenting the instrumentation Twin Class Hierarchy Duplicates all JDK classes Difficult to implement doesn t handle native calls well. Page 16

Instrumenting rt.jar Some classes cannot be modified Object cannot be changed at all Fields in String, Class, and others cannot be modified Methods can be added Difficult to debug JVM exits with machine level dump Slow turnaround Page 17

Algorithm Doubling Methods Approach Create a new copy of each method Add a distinguishing parameter to the copy Instrument the copy Change user code to reference the instrumented copies Instrumentation uses original code Handle fields by storing information externally Native calls work well Parameter can cause problems with reflection Class initializers cannot use this mechanism Page 18

Twin Class Hierarchy Factor, Schuster, and Shagin -- OOPSLA 2004 Parallel isomorphic instrumented class hierarchy Object TCH.Object TCH.String User code is modified to reference the new classes Many complex issues reflection arrays serialization root class (Object) Native calls require hand written solutions Page 19

Hints Do as little as possible as direct instrumentation. Call a java method instead. Reflection calls are surprisingly fast Pay attention to the different sizes of types Locals are easy to create and use. They help with stack level instrumentation javap dumps class file contents Dump original classes and new versions when instrumenting Page 20

Conclusion Java instrumentation tools are powerful JDK problems can be resolved Java instrumentation is surprisingly easy to do Page 21