Joeq Analysis Framework. CS 243, Winter

Similar documents
JoeQ Framework CS243, Winter 20156

Building a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2

Shared Mutable State SWEN-220

Compiling Techniques

SOFTWARE ARCHITECTURE 7. JAVA VIRTUAL MACHINE

JVML Instruction Set. How to get more than 256 local variables! Method Calls. Example. Method Calls

02 B The Java Virtual Machine

Translating JVM Code to MIPS Code 1 / 43

Recap: Printing Trees into Bytecodes

Compiler construction 2009

CS243 Homework 2. Winter Due: January 31, 2018

Compiler construction 2009

Under the Hood: The Java Virtual Machine. Lecture 23 CS2110 Fall 2008

Java and C II. CSE 351 Spring Instructor: Ruth Anderson

An Introduction to Multicodes. Ben Stephenson Department of Computer Science University of Western Ontario

Exercise 7 Bytecode Verification self-study exercise sheet

COMP3131/9102: Programming Languages and Compilers

Over-view. CSc Java programs. Java programs. Logging on, and logging o. Slides by Michael Weeks Copyright Unix basics. javac.

Code Generation Introduction

Plan for Today. Safe Programming Languages. What is a secure programming language?

Java byte code verification

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Chapter 5. A Closer Look at Instruction Set Architectures

CSc 453 Interpreters & Interpretation

CSC 4181 Handout : JVM

JVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format

Tutorial 3: Code Generation

Run-time Program Management. Hwansoo Han

Static Program Analysis

CS2110 Fall 2011 Lecture 25. Under the Hood: The Java Virtual Machine, Part II

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

Improving Java Performance

Java: framework overview and in-the-small features

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

The Java Virtual Machine

EXAMINATIONS 2014 TRIMESTER 1 SWEN 430. Compiler Engineering. This examination will be marked out of 180 marks.

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Chapter 5. A Closer Look at Instruction Set Architectures

Java Security. Compiler. Compiler. Hardware. Interpreter. The virtual machine principle: Abstract Machine Code. Source Code

Parsing Scheme (+ (* 2 3) 1) * 1

The Java Virtual Machine. CSc 553. Principles of Compilation. 3 : The Java VM. Department of Computer Science University of Arizona

Course Overview. PART I: overview material. PART II: inside a compiler. PART III: conclusion

Improving Java Code Performance. Make your Java/Dalvik VM happier

Problem: Too Many Platforms!

Under the Hood: The Java Virtual Machine. Problem: Too Many Platforms! Compiling for Different Platforms. Compiling for Different Platforms

CSE 431S Final Review. Washington University Spring 2013

Java Class Loading and Bytecode Verification

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

Context Threading: A flexible and efficient dispatch technique for virtual machine interpreters

Course Overview. Levels of Programming Languages. Compilers and other translators. Tombstone Diagrams. Syntax Specification

CSE 431S Code Generation. Washington University Spring 2013

CMSC430 Spring 2014 Midterm 2 Solutions

CS263: Runtime Systems Lecture: High-level language virtual machines. Part 1 of 2. Chandra Krintz UCSB Computer Science Department

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

javac 29: pop 30: iconst_0 31: istore_3 32: jsr [label_51]

JAM 16: The Instruction Set & Sample Programs

CS 11 java track: lecture 1

Informatik II (D-ITET)

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 16

Running class Timing on Java HotSpot VM, 1

CS577 Modern Language Processors. Spring 2018 Lecture Interpreters

Informatik II (D-ITET)

Java Instrumentation for Dynamic Analysis

Mnemonics Type-Safe Bytecode Generation in Scala

The Java Virtual Machine

The Java Language Implementation

CSCE 314 Programming Languages

Project Compiler. CS031 TA Help Session November 28, 2011

Plan for Today. Concepts. Next Time. Some slides are from Calvin Lin s grad compiler slides. CS553 Lecture 2 Optimizations and LLVM 1

Compilation 2012 Code Generation

Compiling for Different Platforms. Problem: Too Many Platforms! Dream: Platform Independence. Java Platform 5/3/2011

COMP 520 Fall 2009 Virtual machines (1) Virtual machines

Compiler-guaranteed Safety in Code-copying Virtual Machines

Final Exam. 12 December 2018, 120 minutes, 26 questions, 100 points

Part VII : Code Generation

Just In Time Compilation

Mixed Mode Execution with Context Threading

Soot: a framework for analysis and optimization of Java

Implementing on Object-Oriented Language 2.14

Lecture #31: Code Generation

Introduction to Programming Using Java (98-388)

Principles of Computer Architecture. Chapter 5: Languages and the Machine

CMPSC 497: Java Security

Programming Languages and Techniques (CIS120)

Announcements. Class 7: Intro to SRC Simulator Procedure Calls HLL -> Assembly. Agenda. SRC Procedure Calls. SRC Memory Layout. High Level Program

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Genetic Programming in the Wild:

INDEX. A SIMPLE JAVA PROGRAM Class Declaration The Main Line. The Line Contains Three Keywords The Output Line

Termination Analysis of Java Bytecode

Oak Intermediate Bytecodes

Portable Resource Control in Java The J-SEAL2 Approach

YETI. GraduallY Extensible Trace Interpreter VEE Mathew Zaleski, Angela Demke Brown (University of Toronto) Kevin Stoodley (IBM Toronto)

Homework I - Solution

CSc 372 Comparative Programming Languages. Getting started... Getting started. B: Java Bytecode BCEL

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

We can emit stack-machine-style code for expressions via recursion

CS321 Languages and Compiler Design I. Winter 2012 Lecture 2

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Operating Systems CMPSCI 377, Lec 2 Intro to C/C++ Prashant Shenoy University of Massachusetts Amherst

CS 536 Spring Programming Assignment 5 CSX Code Generator. Due: Friday, May 6, 2016 Not accepted after Midnight, Tuesday, May 10, 2016

Transcription:

Joeq Analysis Framework CS 243, Winter 2009-2010

Joeq Background Compiler backend for analyzing and optimizing Java bytecode Developed by John Whaley and others Implemented in Java Research project infrastructure: 10+ papers rely on Joeq implementations Also see: http://joeq.sourceforge.net

Java Toolchain Foo.java javac.exe Foo.class java.exe javac.exe is the frontend java.exe is the backend

Joeq Toolchain Foo.java javac.exe Foo.class java.exe Analyze bytecode, not Java source Joeq can also act as a VM Not using this functionality in this class Joeq

Rest of this Lecture Java: bytecode overview Joeq: quads overview quads: instruction set used for Joeq analyses Joeq: translating bytecode -> quads Joeq: analyzing quads Homework 2

Java code Everything begins as Java source code A very rich representation Good for reading and writing Hard to analyze Lots of high-level concepts with no hardware counterparts classes, virtual function calls, exceptions, threads, locks, structured control flow, etc.

Java bytecode (1) javac compiles source into machine-independent bytecode format (.class files) Coarse structure of the program is still maintained Each class is a file Sections for each method and field Each method has a bytecode sequence for its implementation Bytecode instructions are still high level Know about virtual methods, locks, etc.

Java bytecode (2) Each bytecode instruction uses one byte Instructions may have additional operands, stored immediately after the instruction Variables stored in abstract registers r0 is this, r1... are params followed by locals Stack machine model All intermediate values stored on a stack

Stack Machine Model Instructions push/pop values onto a stack x = y + 10: push y; push 10; add; pop x;

Bytecode instructions iload_1 Load first param/local and push it on the stack bipush <n> Push byte constant n onto the stack iadd Add the top two values on the stack, and push the result back onto the stack istore_2 Pop the stack and store its value into the second param/local

Example (1) class ExprTest { int test(int a) { int b, c; b = a + 10; c = a + b; if (c > 10) { c = 10; return c; > javac ExprTest.java > javap c ExprTest class ExprTest extends java.lang.object { ExprTest(); Code: (omitted) int test(int); Code: (omitted)

Example (2) ExprTest2(); Code: 0: aload_0 1: invokespecial #1; 4: return aload_0 Load address this and push it onto the stack invokespecial Invokes base class methods (among other uses) #1 is constructor

Example (3) int test(int); Code: 0: iload_1 1: bipush 10 3: iadd 4: istore_2 5: iload_1 6: iload_2 7: iadd 8: istore_3 9: iload_3 10: bipush 10 12: if_icmple 18 15: bipush 10 17: istore_3 18: iload_3 19: ireturn class ExprTest { int test(int a) { int b, c; b = a + 10; c = a + b; if (c > 10) { c = 10; return c;

Joeq Class Representation Joeq has classes for each component of the Java.class file jq_type: a Java type jq_class: a Java class jq_field, jq_method,... including the bytecodes Instead of analyzing bytecodes directly, we will use quads

Joeq Quads Joeq translates bytecodes to four-address instructions, called quads Quads: one operator, up to four operands OPERATOR op1 op2 op3 op4 joeq.compiler.quad.quad joeq.compiler.quad.operator joeq.compiler.quad.operand There is no stack All temporary data stored in registers

Joeq Operands RegisterOperand Abstract registers represent parameters, local variables, and temporary values ConstantOperand int/float/string/etc. constants TargetOperand Basic block target of a branch instruction MethodOperand, ParamListOperands Target and arguments to a method call FieldOperand, TypeOperand,...

Joeq Operators (1) Operator.Move Move a constant or register into another register Operator.Unary, Operator.Binary Operations on constants/registers, storing result in another register Operator.Invoke Method call with a result register, method operand and parameter list operand Operator.Branch, Operator.GetField, Operator.PutField, Operator.New,...

Joeq Operators (2) Runtime checks are explicit quads Not implicit as in bytecodes Can throw exceptions (as can method calls) Operator.NullCheck Check if an operand is NULL Operator.BoundsCheck Check if an array access is out of bounds Operator.CheckCast, Operator.StoreCheck,...

Joeq CFGs (1) For each method, Joeq generates a control flow graph over basic blocks joeq.compiler.quad.controlflowgraph joeq.compiler.quad.basicblock CFGs are graphs of basic blocks with an entry and exit block Blocks are lists of quads, and know their successors and predecessors in the CFG

Joeq CFGs (2) Exception control flow is not explicit in Joeq basic blocks An exception can jump out of the middle of a basic block You do not need to consider exceptions when writing Joeq analyses for this class

Example (1) class ExprTest { int test(int a) { int b, c; b = a + 10; c = a + b; if (c > 10) { c = 10; return c; > javac ExprTest.java > java PrintQuads ExprTest Class: ExprTest Control flow graph for ExprTest.<init> ()V: (omitted) Control flow graph for ExprTest.test (I)I: (omitted)

Example (2) Control flow graph for ExprTest.<init> ()V: BB0 (ENTRY) (in: <none>, out: BB2) BB2 (in: BB0 (ENTRY), out: BB1 (EXIT)) 2 NULL_CHECK T-1 <g>, R0 ExprTest 1 INVOKESPECIAL_V% java.lang.object.<init>()v, (R0 ExprTest) 3 RETURN_V BB1 (EXIT) (in: BB2, out: <none>) Original bytecode: ExprTest2(); Code: 0: aload_0 1: invokespecial #1; 4: return

Example (3) Control flow graph for ExprTest.test: BB0 (ENTRY) (in: <none>, out: BB2) BB2 (in: BB0 (ENTRY), out: BB3, BB4) 1 ADD_I T2 int, R1 int, IConst: 10 2 MOVE_I R3 int, T2 int 3 ADD_I T2 int, R1 int, R3 int 4 MOVE_I R4 int, T2 int 5 IFCMP_I R4 int, IConst: 10, LE, BB4 BB3 (in: BB2, out: BB4) 6 MOVE_I R4 int, IConst: 10 BB4 (in: BB2, BB3, out: BB1 (EXIT)) 7 RETURN_I R4 int class ExprTest { int test(int a) { int b, c; b = a + 10; c = a + b; if (c > 10) { c = 10; return c; BB1 (EXIT) (in: BB4, out: <none>)

Example (4) Control flow graph for ExprTest.test: BB0 (ENTRY) (in: <none>, out: BB2) BB2 (in: BB0 (ENTRY), out: BB3, BB4) 1 ADD_I T2 int, R1 int, IConst: 10 2 MOVE_I R3 int, T2 int 3 ADD_I T2 int, R1 int, R3 int 4 MOVE_I R4 int, T2 int 5 IFCMP_I R4 int, IConst: 10, LE, BB4 BB3 (in: BB2, out: BB4) 6 MOVE_I R4 int, IConst: 10 BB4 (in: BB2, BB3, out: BB1 (EXIT)) 7 RETURN_I R4 int BB1 (EXIT) (in: BB4, out: <none>) int test(int); Code: 0: iload_1 1: bipush 10 3: iadd 4: istore_2 5: iload_1 6: iload_2 7: iadd 8: istore_3 9: iload_3 10: bipush 10 12: if_icmple 18 15: bipush 10 17: istore_3 18: iload_3 19: ireturn

Writing Analyses with Joeq Most analyses can be written with visitors Traverse all the loaded CFGs, or all the quads in those CFGs ControlFlowGraphVisitor: interface for an analysis which makes a pass over the CFGs QuadIterator: iterate over every quad in a CFG QuadVisitor: interface for an analysis which makes a single pass over the quads in a CFG

joeq.main.helper class Helper class provides a clean interface to the complexities of Joeq packages up reading of.class files, generating type and bytecode data, and translating to quads load(string) takes the name of a class, returns the corresponding jq_class runpass(val, visitor) runs a CFG or quad visitor over val (which is a class, CFG, quad, etc.)

Analysis Example (1) public static class QuadCounter extends QuadVisitor.EmptyVisitor { public int count = 0; public void visitquad(quad q) { count++;

Analysis Example (2) class CountQuads { public static class QuadCounter... public static void main(string[] args) { jq_class[] classes = new jq_class[args.length]; for (int i=0; i < classes.length; i++) classes[i] = (jq_class)helper.load(args[i]); for (int i=0; i < classes.length; i++) { System.out.println("Class: " + classes[i].getname()); QuadCounter qc = new QuadCounter(); Helper.runPass(classes[i], qc); System.out.println(classes[i].getName() + " has " + qc.count + " quads");

Analysis Example (3) class CountQuads { public static class QuadCounter... > javac ExprTest.java public static void main(string[] > java CountQuads args) ExprTest { Class: ExprTest jq_class[] classes = new ExprTest jq_class[args.length]; has 10 quads for (int i=0; i < classes.length; i++) classes[i] = (jq_class)helper.load(args[i]); for (int i=0; i < classes.length; i++) { System.out.println("Class: " + classes[i].getname()); QuadCounter qc = new QuadCounter(); Helper.runPass(classes[i], qc); System.out.println(classes[i].getName() + " has " + qc.count + " quads");

Homework 2 Implement a data flow framework with Joeq We provide the interfaces for your framework We provide two dataflow analyses that must work with your framework Write a Reaching Definitions analysis which works using your framework Due next Friday (1/29/2010)