Soot, a Tool for Analyzing and Transforming Java Bytecode

Similar documents
Soot: a framework for analysis and optimization of Java

Introduction to Soot. Automated testing and verification. J.P. Galeotti - Alessandra Gorla. Thursday, November 22, 12

Soot overview/disassembling classfiles

a Not yet implemented in current version SPARK: Research Kit Pointer Analysis Parameters Soot Pointer analysis. Objectives

Soot A Java Bytecode Optimization Framework. Sable Research Group School of Computer Science McGill University

Androsia Securing 'data in process' for your Android Apps

Scaling Java Points-To Analysis using Spark

VISUALIZATION TOOLS FOR OPTIMIZING COMPILERS

Integrating the Soot compiler infrastructure into an IDE

On the Soot menagerie fundamental Soot objects

Shimple: An Investigation of Static Single Assignment Form

InvokeDynamic support in Soot

Compiler Analyses for Improved Return Value Prediction

Scaling Java Points-to Analysis Using Spark

Sta$c Analysis Dataflow Analysis

Using the Soot flow analysis framework

Of Graphs and Coffi Grounds: Decompiling Java 1

Research Interests. Education. Academic Experience. Software engineering, static program analysis, programming language design.

Points-to Analysis using BDDs

Nomair A. Naeem. Personal Data. Education. Teaching Experience. Course Instructor/Sessional

Optimizing AspectJ with abc

Building the abc AspectJ compiler with Polyglot and Soot

Exploring the Suitability of Existing Tools for Constructing Executable Java Slices

arxiv: v1 [cs.pl] 23 Apr 2013

Numerical static analysis with Soot

Intermediate Representations

A Comprehensive Approach to Array Bounds Check Elimination for Java

Compiler Analyses for Improved Return Value Prediction

Creating a class from scratch with Soot

Using Soot to instrument a class file

Annotating Java Bytecode

Instance keys: A technique for sharpening whole-program pointer analyses with intraprocedural information


Implementing Higher-Level Languages. Quick tour of programming language implementation techniques. From the Java level to the C level.

Context Threading: A flexible and efficient dispatch technique for virtual machine interpreters

Compiling Techniques

Context-sensitive points-to analysis: is it worth it?

A Practical MHP Information Analysis for Concurrent Java Programs

Programmer-friendly Decompiled Java

A Practical MHP Information Analysis for Concurrent Java Programs

Building a Whole-Program Type Analysis in Eclipse

Points-to Analysis of RMI-based Java Programs

SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine p. 1

CSE 401/M501 Compilers

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

EDA045F: Program Analysis LECTURE 1: INTRODUCTION. Christoph Reichenbach

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler

SSA Based Mobile Code: Construction and Empirical Evaluation

02/03/15. Compile, execute, debugging THE ECLIPSE PLATFORM. Blanks'distribu.on' Ques+ons'with'no'answer' 10" 9" 8" No."of"students"vs."no.

INTRODUCTION TO LLVM Bo Wang SA 2016 Fall

Interprocedural Data Flow Analysis in Soot using Value Contexts

opt. front end produce an intermediate representation optimizer transforms the code in IR form into an back end transforms the code in IR form into

Bandera: Extracting Finite-state Models from Java Source Code

Context-sensitive points-to analysis: is it worth it?

Intermediate Representation (IR)

INFORMATION FLOW IN A JAVA INTERMEDIATE LANGUAGE

Pointer Analysis in the Presence of Dynamic Class Loading

Weiss Chapter 1 terminology (parenthesized numbers are page numbers)

COI: A First Step towards TrueType Bytecode Analysis

Dispatch techniques and closure representations

CMSC 430 Introduction to Compilers. Spring Intermediate Representations and Bytecode Formats

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Symbolic Solver for Live Variable Analysis of High Level Design Languages

Java TM Introduction. Renaud Florquin Isabelle Leclercq. FloConsult SPRL.

(just another operator) Ambiguous scalars or aggregates go into memory

CS5363 Final Review. cs5363 1

Intermediate Representations

LLVM: A Compilation Framework. Transformation

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions

An Architectural Overview of GCC

/22. Clang Tutorial. Moonzoo Kim School of Computing KAIST. The original slides were written by Yongbae Park,

Object Histories in Java

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Assumptions. History

Partial evaluation based on three-address form

Intermediate Code Generation

Programmer-friendly Decompiled Java

CSE 401/M501 Compilers

Running class Timing on Java HotSpot VM, 1

Static Analysis of JavaScript. Ben Hardekopf

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

McSAF: A Static Analysis Framework for MATLAB

Pointer Analysis in the Presence of Dynamic Class Loading

JOVE. An Optimizing Compiler for Java. Allen Wirfs-Brock Instantiations Inc.

CSc 453 Interpreters & Interpretation

Lecture Compiler Middle-End

CSE 340 Fall 2014 Project 4

Combining Analyses, Combining Optimizations - Summary

Static Aspect Impact Analysis

Inter-procedural Data-flow Analysis with IFDS/IDE and Soot

Nullable Method Detection

CS553 Lecture Dynamic Optimizations 2

Intermediate representation

The abc Group. Optimising AspectJ. abc Technical Report No. abc BRICS. United Kingdom Denmark Montreal, Canada.

Improving locals stack placement in Java with Soot

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

Ripple: Reflection Analysis for Android Apps in Incomplete Information Environments

Compiler Structure. Data Flow Analysis. Control-Flow Graph. Available Expressions. Data Flow Facts

Optimizing Java Bytecode Using the Soot Framework: Is It Feasible?

JoeQ Framework CS243, Winter 20156

Transcription:

Soot, a Tool for Analyzing and Transforming Java Bytecode Laurie Hendren, Patrick Lam, Jennifer Lhoták, Ondřej Lhoták and Feng Qian McGill University Special thanks to John Jorgensen and Navindra Umanee for help in preparing Soot 2.0 and this tutorial. Soot development has been supported, in part, by research grants from NSERC, FCAR and IBM http://www.sable.mcgill.ca/soot/ Soot, a Tool for Analyzing and Transforming Java Bytecode p.1/148

What is Soot? a free compiler infrastructure, written in Java (LGPL) was originally designed to analyze and transform Java bytecode original motivation was to provide a common infrastructure with which researchers could compare analyses (points-to analyses) has been extended to include decompilation and visualization www.sable.mcgill.ca/soot/ Soot, a Tool for Analyzing and Transforming Java Bytecode p.4/148

What is Soot? (2) Soot has many potential applications: used as a stand-alone tool (command line or Eclipse plugin) extended to include new IRs, analyses, transformations and visualizations as the basis of building new special-purpose tools Soot, a Tool for Analyzing and Transforming Java Bytecode p.5/148

Java source SML source Soot Overview Scheme source Eiffel source javac MLJ KAWA SmallEiffel Eclipse class files SOOT Produce Jimple 3 address IR Analyze, Optimize and Tag Generate Bytecode Java source Optimized class files + attributes Interpreter JIT Adaptive Engine Ahead of Time Compiler Soot, a Tool for Analyzing and Transforming Java Bytecode p.8/148

Soot IRs Baf: is a compact rep. of Bytecode (stack-based) Jimple: is Java s simple, typed, 3-addr (stackless) representation Shimple: is a SSA-version of Jimple Grimp: is like Jimple, but with expressions aggregated Dava: structured representation used for Decompiling Java Soot, a Tool for Analyzing and Transforming Java Bytecode p.9/148

Jimple Jimple is: principal Soot Intermediate Representation 3-address code in a control-flow graph a typed intermediate representation stackless Raja s Thesis, CASCON99, CC2000, SAS2000 Soot, a Tool for Analyzing and Transforming Java Bytecode p.19/148

Context: other Jimple Stmts public int foo(java.lang.string) { // locals r0 := @this; // IdentityStmt r1 := @parameter0; if r1!= null goto label0; // IfStmt $i0 = r1.length(); // AssignStmt r1.touppercase(); // InvokeStmt return $i0; // ReturnStmt label0: } return 2; // created by Printer Soot, a Tool for Analyzing and Transforming Java Bytecode p.23/148

Converting bytecode Jimple bytecode These transformations are relatively hard to design so that they produce correct, useful and efficient code. Worth the price, we do want a 3-addr typed IR. raw bytecode each inst has implicit effect on stack no types for local variables typed 3-address code (Jimple) each stmt acts explicitly on named variables types for each local variable 200 kinds of insts only 15 kinds of stmts Soot, a Tool for Analyzing and Transforming Java Bytecode p.24/148

Flow Analysis in Soot Flow analysis is key part of compiler framework Soot has easy-to-use framework for intraprocedural flow analysis Soot itself, and its flow analysis framework, are object-oriented. Soot, a Tool for Analyzing and Transforming Java Bytecode p.43/148

Four Steps to Flow Analysis 1. Forward or backward? Branched or not? 2. Decide what you are approximating. What is the domain s confluence operator? 3. Write equation for each kind of IR statement. 4. State the starting approximation. Soot, a Tool for Analyzing and Transforming Java Bytecode p.44/148

HOWTO: Soot Flow Analysis A checklist of your obligations: 1. Subclass *FlowAnalysis 2. Implement abstraction: merge(), copy() 3. Implement flow function flowthrough() 4. Implement initial values: newinitialflow() and entryinitialflow() 5. Implement constructor (it must call doanalysis()) Soot, a Tool for Analyzing and Transforming Java Bytecode p.45/148

Points-to analysis Default points-to analysis assumes that any pointer can point to any object Spark provides variations of context-insensitive subset-based points-to analysis Work in progress on context-sensitive analyses Soot, a Tool for Analyzing and Transforming Java Bytecode p.112/148