SSA Based Mobile Code: Construction and Empirical Evaluation

Similar documents
Outline. Introduction Concepts and terminology The case for static typing. Implementing a static type system Basic typing relations Adding context

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

CSc 453 Interpreters & Interpretation

Today. Instance Method Dispatch. Instance Method Dispatch. Instance Method Dispatch 11/29/11. today. last time

Field Analysis. Last time Exploit encapsulation to improve memory system performance

Compiling Techniques

Accelerating Ruby with LLVM

Untyped Memory in the Java Virtual Machine

COS 320. Compiling Techniques

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Introduction to Programming Using Java (98-388)

Pierce Ch. 3, 8, 11, 15. Type Systems

Improving Mobile Program Performance Through the Use of a Hybrid Intermediate Representation

Outline. Java Models for variables Types and type checking, type safety Interpretation vs. compilation. Reasoning about code. CSCI 2600 Spring

Intermediate Code Generation

Short Notes of CS201

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Implementing Higher-Level Languages. Quick tour of programming language implementation techniques. From the Java level to the C level.

CS201 - Introduction to Programming Glossary By

Intermediate Code Generation (ICG)

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Intermediate representation

Type Checking and Type Inference

Soot A Java Bytecode Optimization Framework. Sable Research Group School of Computer Science McGill University

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

2 rd class Department of Programming. OOP with Java Programming

Integrated Java Bytecode Verification

Building a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2

Improving Mobile Program Performance Through the Use of a Hybrid Intermediate Representation

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

Semantic actions for expressions

Programming Languages Lecture 15: Recursive Types & Subtyping

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Programming refresher and intro to C programming

CSCE 314 Programming Languages. Type System

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Operators and Expressions

CSE 431S Type Checking. Washington University Spring 2013

The role of semantic analysis in a compiler

CS2110 Fall 2011 Lecture 25. Under the Hood: The Java Virtual Machine, Part II

C#: framework overview and in-the-small features

Principles of Compiler Design

Targeting LLVM IR. LLVM IR, code emission, assignment 4

Introduce C# as Object Oriented programming language. Explain, tokens,

Java Primer 1: Types, Classes and Operators

The structure of a compiler

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

B.V. Patel Institute of BMC & IT, UTU 2014

Lecture 8: Pointer Arithmetic (review) Endianness Functions and pointers

Translating JVM Code to MIPS Code 1 / 43

Java Language Basics: Introduction To Java, Basic Features, Java Virtual Machine Concepts, Primitive Data Type And Variables, Java Operators,

Intermediate Code Generation

The Java Programming Language

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.

Complex, concurrent software. Precision (no false positives) Find real bugs in real executions

Page 1. Stuff. Last Time. Today. Safety-Critical Systems MISRA-C. Terminology. Interrupts Inline assembly Intrinsics

Principles of Programming Languages

Semantic Analysis and Type Checking

Intermediate Representations

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

A program execution is memory safe so long as memory access errors never occur:

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

A Short Summary of Javali

MISRA-C. Subset of the C language for critical systems

Computer Components. Software{ User Programs. Operating System. Hardware

PROGRAMMING FUNDAMENTALS

CSE 307: Principles of Programming Languages

September 10,

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

Static Type Checking. Static Type Checking. The Type Checker. Type Annotations. Types Describe Possible Values

Topics Covered Thus Far CMSC 330: Organization of Programming Languages

Just-In-Time Compilation

Intermediate Code Generation

CS4215 Programming Language Implementation

Yale University Department of Computer Science

Static Checking and Type Systems

Writing Evaluators MIF08. Laure Gonnord

CS 406/534 Compiler Construction Putting It All Together

Intermediate Representation (IR)

The Java Language Implementation

Three-Address Code IR

8. Static Single Assignment Form. Marcus Denker

Important From Last Time

Combining Analyses, Combining Optimizations - Summary

Declaring Pointers. Declaration of pointers <type> *variable <type> *variable = initial-value Examples:

CSE Computer Security

CSE 401/M501 Compilers

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

Run-Time Data Structures

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Intermediate Representations

MethodHandlesArrayElementGetterBench.testCreate Analysis. Copyright 2016, Oracle and/or its affiliates. All rights reserved.

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

6. Intermediate Representation!

Transcription:

SSA Based Mobile Code: Construction and Empirical Evaluation Wolfram Amme Friedrich Schiller University Jena, Germany Michael Franz Universit of California, Irvine, USA Jeffery von Ronne Universtity of Texas, San Antonio, USA

General System for the Transport of Mobile Code Source Program Syntax/ Semantic Analyzer AST Interpreter IR Machine Code Optimizing Code Generator Optimization/ Annotation Verifier AST Transformer IR Producer Encoder File 1...... File n IR Decoder Consumer

Mobile Code Security most approaches are based on some type safe programming language program safety is usually defined as type safety objective: type safety, i.e. no invalid pointer accesses illegal field accesses operator application with illegal parameters calling routines imported from elsewhere with illegal parameters

De facto Standard: Java's Bytecode Java's bytecode format is the de facto standard for transporting mobile code however, it is far away from being an ideal mobile code representation stack model of the JVM leads to a time consuming verification phase on the consumer side limitation of accessing the top elements of the stack prevents the reuse of operands and code reordering optimizing JIT compilers often transform Java bytecode internally into code for a register machine many bytecode operations include sub operations (nullchecks, bounds checks)

SafeTSA: Facts transportation format consisting of a symbol table an abstract syntax tree SSA style instructions within basic blocks SSA based instruction format within basic blocks is reference safe and type safe with less verification effort than Java bytecode allows to move CSE from code consumer to code producer can transport results of null and bounds check elimination in a tamper proof manner can directly used for JIT compilation

SafeTSA Construction

Program in SSA Form Example:... i = i + 1; j = j + 1; if (i<= j) i = i + 1; else i = i 1; j = j + i; (6): add (i) #1 (7): add (j) #1 (8): cmp (6) (7) (9): ble (8) (10): add (6) #1 (11): sub (6) #1 (12): phi (10) (11) (13): add (7) (12)

Program in SSA Form Example:... i = i + 1; j = j + 1; if (i<= j) i = i + 1; else i = i 1; j = j + i; (6): add (i) #1 (7): add (j) #1 (8): cmp (6) (7) (9): ble (8) (10): add (6) #1 (11): sub (6) #1 (12): phi (10) (11) (13): add (8) (12) Type error

Program in SSA Form Example:... i = i + 1; j = j + 1; if (i<= j) i = i + 1; else i = i 1; j = j + i; (6): add (i) #1 (7): add (j) #1 (8): cmp (6) (7) (9): ble (8) (10): add (6) #1 (11): sub (6) #1 (12): phi (10) (11) (13): add (10) (12) Reference error

Extended Machine Model 0 1.. 0 1 2 an infinite register plane for each type types ( register planes ) register (value) numbers Constant pool 0 1 2 types Type table 0 integer 1 float 2 boolean.... imported types.. local types types 0 1 3.14 true 1 14

Type Separation in SafeTSA in SafeTSA all operations are strongly typed for all operations the following holds: a specific operation implicitly selects the register plane(s) from which the arguments are taken an operation merely specifies the register number(s) on that plane, but not the plane(s) involved the result is deposited in the next available register on the plane that corresponds to the result of the operation

Type Separated SSA Example:... i = i + 1; j = j + 1; if (i<= j) i = i + 1; else i = i 1; j = j + i; (int 6): int add (i) const 0 (int 7): int add (j) const 0 (bool 0): int cmp (6) (7) (void): bool ble (0) (int 8): int add (6) const 0 (int 10): int phi (8) (9) (int 11): int add (7) (10) (int 9): int sub (6) const 0

Reference Safety: Construction dominator tree of a program is used for safe access to values in a dominator tree all predecessors of a node, that represents a basic block A, stand for basic blocks which always will be executed before A in reference safe SSA Form an operand access is a pair (steps,value), where steps: number of nodes, that starting with the actual basic block, have to be traversed the dominator tree backwards (until the basic block is found which defines the value) value: a relative instruction number in that basic block

Reference Safe SSA Example:... i = i + 1; j = j + 1; if (i<= j) i = i + 1; else i = i 1; j = j + i; 1 (6): add (i) #1 (7): add (j) #1 (8): cmp (0 6) (0 7) (9): ble (0 8) 2 (0): add (1 6) #1 3 (0): sub (1 6) #1 Dominator tree: 1 2 3 4 4 (0): phi (0 0) (0 0) (1): add (1 7) (0 0)

SafeTSA: Type separated and Reference safe SSA Dominator tree: 1 2 3 4 1 (int 6): int add (i) const 0 (int 7): int add (j) const 0 (bool 0): int cmp (0 6) (0 7) (void): bool ble (0 0) 2 (int 0): int add (1 6) const 0 3 (int 0): int sub (1 6) const 0 4 (int 0): int phi (0 0) (0 0) (int 1): int add (1 7) (0 0) Each basic block is assigned its own set of register planes

Instruction Set

Instruction Set: Operators prim <type> <primfun> <param>* xprim <type> <primfun> <param>* Example: (int 3): prim int add (0 2) (0 1) difference between prim and xprim is whether or not the operation may cause an exception

Instruction Set: Cast Operators xupcast <type> <type> <object> downcast <type> <integer> <object> Example: class B extends A {}; (ref B): xupcast ref A ref B (...) (ref A): downcast ref B 1 (...)

Instruction Set: Other Kind of Instructions Memory Access getfield <type> <object> <symbol> setfield <type> <object> <symbol> <value> getelt <type> <object> <position> setelt <type> <object> <position> <value> Method Call xdispatch <type> <object> <fun> <param> * xcall <type> <object> <fun> <param> * Phi Instruction phi <type> <value> *

Null and Bounds Check Elimination

Construction of Memory Safety safe reference and safe index types for each reference type ref T we introduce a safe reference type safe T guaranteed not to be null for each array object A we have a safe index type safe index A guaranteeing that the array s index value is within range (created when array is allocated) null and range checking then become operations that take values from an unsafe value plane and copy them (to the first available register) of the corresponding safe reference type s plane memory and array accesses take their operands always from the corresponding safe value plane

Example: Memory Access and Nullcheck Elimination class A{ int f; } A obj;... (safe A 1): xupcast ref A safe A (...) obj.f; (int 1): getfield A (0, safe A 1) f...... obj.f; (safe A 2): xupcast ref A safe A (...) (int 2): getfield A (0, safe A 2) f

Example: Memory Access and Nullcheck Elimination class A{ int f; } A obj;... (safe A 1): xupcast ref A safe A (...) obj.f; (int 1): getfield A (0, safe A 1) f...... obj.f; (int 2): getfield A (0, safe A 1) f

Implementation and Evaluation

SafeTSA: Implementation a compiler that transforms Java programs into SafeTSA class files extension of Pizza s Java compiler Optimizations: constant propagation, deadcode elimination, and CSE a JVM which is capable of executing heterogeneous of bytecode and SafeTSA class files extension of IBM s Jikes RVM Optimizations: method inlining, load and store elimination, global code motion, etc.

Results: Runtime Behavior

Results: Compilation Times

Results: Optimizing JIT Compilation