The Java Virtual Machine

Similar documents
COMP 520 Fall 2009 Virtual machines (1) Virtual machines

Virtual Machines. COMP 520: Compiler Design (4 credits) Alexander Krolik MWF 9:30-10:30, TR

Compilation 2012 Code Generation

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

JVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format

JVML Instruction Set. How to get more than 256 local variables! Method Calls. Example. Method Calls

Course Overview. PART I: overview material. PART II: inside a compiler. PART III: conclusion

CSC 4181 Handout : JVM

Tutorial 3: Code Generation

CS2110 Fall 2011 Lecture 25. Under the Hood: The Java Virtual Machine, Part II

Compiler construction 2009

The Java Virtual Machine. CSc 553. Principles of Compilation. 3 : The Java VM. Department of Computer Science University of Arizona

Part VII : Code Generation

javac 29: pop 30: iconst_0 31: istore_3 32: jsr [label_51]

Static Program Analysis

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Run-time Program Management. Hwansoo Han

An Introduction to Multicodes. Ben Stephenson Department of Computer Science University of Western Ontario

Recap: Printing Trees into Bytecodes

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

CMPSC 497: Java Security

SOFTWARE ARCHITECTURE 7. JAVA VIRTUAL MACHINE

COMP3131/9102: Programming Languages and Compilers

Improving Java Performance

Programming Language Systems

Java Class Loading and Bytecode Verification

02 B The Java Virtual Machine

Compiling Techniques

COMP 520 Fall 2013 Code generation (1) Code generation

Java byte code verification

CSCE 314 Programming Languages

A Quantitative Analysis of Java Bytecode Sequences

Under the Hood: The Java Virtual Machine. Lecture 23 CS2110 Fall 2008

Improving Java Code Performance. Make your Java/Dalvik VM happier

Static Analysis of Dynamic Languages. Jennifer Strater

JAM 16: The Instruction Set & Sample Programs

The Java Virtual Machine

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Let s make some Marc R. Hoffmann Eclipse Summit Europe

Program Dynamic Analysis. Overview

3/15/18. Overview. Program Dynamic Analysis. What is dynamic analysis? [3] Why dynamic analysis? Why dynamic analysis? [3]

High-Level Language VMs

Code Generation Introduction

Exercise 7 Bytecode Verification self-study exercise sheet

Over-view. CSc Java programs. Java programs. Logging on, and logging o. Slides by Michael Weeks Copyright Unix basics. javac.

Problem: Too Many Platforms!

CS263: Runtime Systems Lecture: High-level language virtual machines. Part 1 of 2. Chandra Krintz UCSB Computer Science Department

Compiler construction 2009

Under the Hood: The Java Virtual Machine. Problem: Too Many Platforms! Compiling for Different Platforms. Compiling for Different Platforms

CSc 453 Interpreters & Interpretation

Today. Instance Method Dispatch. Instance Method Dispatch. Instance Method Dispatch 11/29/11. today. last time

Java: framework overview and in-the-small features

Building a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2

COMP3131/9102: Programming Languages and Compilers

Translating JVM Code to MIPS Code 1 / 43

CSc 620 Debugging, Profiling, Tracing, and Visualizing Programs. Compiling Java. The Java Class File Format 1 : JVM

Java Instrumentation for Dynamic Analysis

CS577 Modern Language Processors. Spring 2018 Lecture Interpreters

Java and C II. CSE 351 Spring Instructor: Ruth Anderson

Compiling Faster, Compiling Better with Falcon. Iván

EXAMINATIONS 2014 TRIMESTER 1 SWEN 430. Compiler Engineering. This examination will be marked out of 180 marks.

Plan for Today. Safe Programming Languages. What is a secure programming language?

CMSC430 Spring 2014 Midterm 2 Solutions

Shared Mutable State SWEN-220

Space Exploration EECS /25

Characterizing the SPEC JVM98 Benchmarks On The Java Virtual Machine

Compiling for Different Platforms. Problem: Too Many Platforms! Dream: Platform Independence. Java Platform 5/3/2011

Mnemonics Type-Safe Bytecode Generation in Scala

Java Security. Compiler. Compiler. Hardware. Interpreter. The virtual machine principle: Abstract Machine Code. Source Code

Compilation 2012 The What and Why of Compilers

Java TM. Multi-Dispatch in the. Virtual Machine: Design and Implementation. Computing Science University of Saskatchewan

The Java Language Implementation

Intermediate Representations

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Jaos - Java on Aos. Oberon Event 03 Patrik Reali

Oak Intermediate Bytecodes

Code Generation COMP 520: Compiler Design (4 credits) Professor Laurie Hendren

Software Watermarking Through Register Allocation: Implementation, Analysis, and Attacks

Delft-Java Dynamic Translation

Code Profiling. CSE260, Computer Science B: Honors Stony Brook University

301AA - Advanced Programming [AP-2017]

Scala: Byte-code Fancypants. David Pollak JVM Language Summit 2009

EXAMINATIONS 2008 MID-YEAR COMP431 COMPILERS. Instructions: Read each question carefully before attempting it.

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

Topics. Structured Computer Organization. Assembly language. IJVM instruction set. Mic-1 simulator programming

to perform dependence analysis on Java bytecode, we must extend existing dependence analysis techniques for adapting Java bytecode. In this paper we p

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Design and Implementation of Pep, a Java Just-In-Time Translator

Chapter 5. A Closer Look at Instruction Set Architectures

Taming the Java Virtual Machine. Li Haoyi, Chicago Scala Meetup, 19 Apr 2017

Java Card Platform. Virtual Machine Specification, Classic Edition. Version 3.1. January 2019

CSE 431S Final Review. Washington University Spring 2013

Introduction to Compiler Construction in a Java World

JSR 292 backport. (Rémi Forax) University Paris East

Context Threading: A flexible and efficient dispatch technique for virtual machine interpreters

Just In Time Compilation

CSE 431S Code Generation. Washington University Spring 2013

Portable Resource Control in Java The J-SEAL2 Approach

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 16

Joeq Analysis Framework. CS 243, Winter

Transcription:

Virtual Machines in Compilation Abstract Syntax Tree Compilation 2007 The compile Virtual Machine Code interpret compile Native Binary Code Michael I. Schwartzbach BRICS, University of Aarhus 2 Virtual Machines in Compilation Compiling Virtual Machine Code Abstract Syntax Tree compile Virtual Machine Code interpret compile Virtual Machine Code interpret compile Virtual Machine Code Example: gcc translates into RTL, optimizes RTL, and then compiles RTL into native code Advantages: exposes many details of the underlying architecture facilitates code generators for many targets Disadvantage: a code generator must be built for each target interpret compile Native Binary Code 3 4 1

Interpreting Virtual Machine Code Designing A Virtual Machine Examples: P-code for Pascal interpreters Postscript code for display devices Java bytecode for the Advantages: easy to generate code the code is architecture independent bytecode can be more compact Disadvantage: poor performance (naively 5-100 times slower) The instruction set may be more or less high-level A balance must be found between: the work of the compiler the work of the interpreter In the extreme case, there is only one instruction: compiler guy: execute "program " interpreter guy: print "result" The resulting sweet spot involves: doing as much as possible at compile time exposing the program structure to the interpreter minimizing the size of the generated code 5 6 The Java Stack Components of the JVM: stack (per thread) heap constant pool code segment program counter (per thread) The stack consists of frames: local stack locals arguments this lsp sp The number of local slots and the size of the local stack are fixed at compile-time (we ignore multiple threads in this presentation) 7 8 2

The Java Heap The heap consists of objects: The Java Constant Pool The constant pool consists of all constant data: numbers strings symbolic names of classes, interfaces, and fields fields runtime type 9 10 The Java Code Segment Java Bytecodes The code segment consists of bytecodes of variable sizes: pc A bytecode instruction consists of: a one-byte opcode a variable number of arguments (offsets or pointers to the constant pool) It consumes and produces some stack elements Constants, locals, and stack elements are typed: addresses (a) primitive types (i,c,b,s,f,d,l) 11 12 3

Class Files Class Loading Java compilers generate class files: magic number (0xCAFEBABE) minor version/major version constant pool access flags this class super class interfaces fields methods attributes (extra hints for the JVM) Classes are loaded lazily when first accessed Class name must match file name Super classes are loaded first (transitively) The bytecode is verified Static fields are allocated and given default values Static initializers are executed 13 14 From Methods to Bytecode A Sketch of an Interpreter A simple Java method: public int Abs(int i) { if (i < 0) return(i * -1); else return(i); }.method public Abs(I)I // int argument, int result.limit stack 2 // stack with 2 locations.limit locals 2 // space for 2 locals // --locals-- --stack--- iload_1 // [ x -3 ] [ -3 * ] ifge Label1 // [ x -3 ] [ * * ] iload_1 // [ x -3 ] [ -3 * ] iconst_m1 // [ x -3 ] [ -3-1 ] imul // [ x -3 ] [ 3 * ] ireturn // [ x -3 ] [ * * ] Label1: iload_1 ireturn.end method Comments show trace of: x.abs(-3) pc = code.start; while(true) { npc = pc + instruction_length(code[pc]); switch (opcode(code[pc])) { case ILOAD_1: push(locals[1]); break; case ILOAD: push(locals[code[pc+1]]); break; case ISTORE: t = pop(); locals[code[pc+1]] = t; break; case IADD: t1 = pop(); t2 = pop(); push(t1 + t2); break; case IFEQ: t = pop(); if (t==0) npc = code[pc+1]; break;... } pc = npc; } 15 16 4

Instructions Arithmetic Operations The JVM has 256 instructions for: arithmetic operations branch operations constant loading operations locals operations stack operations class operations method operations See the JVM specification for the full list ineg [...:i] [...:-i] i2c [...:i] [...:i%65536] iadd [...:i:j] [...:i+j] isub [...:i:j] [...:i-j] imul [...:i:j] [...:i*j] idiv [...:i:j] [...:i/j] irem [...:i:j] [...:i%j] iinc k i [...] [...] locals[k]=locals[k]+i 17 18 Branch Operations Branch Operations goto L [...] [...] branch always if_icmpeq L [...:i:j] [...] branch if i==j ifeq L [...:i] [...] branch if i==0 ifne L [...:i] [...] branch if i!=0 inull L [...:a] [...] branch if a==null ifnonnull L [...:a] [...] branch if a!=null if_icmpne L if_icmplt L if_icmple L if_acmpeq L [...:a:b] [...] branch if a==b if_acpmne L if_icmpgt L if_icmpge L 19 20 5

Constant Loading Operations Locals Operations iconst_0 [...] [...:0] iload k [...] [...:locals[k]] iconst_1 [...] [...:1]... istore k [...:i] [...] locals[k]=i iconst_5 [...] [...:5] aconst_null [...] [...:null] ldc i [...] [...:i] aload k [...] [...:locals[k]] astore k [...:a] [...] locals[k]=a ldc s [...] [...:String(s)] 21 22 Field Operations Stack Operations getfield f sig [...:a] [...:a.f] dup [...:v] [...:v:v] putfield f sig [...:a:v] [...] a.f=v getstatic f sig [...] [...:C.f] putstatic f sig [...:v] [...] C.f=v pop [...:v] [...] swap [...v:w] [...:w:v] nop [...] [...] dup_x1 [...:v:w] [...:w:v:w] dup_x2 [...:u:v:w] [...:w:u:v:w] 23 24 6

Class Operations Method Operations new C [...] [...:a] instance_of C [...:a] [...:i] if (a==null) i==0 else i==(type(a)<=c) checkcast C [...:a] [...:a] if (a!=null) &&!type(a)<=c) throw ClassCastException invokevirtual name sig [...:a:v 1 :...:v n ] [...(:v)] m=lookup(type(a),name,sig) push frame of size m.locals+m.stacksize locals[0]=a locals[1]=v 1... locals[n]=v n pc=m.code invokestatic invokespecial invokeinterface 25 26 Method Operations A Java Method ireturn [...:i] [...] return i and pop stack frame areturn [...:a] [...] return a and pop stack frame return [...] [...] pop stack frame public boolean member(object item) { if (first.equals(item)) return true; else if (rest == null) return false; else return rest.member(item); } 27 28 7

Generated Bytecode Bytecode Verification.method public member(ljava/lang/object;)z.limit locals 2 // locals[0] = x // locals[1] = item.limit stack 2 // initial stack [ * * ] aload_0 // [ x * ] getfield Cons/first Ljava/lang/Object; // [ x.first *] aload_1 // [ x.first item] invokevirtual java/lang/object/equals(ljava/lang/object;)z // [bool *] ifeq else_1 // [ * * ] iconst_1 // [ 1 * ] ireturn // [ * * ] else_1: aload_0 // [ x * ] getfield Cons/rest LCons; // [ x.rest * ] aconst_null // [ x.rest null] if_acmpne else_2 // [ * * ] iconst_0 // [ 0 * ] ireturn // [ * * ] else_2: aload_0 // [ x * ] getfield Cons/rest LCons; // [ x.rest * ] aload_1 // [ x.rest item ] invokevirtual Cons/member(Ljava/lang/Object;)Z // [ bool * ] ireturn // [ * * ].end method Bytecode cannot be trusted to be well-behaved Before execution, it must be verified Verification is performed: at class loading time at runtime A Java compiler must generate verifiable code 29 30 Verification: Syntax Verification: Constants and Headers The first 4 bytes of a class file must contain the magic number 0xCAFEBABE The bytecodes must be syntactically correct Final classes are not subclassed Final methods are not overridden Every class except Object has a superclass All constants are legal Field and method references have valid signatures 31 32 8

Verification: Instructions Verification: Dataflow Analysis Branch targets are within the code segment Only legal offsets are referenced Constants have appropriate types All instructions are complete Execution cannot fall of the end of the code At each program point, the stack always has the same size and types of objects No uninitialized locals are referenced Methods are invoked with appropriate arguments Fields are assigned appropriate values All instructions have appropriate types of arguments on the stack and in the locals 33 34 Verification: Gotcha Verification: Gotcha Again.method public static main([ljava/lang/string;)v.throws java/lang/exception.limit stack 2.limit locals 1 ldc -21248564 invokevirtual java/io/inputstream/read()i return.method public static main([ljava/lang/string;)v.throws java/lang/exception.limit stack 2.limit locals 2 iload_1 return java Fake java Fake Exception in thread "main" java.lang.verifyerror: (class: Fake, method: main signature: ([Ljava/lang/String;)V) Expecting to find object/array on stack Exception in thread "main" java.lang.verifyerror: (class: Fake, method: main signature: ([Ljava/lang/String;)V) Accessing value from uninitialized register 1 35 36 9

Verification: Gotcha Once More Verification: Gonna Getcha Every Time ifeq A ldc 42 goto B A: ldc "fortytwo" B: A: iconst_5 goto A java Fake java Fake Exception in thread "main" java.lang.verifyerror: (class: Fake, method: main signature: ([Ljava/lang/String;)V Mismatched stack types Exception in thread "main" java.lang.verifyerror: (class: Fake, method: main signature: ([Ljava/lang/String;)V Inconsistent stack height 1!= 0 37 38 JVM Implementation Just-In In-Time Compilation A naive bytecode interpreter is slow Exemplified by SUN s HotSpot JVM State-of-the-art JVM implementations are not: Bytecode fragments are compiled at runtime targeted at the native platform based on runtime profiling customization is possible Offers more opportunities than a static compiler http://kano.net/javabench 39 40 10

Other Java Bytecode Tools assembler (jasmin) deassembler (javap) decompiler (cavaj, mocha, jad) obfuscator (dozens of these...) analyzer (soot) 41 11