COMP 520 Fall 2009 Virtual machines (1) Virtual machines

Similar documents
The Java Virtual Machine

Virtual Machines. COMP 520: Compiler Design (4 credits) Alexander Krolik MWF 9:30-10:30, TR

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

Compiling Techniques

JVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format

Static Program Analysis

JVML Instruction Set. How to get more than 256 local variables! Method Calls. Example. Method Calls

CS2110 Fall 2011 Lecture 25. Under the Hood: The Java Virtual Machine, Part II

Run-time Program Management. Hwansoo Han

Java and C II. CSE 351 Spring Instructor: Ruth Anderson

Course Overview. PART I: overview material. PART II: inside a compiler. PART III: conclusion

CMPSC 497: Java Security

Compilation 2012 Code Generation

An Introduction to Multicodes. Ben Stephenson Department of Computer Science University of Western Ontario

The Java Virtual Machine. CSc 553. Principles of Compilation. 3 : The Java VM. Department of Computer Science University of Arizona

Compiler construction 2009

Compiler construction 2009

CSc 453 Interpreters & Interpretation

The Java Virtual Machine

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Under the Hood: The Java Virtual Machine. Lecture 23 CS2110 Fall 2008

JAM 16: The Instruction Set & Sample Programs

Java Class Loading and Bytecode Verification

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Java byte code verification

High-Level Language VMs

Translating JVM Code to MIPS Code 1 / 43

COMP 520 Fall 2013 Code generation (1) Code generation

Part VII : Code Generation

Under the Hood: The Java Virtual Machine. Problem: Too Many Platforms! Compiling for Different Platforms. Compiling for Different Platforms

SOFTWARE ARCHITECTURE 7. JAVA VIRTUAL MACHINE

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Java: framework overview and in-the-small features

Compiler Construction I

02 B The Java Virtual Machine

COMP3131/9102: Programming Languages and Compilers

Tutorial 3: Code Generation

CSCE 314 Programming Languages

Today. Instance Method Dispatch. Instance Method Dispatch. Instance Method Dispatch 11/29/11. today. last time

Exercise 7 Bytecode Verification self-study exercise sheet

Code Generation Introduction

CS577 Modern Language Processors. Spring 2018 Lecture Interpreters

CSc 520 Principles of Programming Languages

CSc 453. Compilers and Systems Software. 18 : Interpreters. Department of Computer Science University of Arizona. Kinds of Interpreters

CSC 4181 Handout : JVM

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

CSc 553. Principles of Compilation. 2 : Interpreters. Department of Computer Science University of Arizona

Problem: Too Many Platforms!

Java Security. Compiler. Compiler. Hardware. Interpreter. The virtual machine principle: Abstract Machine Code. Source Code

YETI. GraduallY Extensible Trace Interpreter VEE Mathew Zaleski, Angela Demke Brown (University of Toronto) Kevin Stoodley (IBM Toronto)

Plan for Today. Safe Programming Languages. What is a secure programming language?

Programming Language Systems

A Quantitative Analysis of Java Bytecode Sequences

javac 29: pop 30: iconst_0 31: istore_3 32: jsr [label_51]

Let s make some Marc R. Hoffmann Eclipse Summit Europe

Over-view. CSc Java programs. Java programs. Logging on, and logging o. Slides by Michael Weeks Copyright Unix basics. javac.

Improving Java Performance

EXAMINATIONS 2008 MID-YEAR COMP431 COMPILERS. Instructions: Read each question carefully before attempting it.

Program Dynamic Analysis. Overview

3/15/18. Overview. Program Dynamic Analysis. What is dynamic analysis? [3] Why dynamic analysis? Why dynamic analysis? [3]

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Intermediate Representations

Chapter 5. A Closer Look at Instruction Set Architectures

Oak Intermediate Bytecodes

Shared Mutable State SWEN-220

Running class Timing on Java HotSpot VM, 1

Code Generation & Parameter Passing

Recap: Printing Trees into Bytecodes

EXAMINATIONS 2014 TRIMESTER 1 SWEN 430. Compiler Engineering. This examination will be marked out of 180 marks.

Java Instrumentation for Dynamic Analysis

Compiling Faster, Compiling Better with Falcon. Iván

Compiling for Different Platforms. Problem: Too Many Platforms! Dream: Platform Independence. Java Platform 5/3/2011

CS263: Runtime Systems Lecture: High-level language virtual machines. Part 1 of 2. Chandra Krintz UCSB Computer Science Department

Improving Java Code Performance. Make your Java/Dalvik VM happier

On the Design of the Local Variable Cache in a Hardware Translation-Based Java Virtual Machine

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

CS 61c: Great Ideas in Computer Architecture

CSE 431S Final Review. Washington University Spring 2013

Lecture VII: The debugger gdb and Branching. Xuan Guo. CSC 3210 Computer Organization and Programming Georgia State University.

CODE GENERATION Monday, May 31, 2010

Compiler Construction I

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 16

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon. School of Electrical Engineering and Computer Science Seoul National University, Korea

Building a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2

Do-While Example. In C++ In assembly language. do { z--; while (a == b); z = b; loop: addi $s2, $s2, -1 beq $s0, $s1, loop or $s2, $s1, $zero

Lecture 4: Instruction Set Design/Pipelining

Just-In-Time Compilation

The Hardware/Software Interface CSE351 Spring 2015

Just In Time Compilation

Chapter 5. A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Joeq Analysis Framework. CS 243, Winter

Procedure Call. Procedure Call CS 217. Involves following actions

Mixed Mode Execution with Context Threading

Delft-Java Link Translation Buffer

Static Analysis of Dynamic Languages. Jennifer Strater

ARM Assembly Programming

Exploiting FPGA Concurrency to Enhance JVM Performance

Instruction Set Architecture

Code Generation: Introduction

Transcription:

COMP 520 Fall 2009 Virtual machines (1) Virtual machines

COMP 520 Fall 2009 Virtual machines (2) Compilation and execution modes of Virtual machines: Abstract syntax trees Interpreter AOT-compile Virtual machine code interpret JIT-compile Native binary code

COMP 520 Fall 2009 Virtual machines (3) Compilers traditionally compiled to machine code ahead-of-time (AOT). Example: gcc translates into RTL (Register Transfer Language), optimizes RTL, and then compiles RTL into native code. Advantages: can exploit many details of the underlying architecture; and intermediate languages like RTL facilitate production of code generators for many target architectures. Disadvantage: a code generator must be built for each target architecture.

COMP 520 Fall 2009 Virtual machines (4) Interpreting virtual machine code. Examples: P-code for early Pascal interpreters; Postscript for display devices; and Java bytecode for the Java Virtual Machine. Advantages: easy to generate the code; the code is architecture independent; and bytecode can be more compact. Disadvantage: poor performance due to interpretative overhead (typically 5-20 slower). Reasons: Every instruction considered in isolation, confuses branch prediction,... and many more.

COMP 520 Fall 2009 Virtual machines (5) VirtualRISC is a simple RISC machine with: memory; registers; condition codes; and execution unit. In this model we ignore: caches; pipelines; branch prediction units; and advanced features.

COMP 520 Fall 2009 Virtual machines (6) VirtualRISC memory: a stack (used for function call frames); a heap (used for dynamically allocated memory); a global pool (used to store global variables); and a code segment (used to store VirtualRISC instructions).

COMP 520 Fall 2009 Virtual machines (7) VirtualRISC registers: unbounded number of general purpose registers; the stack pointer (sp) which points to the top of the stack; the frame pointer (fp) which points to the current stack frame; and the program counter (pc) which points to the current instruction.

COMP 520 Fall 2009 Virtual machines (8) VirtualRISC condition codes: stores the result of last instruction that can set condition codes (used for branching). VirtualRISC execution unit: reads the VirtualRISC instruction at the current pc, decodes the instruction and executes it; this may change the state of the machine (memory, registers, condition codes); the pc is automatically incremented after executing an instruction; but function calls and branches explicitly change the pc.

COMP 520 Fall 2009 Virtual machines (9) Memory/register instructions: st Ri,[Rj] st Ri,[Rj+C] [Rj] := Ri [Rj+C] := Ri ld [Ri],Rj ld [Ri+C],Rj Rj := [Ri] Rj := [Ri+C] Register/register instructions: mov Ri,Rj add Ri,Rj,Rk sub Ri,Rj,Rk mul Ri,Rj,Rk div Ri,Rj,Rk... Rj := Ri Rk := Ri + Rj Rk := Ri - Rj Rk := Ri * Rj Rk := Ri / Rj Constants may be used in place of register values: mov 5,R1.

COMP 520 Fall 2009 Virtual machines (10) Instructions that set the condition codes: cmp Ri,Rj Instructions to branch: b L bg L bge L bl L ble L bne L To express: if R1 <= 9 goto L1 we code: cmp R1,9 ble L1

COMP 520 Fall 2009 Virtual machines (11) Other instructions: save sp,-c,sp call L restore ret nop save registers, allocating C bytes on the stack R15:=pc; pc:=l restore registers pc:=r15+8 do nothing

COMP 520 Fall 2009 Virtual machines (12) Previous Frame fp (old sp) local variables space from alloca() [fp-offset] scratch space for register spills, temps Current Frame outgoing params space to store register params from callee [sp+offset] space to save register window sp

COMP 520 Fall 2009 Virtual machines (13) Stack frames: stores function activations; sp and fp point to stack frames; when a function is called a new stack frame is created: push fp; fp := sp; sp := sp + C; when a function returns, the top stack frame is popped: sp := fp; fp = pop; local variables are stored relative to fp; the figure shows additional features of the SPARC architecture.

COMP 520 Fall 2009 Virtual machines (14) A simple C function: int fact(int n) { int i, sum; sum = 1; i = 2; while (i <= n) { sum = sum * i; i = i + 1; } return sum; }

COMP 520 Fall 2009 Virtual machines (15) Corresponding VirtualRISC code: _fact: save sp,-112,sp // save stack frame st R0,[fp+68] // save input arg n in frame of CALLER mov 1,R0 // R0 := 1 st R0,[fp-16] // [fp-16] is location for sum mov 2,R0 // RO := 2 st RO,[fp-12] // [fp-12] is location for i L3: ld [fp-12],r0 // load i into R0 ld [fp+68],r1 // load n into R1 cmp R0,R1 // compare R0 to R1 ble L5 // if R0 <= R1 goto L5 b L4 // goto L4 L5: ld [fp-16],r0 // load sum into R0 ld [fp-12],r1 // load i into R1 mul R0,R1,R0 // R0 := R0 * R1 st R0,[fp-16] // store R0 into sum ld [fp-12],r0 // load i into R0 add R0,1,R1 // R1 := R0 + 1 st R1,[fp-12] // store R1 into i b L3 // goto L3 L4: ld [fp-16],r0 // put return value of sum into R0 restore // restore register window ret // return from function

COMP 520 Fall 2009 Virtual machines (16) Java Virtual Machine has: memory; registers; condition codes; and execution unit.

COMP 520 Fall 2009 Virtual machines (17) Java Virtual Machine memory: a stack (used for function call frames); a heap (used for dynamically allocated memory); a constant pool (used for constant data that can be shared); and a code segment (used to store JVM instructions of currently loaded class files).

COMP 520 Fall 2009 Virtual machines (18) Java Virtual Machine registers: no general purpose registers; the stack pointer (sp) which points to the top of the stack; the local stack pointer (lsp) which points to a location in the current stack frame; and the program counter (pc) which points to the current instruction.

COMP 520 Fall 2009 Virtual machines (19) Java Virtual Machine condition codes: stores the result of last instruction that can set condition codes (used for branching). Java Virtual Machine execution unit: reads the Java Virtual Machine instruction at the current pc, decodes the instruction and executes it; this may change the state of the machine (memory, registers, condition codes); the pc is automatically incremented after executing an instruction; but method calls and branches explicitly change the pc.

COMP 520 Fall 2009 Virtual machines (20) Java Virtual Machine stack frames have space for: a reference to the current object (this); the method arguments; the local variables; and a local stack used for intermediate results. The number of local slots and the maximum size of the local stack are fixed at compile-time.

COMP 520 Fall 2009 Virtual machines (21) Java compilers translate source code to class files. Class files include the bytecode instructions for each method. foo.java Java Compiler foo.class magic number (0xCAFEBABE) minor version/major version constant pool access flags this class super class interfaces fields methods attributes

COMP 520 Fall 2009 Virtual machines (22) A simple Java method: public int Abs(int x) { if (x < 0) return(x * -1); else return(x); } Corresponding bytecode (in Jasmin syntax):.method public Abs(I)I // one int argument, returns an int.limit stack 2 // has stack with 2 locations.limit locals 2 // has space for 2 locals // --locals-- --stack--- // [ o -3 ] [ * * ] iload_1 // [ o -3 ] [ -3 * ] ifge Label1 // [ o -3 ] [ * * ] iload_1 // [ o -3 ] [ -3 * ] iconst_m1 // [ o -3 ] [ -3-1 ] imul // [ o -3 ] [ 3 * ] ireturn // [ o -3 ] [ * * ] Label1: iload_1 ireturn.end method Comments show trace of o.abs(-3).

COMP 520 Fall 2009 Virtual machines (23) A sketch of a bytecode interpreter: pc = code.start; while(true) { npc = pc + instruction_length(code[pc]); switch (opcode(code[pc])) { case ILOAD_1: push(local[1]); break; case ILOAD: push(local[code[pc+1]]); break; case ISTORE: t = pop(); local[code[pc+1]] = t; break; case IADD: t1 = pop(); t2 = pop(); push(t1 + t2); break; case IFEQ: t = pop(); if (t == 0) npc = code[pc+1]; break;... } pc = npc; }

COMP 520 Fall 2009 Virtual machines (24) Unary arithmetic operations: ineg i2c [...:i] -> [...:-i] [...:i] -> [...:i%65536] Binary arithmetic operations: iadd isub imul idiv irem [...:i1:i2] -> [...:i1+i2] [...:i1:i2] -> [...:i1-i2] [...:i1:i2] -> [...:i1*i2] [...:i1:i2] -> [...:i1/i2] [...:i1:t2] -> [...:i1%i2] Direct operations: iinc k a [...] -> [...] local[k]=local[k]+a

COMP 520 Fall 2009 Virtual machines (25) Nullary branch operations: goto L [...] -> [...] branch always Unary branch operations: ifeq L [...:i] -> [...] branch if i == 0 ifne L [...:i] -> [...] branch if i!= 0 ifnull L [...:o] -> [...] branch if o == null ifnonnull L [...:o] -> [...] branch if o!= null

COMP 520 Fall 2009 Virtual machines (26) Binary branch operations: if icmpeq L [...:i1:i2] -> [...] branch if i1 == i2 if icmpne L [...:i1:i2] -> [...] branch if i1!= i2 if icmpgt L [...:i1:i2] -> [...] branch if i1 > i2 if icmplt L [...:i1:i2] -> [...] branch if i1 < i2 if icmple L [...:i1:i2] -> [...] branch if i1 <= i2 if icmpge L [...:i1:i2] -> [...] branch if i1 >= i2 if acmpeq L [...:o1:o2] -> [...] branch if o1 == o2 if acmpne L [...:o1:o2] -> [...] branch if o1!= o2

COMP 520 Fall 2009 Virtual machines (27) Constant loading operations: iconst 0 [...] -> [...:0] iconst 1 [...] -> [...:1] iconst 2 [...] -> [...:2] iconst 3 [...] -> [...:3] iconst 4 [...] -> [...:4] iconst 5 [...] -> [...:5] aconst null [...] -> [...:null] ldc int i [...] -> [...:i] ldc string s [...] -> [...:String(s)]

COMP 520 Fall 2009 Virtual machines (28) Locals operations: iload k [...] -> [...:local[k]] istore k [...:i] -> [...] local[k]=i aload k [...] -> [...:local[k]] astore k [...:o] -> [...] local[k]=o Field operations: getfield f sig [...:o] -> [...:o.f] putfield f sig [...:o:v] -> [...] o.f=v

COMP 520 Fall 2009 Virtual machines (29) Stack operations: dup [...:v1] -> [...:v1:v1] pop [...:v1] -> [...] swap [...:v1:v2] -> [...:v2:v1] nop [...] -> [...]

COMP 520 Fall 2009 Virtual machines (30) Class operations: new C [...] -> [...:o] instance of C [...:o] -> [...:i] if (o==null) i=0 else i=(c<=type(o)) checkcast C [...:o] -> [...:o] if (o!=null &&!C<=type(o)) throw ClassCastException

COMP 520 Fall 2009 Virtual machines (31) Method operations: invokevirtual m sig [...:o:a 1 :...:a n ] -> [...] //overloading already resolved: // signature of m is known! entry=lookuphierarchy(m,sig,class(o)); block=block(entry); push stack frame of size block.locals+block.stacksize; local[0]=o; //local points to local[1]=a 1 ; //beginning of frame... local[n]=a n ; pc=block.code;

COMP 520 Fall 2009 Virtual machines (32) Method operations: invokespecial m sig [...:o:a 1 :...:a n ] -> [...] //overloading already resolved: // signature of m is known! entry=lookupclassonly(m,sig,class(o)); block=block(entry); push stack frame of size block.locals+block.stacksize; local[0]=o; //local points to local[1]=a 1 ; //beginning of frame... local[n]=a n ; pc=block.code; For which method calls is invokespecial used?

COMP 520 Fall 2009 Virtual machines (33) Method operations: ireturn [...:<frame>:i] -> [...:i] pop stack frame, push i onto frame of caller areturn [...:<frame>:o] -> [...:o] pop stack frame, push o onto frame of caller return [...:<frame>] -> [...] pop stack frame Those operations also release locks in synchronized methods.

COMP 520 Fall 2009 Virtual machines (34) A Java method: public boolean member(object item) { if (first.equals(item)) return true; else if (rest == null) return false; else return rest.member(item); }

COMP 520 Fall 2009 Virtual machines (35) Corresponding bytecode (in Jasmin syntax):.method public member(ljava/lang/object;)z.limit locals 2 // local[0] = o // local[1] = item.limit stack 2 // initial stack [ * * ] aload_0 // [ o * ] getfield Cons/first Ljava/lang/Object; // [ o.first *] aload_1 // [ o.first item] invokevirtual java/lang/object/equals(ljava/lang/object;)z // [ b * ] for some boolean b ifeq else_1 // [ * * ] iconst_1 // [ 1 * ] ireturn // [ * * ] else_1: aload_0 // [ o * ] getfield Cons/rest LCons; // [ o.rest * ] aconst_null // [ o.rest null] if_acmpne else_2 // [ * * ] iconst_0 // [ 0 * ] ireturn // [ * * ] else_2: aload_0 // [ o * ] getfield Cons/rest LCons; // [ o.rest * ] aload_1 // [ o.rest item ] invokevirtual Cons/member(Ljava/lang/Object;)Z // [ b * ] for some boolean b ireturn // [ * * ].end method

COMP 520 Fall 2009 Virtual machines (36) Bytecode verification: bytecode cannot be trusted to be well-formed and well-behaved; before executing any bytecode, it should be verified, especially if that bytecode is received over the network; verification is performed partly at class loading time, and partly at run-time; and at load time, dataflow analysis is used to approximate the number and type of values in locals and on the stack.

COMP 520 Fall 2009 Virtual machines (37) Interesting properties of verified bytecode: each instruction must be executed with the correct number and types of arguments on the stack, and in locals (on all execution paths); at any program point, the stack is the same size along all execution paths; every method must have enough locals to hold the receiver object (except static methods) and the method s arguments; and no local variable can be accessed before it has been assigned a value.

COMP 520 Fall 2009 Virtual machines (38) Java class loading and execution model: when a method is invoked, a ClassLoader finds the correct class and checks that it contains an appropriate method; if the method has not yet been loaded, then it is verified (remote classes); after loading and verification, the method body is interpreted. If the method becomes executed multiple times, the bytecode for that method is translated to native code. If the method becomes hot, the native code is optimized. The last two steps are very involved and companies like Sun and IBM have a thousand people working on optimizing these steps. good for you! (why not 1001 people?)

COMP 520 Fall 2009 Virtual machines (39) Split-verification in Java 6+: Bytecode verification is easy but still polynomial, i.e. sometimes slow, and this can be exploited in denial-of-service attacks: http://www.bodden.de/research/javados/ Java 6 (version 50.0 bytecodes) introduced StackMapTable attributes to make verification linear. Java compilers know the type of locals at compile time. Java 6 compilers store these types in the bytecode using StackMapTable attributes. Speeds up construction of the proof tree also called Proof-Carrying Code Java 7 (version 51.0 bytecodes) JVMs will enforce presence of these attributes.

COMP 520 Fall 2009 Virtual machines (40) Future use of Java bytecode: the JOOS compiler will produce Java bytecode in Jasmin format; and the JOOS peephole optimizer transforms bytecode into more efficient bytecode. Future use of VirtualRISC: Java bytecode can be converted into machine code at run-time using a JIT (Just-In-Time) compiler; we will study some examples of converting Java bytecode into a language similar to VirtualRISC; we will study some simple, standard optimizations on VirtualRISC.