JVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format

Similar documents
Course Overview. PART I: overview material. PART II: inside a compiler. PART III: conclusion

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

JVML Instruction Set. How to get more than 256 local variables! Method Calls. Example. Method Calls

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Compiling Techniques

CSC 4181 Handout : JVM

301AA - Advanced Programming [AP-2017]

Programming Language Systems

The Java Virtual Machine. CSc 553. Principles of Compilation. 3 : The Java VM. Department of Computer Science University of Arizona

SOFTWARE ARCHITECTURE 7. JAVA VIRTUAL MACHINE

Course Overview. Levels of Programming Languages. Compilers and other translators. Tombstone Diagrams. Syntax Specification

Let s make some Marc R. Hoffmann Eclipse Summit Europe

Compiler construction 2009

COMP3131/9102: Programming Languages and Compilers

Run-time Program Management. Hwansoo Han

Tutorial 3: Code Generation

Under the Hood: The Java Virtual Machine. Problem: Too Many Platforms! Compiling for Different Platforms. Compiling for Different Platforms

The Java Virtual Machine

Topics. Structured Computer Organization. Assembly language. IJVM instruction set. Mic-1 simulator programming

Under the Hood: The Java Virtual Machine. Lecture 23 CS2110 Fall 2008

COMP3131/9102: Programming Languages and Compilers

02 B The Java Virtual Machine

Java byte code verification

JAM 16: The Instruction Set & Sample Programs

Over-view. CSc Java programs. Java programs. Logging on, and logging o. Slides by Michael Weeks Copyright Unix basics. javac.

CSCE 314 Programming Languages

Compiler construction 2009

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Chapter 5. A Closer Look at Instruction Set Architectures

CMPSC 497: Java Security

Java and C II. CSE 351 Spring Instructor: Ruth Anderson

COMP 520 Fall 2009 Virtual machines (1) Virtual machines

Java Class Loading and Bytecode Verification

Translating JVM Code to MIPS Code 1 / 43

Static Program Analysis

Improving Java Performance

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

CSc 453 Interpreters & Interpretation

Taming the Java Virtual Machine. Li Haoyi, Chicago Scala Meetup, 19 Apr 2017

CS2110 Fall 2011 Lecture 25. Under the Hood: The Java Virtual Machine, Part II

Improving Java Code Performance. Make your Java/Dalvik VM happier

CSE 431S Final Review. Washington University Spring 2013

Problem: Too Many Platforms!

An Introduction to Multicodes. Ben Stephenson Department of Computer Science University of Western Ontario

javac 29: pop 30: iconst_0 31: istore_3 32: jsr [label_51]

Roadmap. Java: Assembly language: OS: Machine code: Computer system:

Recap: Printing Trees into Bytecodes

Compilation 2012 Code Generation

Compiling for Different Platforms. Problem: Too Many Platforms! Dream: Platform Independence. Java Platform 5/3/2011

CS263: Runtime Systems Lecture: High-level language virtual machines. Part 1 of 2. Chandra Krintz UCSB Computer Science Department

The Java Virtual Machine

Java: framework overview and in-the-small features

Program Dynamic Analysis. Overview

3/15/18. Overview. Program Dynamic Analysis. What is dynamic analysis? [3] Why dynamic analysis? Why dynamic analysis? [3]

CMSC 430 Introduction to Compilers. Spring Intermediate Representations and Bytecode Formats

Plan for Today. Safe Programming Languages. What is a secure programming language?

A Quantitative Analysis of Java Bytecode Sequences

Chapter 5. A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Building a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2

Jaos - Java on Aos. Oberon Event 03 Patrik Reali

Delft-Java Dynamic Translation

CSc 620 Debugging, Profiling, Tracing, and Visualizing Programs. Compiling Java. The Java Class File Format 1 : JVM

Oak Intermediate Bytecodes

How do you create a programming language for the JVM?

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Java Security. Compiler. Compiler. Hardware. Interpreter. The virtual machine principle: Abstract Machine Code. Source Code

Exercise 7 Bytecode Verification self-study exercise sheet

COMP 520 Fall 2013 Code generation (1) Code generation

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 16

Introduction Basic elements of Java

Code Generation Introduction

High-Level Language VMs

The Java Language Implementation

EXAMINATIONS 2014 TRIMESTER 1 SWEN 430. Compiler Engineering. This examination will be marked out of 180 marks.

Running class Timing on Java HotSpot VM, 1

Part VII : Code Generation

Code Profiling. CSE260, Computer Science B: Honors Stony Brook University

CMSC 430 Introduction to Compilers. Fall Language Virtual Machines

Short Notes of CS201

Mnemonics Type-Safe Bytecode Generation in Scala

CS201 - Introduction to Programming Glossary By

301AA - Advanced Programming [AP-2017]

Topics. Block Diagram of Mic-1 Microarchitecture. Review Sheet. Microarchitecture IJVM ISA

COE318 Lecture Notes Week 3 (Week of Sept 17, 2012)

This project is supported by DARPA under contract ARPA F C-0057 through a subcontract from Syracuse University. 2

JoeQ Framework CS243, Winter 20156

Introduction to Programming Using Java (98-388)

Final Exam. 12 December 2018, 120 minutes, 26 questions, 100 points

Code Generation COMP 520: Compiler Design (4 credits) Professor Laurie Hendren

Java TM Introduction. Renaud Florquin Isabelle Leclercq. FloConsult SPRL.

Design and Implementation of Pep, a Java Just-In-Time Translator

Joeq Analysis Framework. CS 243, Winter

Run time environment of a MIPS program

Computer Components. Software{ User Programs. Operating System. Hardware

Compiler Construction I

Shared Mutable State SWEN-220

Java Instrumentation for Dynamic Analysis

Virtual Machines. COMP 520: Compiler Design (4 credits) Alexander Krolik MWF 9:30-10:30, TR

Transcription:

Course Overview What This Topic is About PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion 8 Interpretation 9 Review Supplementary material: s runtime organization and the Virtual Machine 1 We look at the JVM as an example of a real-world runtime system for a modern object-oriented programming language. JVM is probably the most common and widely used VM in the world, so you ll get a better idea what a real VM looks like. JVM is an abstract machine. What is the JVM architecture? What is the structure of.class files? How are JVM instructions executed? What is the role of the constant pool in dynamic linking? Also visit this site for more complete information about the JVM: http://java.sun.com/docs/books/vmspec/2nd-edition/html/vmspectoc.doc.html 2 Recap: Interpretive Compilers Abstract Machines Why? A tradeoff between fast(er) compilation and a reasonable runtime performance. How? Use an intermediate language more high-level than machine code => easier to compile to more low-level than source language => easy to implement as an interpreter Example: A Development Kit for machine M >JVM M JVM M Abstract machine implements an intermediate language in between the high-level language (e.g. ) and the low-level hardware (e.g. Pentium) High level Low level Pentium JVM (.class files) Pentium Implemented in : Machine independent compiler JVM interpreter or JVM JIT compiler 3 4 Abstract Machines Class Files and Class File Format An abstract machine is intended specifically as a runtime system for a particular (kind of) programming language. JVM is a virtual machine for programs. It directly supports object-oriented concepts such as classes, objects, methods, method invocation etc. Easy to compile to JVM => 1. easy to implement compiler 2. fast compilation Another advantage: portability External representation (platform independent).class files load JVM Internal representation (implementation dependent) classes objects arrays primitive types methods strings The JVM is an abstract machine in the truest sense of the word. The JVM specification does not give implementation details (can be dependent on target OS/platform, performance requirements, etc.) The JVM specification defines a machine independent class file format that all JVM implementations must support. 5 6 1

Data Types JVM (and ) distinguishes between two kinds of types: Primitive types: boolean: boolean numeric integral: byte, short, int, long, char numeric floating point:float, double internal, for exception handling: returnaddress JVM: Runtime Data Areas Besides OO concepts, JVM also supports multi-threading. Threads are directly supported by the JVM. => Two kinds of runtime data areas: 1. shared between all threads 2. private to a single thread Shared Thread 1 Thread 2 Reference types: class types array types interface types Note: Primitive types are represented directly, reference types are represented indirectly (as pointers to array or class instances). 7 Garbage Collected Heap Method area pc Native Method pc Native Method 8 s Frames JVM is a stack based machine, much like TAM. JVM instructions implicitly take arguments from the stack top put their result on the top of the stack The stack is used to pass arguments to methods return a result from a method store intermediate results while evaluating expressions store local variables This works similarly to (but not exactly the same as) what we previously discussed about stack-based storage allocation and routines. The stack consists of frames. The JVM specification does not say exactly how the stack and frames should be implemented. The JVM specification specifies that a stack frame has areas for: args + local vars operand stack Pointer to runtime constant pool A new call frame is created by executing some JVM instruction for invoking a method (e.g. invokevirtual, invokenonvirtual,...) The operand stack is initially empty, but grows and shrinks during execution. 9 10 pointer to constant pool args + local vars operand stack Frames The role/purpose of each of the areas in a stack frame: Used implicitly when executing JVM instructions that contain entries into the constant pool (more about this later). Space where the arguments and local variables of a method are stored. This includes a space for the receiver (this) at position/offset 0. for storing intermediate results during the execution of the method. Initially it is empty. The maximum depth is known at compile time. 11 Frames An implementation using registers such as SB, ST, and LB and a dynamic link is one possible implementation. SB to previous frame on the stack LB ST dynamic link args + local vars operand stack to runtime constant pool JVM instructions store and load (for accessing args and locals) use addresses which are numbers from 0 to #args + #locals - 1 12 2

JVM Interpreter Instruction-set: typed instructions! The core of a JVM interpreter is basically this: do { byte opcode = fetch an opcode; switch (opcode) { case opcode1 : fetch operands for opcode1; execute action for opcode1; break; case opcode2 : fetch operands for opcode2; execute action for opcode2; break; case... while (more to do) JVM instructions are explicitly typed: different opcodes for instructions for integers, floats, arrays, reference types, etc. This is reflected by a naming convention in the first letter of the opcode mnemonics: Example: different types of load instructions iload lload fload dload aload integer load long load float load double load reference-type load 13 14 Instruction set: kinds of operands Instruction-set: accessing arguments and locals JVM instructions have three kinds of operands: - from the top of the operand stack - from the bytes following the opcode - part of the opcode itself Each instruction may have different forms supporting different kinds of operands. Example: different forms of iload Assembly code iload_0 iload_1 iload_2 iload_3 iload n Binary instruction code layout 26 27 28 29 21 n wide iload n 196 21 n 15 0: 1: 2: 3: arguments and locals area inside a stack frame Instruction examples: iload_1 iload_3 aload 5 aload_0 istore_1 astore_1 fstore_3 args: indexes 0.. #args - 1 locals: indexes #args.. #args + #locals - 1 A load instruction takes something from the args/locals area and pushes it onto the top of the operand stack. A store instruction pops something from the top of the operand stack and places it in the args/locals area. 16 Instruction-set: non-local memory access Instruction-set: operations on numbers In the JVM, the contents of different kinds of memory can be accessed by different kinds of instructions. accessing locals and arguments: load and storeinstructions accessing fields in objects:getfield, putfield accessing static fields:getstatic, putstatic Note: Static fields are a lot like global variables. They are allocated in the method area where also code for methods and representations for classes (including method tables) are stored. Arithmetic add: iadd, ladd, fadd, dadd subtract: isub, lsub, fsub, dsub multiply: imul, lmul, fmul, dmul etc. Conversion i2l, i2f, i2d, l2f, l2d, f2d, f2i, d2i, Note: getfield and putfieldaccess memory in the heap. Note: JVM doesn t have anything similar to registers L1, L2, etc. 17 18 3

Instruction-set Operand stack manipulation pop, pop2, dup, dup2, swap, Control transfer Unconditional:goto, jsr, ret, Conditional: ifeq, iflt, ifgt, if_icmpeq, Instruction-set Method invocation: invokevirtual: usual instruction for calling a method on an object. invokeinterface: same as invokevirtual, but used when the called method is declared in an interface (requires a different kind of method lookup) invokespecial: for calling things such as constructors, which are not dynamically dispatched (this instruction is also known as invokenonvirtual). invokestatic: for calling methods that have the static modifier (these methods are sent to a class, not to an object). Returning from methods: return, ireturn, lreturn, areturn, freturn, 19 20 Instruction-set: Heap Memory Allocation Create new class instance (object): new Create new array: newarray: for creating arrays of primitive types. anewarray, multianewarray: for arrays of reference types. Many JVM instructions have operands which are indexes pointing to an entry in the so-called constant pool. The constant pool contains all kinds of entries that represent symbolic references for linking. This is the way that instructions refer to things such as classes, interfaces, fields, methods, and constants such as string literals and numbers. These are the kinds of constant pool entries that exist: Class_info Fieldref_info Methodref_info InterfaceMethodref_info String Integer Float Long Double Name_and_Type_info Utf8_info (Unicode characters) 21 22 Example: We examine the getfield instruction in detail. Format: CONSTANT_Fieldref_info { u1 tag; u2 class_index; u2 name_and_type_index; 180 indexbyte1 indexbyte2 CONSTANT_Name_and_Type_info { u1 tag; u2 name_index; u2 descriptor_index; Class_info { u1 tag; u2 name_index; name of field field descriptor fully qualified class name 23 That previous picture is rather complicated, let s simplify it a little: Format: Class fully qualified class name 180 indexbyte1 indexbyte2 name of field Fieldref Name_and_Type field descriptor 24 4

The constant entries format is part of the class file format. Luckily, we have a assembler that allows us to write a kind of textual assembly code and that is then transformed into a binary.class file. Fully qualified class names and descriptors in constant pool UTF8 entries. 1. Fully qualified class name: a package + class name string. Note this uses / instead of. to separate each level along the path. This assembler takes care of creating the constant pool entries for us. When an instruction operand expects a constant pool entry the assembler allows you to enter the entry in place in an easy syntax. Example: getfield mypackage/queue i I 2. Descriptor: a string that defines a type for a method or field. boolean integer Object String[] int foo(int,object) descriptor Z I Ljava/lang/Object; [Ljava/lang/String; (ILjava/lang/Object;)I 25 26 Linking Loading and Linking in JVM In general, linking is the process of resolving symbolic references in binary files. In JVM, loading and linking of class files happens at runtime, while the program is running! Most programming language implementations have what we call separate compilation. Modules or files can be compiled separately and transformed into some binary format. But since these separately compiled files may have connections to other files, they have to be linked. Classes are loaded as needed. The constant pool contains symbolic references that need to be resolved before a JVM instruction that uses them can be executed (this is the equivalent of linking). => The binary file is not yet executable, because it has some kind of symbolic links in it that point to things (classes, methods, functions, variables, etc.) in other files/modules. Linking is the process of resolving these symbolic links and replacing them by real addresses so that the code can be executed. 27 In JVM a constant pool entry is resolved the first time it is used by a JVM instruction. Example: When a getfieldis executed for the first time, the constant pool entry index in the instruction can be replaced by the offset of the field. 28 class Factorial { Closing Example As a closing example on the JVM, we will take a look at the compiled code of the following simple class declaration. int fac(int n) { int result = 1; for (int i=2; i<n; i++) { result = result * i; return result; Compiling and Disassembling % javac Factorial.java % javap -c -verbose Factorial Compiled from Factorial.java class Factorial extends java.lang.object { Factorial(); /* =1, Locals=1, Args_size=1 */ int fac(int); /* =2, Locals=4, Args_size=2 */ Method Factorial() 0 aload_0 1 invokespecial #1 <Method java.lang.object()> 4 return 29 30 5

Compiling and Disassembling... // address: 0 1 2 3 Method int fac(int) // stack: this n result i 0 iconst_1 // stack: this n result i 1 1 istore_2 // stack: this n result i 2 iconst_2 // stack: this n result i 2 3 istore_3 // stack: this n result i 4 goto 14 7 iload_2 // stack: this n result i result 8 iload_3 // stack: this n result i result i 9 imul // stack: this n result i result*i 10 istore_2 // stack: this n result i 11 iinc 3 1 // stack: this n result i 14 iload_3 // stack: this n result i i 15 iload_1 // stack: this n result i i n 16 if_icmplt 7 // stack: this n result i 19 iload_2 // stack: this n result i result 20 ireturn.class package Factorial.super java/lang/object Writing Factorial in jasmin Jasminis a Assembler Interface. It takes ASCII descriptions for classes, written in a simple assembler-like syntax and using the Virtual Machine instruction set. It converts them into binary class files suitable for loading into a JVM implementation..method package <init>( )V )V.limit stack 50 50.limit locals 1 aload_0 invokenonvirtual java/lang/object/<init>( )V )V return.end method 31 32 Writing Factorial in jasmin (continued).method package fac(i)i.limit stack 50.limit locals 4 iconst_1 istore 2 iconst_2 istore 3 Label_1: iload 3 iload 1 if_icmplt Label_4 iconst_0 goto Label_5 Label_4: iconst_1 Label_5: ifeq Label_2 iload 2 iload 3 imul dup istore 2 pop Label_3: iload 3 dup iconst_1 iadd istore 3 pop goto Label_1 Label_2: iload 2 ireturn iconst_0 ireturn.end method 33 6