High-Level Language VMs

Similar documents
Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Mechanisms and constructs for System Virtualization

Run-time Program Management. Hwansoo Han

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine p. 1

CSc 453 Interpreters & Interpretation

Name, Scope, and Binding. Outline [1]

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

JamaicaVM Java for Embedded Realtime Systems

Compiling Techniques

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

Managed runtimes & garbage collection

Run-Time Environments/Garbage Collection

JAM 16: The Instruction Set & Sample Programs

Lecture 13: Garbage Collection

Garbage Collection. Hwansoo Han

Run-time Environments

Run-time Environments

Java Internals. Frank Yellin Tim Lindholm JavaSoft

Java: framework overview and in-the-small features

Compiler Construction

Automatic Memory Management

Manual Allocation. CS 1622: Garbage Collection. Example 1. Memory Leaks. Example 3. Example 2 11/26/2012. Jonathan Misurda

Compiler Construction

G Programming Languages - Fall 2012

Introduction to Virtual Machines. Michael Jantz

Static Program Analysis

Binding and Storage. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.

Hardware Emulation and Virtual Machines

Virtualization. Pradipta De

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Garbage Collection. Akim D le, Etienne Renault, Roland Levillain. May 15, CCMP2 Garbage Collection May 15, / 35

Advanced Programming & C++ Language

LIA. Large Installation Administration. Virtualization

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Java Instrumentation for Dynamic Analysis

Memory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008

Structure of Programming Languages Lecture 10

Implementation Garbage Collection

Chapter 5. A Closer Look at Instruction Set Architectures

Garbage Collection (1)

Last class: OS and Architecture. OS and Computer Architecture

Last class: OS and Architecture. Chapter 3: Operating-System Structures. OS and Computer Architecture. Common System Components

Virtual Machines and Dynamic Translation: Implementing ISAs in Software

Java Security. Compiler. Compiler. Hardware. Interpreter. The virtual machine principle: Abstract Machine Code. Source Code

CSCE 410/611: Virtualization!

CS61C : Machine Structures

Learning Outcomes. Extended OS. Observations Operating systems provide well defined interfaces. Virtual Machines. Interface Levels

Computer Systems A Programmer s Perspective 1 (Beta Draft)

Compilers. 8. Run-time Support. Laszlo Böszörmenyi Compilers Run-time - 1

Page 1. Structure of von Nuemann machine. Instruction Set - the type of Instructions

Parley: Federated Virtual Machines

JAVA PERFORMANCE. PR SW2 S18 Dr. Prähofer DI Leopoldseder

Smalltalk Implementation

CMSC 330: Organization of Programming Languages

[07] SEGMENTATION 1. 1

6.828: OS/Language Co-design. Adam Belay

Heap Management. Heap Allocation

Virtual Machines. Virtual Machines

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

Compiler construction 2009

CMSC 330: Organization of Programming Languages

Memory Management. Chapter Fourteen Modern Programming Languages, 2nd ed. 1

Heap, Variables, References, and Garbage. CS152. Chris Pollett. Oct. 13, 2008.

CSE 504: Compiler Design. Runtime Environments

Intermediate Representations

Instruction-set Design Issues: what is the ML instruction format(s) ML instruction Opcode Dest. Operand Source Operand 1...

SOFTWARE ARCHITECTURE 7. JAVA VIRTUAL MACHINE

CS2110 Fall 2011 Lecture 25. Under the Hood: The Java Virtual Machine, Part II

UNIT 3

Subprograms, Subroutines, and Functions

Runtime. The optimized program is ready to run What sorts of facilities are available at runtime

Process s Address Space. Dynamic Memory. Backing the Heap. Dynamic memory allocation 3/29/2013. When a process starts the heap is empty

The Java Language Implementation

Advanced Object-Oriented Programming Introduction to OOP and Java

Memory Management. Didactic Module 14 Programming Languages - EEL670 1

NOTE: Answer ANY FOUR of the following 6 sections:

Using the Singularity Research Development Kit

The SURE Architecture

Java and C II. CSE 351 Spring Instructor: Ruth Anderson

UNIT-4 (COMPILER DESIGN)

CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection

Run-time Environments -Part 3

CSCE 410/611: Virtualization

CSCE 314 Programming Languages

Virtual Memory Primitives for User Programs

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

Algorithms for Dynamic Memory Management (236780) Lecture 4. Lecturer: Erez Petrank

Practical Malware Analysis

JVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format

ISA: The Hardware Software Interface

Older geometric based addressing is called CHS for cylinder-head-sector. This triple value uniquely identifies every sector.

Lecture Notes on Garbage Collection

Jaguar: Enabling Efficient Communication and I/O in Java

Executing Legacy Applications on a Java Operating System

Transcription:

High-Level Language VMs

Outline Motivation What is the need for HLL VMs? How are these different from System or Process VMs? Approach to HLL VMs Evolutionary history Pascal P-code Object oriented HLL VMs HLL VM architecture and constructs

What is a High-Level Language (HLL) VM? A type of Process VM that facilitates application code portability by encapsulating the process runtime in the VM. The application code is built in a high-level language built to a virtual-isa model. The runtime for the HLL is translated during application execution on the physical ISA. Early example is Pascal P-code Popular examples are Java VM and Microsoft CLI.

HLL-VM

Concerns with conventional ISAs Conventional ISAs not developed for being virtualized! no built-in ISA support for virtualization after the fact solution for portability Operating system dependencies Issues with fixed-size address space, page-size Memory address formation Maintaining precise exceptions Instruction set features Instruction discovery during indirect jumps Self-modifying and self-referencing code

Virtual ISA and HLL-VMs HLL-VMs are built to a virtual-isa The virtual-isa is designed to support the virtualization model aimed at: VM based portability is a primary design goal generous use of metadata metadata allows better type-safe code verification, interoperability, and performance

Operating System Dependencies Conventional ISA difficult to emulate exact emulation may not be possible (different OS, different ISA) HLL Virtual-ISA find a least common denominator set of functions programs interact with the library API library interface is higher level than conventional OS interface

Memory Architecture Conventional ISA fixed-size address spaces specific addresses visible to user programs HLL Virtual-ISA abstract memory model of indefinite size memory regions allocated based on need actual memory addresses are never visible out-of-memory error reported if process requests more than what is available of platform at runtime

Memory Address Formation Conventional ISA unrestricted address computation difficult to protect runtime from unauthorized guest program accesses HLL Virtual-ISA pointer arithmetic not permitted memory access only through explicit memory pointers static/dynamic type checking employed

Precise Exceptions Conventional ISA many instructions trap, precise state needed global flags enable/disable exceptions HLL Virtual-ISA few instructions trap test for exception encoded in the program requirements for precise exceptions are relaxed

Instruction Set Features Conventional ISA guest ISA registers > host registers is a problem ISAs with condition codes are difficult to emulate HLL Virtual-ISA stack-oriented condition codes are avoided

Instruction Discovery Conventional ISA indirect jumps to potentially arbitrary locations variable-length instruction, embedded data, padding HLL Virtual-ISA restricted indirect jumps no mixing of code and data variable-length instructions permitted

Self-Referencing, Self-Modifying Code Conventional ISA pose problems for translated code High-level language V-ISA self-modifying and self-referencing code not permitted

HLL VMs Case Studies Pascal P-Code Object-Oriented HLL VMs Java and JVM CLI and.net

Pascal P-code Popularized the Pascal language simplified porting of a Pascal compiler Introduced several concepts used in HLL VMs stack-based instruction set memory architecture is implementation independent undefined stack and heap sizes standard libraries used to interface with the OS Objective was compiler portability (and application portability)

Pascal P-Code Features Protection via trusted interpreter. Advantages porting is simplified Same compiler for all platforms VM implementation is smaller/simpler than a compiler VM provides concise definition of semantics Disadvantages achieving OS independence reduces API functionality to least common denominator tendency to add platform-specific API extensions

Object Oriented HLL VMs Evolved with an end-use in networked computing environment Important features of HLL VMs security and protection Protect remote resources, local files, VM runtime robustness OOP model provides component-based programming, strong type-checking, and garbage collection networking Incremental loading, and small code-size performance Easy code discovery allows entire method compilation

Toolchains and Terminology Java Virtual Machine Architecture and Microsoft CLI analogous to an ISA Java Virtual Machine Implementation and CLR analogous to a computer implementation Java bytecodes and Microsoft Intermediate Language (MSIL), CIL, IL the instruction part of the ISA Java Platform and.net framework ISA + Libraries; a higher level ABI

Modern HLL VM Compiler frontend produces binary files standard format common to all architectures Binary files contain both code and metadata

HLL VMs Key Properties: Security A key aspect of modern networkoriented Vms protection sandbox Must protect: remote resources (files) local files runtime Java's first generation security method still the default

Java Protection Sandbox Remote resources protected by remote system Local resources protected by security manager VM software local file local file class file loaded method lib method loaded method class file loaded method class file loaded method loaded method loaded method class file Network or Local Filesystem loaded method loaded method protected via static/dynamic checking Security Agent native method native method Standard Libraries Java Emulation Engine Loader

Java 1.1 Security: Signing Identifies source of the input program can implement different security policies for programs from different vendors

Java 2 Security: Stack Walking Inspect privileges of all methods on stack append method permissions Method 4 attempts to write file B via io.method5 call fails since Method2 does not have privileges

Garbage collection Issues with traditional malloc/free, new/delete explicit memory allocation places burden on programmer dangling pointer, double free errors Garbage collection objects with no references are garbage must be collected to free up memory for future object allocation OS limits memory use by a process eliminates programmer pointer errors

Network Friendliness Support dynamic class loading on demand load classes only when needed spread loading over time Compact instruction encoding zero-address stack-based bytecode to reduce code size contain significant metadata maybe a slight code size win over RISC fixed-width ISAs

Java ISA Formalized in classfile specification. Includes instruction definitions (bytecodes). Includes data definitions and interrelationships (metadata).

Java Architected State Implied registers program counter, local variable pointer, operand stack pointer, current frame pointer, constant pool base Stack arguments, locals, and operands Heap objects and arrays implementation-dependent object representation Class file content constant pool holds immediates (and other constant information)

Data Items Types are defined in specification implementation free to choose representation reference (pointers) and primitive (byte, int, etc.) types Range of values that can be held are given e.g., byte is between -127 and +128 data is located via references; as fields of objects in heap offsets using constant pool pointer, stack pointer

Data Accessing

Bytecodes single byte opcode zero or more operands Can access operands from instruction current constant pool current frame local variables values on operand stack Instruction Set

Instruction Types Pushing constants onto the stack Moving local variable contents to and from the stack Managing arrays Generic stack instructions (dup, swap, pop & nop) Arithmetic and logical instructions Conversion instructions Control transfer and function return Manipulating object fields Method invocation Miscellaneous operations Monitors

Stack Tracking At any point in program operand stack has same number of operands of same types and in same order regardless of the control path getting there! Helps with static type checking

Stack Tracking: Example Valid bytecode sequence iload A //push int. A from local mem. iload B //push int. B from local mem. If_cmpne 0 else // branch if B ne 0 iload C // push int. C from local mem. goto endelse else: iload F //push F endelse: add // add from stack; result to stack istore D // pop sum to D

Stack Tracking: Example Invalid bytecode sequence stack at skip1 depends on control-flow path iload B // push int. B from local mem. If_cmpne 0 skip1 // branch if B ne 0 iload C // push int. C from local mem. skip1: iload D // push D iload E // push E if_cmpne 0 skip2 // branch if E ne 0 add // add stack; result to stack skip2: istore F // pop to F

Exception Table Exceptions identified by table with each method in the classfile address range where checking is in effect target if exception is thrown Operand stack is emptied If no table entry in current method pop stack frame and check calling method default handlers at main function From To Target Type 8 12 96 Arithmetic Exception

Java Binary Class Format Magic number and header Regions preceded by counts constant pool interfaces field information methods attributes Magic Number Version Information Constant Pool Size Constant Pool Access Flags This Class Super Class Interface Count Interfaces Field Count Field Information Methods count Methods Information Attributes Count Attributes Information

Java Virtual Machine Abstract entity that gives meaning to class files Has many concrete implementations hardware interpreter JIT compiler Persistence an instance is created when an application starts terminates when the application finishes

JVM Implementation A typical JVM implementation consists of class loader subsystem, memory subsystem, emulation/execution engine, garbage collector

Java Class Loader Functions find the binary class convert class data into implementation dependent memory image verify correctness and consistency of the loaded classes Security checks checks class magic number component sizes are as indicated in class file checks number/types of arguments verify integrity of the bytecode program

Java Protection Sandbox

Java Security Manager A trusted class containing check methods attached when Java program starts cannot be removed or changed User specifies checks to be made files, types of access, etc. Operation native methods that involve resource accesses (e.g. I/O) first call check method(s)

Verification in Java Class files are checked when loaded to ensure security and protection Internal Checks checks for magic number checks for truncation or extra bytes each component specifies a length make sure components are well-formed Bytecode checks check valid opcodes perform full path analysis regardless of path to an instruction contents of operand stack must have same number and types of items checks arguments of each bytecode check no local variables are accessed before assigned makes sure fields are assigned values of proper type

Java Native Interface (JNI) Allows java code and native code to interoperate access legacy code, system calls from Java access Java API from native functions each side compiles to its own binary format different java and native stacks maintained arguments can be passed; values/exceptions returned

JNI Schematic

Java Garbage collector Provides implicit heap object space reclamation policy. Collects objects that have all their references removed or destroyed. Invoked at regular intervals, or when low on memory. Root set point to objects in heap Objects not reachable from root set are garbage

Java Garbage Collector

Types of Garbage collectors Reference count collectors keep a count of the number of references to each object Tracing collectors using the root set of references

Mark and Sweep Collectors Basic tracing collector start with root set of references trace and mark all reachable objects sweep through heap collecting marked objects Advantages does not require moving object/pointers Disadvantages garbage objects combined into a linked list leads to fragmentation segregated free-lists can be used consolidation of free space can improve efficiency

Compacting Collector Make free space contiguous multiple passes through heap lot of object movement many pointer updates

Copying Collector Divide heap into halves collect when one half full copy into unused half during sweep phase Reduces passes through heap Wastes half the heap

Simplifying Pointer Updates

Generational Collectors Reduce number of objects moved during each collection cycle. Exploit the bi-modal distribution of object lifetimes. Divide heap into two sub-heaps nursery, for newly created objects tenured, for older objects Collect a smaller portion of the heap each time. Stop-the-world collectors time consuming, long pauses unsuitable for real-time applications

Concurrent Collectors GC concurrently with application execution partially collected heap may be unstable synchronization needed between the application (mutator) and the collector

Java Byte Code Emulation Interpretation simple, fast startup, slow steady-state Just-In-Time (JIT) compilation compile each method on first invocation simple optimizations, slow startup, fast steady-state Hot-spot compilation compile frequently executed code can apply more aggressive optimizations moderate startup, fast steady-state