Executing Legacy Applications on a Java Operating System

Similar documents
Untyped Memory in the Java Virtual Machine

High-Level Language VMs

Introduction to Virtual Machines. Michael Jantz

Darek Mihocka, Emulators.com Stanislav Shwartsman, Intel Corp. June

Full file at

The Slide does not contain all the information and cannot be treated as a study material for Operating System. Please refer the text book for exams.

CSE 237B Fall 2009 Virtualization, Security and RTOS. Rajesh Gupta Computer Science and Engineering University of California, San Diego.

Interaction of JVM with x86, Sparc and MIPS

Runtime Defenses against Memory Corruption

BEAMJIT, a Maze of Twisty Little Traces

CSc 453 Interpreters & Interpretation

Sista: Improving Cog s JIT performance. Clément Béra

Baggy bounds with LLVM

Constructing Virtual Architectures on Tiled Processors. David Wentzlaff Anant Agarwal MIT

Trace Compilation. Christian Wimmer September 2009

C03c: Linkers and Loaders

Notes of the course - Advanced Programming. Barbara Russo

SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine p. 1

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

Modern Buffer Overflow Prevention Techniques: How they work and why they don t

Virtual Machines and Dynamic Translation: Implementing ISAs in Software

Inject malicious code Call any library functions Modify the original code

Handling Self Modifying Code Using Software Dynamic Translation

For our next chapter, we will discuss the emulation process which is an integral part of virtual machines.

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy

Dynamic Translator-Based Virtualization

1 Little Man Computer

Parley: Federated Virtual Machines

CPE300: Digital System Architecture and Design

Virtualization. Dr. Yingwu Zhu

PESIT Bangalore South Campus Hosur road, 1km before Electronic City, Bengaluru Department of Electronics and Communication Engineering

Lecture 5: February 3

New Java performance developments: compilation and garbage collection

The A ssembly Assembly Language Level Chapter 7 1

Trace-based JIT Compilation

24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant.

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon. School of Electrical Engineering and Computer Science Seoul National University, Korea

Just-In-Time Compilers & Runtime Optimizers

JamaicaVM Java for Embedded Realtime Systems

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler

stack Two-dimensional logical addresses Fixed Allocation Binary Page Table

CHAPTER 16 - VIRTUAL MACHINES

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui

On the Design of the Local Variable Cache in a Hardware Translation-Based Java Virtual Machine

Improving Mobile Program Performance Through the Use of a Hybrid Intermediate Representation

BlackBox. Lightweight Security Monitoring for COTS Binaries. Byron Hawkins and Brian Demsky University of California, Irvine, USA

Intermediate Representations

Assembly Language. Lecture 2 - x86 Processor Architecture. Ahmed Sallam

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization.

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

CSCE 410/611: Virtualization!

Learning Outcomes. Extended OS. Observations Operating systems provide well defined interfaces. Virtual Machines. Interface Levels

Advanced Computer Architecture

Fault Tolerant Java Virtual Machine. Roy Friedman and Alon Kama Technion Haifa, Israel

Chapter Overview. Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 1: Basic Concepts. Printing this Slide Show

The NetRexx Interpreter

Evolution of Virtual Machine Technologies for Portability and Application Capture. Bob Vandette Java Hotspot VM Engineering Sept 2004

Introduction to Virtual Machines

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler

An Introduction to Python (TEJ3M & TEJ4M)

Assembly Language. Lecture 2 x86 Processor Architecture

5.b Principles of I/O Hardware CPU-I/O communication

Lecture 9 Dynamic Compilation

From High Level to Machine Code. Compilation Overview. Computer Programs

Optimization of Vertical and Horizontal Beamforming Kernels on the PowerPC G4 Processor with AltiVec Technology

A program execution is memory safe so long as memory access errors never occur:

FAWN as a Service. 1 Introduction. Jintian Liang CS244B December 13, 2017

Computer Systems A Programmer s Perspective 1 (Beta Draft)

Optimization Techniques

Run-time Program Management. Hwansoo Han

Operating- System Structures

Exploiting Hardware Resources: Register Assignment across Method Boundaries

A Feasibility Study for Methods of Effective Memoization Optimization

Virtual Machines. 2 Disco: Running Commodity Operating Systems on Scalable Multiprocessors([1])

CHAPTER 2: SYSTEM STRUCTURES. By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Introduction to Java Programming

Chapter 2: System Structures

Buffer Overflows: Attacks and Defenses for the Vulnerability of the Decade Review

Operating Systems. Operating System Structure. Lecture 2 Michael O Boyle

AOT Vs. JIT: Impact of Profile Data on Code Quality

Fig 7.30 The Cache Mapping Function. Memory Fields and Address Translation

Emulation. Michael Jantz

Chapter 2: Operating-System Structures

Towards future method hotness prediction for Virtual Machines

Operating System: Chap2 OS Structure. National Tsing-Hua University 2016, Fall Semester

Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances

LECTURE 2. Compilers and Interpreters

Enterprise Architect. User Guide Series. Profiling

Enterprise Architect. User Guide Series. Profiling. Author: Sparx Systems. Date: 10/05/2018. Version: 1.0 CREATED WITH

! Software ( kernel ) that runs at all times. ! OS performs two unrelated functions: Maria Hybinette, UGA. user! 1! Maria Hybinette, UGA. user! n!

Method-Level Phase Behavior in Java Workloads

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.

The Use of Cloud Computing Resources in an HPC Environment

Chapter 2 Operating-System Structures

File System Internals. Jo, Heeseung

Practical Malware Analysis

The Assembly Language Level. Chapter 7

PRESENTED BY: SANTOSH SANGUMANI & SHARAN NARANG

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Transcription:

Executing Legacy Applications on a Java Operating System Andreas Gal, Michael Yang, Christian Probst, and Michael Franz University of California, Irvine {gal,mlyang,probst,franz}@uci.edu May 30, 2004 Abstract One of the biggest disadvantages of using type-safe languages and frameworks such as Java for the implementation of commercial operating system (OS) platforms is the lack of backward compatibility with legacy execution environments. Traditional Java operating systems are designed to execute Java programs. They struggle to support established application programming interfaces such as POSIX or Win32 as these are founded on the idea of direct execution of native machine code. Instead of using the host processor to execute legacy applications which would eliminate most of the security and portability benefits of using a type-safe language for the OS implementation in the first place we propose using a virtual processor implemented on top of the virtual machine infrastructure to execute legacy applications. 1 Rationale The recent surge in security exploits for the Microsoft Windows platform [5] demonstrates the need for foundationally safe operating systems (OSs) in internet-based computing. Language-based security is a promising new approach to rid applications from buffer-overflows, format string attacks and similar vulnerabilities. Currently, type-safe languages such as Java or C# are predominantly used in application code. Operating systems continue to be implemented in unsafe languages such as C or C++. To offer a viable alternative to the established operating system platforms Windows, Linux, etc., Java-based OSs need to learn to support the execution of legacy applications. Existing implementations such as JavaOS [6], JX [3], or JOS [4] implement the majority of OS code in type-safe Java code, but allow to execute Java applications only. Emulating Java-based Application Programming Interfaces (APIs) on top of a Java OS is simple and has been successfully demonstrated by many Java-based OS implementations. However, the vast majority of existing applications are written in type-unsafe languages and many of them are available as target-machine specific executables only. Being able to execute such applications on top of a Java-based OS would allow users to switch to a Java-based platform even before all applications have been ported. 1

JVM JIT translator execution application (host CPU) JVM virtual CPU machine code Figure 1: Executing native machine code on top of a Java Virtual Machine (JVM). The machine code is translated to Java byte-code and then forwarded to the JVM for execution. While the JVM will initially interpret the byte-code, the JIT compiler available in most JVMs will select often referenced byte-code fragments and compile them to directly executable machine code for the host CPU. Thus, for relevant hot spots, the proposed architecture is able to translate target machine code to directly executable host machine code via generating portable Java byte-code. 2 Performance Full system virtualization is a challenge and a number of optimizations have to be applied to achieve acceptable performance. Simple interpretation-based processor emulators such as the PowerPC750 Simulator [2] usually suffer from a slowdown from anywhere between 60 to 500 times in comparison to native execution. In contrast to byte-code that has been designed for interpretation, machine code instructions are often expensive to execute in software as they contain optional computations that are essentially free if done in hardware, but costly if performed step-by-step in software. A well-known example for this is the flag register of the Intel 386 CPU. While nearly all arithmetic instructions update the flag registers, less than 10% of the computed flag values are actually evaluated. In contrast to hardware, a software simulator has to pay a hefty price if redundant flag calculations need to be performed. To avoid this overhead, full system virtualizers such as VirtualPC [1] compile the target machine code to host machine code and optimize the host machine code by removing any redundant instructions. The same principle used by VirtualPC can be applied when executing in a Java environment. Instead of trying to build an interpreter in Java which would be even slower than existing interpreters written in C or C++, the machine code is instead translated to Java byte-code (Figure 1) and then forwarded to the Java Virtual Machine (JVM) for execution. While the JVM will initially interpret the byte-code, the JIT compiler available in most JVMs will select often referenced byte-code fragments and compile them to directly executable machine code for the host CPU. Thus, for relevant hot spots, the proposed architecture is able to translate target machine code to directly executable host machine code via generating portable Java byte-code. In contrast to existing emulators such as VirtualPC, the code translator depends on the 2

target system only and can generate efficient host machine code by relying on the JIT compiler as back-end for the code generation. 3 Implementation We are currently implementing a research prototype for the proposed legacy machine code execution framework. The prototype is able to execute statically linked Linux ELF binaries for PowerPC on top of a standard Java VM. A significant hurdle we had to overcome was the lack of appropriate reflection and dynamic code generation capabilities in Java. In fact, when dynamically generating Java byte-code on the fly, our framework has to emit Java class-files and load them into the VM using the standard class loader. This process causes a noticeable startup delay, especially for small applications that contain large amounts of run-once startup code. To counter this problem, we decided to add additionally to the code translator a direct interpreter. While the interpreter is many times slower than the execution of translated byte-code, it does not have any startup latency. During interpretation, a monitor detects basic block boundaries by recording all branch targets (basic block entries) and branch instructions (basic block exits) in a hash-table. If the interpreter detects that a basic block is executed repeatedly, the basic block is delegated to the code translator to be compiled to byte-code. The code translator compiles basic blocks of to methods of dynamically generated Java classes. Each of these methods takes an object holding the current CPU state (i.e. registers and flags) as parameter and returns the value of the program counter upon completion of the basic block execution. Using basic-block based code translation, we are able to achieve a speed-up of factor 2 in comparison to pure interpretation by an optimized emulator implemented in C. However, this still represents an overall slowdown of factor 30 in comparison to direct execution of the code by a hardware CPU. The biggest single cost factor, once all relevant hot spots have been translated to byte-code and successively to by the JIT, is the interpreter loop that is executed every time a basic block is exited (Figure 2). Each compiled basic block returns the new value of the program counter (PC) as return value. The interpreter loop uses a hash-table to locate the Java method that holds the corresponding byte-code to that location in the. Besides the cost for the hash-table lookup, the JIT compiler is also not able to optimize across basic block boundaries, as all the invocation of basic block methods through the hash-table are virtual. To further boost the performance of the emulator, this overhead has to be eliminated. For this, we are currently implementing a super-block based code translator that is not limited to mapping single basic blocks to Java methods, but tries to fit code located in frequently referenced basic blocks into the same Java method (Figure 2). Using this approach, branch instructions targeting basic blocks that reside in the same Java methods can be implemented using the Java goto instruction. Not only is the cost for this instruction much lower than exiting the method and going through the hash-table, but this also allows the JIT to detect loop headers and optimize the code accordingly. Our prototype system is not yet able to perform super-block based code translation on non-trivial code. However, preliminary benchmarks indicate that we will be able 3

A interpreter loop interpreter loop 1x code hashtable code hashtable B 1000x C 1000x A B C super block basic block based translation A B C super block based translation Figure 2: Basic block and super-block based translation of a loop in native machine code to Java byte-code. The basic-block based approach has to go through a hash-table twice per iteration to locate the successor basic block. The super-block based approach fits often referenced basic blocks into the same Java method and branch instructions can be modeled using the Java goto operation at much lower cost. Only references to external basic blocks still go through the hash-table. to reduce the slowdown in comparison to direct hardware execution to a factor of 3 to 15 depending on the type of application. The best performance can be expected for computational-intensive applications. Memory-intensive applications in contrast suffer significantly from the fact that our system has to emulated untyped memory on top of a safe virtual machine. 4 Conclusions and Future Work Java-based operating systems have not gained much momentum in the real world. We believe that this is partially caused by the lack of backward-compatibility to legacy execution environments and APIs. We have presented an approach that allows to emulate any target machine on top of a type-safe virtual machine. While we expect a significant slowdown when executing legacy applications, we believe that the proposed approach is still feasible for real-world applications. As far as future work is concerned, we plan to port our prototype system to the Microsoft.NET virtual machine, which has better capabilities in regards to dynamic code generation. Most importantly,.net has a standard API to dynamically generate new types and methods whereas Java requires the application to manually assemble class-files that are then loaded into the JVM through the class loader. We will also explore the feasibility of adding support for untyped memory blocks to the.net virtual machine to allow a more efficient emulation of machine code instructions. As the same functionality can be achieved through an equivalent implementation using existing VM constructs, we do not consider such a new language constructor to be a threat to the security of the VM. 4

References [1] Connectix Corporation. The Technology of Virtual Machines. 2001. [2] S. Gibral, G. Mouchard, A. Cohen, and O. Temam. PowerPC 750 Simulator, 2004. http://www.microlib.org. [3] M. Golm, M. Felser, C. Wawersich, and J. Kleinoeder. The JX Operating System. In Proceedings of the 2002 USENIX Annual Technical Conference, pages 45 58, Monterey, CA, June 2002. [4] G. Herschberger, T. Miller, and R. Fitzsimons. The JOS Java Operating System, 2004. http://sourceforge.net/projects/jos. [5] Microsoft Cooperation. Microsoft Security Bulletins, 2004. http://www.microsoft.com/technet/security/currentdl.aspx. [6] C. Sasaki. JavaOS Architecture. In Proceedings of the 1997 JavaOne World Wide Java Developer Conference, San Fransisco, CA, Apr. 1997. 5