CSc 372 Comparative Programming Languages. Getting started... Getting started. B: Java Bytecode BCEL

Similar documents
CSc 620 Debugging, Profiling, Tracing, and Visualizing Programs. Getting started... Getting started. 2 : Java Bytecode BCEL

CSc 620 Debugging, Profiling, Tracing, and Visualizing Programs

Hacking SandMark. Christian Collberg. January 28, 2003

Under the Hood: The Java Virtual Machine. Problem: Too Many Platforms! Compiling for Different Platforms. Compiling for Different Platforms

Compiling Techniques

Under the Hood: The Java Virtual Machine. Lecture 23 CS2110 Fall 2008

Introduction to Programming Using Java (98-388)

Java: framework overview and in-the-small features

CSC Java Programming, Fall Java Data Types and Control Constructs

Compiling for Different Platforms. Problem: Too Many Platforms! Dream: Platform Independence. Java Platform 5/3/2011

Java Bytecode (binary file)

Problem: Too Many Platforms!

The Java Virtual Machine. CSc 553. Principles of Compilation. 3 : The Java VM. Department of Computer Science University of Arizona

Java Class Loading and Bytecode Verification

Java Security. Compiler. Compiler. Hardware. Interpreter. The virtual machine principle: Abstract Machine Code. Source Code

Outline. Parts 1 to 3 introduce and sketch out the ideas of OOP. Part 5 deals with these ideas in closer detail.

Over-view. CSc Java programs. Java programs. Logging on, and logging o. Slides by Michael Weeks Copyright Unix basics. javac.

Lecture Set 4: More About Methods and More About Operators

Getting started with Java

JVML Instruction Set. How to get more than 256 local variables! Method Calls. Example. Method Calls

CSc 520 Principles of Programming Languages

CSC 1214: Object-Oriented Programming

Array. Prepared By - Rifat Shahriyar

Loops. CSE 114, Computer Science 1 Stony Brook University

Plan for Today. Safe Programming Languages. What is a secure programming language?

Static Program Analysis

CSC 4181 Handout : JVM

CS2110 Fall 2011 Lecture 25. Under the Hood: The Java Virtual Machine, Part II

Course information. Petr Hnětynka 2/2 Zk/Z

Tools : The Java Compiler. The Java Interpreter. The Java Debugger

Computer Programming, I. Laboratory Manual. Final Exam Solution

Java Programming Language Mr.Rungrote Phonkam

CSc 553. Principles of Compilation. 2 : Interpreters. Department of Computer Science University of Arizona

Course information. Petr Hnětynka 2/2 Zk/Z

High-Level Language VMs

Object-oriented programming. and data-structures CS/ENGRD 2110 SUMMER 2018

Compiler construction 2009

COMP 250: Java Programming I. Carlos G. Oliver, Jérôme Waldispühl January 17-18, 2018 Slides adapted from M. Blanchette

Object-Oriented Programming

Java Identifiers, Data Types & Variables

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Let s make some Marc R. Hoffmann Eclipse Summit Europe

INDEX. A SIMPLE JAVA PROGRAM Class Declaration The Main Line. The Line Contains Three Keywords The Output Line

An overview of Java, Data types and variables

CS313D: ADVANCED PROGRAMMING LANGUAGE

CSE 431S Final Review. Washington University Spring 2013

Background. Reflection. The Class Class. How Objects Work

SOFTWARE ARCHITECTURE 7. JAVA VIRTUAL MACHINE

CSc 453. Compilers and Systems Software. 18 : Interpreters. Department of Computer Science University of Arizona. Kinds of Interpreters

CS171:Introduction to Computer Science II

2 rd class Department of Programming. OOP with Java Programming

Lecture Set 4: More About Methods and More About Operators

Computer Components. Software{ User Programs. Operating System. Hardware

JoeQ Framework CS243, Winter 20156

Lecture #6-7 Methods

Part VII : Code Generation

Class, Variable, Constructor, Object, Method Questions

Lecture 14 CSE11 Fall 2013 For loops, Do While, Break, Continue

Object Oriented Programming: In this course we began an introduction to programming from an object-oriented approach.

Full file at

CS111: PROGRAMMING LANGUAGE II

Programming II (CS300)

Course Overview. PART I: overview material. PART II: inside a compiler. PART III: conclusion

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

CSE 201 JAVA PROGRAMMING I. Copyright 2016 by Smart Coding School

BIT Java Programming. Sem 1 Session 2011/12. Chapter 2 JAVA. basic

Expressions and Data Types CSC 121 Spring 2015 Howard Rosenthal

Java Primer. CITS2200 Data Structures and Algorithms. Topic 2

CS 11 java track: lecture 1

301AA - Advanced Programming [AP-2017]

Today. Instance Method Dispatch. Instance Method Dispatch. Instance Method Dispatch 11/29/11. today. last time

Fundamentals of Programming Data Types & Methods

Lecture 2. COMP1406/1006 (the Java course) Fall M. Jason Hinek Carleton University

Java Instrumentation for Dynamic Analysis

CMSC131. Introduction to your Introduction to Java. Why Java?

Interpreted vs Compiled. Java Compile. Classes, Objects, and Methods. Hello World 10/6/2016. Python Interpreted. Java Compiled

Introduction to Java

In this lab we will practice creating, throwing and handling exceptions.

PROGRAMMING FUNDAMENTALS

1 Shyam sir JAVA Notes

Programming II (CS300)

JVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format

CSCE 314 Programming Languages

AP COMPUTER SCIENCE JAVA CONCEPTS IV: RESERVED WORDS

Classes Basic Overview

School of Informatics, University of Edinburgh

Computer Science II (20082) Week 1: Review and Inheritance

Parsing Scheme (+ (* 2 3) 1) * 1

Zheng-Liang Lu Java Programming 45 / 79

CS 211: Methods, Memory, Equality

COMP 250 Winter 2011 Reading: Java background January 5, 2011

301AA - Advanced Programming

Java Exceptions Java, winter semester

Expressions & Flow Control

Object oriented programming. Instructor: Masoud Asghari Web page: Ch: 3

Outline. Overview. Control statements. Classes and methods. history and advantage how to: program, compile and execute 8 data types 3 types of errors

Crash Course Review Only. Please use online Jasmit Singh 2

CSc 453 Interpreters & Interpretation

Expressions and Data Types CSC 121 Fall 2015 Howard Rosenthal

2. Introducing Classes

Transcription:

BCEL CSc 372 Comparative Programming Languages B: Java Bytecode BCEL BCEL (formerly JavaClass) allows you to load a class, iterate through the methods and fields, change methods, add new methods and fields, etc. BCEL also makes it easy to create new classes from scratch. Christian Collberg collberg@cs.arizona.edu Department of Computer Science University of Arizona Copyright c 2004 Christian Collberg [1] 372 [2] Getting started Getting started... To open a class file for editing you do the following: de.fub.bytecode.classfile.classparser p = new de.fub.bytecode.classfile.classparser(classfile); de.fub.bytecode.classfile.javaclass jc = p.parse() de.fub.bytecode.generic.classgen cg = new de.fub.bytecode.generic.classgen(jc); de.fub.bytecode.generic.constantpoolgen cp = new de.fub.bytecode.generic.constantpoolgen( jc.getconstantpool()); A de.fub.bytecode.classfile.javaclass-object represents the parsed class file. You can query this object to get any information you need about the class, for example its methods, fields, code, super-class, etc. A de.fub.bytecode.generic.classgen-object represents a class that can be edited. You can add methods and fields, you can modify existing methods, etc. A de.fub.bytecode.generic.constantpoolgen-object represents a constant pool to which new constants can be added.

Finishing up Editing a Method When you are finished editing the class you must close and save it to the new class file: de.fub.bytecode.classfile.javaclass jc1 = cg.getjavaclass(); jc1.setconstantpool(cp.getfinalconstantpool()); jc1.dump(classfile1); de.fub.bytecode.classfile.method represents a method that has been read from a class file. de.fub.bytecode.generic.methodgen is the BCEL class that represents a method being edited: de.fub.bytecode.generic.constantpoolgen cp =...; de.fub.bytecode.generic.classgen cg =...; de.fub.bytecode.classfile.method method = cg.containsmethod(methodname, methodsignature); de.fub.bytecode.generic.methodgen mg = new de.fub.bytecode.generic.methodgen(method,cname,cp); [5] 372 [6] Editing Instructions Editing Instructions... Once you have opened a method for editing you can get its list of instructions: de.fub.bytecode.generic.methodgen mg =...; de.fub.bytecode.generic.instructionlist il = mg.getinstructionlist(); de.fub.bytecode.generic.instructionhandle[] ihs = il.getinstructionhandles(); for(int i=0; i < ihs.length; i++) { de.fub.bytecode.generic.instructionhandle ih = ihs[i]; de.fub.bytecode.generic.instruction instr = ih.getinstruction(); de.fub.bytecode.generic.instructionlists can be manipulated by appending or inserting new instruction, deleting instructions, etc. All bytecode instructions in BCEL are represented by their own class. de.fub.bytecode.generic.iadd is the class that represents integer addition, for example. This allows us to use instanceof to classify instructions.

Changing an Instruction Local variables de.fub.bytecode.generic.instruction inst =...; de.fub.bytecode.generic.instructionhandle ih =...; if (inst instanceof de.fub.bytecode.generic.invokestatic){ de.fub.bytecode.generic.invokestatic call = (de.fub.bytecode.generic.invokestatic) inst; String classname = call.getclassname(cp); String methodname = call.getmethodname(cp); String methodsig = call.getsignature(cp); A Java method can have a maximum of 256 local variables. (Actually 65536). In a virtual (non-static) method local variable zero ( slot #0 ) is this. Formal arguments take up the first slots. In other words, in a virtual method with 2 formal parameters, slot #0 is this, slot #1 is the first formal, slot #2 is the second formal, and any local variables are slots #3,#4,...,#255. ih.setinstruction(new de.fub.bytecode.generic.nop()); setinstruction replaces an instruction with a new one. [9] 372 [10] Local variables... Local variables... Local variables can be reused within a method. A clever compiler will allocate both x and y to the same slot, since they have non-overlapping lifetimes. For example, consider the following method: void P() { int x= 5; System.out.println(x); float y=5.6; System.out.println(y); BCEL requires that we indicate where a new local variable is accessible: de.fub.bytecode.generic.type T =...; de.fub.bytecode.generic.localvariablegen lg = mg.addlocalvariable(localname, T, null, null); int localindex = lg.getindex(); // push something here de.fub.bytecode.generic.instruction store = new de.fub.bytecode.generic.astore(localindex); de.fub.bytecode.generic.instructionhandle start =...; lg.setstart(start);

Local variables... Example 1: Creating a New Class de.fub.bytecode.generic.localvariablegen creates a new local variable. We start these locations out as null (unknown) and fill them in using the setstart method. It takes an de.fub.bytecode.generic.instructionhandle as input, namely the first instruction where the variable should be visible. de.fub.bytecode.generic.localvariablegen lg = mg.addlocalvariable(localname, T, null, null);... de.fub.bytecode.generic.instructionhandle start =...; lg.setstart(start); We create a new de.fub.bytecode.generic.classgen object representing the class. int flags = de.fub.bytecode.constants.acc_public; String class_name = "MyClass"; String file_name = "MyClass.java"; String super_class_name = "java.lang.object"; String[] interfaces = {; de.fub.bytecode.generic.classgen cg = new de.fub.bytecode.generic.classgen( class_name, super_class_name, file_name, flags, interfaces); [13] 372 [14] Example 1... Example 1 Creating a Method We can then add a field field1 of type int to the new class: de.fub.bytecode.generic.constantpoolgen cp = cg.getconstantpool(); de.fub.bytecode.generic.type field_type = de.fub.bytecode.generic.type.int; String field_name = "field1"; de.fub.bytecode.generic.fieldgen fg = new de.fub.bytecode.generic.fieldgen( access_flags, field_type, field_name, cp); cg.addfield(fg.getfield()); int method_access_flags = de.fub.bytecode.constants.acc_public de.fub.bytecode.constants.acc_static; de.fub.bytecode.generic.type return_type = de.fub.bytecode.generic.type.void; de.fub.bytecode.generic.type[] arg_types = de.fub.bytecode.generic.type.no_args; String[] arg_names = {;

Example 1 Creating a Method... Example 1 Creating a Method... String method_name = "method1"; String class_name = "MyClass"; de.fub.bytecode.generic.instructionlist il = new de.fub.bytecode.generic.instructionlist(); il.append( de.fub.bytecode.generic.instructionconstants.return); de.fub.bytecode.generic.methodgen mg = new de.fub.bytecode.generic.methodgen( method_access_flags, return_type, arg_types, arg_names, method_name, class_name, il, cp); The method has no arguments, returns void, and contains only one instruction, a return. Note the call to mg.setmaxstack(). With every method in a class file is stored the maximum stack-size is needed to execute the method. For example, a method whose body consists of the instructions iconst 1, iconst 2, iadd, pop (i.e. compute 1 + 2 and throw away the result) will have max-stack set to two. We can either compute this ourselves or let BCEL s mg.setmaxstack() do it for us. mg.setmaxstack(); cg.addmethod(mg.getmethod()); [17] 372 [18] Branches Branches... As is always the case with branches, we need to be able to handle forward jumps. The typical way of accomplishing this is to create a branch with unspecified target: de.fub.bytecode.generic.instructionlist il =...; de.fub.bytecode.generic.ifnull branch = new de.fub.bytecode.generic.ifnull(null); il.append(branch);... When we have gotten to the location we want to jump to we add a (bogus) NOP instruction which will serve our branch target. The append instruction returns a handle to the NOP and we use this handle to set the target of the branch: de.fub.bytecode.generic.instructionhandle h = il.append(new de.fub.bytecode.generic.nop()); branch.settarget(h);

Exceptions Exceptions... Here is the generic code for building a try-catch-block: de.fub.bytecode.generic.instructionlist il =...; de.fub.bytecode.generic.methodgen mg =...; String exception = "java.lang.exception"; de.fub.bytecode.generic.instructionhandle start_pc = il.append(new de.fub.bytecode.generic.nop()); de.fub.bytecode.generic.instructionhandle end_pc = il.append(branch); // Pop exception off stack when entering catch block. de.fub.bytecode.generic.instructionhandle handler_pc = il.append(new de.fub.bytecode.generic.pop()); // The code that builds the catch-body goes here. // The code that builds the try-body goes here. // Code to jump out of the try-block: de.fub.bytecode.generic.goto branch = new de.fub.bytecode.generic.goto(null); // Add a NOP after the exception handler. This is // where we will jump after we re through with the // try-block. de.fub.bytecode.generic.instructionhandle next_pc = il.append(new de.fub.bytecode.generic.nop()); branch.settarget(next_pc); [21] 372 [22] Exceptions... Types de.fub.bytecode.generic.objecttype catch_type = new de.fub.bytecode.generic.objecttype(exception); de.fub.bytecode.generic.codeexceptiongen eg = mg.addexceptionhandler( start_pc, end_pc, handler_pc, catch_type); start pc and end pc hold the beginning and the end of the try-body, respectively. handler pc holds the address of the beginning of the catch-block. BaseType Type Interpretation B byte signed 8-bit integer C char Unicode character D double 64-bit floating point value F float 32-bit floating point value I int 32-bit integer J long 64-bit integer L<classname>; reference instance of class <classname> S short signed 16-bit integer Z boolean true pr false [ reference one array dimension V (BaseType*)BaseType void method descriptor

Types... Types... A source-level Java type is encoded into a classfile type descriptor using the definitions on the previous slide. Types (such as method signatures) are defined by the class de.fub.bytecode.generic.type: de.fub.bytecode.generic.type return_type = de.fub.bytecode.generic.type.void; de.fub.bytecode.generic.type[] arg_types = new de.fub.bytecode.generic.type[] { ; new de.fub.bytecode.generic.arraytype( de.fub.bytecode.generic.type.string, 1) BCEL has quite a number of methods that convert back and forth between Java source format java.lang.object[] bytecode format [Ljava/lang/Object; and BCEL s internal format de.fub.bytecode.generic.type. de.fub.bytecode.generic.type.gettype converts from Java bytecode format to Java source format: String S = "[Ljava/lang/Object;"; de.fub.bytecode.generic.type T = de.fub.bytecode.generic.type.gettype(s); System.out.println(T); // ==> "java.lang.object[]". [25] 372 [26] Types... Types... The method de.fub.bytecode.classfile.utility.getsignature converts a type in Java bytecode format to Java source format: String type = "java.lang.object[]"; String S = de.fub.bytecode.classfile.utility.getsignature(type); System.out.println(S); ==> [Ljava/lang/Object; The methods de.fub.bytecode.generic.type.getargumenttypes and de.fub.bytecode.generic.type.getreturntype take a type in Java bytecode format as argument and extract the array of argument types and the return type, respectively: de.fub.bytecode.generic.type.getmethodsignature converts the return and argument types back to a Java bytecode type string.

Types... Types... String S = "(Ljava/lang/String;I)V;"; de.fub.bytecode.generic.type[] arg_types = de.fub.bytecode.generic.type.getargumenttypes(s); de.fub.bytecode.generic.type return_type = de.fub.bytecode.generic.type.getreturntype(s); String M = de.fub.bytecode.generic.type.getmethodsignature( return_type, arg_types); The method de.fub.bytecode.generic.type.getsignature converts a BCEL type to the equivalent Java bytecode signature. This code de.fub.bytecode.generic.type T = de.fub.bytecode.generic.type.stringbuffer; String M = T.getSignature(); System.out.println(M); will print out Ljava/lang/StringBuffer;. [29] 372 [30] Static Method Calls Static Method Calls... To make a static method call we push the arguments on the stack, create a constant pool reference to the method, and then generate an INVOKESTATIC opcode that performs the actual call: String classname =...; String methodname =...; String signature =...; de.fub.bytecode.generic.instructionlist il =...; de.fub.bytecode.generic.constantpoolgen cp =...; // Generate code that pushes the actual // arguments of the call. int index = cp.addmethodref(classname,methodname,signature); de.fub.bytecode.generic.invokestatic call = new de.fub.bytecode.generic.invokestatic(index); il.append(call);

Virtual Method Calls Example 2: List.java Making a virtual call is similar, except that we need an object to make the call through. This object reference is pushed on the stack prior to the arguments: // Generate code pushing the object on the stack. // Generate code pushing the actual arguments. int index = cp.addmethodref(classname,methodname,signature); de.fub.bytecode.generic.invokevirtual s = new de.fub.bytecode.generic.invokevirtual(index); il.append(call); public class List { public static void main(string args[]) throws java.io.ioexception { String classfile = args[0]; String classname = classfile.substring( 0,classFile.length()-".class".length()); de.fub.bytecode.classfile.classparser p = new de.fub.bytecode.classfile.classparser(classfile); de.fub.bytecode.classfile.javaclass jc = p.parse(); de.fub.bytecode.generic.classgen cg = new de.fub.bytecode.generic.classgen(jc); [33] 372 [34] Example 2: List.java Example 2: List.java de.fub.bytecode.classfile.constantpool cp = jc.getconstantpool(); de.fub.bytecode.generic.constantpoolgen cpg = new de.fub.bytecode.generic.constantpoolgen(cp); de.fub.bytecode.classfile.method[] methods = cg.getmethods(); for(int m=0; m<methods.length; m++) { de.fub.bytecode.generic.methodgen mg = new de.fub.bytecode.generic.methodgen( methods[m],classname,cpg); System.out.println("\nMETHOD: " + mg.getclassname() + "." + mg.getname() + ":" + mg.getsignature()); de.fub.bytecode.generic.instructionlist il = mg.getinstructionlist(); de.fub.bytecode.generic.instructionhandle[] ihs = il.getinstructionhandles(); int pc = 0; for(int i=0; i < ihs.length; i++) { de.fub.bytecode.generic.instructionhandle ih = ihs[i]; de.fub.bytecode.generic.instruction instr = ih.getinstruction(); System.out.println(pc + " : " + instr.tostring(cp)); pc += instr.getlength();

Readings and References BCEL is here: http://jakarta.apache.org/bcel/index.html. There is a manual and an on-line API description at http://bcel.sourceforge.net/docs/index.html. [37]