ByCounter: Portable Runtime Counting of Bytecode Instructions and Method Invocations

Size: px

Start display at page:

Download "ByCounter: Portable Runtime Counting of Bytecode Instructions and Method Invocations"

Stanley Gilmore
5 years ago
Views:

1 ByteCode 2008 ByCounter: Portable Runtime Counting of Bytecode Instructions and Method Invocations Michael Kuperberg 1 Martin Krogmann 2 Ralf Reussner 3 Chair for Software Design and Quality Institute for Program Structures and Data Organisation Faculty of Informatics, Universität Karlsruhe (TH) Abstract For bytecode-based applications, runtime instruction counts can be used as a platform-independent application execution metric, and also can serve as the basis for bytecode-based performance prediction. However, different instruction types have different execution durations, so they must be counted separately, and method invocations should be identified and counted because of their substantial contribution to the total application performance. For Java bytecode, most JVMs and profilers do not provide such functionality at all, and existing bytecode analysis frameworks require expensive JVM instrumentation for instruction-level counting. In this paper, we present ByCounter, a lightweight approach for exact runtime counting of executed bytecode instructions and method invocations. ByCounter significantly reduces total counting costs by instrumenting only the application bytecode and not the JVM, and it can be used without modifications on any JVM. We evaluate the presented approach by successfully applying it to multiple Java applications on different JVMs, and discuss the runtime costs of applying ByCounter to these cases. Key words: Java, bytecode, counting, portable, fine-grained 1 Introduction The runtime behaviour of applications has functional aspects such as correctness, but also extra-functional aspects, such as performance. The runtime behaviour of a Java application can be described by analysing the execution of the application s Java bytecode instructions. Execution counts of these instructions are needed for bytecode-based performance prediction of Java applications [?], and also for dynamic bytecode metrics [?]. 1 michael.kuperberg@informatik.uni-karlsruhe.de 2 martin.krogmann@informatik.uni-karlsruhe.de 3 reussner@ipd.uka.de This paper is electronically published in Electronic Notes in Theoretical Computer Science URL:

2 As different instruction types have different execution durations, they must be counted separately. Also, method invocations should be identified due to the substantial contribution of methods to the total application performance. Thus, each method signature should have its own counter. To obtain all these runtime counts, static analysis (i.e. without executing the application) could be used, but it would have to be augmented to evaluate runtime effects of control flow constructs like loops or branches. Even if control flow consideration is attempted with advanced techniques such as symbolic execution, additional effort is required for handling infinite symbolic execution trees [?, pp ]. Hence, it is often faster and easier to use dynamic (i.e. runtime) analysis for counting executed instructions and invoked methods. However, dynamic counting of executed Java bytecode instructions is not offered by Java profilers or conventional Java Virtual Machines (JVMs). Existing program behaviour analysis frameworks for Java applications (such as JRAF [?]) do not differentiate between bytecode instruction types, do not identify method invocations performed from bytecode, or do not work at the level of bytecode instructions at all. These frameworks frequently rely on the instrumentation of the JVM, however, such instrumentation requires substantial effort and must be reimplemented for different JVMs. The contribution of the paper is a novel approach for lightweight portable runtime counting of Java bytecode instructions and method invocations. Its implementation is called ByCounter and it works by instrumenting the application bytecode instead of instrumenting the JVM. Through this, By- Counter can be used with any JVM, and the instrumented application can be executed by any JVM, making the ByCounter approach truly portable. Furthermore, ByCounter does not alter existing method signatures in instrumented classes nor does it require wrappers, so the instrumentation does not lead to any structural changes in the existing application. To make performance characterisation through bytecode counts more precise, runtime parameters of some bytecode instructions must be considered, as they can have significant impact on their performance [?]. For these cases, ByCounter provides basic parameter recording (e.g. for the array-creating instructions), and it also offers extension hooks for the recording mechanism. The presented approach is evaluated on two different Java virtual machines using applications that are subsets of three Java benchmarks. For these applications, our evaluation shows that despite accounting of single bytecode instructions, the ByCounter overhead during the counting phase at runtime is reasonably low (between ca. 1% and 85% in all cases except one outlier), while instrumenting the bytecode requires < 0.3 s in all studied cases. The paper is structured as follows: in Section 2, we outline the foundations of our approach. Section 3 provides an overview over how ByCounter works, while Section 4 describes its implementation. Section 5 presents our evaluation, before related work is presented in Section 6. Finally, we list our assumptions and limitation in Section 7 and conclude the paper in Section 8. 2

3 2 Foundations Java bytecode is executed on the Java Virtual Machine (JVM), which abstracts the specific details of the underlying software/hardware platform, making compiled Java classes portable across different platforms for which JVMs are offered. Each JVM is supplied with a set of Java classes that form the (vendor-specific) implementation of the Java API. In Java bytecode, four instructions are used to invoke Java methods, including those of the Java API: INVOKEINTERFACE, INVOKESPECIAL, INVOKE- STATIC and INVOKEVIRTUAL (hereafter called INVOKE*). The signature of the invoked method appears as the parameter of the INVOKE* instruction, while the parameters of the invoked method are prepared on the stack before method invocation. If an invoked method is part of the Java API, its implementation can be different across operating systems, as it may call platform-specific native methods (e.g. for file system access). To avoid platform-dependent counts, invocations of API and all other methods must initially be counted as they appear in application s bytecode, without decomposing them into the elements of their implementation. This results in a flat view, which summarises the execution of the analysed method in a platform-independent way. If needed, counts for the invoked methods can be obtained using the same approach, too. Using this additional information, counts for the entire (expanded) calling tree of the analysed method can be computed, and such stepwise approach promotes reuse of counting results. For bytecode-based performance prediction, parameters of invoked methods, but also parameters of non-invoke* bytecode instructions can be significant, because they influence the execution speed of the instruction [?]. The latter parameters and their locations are described in the JVM specification [?]; for example, the MULTIANEWARRAY instruction is followed by the array element type and the array dimensionality directly in the bytecode, while the sizes of the individual array dimensions have to be prepared on the stack. Hence, in order to describe the runtime behaviour of programs as precisely as possible, the approach must be able to record such parameters. However, parameter recording slows down the execution of the instrumented methods, and parameters may be relevant only in specific cases and only for some instructions or methods. As Java bytecode instructions or methods can have parameters of arbitrary object types, persistent parameter recording by simply saving the parameter value may be irrational, or even technically impossible. In such a case, a characterisation of the parameter object instance should be recorded: for a (custom) data structure, its size could be a suitable characterisation. Hence, to allow users to provide their own characterisations for Java classes of their choice, the approach must offer suitable extension hooks. In the next section, we provide an overview on our implementation of this approach, and how it handles the issues described in this section. 3

4 3 Overview of the ByCounter Process Parse program bytecode Instrument bytecode before execution ILOAD 2. Instrument ILOAD IINC C1 IADD parsed program IADD representation IINC C8 3. Convert into executable bytecode Execute instrumented bytecode 4. Create parameters 5. Replace original 6. Run instrumented for class constructors with instrumented bytecode, collect + method invocations bytecode classes counting results 27865*ILOAD 11108*IADD Fig. 1. Bytecode instrumentation and instruction counting using ByCounter In Fig. 1, all 6 steps of the process performed by ByCounter are shown. Following these process steps, this section describes the design rationale behind ByCounter and explains important design decisions that were made. Further details of the ByCounter implementation and of executing the instrumented bytecode will be described in Section 4. In step 1, ByCounter parses the existing bytecode class file into a navigable, structured representation, because direct manipulation of bytecode is very complex and error-prone. ByCounter uses the ASM bytecode engineering framework [?], which offers a bytecode class representation that includes semantic details (method signatures, fields, etc.). ASM s bytecode representation can be accessed and changed through the ASM API, which follows the visitor pattern. Using the ASM API, custom visitors can be created to add, change or delete the elements of the class representation down to the level of individual bytecode instructions. In step 2, ByCounter inserts counting instrumentation into the bytecode representation using a special ASM class visitor that we have written. The basic principle behind the visitor is to add new counters to existing bytecode. Later, during the execution the instrumented method, these counters will be initialised, incremented, evaluated and finally reported. Step 2 has to be implemented with the following objectives in mind: (i) it must not change the existing fields, variables, method signatures, class structure and execution semantics; (ii) the instrumentation has to account for each instruction individually with as few additional instructions and overhead as possible and (iii) for methods with control flow constructs (loops, ) that depend on the input parameters, counts must be reported correctly for any execution path, i.e. for all allowed values of input parameters. In step 3, the instrumented representation of the class is converted back into executable bytecode, which can be written into a class file and which can 4

5 also be used by ByCounter-provided ClassLoader. Step 3 concludes the instrumentation part and is followed by the execution part, which consists of steps 4 to 6. In step 4, execution of the instrumented method is prepared. In any case, it is the responsibility of the user to provide the prerequisites for the execution of the instrumented class. If the instrumented class will be run in the same context as the uninstrumented one, the preparations are absolutely identical to those for executing the uninstrumented class, and should already be fulfilled by the existing context. However, for running the instrumented class and its methods in isolation, preparations would include providing the parameters for class construction/initialisation and for method invocation, as well as providing required services. In step 5, the instrumented class must replace the original, uninstrumented class inside the deployed application. The most straightforward way to do this is to restart the application after replacing the original class bytecode with the new, instrumented version. If such a restart is not possible or not desirable, ByCounter can be connected to the JDK s Java Instrumentation API (java.lang.instrument package) or other tools that support class/method replacement and redefinition even after the application has started. In step 6, the instrumented method is run to count and to report executed bytecode instructions and method invocations. If needed, this step can be repeated with different input parameters of the instrumented method. Step 6 is the final step performed by ByCounter, after which the original, uninstrumented class can be reinstated if needed. 4 Implementation of ByCounter Following the overview in Fig. 1, the first step of ByCounter is to parse a Java class using the ASM framework [5]. Then, in step 2, ByCounter adds instrumentation for three tasks: for setup of the counting infrastructure, for counter incrementation and finally for reporting of the results. Users can specify which methods should be instrumented for counting and which methods should be left in the original, uninstrumented state. Otherwise, no manual intervention is needed from the user. 4.1 Creating and Using the Counters A suitable data structure must be selected for the counters. The JVM specification [?] lists 200 working bytecode instructions, including four INVOKE* instructions. Hence, these instructions require a fixed number of counters. In contrast to that, it depends on the application which and how many different methods will be invoked using INVOKE* in the instrumented method. Hence, in principle, method invocations inside the instrumented bytecode should be counted using a data structure which allows a dynamic addition of new counters for found method signatures. For ByCounter, the counters for 5

6 method invocations could be stored in a java.util.map-like data structure. At runtime, this structure can be easily extended, however, each access to a Map-like structure for incrementing a counter is very expensive. Thus, a more efficient technique is used in ByCounter by creating int counters for both method invocations and other bytecode instructions. The basic idea behind this technique is to perform an initial discovery pass over the bytecode for identifying all method signatures that occur in the bytecode of the considered method. Using the results of this pass, ByCounter can create precisely the required number of int counters. Incrementing an int counter can be done efficiently using a single IINC instruction. The list of method signatures will not grow at runtime, except in cases where other bytecode-instrumenting operations take place after ByCounter instrumentation. Hence, for correct counting results, we require that By- Counter is the last tool in the bytecode instrumentation chain. It must further be noted that the same discovery pass could identify non-invoke* instructions that really occur in the considered bytecode, but this enhancement ultimately results in more overhead than simply creating counters for all instructions. This list of found signatures might contain some methods that will not always be executed at runtime, because the execution path does not reach them for some values of input parameters passed to the instrumented method. The case-specific non-execution of these methods is not problematic, as the corresponding counts will simply maintain their initial value of 0. After the list of found method signatures has been populated in the discovery pass, ByCounter performs its main pass over bytecode. In the main pass, counters of type int for method invocations (as many as different signatures found during the discovery pass) are added to bytecode through instrumentation. Additionally, counters of type int are added for all 200 defined bytecode instructions. From the bytecode view, these counters are local variables. The maximum number of local variables in the bytecode of a Java method is (incl. those variables that existed before instrumentation), and this number shouldn t be a limitation in realistic cases. Additionally, to support bytecode-based performance prediction, the current version of ByCounter is able to record the dimension(s) and the element types of arrays created using ANEWARRAY, MULTIANEWARRAY and NEWARRAY instructions. The recording is optional; it is done using a set of additional Lists that save these details. In the future, functionality for recording parameters of other instructions and methods will be added to ByCounter. After creating the counters, ByCounter adds instrumentation to update (i.e. increment) them when the corresponding instructions and methods are executed. The IINC instruction does not modify the stack (it directly increments the underlying local variable ), so no side effects will occur at runtime. Recording of the dimensions of created arrays is also implemented in a way that does not produce any side effects. 6

7 4.2 Reporting the Counting Results Kuperberg, Krogmann, Reussner For reporting of counting results, two alternatives have been implemented in ByCounter, and both preserve the integrity of method signatures in the instrumented class. The first alternative instruments the method with code to directly write a log file with the counting results; for this, no additional classes must be loaded manually into the JVM. Details of the log file writing, such as the log file path, can be configured by the ByCounter user before the instrumentation starts. The second alternative is based on ByCounter s ResultCollector class, and has the advantage that it can aggregate and reference counts of different methods. In order to report the state of counters using ResultCollector, a call to its collectresults method is inserted by the instrumentation. Additionally, the ResultCollector class must be loaded into the JVM. Instead of reporting counting results periodically (e.g. after a certain time, or after a certain number of instructions has been counted), ByCounter is implemented to report the complete results immediately before the instrumented method exits. However, if a method declares possible exceptions in its signature (instead of the exception table), there is no way to foresee from the bytecode where and when method execution will exit due to an exception; in such a case, it can be assumed that the obtained counts are not representative. At the same time, exceptions declared using try/catch/finally are handled properly in ByCounter, as they are a part of the normal control flow. Thus, the ByCounter implementation ensures that the counting results are reported if and only if the method exits properly (i.e. if it returns without an uncaught exception). To achieve this, for both reporting alternatives (log file and ResultCollector), ByCounter adds instructions that report the result immediately preceeding every return -like bytecode instruction. These instructions include areturn, dreturn etc., depending on the type of returned value (bytecode of methods returning void also uses a return instruction). As the proper execution of a method always terminates with exactly one *return instruction, any such *return instruction is accounted for properly by pre-initialising the corresponding counter with 1. For the interpretation of the counting results, it can be important to have knowledge about the runtime parameters of the instrumented method itself. Hence, ByCounter is designed to store the characterisations of these parameters at the beginning of the method s execution and can report them together with the counting results. These characterisations can be the length of a String, size of an array etc. After the instrumentation has been completed, ByCounter converts the instrumented ASM bytecode representation into a Java class which is to substitute the original, uninstrumented class. The instrumented class can be saved as a class file, or passed to a suitable ClassLoader for immediate, reflectionbased invocation. 7

8 5 Case Study and Evaluation In this section, a case study is presented to show the efficiency and the precision of instrumentation and counting performed by ByCounter. The precision of ByCounter was evaluated by comparing ByCounter counting results with manually obtained results. For manual calculation of counts, the input parameters and their impact on the control flow of the considered method have to be evaluated. To make counting results comparable, it was required that an evaluated method has a deterministic execution when it is executed with the same set of input parameters. As described in Section 5.1 below, suitable Java methods have been selected and evaluated. The efficiency of ByCounter was evaluated by measuring (1) the duration of the instrumentation phase performed by ByCounter, (2) the execution duration of a method run before instrumentation and (3) the execution duration of the instrumented method. For (2) and (3), the same input parameters were used to be able to compare the results. From (2) and (3), the relative overhead caused by the counting process at runtime was computed and is reported below. 5.1 Study Setup For evaluating ByCounter, three Java benchmarks that are publicly available with their source code were used. The following methods have been instrumented and measured: (i) JavaGrande benchmark [?,?]: method JGFrun() of class JGFCastBench, 216 LOC (lines of Java source code excluding comments and empty lines) (ii) Linpack benchmark [?] (Java implementation available from [?]): method run_benchmark() of class Linpack, 60 LOC (iii) SciMark 2.0 benchmark [?]: method integrate(int Num_samples) of class MonteCarlo, only 9 LOC (parameter is passed to the method, so the contained loop with 5 LOC is repeated times) To ensure deterministic behaviour in repeated runs, loop conditions in JGFCast Bench.JGFrun() were slightly modified. Furthermore, JGFCastBench bytecode contained many effectless casting instructions whose execution can be skipped (and is actually skipped after JIT compilation by some JVMs). Hence, we have enhanced and recompiled the source code of JGFCastBench to prevent skipping of these instructions by the JVMs that were used in this case study. Of course, the platform on which ByCounter is executed has an influence on the performance, both for instrumentation phase and for the execution of instrumented bytecode. For the JVM as a part of this platform, different vendors implement optimisations of bytecode execution in different ways, and the resulting impact on the performance is of particular interest. 8

9 Thus, we compared Bea JRockit JDK 6 Update 2 R (32bit) and Sun JDK JVM on a computer with the Intel Core Duo T2400 CPU at 1.83 GHz, 1.50 GB of RAM and with Windows XP Pro operating system. The load on the machine has been minimised during measurements, and thread/process priorities have not been changed. The JVMs were run in server mode using the -server flag. If a measurement is repeated several times in a row, the JVM may have more opportunities to optimise the measured code. While the instrumentation phase will likely be performed only once, an (un)instrumented method can be executed several times in a realistic application. Thus, for evaluating the counting-caused overhead, measurements of both uninstrumented and instrumented methods must be repeated appropriately. As a consequence, for each of the three studied benchmarks and for each of the JVMs, we have performed 100 JVM runs with 200 repetitions of these measurements in each JVM run. 5.2 Measurements and Results After the precision of counting results was successfully verified through aforementioned manual bytecode inspection, measurements were performed. Table 1 aggregates the results of the measurements; due to space limitations, we do only report the median values and not the full statistical evaluation. The most interesting metric in Table 1 is the overhead introduced by By- Counter, which is between <1% (Linpack, Bea JVM) and 84.6% (Java- Grande, Sun JVM), except for one outlier (SciMark, Bea JVM). benchmark name JavaGrande Linpack SciMark Java VM Bea Sun Bea Sun Bea Sun instrumentation [ms] uninstrumented method execution duration [ms] instrumented method execution duration [ms] , , counting overhead [%] < total count of executed bytecode instructions total count of executed method invocations 103,050,724 4,912 20,784, ,000,001 Table 1 ByCounter evaluation for 3 benchmarks and 2 JVMs (obtained individual counts of bytecode instructions and method invocations are not shown) 9

10 We have studied measurements of this outlier to understand the reason for such exceptionally large overhead, and have come to the conclusion that the optimisation mechanism of the Bea JVM does not work well for the instrumented version of SciMark: if optimisations are turned off using -Xnoopt flag, the uninstrumented version of SciMark runs in ms, while the instrumented one runs in ms, resulting in a counting overhead of 68.9%. In fact, the counting overhead for running SciMark using -Xnoopt is very close to the overhead for SciMark executed in the Sun JVM. Also, when normal and -Xnoopt runs are compared for the instrumented SciMark version in the Bea JVM, the -Xnoopt run is slower by a factor of But if normal and -Xnoopt Bea runs are compared for the uninstrumented SciMark version, the slowdown factor is 3.48 (28.92 ms vs ms), i.e. exceptionally high. Observed differences between the overhead percentages in Table 1 are explained through the different structure of the instrumented methods. For example, in the JavaGrande case, there is a very large number of counted instructions, but a small number of method invocations (and the invoked methods are computationally expensive), so the relative counting overhead is between 27.4% (Bea) and 84.5% (Sun). In the Linpack case, significantly fewer instructions must be counted, yet the invoked methods are computationally expensive (i.e., they have longer execution durations). As these invoked methods have not been instrumented, they have the same execution duration no matter whether they are called from the instrumented or uninstrumented version of Linpack s run_benchmark method. Hence, for Linpack s run_benchmark method, the costs of counting these invoked methods are very small in relation to their execution duration of the invoked methods. It can also be observed that the used JVMs lead to very different method execution durations, and that none of the JVMs is the fastest for all three benchmarks. The cost of the instrumentation phase is also reasonably small (below 300ms for all benchmarks in all JVMs). Before assumptions and limitations are discussed in Section 7, we describe related work and compare it to our approach in the next section. 6 Related Work Bytecode instruction counts can be considered as a dynamic bytecode metric. In [?], a collection of other metrics for Java bytecode is presented, but that collection does not include execution counts for individual bytecode instructions and method invocations. Existing approaches for dynamic (runtime) counting of Java bytecode instructions and method invocations can be grouped into three categories, according to the technology they rely upon: (a) using monitoring/reporting interfaces provided by the JVM (b) by instrumenting the JVM or its API-implementing library (c) by instrumenting the actual application bytecode or source. 10

11 For case (a), different interfaces are explicitly exposed by JVMs, such as JVMTI[?], which must be programmed in a native language. These interfaces are used by standalone Java tools and profilers, such as Intel VTUNE [?]. In general, profilers measure resource usage and need manual supervision and interpretation. In contrast to that, ByCounter obtains exact counts of executed instructions without human supervision of the counting process. Since Java 6, direct access to individual bytecode instructions with Javaown means is possible only with JVMTI - for this, execution of bytecode must be single-stepped, substantially slowing down bytecode execution. JVMTI is not a mandatory part of the JVM standard, and many virtual machines (such as Jikes RVM [?]) do not implement JVMTI at all. Hence, JVMTI is not suitable as a portable basis for platform-independent bytecode counting when compared to bytecode instrumentation. In category (b), two parts of a JVM must be differentiated: the bytecode interpreter with its components and the JVM s Java API implementation, which consists of (partially platform-specific) Java classes. Instrumenting the first part means dealing with native (non-java) code or binaries, which is generally a complicated, both platform-specific and JVM-specific task. Instrumenting the API implementation means instrumenting Java bytecode or source code of a very large number of Java classes. For both JVM parts, commercial JVMs usually do not provide the source code. JVM instrumentation is done for replaying the behaviour of multi-threaded Java programs, for example in [?] and similar approaches; however, only highlevel constructs and not bytecode instructions or method invocations are considered. Vertical profiling approaches such as [?], [?] or [?] also use JVM instrumentation, and only consider high-level events, too. JRAF / FERRARI [?] instruments the entire Java API, but it could not be obtained for evaluation. The available documentation shows that it does not offer counting of individual bytecode instructions and method invocations, as its instrumentation maintains only one counter for all bytecode instructions. Furthermore, FERRARI captures JVM-specific calling context trees and not an expandable flat view as ByCounter does. To instrument bytecode, the Java API itself does not provide any means, but only methods to read/load already instrumented bytecode. Instead, external frameworks for bytecode engineering (such as ASM [?] or SOOT [?]) can be used, as they offer rich APIs for analysing and modifying bytecode. However, they do not include bytecode-counting functionality or instrumentation templates. For case (c), the actual application code must be instrumentated and then executed by the JVM. This approach is used in ByCounter. Generic frameworks for bytecode manipulation, such as SOOT [?], do not offer the functionality provided by ByCounter, they serve as tools to implement this functionality. For example, the ASM framework [?] was used for ByCounter. Aspect-oriented bytecode-analysing frameworks such as in [?] do not pro- 11

12 vide the instruction-counting functionality itself, but merely offer a different way to implement instrumentation when compared to ASM or other bytecode engineering frameworks. 7 Assumptions and Limitations We assume that it is possible to pass the final class bytecode that will be executed to ByCounter for instrumentation. For applications where bytecode is generated on the fly and not by the Java compiler (for example in Java EE application servers), additional provisions must be taken. We also assume that the bytecode to instrument conforms to the JVM specification, even if it has been protected using obfuscation. The ASM library that is used in ByCounter has one small limitation: ASM does not generate a 1:1 representation of parsed bytecode in a few cases. For example, ASM visitors consider the parameterless LLOAD_0 bytecode instruction to be the same as the (different) LLOAD instruction with parameter 0. Hence, ByCounter reports the four LLOAD_* instructions and the LLOAD instruction using one counter. However, as there is no semantical difference between the two instructions in the above example, it does not invalidate the semantical accuracy of ByCounter. If needed, this small limitation can be overcome by modifying the ASM library. Another current limitation of ByCounter is grounded in polymorphism. For example, when methods of interfaces are invoked, only the type of the interface is visible at bytecode level. Hence, the class implementing the interface cannot be identified unambiguously in the current implementation of ByCounter. However, handling of polymorphism can be achieved with additional effort. The obtained instruction counts depend on the input parameters that have been provided to the instrumented method, for example due to control flow constructs that depend on these parameters. Currently, this dependency cannot be expressed by ByCounter because neither control flow constructs are recognised by it, nor states of variables/fields during method execution are inspected. Finally, superfluous bytecode instructions can exist in an application, i.e. bytecode which can be optimized away by Just-In-Time (JIT) compiler of the JVM without effects on execution results. These instructions are instrumented by ByCounter as it cannot anticipate later JIT optimisations. The instrumentation instructions cannot be optimised away by JIT, with the effect that they increment counters even for those (superfluous) instructions that have been removed by JIT. We have observed such effect for a benchmark where counting would seem to slow down the execution by a factor of more than 900 in some cases (which is unrealistic given the ByCounter implementation). Hence, we assume that ByCounter is run on code that has been engineered effectively and does not 12

13 contain large portions of superfluous code. At the same time, ByCounter can hint at the existence of superfluous code through an exceptionally high runtime counting costs. 8 Conclusions This paper presents a novel, evaluated approach for dynamic counting of executed instructions and method invocations in bytecode-based applications. The approach is called ByCounter and it works by instrumenting the application bytecode, without the need to instrument or modify the JVM or the Java API implementation. An example usage of the counting results is the bytecode-based performance prediction [?], and these results could also be used for characterising the execution of a bytecode application. ByCounter will be made publicly available at By instrumenting the application bytecode and not the JVM, ByCounter simplifies the entire counting process and becomes truly portable accross JVMs. The instrumentation added by ByCounter is lightweight, leading to low runtime costs of counting. The evaluation of these costs and the overall counting effort is performed in this paper for several Java applications on different JVMs. In addition to being portable, the presented approach has been designed for easy use: no understanding of bytecode internals is needed to use it, and the application methods available for instrumentation are automatically identified and proposed to the user. To minimise disruptions, ByCounter instrumentation preserves the signatures of all methods and constructors, and it also preserves the application architecture. For reporting of counting results, ByCounter offers two alternatives: either using structured log files or using a result collector framework (the latter can aggregate counting results accross methods and classes). In the future, our work will be extended into several directions. First, it will be integrated into Palladio [?], which is an approach to predict the performance of component-based software architectures. Palladio uses models of software components, for which behaviour specifications with performance annotations must be created. For existing bytecode components, these annotations can be obtained through bytecode-based performance prediction. Such prediction needs two inputs: execution durations of individual bytecode instructions and invoked methods, as well as the counts of these instructions and invocations. Hence, the role of ByCounter in the Palladio approach is to support bytecode-based performance prediction by providing the counts of executed instructions and of method invocations. For the next version of ByCounter, several enhancements are being evaluated. For example, we have envisioned support for instrumenting userdefined sections of methods. Another feature could be polymorphic instrumentation: with this option, instrumentation of a method C.m() in class C 13

14 will be automatically accompanied by instrumentation of all methods m() in classes that extend class C. Finally, extending our approach to other virtual machines and their bytecode languages (for example.net runtime and its CIL bytecode) would allow the use ByCounter in heterogenous systems. Acknowledgements The authors would like to thank Klaus Krogmann for insightful discussions and the anonymous reviewers for their helpful comments. 14

ByCounter: Portable Runtime Counting of Bytecode Instructions and Method Invocations

ByteCode 2008 ByCounter: Portable Runtime Counting of Bytecode Instructions and Method Invocations Michael Kuperberg 1 Martin Krogmann 2 Ralf Reussner 3 Chair for Software Design and Quality Institute