Kazunori Ogata, Dai Mikurube, Kiyokuni Kawachiya and Tamiya Onodera

Size: px
Start display at page:

Download "Kazunori Ogata, Dai Mikurube, Kiyokuni Kawachiya and Tamiya Onodera"

Transcription

1 RT0854 Computer Science 13 pages Research Report April 27, 2009 A Study of Java s non-java Memory Kazunori Ogata, Dai Mikurube, Kiyokuni Kawachiya and Tamiya Onodera IBM Research, Tokyo Research Laboratory IBM Japan, Ltd Shimotsuruma, Yamato Kanagawa , Japan Research Di vision Almaden - Austin - Beijing - Haifa - India - T. J. Watson - Tokyo - Zurich Limited Distribution Notice This report has been submitted for publication outside of IBM and will be probably copyrighted if accepted. It has been issued as a Research Report for early dissemination of its contents. In view of the expected transfer of copyright to an outside publisher, its distribution outside IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or copies of the article legally obtained (for example, by payment of royalities).

2 A Study of Java s non-java Memory Kazunori Ogata Dai Mikurube Kiyokuni Kawachiya Tamiya Onodera IBM Research, Tokyo Research Laboratory , Shimo-tsuruma, Yamato, Kanagawa , Japan ogatak@jp.ibm.com Abstract A Java application sometimes raises an out-of-memory exception. This is usually because it has exhausted the Java heap. However, a Java application can raise an out-ofmemory exception when it exhausts the non-java memory, which is memory used by Java but not in the Java heap. This can happen, for example, when it attempts to load too many classes into the virtual machine. Although it is relatively rare to exhaust the non-java memory compared to exhausting the Java heap, a Java application actually consumes a considerable amount of non-java memory. This paper is a quantitative analysis of non-java memory. To the best of our knowledge, this is the first in-depth analysis of non-java memory. To do this we created a tool, Marusa, which gathers memory statistics from both the operating system and the Java virtual machine, breaking down and visualizing the non-java memory usage. We studied the use of non-java memory for a wide range of Java applications, including the DaCapo benchmarks, a large enterprise application, and a JRuby on Rails application. Our study is based on the IBM Java J9 virtual machine for Linux. Although some of our results may be specific to this combination, we believe that most of our observations are applicable to other platforms as well. Categories and Subject Descriptors C.4 [Performance of Systems]: Measurement techniques, D.2.5 [Software Engineering]: Testing and Debugging debugging aids. General Terms Keywords Marusa Measurement, Experimentation. Java, memory analysis, non-java memory, 1. Introduction A Java application sometimes raises an out-of-memory exception. This is usually because it has exhausted the Java heap. A large application will often use gigabytes of Java heap due to memory leaks or bloat. With varying degrees of sophistication, many tools are available for analyzing the Java heap and for debugging the out-of-memory exceptions [19, 20, 30]. However, a Java application can sometimes raise an outof-memory exception because it has exhausted the non- Java memory, the memory region outside the Java heap. This can happen, for example, when it attempts to load too many classes into the virtual machine. Although it is infrequent to exhaust this non-java memory compared to the Java heap, a Java application actually consumes a considerable amount of non-java memory. As we will show later, the non-java memory becomes as large as the Java heap for more than half of the DaCapo benchmarks [6] when the heap sizes are twice the minimum heap sizes required for each of the benchmarks. A Java Virtual Machine (JVM) uses the non-java memory for various purposes. It holds shared libraries, the class metadata for the loaded Java classes, the just-in-time (JIT) compiled code for Java methods, and the dynamic memory used to interact with the underlying operating system. Interestingly, modern virtual machines tend to impose increasingly heavier demands on non-java memory. For instance, beginning with Version 1.4.0, Sun's HotSpot Virtual Machine [29] optimizes reflective invocations [27] by dynamically generating classes. Obviously, these consume non-java memory. The same version also introduced direct byte buffers to improve I/O operations [28]. These buffers typically reside in non-java memory. It is difficult for Java programmers to be aware of such implicit overhead in non-java memory. This paper presents a quantitative analysis of non-java memory. While there are numerous reports and publications analyzing Java heaps by researchers and by practitioners [19, 20, 30], to the best of our knowledge this is the first study analyzing the non-java memory. To do this, we built a tool called Marusa, which gathers memory statistics from both the Java virtual machine and the operating system, using this data to visualize the non-java memory usage. We modified IBM Java J9 virtual machine [3, 9] for Linux to gather fine-grained, JVM-level statistics very efficiently. We studied the usage of non-java memory for a wide range of Java applications, including the DaCapo bench

3 marks [6], a large enterprise application, and a JRuby on Rails [15] application. We ran them with the modified IBM Java J9 virtual machine under Linux. Note that the use of non-java memory inevitably depends on both the Java virtual machine and the operating system. Although some of our results may be specific to our JVM and Linux, we believe that most of our observations are relevant to other platforms. More specifically, the Java platforms we focus on in this paper are the Java Standard and Enterprise Editions (Java SE and EE), not the Java Micro Edition (Java ME). Today the majority of Java virtual machines for Java SE and EE are written in C and C++, run on generalpurpose operating systems, and include adaptive JIT compilers with multiple optimization levels [3, 9, 22, 29]. We believe that our observations are also substantially relevant to these platforms. Our contributions in this paper are: We quantitatively analyzed the usage of non-java memory for a variety of Java programs, including the DaCapo benchmarks, a large enterprise application, and a JRuby on Rails application. We ran them on a modified version of IBM's production virtual machine for Linux. Dividing non-java memory into eight components, such as class metadata,, and s, we gathered time series data about the sizes of the resident memory the components consume. We found that non-java memory outgrows the Java heap for more than half of the DaCapo benchmarks, when the heap size was set to be twice as large as the minimum heap size necessary to run each benchmark. We found that, in all of the programs studied, the JIT work area fluctuates greatly, while the rest of the components soon become stable. This is because the JIT compiler from time to time demands significantly more memory for its work area when compiling methods at aggressive levels of optimization. We observed that the behaviors of the libc memory management system (MMS), the malloc and free routines, have a profound impact on the usage of non-java memory. Typically, a JVM-level MMS is built on top of the libc MMS, which in turn is built on top of the OSlevel MMS. Even if the JVM-level MMS returns a chunk of memory to the libc MMS, this may not lead to reduced resident memory, since the libc MMS may fail to return it to the OS-level MMS. The rest of the paper is organized as follows. Section 2 presents an anatomy of non-java memory. Section 3 describes our methodology, including our tool, Marusa. Section 4 shows the results of the micro-benchmarks, while Section 5 presents the results of the macro-benchmarks. Section 6 discusses related work. Finally, Section 7 offers conclusions. Malloc-then-Freed Figure 1. Breakdown of non-java memory when Trade6 is running on IBM's WebSphere Application Server Version 7.0. Category JIT compiled code area Management overhead Typical data Code loaded from the executable file Shared libraries Data areas for shared libraries Work area for the JVM Areas allocated by Java class libraries Java classes Native code generated by the JIT Runtime data for the generated code Work areas for the JIT compiler The areas that were once allocated by malloc(), then free()ed, and still residing in memory (typically held in the free list) The unused portion of a page where only a part of a page is used, the area to manage an artifact, such as the malloc header C stack Java stack Table 1. Categories of non-java memory 2. An Anatomy of Non-Java Memory Figure 1 shows a breakdown for the non-java memory of an enterprise Java application, WebSphere Application Server [13] running Trade 6 [12] for 8 minutes. The non-java memory occupies about 260, which is larger than the Java heap required by this application, 256 (not shown in Figure 1). However, Java programmers are typically unaware of such situations. For deeper quantitative analysis, we divided the non- Java memory into eight categories. Table 1 summarizes the categories and typical data in each of them. This section describes each of these memory areas. In the example of Figure 1, five categories consume most of the non-java memory. 2.1 This is the memory area that holds the native code from executable files and libraries, and also the data loaded from - 2 -

4 shared libraries. This area does not include any of the code generated by the JIT compiler. The size of code area increases when a dynamic link library is newly loaded or when code in a page that has not yet been accessed is executed and loaded into memory. 2.2 This is a memory area that holds the data used by the JVM itself and the memory allocated by the Java class library (JCL) and user-defined JNI methods. The memory used for direct byte buffers is an example of memory allocated by the JCL. This area does not include class metadata or the JIT-related areas explained below. The size of this area increases when the JVM needs more working storage or when a Java application allocates more memory through the JCL. 2.3 is a memory area for the data loaded from Java class files, such as the bytecode, UTF-8 literals, constant pool, and method tables. The JVM creates metadata upon loading a Java class. While there is no explicit allocation in Java applications, using a class is not free, but does require some memory. This overhead memory can become significant for large applications using thousands of classes. 2.4 JIT compiled code This is a memory area for the native code generated by the JIT compiler and the data for the generated code. The size of this area increases as the JIT compiler compiles more methods. Some JIT compilers can recompile methods to optimize them more aggressively and generate new versions of the compiled code, which usually consume even more memory. If a JIT compiler supports unloading of the generated code, the size of this area can decrease. 2.5 This is a memory area used to hold the data used by the JIT compiler, such as the intermediate representations of a method being compiled. The size of this area increases when the intermediate representation is large (perhaps as methods are inlined) or when the JIT does aggressive optimizations. The size of this area decreases when the compilation of a method is completed, though some of the data may remain in memory for inter-procedural analysis or profiling. 2.6 areas This is the memory that was allocated using malloc() by the JVM or JIT, and then deallocated using free(). The malloc library typically manages such areas by holding them in a free list or by returning them to the OS. If managed, then the deallocated memory resides in the non-java memory in this malloc-then-freed area. If returned to the OS, then the deallocated memory is removed from the process memory. Therefore, the size of this non-java memory depends on how the standard C library (libc) and OS handle the deallocated memory. We include any malloc-then-freed area as part of the non-java memory, since it remains in the resident memory of the process and consumes actual memory pages. This area sometimes becomes quite large, as shown in Figure 1. Note that this large malloc-then-freed area is not a unique problem for JVMs, but can be demonstrated by traditional C programs. 2.7 This is a memory area implicitly used by OS or system libraries to manage process memory. A malloc header is an example of this kind of data. The unused parts of allocated pages are also included in this category. 2.8 This is a memory area for the Java stack and C stack. We combined these stacks into the same category because both are used to store the stack frames of Java methods and because the implementation of the JVM determines whether the Java stack frames are stored in the Java stack or C stack. The size of this area increases when many stack frames are allocated in deeply nested calls, when a stack frame contains many local variables, or when many threads are created. 3. Methodology to Measure Non-Java Memory This section describes the analysis methodology used to divide the non-java memory into these eight categories. 3.1 Our approach The philosophic key to our memory analysis is to fully identify the usage of the resident memory of a JVM process based on these eight categories (plus the Java heap). The underlying OS manages the address ranges of a process s resident memory, while the JVM decides on the usage of memory. Thus, we need to gather the memory management information at both the OS and JVM levels. We use three steps to categorize the non-java memory: 1. Gather OS-level information to enumerate all of the memory ranges owned by a JVM process and identify the attributes of each range. 2. Gather the JVM-level information to identify the use of each area based on the component that allocated it

5 JVM libc Code Java application memory manager C Heap Application JVM System library 3. Combine these two levels of information and summarize the results using the eight categories. Modern large programs, including JVMs, may have their own internal memory managers, which allocate chunks of memory from a pool, dividing them into smaller pieces to handle memory allocation requests from other components. In such a case, we need to identify each component that requested memory from the internal memory manager. Tracing only the memory allocation API calls, such as malloc() and free(), is insufficient to identify the memory usage in such a program because it only captures the operations of the internal memory manager, without identifying how the pool is used by those components. Figure 2 shows examples of the correspondences between the memory allocation paths and the eight categories of non-java memory. Since a memory request at a higher layer has more detailed knowledge about how the memory is used, we need to gather the information from all of the layers and combine it carefully, avoiding duplication. For that purpose, we built a tool called Marusa 1, which gathers two levels of memory information, interprets it, and then visualizes the breakdown of the non-java memory usage. The graph in Figure 1 is actual output from Marusa. This tool can also be used to analyze the Java heap area [17], though we focus on the non-java memory in this paper. 1 Acronym for Memory Analyzer for Redundant, Unused, and String Areas. OS JIT Code JVM Class JIT Categories compiled area of non-java work meta work code memory Management overhead Allocate work Allocate byte buffer Load class when these areas are freed and held in the free list Software layers Figure 2. Correspondence between memory allocation paths and the eight categories of non-java memory Generate code JIT Allocate work 3.2 Gathering OS-level memory management information We first need to know the total size and attributes of the memory blocks assigned to the JVM process. The attributes typically include the access permission, the mapped file flag, and the file path if the memory is mapped to a file, though the specific attributes available depend on the OS. In this study, we focus on the resident size of process memory, where the physical memory is assigned. Therefore, we also need to gather information on which of the pages in the process s memory blocks have physical pages. For Linux, Marusa uses maps in the /proc file system to gather the address ranges and their attributes. For Linux kernels from version and higher, we can collect the physical page states using pageinfo in the /proc file system. For older kernels, we can use a kernel module included in the open source software exmap [5]. 3.3 Gathering memory usage in JVM If the JVM provides detailed information about its memory usage for debugging the JVM, we can use it for categorizing non-java memory. If the information is insufficient, we need to add probes to the JVM by using plug-ins or by modifying the source code of the JVM. Marusa uses a mix of these approaches. We use the debugging information of the IBM Java J9 VM to get the sizes of the class metadata and the JIT compiled code, and we modified IBM JVM to gather detailed information about memory allocations and deallocations, including the requests to the internal memory manager. This fine-grained data allows us to capture all of the. 3.4 Computing non-java memory usage To combine both the OS-level and JVM-level information, the Marusa analyzer uses a map structure that holds all of the gathered information for each memory byte in the JVM process. The map uses the virtual address of the byte as a key to combine the information gathered from different sources. We call this map the memory attribute map. For example, it can identify that a memory byte was allocated using malloc() by the internal memory manager for loading class metadata, and that it is in a page that has a physical memory. To compute the breakdown of the non-java memory usage, Marusa counts the bytes with the same memory attributes. Marusa uses a prioritized list of attributes to avoid counting any bytes twice. It first counts up the bytes with the highest priority, and then counts up the bytes with the second highest priority among the bytes that have not yet been counted, and so on. We can use other views of the memory breakdown by changing the ordering of the list

6 Hardware environment Machine IBM BladeCenter LS21 CPU Dual-core Opteron % 2 (2.4 GHz) RAM 4 GB Software environment OS SUSE Linux Enterprise Server 10.0 Kernel version JVM IBM Java J9 VM, version 6 (SR3) Table 2. Execution environment 4. Micro-Benchmarks This section describes the correspondence between the size of the non-java memory and the operations in Java programs. Although this correspondence depends on the implementation of the Java VM, many other implementations of the Java VM should show similar trends. We developed several micro-benchmarks for the non- Java memory, and evaluated them using the IBM Java J9 VM for Java 6 in SLES 10.0 (kernel ) on an IBM BladeCenter LS21. Table 2 gives details for our measurement environment. In these measurements, we show the size of the non-java memory where physical memory is actually allocated. Since no memory was swapped out during these measurements, this is the same as the resident set size (RSS) of each JVM process after subtracting the size of its Java heap area. 4.1 Micro-benchmark for the class metadata The first micro-benchmark shows how the size of class metadata changes with the number of loaded classes. We created a micro-benchmark that loads 1, 1,000, and 10,000 classes. The size of each loaded class is about 600 bytes, because they are tiny classes containing only one small method. Figure 3 shows the results for this micro-benchmark. The class metadata area grew as more classes were loaded. There is only a small difference between the first two results for 1 class and 1,000 classes. This is because the JVM automatically loads about 500 system classes at startup time, consuming about 2 of the class metadata area. When 10,000 classes were loaded, the size of class metadata area increased to 8.1. Although we loaded almost empty classes, the memory overhead (6 ) was close to the size of code area (7.2 ). Since the class sizes are usually larger for real applications, the size of class metadata sometimes is significant. For example, the average class size for WAS 7.0 running Trade 6 was about 5 KB and the JVM loads more than 16,000 classes, resulting in 83 of class metadata, as shown in Figure 1. Load 1 class Load 1K classes Load 10K classes Figure 3. Change in non-java memory size by loading various numbers of classes. Step 1: Alloc 32KB x 1000 Step 2: Delete 32KB x 500 Step 3: Alloc 32KB x 1000 Step 4: Delete 32KB x 500 Step 5: Alloc 32KB x 1000 Step 6: Delete 32KB x 500 Step 7: Delete all buf Malloc-then-Freed Figure 4. Change in non-java memory size by allocating and freeing direct byte buffers of the same size. Step 1: Alloc 32KB x 250 Step 2: Delete 32KB x 125 Step 3: Alloc 16KB x 125 Step 4: Delete 16KB x 125 Step 5: Alloc 96KB x 125 Step 6: Delete 96KB x 125 Step 7: Delete all buf In Figure 3, the size of also increased dramatically when the program loads 10,000 classes. This is because the JIT compiler started compilation of the java.lang.class.loadclass() method. The size of code area also increased as more pages in the code area were touched by the compilation. 4.2 Micro-benchmark for the JVM work and mallocthen-freed areas Next we studied how the size of changes due to the allocations of direct byte buffers. We created a micro-benchmark that allocates and deallocates specified sizes of direct byte buffers. We executed two scenarios of allocation and deallocation, and measured the non-java memory size for each. Figure 4 shows the result of the first scenario: 1. Allocate 1,000 direct byte buffers, each of 32 KB, 2. Delete every other buffer, which will result in many freed area fragments, 3. Allocate another 1,000 direct byte buffers of 32 KB, 4. Delete every other buffer of the new buffers allocated in Step 3, Figure 5. Change in non-java memory size by allocating and freeing direct byte buffers of various sizes

7 5. Allocate another 1,000 direct byte buffers of 32 KB, 6. Delete half of the buffers allocated in Step 5, 7. Delete all of the remaining buffers. Each bar in Figure 4 shows the non-java memory breakdown just after one step. In this scenario, each allocation step increased the size of the, which became part of the malloc-then-freed area in the following deletion step. The malloc-then-freed area remained as resident memory even after all of the direct byte buffers were freed. Figure 5 shows the results of another scenario with various sizes of direct byte buffers: 1. Allocate 250 direct byte buffers of 32 KB, 2. Delete the last half, making a large contiguous address range available for reuse or for returning memory to OS, 3. Allocate 125 direct byte buffers of 16 KB, 4. Delete all of the 16-KB buffers allocated in Step 3, 5. Allocate 125 direct byte buffers of 64 KB, 6. Delete all of 64-KB byte buffers allocated in Step 5, 7. Delete all of the remaining buffers. Each bar in Figure 5 shows the non-java memory breakdown just after one step. In Steps 3 and 5, the allocation of byte buffers seems to reuse the malloc-then-freed area created in Step 2. In Step 5, the total size of the non-java memory increased because the memory in the malloc-thenfreed area is too small for all of the requested buffers. Interestingly, after freeing the 64-KB buffers at Step 6, the size of malloc-then-freed area increased to 6.1, which is close to the size after Step 2 (5.5 ). Although the total size of the non-java memory was reduced, the size at Step 6 is almost the same as the size before allocating the 16-KB buffers in Step 4. Although we are still investigating this, we believe this behavior is caused by the inner working of malloc() in Linux as it allocates the same size blocks from the memory pool and deallocates the pool as a unit when all of the chunks allocated by malloc() are freed. 5. Macro-Benchmarks This section shows our experimental results using larger programs. We measured WebSphere Application Server (WAS) 7.0 [13] running Trade 6 [12], the DaCapo benchmarks [6], and a simple application in the JRuby on Rails. For DaCapo, we present and discuss only the results of the benchmark named bloat, with the other results appearing in the Appendix. The hardware environment of these measurements is the same as for the micro-benchmarks as shown in Table WAS 7.0 running Trade 6 Figure 6 shows how non-java memory changes during the execution of Trade 6 in WAS 7.0. This graph shows the non-java memory at nine execution points in a single invocation of WAS: just after starting WAS, after the first Just booted WAS After 1st access 30s 1 min. 1m30s 2 min. 4 min. 6 min. 8 min Figure 6. Non-Java memory of WAS 7.0 running Trade 6 Just started WAS After 1st access 30 sec. 8 min. Other JVM work access to the scenario page of the Trade 6 application, and then at 7 times up to 8 minutes while Trade 6 is accessed by the 16 threads of the load generator. Note that the measurement intervals are not equal. The maximum heap size was set to 256, though the Java heap area is not shown in the graph. In this application, the, class metadata, and the malloc-then-freed area were the major areas in the non-java memory. The occasionally became large, but it was small at many of the measurement points. We will discuss the JIT work and malloc-then-freed areas in Section 5.2. The grew from 22.6 to 39.4 in 8 minutes. The largest increase of 11.9 was in the first 30 seconds after the first access. Figure 7 is a detailed breakdown of the, separating the sizes of the direct byte buffers. We can see that the increases in the direct byte buffers caused most of the increases in the JVM work area. They increased by 11.0 in the test period. The class metadata increased from 83.4 to 87.1 at the time of the first access to Trade 6. Then it decreased to 86.7 over the next 30 seconds and remained unchanged for the rest of the execution. The reason for this increase in the class metadata is the loading of the classes needed for handling the clients requests, such as those in javax.servlet.* packages. Dynamically generated classes for optimizing reflection were also loaded. The reason the class metadata decreased was that the classes for optimizing reflection were unloaded at that time. Direct byte buffers Figure 7. Further breakdown of the - 6 -

8 100% Benchmark Heap size [] antlr 32 bloat 32 chart 32 eclipse 32 fop 16 hsqldb 128 jython 32 luindex 16 lusearch 32 pmd 32 xalan 32 Table 3. Heap sizes of the DaCapo benchmarks Figure 8. Results of DaCapo bloat. 5.2 DaCapo We analyzed the non-java memory of the DaCapo benchmarks. We measured programs in the DaCapo benchmarks with the heap sizes shown in Table 3. We will discuss the result of bloat in this section, and present the other results in the Appendix. We started a fresh JVM for each of the benchmarks, and repeated each benchmark ten times with the -n 10 option to see how the non-java memory changes during iterative executions of the benchmark. Figure 8 shows how the size of the non-java memory changes during the execution of bloat. The vertical axis is the percentage of the total object allocation in the benchmark. For example, the first bar shows the result when the JVM had allocated objects whose cumulative size was to the total allocation in bloat, which was about 9.6 GB for the 10 iterations. We call this point the allocation point. As shown in Table 3, the maximum heap size was set to 32 for this benchmark. This shows that the Java and non-java memory consumptions were similar. The size of the class metadata increased between the and allocation points. The reason for this increase was the allocation of classes for calculating the SHA-1 digest of the result of the first calculation. In this 53 period, the first iteration ends and the control component of DaCapo creates a java.security.messagedigest object and calculates the digest value. Because more than 200 classes are loaded for this calculation, the class metadata area grew 1.2. The sizes of the and malloc-then-freed area varied widely within a single execution. Note that our measurement approach captures snapshots of the memory as it continuously changes during the execution of the program. Therefore, the sizes shown in Figure 8 do not necessarily show the maximum size in each period. The JIT compiler consumes a large work area when it compiles a large method, which may be due to inlining many methods or due to aggressive optimization. The largest s were between 20 to 40 in most of the intervals until the allocation point. In later intervals, few methods were compiled aggressively and the maximum s were near 10 or less. The size of malloc-then-freed area occasionally increased, though it was around 7 in most of the intervals after the allocation point. This is still under investigation, but we believe most of the malloc-then-freed area was the same memory used as the. Since the size of the was large in some compilations, the size of malloc-then-freed area increases after those compilation. However, as we noted in Section 4.2, not all of the freed area is held in the malloc-then-freed area. The size of this area is the result of the combination of the memory allocation and deallocation in the JIT compiler and the algorithm used to maintain the free list in libc. 5.3 JRuby-on-Rails We also measured the non-java memory use of JRuby on Rails, which is an environment for a Ruby on Rails (RoR) [24] application in the JVM using the JRuby [14] runtime. For this measurement, we created a small address book application that can show, create, update, delete, and list address entries. The application and the RoR middleware were packaged into a WAR file by using warbler [32] and deployed in the Tomcat [2] application server. The database is in a separate MySQL [21] process. Table 4 shows the details of the configuration of the software for this measurement. We used Apache JMeter [1] as a load generator. The scenario is to repeatedly send the following sequence of requests: List all of the entries in the database, Create a new entry, List all of the entries in the database, Show the new entry and update it, List all of the entries in the database, Delete the entry, and List all of the entries in the database

9 JVM configuration JVM heap size 512 GC mode Flat heap GC Versions of software JRuby Apache Tomcat Warbler version MySQL version Ver Distrib Apache JMeter Table 4. Execution environment for the JRuby on Rails sample Execution mode Description Interpreter The JRuby runtime executes Ruby programs using only the interpreter written in Java, and no Ruby programs are compiled to Java bytecode. JIT compilation The JRuby runtime compiles frequently executed Ruby programs to Java bytecode. The default threshold is 50 executions. The current version of JRuby limits the number of compilation up to 4,096 Java classes (and our sample application did not reach this limit). AOT compilation The JRuby runtime compiles all of the Ruby programs before starting execution of the program. This mode may compile Ruby programs that are never executed. Table 5. JRuby execution modes JRuby has three modes for executing Ruby programs in a JVM: the interpreter mode, the JIT compilation mode, and the AOT compilation mode. Table 5 gives details on these modes Non-Java memory for JRuby on Rails using the AOT compilation mode Figure 9 shows how the size of non-java memory changes during the execution. In these measurements, we used the AOT compilation mode in JRuby. We measured the non-java memory just after starting Tomcat, after the first access to the application, then every minute while less than 1,000 requests had been processed, and then a final measurement after processing the 1,000th request. We sent the requests from one client thread, except when testing concurrent accesses between five and six minutes. After the first access to the application, the size of class metadata increased from 17.5 to The reason for this increase is that JRuby compiles the Ruby programs into Java bytecode. Since the AOT compilation mode Started Tomcat First accessed 1min 2min 3min 4min 5min 6min 7min After 1000 requests Adding another concurrent access resulted in large increase in class metadata Figure 9. Non-Java memory breakdown of JRuby on Rails during execution Interpreter 1-thread JIT 1-thread JIT 2-threads JIT 4-threads AOT 1-thread AOT 2-threads AOT 4-threads Figure 10. Non-Java memory breakdown of JRuby on Rails for various numbers of concurrent accesses generates Java classes for all of the loaded Ruby programs before starting execution of the application, they resulted in the increases up to Another large increase in class metadata occurred between five minutes and six minutes, from 51.7 to This increase was caused by increasing the number of concurrent access from one to two. We observed that JRuby in the AOT mode compiled and loaded another RoR runtime when the number of simultaneous accesses increased. This duplicate compilation resulted in an increase of class metadata Difference in non-java memory use based on the JRuby compilation mode Figure 10 shows the non-java memory of the same address book application in each of three execution modes of JRuby. For the JIT and AOT compilation modes, we also changed the number of concurrent accesses to 1, 2, and 4 by changing the number of client threads in JMeter. We measured the non-java memory use after processing 1,000 requests for each client thread

10 JRuby's compilation mode AOT JIT Number of concurrent accesses ,397 7, ,392 7, ,372 7, Table 6. The number of Java classes generated by Rubyto-Java compilation. (The upper row is the total number of generated classes, and the lower row shows the number of Ruby methods and blocks compiled) When we use one client thread, the sizes of the class metadata were 17.9, 24.8, and 51.5 in the interpreter, JIT, and AOT compilation modes, respectively. The difference in the JIT compilation mode compared to the interpreter mode was small because the JRuby runtime only compiled the methods executed more than the threshold of 50 times. In this measurement, the frequently executed methods were only those of the address book application itself and most of the programs of the RoR runtime were not compiled. Actually, the number of the compiled Ruby programs never reached the limit of 4,096. Since the code for this application is small, the differences were small, too. For the AOT compilation mode, the JRuby runtime compiled almost all of the RoR runtime. Thus, far more Ruby programs were compiled to Java bytecode, and that resulted in a larger class metadata area. Increasing the number of concurrent accesses produced completely different results. For the JIT compilation mode, the size of class metadata was nearly constant even when the number of concurrent accesses increased. There are two reasons for this. The simple one is that the number of JIT compiled methods was small. The other one is that duplicate compilation occurred only for Ruby blocks, not for Ruby methods. In contrast, the size of the class metadata in the AOT compilation mode increased from 51.5 to 83.0 and for two and four concurrent accesses, respectively. The size increase was almost linear, around 30 for each concurrent access. The reason for this large increase is the way Ruby code is compiled to Java bytecode in JRuby. The same Ruby programs are compiled into Java bytecode for each concurrent access. For example, four classes of the same (Tomcat_directory_name)/webapps/ (Application_name)/WEB-INF/lib/jruby-complete jar /uri/http were loaded using four classloader instances. Table 6 shows the numbers of classes generated by Ruby-to-Java compilation in JRuby and the Ruby-on-Rails methods and blocks. For the AOT compilation mode, the number of the generated classes increases proportionally to the number of concurrent accesses, while the number of Ruby methods is almost constant. This is because JRuby in the AOT compilation mode generates separate classes for each concurrent session from the same Ruby code. In contrast, in the JIT compilation mode, the increase in the number of generated classes was very small. This is because JRuby in the JIT compilation mode only regenerates the classes for the blocks that are actually invoked in the concurrent sessions. Table 6 also shows that the number of Ruby methods and blocks is much larger in the AOT compilation mode. This is because the AOT compilation mode compiles most of the Ruby files in the RoR package during the initialization of the server. 6. Related Work There have been numerous papers and reports analyzing the Java heap, so we will only review a few of the most important ones. Sun's Java Development Kit Version 1.2 introduced the Java Virtual Machine Profiler Interface (JVMPI), and included the HPROF agent which interacts with the JVMPI to profile the use of the Java heap and the CPU [18]. For example, this agent can generate a heap allocation profile that shows the numbers and sizes in bytes of the allocated and live objects for each allocation site. The agent relates the allocation sites to the source code by tracking the dynamic stack traces that led to the allocations. The HPROF agent can also generate a complete heap dump to find unnecessary object retentions or memory leaks. In JDK 5.0, the JVMPI was replaced by the Java Virtual Machine Tool Interface (JVMTI) [31], and the HPROF [26] agent was re-implemented in the JVMTI. IBM Dump Analyzer for Java [4] analyzes the dump produced by a JVM, helping developers identify common problems such as out of memory, deadlocks, and crashes. It provides the basic support for diagnosing memory problems, such as showing the statistics of live objects in Java heap and of class metadata. The tool is available as a plug-in for the IBM Support Assistant (ISA) [11], a free software serviceability workbench. Even if complete heap dumps are available and tooling is provided for viewing such dumps, diagnosing memory leaks is a significant challenge for developers. The Java Heap Analysis Tool, jhat, supports an SQL-like query language to query the heap dumps, and allows developers to browse heap dumps with Web browsers [30]. Beginning in JDK 6.0, jhat is included in the standard distribution. Mitchell and Sevitsky [19] proposed an automated and lightweight tool, LeakBot, for diagnosing memory leaks. It ranks data structures by their likelihood of containing leaks, identifies suspicious regions, characterizes the expected evolution of memory use, and tracks the actual evolution at run time. LeakBot is now incorporated into another tool named Memory Dump Diagnostic for Java (MDD4J) [23], - 9 -

11 which is also available as a plug-in for ISA. Jump and McKinley [16] proposed an accurate, scalable, online, and low-overhead leak detector called Cork. They introduced a new heap summarization technique based on types. They build a type points-from graph to summarize, identify and report the data structures with systematic heap growth. Mitchell and Sevitsky [20] did an analysis of Java heap, focusing on the overhead of collections. They introduce a health signature to distinguish the roles of the bytes based on the roles of the objects in collections, and provide concise and application-neutral summaries of the heap usage. Kawachiya et al. [17] did another analysis of Java heaps, focusing on Java strings. Analyzing Java heap snapshots, they found that there are many identical strings, and propose three different techniques to eliminate them, including one to "unify" the duplicates at garbage collection time. Java's non-java memory, also called Java's native heap, is not well described or documented. Chawla [7] provides a brief overview of how IBM's 32-bit Java virtual machine uses the address space in AIX, though IBM s JVM for can behave differently from IBM s Java5 and Java6 VMs. Hanik [10] describes the memory layout of a JVM process, and considers the causes of and solutions for out of memory errors. When multiple Java applications run in a single machine, it is good to share class metadata among the Java processes. This helps reduce the startup time and the memory use of each Java application. Sun's JDK 5.0 introduces Class Data Sharing, building a shared archive from a set of classes in the system JAR files [25]. IBM's implementation of the 5.0 JVM takes this further, and allows all system and application classes to be stored in a persistent, shared cache [8]. 7. Conclusion We quantitatively analyzed the usage of non-java memory for a wide range of Java applications. Using a modified version of a production Java virtual machine for Linux, we verified that a Java application consumes a considerable amount of non-java memory. We found the non-java memory could become as large as the Java heap in many Java programs. A Java virtual machine uses non-java memory for various purposes. The non-java memory holds shared libraries, builds the class metadata, provides the work area for generating the, and has the dynamic memory used to interact with the operating system. Although a plethora of memory problems affect the Java heap, similar problems can also appear in the non-java memory. For example, an out-of-memory exception will be raised when the virtual machine loads or dynamically generates too many classes based on the requests from an application. In time series analysis, we observed that the JIT work area had significant fluctuations in the use of non-java memory, because the JIT compiler intermittently requires large amounts of temporary memory for aggressive optimizations. We also observed that the libc memory management system (MMS) has a profound impact on the resident memory of non-java memory, because it may retain the memory chunks freed by an upper-level MMS. This suggests that the layers of MMSes should be more carefully integrated. For example, the upper-level MMS may need a capability to force the libc MMS to return free memory to the OS-level MMS. Modern Java virtual machines tend to use relatively more non-java memory. For example, they may dynamically generate classes to optimize reflective invocations, while also allocating direct byte buffers to improve I/O performance. In addition, it is becoming popular to build scripting language runtimes on top of JVMs. Examples include JRuby, Jython, and Groovy. These runtimes often generate Java classes dynamically. Thus, we believe that properly understanding Java's non-java memory will have increasing importance. Acknowledgments We would like to thank Andrew Low, Trent Gray-Donald, Mark Stoodley and Marius Pirvu of IBM Canada for helpful discussions on this research. References [1] The Apache Software Foundation. Apache JMeter. [2] The Apache Software Foundation. Apache Tomcat. [3] Chris Bailey. Java technology, IBM style: Introduction to the IBM Developer Kit, j-ibmjava1.html [4] Helen Beeken, Daniel Julin, Julie Stalley and Martin Trotter. Java diagnostics, IBM style, Part 1: Introducing the IBM Diagnostic and Monitoring Tools for Java - Dump Analyzer. [5] John Berthels. Exmap memory analysis tool. [6] Stephen M. Blackburna, et al. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In Proceedings of the 21st ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA '06), pp , [7] Sumit Chawla. Getting more memory in AIX for your Java applications, aix4java1.html [8] Ben Corrie. Java technology, IBM style: Class sharing,

12 [9] Nikola Grcevski, Allan Kielstra, Kevin Stoodley, Mark Stoodley, and Vijay Sundaresan. Java Just-In-Time Compiler and Virtual Machine Improvements for Server and Middleware Applications. In Proceedings of the 3rd USENIX Virtual Machine Research and Technology Symposium (VM '04), pp , [10] Filip Hanik. Inside the Java Virtual Machine, [11] IBM Corporation. IBM Support Assistant. [12] IBM Corporation. IBM Trade Performance Benchmark. [13] IBM Corporation. WebSphere Application Server. [14] JRuby: Java powered Ruby implementation. [15] JRubyWiki. JRuby on Rails. [16] Maria Jump and Kathryn S. McKinley. Cork: Dynamic Memory Leak Detection for Garbage-Collected Languages. In Proceedings of the 34th ACM Symposium on Principles of Programming Languages (POPL '07), pp , [17] Kiyokuni Kawachiya, Kazunori Ogata, and Tamiya Onodera. Analysis and Reduction of Memory Inefficiencies in Java Strings. In Proceedings of the 23rd ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA '08), pp , [18] Sheng Liang and Deepa Viswanathan. Comprehensive Profiling Support in the Java Virtual Machine. In Proceedings of the 5th USENIX Conference on Object-Oriented Technologies and Systems (COOTS '99), pp , [19] Nick Mitchell and Gary Sevitsky. LeakBot: An Automated and Lightweight Tool for Diagnosing Memory Leaks in Large Java Applications. In Proceedings of the 17th European Conference on Object-Oriented Programming (ECOOP '03), pp , [20] Nick Mitchell and Gary Sevitsky. The Causes of Bloat, The Limits of Health. In Proceedings of the 22nd ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA '07), pp , [21] MySQL: The world's most popular open source database. [22] Oracle. JRockit. [23] Indrajit Poddar and Robbie John Minshall. Memory leak detection and analysis in WebSphere Application Server: Part 1: Overview of memory leaks. techarticles/0606_poddar/0606_poddar.html [24] Ruby on Rails. [25] Sun Microsystems. Class Data Sharing. [26] Sun Microsystems. HPROF: A Heap/CPU Profiling Tool in J2SE /HPROF.html [27] Sun Microsystems. Java 2 Platform, Standard Edition v 1.4 Performance and Scalability Guide. [28] Sun Microsystems. Java API reference, java.nio.bytebuffer. ml [29] Sun Microsystems. Java SE HotSpot at a Glance. [30] Sun Microsystems. jhat - Java Heap Analysis Tool. ml [31] Sun Microsystems. JVM Tool Interface (JVM TI). [32] Warbler

13 Appendix All results for DaCapo We show the experimental results for each of the programs in DaCapo. We measured them in the same environment described in Table Figure 11. antlr Figure 14. eclipse Figure 15. fop 100% Figure 12. bloat Figure 16. hsqldb Figure 13. chart

14 Figure 17. jython Figure 20. pmd Figure 18. luindex Figure 21. xalan Figure 19. lusearch

Efficient Runtime Tracking of Allocation Sites in Java

Efficient Runtime Tracking of Allocation Sites in Java Efficient Runtime Tracking of Allocation Sites in Java Rei Odaira, Kazunori Ogata, Kiyokuni Kawachiya, Tamiya Onodera, Toshio Nakatani IBM Research - Tokyo Why Do You Need Allocation Site Information?

More information

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu and Toshio Nakatani IBM Research Tokyo IBM Research T.J. Watson Research Center April

More information

Trace-based JIT Compilation

Trace-based JIT Compilation Trace-based JIT Compilation Hiroshi Inoue, IBM Research - Tokyo 1 Trace JIT vs. Method JIT https://twitter.com/yukihiro_matz/status/533775624486133762 2 Background: Trace-based Compilation Using a Trace,

More information

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu and Toshio Nakatani IBM Research Tokyo IBM Research T.J. Watson Research Center October

More information

Continuous Object Access Profiling and Optimizations to Overcome the Memory Wall and Bloat

Continuous Object Access Profiling and Optimizations to Overcome the Memory Wall and Bloat Continuous Object Access Profiling and Optimizations to Overcome the Memory Wall and Bloat Rei Odaira, Toshio Nakatani IBM Research Tokyo ASPLOS 2012 March 5, 2012 Many Wasteful Objects Hurt Performance.

More information

Runtime Application Self-Protection (RASP) Performance Metrics

Runtime Application Self-Protection (RASP) Performance Metrics Product Analysis June 2016 Runtime Application Self-Protection (RASP) Performance Metrics Virtualization Provides Improved Security Without Increased Overhead Highly accurate. Easy to install. Simple to

More information

Identifying the Sources of Cache Misses in Java Programs Without Relying on Hardware Counters. Hiroshi Inoue and Toshio Nakatani IBM Research - Tokyo

Identifying the Sources of Cache Misses in Java Programs Without Relying on Hardware Counters. Hiroshi Inoue and Toshio Nakatani IBM Research - Tokyo Identifying the Sources of Cache Misses in Java Programs Without Relying on Hardware Counters Hiroshi Inoue and Toshio Nakatani IBM Research - Tokyo June 15, 2012 ISMM 2012 at Beijing, China Motivation

More information

Hierarchical PLABs, CLABs, TLABs in Hotspot

Hierarchical PLABs, CLABs, TLABs in Hotspot Hierarchical s, CLABs, s in Hotspot Christoph M. Kirsch ck@cs.uni-salzburg.at Hannes Payer hpayer@cs.uni-salzburg.at Harald Röck hroeck@cs.uni-salzburg.at Abstract Thread-local allocation buffers (s) are

More information

Towards Parallel, Scalable VM Services

Towards Parallel, Scalable VM Services Towards Parallel, Scalable VM Services Kathryn S McKinley The University of Texas at Austin Kathryn McKinley Towards Parallel, Scalable VM Services 1 20 th Century Simplistic Hardware View Faster Processors

More information

ORACLE ENTERPRISE MANAGER 10g ORACLE DIAGNOSTICS PACK FOR NON-ORACLE MIDDLEWARE

ORACLE ENTERPRISE MANAGER 10g ORACLE DIAGNOSTICS PACK FOR NON-ORACLE MIDDLEWARE ORACLE ENTERPRISE MANAGER 10g ORACLE DIAGNOSTICS PACK FOR NON-ORACLE MIDDLEWARE Most application performance problems surface during peak loads. Often times, these problems are time and resource intensive,

More information

Windows Java address space

Windows Java address space Windows Java address space This article applies to the IBM 32-bit SDK and Runtime Environment for Windows, Java2 Technology Edition. It explains how the process space for Java is divided and explores a

More information

String Deduplication for Java-based Middleware in Virtualized Environments

String Deduplication for Java-based Middleware in Virtualized Environments String Deduplication for Java-based Middleware in Virtualized Environments Michihiro Horie, Kazunori Ogata, Kiyokuni Kawachiya, Tamiya Onodera IBM Research - Tokyo Duplicated strings found on Web Application

More information

Understanding Performance in Large-scale Framework-based Systems

Understanding Performance in Large-scale Framework-based Systems Research Division Understanding Performance in Large-scale Framework-based Systems Gary Sevitsky, Nick Mitchell, Harini Srinivasan Intelligent Analysis Tools Group April 18, 2005 Background Our group develops

More information

Enterprise Architect. User Guide Series. Profiling. Author: Sparx Systems. Date: 10/05/2018. Version: 1.0 CREATED WITH

Enterprise Architect. User Guide Series. Profiling. Author: Sparx Systems. Date: 10/05/2018. Version: 1.0 CREATED WITH Enterprise Architect User Guide Series Profiling Author: Sparx Systems Date: 10/05/2018 Version: 1.0 CREATED WITH Table of Contents Profiling 3 System Requirements 8 Getting Started 9 Call Graph 11 Stack

More information

Enterprise Architect. User Guide Series. Profiling

Enterprise Architect. User Guide Series. Profiling Enterprise Architect User Guide Series Profiling Investigating application performance? The Sparx Systems Enterprise Architect Profiler finds the actions and their functions that are consuming the application,

More information

High-Level Language VMs

High-Level Language VMs High-Level Language VMs Outline Motivation What is the need for HLL VMs? How are these different from System or Process VMs? Approach to HLL VMs Evolutionary history Pascal P-code Object oriented HLL VMs

More information

Reducing the Overhead of Dynamic Compilation

Reducing the Overhead of Dynamic Compilation Reducing the Overhead of Dynamic Compilation Chandra Krintz y David Grove z Derek Lieber z Vivek Sarkar z Brad Calder y y Department of Computer Science and Engineering, University of California, San Diego

More information

Array Bounds Check Elimination Utilizing a Page Protection Mechanism

Array Bounds Check Elimination Utilizing a Page Protection Mechanism RT0550 Computer Science 6 pages Research Report October 14, 2003 Array Bounds Check Elimination Utilizing a Page Protection Mechanism Motohiro Kawahito IBM Research, Tokyo Research Laboratory IBM Japan,

More information

Interaction of JVM with x86, Sparc and MIPS

Interaction of JVM with x86, Sparc and MIPS Interaction of JVM with x86, Sparc and MIPS Sasikanth Avancha, Dipanjan Chakraborty, Dhiral Gada, Tapan Kamdar {savanc1, dchakr1, dgada1, kamdar}@cs.umbc.edu Department of Computer Science and Electrical

More information

Borland Optimizeit Enterprise Suite 6

Borland Optimizeit Enterprise Suite 6 Borland Optimizeit Enterprise Suite 6 Feature Matrix The table below shows which Optimizeit product components are available in Borland Optimizeit Enterprise Suite and which are available in Borland Optimizeit

More information

Workload Characterization and Optimization of TPC-H Queries on Apache Spark

Workload Characterization and Optimization of TPC-H Queries on Apache Spark Workload Characterization and Optimization of TPC-H Queries on Apache Spark Tatsuhiro Chiba and Tamiya Onodera IBM Research - Tokyo April. 17-19, 216 IEEE ISPASS 216 @ Uppsala, Sweden Overview IBM Research

More information

Heap Management. Heap Allocation

Heap Management. Heap Allocation Heap Management Heap Allocation A very flexible storage allocation mechanism is heap allocation. Any number of data objects can be allocated and freed in a memory pool, called a heap. Heap allocation is

More information

Lu Fang, University of California, Irvine Liang Dou, East China Normal University Harry Xu, University of California, Irvine

Lu Fang, University of California, Irvine Liang Dou, East China Normal University Harry Xu, University of California, Irvine Lu Fang, University of California, Irvine Liang Dou, East China Normal University Harry Xu, University of California, Irvine 2015-07-09 Inefficient code regions [G. Jin et al. PLDI 2012] Inefficient code

More information

Monitoring Agent for Tomcat 6.4 Fix Pack 4. Reference IBM

Monitoring Agent for Tomcat 6.4 Fix Pack 4. Reference IBM Monitoring Agent for Tomcat 6.4 Fix Pack 4 Reference IBM Monitoring Agent for Tomcat 6.4 Fix Pack 4 Reference IBM Note Before using this information and the product it supports, read the information in

More information

MODULE 1 JAVA PLATFORMS. Identifying Java Technology Product Groups

MODULE 1 JAVA PLATFORMS. Identifying Java Technology Product Groups MODULE 1 JAVA PLATFORMS Identifying Java Technology Product Groups Java SE Platform Versions Year Developer Version (JDK) Platform 1996 1.0 1 1997 1.1 1 1998 1.2 2 2000 1.3 2 2002 1.4 2 2004 1.5 5 2006

More information

Practical Lessons in Memory Analysis

Practical Lessons in Memory Analysis Practical Lessons in Memory Analysis Krum Tsvetkov SAP AG Andrew Johnson IBM United Kingdom Limited GOAL > Learn practical tips and tricks for the analysis of common memory-related problems 2 Agenda >

More information

Efficient Runtime Tracking of Allocation Sites in Java

Efficient Runtime Tracking of Allocation Sites in Java Efficient Runtime Tracking of Allocation Sites in Java Rei Odaira, Kazunori Ogata, Kiyokuni Kawachiya, Tamiya Onodera, Toshio Nakatani IBM Research Tokyo 623-4, Shimotsuruma, Yamato-shi, Kanagawa-ken,

More information

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc. Chapter 1 GETTING STARTED SYS-ED/ Computer Education Techniques, Inc. Objectives You will learn: Java platform. Applets and applications. Java programming language: facilities and foundation. Memory management

More information

Compact and Efficient Strings for Java

Compact and Efficient Strings for Java Compact and Efficient Strings for Java Christian Häubl, Christian Wimmer, Hanspeter Mössenböck Institute for System Software, Christian Doppler Laboratory for Automated Software Engineering, Johannes Kepler

More information

Fiji VM Safety Critical Java

Fiji VM Safety Critical Java Fiji VM Safety Critical Java Filip Pizlo, President Fiji Systems Inc. Introduction Java is a modern, portable programming language with wide-spread adoption. Goal: streamlining debugging and certification.

More information

Building Memory-efficient Java Applications: Practices and Challenges

Building Memory-efficient Java Applications: Practices and Challenges Building Memory-efficient Java Applications: Practices and Challenges Nick Mitchell, Gary Sevitsky (presenting) IBM TJ Watson Research Center Hawthorne, NY USA Copyright is held by the author/owner(s).

More information

Java Without the Jitter

Java Without the Jitter TECHNOLOGY WHITE PAPER Achieving Ultra-Low Latency Table of Contents Executive Summary... 3 Introduction... 4 Why Java Pauses Can t Be Tuned Away.... 5 Modern Servers Have Huge Capacities Why Hasn t Latency

More information

Short-term Memory for Self-collecting Mutators. Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg

Short-term Memory for Self-collecting Mutators. Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg Short-term Memory for Self-collecting Mutators Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg CHESS Seminar, UC Berkeley, September 2010 Heap Management explicit heap

More information

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0.

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0. IBM Optim Performance Manager Extended Edition V4.1.0.1 Best Practices Deploying Optim Performance Manager in large scale environments Ute Baumbach (bmb@de.ibm.com) Optim Performance Manager Development

More information

Accelerated Library Framework for Hybrid-x86

Accelerated Library Framework for Hybrid-x86 Software Development Kit for Multicore Acceleration Version 3.0 Accelerated Library Framework for Hybrid-x86 Programmer s Guide and API Reference Version 1.0 DRAFT SC33-8406-00 Software Development Kit

More information

How s the Parallel Computing Revolution Going? Towards Parallel, Scalable VM Services

How s the Parallel Computing Revolution Going? Towards Parallel, Scalable VM Services How s the Parallel Computing Revolution Going? Towards Parallel, Scalable VM Services Kathryn S McKinley The University of Texas at Austin Kathryn McKinley Towards Parallel, Scalable VM Services 1 20 th

More information

Phosphor: Illuminating Dynamic. Data Flow in Commodity JVMs

Phosphor: Illuminating Dynamic. Data Flow in Commodity JVMs Phosphor: Illuminating Dynamic Fork me on Github Data Flow in Commodity JVMs Jonathan Bell and Gail Kaiser Columbia University, New York, NY USA Dynamic Data Flow Analysis: Taint Tracking Output that is

More information

Advanced Object-Oriented Programming Introduction to OOP and Java

Advanced Object-Oriented Programming Introduction to OOP and Java Advanced Object-Oriented Programming Introduction to OOP and Java Dr. Kulwadee Somboonviwat International College, KMITL kskulwad@kmitl.ac.th Course Objectives Solidify object-oriented programming skills

More information

LANGUAGE RUNTIME NON-VOLATILE RAM AWARE SWAPPING

LANGUAGE RUNTIME NON-VOLATILE RAM AWARE SWAPPING Technical Disclosure Commons Defensive Publications Series July 03, 2017 LANGUAGE RUNTIME NON-VOLATILE AWARE SWAPPING Follow this and additional works at: http://www.tdcommons.org/dpubs_series Recommended

More information

IBM Research - Tokyo 数理 計算科学特論 C プログラミング言語処理系の最先端実装技術. Trace Compilation IBM Corporation

IBM Research - Tokyo 数理 計算科学特論 C プログラミング言語処理系の最先端実装技術. Trace Compilation IBM Corporation 数理 計算科学特論 C プログラミング言語処理系の最先端実装技術 Trace Compilation Trace JIT vs. Method JIT https://twitter.com/yukihiro_matz/status/533775624486133762 2 Background: Trace-based Compilation Using a Trace, a hot path identified

More information

JBuilder 2008 also now has full support for Struts 1.x applications including graphical editing and Web flow development.

JBuilder 2008 also now has full support for Struts 1.x applications including graphical editing and Web flow development. JBUILDER 2008 FREQUENTLY ASKED QUESTIONS GENERAL QUESTIONS What new JBuilder products did CodeGear announce in the April 2, 2008 press release? JBUILDER 2008 TURBO, JBUILDER 2008 PROFESSIONAL, AND JBUILDER

More information

Java performance - not so scary after all

Java performance - not so scary after all Java performance - not so scary after all Holly Cummins IBM Hursley Labs 2009 IBM Corporation 2001 About me Joined IBM Began professional life writing event framework for WebSphere 2004 Moved to work on

More information

Dynamic Vertical Memory Scalability for OpenJDK Cloud Applications

Dynamic Vertical Memory Scalability for OpenJDK Cloud Applications Dynamic Vertical Memory Scalability for OpenJDK Cloud Applications Rodrigo Bruno, Paulo Ferreira: INESC-ID / Instituto Superior Técnico, University of Lisbon Ruslan Synytsky, Tetiana Fydorenchyk: Jelastic

More information

Performance Optimization for Informatica Data Services ( Hotfix 3)

Performance Optimization for Informatica Data Services ( Hotfix 3) Performance Optimization for Informatica Data Services (9.5.0-9.6.1 Hotfix 3) 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

CGO:U:Auto-tuning the HotSpot JVM

CGO:U:Auto-tuning the HotSpot JVM CGO:U:Auto-tuning the HotSpot JVM Milinda Fernando, Tharindu Rusira, Chalitha Perera, Chamara Philips Department of Computer Science and Engineering University of Moratuwa Sri Lanka {milinda.10, tharindurusira.10,

More information

Deallocation Mechanisms. User-controlled Deallocation. Automatic Garbage Collection

Deallocation Mechanisms. User-controlled Deallocation. Automatic Garbage Collection Deallocation Mechanisms User-controlled Deallocation Allocating heap space is fairly easy. But how do we deallocate heap memory no longer in use? Sometimes we may never need to deallocate! If heaps objects

More information

Phase-based Adaptive Recompilation in a JVM

Phase-based Adaptive Recompilation in a JVM Phase-based Adaptive Recompilation in a JVM Dayong Gu Clark Verbrugge Sable Research Group, School of Computer Science McGill University, Montréal, Canada {dgu1, clump}@cs.mcgill.ca April 7, 2008 Sable

More information

Run-Time Environments/Garbage Collection

Run-Time Environments/Garbage Collection Run-Time Environments/Garbage Collection Department of Computer Science, Faculty of ICT January 5, 2014 Introduction Compilers need to be aware of the run-time environment in which their compiled programs

More information

Operating- System Structures

Operating- System Structures Operating- System Structures 2 CHAPTER Practice Exercises 2.1 What is the purpose of system calls? Answer: System calls allow user-level processes to request services of the operating system. 2.2 What

More information

JamaicaVM Java for Embedded Realtime Systems

JamaicaVM Java for Embedded Realtime Systems JamaicaVM Java for Embedded Realtime Systems... bringing modern software development methods to safety critical applications Fridtjof Siebert, 25. Oktober 2001 1 Deeply embedded applications Examples:

More information

Java Performance: The Definitive Guide

Java Performance: The Definitive Guide Java Performance: The Definitive Guide Scott Oaks Beijing Cambridge Farnham Kbln Sebastopol Tokyo O'REILLY Table of Contents Preface ix 1. Introduction 1 A Brief Outline 2 Platforms and Conventions 2 JVM

More information

Exploiting the Behavior of Generational Garbage Collector

Exploiting the Behavior of Generational Garbage Collector Exploiting the Behavior of Generational Garbage Collector I. Introduction Zhe Xu, Jia Zhao Garbage collection is a form of automatic memory management. The garbage collector, attempts to reclaim garbage,

More information

Lesson 2 Dissecting Memory Problems

Lesson 2 Dissecting Memory Problems Lesson 2 Dissecting Memory Problems Poonam Parhar JVM Sustaining Engineer Oracle Agenda 1. Symptoms of Memory Problems 2. Causes of Memory Problems 3. OutOfMemoryError messages 3 Lesson 2-1 Symptoms of

More information

Just In Time Compilation

Just In Time Compilation Just In Time Compilation JIT Compilation: What is it? Compilation done during execution of a program (at run time) rather than prior to execution Seen in today s JVMs and elsewhere Outline Traditional

More information

Reducing the Overhead of Dynamic Compilation

Reducing the Overhead of Dynamic Compilation Reducing the Overhead of Dynamic Compilation Chandra Krintz David Grove Derek Lieber Vivek Sarkar Brad Calder Department of Computer Science and Engineering, University of California, San Diego IBM T.

More information

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft.NET Framework Agent Fix Pack 13.

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft.NET Framework Agent Fix Pack 13. IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft.NET Framework Agent 6.3.1 Fix Pack 13 Reference IBM IBM Tivoli Composite Application Manager for Microsoft Applications:

More information

Performance Best Practices Paper for IBM Tivoli Directory Integrator v6.1 and v6.1.1

Performance Best Practices Paper for IBM Tivoli Directory Integrator v6.1 and v6.1.1 Performance Best Practices Paper for IBM Tivoli Directory Integrator v6.1 and v6.1.1 version 1.0 July, 2007 Table of Contents 1. Introduction...3 2. Best practices...3 2.1 Preparing the solution environment...3

More information

xtc Robert Grimm Making C Safely Extensible New York University

xtc Robert Grimm Making C Safely Extensible New York University xtc Making C Safely Extensible Robert Grimm New York University The Problem Complexity of modern systems is staggering Increasingly, a seamless, global computing environment System builders continue to

More information

JAVA PERFORMANCE. PR SW2 S18 Dr. Prähofer DI Leopoldseder

JAVA PERFORMANCE. PR SW2 S18 Dr. Prähofer DI Leopoldseder JAVA PERFORMANCE PR SW2 S18 Dr. Prähofer DI Leopoldseder OUTLINE 1. What is performance? 1. Benchmarking 2. What is Java performance? 1. Interpreter vs JIT 3. Tools to measure performance 4. Memory Performance

More information

Java On Steroids: Sun s High-Performance Java Implementation. History

Java On Steroids: Sun s High-Performance Java Implementation. History Java On Steroids: Sun s High-Performance Java Implementation Urs Hölzle Lars Bak Steffen Grarup Robert Griesemer Srdjan Mitrovic Sun Microsystems History First Java implementations: interpreters compact

More information

Gplus Adapter 6.1. Gplus Adapter for WFM. Hardware and Software Requirements

Gplus Adapter 6.1. Gplus Adapter for WFM. Hardware and Software Requirements Gplus Adapter 6.1 Gplus Adapter for WFM Hardware and Software Requirements The information contained herein is proprietary and confidential and cannot be disclosed or duplicated without the prior written

More information

Last class: OS and Architecture. OS and Computer Architecture

Last class: OS and Architecture. OS and Computer Architecture Last class: OS and Architecture OS and Computer Architecture OS Service Protection Interrupts System Calls IO Scheduling Synchronization Virtual Memory Hardware Support Kernel/User Mode Protected Instructions

More information

Last class: OS and Architecture. Chapter 3: Operating-System Structures. OS and Computer Architecture. Common System Components

Last class: OS and Architecture. Chapter 3: Operating-System Structures. OS and Computer Architecture. Common System Components Last class: OS and Architecture Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System Design and Implementation

More information

HANA Performance. Efficient Speed and Scale-out for Real-time BI

HANA Performance. Efficient Speed and Scale-out for Real-time BI HANA Performance Efficient Speed and Scale-out for Real-time BI 1 HANA Performance: Efficient Speed and Scale-out for Real-time BI Introduction SAP HANA enables organizations to optimize their business

More information

Java Garbage Collector Performance Measurements

Java Garbage Collector Performance Measurements WDS'09 Proceedings of Contributed Papers, Part I, 34 40, 2009. ISBN 978-80-7378-101-9 MATFYZPRESS Java Garbage Collector Performance Measurements P. Libič and P. Tůma Charles University, Faculty of Mathematics

More information

BMC Remedy OnDemand

BMC Remedy OnDemand BMC Remedy OnDemand 2011.01 Bandwidth usage and latency benchmark results Page 1 TABLE OF CONTENTS Executive summary... 3 Test environment... 4 Scenarios... 5 Workload... 5 Data volume... 9 Results...

More information

Full file at

Full file at Import Settings: Base Settings: Brownstone Default Highest Answer Letter: D Multiple Keywords in Same Paragraph: No Chapter: Chapter 2 Multiple Choice 1. A is an example of a systems program. A) command

More information

Product Guide. McAfee Performance Optimizer 2.2.0

Product Guide. McAfee Performance Optimizer 2.2.0 Product Guide McAfee Performance Optimizer 2.2.0 COPYRIGHT Copyright 2017 McAfee, LLC TRADEMARK ATTRIBUTIONS McAfee and the McAfee logo, McAfee Active Protection, epolicy Orchestrator, McAfee epo, McAfee

More information

1 of 8 14/12/2013 11:51 Tuning long-running processes Contents 1. Reduce the database size 2. Balancing the hardware resources 3. Specifying initial DB2 database settings 4. Specifying initial Oracle database

More information

Solution overview VISUAL COBOL BUSINESS CHALLENGE SOLUTION OVERVIEW BUSINESS BENEFIT

Solution overview VISUAL COBOL BUSINESS CHALLENGE SOLUTION OVERVIEW BUSINESS BENEFIT BUSINESS CHALLENGE There is an increasing demand from users of business software for easier to use applications which integrate with other business systems. As a result IT organizations are being asked

More information

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24 FILE SYSTEMS, PART 2 CS124 Operating Systems Fall 2017-2018, Lecture 24 2 Last Time: File Systems Introduced the concept of file systems Explored several ways of managing the contents of files Contiguous

More information

Garbage Collection (2) Advanced Operating Systems Lecture 9

Garbage Collection (2) Advanced Operating Systems Lecture 9 Garbage Collection (2) Advanced Operating Systems Lecture 9 Lecture Outline Garbage collection Generational algorithms Incremental algorithms Real-time garbage collection Practical factors 2 Object Lifetimes

More information

Zing Vision. Answering your toughest production Java performance questions

Zing Vision. Answering your toughest production Java performance questions Zing Vision Answering your toughest production Java performance questions Outline What is Zing Vision? Where does Zing Vision fit in your Java environment? Key features How it works Using ZVRobot Q & A

More information

Ryan Sciampacone Senior Software Developer August 1 st Multitenant JVM. JVM Languages Summit IBM Corporation

Ryan Sciampacone Senior Software Developer August 1 st Multitenant JVM. JVM Languages Summit IBM Corporation Ryan Sciampacone Senior Software Developer August 1 st 2012 Multitenant JVM JVM Languages Summit 2012 Important Disclaimers THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL

More information

Method-Level Phase Behavior in Java Workloads

Method-Level Phase Behavior in Java Workloads Method-Level Phase Behavior in Java Workloads Andy Georges, Dries Buytaert, Lieven Eeckhout and Koen De Bosschere Ghent University Presented by Bruno Dufour dufour@cs.rutgers.edu Rutgers University DCS

More information

Older geometric based addressing is called CHS for cylinder-head-sector. This triple value uniquely identifies every sector.

Older geometric based addressing is called CHS for cylinder-head-sector. This triple value uniquely identifies every sector. Review: On Disk Structures At the most basic level, a HDD is a collection of individually addressable sectors or blocks that are physically distributed across the surface of the platters. Older geometric

More information

ECE 598 Advanced Operating Systems Lecture 10

ECE 598 Advanced Operating Systems Lecture 10 ECE 598 Advanced Operating Systems Lecture 10 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 17 February 2015 Announcements Homework #1 and #2 grades, HW#3 Coming soon 1 Various

More information

WebOTX Batch Server. November, NEC Corporation, Cloud Platform Division, WebOTX Group

WebOTX Batch Server. November, NEC Corporation, Cloud Platform Division, WebOTX Group WebOTX Batch Server November, 2015 NEC Corporation, Cloud Platform Division, WebOTX Group Index 1. Product Overview 2. Solution with WebOTX Batch Server 3. WebOTX Batch Server V8.4 enhanced features 4.

More information

Introduction to Java Programming

Introduction to Java Programming Introduction to Java Programming Lecture 1 CGS 3416 Spring 2017 1/9/2017 Main Components of a computer CPU - Central Processing Unit: The brain of the computer ISA - Instruction Set Architecture: the specific

More information

Heimdall Data Access Platform Installation and Setup Guide

Heimdall Data Access Platform Installation and Setup Guide Heimdall Data Access Platform Installation and Setup Guide Heimdall Data Access Platform Installation and Setup Guide Heimdall Data Access Platform Installation and Setup Guide 1. General Information 1

More information

SAY-Go: Towards Transparent and Seamless Storage-As-You-Go with Persistent Memory

SAY-Go: Towards Transparent and Seamless Storage-As-You-Go with Persistent Memory SAY-Go: Towards Transparent and Seamless Storage-As-You-Go with Persistent Memory Hyeonho Song, Sam H. Noh UNIST HotStorage 2018 Contents Persistent Memory Motivation SAY-Go Design Implementation Evaluation

More information

Oracle Database 10g The Self-Managing Database

Oracle Database 10g The Self-Managing Database Oracle Database 10g The Self-Managing Database Benoit Dageville Oracle Corporation benoit.dageville@oracle.com Page 1 1 Agenda Oracle10g: Oracle s first generation of self-managing database Oracle s Approach

More information

Notes of the course - Advanced Programming. Barbara Russo

Notes of the course - Advanced Programming. Barbara Russo Notes of the course - Advanced Programming Barbara Russo a.y. 2014-2015 Contents 1 Lecture 2 Lecture 2 - Compilation, Interpreting, and debugging........ 2 1.1 Compiling and interpreting...................

More information

Name, Scope, and Binding. Outline [1]

Name, Scope, and Binding. Outline [1] Name, Scope, and Binding In Text: Chapter 3 Outline [1] Variable Binding Storage bindings and lifetime Type bindings Type Checking Scope Lifetime vs. Scope Referencing Environments N. Meng, S. Arthur 2

More information

Use of profilers for studying Java dynamic optimizations

Use of profilers for studying Java dynamic optimizations Use of profilers for studying Java dynamic optimizations Kevin Arhelger, Fernando Trinciante, Elena Machkasova Computer Science Discipline University of Minnesota Morris Morris MN, 56267 arhel005@umn.edu,

More information

JVM Performance Study Comparing Java HotSpot to Azul Zing Using Red Hat JBoss Data Grid

JVM Performance Study Comparing Java HotSpot to Azul Zing Using Red Hat JBoss Data Grid JVM Performance Study Comparing Java HotSpot to Azul Zing Using Red Hat JBoss Data Grid Legal Notices JBoss, Red Hat and their respective logos are trademarks or registered trademarks of Red Hat, Inc.

More information

CS5015 Object-oriented Software Development. Lecture: Overview of Java Platform. A. O Riordan, 2010 Most recent revision, 2014 updated for Java 8

CS5015 Object-oriented Software Development. Lecture: Overview of Java Platform. A. O Riordan, 2010 Most recent revision, 2014 updated for Java 8 CS5015 Object-oriented Software Development Lecture: Overview of Java Platform A. O Riordan, 2010 Most recent revision, 2014 updated for Java 8 Java Programming Language Java is an object-oriented programming

More information

Cross-Layer Memory Management to Reduce DRAM Power Consumption

Cross-Layer Memory Management to Reduce DRAM Power Consumption Cross-Layer Memory Management to Reduce DRAM Power Consumption Michael Jantz Assistant Professor University of Tennessee, Knoxville 1 Introduction Assistant Professor at UT since August 2014 Before UT

More information

Oracle Developer Studio 12.6

Oracle Developer Studio 12.6 Oracle Developer Studio 12.6 Oracle Developer Studio is the #1 development environment for building C, C++, Fortran and Java applications for Oracle Solaris and Linux operating systems running on premises

More information

Hardware-Supported Pointer Detection for common Garbage Collections

Hardware-Supported Pointer Detection for common Garbage Collections 2013 First International Symposium on Computing and Networking Hardware-Supported Pointer Detection for common Garbage Collections Kei IDEUE, Yuki SATOMI, Tomoaki TSUMURA and Hiroshi MATSUO Nagoya Institute

More information

Four Components of a Computer System

Four Components of a Computer System Four Components of a Computer System Operating System Concepts Essentials 2nd Edition 1.1 Silberschatz, Galvin and Gagne 2013 Operating System Definition OS is a resource allocator Manages all resources

More information

Run-time Program Management. Hwansoo Han

Run-time Program Management. Hwansoo Han Run-time Program Management Hwansoo Han Run-time System Run-time system refers to Set of libraries needed for correct operation of language implementation Some parts obtain all the information from subroutine

More information

Introduction to Java. Lecture 1 COP 3252 Summer May 16, 2017

Introduction to Java. Lecture 1 COP 3252 Summer May 16, 2017 Introduction to Java Lecture 1 COP 3252 Summer 2017 May 16, 2017 The Java Language Java is a programming language that evolved from C++ Both are object-oriented They both have much of the same syntax Began

More information

1. Introduction. Java. Fall 2009 Instructor: Dr. Masoud Yaghini

1. Introduction. Java. Fall 2009 Instructor: Dr. Masoud Yaghini 1. Introduction Java Fall 2009 Instructor: Dr. Masoud Yaghini Outline Introduction Introduction The Java Programming Language The Java Platform References Java technology Java is A high-level programming

More information

What a Year! Java 10 and 10 Big Java Milestones

What a Year! Java 10 and 10 Big Java Milestones What a Year! Java 10 and 10 Big Java Milestones Java has made tremendous strides in the past 12 months, with exciting new features and capabilities for developers of all kinds. Table of Contents INTRODUCTION

More information

Towards Garbage Collection Modeling

Towards Garbage Collection Modeling Towards Garbage Collection Modeling Peter Libič Petr Tůma Department of Distributed and Dependable Systems Faculty of Mathematics and Physics, Charles University Malostranské nám. 25, 118 00 Prague, Czech

More information

VIProf: A Vertically Integrated Full-System Profiler

VIProf: A Vertically Integrated Full-System Profiler VIProf: A Vertically Integrated Full-System Profiler NGS Workshop, April 2007 Hussam Mousa Chandra Krintz Lamia Youseff Rich Wolski RACELab Research Dynamic software adaptation As program behavior or resource

More information

ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation

ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation Weiping Liao, Saengrawee (Anne) Pratoomtong, and Chuan Zhang Abstract Binary translation is an important component for translating

More information

J2EE Development Best Practices: Improving Code Quality

J2EE Development Best Practices: Improving Code Quality Session id: 40232 J2EE Development Best Practices: Improving Code Quality Stuart Malkin Senior Product Manager Oracle Corporation Agenda Why analyze and optimize code? Static Analysis Dynamic Analysis

More information

Server Status Dashboard

Server Status Dashboard The Cisco Prime Network Registrar server status dashboard in the web user interface (web UI) presents a graphical view of the system status, using graphs, charts, and tables, to help in tracking and diagnosis.

More information