MethodHandlesArrayElementGetterBench.testCreate Analysis. Copyright 2016, Oracle and/or its affiliates. All rights reserved.

Similar documents
invokedynamic under the hood

MethodHandle implemention tips and tricks

CSc 453 Interpreters & Interpretation

invokedynamic IN 45 MINUTES!!! Wednesday, February 6, 13

Generating Efficient Code for Lambdas and Function Types. Fredrik Öhrström Principal Member of Technical Staff JRockit+Hotspot...

State of java.lang.invoke Implementation in OpenJDK

System Software Assignment 1 Runtime Support for Procedures

JSR 292 and companions Rémi Forax FOSDEM'09

Hierarchical PLABs, CLABs, TLABs in Hotspot

Trace Compilation. Christian Wimmer September 2009

Recursion. What is Recursion? Simple Example. Repeatedly Reduce the Problem Into Smaller Problems to Solve the Big Problem

Project 5: Extensions to the MiniJava Compiler

Java Primer 1: Types, Classes and Operators

Introduction to Programming Using Java (98-388)

CSC Java Programming, Fall Java Data Types and Control Constructs

SSA Based Mobile Code: Construction and Empirical Evaluation

Optimization Techniques

IBM Cognos ReportNet and the Java Heap

Introduction to Visual Basic and Visual C++ Introduction to Java. JDK Editions. Overview. Lesson 13. Overview

CS159. Nathan Sprague

Field Analysis. Last time Exploit encapsulation to improve memory system performance

April 15, 2009 John R. Rose, Sr. Staff Engineer

Why Don t Computers Use Base 10? Lecture 2 Bits and Bytes. Binary Representations. Byte-Oriented Memory Organization. Base 10 Number Representation

Interaction of JVM with x86, Sparc and MIPS

Compiling Techniques

SABLEJIT: A Retargetable Just-In-Time Compiler for a Portable Virtual Machine p. 1

Sista: Improving Cog s JIT performance. Clément Béra

Java language. Part 1. Java fundamentals. Yevhen Berkunskyi, NUoS

THE ROAD NOT TAKEN. Estimating Path Execution Frequency Statically. ICSE 2009 Vancouver, BC. Ray Buse Wes Weimer

3. Convert 2E from hexadecimal to decimal. 4. Convert from binary to hexadecimal

JAVA PERFORMANCE. PR SW2 S18 Dr. Prähofer DI Leopoldseder

Why Don t Computers Use Base 10? Lecture 2 Bits and Bytes. Binary Representations. Byte-Oriented Memory Organization. Base 10 Number Representation

Implementing Higher-Level Languages. Quick tour of programming language implementation techniques. From the Java level to the C level.

CS111: PROGRAMMING LANGUAGE II

CS 3 Introduction to Software Engineering. 3: Exceptions

Java: framework overview and in-the-small features

Binghamton University. CS-140 Fall Problem Solving. Creating a class from scratch

Programming II (CS300)

JOVE. An Optimizing Compiler for Java. Allen Wirfs-Brock Instantiations Inc.

Today. Instance Method Dispatch. Instance Method Dispatch. Instance Method Dispatch 11/29/11. today. last time

CS-140 Fall Binghamton University. Methods. Sect. 3.3, 8.2. There s a method in my madness.

Roos Instruments, Inc. RTALK - SMALLTALK ON THE JVM

Objects and Classes. Basic OO Principles. Classes in Java. Mark Allen Weiss Copyright 2000

Pointer Analysis in the Presence of Dynamic Class Loading. Hind Presented by Brian Russell

public static String[] manyme() {

The results for a few specific cases below are indicated. allequal ([1,1,1,1]) should return true allequal ([1,1,2,1]) should return false

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Static Program Analysis

Performance Analysis of Java Communications with and without CORBA

Program Dynamic Analysis. Overview

3/15/18. Overview. Program Dynamic Analysis. What is dynamic analysis? [3] Why dynamic analysis? Why dynamic analysis? [3]

Let's talk about invokedynamic

Jython Python for the Java Platform. Jim Baker, Committer twitter.com/jimbaker zyasoft.com/pythoneering

301AA - Advanced Programming [AP-2017]

Hardware: Logical View

Pace University. Fundamental Concepts of CS121 1

public static void negate2(list<integer> t)

Computing Science 114 Solutions to Midterm Examination Tuesday October 19, In Questions 1 20, Circle EXACTLY ONE choice as the best answer

Featherweight Monitors with Bacon Bits

Practical Malware Analysis

Administration. Exceptions. Leftovers. Agenda. When Things Go Wrong. Handling Errors. CS 99 Summer 2000 Michael Clarkson Lecture 11

CS 11 java track: lecture 1

Java Platform, Standard Edition Java Flight Recorder Command Reference. Release 10

Motivation was to facilitate development of systems software, especially OS development.

Notes of the course - Advanced Programming. Barbara Russo

1.3.4 case and case* macro since 1.2. Listing Conditional Branching, Fast Switch. Listing Contract

COMP 202 Recursion. CONTENTS: Recursion. COMP Recursion 1

<Insert Picture Here> Implementing lambda expressions in Java

CS 231 Data Structures and Algorithms, Fall 2016

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

Java Identifiers, Data Types & Variables

jdk.dynalink is here! Attila Szegedi, Fauna Inc.

PRESENTED BY: SANTOSH SANGUMANI & SHARAN NARANG

Array. Prepared By - Rifat Shahriyar

Programming Language Concepts: Lecture 2

What is software testing? Software testing is designing, executing and evaluating test cases in order to detect faults.

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University

CSc 520 Principles of Programming Languages

B.V. Patel Institute of BMC & IT, UTU 2014

Java Review Outline. basics exceptions variables arrays modulo operator if statements, booleans, comparisons loops: while and for

Motivation was to facilitate development of systems software, especially OS development.

Flow of Control. Chapter 3 Part 3 The Switch Statement

Java Code Coverage Mechanics. by Evgeny Mandrikov at EclipseCon Europe 2017

Digital Forensics Lecture 3 - Reverse Engineering

2. Introducing Classes

Java and C CSE 351 Spring

(CONDITIONALS_BOUNDARY)

Lecture 4: RISC Computers

Java Code Coverage Mechanics Evgeny Mandrikov Marc Hoffmann #JokerConf 2017, Saint-Petersburg

New Compiler Optimizations in the Java HotSpot Virtual Machine

Open2Test Test Automation Framework for SilkTest - Coding Standards for Developers

Lecture 4. Types, Memory, Exceptions

Optimized Interval Splitting in a Linear Scan Register Allocator

JSR 292 Cookbook: Fresh Recipes with New Ingredients

Computational Applications in Nuclear Astrophysics using Java Java course Lecture 4

About these Release Notes. Documentation Accessibility. New Features in Pro*COBOL

CS313D: ADVANCED PROGRAMMING LANGUAGE

Compact and Efficient Strings for Java

CSC 2400: Computer Systems. Using the Stack for Function Calls

Java Code Coverage Mechanics

Transcription:

MethodHandlesArrayElementGetterBench.testCreate Analysis

Overview Benchmark : nom.indy.methodhandlesarrayelementgetterbench.testcreate Results with JDK8 (ops/us) JDK8 Intel 234 T8 T8 with -XX:FreqInlineSize=325 90 (Intel is 2.5x than this) 189 (Intel is 1.2x than this) -XX:FreqInlineSize=<size> : Integer specifying maximum number of bytecode instructions in a frequently executed method which gets inlined. Default value for Intel is 325 Default value for SPARC is 175 3

MethodHandlesArrayElementGetterBench.testCreate Benchmark source code @Benchmark public MethodHandle testcreate() { return MethodHandles.arrayElementGetter(int[].class); } /*Call the same function with the same argument multiple times */ MethodHandle arrayelementgetter(class<?> arrayclass) throws IllegalArgumentException { return MethodHandleImpl.makeArrayElementAccessor(arrayClass, false); } 4

MethodHandlesArrayElementGetterBench.testCreate Benchmark source code static MethodHandle makearrayelementaccessor(class<?> arrayclass, boolean issetter) { if (arrayclass == Object[].class) return (issetter? ArrayAccessor.OBJECT_ARRAY_SETTER : ArrayAccessor.OBJECT_ARRAY_GETTER); if (!arrayclass.isarray()) throw newillegalargumentexception("not an array: "+arrayclass); MethodHandle[] cache = ArrayAccessor.TYPED_ACCESSORS.get(arrayClass); int cacheindex = (issetter? ArrayAccessor.SETTER_INDEX : ArrayAccessor.GETTER_INDEX); MethodHandle mh = cache[cacheindex]; if (mh!= null) return mh; mh = ArrayAccessor.getAccessor(arrayClass, issetter); MethodType correcttype = ArrayAccessor.correctType(arrayClass, issetter); if (mh.type()!= correcttype) { assert(mh.type().parametertype(0) == Object[].class); assert((issetter? mh.type().parametertype(2) : mh.type().returntype()) == Object.class); assert(issetter correcttype.parametertype(0).getcomponenttype() == correcttype.returntype()); // safe to view non-strictly, because element type follows from array type mh = mh.viewastype(correcttype, false); } mh = makeintrinsic(mh, (issetter? Intrinsic.ARRAY_STORE : Intrinsic.ARRAY_LOAD)); // Atomically update accessor cache. synchronized(cache) { if (cache[cacheindex] == null) { cache[cacheindex] = mh; } else { // Throw away newly constructed accessor and use cached version. mh = cache[cacheindex]; } } } return mh; http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40- b25/java/lang/invoke/methodhandleimpl.java#methodhandleimpl.makearrayelementaccessor%28java.lang.class%2cboolean%29 5

Intel disassembly (JDK8) Inline optimization Before inline optimization 6

T8 disassembly (JDK8) After forcing Inline optimization Does not do inline optimization, invokes call and ret. The method being called is the hotspot and not optimized. It has redundant branch instructions. 7

Analysis Inline method provides more opportunity to optimize. Inline method saves call, ret and stack allocate, also uses less branches -XX:FreqInlineSize controls maximum inline function size, default value for x86 is 325, for SPARC is 175. The hotspot method in this case is 285 bytes, so x86 used inline optimization, SPARC did not Adding -XX:FreqInlineSize=325 on T8 doubled the performance

SPARC -XX:FreqInlineSize=325 VS XX:FreqInlineSize=175 Run on java 1.8.0_77 -XX:FreqInlineSize=325 # of benchmark Percentage Improvement Performance gain >=0 859 73.2% Performance gain > 10% 153 13.0% Performance loss 314 26.7% Performance loss >10% 59 5.0% All Benchmark Performance Gain Chart Performance gain -0.5--0.4 1-0.4--0.3 5-0.3--0.2 10-0.2--0.1 43-0.1-0 255 0-0.1 706 0.1-0.2 49 0.2-0.3 21 0.3-0.4 20 0.4-0.5 13 0.5-0.6 10 0.6-0.7 6 0.7-0.8 20 0.8-0.9 6 0.9-1 3 1-1.1 1 1.1-1.2 1 1.4-1.5 1 1.5-1.6 1 1.9-2 1 Grand Total 1173 # of benchmark

Intel -XX:FreqInlineSize=175 VS XX:FreqInlineSize=325 Run on java 1.8.0_77 -XX:FreqInlineSize=325 # of benchmark Percentage Performance gain >=0 734 62.6% Performance gain > 10% 55 4.7% Performance loss 439 37.4% Performance loss >10% 39 3.3% All Benchmark Performance Gain Chart Performance gain # of benchmark -0.6--0.5 3-0.4--0.3 2-0.3--0.2 10-0.2--0.1 24-0.1-0 400 0-0.1 679 0.1-0.2 28 0.2-0.3 11 0.3-0.4 5 0.4-0.5 1 0.5-0.6 2 0.6-0.7 1 0.8-0.9 1 1-1.1 1 1.6-1.7 2 1.7-1.8 2 2.8-2.9 1 Grand Total 1173

Performance Gain for Each Benchmark on SPARC X axis: benchmark ID Y axis: performance gain Performance Gain for Each Benchmark on x86

Comparison Analysis X86: With -XX:FreqInlineSize=325, 62.6% of benchmarks performance improved, 37.4% of benchmarks lose performance SPARC: With-XX:FreqInlineSize=325, 73.2% of benchmarks performance improved on SPARC, 26.7% of benchmarks lose performance

Conclusion Should -XX:FreqInlineSize=325 be made default JVM configuration for SPARC?