The Z Garbage Collector An Introduction

Similar documents
The Z Garbage Collector Low Latency GC for OpenJDK

The Z Garbage Collector Scalable Low-Latency GC in JDK 11

JDK 9/10/11 and Garbage Collection

Shenandoah: Theory and Practice. Christine Flood Roman Kennke Principal Software Engineers Red Hat

The G1 GC in JDK 9. Erik Duveblad Senior Member of Technical Staf Oracle JVM GC Team October, 2017

The C4 Collector. Or: the Application memory wall will remain until compaction is solved. Gil Tene Balaji Iyengar Michael Wolf

Pause-Less GC for Improving Java Responsiveness. Charlie Gracie IBM Senior Software charliegracie

2011 Oracle Corporation and Affiliates. Do not re-distribute!

A new Mono GC. Paolo Molaro October 25, 2006

Concurrent Garbage Collection

NUMA in High-Level Languages. Patrick Siegler Non-Uniform Memory Architectures Hasso-Plattner-Institut

Shenandoah: An ultra-low pause time garbage collector for OpenJDK. Christine Flood Principal Software Engineer Red Hat

Scaling the OpenJDK. Claes Redestad Java SE Performance Team Oracle. Copyright 2017, Oracle and/or its afliates. All rights reserved.

Ahead of Time (AOT) Compilation

Shenandoah: An ultra-low pause time garbage collector for OpenJDK. Christine Flood Roman Kennke Principal Software Engineers Red Hat

Java Performance Tuning and Optimization Student Guide

JAVA PERFORMANCE. PR SW2 S18 Dr. Prähofer DI Leopoldseder

Harmony GC Source Code

Hierarchical PLABs, CLABs, TLABs in Hotspot

Welcome to the session...

JVM Troubleshooting MOOC: Troubleshooting Memory Issues in Java Applications

Optimising Multicore JVMs. Khaled Alnowaiser

G1 Garbage Collector Details and Tuning. Simone Bordet

JVM Performance Study Comparing Java HotSpot to Azul Zing Using Red Hat JBoss Data Grid

Java Without the Jitter

New Java performance developments: compilation and garbage collection

High Performance Managed Languages. Martin Thompson

Shenandoah An ultra-low pause time Garbage Collector for OpenJDK. Christine H. Flood Roman Kennke

Real Time: Understanding the Trade-offs Between Determinism and Throughput

TRASH DAY: COORDINATING GARBAGE COLLECTION IN DISTRIBUTED SYSTEMS

OS-caused Long JVM Pauses - Deep Dive and Solutions

Garbage Collection. Hwansoo Han

SPECjbb2005. Alan Adamson, IBM Canada David Dagastine, Sun Microsystems Stefan Sarne, BEA Systems

Java On Steroids: Sun s High-Performance Java Implementation. History

6.828: OS/Language Co-design. Adam Belay

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

JamaicaVM Java for Embedded Realtime Systems

High-Level Language VMs

Managed runtimes & garbage collection

Hard Real-Time Garbage Collection in Java Virtual Machines

Java Application Performance Tuning for AMD EPYC Processors

Copyright 2014 Oracle and/or its affiliates. All rights reserved.

JVM Performance Study Comparing Oracle HotSpot and Azul Zing Using Apache Cassandra

TECHNOLOGY WHITE PAPER. Azul Pauseless Garbage Collection. Providing continuous, pauseless operation for Java applications

Attila Szegedi, Software

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

Finally! Real Java for low latency and low jitter

Azul Pauseless Garbage Collection

Modern and Fast: A New Wave of Database and Java in the Cloud. Joost Pronk Van Hoogeveen Lead Product Manager, Oracle

Java in a World of Containers

Enabling Java in Latency Sensitive Environments

JVM Memory Model and GC

Scaling Up Performance Benchmarking

Low latency & Mechanical Sympathy: Issues and solutions

IN-PERSISTENT-MEMORY COMPUTING WITH JAVA ERIC KACZMAREK INTEL CORPORATION

Implementation Garbage Collection

Runtime Application Self-Protection (RASP) Performance Metrics

Efficient Runtime Tracking of Allocation Sites in Java

Using Transparent Compression to Improve SSD-based I/O Caches

OpenJDK Adoption Group

What's new in MySQL 5.5? Performance/Scale Unleashed

Oracle JD Edwards EnterpriseOne Object Usage Tracking Performance Characterization Using JD Edwards EnterpriseOne Object Usage Tracking

End-to-End Java Security Performance Enhancements for Oracle SPARC Servers Performance engineering for a revenue product

Java in a World of Containers

Workload Characterization and Optimization of TPC-H Queries on Apache Spark

Runtime. The optimized program is ready to run What sorts of facilities are available at runtime

PERFORMANCE ANALYSIS AND OPTIMIZATION OF SKIP LISTS FOR MODERN MULTI-CORE ARCHITECTURES

A RESTful Java Framework for Asynchronous High-Speed Ingest

Automatic Memory Management

Jaguar: Enabling Efficient Communication and I/O in Java

Garbage Collection Algorithms. Ganesh Bikshandi

Truffle A language implementation framework

New Features Overview

Garbage Collection. Akim D le, Etienne Renault, Roland Levillain. May 15, CCMP2 Garbage Collection May 15, / 35

Understanding Garbage Collection

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides

Hardware-Supported Pointer Detection for common Garbage Collections

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.

Run Anywhere. The Hardware Platform Perspective. Ben Pollan, AMD Java Labs October 28, 2008

About Terracotta Ehcache. Version 10.1

Dynamic Vertical Memory Scalability for OpenJDK Cloud Applications

High Performance Managed Languages. Martin Thompson

<Insert Picture Here> Symmetric multilanguage VM architecture: Running Java and JavaScript in Shared Environment on a Mobile Phone

Kotlin/Native concurrency model. nikolay

MySQL InnoDB Cluster. MySQL HA Made Easy! Miguel Araújo Senior Software Developer MySQL Middleware and Clients. FOSDEM 18 - February 04, 2018

Lesson 2 Dissecting Memory Problems

JVM and application bottlenecks troubleshooting

Java ME Directions. JCP F2F - Austin. Florian Tournier - Oracle May 9, Copyright 2017, Oracle and/or its affiliates. All rights reserved.

Using Intel VTune Amplifier XE and Inspector XE in.net environment

Memory management has always involved tradeoffs between numerous optimization possibilities: Schemes to manage problem fall into roughly two camps

Oracle WebCenter Portal Performance Tuning

Lecture 15 Garbage Collection

NOSQL DATABASE CLOUD SERVICE. Flexible Data Models. Zero Administration. Automatic Scaling.

What every DBA needs to know about JDBC connection pools Bridging the language barrier between DBA and Middleware Administrators

Myths and Realities: The Performance Impact of Garbage Collection

Safe Harbor Statement

SPECjbb2015 Benchmark User Guide

Using the Singularity Research Development Kit

Application Container Cloud

Kodewerk. Java Performance Services. The War on Latency. Reducing Dead Time Kirk Pepperdine Principle Kodewerk Ltd.

Transcription:

The Z Garbage Collector An Introduction Per Lidén & Stefan Karlsson HotSpot Garbage Collection Team FOSDEM 2018

Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. 2

Agenda 1 2 3 4 5 What is ZGC? Some Numbers A Peek Under The Hood Going Forward How To Get Started 3

Agenda 1 2 3 4 5 What is ZGC? Some Numbers A Peek Under The Hood Going Forward How To Get Started 4

A Scalable Low Latency Garbage Collector 5

Goals TB Multi-terabyte heaps 10ms Max GC pause time Lay the foundation for future GC features 15% Max application throughput reduction 6

GC pause times do not increase with heap or live-set size 7

At a Glance New garbage collector Load barriers Colored pointers Single generation Partial compaction Region-based Immediate memory reuse NUMA-aware Concurrent Marking Relocation/Compaction Relocation Set Selection Reference Processing JNI WeakRefs Cleaning - StringTable/SymbolTable Cleaning - Class Unloading 8

Current Status Design and implementation approaching mature and stable Main focus on Linux/x86_64 Other platforms can be added if there s enough demand Performance looks very good Both in terms of latency and throughput 9

Agenda 1 2 3 4 5 What is ZGC? Some Numbers A Peek Under The Hood Going Forward How To Get Started 10

SPECjbb 2015 Score (Higher is better) 100% 90% 80% 70% 60% Mode: Composite Heap Size: 128G OS: Oracle Linux 7.4 HW: Intel Xeon E5-2690 2.9GHz 2 sockets, 16 cores (32 hw-threads) 50% 40% 30% 20% 10% 0% ZGC Parallel G1 max-jops (Throughput) critical-jops (Throughput with latency requirements) SPECjbb 2015 is a registered trademark of the Standard Performance Evaluation Corporation (spec.org). The actual results are not represented as compliant because the SUT may not meet SPEC's requirements for general availability. 11

GC Pause Times (ms) GC Pause Times (ms) SPECjbb 2015 Pause Times Linear scale (Lower is better) 900 1000 Logarithmic scale 800 700 Same data, different scales 600 500 400 300 200 100 0 ZGC Parallel G1 Average 95th percentile 99th percentile 99.9th percentile Max 100 10 1 ZGC Parallel G1 Average 95th percentile 99th percentile 99.9th percentile Max 12

Agenda 1 2 3 4 5 What is ZGC? Some Numbers A Peek Under The Hood Going Forward How To Get Started 13

ZGC Phases Pause Mark Start Pause Mark End Pause Relocate Start Concurrent Mark/Remap Concurrent Prepare for Reloc. Concurrent Relocate GC Cycle 14

ZGC Phases Pause Mark Start Pause Mark End Pause Relocate Start Concurrent Mark/Remap Concurrent Prepare for Reloc. Concurrent Relocate Mark objects pointed to by roots GC Cycle 15

ZGC Phases Pause Mark Start Pause Mark End Pause Relocate Start Concurrent Mark/Remap Concurrent Prepare for Reloc. Concurrent Relocate Walk the object graph and mark objects GC Cycle 16

ZGC Phases Pause Mark Start Pause Mark End Pause Relocate Start Concurrent Mark/Remap Concurrent Prepare for Reloc. Concurrent Relocate Synchronization point (Weak roots cleaning) GC Cycle 17

ZGC Phases Pause Mark Start Pause Mark End Pause Relocate Start Concurrent Mark/Remap Concurrent Prepare for Reloc. Concurrent Relocate GC Cycle Reference processing Weak root cleaning Relocation set selection 18

ZGC Phases Pause Mark Start Pause Mark End Pause Relocate Start Concurrent Mark/Remap Concurrent Prepare for Reloc. Concurrent Relocate GC Cycle Handle roots pointing into the relocation set 19

ZGC Phases Pause Mark Start Pause Mark End Pause Relocate Start Concurrent Mark/Remap Concurrent Prepare for Reloc. Concurrent Relocate GC Cycle Relocate objects in the relocation set 20

Heap Address Space Max heap size Heap Memory/Regions Maps into Heap Address Space Large address space reservation 21

Colored Pointers Core design concept in ZGC Metadata stored in unused bits in 64-bit pointers No support for 32-bit platforms No support for CompressedOops Virtual Address-masking either in hardware, OS or software Heap multi-mapping on Linux/x86_64 Supported in hardware on Solaris/SPARC 22

Colored Pointers Layout on x86_64 Finalizable Remapped Marked1 Marked0 Unused (18 bits) Object Address (42 bits, 4TB address space) 64-bit Object Pointer 23

Colored Pointers Layout on x86_64 Known to be marked? Finalizable Remapped Marked1 Marked0 Unused (18 bits) Object Address (42 bits, 4TB address space) 64-bit Object Pointer 24

Colored Pointers Layout on x86_64 Known to not point into the relocation set? Finalizable Remapped Marked1 Marked0 Unused (18 bits) Object Address (42 bits, 4TB address space) 64-bit Object Pointer 25

Colored Pointers Layout on x86_64 Only reachable through a Finalizer? Finalizable Remapped Marked1 Marked0 Unused (18 bits) Object Address (42 bits, 4TB address space) 64-bit Object Pointer 26

Heap Multi-Mapping on Linux/x86_64 Colorless pointer 0x0000000012345678 Address Space 0x00007FFFFFFFFFFF (128TB) 0x0000140000000000 (20TB) Colored pointer (Remapped) 0x0000100012345678 Heap Remapped View 0x0000100000000000 (16TB) Colored pointer (Marked1) 0x0000080012345678 Colored pointer (Marked0) 0x0000040012345678 Heap Marked1 View Heap Marked0 View 0x00000C0000000000 (12TB) 0x0000080000000000 (8TB) 0x0000040000000000 (4TB) Heap Memory 0x0000000000000000 Same memory mapped in 3 different locations 27

Load Barrier Applied when loading an object reference from the heap Not when later using that reference to access the object Conceptually similar to the decoding of compressed oops Looks at the color of the pointer Take action if the pointer has a bad color (mark/relocate/remap) Change to the good color (repair/heal) Optimized for the common case Most object references will have the good color 28

Load Barrier Object o = obj.fielda; // Loading an object reference from heap 29

Load Barrier Object o = obj.fielda; <load barrier needed here> // Loading an object reference from heap 30

Load Barrier Object o = obj.fielda; <load barrier needed here> Object p = o; o.dosomething(); int i = obj.fieldb; // Loading an object reference from heap // No barrier, not a load from heap // No barrier, not a load from heap // No barrier, not an object reference 31

Load Barrier Object o = obj.fielda; <load barrier needed here> // Loading an object reference from heap 32

Load Barrier mov 0x20(%rax), %rbx // Object o = obj.fielda; test %rbx, (0x16)%r15 // Bad color? jnz slow_path // Yes -> Enter slow path and // mark/relocate/remap, adjust // 0x20(%rax) and %rbx 33

Load Barrier mov 0x20(%rax), %rbx // Object o = obj.fielda; test %rbx, (0x16)%r15 // Bad color? jnz slow_path // Yes -> Enter slow path and // mark/relocate/remap, adjust // 0x20(%rax) and %rbx ~4% execution overhead on SPECjbb 2015 34

Mark Concurrent & Parallel Load barrier Detects loads of non-marked object pointers Finalizable mark Enabler for Concurrent Reference Processing Thread local handshakes Used to synchronize end of concurrent mark 35

Relocation Concurrent & Parallel Load barrier Detects loads of object pointers pointing into the relocation set Java threads help out with relocation if needed Off-heap forwarding tables No forwarding information stored in old copies of objects Important for immediate reuse of heap memory 36

Agenda 1 2 3 4 5 What is ZGC? Some Numbers A Peek Under The Hood Going Forward How To Get Started 37

In The Works GC Barrier API Make it easier to plug in new GCs (ZGC, Shenandoah, Epsilon) Concurrent class unloading & weak roots Traditionally done in a Stop-The-World pause Impacts JITs and Runtime subsystems Addressing non-gc induced latencies Time to safepoint/unsafepoint, object monitor deflation, etc. 38

Foundation for Future GC Features Colored Pointers + Load Barriers Thread local GC scheme Track heap access patterns Use non-volatile memory for rarely used parts of the heap Compress or archive parts of the heap Object properties encoded in pointers Allocation tricks etc. 39

Agenda 1 2 3 4 5 What is ZGC? Some Numbers A Peek Under The Hood Going Forward How To Get Started 40

How To Get Started Download Official early access builds will be available soon-ish, but until then Download & build Run $ hg clone http://hg.openjdk.java.net/zgc/zgc $ cd zgc $ sh configure $ make images $./build/linux-x86_64-<...>/images/jdk/bin/java 41

How To Get Started JVM Options Enable ZGC: -XX:+UseZGC Tuning If you care about latency, do not overprovision your machine Max heap size: -Xmx<size> Number of concurrent GC threads: -XX:ConcGCThreads=<number> Logging Basic logging: -Xlog:gc Detailed logging useful when tuning: -Xlog:gc* 42

Feedback Welcome! http://openjdk.java.net/projects/zgc/ 43