Memory management has always involved tradeoffs between numerous optimization possibilities: Schemes to manage problem fall into roughly two camps

Similar documents
:.NET Won t Slow You Down

Inside the Garbage Collector

Algorithms for Dynamic Memory Management (236780) Lecture 4. Lecturer: Erez Petrank

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

Runtime. The optimized program is ready to run What sorts of facilities are available at runtime

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides

CMSC 330: Organization of Programming Languages

Java Performance Tuning

A new Mono GC. Paolo Molaro October 25, 2006

CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection

CMSC 330: Organization of Programming Languages

CS 345. Garbage Collection. Vitaly Shmatikov. slide 1

Exercise Session Week 9 GC and Managed Memory Leaks

Performance of Non-Moving Garbage Collectors. Hans-J. Boehm HP Labs

Item 18: Implement the Standard Dispose Pattern

Exploiting the Behavior of Generational Garbage Collector

Garbage Collection. Akim D le, Etienne Renault, Roland Levillain. May 15, CCMP2 Garbage Collection May 15, / 35

Robust Memory Management Schemes

Programming Language Implementation

Habanero Extreme Scale Software Research Project

Memory Usage. Chapter 23

Name, Scope, and Binding. Outline [1]

Lecture 15 Garbage Collection

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

CS577 Modern Language Processors. Spring 2018 Lecture Garbage Collection

CS 4120 Lecture 37 Memory Management 28 November 2011 Lecturer: Andrew Myers

Run-time Environments - 3

Managed runtimes & garbage collection

CS 241 Honors Memory

Implementation Garbage Collection

Run-Time Environments/Garbage Collection

Garbage Collection. Vyacheslav Egorov

JamaicaVM Java for Embedded Realtime Systems

Garbage Collection. Hwansoo Han

Shenandoah An ultra-low pause time Garbage Collector for OpenJDK. Christine H. Flood Roman Kennke

Lecture Notes on Advanced Garbage Collection

Memory Management Basics

Generation scavenging: A non-disruptive high-performance storage reclamation algorithm By David Ungar

CS399 New Beginnings. Jonathan Walpole

Java & Coherence Simon Cook - Sales Consultant, FMW for Financial Services

Garbage Collection: Automatic Memory Management in the Microsoft.NET Framework

JVM Troubleshooting MOOC: Troubleshooting Memory Issues in Java Applications

Shenandoah: Theory and Practice. Christine Flood Roman Kennke Principal Software Engineers Red Hat

Virtual Memory Outline

Hard Real-Time Garbage Collection in Java Virtual Machines

Lecture 13: Garbage Collection

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Mark-Sweep and Mark-Compact GC

Garbage Collection. Weiyuan Li

Processes. CS 416: Operating Systems Design, Spring 2011 Department of Computer Science Rutgers University

Do Your GC Logs Speak To You

Pause-Less GC for Improving Java Responsiveness. Charlie Gracie IBM Senior Software charliegracie

High Performance Managed Languages. Martin Thompson

Concurrent Garbage Collection

Chapter 8: Virtual Memory. Operating System Concepts

Chapter 8: Virtual Memory. Operating System Concepts Essentials 2 nd Edition

Run-time Environments -Part 3

Review. Partitioning: Divide heap, use different strategies per heap Generational GC: Partition by age Most objects die young

Separate compilation. Topic 6: Runtime Environments p.1/21. CS 526 Topic 6: Runtime Environments The linkage convention

CS 261 Fall Mike Lam, Professor. Virtual Memory

The Z Garbage Collector Scalable Low-Latency GC in JDK 11

JAVA PERFORMANCE. PR SW2 S18 Dr. Prähofer DI Leopoldseder

Dynamic Memory Allocation

Shenandoah: An ultra-low pause time garbage collector for OpenJDK. Christine Flood Roman Kennke Principal Software Engineers Red Hat

Automatic Garbage Collection

Operating Systems. Memory Management. Lecture 9 Michael O Boyle

Fundamentals of GC Tuning. Charlie Hunt JVM & Performance Junkie

Java Performance Tuning From A Garbage Collection Perspective. Nagendra Nagarajayya MDE

The C4 Collector. Or: the Application memory wall will remain until compaction is solved. Gil Tene Balaji Iyengar Michael Wolf

Memory: Overview. CS439: Principles of Computer Systems February 26, 2018

Finally! Real Java for low latency and low jitter

Chapter 9: Virtual Memory. Operating System Concepts 9 th Edition

Contents. 8-1 Copyright (c) N. Afshartous

DNWSH - Version: 2.3..NET Performance and Debugging Workshop

High Performance Managed Languages. Martin Thompson

Understanding Garbage Collection

JVM Memory Model and GC

CS Computer Systems. Lecture 8: Free Memory Management

Hardware-Supported Pointer Detection for common Garbage Collections

Cycle Tracing. Presented by: Siddharth Tiwary

Compiler Construction D7011E

Parallel GC. (Chapter 14) Eleanor Ainy December 16 th 2014

Garbage Collection Algorithms. Ganesh Bikshandi

A.Arpaci-Dusseau. Mapping from logical address space to physical address space. CS 537:Operating Systems lecture12.fm.2

Myths and Realities: The Performance Impact of Garbage Collection

Classifying Information Stored in Memory! Memory Management in a Uniprogrammed System! Segments of a Process! Processing a User Program!

Chapter 9: Virtual Memory

Computer Systems II. First Two Major Computer System Evolution Steps

Processes. CS 475, Spring 2018 Concurrent & Distributed Systems

Harmony GC Source Code

CHAPTER 5: EXCEPTION & OBJECT LIFETIME

Using Page Residency to Balance Tradeoffs in Tracing Garbage Collection

OPERATING SYSTEM. Chapter 9: Virtual Memory

CA341 - Comparative Programming Languages

Run-time Environments

Run-time Environments

Garbage Collection (1)

the gamedesigninitiative at cornell university Lecture 9 Memory Management

Advanced Programming & C++ Language

Concurrent Programming

Transcription:

Garbage Collection Garbage collection makes memory management easier for programmers by automatically reclaiming unused memory. The garbage collector in the CLR makes tradeoffs to assure reasonable performance for a wide variety of situations. 1

A Fundamental tradeoff Memory management has always involved tradeoffs between numerous optimization possibilities: CPU overhead Working Set size Determinism Pause times Cache coherency Ease of development Schemes to manage problem fall into roughly two camps Deterministic camp vs. heuristic camp 2

Basic tradeoffs by different strategies Manual Ref counting deterministic CPU heuristic Mark/Compact Mark / Sweep Copy collect RAM 3

GC and the CLR The CLR uses a Mark / Compact collector as a good choice to handle a wide variety of situations With optimizations to reduce pause times Mark / Compact leads to good object locality Over time objects tend to coalesce together Tight locality leads to fewer page faults Mark / Compact has very low memory overhead per object 4

CLR heap usage Objects are always allocated sequentially on the heap Heap allocation very fast as a result a b a.b a2 b a2.b A func() { A a = new A(); a.b = new B(); A a2 = new A(); a2.b = new B(); return a2; } Start of free space 5

GC When the heap becomes too full then a GC cycle ensues Exact criteria guarded by MS GC can be ordered by programmer with a call to GC.Collect Generally discouraged Collection proceeds in two phases Phase 1: Mark Find memory that can be reclaimed Phase 2: Compact Move all live objects to the bottom of the heap, leaving free space at the top 6

Phase I: Mark GC begins by identifying live object references in well-known places AppDomain properties CPU Registers TLS slots Local variables on the stack Static members of classes This set of objects is called the root set Collector then recursively follows references to find all live objects in the process Cycles are handled correctly 7

Marking objects cycles are not a problem Root set Live objects Available for collection Free space 8

Phase II: Compact After finding live objects the Garbage Collector will initiate a compaction phase All live objects are relocated to the bottom of the heap, squeezing free space to the top All pointers to live objects are adjusted as well 9

After Compaction reclaimed free space Live objects Free space Root set 10

A real life example memory usage A program that simply spews garbage objects shows sequential allocation intermingled with regular compaction. 11

Finalization Dead objects with finalizers can t be collected at first GC Finalizer has to run first Finalizer execution deferred until after collection During collection goal is to quickly find memory that is available immediately Reference to object is placed on freachable queue These references become part of root set GC will dequeue references in background and call finalizers Call GC.WaitForPendingFinalizers to wait synchronously for finalization queue to empty Beware order of finalization! No guarantees in parent / child relationship which gets called first 12

Finalizers and compaction freachable queue Live objects Pending Finalize Free space Root set 13

Memory cost of finalizers Finalizers no Finalizers Pending finalizers piling up Finalized objects take longer to be collected, resulting in greater memory usage for otherwise identical program. 14

Controlling Finalization Objects with Finalizers can request that their finalizer not be called with GC.SuppressFinalize Sets the finalize bit on object s Sync Block Good thing to call in IDisposable::Dispose SuppressFinalize is currently pretty expensive Should be faster post beta2 Objects that change their mind can call GC.ReRegisterForFinalize 15

Example: IDisposable and SuppressFinalize public class Transaction : IDisposable { public Transaction() { lowleveltx = raw.begintransaction(); } } public void Commit() { raw.committransaction(lowleveltx); lowleveltx = 0; } private void cleanup() { if (lowleveltx!= 0) raw.aborttransaction(lowleveltx); } public void Dispose() { System.GC.SuppressFinalize(this); cleanup(); // call base.dispose(); if necessary } ~Transaction() { cleanup(); } 16

Generations Compaction and object relocation can lead to large pauses Not good for programs that need to be responsive GC should concentrate on areas where objects are likely to be dead In most programs most objects die young This assertion known as the Weak Generational Hypothesis So it makes sense to concentrate on recently allocated objects for reclamation The.NET Collector segregates objects by age on the heap Into segments called generations Generations allow the collector to spend more time scavenging young objects Where more garbage is likely to be found 17

Generations, cont d Objects that survive a single GC are promoted to a higher generation Current implementation uses 3 generations Objects are always compacted down Gen2 Gen1 Gen0 These objects have survived two or more collections These objects have survived one collection These objects have never been collected 18

Generations, cont d GC.GetGeneration() returns the current generation of an object Value may change Each older generation is collected much more rarely than the one before it. Up to ten times less often, depending on app During Collection collector does not follow references that point into older generations GC.Collect() takes an optional parameter to specify how many generations to search Note that finalized objects always get promoted at least once! Since they always survive their first collection attempt 19

The Large Object heap Really large objects are too expensive to move around And usually somewhat rare The CLR maintains a large object heap for really big objects Threshold ~20kb Objects on the LOH are never relocated Heap managed more like a classic C/C++ heap 20

Large object heap behavior address Once the LOH is filled new requests are filled by searching for available space 21

Threads and GC Different versions of the runtime have different GC behavior mscorwks.dll used on single CPU boxes mscorsvr.dll used on multi-cpu boxes Every process has one dedicated Finalization thread runs at highest non-realtime priority GC can happen on dedicated background thread Finalization always happens on dedicated thread When GC runs all managed threads must be brought to a halt Runtime has several methods for doing this Thread Hijacking Safe Points Interruptible code 22

Thread hijacking For short methods runtime can wallop return address on stack to return into GC code int bar() { return 5; } int foo() { int ret =bar(); return ret; } GC 23

Safe Points For longer methods Jitter will add callouts to code to check for a GC Callouts are made at places deemed safe for GC to occur If runtime is beginning a GC cycle then method will halt at safe point until GC is completed 24

Fully Interruptible code For tight loops neither of the above methods really works The JIT compiler emits tables that are used by the collector to determine which object references are live at each point in the method Thread is simply suspended, GC takes place, and thread is then resumed 1 void method() 2 { 3 foo a = new foo(); 4 bar b = new bar(); 5 foo c = new foo(); 6 a.method(); 7 c.method(); 8 } Line Objects 3 { } 4 { a } 5 { a } 6 { a,c } 7 { c } 8 { } 25

Arenas On Multi-CPU systems heap contention is an issue Current implementation deals with this by divvying heap into processor-specific arenas Initially each arena is ~24kb each Arenas allocated contiguously on heap Each CPU s objects get allocated in that CPU s arena Each CPU manages GC of its own arenas GC still synchronized across CPUs If one Arena fills up new ones are allocated further up the heap New arenas currently ~8kb in size 26

Arenas: Single CPU Thread 1 Thread 2 On a 1 CPU box memory allocated contiguously regardless of thread 27

Arenas: Two CPUs Thread 1 Thread 2 On a 2 CPU box memory allocated from two dynamically managed arenas 28

Summary GC involves tradeoffs Techniques chosen by MS give reasonable performance for a wide variety of languages / patterns GC is mostly automatic But can be influenced via System.GC class Try to minimize use of Finalizers Saves CPU and memory Details certain to change in future As MS tweaks implementation 29