Review. Partitioning: Divide heap, use different strategies per heap Generational GC: Partition by age Most objects die young

Similar documents
CS842: Automatic Memory Management and Garbage Collection. Mark and sweep

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides

Algorithms for Dynamic Memory Management (236780) Lecture 4. Lecturer: Erez Petrank

Lecture 13: Garbage Collection

Runtime. The optimized program is ready to run What sorts of facilities are available at runtime

A new Mono GC. Paolo Molaro October 25, 2006

Heap Management. Heap Allocation

Deallocation Mechanisms. User-controlled Deallocation. Automatic Garbage Collection

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

Garbage Collection (1)

Managed runtimes & garbage collection

CS842: Automatic Memory Management and Garbage Collection. GC basics

CMSC 330: Organization of Programming Languages

Garbage Collection. Vyacheslav Egorov

Programming Language Implementation

CMSC 330: Organization of Programming Languages

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

Implementation Garbage Collection

Lecture 15 Garbage Collection

Garbage Collection. Akim D le, Etienne Renault, Roland Levillain. May 15, CCMP2 Garbage Collection May 15, / 35

Lecture Conservative Garbage Collection. 3.2 Precise Garbage Collectors. 3.3 Other Garbage Collection Techniques

CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection

CSE P 501 Compilers. Memory Management and Garbage Collec<on Hal Perkins Winter UW CSE P 501 Winter 2016 W-1

Harmony GC Source Code

Compiler Construction D7011E

CS 31: Intro to Systems Pointers and Memory. Kevin Webb Swarthmore College October 2, 2018

Lecture 15 Advanced Garbage Collection

Garbage Collection Algorithms. Ganesh Bikshandi

CS 241 Honors Memory

Garbage Collection. Hwansoo Han

CS577 Modern Language Processors. Spring 2018 Lecture Garbage Collection

CS 4120 Lecture 37 Memory Management 28 November 2011 Lecturer: Andrew Myers

Garbage Collection (2) Advanced Operating Systems Lecture 9

CS 345. Garbage Collection. Vitaly Shmatikov. slide 1

CS 318 Principles of Operating Systems

Run-Time Environments/Garbage Collection

HOT-Compilation: Garbage Collection

CS61, Fall 2012 Section 2 Notes

Exploiting the Behavior of Generational Garbage Collector

Habanero Extreme Scale Software Research Project

Parallel GC. (Chapter 14) Eleanor Ainy December 16 th 2014

CS 318 Principles of Operating Systems

Memory management has always involved tradeoffs between numerous optimization possibilities: Schemes to manage problem fall into roughly two camps

Advanced Programming & C++ Language

Heap Arrays and Linked Lists. Steven R. Bagley

CA341 - Comparative Programming Languages

Design Issues. Subroutines and Control Abstraction. Subroutines and Control Abstraction. CSC 4101: Programming Languages 1. Textbook, Chapter 8

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

CS 31: Intro to Systems Pointers and Memory. Martin Gagne Swarthmore College February 16, 2016

CS61C Midterm Review on C & Memory Management

Dynamic Memory Allocation: Advanced Concepts

Shenandoah An ultra-low pause time Garbage Collector for OpenJDK. Christine H. Flood Roman Kennke

THE TROUBLE WITH MEMORY

Robust Memory Management Schemes

Attila Szegedi, Software

Compilers. 8. Run-time Support. Laszlo Böszörmenyi Compilers Run-time - 1

Garbage Collection. Steven R. Bagley

Dynamic Memory Allocation

Engine Support System. asyrani.com

Reference Counting. Reference counting: a way to know whether a record has other users

Garbage Collection Basics

Memory Management: The Details

The Art and Science of Memory Allocation

One-Slide Summary. Lecture Outine. Automatic Memory Management #1. Why Automatic Memory Management? Garbage Collection.

CS61C : Machine Structures

Mark-Sweep and Mark-Compact GC

Java Performance Tuning

The Generational Hypothesis

Motivation for Dynamic Memory. Dynamic Memory Allocation. Stack Organization. Stack Discussion. Questions answered in this lecture:

Myths and Realities: The Performance Impact of Garbage Collection

Garbage-First Garbage Collection by David Detlefs, Christine Flood, Steve Heller & Tony Printezis. Presented by Edward Raff

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

Running class Timing on Java HotSpot VM, 1

Memory Management. To do. q Basic memory management q Swapping q Kernel memory allocation q Next Time: Virtual memory

Memory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008

Simple Garbage Collection and Fast Allocation Andrew W. Appel

Garbage Collection. Weiyuan Li

Lecture 13: Complex Types and Garbage Collection

CS Computer Systems. Lecture 8: Free Memory Management

Memory Management. q Basic memory management q Swapping q Kernel memory allocation q Next Time: Virtual memory

Opera&ng Systems CMPSCI 377 Garbage Collec&on. Emery Berger and Mark Corner University of Massachuse9s Amherst

Automatic Memory Management

High Performance Managed Languages. Martin Thompson

Name, Scope, and Binding. Outline [1]

G Programming Languages - Fall 2012

Automatic Garbage Collection

Garbage Collection Techniques

Garbage Collection. CS 351: Systems Programming Michael Saelee

Reference Counting. Reference counting: a way to know whether a record has other users

Lecture 8 Dynamic Memory Allocation

3. Memory Management

Dynamic Selection of Application-Specific Garbage Collectors

Compiler construction 2009

CSCI-1200 Data Structures Fall 2011 Lecture 24 Garbage Collection & Smart Pointers

Requirements, Partitioning, paging, and segmentation

Structure of Programming Languages Lecture 10

1 Dynamic Memory continued: Memory Leaks

High Performance Managed Languages. Martin Thompson

Lectures 13 & 14. memory management

Transcription:

Generational GC 1

Review Partitioning: Divide heap, use different strategies per heap Generational GC: Partition by age Most objects die young 2

Single-partition scanning Stack Heap Partition #1 Partition #2 3

Single-partition scanning Stack Heap Partition #1 Partition #2 Don t follow pointers out of partition: Not our job 3

Single-partition scanning Stack Heap Partition #1 Partition #2 Don t follow pointers out of partition: Not our job Our partition s objects could be referenced by other partitions 3

Single-partition scanning Stack Heap Partition #1 Partition #2 4

Single-partition scanning Stack Heap Partition #1 Partition #2 Cannot be collected! 4

Alternate approach Stack Heap Partition #1 Partition #2 Pseudoroot 5

The big idea Most objects die young Partition heap by age Nursery partition collected frequently Old partition only collected during fullheap GC 6

Inter-partition references Never collect old partition without young Only need to remember old young interpartition references Write barrier is back! (Details later) Treat old objects w/ young refs as roots 7

Generational GC Stack Nursery Heap Old generation 8

Generational GC Stack Nursery Heap Old generation No need to specially mark this reference 8

Generational GC Stack Nursery Heap Old generation No need to specially mark this reference Need to remember this reference when doing young GC 8

Promotion Stack Nursery Heap Old generation 9

Promotion Stack Nursery Heap Old generation 9

Promotion Stack Nursery Heap Old generation 9

En masse generational When nursery is full, collect nursery Copying GC, treat old generation as tospace Old gen can have any allocator After GC, nursery is empty by definition When old generation is full, full collect 10

When to GC Allocation in old generation only done by young GC GC of old generation only done due to allocation in old generation Therefore: Full collection caused by partial young collection 11

Weird promotions Old generation is full, GC everything Cannot now promote: Old gen is full! Nursery in half-collected state, but can t continue copying 12

Solution: No worries! Update references to already-copied objects Scan but don t copy newly-found young objects When finished with full collection, collect young again Careful for collection loops! 13

En-masse advantages Copying without semispaces: No wasted heap Young collection very fast Young-death objects don t impact performance 14

GC timeline Time O1 O2 alloc alloc O3 alloc O1 dies O4 alloc GC O2 dies O5 alloc O3 dies O6 alloc 15

GC timeline Time O1 O2 alloc alloc O3 alloc O1 dies O4 alloc GC O2 dies O5 alloc O3 dies O6 alloc Wrongly promoted, will take a long time to collect! 15

Tempered generational GC Don t promote objects on first GC Decide whether to promote from age Leave very young objects in nursery 16

Determining age Wall clock: Easy: Put time in header, check it Problem: Machine- and program-dependent, young for one app may be old for other Memory age (words allocated since): Robust to app differences Difficult/impossible to implement 17

Determining age Collection count: Every time object survives collection, collection count +1 Approximates memory age Easy to track: Add counter to header 18

Collection counter Stack Nursery tospace 0 0 fromspace 1 Heap Old generation 19

Collection counter Stack Nursery fromspace Heap Old generation tospace 1 1 2 20

Collection counter Stack Nursery tospace 2 2 Heap Old generation fromspace 21

Collection counter Stack Nursery tospace 2 2 Heap Old generation fromspace Created inter-generational link during GC! 21

Collection counter Stack Nursery tospace 2 2 Heap Old generation With young gen copying, still need semispace: Wasted heap fromspace Created inter-generational link during GC! 21

Collection counter Semispace problems are back: Waste half of young space Not all objects promoted: Can create intergenerational links 22

The bucket brigade Alloc GC GC 0 1 Old Nursery Old generation 23

The broken bucket brigade Alloc GC GC fromspace tospace 0 Nursery Old generation 24

Bucket brigade Copy objects from bucket n to bucket n+1 Cannot just use semispace copying with from=0, to=1: tospace already partially used Emptied tospace prefix not available 25

The Java bucket brigade Alloc GC GC fromspace 0 Old tospace Nursery Old generation 26

Java [HotSpot] collection Nursery: Bucket 0 ( Eden ) flat space Bucket 1 ( Survivor ) semi-space copying Old generation mark-and-compact 27

Young collection algorithm youngcollect(): (scan roots) b1fromspace, b1tospace := b1tospace, b1fromspace while loc := worklist.pop(): obj := *loc if!obj->header.forward: if obj in bucket 0: obj->header.forward := (copy to b1tospace) else if obj in bucket 1: obj->header.forward := (copy obj to old gen) (add references in newly-copied object) *loc := obj->header.forward 28

Overhead But still wasting heap for semispaces! Most objects die young: Even by bucket 1, most objects dead Remember: You allocate pools, so you choose heap, generation, bucket sizes! Typically, bucket 0 32x larger than one space in bucket 1 29

Caveat Of course, bucket 0 could fill b1tospace Expand b1 spaces during collection: Want H=n*L (all objects copied in are in L) Can t expand more? Fallback to copying to old gen 30

Breather Alloc GC GC fromspace 0 Old tospace Nursery Old generation 31

Write barrier To collect just nursery, remember objects in old generation which have inter-generation references In practice: Over-approximate Treat portions of old generation as roots for nursery collection 32

Remembered set Remember a portion of old generation as interesting During young GC, scan objects in remembered set During full GC, clear remembered set 33

Cards Typical (read: Java) GCs use cards Divide pools into smaller chunks called cards of power-of-2 size Remembered set is bit-array of size #-ofcards-per-pool To remember a card, mark its bit 34

Card barrier write(obj, loc, val): if genof(obj) == old and genof(val) == young: poolof(obj)->remember[cardof(obj)] := 1 *loc = val poolof(ptr): ptr & POOL_MASK cardof(ptr): (ptr & ~POOL_MASK) >> CARD_SIZE_POW2 35

Card use During young GC, scan each old pool s remembered set For each 1, scan objects in corresponding card Note: Cards must be parsable! 36

Parsable cards Objects may span cards To parse card, need to know offset of first object in card Additional element in pool: Offsets of first objects within cards 37

Parsable cards oldallocate(): bump-pointer : ret := end end += size if cardof(obj)!= cardof(end): pool->firstobjs[cardof(end)] := offsetincard(end) split free-list : if cardof(ret)!= cardof(split): pool->firstobjs[cardof(split)] := offsetincard(split) 38

Summary Young gen in buckets, promoted to old gen from oldest bucket Oldest bucket semispace Pointers from old gen remembered by cards Cards treated like roots (but scanned as objects) 39

Cards (part deux) 40

Cards Partitioned (e.g. generational) GC must remember certain inter-partition refs Remembering each object too expensive Divide generation into cards 41

Pools with cards struct Pool { struct Pool *next; struct FreeObject *freeobjects; void *freespace; char remember[cards_per_pool]; unsigned short firstobjs [CARDS_PER_POOL]; }; 42

Allocation with cards Remember: firstobjs there to make cards parsable Need to worry about firstobjs Matters during: Bump-pointer allocation Free-list allocation splitting Coalescence 43

Parsing cards scancard(pool, cardno): loc := pool + cardno*card_size + pool->firstobj[cardno] while cardof(loc) == cardno: obj := loc (scan this object) loc += obj->header.size 44

youngcollect(): (add roots to worklist) (add remembered set to worklist) fromspace, tospace := tospace, fromspace while loc := worklist.pop() obj := *loc if obj is young and!obj->header.forward: if obj->header.collectioncount < 1: (move obj to newobj in tospace) obj->header.collectioncount++ obj->header.forward = newobj else: newobj := (try old gen allocator) if newobj == NULL: collectfull() (retry young collection) obj->header.forward = newobj (scan newobj) if obj->header.forward: *loc := obj->header.forward 45

youngcollect(): (add roots to worklist) Using e.g. scancard (add remembered set to worklist) fromspace, tospace := tospace, fromspace while loc := worklist.pop() obj := *loc if obj is young and!obj->header.forward: if obj->header.collectioncount < 1: (move obj to newobj in tospace) obj->header.collectioncount++ obj->header.forward = newobj else: newobj := (try old gen allocator) if newobj == NULL: collectfull() (retry young collection) obj->header.forward = newobj (scan newobj) if obj->header.forward: *loc := obj->header.forward 45

youngcollect(): (add roots to worklist) Using e.g. scancard (add remembered set to worklist) fromspace, tospace := tospace, fromspace while loc := worklist.pop() Ignore old objects obj := *loc if obj is young and!obj->header.forward: if obj->header.collectioncount < 1: (move obj to newobj in tospace) obj->header.collectioncount++ obj->header.forward = newobj else: newobj := (try old gen allocator) if newobj == NULL: collectfull() (retry young collection) obj->header.forward = newobj (scan newobj) if obj->header.forward: *loc := obj->header.forward 45

youngcollect(): (add roots to worklist) (add remembered set to worklist) fromspace, tospace := tospace, fromspace while loc := worklist.pop() obj := *loc if obj is young and!obj->header.forward: if obj->header.collectioncount < 1: (move obj to newobj in tospace) obj->header.collectioncount++ obj->header.forward = newobj else: newobj := (try old gen allocator) if newobj == NULL: collectfull() (retry young collection) obj->header.forward = newobj (scan newobj) if obj->header.forward: *loc := obj->header.forward Using e.g. scancard Ignore old objects Remember: This could have created inter-partition references! 45

Runtime interface 46

Review Unreachable objects dead Sweep, copying, compacting, ref counting Combine with partitioning, generational Most objects die young, so partitioning by age universal Also useful to partition large objects, threads, executable objects, etc 47

Runtime interface Compiler, memory manager and (possibly) programmer must agree Compiler motivated by language, so memory manager often motivated by language too GGGGC s interface is simple by design 48

Runtime interface Important components: Allocation Where references are in roots, objects When collection can occur Barriers on writing to (reading from?) objects 49

Allocation 50

Allocation C malloc Just give me some bytes. Memory manager needs no type info. Object returned full of garbage. 50

Allocation C malloc Just give me some bytes. Memory manager needs no type info. Haskell allocation I promise not to touch! Memory manager knows all! Object returned full of garbage. Object returned fully initialized, immutable. 50

Allocation C malloc Just give me some bytes. Memory manager needs no type info. Java new Good enough is good enough. Memory manager needs refs. Haskell allocation I promise not to touch! Memory manager knows all! Object returned full of garbage. Object returned full of zeroes. Object returned fully initialized, immutable. 50

Allocation and initialization Time Allocation Initialization Use 51

Allocation and initialization Time C Allocation Initialization Use Manager Mutator Java Haskell Manager Manager Mutator Mutator 51

Constructor problem GC depends on correct objects Correct for us means: References refer to valid objects or NULL If an object scanned mid-initialization, it may not be correct! Options: Prevent GC mid-initialization, or initialize ourselves 52

Mid-initialization GC If initialization cannot allocate objects, we could prevent collection Does not generalize Initialization often not guaranteed correct anyway (just user code) Generally, manager must initialize 53

Zeroing Simple initialization: All bits zero Mundane value for virtually all types Simple initialization for manager Consequence: Objects initialized twice! (Once by manager, once by program) 54

When to zero 55

When to zero During allocation Makes allocation slower. Inefficient to spread work over many allocations. 55

When to zero During allocation Makes allocation slower. Inefficient to spread work over many allocations. During collection Makes collection (much!) slower. Impossible with freelist. Efficient. 55

When to zero During allocation Ahead of allocation During collection Makes allocation slower. Inefficient to spread work over many allocations. Zero chunks beyond current object during allocation. Choose chunk size wisely (cache line or page) for efficiency. Makes collection (much!) slower. Impossible with freelist. Efficient. 55

Runtime interface Important components: Allocation Where references are in roots, objects When collection can occur Barriers on writing to (reading from?) objects 56

How to identify references 57

How to identify references Conservative If it looks like it might be a reference, it s a reference. 57

How to identify references Conservative If it looks like it might be a reference, it s a reference. Precise Compiler must know location of all references and tell the GC. Usual for staticallytyped languages. 57

How to identify references Conservative If it looks like it might be a reference, it s a reference. Tagged References stored differently from data. Known only at runtime. Precise Compiler must know location of all references and tell the GC. Usual for dynamically-typed languages. Usual for staticallytyped languages. 57

Tagged references function FancyString(x) { return "\"" + x.tostring() + "\""; } This code works whether x is a reference or not (That s JavaScript!) 58

Refresher: Bit-sneakiness There are three 1 wasted bits in our header 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 1 1 0 1 0 0 0 0 Pool address Object address Always 0! All pointers have this extra space! 1 On 32-bit systems, two 59

Tagged references Use extra bits as type e.g.: References end in 000, Integers end in 1, Strings end in 010, Compiler and GC must agree Compiler must untag values to use them 60

Tagged references trace(obj): for loc in obj to obj + (size) if (*loc & 0b111) == 0: trace(*loc) 61

Precise object references You ve seen plenty of this! Bitmaps is pretty much how it s done In some (usually functional) language, object type includes trace function: Compiler creates trace function per type GC calls trace function referred to in descriptor 62

Stack references Pointer stacks (ala GGGGC) is not the most common choice Options: Secondary reference stack Parsable stack Heap-allocated stack Object-shaped stack 63

Secondary reference stack Normal stack has no references, second stack has only references C stack Ref stack 64

Secondary reference stack Normal stack has no references, second stack has only references foo foo C stack Ref stack foo() 64

Secondary reference stack Normal stack has no references, second stack has only references foo foo foo() bar bar bar() C stack Ref stack 64

Secondary reference stack Normal stack has no references, second stack has only references foo bar baf C stack foo bar baf Ref stack foo() bar() baf() 64

Secondary reference stack Normal stack has no references, second stack has only references foo bar baf C stack foo bar baf Ref stack foo() bar() baf() GC scans only this stack Simply read entire stack: There are only references 64

Reference stack caveats Usually needs reference stack register CPU registers are rare! Complicates other parts of compiler, e.g. exception handling 65

Parsable stack Like parsable heap: Make sure GC can find info in stack C stack 66

Parsable stack Like parsable heap: Make sure GC can find info in stack foo foo() h C stack 66

Parsable stack Like parsable heap: Make sure GC can find info in stack foo h bar h C stack foo() bar() 66

Parsable stack Like parsable heap: Make sure GC can find info in stack foo h bar h baf h C stack foo() bar() baf() 66

Parsable stack caveats Does not play nicely with others (C, C++) 67

Heap-allocated stack Every function allocates a stack object, chains those together into pseudo-stack Root Heap 68

Heap-allocated stack Every function allocates a stack object, chains those together into pseudo-stack Root Heap foo foo() 68

Heap-allocated stack Every function allocates a stack object, chains those together into pseudo-stack Root Heap foo bar foo() bar() 68

Heap-allocated stack Every function allocates a stack object, chains those together into pseudo-stack Root Heap baf foo bar foo() bar() baf() 68

Heap-allocated stack caveats Stack frames outlive function execution Easy closures! Need descriptors for every possible stack frame Restrictive to compiler 69

Object-shaped stack Exactly like heap-allocated stack, but allocate stack frame objects on stack Caveat: GC must expect objects in stack Stack is now a partition 70

What is a reference? Thusfar: Reference is pointer to beginning of object Possibilities to consider: Interior pointers Indirect pointers 71

Interior pointer struct List { struct List *next; int val; }; void breakthings(struct List *l) { int *nasty = &l->val; l = l->next; // yield to GC printf("%d\n", *nasty); } 72

Interior pointers Usual solution: Not allowed When allowed: 1. Get chunk (e.g. card) associated with interior pointer 2. Parse chunk to find containing object 3. If moving, preserve offset 73

Object tables Alternative contract between compiler and memory manager Instead of program using (e.g.) struct List * always use (e.g.) struct List ** 74

Object tables Accessing object is double-dereference User pointer refers to object table entry, object table entry refers to object. Object table forms its own partition Object table entries don t move 75

Object table thoughts Slows down object access Allows atomic object replacement i.e., one object may replace another Makes moving objects simple (only one ref to update) Makes mark-and-compact require only one sweep! 76

Runtime interface Important components: Allocation Where references are in roots, objects When collection can occur Barriers on writing to (reading from?) objects 77

Any-time collection Threads may be doing anything at any time Can we stop mutator at any time to collect? Think about stack references: Mutator must never deviate from description Very limiting to optimizations 78

Yieldpoint collection Compiler adds yieldpoints to allow collection Between yieldpoints, stack in inconsistent state To stop, wait for all threads to reach yeildpoint 79

When to yield Mandatory: During allocation, at function calls Optional: At tight loops or other longrunning operations Threads must yield frequently to allow GC to occur promptly 80

How to yield Polling: void yield() { if (stoptheworld) collect(); } Patching: Overwrite the yield function with a version that stops the thread 81

Runtime interface Important components: Allocation Where references are in roots, objects When collection can occur Barriers on writing to (reading from?) objects 82

Write barriers Either reference counting or inter-partition remembering For inter-partition remembering: Guard (check that ref must be remembered) Remember (e.g. write bit for card) 83

Write barrier guard write(obj, loc, val): if genof(obj) == old and genof(val) == young: genof isn t cheap (bit math to get pool, read gen from pool header) 84

Arranging for better barriers Sometimes can ask for pools at certain locations in memory Approximate generation check with location check Imperfect guard means more remembered set writing, but less time in write barrier 85

Sequential heap guard write(obj, loc, val): if obj > val: Write barrier can trigger improperly, but the check is (usually) one CPU operation Young gen must have (pointless) remembered set 86

Unguarded write barrier write(obj, loc, val): poolof(obj)->(remember obj) *loc := val All writes more expensive, remembered set less precise, but write time is predictable. 87

Runtime interface Summary: Allocation Where references are in roots, objects When collection can occur Barriers on writing to (reading from?) objects 88

Presentations Next week we start presentations [If any presentation doesn t happen, I will blather in the interim] I will have some final words on the last day Make sure to talk to me about your final projects! 89