Lecture 13: Garbage Collection

Similar documents
CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection

Runtime. The optimized program is ready to run What sorts of facilities are available at runtime

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides

Garbage Collection. Hwansoo Han

Run-Time Environments/Garbage Collection

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

Garbage Collection Algorithms. Ganesh Bikshandi

Memory Management. Memory Management... Memory Management... Interface to Dynamic allocation

Programming Language Implementation

Deallocation Mechanisms. User-controlled Deallocation. Automatic Garbage Collection

Automatic Garbage Collection

Automatic Memory Management

Robust Memory Management Schemes

CS 4120 Lecture 37 Memory Management 28 November 2011 Lecturer: Andrew Myers

One-Slide Summary. Lecture Outine. Automatic Memory Management #1. Why Automatic Memory Management? Garbage Collection.

Algorithms for Dynamic Memory Management (236780) Lecture 4. Lecturer: Erez Petrank

Heap Management. Heap Allocation

Garbage Collection. Lecture Compilers SS Dr.-Ing. Ina Schaefer. Software Technology Group TU Kaiserslautern. Ina Schaefer Garbage Collection 1

CSE P 501 Compilers. Memory Management and Garbage Collec<on Hal Perkins Winter UW CSE P 501 Winter 2016 W-1

Lecture Notes on Garbage Collection

CS 345. Garbage Collection. Vitaly Shmatikov. slide 1

CS61C : Machine Structures

G Programming Languages - Fall 2012

Garbage Collection. Akim D le, Etienne Renault, Roland Levillain. May 15, CCMP2 Garbage Collection May 15, / 35

CS577 Modern Language Processors. Spring 2018 Lecture Garbage Collection

Garbage Collection Techniques

Memory Management. Chapter Fourteen Modern Programming Languages, 2nd ed. 1

Garbage Collection (1)

Simple Garbage Collection and Fast Allocation Andrew W. Appel

Habanero Extreme Scale Software Research Project

Memory Management. Didactic Module 14 Programming Languages - EEL670 1

Run-time Environments -Part 3

Memory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008

Exploiting the Behavior of Generational Garbage Collector

Garbage Collection. Weiyuan Li

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

Garbage Collection. Steven R. Bagley

Compiler Construction D7011E

Managed runtimes & garbage collection

Lecture 15 Garbage Collection

CS Computer Systems. Lecture 8: Free Memory Management

ACM Trivia Bowl. Thursday April 3 rd (two days from now) 7pm OLS 001 Snacks and drinks provided All are welcome! I will be there.

Design Issues. Subroutines and Control Abstraction. Subroutines and Control Abstraction. CSC 4101: Programming Languages 1. Textbook, Chapter 8

CS 241 Honors Memory

Lecture Notes on Garbage Collection

Compiler Construction

A.Arpaci-Dusseau. Mapping from logical address space to physical address space. CS 537:Operating Systems lecture12.fm.2

Reference Counting. Reference counting: a way to know whether a record has other users

Implementation Garbage Collection

Compiler Construction

Compilers. 8. Run-time Support. Laszlo Böszörmenyi Compilers Run-time - 1

A new Mono GC. Paolo Molaro October 25, 2006

Motivation for Dynamic Memory. Dynamic Memory Allocation. Stack Organization. Stack Discussion. Questions answered in this lecture:

Reference Counting. Reference counting: a way to know whether a record has other users

Performance of Non-Moving Garbage Collectors. Hans-J. Boehm HP Labs

6.172 Performance Engineering of Software Systems Spring Lecture 9. P after. Figure 1: A diagram of the stack (Image by MIT OpenCourseWare.

Dynamic Memory Management

CPSC 213. Introduction to Computer Systems. Winter Session 2017, Term 2. Unit 1c Jan 24, 26, 29, 31, and Feb 2

Dynamic Memory Allocation

Dynamic Storage Allocation

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

CS61C : Machine Structures

Dynamic Memory Management! Goals of this Lecture!

CS61C : Machine Structures

CS61C : Machine Structures

Garbage Collection (2) Advanced Operating Systems Lecture 9

Review. Partitioning: Divide heap, use different strategies per heap Generational GC: Partition by age Most objects die young

Dynamic Memory Management

5. Garbage Collection

Algorithms for Dynamic Memory Management (236780) Lecture 1

Process s Address Space. Dynamic Memory. Backing the Heap. Dynamic memory allocation 3/29/2013. When a process starts the heap is empty

Lecture 15 Advanced Garbage Collection

Binding and Storage. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Memory Management Basics

CPSC 213. Introduction to Computer Systems. Instance Variables and Dynamic Allocation. Unit 1c

Manual Allocation. CS 1622: Garbage Collection. Example 1. Memory Leaks. Example 3. Example 2 11/26/2012. Jonathan Misurda

CSC 1600 Memory Layout for Unix Processes"

Requirements, Partitioning, paging, and segmentation

Programming Languages

Memory Allocation. Copyright : University of Illinois CS 241 Staff 1

Lecture Outine. Why Automatic Memory Management? Garbage Collection. Automatic Memory Management. Three Techniques. Lecture 18

Run-time Environments - 3

HOT-Compilation: Garbage Collection

Hard Real-Time Garbage Collection in Java Virtual Machines

Lecture 7 More Memory Management Slab Allocator. Slab Allocator

High-Level Language VMs

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

Automatic Memory Management

Garbage Collection. Vyacheslav Egorov

The Generational Hypothesis

Structure of Programming Languages Lecture 10

CS842: Automatic Memory Management and Garbage Collection. Mark and sweep

COSC Software Engineering. Lectures 14 and 15: The Heap and Dynamic Memory Allocation

(notes modified from Noah Snavely, Spring 2009) Memory allocation

Heap Management portion of the store lives indefinitely until the program explicitly deletes it C++ and Java new Such objects are stored on a heap

Requirements, Partitioning, paging, and segmentation

MEMORY MANAGEMENT HEAP, STACK AND GARBAGE COLLECTION

Opera&ng Systems CMPSCI 377 Garbage Collec&on. Emery Berger and Mark Corner University of Massachuse9s Amherst

Transcription:

Lecture 13: Garbage Collection COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer/Mikkel Kringelbach 1

Garbage Collection Every modern programming language allows programmers to allocate new storage dynamically New records, arrays, tuples, objects, closures, etc. Every modern language needs facilities for reclaiming and recycling the storage used by programs It s usually the most complex aspect of the run-time system for any modern language (Java, ML, Lisp, Scheme, Modula, )

Memory Layout per process virtual memory new pages allocated via calls to OS physical memory heap static data TLB address translation stack grows to preset limit

GC What is garbage? A value is garbage if it will not be used in any subsequent computation by the program Is it easy to determine which objects are garbage?

GC What is garbage? A value is garbage if it will not be used in any subsequent computation by the program Is it easy to determine which objects are garbage? No. It s undecidable. Eg: if long-and-tricky-computation then use v else don t use v

GC Since determining which objects are garbage is tricky, people have come up with many different techniques It s the programmers problem: Explicit allocation/deallocation Reference counting Tracing garbage collection Mark-sweep, copying collection Generational GC

Explicit Memory Management User library manages memory; programmer decides when and where to allocate and deallocate void* malloc(long n) void free(void *addr) Library calls OS for more pages when necessary Advantage: people are smart Disadvantage: people are dumb and they really don t want to bother with such details if they can avoid it

Explicit MM How does malloc/free work? Blocks of unused memory stored on a freelist malloc: search free list for usable memory block free: put block onto the head of the freelist freelist

Explicit MM Drawbacks malloc is not free: we might have to do a search to find a big enough block As program runs, the heap fragments leaving many small, unusable pieces freelist

Explicit MM Solutions: Use multiple free lists, one for each block size Malloc and free become O(1) But can run out of size 4 blocks, even though there are many size 6 blocks or size 2 blocks! Blocks are powers of 2 Subdivide blocks to get the right size Adjacent free blocks merged into the next biggest size still possibly 30% wasted space Crucial point: there is no magic bullet. Memory management always has a cost. We want to minimize costs and, these days, maximize reliability.

Automatic MM Languages with explicit MM are harder to program Always worrying about dangling pointers, memory leaks: a huge software engineering burden Impossible to develop a secure system, impossible to use these languages in emerging applications involving mobile code New languages tend to have automatic MM eg: Microsoft is pouring $$$ into developing safe language technology, including a new research project on dependable operating system construction

Automatic MM Question: how do we decide which objects are garbage? Can t do it exactly Therefore, we conservatively approximate Normal solution: an object is garbage when it becomes unreachable from the roots The roots = registers, stack, global static data If there is no path from the roots to an object, it cannot be used later in the computation so we can safely recycle its memory

Object Graph r1 stack How should we test reachability? r2

Object Graph r1 stack How should we test reachability? r2

Reference Counting Keep track of the number of pointers to each object (the reference count). When the reference count goes to 0, the object is unreachable garbage

Object Graph r1 stack Reference counting can t detect cycles r2

Reference Counting In place of a single assignment x.f = p: - Ouch, that hurts performance-wise! z = x.f z.count = z.count - 1 If z.count = 0 call putonfreelist(z) x.f = p p.count = p.count + 1 - Dataflow analysis can eliminate some increments and decrements, but many remain - Reference counting used in some special cases but not usually as the primary GC mechanism in a language implementation

Copying Collection Basic idea: use 2 heaps GC: One used by program The other unused until GC time Start at the roots & traverse the reachable data Copy reachable data from the active heap (from-space) to the other heap (to-space) Dead objects are left behind in from space Heaps switch roles

Copying Collection from-space to-space roots

Copying GC Cheney s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root next scan

Copying GC Cheney s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root next scan

Copying GC Cheney s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root next scan

Copying GC Cheney s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root next scan

Copying GC Cheney s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space next root scan

Copying GC Cheney s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root next scan

Copying GC Cheney s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root next scan Done when next = scan

Copying GC Cheney s algorithm for copying collection Traverse data breadth first, copying objects from from-space to to-space root next scan Done when next = scan

Copying GC Pros Simple & collects cycles Run-time proportional to # live objects Automatic compaction eliminates fragmentation Fast allocation: pointer increment by object size Cons Precise type information required (pointer or not) Tag bits take extra space; normally use header word Twice as much memory used as program requires Usually, we anticipate live data will only be a small fragment of store Allocate until 70% full From-space = 70% heap; to-space = 30% Long GC pauses = bad for interactive, real-time apps

Baker s Concurrent GC GC pauses avoided by doing GC incrementally Program only holds pointers to to-space On field fetch, if pointer to from-space, copy object and fix pointer Extra fetch code = 20% performance penalty But no long pauses ==> better response time On swap, copy roots as before

Generational GC Empirical observation: if an object has been reachable for a long time, it is likely to remain so Empirical observation: in many languages (especially functional languages), most objects died young Conclusion: we save work by scanning the young objects frequently and the old objects infrequently

Generational GC Assign objects to different generations G0, G1, G0 contains young objects, most likely to be garbage G0 scanned more often than G1 Common case is two generations (new, tenured) Roots for GC of G0 include all objects in G1 in addition to stack, registers

Generational GC How do we avoid scanning tenured objects? Observation: old objects rarely point to new objects Normally, object is created and when it initialized it will point to older objects, not newer ones Only happens if old object modified well after it is created In functional languages that use mutation infrequently, pointers from old to new are very uncommon Compiler inserts extra code on object field pointer write to catch modifications to old objects Remembered set is used to keep track of objects that point into younger generation. Remembered set included in set of roots for scanning.

Generational GC Other issues When do we promote objects from young generation to old generation Usually after an object survives a collection, it will be promoted How big should the generations be? Appel says each should be exponentially larger than the last When do we collect the old generation? After several minor collections, we do a major collection

Generational GC Other issues Sometimes different GC algorithms are used for the new and older generations. Why? Because the have different characteristics Copying collection for the new Less than 10% of the new data is usually live Copying collection cost is proportional to the live data Mark-sweep for the old Mark reachable Sweep that not marked