Quantifying the Performance of Garbage Collection vs. Explicit Memory Management

Similar documents
Opera&ng Systems CMPSCI 377 Garbage Collec&on. Emery Berger and Mark Corner University of Massachuse9s Amherst

Garbage Collection Techniques

Myths and Realities: The Performance Impact of Garbage Collection

QUANTIFYING AND IMPROVING THE PERFORMANCE OF GARBAGE COLLECTION

Runtime. The optimized program is ready to run What sorts of facilities are available at runtime

Free-Me: A Static Analysis for Automatic Individual Object Reclamation

CMSC 330: Organization of Programming Languages

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides

CMSC 330: Organization of Programming Languages

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

Managed runtimes & garbage collection

CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection

Garbage Collection. Steven R. Bagley

Garbage Collection (1)

Performance of Non-Moving Garbage Collectors. Hans-J. Boehm HP Labs

CS 345. Garbage Collection. Vitaly Shmatikov. slide 1

Garbage Collection. Akim D le, Etienne Renault, Roland Levillain. May 15, CCMP2 Garbage Collection May 15, / 35

Lecture 15 Garbage Collection

Deallocation Mechanisms. User-controlled Deallocation. Automatic Garbage Collection

Run-time Environments -Part 3

Exploiting the Behavior of Generational Garbage Collector

Lecture Notes on Garbage Collection

Lecture 13: Garbage Collection

Programming Language Implementation

G Programming Languages - Fall 2012

Automatic Memory Management

Heap Management. Heap Allocation

Robust Memory Management Schemes

Manual Allocation. CS 1622: Garbage Collection. Example 1. Memory Leaks. Example 3. Example 2 11/26/2012. Jonathan Misurda

C11: Garbage Collection and Constructors

CS61C : Machine Structures

Older-First Garbage Collection in Practice: Evaluation in a Java Virtual Machine

CS2210: Compiler Construction. Runtime Environment

Garbage Collection. Weiyuan Li

CA341 - Comparative Programming Languages

Dynamic Allocation of Memory

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

CS558 Programming Languages

arxiv: v1 [cs.pl] 30 Apr 2015

Garbage Collection. Hwansoo Han

CSE P 501 Compilers. Memory Management and Garbage Collec<on Hal Perkins Winter UW CSE P 501 Winter 2016 W-1

Azul Systems, Inc.

One-Slide Summary. Lecture Outine. Automatic Memory Management #1. Why Automatic Memory Management? Garbage Collection.

Dynamic Memory Allocation

Run-Time Environments/Garbage Collection

Program Analysis. Uday Khedker. ( uday) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Garbage Collection Algorithms. Ganesh Bikshandi

CRAMM: Virtual Memory Support for Garbage-Collected Applications

Operating Systems CMPSCI 377, Lec 2 Intro to C/C++ Prashant Shenoy University of Massachusetts Amherst

the gamedesigninitiative at cornell university Lecture 9 Memory Management

CS 241 Honors Memory

Run-time Environments - 3

6.172 Performance Engineering of Software Systems Spring Lecture 9. P after. Figure 1: A diagram of the stack (Image by MIT OpenCourseWare.

Dynamic Storage Allocation

Memory Management: The Details

C10: Garbage Collection and Constructors

Memory Allocation III

Algorithms for Dynamic Memory Management (236780) Lecture 4. Lecturer: Erez Petrank

Habanero Extreme Scale Software Research Project

Algorithms for Dynamic Memory Management (236780) Lecture 1

Structure of Programming Languages Lecture 10

Memory Management. Memory Management... Memory Management... Interface to Dynamic allocation

CS Computer Systems. Lecture 8: Free Memory Management

CS842: Automatic Memory Management and Garbage Collection. Mark and sweep

Memory Allocation III

CS558 Programming Languages

Short-term Memory for Self-collecting Mutators. Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg

Dynamic Selection of Application-Specific Garbage Collectors

IBM Research Report. Efficient Memory Management for Long-Lived Objects

Automatic Heap Sizing: Taking Real Memory Into Account

Lecture 7: Binding Time and Storage

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11


Internal Fragmentation

Advanced Programming & C++ Language

Dynamic Memory Allocation I Nov 5, 2002

CPSC 213. Introduction to Computer Systems. Instance Variables and Dynamic Allocation. Unit 1c

Dynamic Memory Allocation: Advanced Concepts

Dynamic Memory Allocation II October 22, 2008

Design Issues. Subroutines and Control Abstraction. Subroutines and Control Abstraction. CSC 4101: Programming Languages 1. Textbook, Chapter 8

Dynamic Memory Allocation I

Limitations of the stack

Lecture Notes on Garbage Collection

Dynamic Allocation of Memory

Memory Allocation II. CSE 351 Autumn Instructor: Justin Hsia

Lecture 14. No in-class files today. Homework 7 (due on Wednesday) and Project 3 (due in 10 days) posted. Questions?

I J C S I E International Science Press

Dynamic Memory Allocation. Gerson Robboy Portland State University. class20.ppt

CPSC 213. Introduction to Computer Systems. Winter Session 2017, Term 2. Unit 1c Jan 24, 26, 29, 31, and Feb 2

Dynamic Memory Allocation

Memory Allocation III CSE 351 Spring (original source unknown)

Memory Allocation III

CMSC 330: Organization of Programming Languages. Ownership, References, and Lifetimes in Rust

ACM Trivia Bowl. Thursday April 3 rd (two days from now) 7pm OLS 001 Snacks and drinks provided All are welcome! I will be there.

EMERY BERGER Research Statement

CS558 Programming Languages

Book keeping. Will post HW5 tonight. OK to work in pairs. Midterm review next Wednesday

Review. Partitioning: Divide heap, use different strategies per heap Generational GC: Partition by age Most objects die young

Today. Dynamic Memory Allocation: Advanced Concepts. Explicit Free Lists. Keeping Track of Free Blocks. Allocating From Explicit Free Lists

CS2210: Compiler Construction. Runtime Environment

Transcription:

Quantifying the Performance of Garbage Collection vs. Explicit Memory Management Matthew Hertz Canisius College Emery Berger University of Massachusetts Amherst

Explicit Memory Management malloc / new allocates space for an object free / delete returns memory to system Simple, but tricky to get right Forget to free memory leak free too soon dangling pointer

Dangling Pointers Node x = new Node ( happy ); Node ptr = x; delete x; // // But I m not dead yet! Node y = new Node ( sad ); cout << ptr->data << endl; // // sad Insidious, hard-to-track down bugs

Solution: Garbage Collection No need to call free Garbage collector periodically scans objects on heap Reclaims unreachable objects Won t reclaim objects until it can prove object will not be used again

No More Dangling Pointers Node x = new Node ( happy ); Node ptr = x; // // x still live (reachable through ptr) Node y = new Node ( sad ); cout << ptr->data << endl; // // happy! So why not use GC all the time?

Conventional Wisdom GC worse than malloc, because Extra processing (collection) Poor cache performance (ibid) Bad page locality (ibid) Increased footprint (delayed reclamation)

Conventional Wisdom GC improves performance, by Quicker allocation (fast path inlining & bump pointer alloc.) Better cache performance (object reordering) Improved page locality (heap compaction)

Outline Motivation Quantifying GC performance A hard problem Oracular memory management Selecting & generating the Oracles Experimental methodology Results

Quantifying GC Performance Apples-to-apples comparison Examine unaltered applications Runs differ only in memory manager Examine impact on time & space

Quantifying GC Performance Evaluate state-of-art algorithms Garbage collectors Generational collectors Copying collectors Standard for Java, not compatible with C/C++ Explicit memory managers Fast, single-threaded allocators

Comparing Memory Managers Node v = malloc(sizeof(node)); v->data= malloc(sizeof(nodedata)); memcpy(v->data, old->data, sizeof(nodedata)); free(old->data); v->next = old->next; v->next->prev = v; v->prev = old->prev; v->prev->next = v; free(old); BDW Collector Using GC in C/C++ is easy:

Comparing Memory Managers Node v = malloc(sizeof(node)); v->data= malloc(sizeof(nodedata)); memcpy(v->data, old->data, sizeof(nodedata)); free(old->data); v->next = old->next; v->next->prev = v; v->prev = old->prev; v->prev->next = v; free(old); BDW Collector ignore calls to free.

Comparing Memory Managers Node node = new Node(); node.data = new NodeData(); usenode(node); node = null;... node = new Node();... node.data = new NodeData();... Lea Allocator Adding malloc/free to Java: not easy

Comparing Memory Managers Node node = new Node(); node.data = new NodeData(); usenode(node); node = null;... node = new Node();... node.data = new NodeData();... free(node)? free(node.data)? Lea Allocator... where should free be inserted?

Inserting Free Calls Do not know where programmer would call free Hints provided from null-ing pointers Restructure code to avoid memory leaks? Tests programming skills, not memory manager Want unaltered applications

Oracular Memory Manager Java C malloc/free execute program here Simulator allocation perform actions at no cost below here Oracle Consult oracle to place free calls Oracle does not disrupt hardware state Simulator inserts free

Object Lifetime & Oracle Placement obj obj = new new Object; live reachable free(obj) freed can be byfreed lifetime-based oracle freed by reachability-based free(obj) free(??) can be collected oracle dead unreachable Oracles bracket placement of frees Lifetime-based: most aggressive Reachability-based: most conservative

Reachability Oracle Generation Java C malloc/free execute program here allocations, ptr updates, prog roots PowerPC Simulator perform actions at no cost below here trace file Merlin analysis Oracle Illegal instructions mark heap events Simulated identically to legal instructions

Liveness Oracle Generation Java C malloc/free execute program here allocations, mem access, prog roots PowerPC Simulator perform actions at no cost below here trace file Postprocess Oracle Record allocations, memory accesses Preserve code, type objects, etc. May use objects without accessing them

Liveness Oracle Generation Java C malloc/free execute program here allocation, mem access, prog. roots if (f.x == y) { } PowerPC Simulator uses address of f.x, perform actions at no cost below here trace Postprocess not touch f.x or Oracle but file may f Record allocations, memory accesses Preserve code, type objects, etc. May use objects without accessing them

Oracular Memory Manager Java C malloc/free execute program here PowerPC Simulator allocation perform actions at no cost below here oracle Consult oracle before each allocation When needed, modify instructions to call free Extra costs hidden by simulator

Experimental Methodology Java platform: MMTk/Jikes RVM(2.3.2) Simulator: Dynamic SimpleScalar (DSS) Simulates 2GHz PowerPC processor G5 cache configuration

Experimental Methodology Garbage collectors: GenMS, GenCopy, GenRC, SemiSpace, CopyMS, MarkSweep Explicit memory managers: Lea, MSExplicit (MS + explicit deallocation)

Experimental Methodology Perfectly repeatable runs Pseudoadaptive compiler Same sequence of optimizations Advice generated from mean of 5 runs Deterministic thread switching Deterministic system clock Use illegal instructions in all runs

Execution Time for pseudojbb GC performance can be competitive

Footprint at Quickest Run 800% 700% 600% 500% 400% 300% 200% 100% 0% Lea w/ Reach Lea w/ Li f e MSExpl i ci t w/ Reach Ki ngsl ey GenMS GenCopy CopyMS Semi Space Mar ksweep GC uses much more memory

Footprint at Quickest Run 800% 700% 7.69 7.09 600% 5.66 500% 5.10 4.84 400% 300% 200% 100% 1.00 0.63 1.38 1.61 0% Lea w/ Reach Lea w/ Li f e MSExpl i ci t w/ Reach Ki ngsl ey GenMS GenCopy CopyMS Semi Space Mar ksweep GC uses much more memory

Avg. Relative Cycles and Footprint GC trades space for time

Javac Paging Performance Much slower in limited physical RAM

pseudojbb Paging Performance Lifetime analysis adds little

Summary of Results Best collector equals Lea's performance Up to 10% faster on some benchmarks... but uses more memory Quickest runs use 5x or more memory At least twice mean footprint

Take-home: Practitioners GC ok if: system has more than 3x needed RAM, and no competition with other processes GC not so good if: Limited RAM Competition for physical memory Relies upon RAM for performance In-memory database Search engines, etc.

Take-home: Researchers GC performance already good enough with enough RAM Problems: Paging is a killer Performance suffers when RAM limited

Future Work Obvious dimensions Other collectors: Bookmarking collector [PLDI 05] Parallel collectors Other allocators: New version of DLmalloc (2.8.2) VAM [ISMM 05] Other architectures: Examine impact of cache size

Future Work Other memory management methods Regions, reaps

Conclusion Code available at: http://www-cs.canisius.edu/~hertzm