CMSC 330: Organization of Programming Languages

Similar documents
CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection

CMSC 330: Organization of Programming Languages

Lecture 13: Garbage Collection

Runtime. The optimized program is ready to run What sorts of facilities are available at runtime

Garbage Collection. Akim D le, Etienne Renault, Roland Levillain. May 15, CCMP2 Garbage Collection May 15, / 35

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides

Programming Language Implementation

Automatic Memory Management

Garbage Collection. Steven R. Bagley

One-Slide Summary. Lecture Outine. Automatic Memory Management #1. Why Automatic Memory Management? Garbage Collection.

Heap storage. Dynamic allocation and stacks are generally incompatible.

Garbage Collection (1)

Run-Time Environments/Garbage Collection

Memory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008

G Programming Languages - Fall 2012

Garbage Collection. Hwansoo Han

CS 345. Garbage Collection. Vitaly Shmatikov. slide 1

Heap Management. Heap Allocation

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

CA341 - Comparative Programming Languages

Deallocation Mechanisms. User-controlled Deallocation. Automatic Garbage Collection

Managed runtimes & garbage collection

Algorithms for Dynamic Memory Management (236780) Lecture 4. Lecturer: Erez Petrank

Lecture 15 Garbage Collection

A.Arpaci-Dusseau. Mapping from logical address space to physical address space. CS 537:Operating Systems lecture12.fm.2

Performance of Non-Moving Garbage Collectors. Hans-J. Boehm HP Labs

Garbage Collection Algorithms. Ganesh Bikshandi

CSE P 501 Compilers. Memory Management and Garbage Collec<on Hal Perkins Winter UW CSE P 501 Winter 2016 W-1

Automatic Garbage Collection

Structure of Programming Languages Lecture 10

CS577 Modern Language Processors. Spring 2018 Lecture Garbage Collection

CS61C : Machine Structures

Garbage Collection. Weiyuan Li

Memory Management. Memory Management... Memory Management... Interface to Dynamic allocation

Robust Memory Management Schemes

CS 241 Honors Memory

In Java we have the keyword null, which is the value of an uninitialized reference type

Garbage Collection Techniques

Lecture Notes on Garbage Collection

Design Issues. Subroutines and Control Abstraction. Subroutines and Control Abstraction. CSC 4101: Programming Languages 1. Textbook, Chapter 8

ACM Trivia Bowl. Thursday April 3 rd (two days from now) 7pm OLS 001 Snacks and drinks provided All are welcome! I will be there.

Motivation for Dynamic Memory. Dynamic Memory Allocation. Stack Organization. Stack Discussion. Questions answered in this lecture:

Compiler Construction D7011E

Habanero Extreme Scale Software Research Project

CS61C : Machine Structures

Compiler Construction

Class Information ANNOUCEMENTS

CMSC 330: Organization of Programming Languages. Ownership, References, and Lifetimes in Rust

Compiler Construction

6.172 Performance Engineering of Software Systems Spring Lecture 9. P after. Figure 1: A diagram of the stack (Image by MIT OpenCourseWare.

Lecture Notes on Garbage Collection

Garbage Collection. Vyacheslav Egorov

Exploiting the Behavior of Generational Garbage Collector

Programmer Directed GC for C++ Michael Spertus N2286= April 16, 2007

CS61, Fall 2012 Section 2 Notes

Dynamic Memory Management

Lecture 13: Complex Types and Garbage Collection

Dynamic Storage Allocation

Dynamic Memory Management

Manual Allocation. CS 1622: Garbage Collection. Example 1. Memory Leaks. Example 3. Example 2 11/26/2012. Jonathan Misurda

Dynamic Memory Management! Goals of this Lecture!

Run-time Environments -Part 3

CS558 Programming Languages

CS 4120 Lecture 37 Memory Management 28 November 2011 Lecturer: Andrew Myers

CSC 1600 Memory Layout for Unix Processes"

Lecture 23: Object Lifetime and Garbage Collection

CS61C : Machine Structures

CS558 Programming Languages

CS61C : Machine Structures

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments

Storage. Outline. Variables and Updating. Composite Variables. Storables Lifetime : Programming Languages. Course slides - Storage

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

Lecture 7 More Memory Management Slab Allocator. Slab Allocator

Concepts Introduced in Chapter 7

Limitations of the stack

The issues. Programming in C++ Common storage modes. Static storage in C++ Session 8 Memory Management

Name, Scope, and Binding. Outline [1]

Types II. Hwansoo Han

CPSC 213. Introduction to Computer Systems. Winter Session 2017, Term 2. Unit 1c Jan 24, 26, 29, 31, and Feb 2

Operating Systems CMPSCI 377, Lec 2 Intro to C/C++ Prashant Shenoy University of Massachusetts Amherst

CS558 Programming Languages

CSCI-1200 Data Structures Fall 2011 Lecture 24 Garbage Collection & Smart Pointers

Advanced Programming & C++ Language

Memory Allocation III

Compilers. 8. Run-time Support. Laszlo Böszörmenyi Compilers Run-time - 1

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

Reference Counting. Reference counting: a way to know whether a record has other users

Heap, Variables, References, and Garbage. CS152. Chris Pollett. Oct. 13, 2008.

Quantifying the Performance of Garbage Collection vs. Explicit Memory Management

Algorithms for Dynamic Memory Management (236780) Lecture 1

CSC 533: Organization of Programming Languages. Spring 2005

C11: Garbage Collection and Constructors

CS Computer Systems. Lecture 8: Free Memory Management

The Generational Hypothesis

Implementation Garbage Collection

Memory Management. Didactic Module 14 Programming Languages - EEL670 1

Memory Management. Chapter Fourteen Modern Programming Languages, 2nd ed. 1

Opera&ng Systems CMPSCI 377 Garbage Collec&on. Emery Berger and Mark Corner University of Massachuse9s Amherst

Dynamic Memory Allocation

Transcription:

CMSC 330: Organization of Programming Languages Memory Management and Garbage Collection CMSC 330 Spring 2017 1

Memory Attributes Memory to store data in programming languages has the following lifecycle Allocation Ø When the memory is allocated to the program Lifetime Ø How long allocated memory is used by the program Recovery Ø When the system recovers the memory for reuse The allocator is the system feature that performs allocation and recovery CMSC 330 Spring 2017 2

Memory Attributes (cont.) Most programming languages are concerned with some subset of the following 4 memory classes 1. Static (or fixed) memory 2. Automatic memory 3. Dynamically allocated memory 4. Persistent memory CMSC 330 Spring 2017 3

Memory Classes Static memory Usually at a fixed address Lifetime The execution of program Allocation For entire execution Recovery By system when program terminates Allocator Compiler Automatic memory Usually on a stack Lifetime Activation of method using that data Allocation When method is invoked Recovery When method terminates Allocator Typically compiler, sometimes programmer CMSC 330 Spring 2017 4

Memory Classes (cont.) Dynamic memory Addresses allocated on demand in an area called the heap Lifetime As long as memory is needed Allocation Explicitly by programmer, or implicitly by compiler Recovery Either by programmer or automatically (when possible and depends upon language) Allocator Manages free/available space in heap CMSC 330 Spring 2017 5

Memory Classes (cont.) Persistent memory Usually the file system Lifetime Multiple executions of a program Ø E.g., files or databases Allocation By program or user Ø Often outside of program execution Recovery When data no longer needed Dealing with persistent memory databases Ø CMSC 424 CMSC 330 Spring 2017 6

Memory Management in C Local variables live on the stack Allocated at function invocation time Deallocated when function returns Storage space reused after function returns Space on the heap allocated with malloc() Must be explicitly freed with free() Called explicit or manual memory management Ø Deletions must be done by the user CMSC 330 Spring 2017 7

Memory Management Errors May forget to free memory (memory leak) { int *x = (int *) malloc(sizeof(int)); } May retain ptr to freed memory (dangling pointer) { int *x =...malloc(); } free(x); *x = 5; /* oops! */ May try to free something twice { int *x =...malloc(); free(x); free(x); } Ø This may corrupt the memory management data structures E.g., the memory allocator maintains a free list of space on the heap that s available CMSC 330 Spring 2017 8

Ways to Avoid Mistakes in C Don t allocate memory on the heap Could lead to confusing code Never free memory OS will reclaim process s memory anyway at exit Memory is cheap; who cares about a little leak? But: Both of the above two may be impractical Can avoid all three problems by using automatic memory management Key technology to ensure type safety Does not prevent all leaks, as we will see CMSC 330 Spring 2017 9

Fragmentation Another memory management problem Example sequence of calls allocate(a); allocate(x); allocate(y); free(a); allocate(z); free(y); allocate(b); Þ Not enough contiguous space for b CMSC 330 Spring 2017 10

Automatic memory management Primary goal: automatically reclaim dynamic memory Secondary goal: also avoid fragmentation Insight: You can do reclamation and avoid fragmentation if you can identify every pointer in a program You can move the allocated storage, then redirect pointers to it Ø Compact it, to avoid fragmentation Compiler ensures perfect knowledge LISP, OCAML, Java, Prolog (with caveats), but not in C, C++, Pascal, Ada CMSC 330 Spring 2017 11

Strategy At any point during execution, can divide the objects in the heap into two classes Live objects will be used later Dead objects will never be used again Ø They are garbage Thus we need garbage collection (GC) algorithms that can 1.Distinguish live from dead objects 2.Reclaim the dead objects and retain the live ones CMSC 330 Spring 2017 12

Determining Liveness In most languages we can t know for sure which objects are really live or dead Undecidable, like solving the halting problem Thus we need to make a safe approximation OK if we decide something is live when it s not But we d better not deallocate an object that will be used later on CMSC 330 Spring 2017 13

Liveness by Reachability An object is reachable if it can be accessed by dereferencing ( chasing ) pointers from live data Safe policy: delete unreachable objects An unreachable object can never be accessed again by the program Ø The object is definitely garbage A reachable object may be accessed in the future Ø The object could be garbage but will be retained anyway Ø Could lead to memory leaks CMSC 330 Spring 2017 14

Roots At a given program point, we define liveness as being data reachable from the root set Global variables Ø What are these in Java? Ruby? OCaml? Local variables of all live method activations Ø I.e., the stack At the machine level Also consider the register set Ø Usually stores local or global variables Next Techniques for determining reachability CMSC 330 Spring 2017 15

Reference Counting Idea: Each object has count of number of pointers to it from the roots or other objects When count reaches 0, object is unreachable Count tracking code may be manual or automatic In regular use C++ (smart pointer library), Cocoa (manual), Python Method doesn t address fragmentation problem Invented by Collins in 1960 A method for overlapping and erasure of lists. Communications of the ACM, December 1960 CMSC 330 Spring 2017 16

Reference Counting Example stack 1 2 1 CMSC 330 Spring 2017 17

Reference Counting Example (cont.) stack 1 2 1 1 1 CMSC 330 Spring 2017 18

Reference Counting Example (cont.) stack 1 2 1 1 1 CMSC 330 Spring 2017 19

Reference Counting Example (cont.) stack 1 2 1 1 0 1 CMSC 330 Spring 2017 20

Reference Counting Example (cont.) stack 1 2 1 1 CMSC 330 Spring 2017 21

Reference Counting Example (cont.) stack 1 2 1 0 1 CMSC 330 Spring 2017 22

Reference Counting Example (cont.) stack 1 CMSC 330 Spring 2017 23

Reference Counting Tradeoffs Advantage Incremental technique Ø Generally small, constant amount of work per memory write Ø With more effort, can even bound running time Disadvantages Cascading decrements can be expensive Requires extra storage for reference counts Need other means to collect cycles, for which counts never go to 0 CMSC 330 Spring 2017 24

Tracing Garbage Collection Idea: Determine reachability as needed, rather than by stored counts Every so often, stop the world and Follow pointers from live objects (starting at roots) to expand the live object set Ø Repeat until no more reachable objects Deallocate any non-reachable objects Two main variants of tracing GC Mark/sweep (McCarthy 1960) and stop-and-copy (Cheney 1970) CMSC 330 Spring 2017 25

Mark and Sweep GC Two phases Mark phase: trace the heap and mark all reachable objects Sweep phase: go through the entire heap and reclaim all unmarked objects CMSC 330 Spring 2017 26

Mark and Sweep Example stack CMSC 330 Spring 2017 27

Mark and Sweep Example (cont.) stack CMSC 330 Spring 2017 28

Mark and Sweep Example (cont.) stack CMSC 330 Spring 2017 29

Mark and Sweep Example (cont.) stack CMSC 330 Spring 2017 30

Mark and Sweep Example (cont.) stack CMSC 330 Spring 2017 31

Mark and Sweep Example (cont.) stack CMSC 330 Spring 2017 32

Mark and Sweep Example (cont.) stack CMSC 330 Spring 2017 33

Mark and Sweep Advantages No problem with cycles Memory writes have no cost Non-moving Live objects stay where they are Makes conservative GC possible Ø Used when identification of pointer vs. non-pointer uncertain Ø More later CMSC 330 Spring 2017 34

Mark and Sweep Disadvantages Fragmentation Available space broken up into many small pieces Ø Thus many mark-and-sweep systems may also have a compaction phase (like defragmenting your disk) Cost proportional to heap size Sweep phase needs to traverse whole heap it touches dead memory to put it back on to the free list CMSC 330 Spring 2017 35

Copying GC Like mark and sweep, but only touches live objects Divide heap into two equal parts (semispaces) Only one semispace active at a time At GC time, flip semispaces 1. Trace the live data starting from the roots 2. Copy live data into other semispace 3. Declare everything in current semispace dead 4. Switch to other semispace CMSC 330 Spring 2017 36

Copying GC Example stack CMSC 330 Spring 2017 37

Copying GC Example (cont.) stack CMSC 330 Spring 2017 38

Copying GC Example (cont.) stack CMSC 330 Spring 2017 39

Copying GC Example (cont.) stack CMSC 330 Spring 2017 40

Copying GC Tradeoffs Advantages Only touches live data No fragmentation (automatically compacts) Ø Will probably increase locality Disadvantages Requires twice the memory space CMSC 330 Spring 2017 41

Quiz 1 Which garbage collection implementation requires more storage? A.Mark and Sweep B.Copying GC CMSC 330 Spring 2017 42

Quiz 1 Which garbage collection implementation requires more storage? A.Mark and Sweep B.Copying GC CMSC 330 Spring 2017 43

Quiz 2 Which compacts the heap to prevent fragmentation? A.Mark and Sweep B.Reference Counting C.Copying GC CMSC 330 Spring 2017 44

Quiz 2 Which compacts the heap to prevent fragmentation? A.Mark and Sweep B.Reference Counting C.Copying GC CMSC 330 Spring 2017 45

Quiz 3 The computational cost of Copying GC is proportional to the heap size A.True B.False CMSC 330 Spring 2017 46

Quiz 3 The computational cost of Copying GC is proportional to the heap size A.True B.False CMSC 330 Spring 2017 47

Quiz 4 Which of the following happens most frequently? A.Reference Count Updating B.Mark and Sweep checking for dead memory C.Copying GC copying live data CMSC 330 Spring 2017 48

Quiz 4 Which of the following happens most frequently? A.Reference Count Updating B.Mark and Sweep checking for dead memory C.Copying GC copying live data CMSC 330 Spring 2017 49

Stop the World: Potentially Long Pause Both of the previous algorithms stop the world by prohibiting program execution during GC Ensures that previously processed memory is not changed or accessed, creating inconsistency But the execution pause could be too long Bad if your car s braking system performs GC while you are trying to stop at a busy intersection! How can we reduce the pause time of GC? Don t collect the whole heap at once (incremental) CMSC 330 Spring 2017 50

The Generational Principle More objects live Þ Young objects die quickly; old objects keep living Object lifetime increases Þ CMSC 330 Spring 2017 51

Generational Collection Long lived objects visited multiple times Idea: Have more than one heap region, divide into generations Ø Older generations collected less often Ø Objects that survive many collections get promoted into older generations Ø Need to track pointers from old to young generations to use as roots for young generation collection Tracking one in the remembered set One popular setup: Generational, copying GC CMSC 330 Spring 2017 52

Conservative Garbage Collection (for C) For C, we cannot be sure which elements of an object are pointers Because of incomplete type information, the use of unsafe casts, etc. Idea: suppose it is a pointer if it looks like one Most pointers are within a certain address range, they are word aligned, etc. May retain memory spuriously Different styles of conservative collector Mark-sweep: important that objects not moved Mostly-copying: can move objects you are sure of CMSC 330 Spring 2017 53

Memory Management in Ruby Local variables live on the stack Storage reclaimed when method returns Objects live on the heap Created with calls to Class.new Objects never explicitly freed Ruby uses automatic memory management CMSC 330 Spring 2017 54

Memory Management in OCaml Local variables live on the stack Tuples, closures, and constructed types live on the heap let x = (3, 4) (* heap-allocated *) let f x y = x + y in f 3 (* result heap-allocated *) type a t = None Some of a None (* not on the heap just a primitive *) Some 37 (* heap-allocated *) Garbage collection reclaims memory CMSC 330 Spring 2017 55

Memory Management in Java Local variables live on the stack Allocated at method invocation time Deallocated when method returns Other data lives on the heap Memory is allocated with new But never explicitly deallocated Ø Java uses automatic memory management CMSC 330 Spring 2017 56

Java HotSpot SDK 1.4.2 Collector Multi-generational, hybrid collector Young generation Ø Stop and copy collector Tenured generation Ø Mark and sweep collector Permanent generation Ø No collection CMSC 330 Spring 2017 57

More Issues in GC (cont.) Stopping the world is a big hit Unpredictable performance Ø Bad for real-time systems Need to stop all threads Ø Without a much more sophisticated GC One-size-fits-all solution Sometimes, GC just gets in the way But correctness comes first CMSC 330 Spring 2017 58

What Does GC Mean to You? Ideally, nothing GC should make programming easier GC should not affect performance (much) Usually bad idea to manage memory yourself Using object pools, free lists, object recycling, etc GC implementations have been heavily tuned Ø May be more efficient than explicit deallocation If GC becomes a problem, hard to solve You can set parameters of the GC You can modify your program CMSC 330 Spring 2017 59

Increasing Memory Performance Don t allocate as much memory Less work for your application Less work for the garbage collector Don t hold on to references Null out pointers in data structures Example Object a = new Object; use a a = null; // when a is no longer needed CMSC 330 Spring 2017 60

Find the Memory Leak class Stack { private Object[] stack; private int index; public Stack(int size) { stack = new Object[size]; } public void push(object o) { stack[index++] = o; } public void pop() { stack[index] = null; // null out ptr return stack[index--]; } } From Haggar, Garbage Collection and the Java Platform Memory Model Answer: pop() leaves item on stack array; storage not reclaimed CMSC 330 Spring 2017 61