Lecture 13: Complex Types and Garbage Collection

Similar documents
Types II. Hwansoo Han

Lecture 23: Object Lifetime and Garbage Collection

Programming Languages

CPSC 3740 Programming Languages University of Lethbridge. Data Types

Chapter 8 :: Composite Types

Lecture 7: Binding Time and Storage

Lecture 15: Object-Oriented Programming

COSC252: Programming Languages: Basic Semantics: Data Types. Jeremy Bolton, PhD Asst Teaching Professor

G Programming Languages - Fall 2012

G Programming Languages - Fall 2012

Memory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008

Concepts Introduced in Chapter 7

Data Types. Outline. In Text: Chapter 6. What is a type? Primitives Strings Ordinals Arrays Records Sets Pointers 5-1. Chapter 6: Data Types 2

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments

Memory Management. Memory Management... Memory Management... Interface to Dynamic allocation

Chapter 6. Data Types ISBN

Data Types In Text: Ch C apter 6 1

Compiler Construction

CMSC 330: Organization of Programming Languages

Chapter 5. Names, Bindings, and Scopes

Garbage Collection. Akim D le, Etienne Renault, Roland Levillain. May 15, CCMP2 Garbage Collection May 15, / 35

Compiler Construction

CMSC 330: Organization of Programming Languages

Acknowledgements These slides are based on Kathryn McKinley s slides on garbage collection as well as E Christopher Lewis s slides

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection

Organization of Programming Languages CS320/520N. Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Storage. Outline. Variables and updating. Copy vs. Ref semantics Lifetime. Dangling References Garbage collection

Introduction Primitive Data Types Character String Types User-Defined Ordinal Types Array Types. Record Types. Pointer and Reference Types

Chapter 5 Names, Binding, Type Checking and Scopes

CS 430 Spring Mike Lam, Professor. Data Types and Type Checking

Name, Scope, and Binding. Outline [1]

Design Issues. Subroutines and Control Abstraction. Subroutines and Control Abstraction. CSC 4101: Programming Languages 1. Textbook, Chapter 8

Organization of Programming Languages CS3200 / 5200N. Lecture 06

Lecture 12: Data Types (and Some Leftover ML)

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Data Types. CSE 307 Principles of Programming Languages Stony Brook University

CS558 Programming Languages

CS558 Programming Languages

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

One-Slide Summary. Lecture Outine. Automatic Memory Management #1. Why Automatic Memory Management? Garbage Collection.

Heap Management. Heap Allocation

Storage. Outline. Variables and Updating. Composite Variables. Storables Lifetime : Programming Languages. Course slides - Storage

Programming Languages

Automatic Memory Management

NOTE: Answer ANY FOUR of the following 6 sections:

Control Flow February 9, Lecture 7

COS 140: Foundations of Computer Science

Programming Languages Third Edition. Chapter 7 Basic Semantics

Principles of Programming Languages Topic: Scope and Memory Professor Louis Steinberg Fall 2004

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

Garbage Collection (1)

CSC 533: Organization of Programming Languages. Spring 2005

6. Names, Scopes, and Bindings

Deallocation Mechanisms. User-controlled Deallocation. Automatic Garbage Collection

Binding and Storage. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Names, Bindings, Scopes

CS558 Programming Languages

A.Arpaci-Dusseau. Mapping from logical address space to physical address space. CS 537:Operating Systems lecture12.fm.2

Run-time Environments - 3

Programmin Languages/Variables and Storage

Heap storage. Dynamic allocation and stacks are generally incompatible.

CS321 Languages and Compiler Design I Winter 2012 Lecture 13

first class citizens

Example. program sort; var a : array[0..10] of integer; procedure readarray; : function partition (y, z :integer) :integer; var i, j,x, v :integer; :

CS 345. Garbage Collection. Vitaly Shmatikov. slide 1

Test 1 Summer 2014 Multiple Choice. Write your answer to the LEFT of each problem. 5 points each 1. Preprocessor macros are associated with: A. C B.

Attributes, Bindings, and Semantic Functions Declarations, Blocks, Scope, and the Symbol Table Name Resolution and Overloading Allocation, Lifetimes,

CS558 Programming Languages. Winter 2013 Lecture 3

Motivation for Dynamic Memory. Dynamic Memory Allocation. Stack Organization. Stack Discussion. Questions answered in this lecture:

Programming Language Implementation

Memory Management. Didactic Module 14 Programming Languages - EEL670 1

Compilers. 8. Run-time Support. Laszlo Böszörmenyi Compilers Run-time - 1

Memory Management. Chapter Fourteen Modern Programming Languages, 2nd ed. 1

Lecture 13: Garbage Collection

Cpt S 122 Data Structures. Course Review Midterm Exam # 1

The basic operations defined on a symbol table include: free to remove all entries and free the storage of a symbol table

Data Structure for Language Processing. Bhargavi H. Goswami Assistant Professor Sunshine Group of Institutions

CS61C : Machine Structures

In Java we have the keyword null, which is the value of an uninitialized reference type

Dynamic Memory Management

COS 140: Foundations of Computer Science

Structure of Programming Languages Lecture 10

Week 9 Part 1. Kyle Dewey. Tuesday, August 28, 12

Chapter 6. Data Types ISBN

References and pointers

The issues. Programming in C++ Common storage modes. Static storage in C++ Session 8 Memory Management

Performance of Non-Moving Garbage Collectors. Hans-J. Boehm HP Labs

Symbol Tables. ASU Textbook Chapter 7.6, 6.5 and 6.3. Tsan-sheng Hsu.

CS 230 Programming Languages

Software II: Principles of Programming Languages

Data Types. (with Examples In Haskell) COMP 524: Programming Languages Srinivas Krishnan March 22, 2011

CPSC 213. Introduction to Computer Systems. Winter Session 2017, Term 2. Unit 1c Jan 24, 26, 29, 31, and Feb 2

CSCI-1200 Data Structures Fall 2011 Lecture 24 Garbage Collection & Smart Pointers

Names, Scope, and Bindings

Lecture 10: Expression Evaluation

CS558 Programming Languages

INITIALISING POINTER VARIABLES; DYNAMIC VARIABLES; OPERATIONS ON POINTERS

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

Transcription:

Lecture 13: Complex Types and Garbage Collection COMP 524 Programming Language Concepts Stephen Olivier March 17, 2009 Based on slides by A. Block, notes by N. Fisher, F. Hernandez-Campos, and D. Stotts

Goals Discuss Arrays, Pointers, Recursive Types, Strings, Sets, Lists, Equality, and Garbage Collection 2

Arrays Arrays are usually stored in contiguous locations row-major order 3

Arrays Arrays are usually stored in contiguous locations Columnmajor order 4

Multidimensional Arrays: Address Calculations L3 j L1 L2 k i A: array [L1..U1] of array [L2..U2] of array [L3..U3] of element_type 5

S3 = size of element S2 = (U3 - L3 + 1) x S3 S1 = (U2 - L2 + 1) x S2 Address of A[i,j,k] Address of A + (i-l1)xs1 + (j-l2)xs2 + (k-l3)xs3 Multidimensional Arrays: Address Calculations L3 j L1 L2 k i A: array [L1..U1] of array [L2..U2] of array [L3..U3] of element_type 6

Optimized (i x S1) + (j x S2) + (k x S3) + address A - [(L1 x S1) + (L2 x S2) + (L3 x S3)]. the phrase [(L1 x S1) + (L2 x S2) + (L3 x S3)] can be determined at compile-time Multidimensional Arrays: Address Calculations L3 j L1 L2 k i A: array [L1..U1] of array [L2..U2] of array [L3..U3] of element_type 7

Heap-based Allocation The heap is a region of storage in which sub-blocks can be allocated and deallocated. 8

Issues with Heap Allocation Pointers Used in value model of variables Not necessary for reference model Dangling References Garbage Collection 9

Pointers Pointers serve two purposes: Efficient (and sometimes intuitive) access to elaborated objets (as in C) Dynamic creation of linked data structures, in conjunction with a heap storage manger. Several Languages (e.g., Pascal) restrict pointers to accessing things in the heap Pointers are used with a value model of variables They aren t needed with a reference model (already implicit) 10

C Pointers and Arrays C Pointers and arrays int *a == int a[] int **a == int *a[] 11

C Pointers and Arrays But equivalencies don t always hold Specifically, a declaration allocates an array if it specifies a size for the first dimension otherwise it allocates a pointer int **a, int *a[] // Pointer to pointer to int int *a[n] //n-element array of row pointers int a[n][m] // 2-D array 12

Contiguous 2D Arrays vs. Row Pointers (C) char schools[][5] = { UNC, DOOK, BC, WAKE, UVA, VT, FSU }; /* 35B data */ char *schools[] = { UNC, DOOK, BC, WAKE, UVA, VT, FSU }; /* 28B data+28b ptr */ U N C D O O B C W A K U V A V T F S U K E U N C D O O K B C W A K E U V A V T F S U 13

Recursive Types (now with Pointers!): Binary Tree struct chr_tree{ struct chr_tree *l, *r; char var; } C type chr_tree; type chr_tree_ptr is access chr_tree; type chr_tree is record left,right:chr_tree_ptr; val:character; end record; Ada 14

Binary Tree with Explicit Pointers R C X Y Z W 15

Binary Tree in Reference Model C C C Lisp A R C C C C C C A X A Y C C C A Z C C C (#\R (#\X () ()) (#\Y (#\Z () ()) (#\W () ()))) A W

Problems with Explicit Reclamation Explicit reclamation of heap objects is problematic The programmer may forget to deallocate some objects Causing memory leaks For example, in the previous example, the programmer may forget to include the delete statement References to deallocated objets may not be reset Creating dangling references ptr1 ptr2 17

Problems with Explicit Reclamation Explicit reclamation of heap objects is problematic The programmer may forget to deallocate some objects Causing memory leaks For example, in the previous example, the programmer may forget to include the delete statement References to deallocated objets may not be reset Creating dangling references ptr2 18

Issues with Heap Allocation 19

Dealing with Dangling References Tombstones: Use an intermediary device Pointers refer to the tombstone rather than the actual object Indirection costs Tombstones invalidated on deallocation Problem: when do we reclaim them Lock and Keys: Use a universal key to per-object lock Check key against lock each access (costly) Reset lock to zero on deallocation and it should get a different value upon reuse 20

Tombstones new(ptr1); ptr1 ptr2 := ptr1; ptr1 ptr2 delete(ptr1); ptr1 ptr2 21

Locks and Keys new(ptr1); ptr1:135942 135942 ptr2 := ptr1; ptr1:135942 ptr2:135942 135942 delete(ptr1); ptr1:135942 ptr2:135942 0

Garbage Collection Automatic reclamation of the space used by objects that are no longer useful: Developed for functional languages Essential in this programming paradigm. More and more popular in imperative languages Java, C#, Python Generally slower than manual reclamation, but it eliminates a very frequent programming error 23

Garbage Collection: Techniques When is an object no longer useful? There are several garbage collection techniques that answer this question in a different manner Reference counting Mark-and-sweep collection store-and-copy generational collection 24

Reference Counting Each object has an associated reference counter ptr1 ptr2 2 foo Stack Heap Keeps reference counter up to date, and deallocates objects when the counter is zero 25

Reference Counting Each object has an associated reference counter ptr1 ptr2 1 foo Stack Heap Keeps reference counter up to date, and deallocates objects when the counter is zero 26

Reference Counting Each object has an associated reference counter ptr1 ptr2 0 foo Stack Heap Keeps reference counter up to date, and deallocates objects when the counter is zero 27

Reference Counting: Problems Extra overhead of storing and updating reference counts Strong Typing required Impossible in language like C It cannot be used for variant records It doesn t work with circular data structures This is a problem with this definition of useful object as an object with one or more references 28

Reference Counting: Circular Data Structures stooges 2 larry 1 moe Stack Heap 1 curly stooges 1 larry 1 moe Stack Heap 1 curly 29

Mark-and-Sweep Collection A better definition of useless is one that cannot be reached by following a chain of valid pointers starting from outside the heap. Mark-and-Sweep GC applies this definition 30

Mark-and-Sweep Algorithm: Mark every block in the heap as useless Starting with all pointers outside the heap, recursively explore all linked data structures Add every block that remain marked to the free list. Run whenever free space is low 31

Mark-and-Sweep Collection: Problems Block must begin with an indication of its size A stack of depth proportional to the longest reference chain is required AND! We are usually running low when running the GC Must stop the world to run Much work in parallel garbage collection to address this Also can use periodically in combination with reference counting 32

Mark-and-Sweep Collection: Problems Block must begin with an indication of its size A stack of depth proportional to the longest reference chain is required AND! We are usually running low when running the GC If we use type descriptors, that indicate their size, then we don t need to do this. Must stop the world to run Much work in parallel garbage collection to address this Also can use periodically in combination with reference counting 33

Mark-and-Sweep Collection: Problems Block must begin with an indication of its size A stack of depth proportional to the longest reference chain is required AND! We are usually running low when running the GC Must stop the world to run Much work in parallel garbage collection to address this Possible to implement without a stack! Also can use periodically in combination with reference counting 34

Mark-and-Sweep Collection: Pointer Reversal R X Y Z W A 35

Mark-and-Sweep Collection: Pointer Reversal R X Y Z W A 36

Mark-and-Sweep Collection: Pointer Reversal R X Y Z W A 37

Mark-and-Sweep Collection: Pointer Reversal R X Y Z W A 38

Mark-and-Sweep Collection: Pointer Reversal R X Y Z W A 39

Mark-and-Sweep Collection: Pointer Reversal R X Y Z W A 40

Mark-and-Sweep Collection: Pointer Reversal R X Y Z W A 41

Store-and-Copy Use to reduce external fragmentation S-C divides the available space in half and allocates everything in that half until its full When that happens, copy each useful block to the other half, clean up the remaining block, and switch the roles of each half. Drawback: only get to use half of heap at a time 42

Generational Collection Based on the idea that objects are often short-lived Heap space divided into regions based on how many times objects have been seen by the garbage collector Region of objects allocated since last GC (called nursery ) explored for reclamation before the regions containing older objects 43

Lists We ve seen lists in ML In LISP, homogeneity is not required A list is technically defined recursively as either the empty list or a pair consisting of an object (which may be either a list or an atom) and another (shorter) list In Lisp, in fact, a program is a list, and can extend itself at run time by constructing a list and executing it. Lists are also supported in some imperative programs 44

List Comprehensions Specify expression, enumerator, and filter(s) Example: squares of odd numbers less than 100: [i*i i <- [1..100], i mod 2 == 1] Provided in Haskell, Miranda, Python 45

Strings A string may be regarded as merely an array of characters or as a distinct data type Some languages that consider them character arrays nonetheless provide some syntactic sugar for them For example, C: char str[ ] = hello world ; Orthogonality problem: can t assign string literal after declaration Support for variable-length strings Built-in type in functional languages (Lisp, Scheme, ML) String class in object-oriented languages (C++, Java, C#) 46

Sets A Set is an unordered collection of an arbitrary number of distinct values of a common type. The universe comprises all the possible values that could be in the set e.g. The 256 ASCII characters Often represented as a bit vector for fast set operations Each bit represents one possible member of the set This is not feasable for a large universe, i.e. the integers Use a hash table or tree instead 47

Equality and Assignment What does equality mean for complex types? Shallow comparison: refer to the same object Deep comparison: identical structure with identical values throughout all data members Can also have shallow or deep assignment Tricky with pointers: In a deep copy, pointers in both copies point to the same objects 48