Compiler Construction D7011E - PDF Free Download

Compiler Construction D7011E Lecture 14: Memory Management Viktor Leijon Slides largely by Johan Nordlander with material generously provided by Mark P. Jones. 1

First: Run-time Systems 2

The Final Component: We have followed the workings of a compiler: From source input to code generation. We can use an assembler to turn the compiler s output into an object file. To produce an executable program, we must link the object file to the the compiler s run-time system. 3

Why use a Run-time System? Every programming language provides a different view of what the computer can do: Built-in types, and associated operations; I/O facilities; Concurrency and multiple threads; Dynamic memory allocation; Etc.. Different computers, and different operating systems, do not necessarily support all of these features directly, and will often use different interfaces, and even different semantics. 4

Examples from Java: Not all CPUs have built-in support for floating point division, or for standard mathematical functions like square root, sin, or log. Unix systems use file descriptors to identify files, and don t know anything about the File, OutputStream, etc. objects used in Java. Many operating systems include support for multiple threads of execution, but they don t all use the same interface. 5

Using a Run-time System: Compiled Program Run-time System Operating System / Hardware A run-time system bridges the gap between: The language designer s expectation of what facilities the underlying system will provide; and The set of features that are actually supported by the machine or its operating system. 6

Run-time Libraries: Conceptually, a run-time system has two parts: A set of conventions about the way that different kinds of value are represented, and about the data structures that are used; A library of code to implement the required features, or to wrap up operating system features according to the conventions of the run-time system. Compiled programs: Must follow the conventions of the run-time system; May include references to code in the run-time library. 7

Linking: A linking process is used to build executable programs by connecting compiled object files to the appropriate run-time system libraries. The goal is to fill each reference in an object file with the corresponding code from the library. Object Code Run-time Library Executable Program f a g a b c d e f g h f a g u m c n x i j k l m n u v w x y z Link u m c x n 8

Static Linking: Static linking produces executable code by inserting sections of run-time library code into each executable program. Thus portions of the run-time library may be duplicated many times over, which takes extra space on disk, and in memory (if multiple processes are executing). If there is a bug in a run-time library routine, then all compiled programs will need to be rebuilt. 9

Dynamic Linking: With dynamic linking, libraries are separate units that can be loaded into memory and connected to executable code as needed. It is enough to have just one copy of a dynamically linked library (DLL) on disk. A single copy in memory to be shared between multiple programs. In effect, DLLs behave like extensions of the operating system. But programs using DLLs won t run properly without suitable versions of their libraries. 10

Run-time System Pros: Portability: A run-time system isolates a program from the details of a particular operating environment, and so makes it easier to port code between different platforms. Reuse: A run-time system provides standardized libraries and abstractions that can be used in many different programs. Higher levels of abstraction: A run-time system can package up low-level features in a way that makes them easier to understand and use. 11

Run-time System Cons: Size: A comprehensive run-time system may be quite large, and so require a lot of effort to implement and port. Overhead: A run-time system can add overhead, increasing the size of executable programs. Coding Difficulties: The implementation of some features may conflict with the semantics of features in the underlying system, and so require an inefficient or indirect encoding. 12

A Question of Scale: For the MiniJava compiler, only three run-time system features are required: Initialization of a program; Output of an integer value; Allocation of a new object. In real compilers, the run-time system is often much larger than the compiler, and requires at least as much effort to implement, maintain, and document. 13

MiniJava Run-Time Support: Initialization: No special needs by naming our top-level assembly language routine main (and exporting it by.globl main), we can rely on the standard C/Unix program startup code. Printing integers: Simplest solution: just issue a call to the standard C library routine printf. First argument is a pointer to the constant string "%d\n", second argument is the integer to print. Allocating memory: Simplification: no memory recycling is necessary for the project. This means that memory allocation can most easily be done using the standard C library routine void* malloc(int) 14

Accessing the RTS Functions: From assembly language: No "import" or "external" declarations are needed. However, some platforms insist that printf and malloc are referred to as _printf and _malloc (you will quickly find out which is the case!). From LLVM: Include the following declarations: declare i32 @printf(i8*, i32) declare i8* @malloc(i32) (Note that we're giving printf a quite restrictive type here, but that's approximating in the safe direction.) 15

Declaring a String Literal: In assembly code: Add the following declaration: str:.asciz "%d\n" Just refer to the string address as str. In LLVM: Add the following global declaration (yes, odd syntax!): @str = global [4 x i8] c"%d\0a\00" Complication: @str now has type [4 x i8]* must be type-cast to i8* before calling @printf: %str1 = getelementptr [4 x i8]* @str, i32 0, i32 0 16

Calculating Object Sizes: In assembly code: You have full control over the object sizes just give malloc the correct byte-count. In LLVM: Since the size of pointer types is not under direct control in LLVM, the proper way to obtain the size of a type %T is: %1 = getelementptr %T* null, i32 1 %sizet = ptrtoint %T* %1 to i32 Note: the malloc result is an i8*, must be cast to a %T*: %2 = call i8* @malloc(i32 %sizet) %ptrt = bitcast i8* %2 to %T* 17

Declaring Method Tables: Assume we have a class T: class T {...; int m1(int x) {...}; int m2(int x, int y) {...}; } In assembly code:.section DATA, const T_vtab:.long m1.long m2 In LLVM (note the syntax for function pointers!): %T = type { %T_vt*,... } %T_vt = type { i32(%t*,i32)*, i32(%t*,i32,i32)* } @T_vtab = global %T_vt { i32(%t*,i32)* @m1, i32(%t*,i32,i32)* @m2 } 18

Dynamic Memory Allocation and Garbage Collection: 19

Dynamic Memory Allocation: Dynamic memory allocation is used when the amount of memory that will be needed to store a program s data cannot be predicted at compile-time. Examples of programs where this is useful include: compilers, web browsers, word processors, 20

Allocation in Run-time Systems: Some languages do not support dynamically allocated memory; instead, programmers must anticipate/guess the requirements at compiletime and pre-allocate storage accordingly. Many operating systems do not support (finegrained) dynamic memory allocation well but many languages require it. As a result, dynamic memory allocation is one of the most commonly supported features in modern run-time systems. 21

Explicit Allocation: Different languages provide different ways to allocate memory: bytes = (int*)malloc(120); ints = new int[30]; expr = new IntExpr(120); list = x:xs (cons x xs) 22

Allocating from a Heap: Where does dynamically allocated memory come from? When the run-time system is initialized, it requests a large block of memory from the operating system, which is known as the heap. The run-time system maintains a heap pointer that identifies the next free location. allocated not yet allocated Heap pointer To allocate n bytes, we return the current heap pointer setting, and advance the heap pointer by n. 23

Allocation is Only Half the Story: What happens when we run out of memory? Can we reclaim and recycle memory when we finish using it? 24

Memory Usage in a Compiler: Here is a (hypothetical) graph to illustrate how memory is used by a typical compiler: memory parsing static analysis output time NOT TO SCALE! When a program exits, any memory that it has used is returned to the operating system. We probably don t need to worry about reclaiming memory in programs that run only for a short time. 25

Memory Usage in a Browser: Each time you visit a new web page, the browser needs to allocate memory to store the text, images, and other items on that page. You might run a browser for a long time and visit many web pages. If the browser doesn t take steps to reclaim memory, then, eventually, your browser will not be able to load any new web pages. 26

Reclaiming Memory Explicitly: In some languages, programmers can tell the run-time system that they have finished with a piece of memory, and that it can be recycled. free(ptr); /* C */ destroy obj; /* C++ */ This can be risky; the programmer must ensure that: The specified memory was allocated dynamically; No part of the program will attempt to access that section of memory again. 27

Reclaiming Memory in a Browser: It is easy to manage memory in a browser: Keep data for the web pages that can be reached using the Back and Forward buttons. If memory gets tight, we can reclaim the storage used by some of the web pages: we can store the data on disk, or download it again if it is needed. In other words, it is easy to see where the calls to free or destroy should go. 28

Reclaiming Memory Automatically: In general, however: It is hard to know when memory can be reclaimed; If it is too early, the run-time system s structures will be corrupted, and the program could crash; If memory is reclaimed too late, then the program will have a space leak and use more memory than it needs. Incorrect attempts to reclaim memory are one of the biggest sources of bugs in C++ programs. Could a run-time system do better in deciding when memory can be reclaimed? 29

Garbage Collection: Garbage collection is the term used to describe automatic reclamation of computer storage. An object is garbage if it will not be used again in other words, if it is not live. Conceptually, garbage collection is a two phase process: Garbage detection: Distinguish live objects from those which are garbage. Garbage reclamation: Reclaim memory used by garbage so that the running program can reuse it. In practice, these phases may be interleaved. 30

How do we Detect Garbage? First Attempt: An object is garbage if there are no pointers to it. this leads us to the technique of reference counting. 31

Reference Counting: A run-time system can attach a reference count to each chunk of memory that is allocated. The reference count is the number of pointers to this object from elsewhere in the program. Every time we duplicate a pointer to an object, we increment the reference count. Every time we eliminate a pointer to an object, we decrement the reference count. When the reference count is zero, the object can be reclaimed. 32

Example: p q 1 1 33

Example: p q p = q; 1 1 33

Example: p q p q p = q; 1 1 0 2 33

Example: p q p q p = q; 1 1 0 2 Now this object is garbage 33

Example: p q p q p = q; 1 1 0 2 Now this object is garbage Reference counts take up one word of memory per object. A compiler can generate code to maintain the reference counts, but the overhead might be high. 33

The Problem With Cycles: 1 1 34

The Problem With Cycles: 1 1 The two values shown here are garbage; neither one can be reached from anywhere else in the program. But neither one has a zero reference count, so neither one will be reclaimed a memory leak! Thus additional steps must be taken to deal with cycles one further reason why reference counting is not popular in run-time systems. 34

How do we Detect Garbage? First Attempt: An object is garbage if there are no pointers to it. this leads us to the technique of reference counting. 35

How do we Detect Garbage? First Attempt: An object is garbage if there are no pointers to it. this leads us to the technique of reference counting. But an object can be garbage, even if there are pointers to it if those pointers are in other pieces of garbage. Second Attempt: An object is garbage if it is unreachable. This is still conservative, but usually works quite well. 35

Reachability: Suppose that we could interrupt a MiniJava computation at any stage. Which objects might be live at that point? We can identify a set of roots for live data: Any object that is pointed to from a global variable (i.e., a static field in a class, but we don t have those in MiniJava); Any object that is pointed to from an active frame on the stack. Any object that can be reached from one (or more) of the roots might be used in a future computation. 36

Understanding Reachability: p q root set 37

Understanding Reachability: p q root set All that remains unmarked is garbage! 37

Mark-Sweep Garbage Collection: This is almost exactly how a mark-sweep garbage collector works: The mark phase: Traverse the graph, starting at the roots, and mark every object that is reached; The sweep phase: Storage for any object that has not been marked can be reclaimed. The time to garbage collect is proportional to the size of the heap: we have to sweep the whole heap to find unmarked objects. 38

How do we Reclaim Memory? Once the marking phase is over, the heap will typically be broken into a mixture of marked and unmarked areas: We can reclaim memory by linking together the unused areas: free list 39

Allocating From a Free List: Now we must allocate memory from the free list too; we can t just advance a heap pointer. To allocate n bytes: Search for the first free chunk with n bytes in it; Allocate the required memory from that chunk; Return any unused portion to the free list. There are some techniques that we can use to make this more efficient. Exactly the same problems occur in systems with explicit memory reclamation (malloc() ) 40

Fragmentation: Another serious problem here is the risk of fragmentation, which happens when memory is broken into small pieces that are hard to reuse. For example, we can t allocate an object in a heap that looks like this: Although there is enough unused memory, in total, it isn t available in one contiguous block. Several compaction techniques have been developed to overcome this problem. 41

A Copying Collector: A copying collector works by copying all the reachable data to a safe place, and then discarding the original heap altogether! Copying collectors usually alternate between two heaps. At each garbage collection, reachable values in one heap ( from space ) are copied into new locations in the other ( to space ). 42

Pros and Cons: J A copying collector ensures that the heap is compacted at each garbage collection: J No fragmentation! J We can go back to allocating using a simple heap pointer. J The time to garbage collect is proportional to the amount memory that is reachable, which may be much less than the size of the heap. L We have to split available memory resources between two large heaps of equal size, even though we only use one at a time. 43

Some Details: We will look at a copying garbage collection algorithm in a little more detail. Let s assume that the runtime system maintains the following variables: fromspace the address of the active heap; tospace hp the address of the second heap; the heap pointer. Normally, hp points into fromspace. At the start of a garbage collection, we will reset it to point to the start of tospace. 44

Object Representations: We will also add some extra details to the representation of objects to support garbage collection: fields size forward scavenge methods The number of bytes in this object. Code to copy this object into tospace. Code to copy this object s fields to tospace. [Virtual methods listed here, as before] 45

Forwarding An Object: To copy an object from fromspace to tospace: Addr forward(addr obj) { } Addr dest = hp; for (i=0; i<obj.tag.size; i++) mem[hp++] = obj[i]; obj.tag = FORWARDED; obj.field1 = dest; return dest; The object s new address in tospace. Assume every object at least 8 bytes long. 46

The FORWARDED tag: Once an object has been forwarded, it should not be forwarded again. We deal with this by overwriting the tag of each forwarded object with the address FORWARDED, which points to a special virtual function table: not used forward Addr forwarded(addr obj) { return obj.field1; } The same address as the first time this object was forwarded 47

Scavenging An Object: Scavenging = forwarding the pointer-fields of an object: void scavenge(addr obj) { } obj.field1 = obj.field1.forward(); obj.field2 = obj.field2.forward(); Only pointers to objects should be scavenged, and not all fields contain such pointers. A compiler can generate an appropriate scavenge function for each class using the list of fields in that class. 48

Using tospace as a Queue: Initially, tospace is empty: Once the roots have been forwarded, it looks like this: tovisit hp Now we scavenge each object, left to right, using tospace as a queue: scavenged tovisit Waiting to be scavenged hp Unused space into which objects might be forwarded. 49

Putting it all Together: hp = tospace; for each root r { r = r.forward(); } tovisit = tospace; while (tovisit < hp) { tovisit.scavenge(); tovisit += tovisit.tag.size; } exchange tospace and fromspace; Make sure all the roots are forwarded Scavenge each forwarded object for pointers 50

Incremental Garbage Collection: The techniques that we have looked at so far put the main computation on hold while garbage collection is taking place. For an interactive program with a large heap size, this might cause a noticeable pause in execution. For real-time applications, a long pause at random is not acceptable. Much effort has been invested in the design of more sophisticated, incremental garbage collection algorithms that solve these problems by interleaving detection and reclamation. 51

Generational Garbage Collection: Experiments suggest most data is short-lived. Generational collectors exploit this by breaking the heap into several generations. old middle new The new generation is smaller and takes less time to garbage collect. Most objects die during the new collection. Those that survive are promoted to the middle generation, which needs less frequent collections. Objects that survive a middle collection are promoted to the old generation, which needs even less frequent collections. 52

The Cost of Garbage Collection: Appel has argued that garbage collection can sometimes be cheaper than stack allocation. Other estimates suggest that use of garbage collection can increase execution time by 10%. In any case: The cost depends on the quality of the garbage collector, and on the program that uses it. There are also overheads with schemes for explicitly reclaimed memory Perhaps the overheads of garbage collection are justified by the reduction in bugs? 53

Further Reading: There is a large literature on garbage collection; we have only scraped the surface here! There is some material in Appel s book. There is a book devoted to garbage collection by Richard Jones and Rafael Lins: http://www.cs.ukc.ac.uk/people/staff/rej/gc.html But you could do a lot worse than start with Paul Wilson s long, but excellent survey at: ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps 54