Structure of Programming Languages Lecture 10

Structure of Programming Languages Lecture 10 CS 6636 4536 Spring 2017 CS 6636 4536 Lecture 10: Classes... 1/23 Spring 2017 1 / 23

Outline 1 1. Types Type Coercion and Conversion Type Classes, Generics, and Polymorphism 2 Memory Management The Stack, the Static, and the Heap Non-Managed Memory Problems Garbage Collection 3 Homework CS 6636 4536 Lecture 10: Classes... 2/23 Spring 2017 2 / 23

1. Types Type Coercion and Conversion 3. Type Coercion and Conversion Casts, Conversions, Coercions Some coercions will create nonsense PL/1 was liberal with coercion C implemented coercion between numeric types Ada threw it all out C++ adds new coercions, under programmer control CS 6636 4536 Lecture 10: Classes... 3/23 Spring 2017 3 / 23

1. Types Type Coercion and Conversion Casts, Conversions, Coercions. The C type cast hides a rat s nest of problems. C++ got it straight. Static cast (type conversion, changes size or encoding, keeps semantics.) Reinterpret cast (pointer cast, changes semantics, keeps representation.) Const cast (removes or adds a restriction to a pointer) Dynamic cast (movement up or down a type hierarchy. Downward movement causes a run time type test and may fail.) A type coercion is a cast that is applied by the compiler. CS 6636 4536 Lecture 10: Classes... 4/23 Spring 2017 4 / 23

1. Types Type Coercion and Conversion Some coercions will create nonsense. Some languages will coerce argument types when they do not match the corresponding parameter types. Coercions are supposed to be applied ONLY when they preserve semantics. Lengthening a short value to a longer type preserves semantics. Shortening might or might not preserve the semantics, so it should not be used freely. Some changes of representation preserve semantics (int to double). Some generally do not (double to short int). Coercion is, therefore, a questionable practice. CS 6636 4536 Lecture 10: Classes... 5/23 Spring 2017 5 / 23

1. Types Type Coercion and Conversion PL/1 was liberal with coercion. In PL/1, any argument would be coerced to any parameter type if the compiler could find a chain of conversions to get from one to the other. In the process, garbage often happened. Consider this PL/1 statement: IF (a<=b<=c) THEN x = 1; ELSE t = 1 The result of a<=b is a single bit truth value, 1 or 0. The 1 or 0 is cast to the underlying type bitstring, length 1. The bitstring is promoted to an integer to match type of c. The integer is lengthened to the length of c. Now it is compared to c, and the answer is always true if c. is greater than 0! Analysis: Casting to an underlying representation type removes the original semantics. Then casting to a higher-level represented type ADDS semantics, which may be inappropriate. The final step in this evaluation compared apples (painted orange) to oranges. CS 6636 4536 Lecture 10: Classes... 6/23 Spring 2017 6 / 23

1. Types Type Coercion and Conversion C implemented coercion between numeric types. The idea in C is that numbers can retain all or most of their meaning when converted to a different representation. Lengthening a representation (float --> double) is safe and is used freely for coercion. Shortening a representation is not safe, and is only used if necessary to carry out an assignment or call-by-value. It should trigger a warning. Converting to a floating representation (int --> double or float) is usually safe and is used freely for coercion. Converting to an integer representation (double or float --> int) usually loses precision. It is considered unsafe and used only for assignment. It should trigger a warning. Analysis: This set of conventions works and is convenient. Warnings are never given if the cast is explicit. CS 6636 4536 Lecture 10: Classes... 7/23 Spring 2017 7 / 23

1. Types Type Coercion and Conversion Ada threw it all out. Semantic validity was a primary goal in the design of Ada. The coercion messes made by PL/1 were to be avoided. So all coercion was thrown out in the design of Ada. If you wanted to add 1 to a double, you could define a method for operator+ that converted the 1 to a double representation, then did the addition and returned a real number. Ada programmers learned to be careful of the types of their constants. Analysis: The semantic nonsense problem was solved at the cost of convenience. CS 6636 4536 Lecture 10: Classes... 8/23 Spring 2017 8 / 23

1. Types Type Coercion and Conversion C++ adds new coercions, under programmer control. C++ adopted the type coercion rules of C and added the ability to define conversions for new types. Programmer-defined casts will be used for coercion, where needed. A class constructor can be used to convert from any type to the class type. The cast operator can be used to convert from the class type to any other type. Programmer-defined casts will be used for coercion, if necessary. Overuse of this facility can make code very hard to penetrate. Casts up a class hierarchy (toward the base class) are used for coercion. Downward casts must be explicit and may bomb at runtime. C++ preserves object semantics by doing a run-time test for the validity of a downward cast on a derivation hierarchy. CS 6636 4536 Lecture 10: Classes... 9/23 Spring 2017 9 / 23

1. Types Type Classes, Generics, and Polymorphism What is a Class? If a language supports classes, it is called object oriented. However, this covers a wide range of behaviors. A class is a set of data members, with member names...... plus the operators and functions that work on the data. It may be fully defined or be abstract because some of its functions are prototyped but not defined. The privacy/visibility of data can be restricted to a class. It should provide a stable interface to the world. (Not Ruby, Python.) It may or may not have subclass relationships with other classes. If one class is derived from another, they may or may not form a polymorphic class. At run time, each object belongs to at least one class the class used to allocate it. It also belongs to all the superclasses of that class. CS 6636 4536 Lecture 10: Classes... 10/23 Spring 2017 10 / 23

1. Types Type Classes, Generics, and Polymorphism Why are Classes Useful? Grouping related data and functions (lexical coherence) is better than spreading them all over a program. Classes make it possible to develop and debug programs in a team environment, as long as the language enforces a stable interface for each class. Classes provide a way to make programmer-defined types behave like primitive types w.r.t. input, output, and operations. Derived classes and polymorphism allow efficient creation of many variations on a basic type. Derivation gives access to sophisticated pre-written packages (Swing, collections). Classes provide an analog of database tables. This is essential in web application programming. CS 6636 4536 Lecture 10: Classes... 11/23 Spring 2017 11 / 23

1. Types Type Classes, Generics, and Polymorphism Classes in C++ In C++, the interface provided by a class is fully defined at compile time and stable thereafter. Classes are used for strong type checking and type coercion. Cast operators may be programmer-defined and will be used for coercion, where appropriate. New methods can be defined for all the existing operators, including input and output. In a class hierarchy, methods can be either virtual (dispatched at run time) or not (dispatched at compile time). This allows for optimization of non-virtual functions. CS 6636 4536 Lecture 10: Classes... 12/23 Spring 2017 12 / 23

1. Types Type Classes, Generics, and Polymorphism Classes in Ruby Ruby classes have methods, including constructors (initialize) getters(@var) and setters (var=). Syntax is provided (attr_accessor)for defining part names with getters and setters. Data members (untyped names) can be added during execution by mentioning a new name in a method. Different execution paths could create different sets of data members. New methods can be added to a class at any time. Thus, a Ruby class does not provide a consistent interface. Each instance of a Ruby class could have different functionality. To Ruby people, this is an advantage: This language feature makes the language incredibly flexible. CS 6636 4536 Lecture 10: Classes... 13/23 Spring 2017 13 / 23

Memory Management Part 2. Memory Management The Stack, the Static, and the Heap Non- Managed Memory Problems Garbage Collection CS 6636 4536 Lecture 10: Classes... 14/23 Spring 2017 14 / 23

Memory Management The Stack, the Static, and the Heap The System, the Process, and Virtual Memory When a process is loaded, a virtual memory operating system will create segments for: The code (read only) and literal strings. The stack, possibly combined with global variables. The dynamic heap. A static storage area, if the language supports local static variables. Globals could also be placed in this segment. Each segment is implemented by a page table and can be as short or as long as necessary. CS 6636 4536 Lecture 10: Classes... 15/23 Spring 2017 15 / 23

Memory Management The Stack, the Static, and the Heap The Stack, the Static, and the Heap The code segment and static segment do not grow or shrink during execution. The stack grows automatically as functions are called; if more pages are needed to implement it, they are added automatically. As functions return, the stack shrinks. If a stack-page is no longer needed because its functions have all returned, it will be paged out and not become part of the memory-bloat. The stack is always compact. The heap grows when new objects are dynamically allocated. It doesn t automatically shrink. As objects become inaccessible (garbage) or are freed explicitly, the heap becomes sparse. This becomes a problem if objects are constantly created and discarded, since more and more pages will be required to store the remaining useful data objects CS 6636 4536 Lecture 10: Classes... 16/23 Spring 2017 16 / 23

Memory Management Non-Managed Memory Problems Memory Leaks A memory leak is an area of memory that has been allocated, is no longer in use, but it has not been deallocated. Memory leaks happen when programmers allocate new objects and fail to incorporate them into the program s data structures. Then they forget to free them. This is not a problem for little applications that run and terminate. It becomes a problem only for applications that you turn on and leave open on your desktop for days or weeks. (A mail program?) Gradually, the number of dead memory blocks builds up and the heap segment becomes larger and larger, sparser and sparser. Eventually, the memory needs of the application become so great that it crowds out other applications and thrashing starts. CS 6636 4536 Lecture 10: Classes... 17/23 Spring 2017 17 / 23

Memory Management Non-Managed Memory Problems Dangling Pointers A dangling pointer is a pointer that is still in use longer in use that points at an object that has been deallocated. These happen when a short-lived object is pointed at by an object with a longer lifetime. The typical error is to set a pointer to a local variable or array, and return that pointer from the function. This is a plague on intermediate programmers who have no clear idea of when objects are born, how long they live, and when they die. Managed-memory systems avoid this problem by forcing arrays and objects to all be dynamically allocated (permanent lifetime). The result is that lots of extra run-time and run-space is used creating very temporary objects that then need collection. CS 6636 4536 Lecture 10: Classes... 18/23 Spring 2017 18 / 23

Memory Management Non-Managed Memory Problems Efficient Management of Dynamic Objects Make your own freelist. For every type of object you allocate dynamically, create a recycling system and an allocation function called mynew. Pre-allocate a block, called pool, of this type of objects and create a linked-list called freelist to organize them. Initialize freelist to null. When mynew is called, return the first object on the freelist. If is is empty, return the next object from the pool. If the pool is also empty, allocate another block of objects and store it in pool. CS 6636 4536 Lecture 10: Classes... 19/23 Spring 2017 19 / 23

Memory Management Garbage Collection Mark-and Sweep Garbage Collection A garbage collector (GC) identifies the memory blocks that are still accessible and collects the discarded areas for recycling. It starts with the roots of the objects in use, on the stack, in registers, in system data structures. It follows each pointer attached to anything attached to a root. This means walking down entire data structures at run time. At each step, the current block is marked When this is finished, the garbage is swept up by starting at the beginning of each memory segment and reclaiming blocks that have not been marked recently. Compaction is easy and efficient during this process IF ALL POINTERS were the results of calling new. (No pointers were calculated by programmer-written code.) CS 6636 4536 Lecture 10: Classes... 20/23 Spring 2017 20 / 23

Memory Management Garbage Collection Advantages of Managed Memory A programmer can use a managed-memory language without having any firm idea of how objects, pointers, allocation, and deallocation work. Some programmers have trouble getting programs to work when they need to use pointers. They write code that won t compile or code that crashes at run time. Reliance on managed memory lets more programmers debug more code faster and with less expertise. Our systems are so big and so fast now, maybe we don t need to worry about code bloat and inefficient code any more. CS 6636 4536 Lecture 10: Classes... 21/23 Spring 2017 21 / 23

Memory Management Garbage Collection Disadvantages of Managed Memory The programmer cannot control when the GC works or what it collects. The GC has a complex relationship with new allocation and initialization: the garbage collector must not try to collect a half-initialized new object. For this reason, there are times when a GC may and may not execute safely. Some programmers are not aware of the performance problems caused by managed memory. They write code that creates many unnecessary objects that must soon be collected, decreasing application performance. While collection is happening, the process is suspended. This makes managed memory inappropriate for some real-time applications. Use of a compacting collector strongly restricts the kind of data structures you can use. CS 6636 4536 Lecture 10: Classes... 22/23 Spring 2017 22 / 23

Homework Homework 10 Read Chapters 14 18 in the textbook. 1 How is a static variable like a global variable? How are they different? 2 Explain the most important benefit of managed memory. 3 Explain an important fault or limitation of managed memory. 4 Choose some detail about types and compare (in half a page) how it is supported (or not supported) in two different languages, either languages mentioned in the reading or in this lecture. 5 Write a half-page essay about something you learned about types from reading this material. CS 6636 4536 Lecture 10: Classes... 23/23 Spring 2017 23 / 23