Naming in OOLs and Storage Layout Comp 412

Similar documents
What about Object-Oriented Languages?

Runtime Support for OOLs Object Records, Code Vectors, Inheritance Comp 412

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

Runtime Support for Algol-Like Languages Comp 412

The Procedure Abstraction

Optimizer. Defining and preserving the meaning of the program

CS415 Compilers. Procedure Abstractions. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

Arrays and Functions

Separate compilation. Topic 6: Runtime Environments p.1/21. CS 526 Topic 6: Runtime Environments The linkage convention

The Processor Memory Hierarchy

CSE 504: Compiler Design. Runtime Environments

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Run-time Environments

Run-time Environments

The Software Stack: From Assembly Language to Machine Code

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

The Procedure Abstraction Part I: Basics

Intermediate Representations

Handling Assignment Comp 412

Run Time Environment. Procedure Abstraction. The Procedure as a Control Abstraction. The Procedure as a Control Abstraction

Procedure and Object- Oriented Abstraction

Short Notes of CS201

CS201 - Introduction to Programming Glossary By

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Run Time Environment. Activation Records Procedure Linkage Name Translation and Variable Access

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction

Runtime management. CS Compiler Design. The procedure abstraction. The procedure abstraction. Runtime management. V.

! Those values must be stored somewhere! Therefore, variables must somehow be bound. ! How?

The Procedure Abstraction Part I Basics

Instruction Selection: Preliminaries. Comp 412

Intermediate Representations

Programming Languages

G Programming Languages - Fall 2012

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Intermediate Representations & Symbol Tables

Computing Inside The Parser Syntax-Directed Translation. Comp 412

Implementing Control Flow Constructs Comp 412

Run-Time Data Structures

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

CA Compiler Construction

Lecture 7: Binding Time and Storage

Implementing Subprograms

Qualifying Exam in Programming Languages and Compilers

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Principles of Programming Languages Topic: Scope and Memory Professor Louis Steinberg Fall 2004

Example. program sort; var a : array[0..10] of integer; procedure readarray; : function partition (y, z :integer) :integer; var i, j,x, v :integer; :

Loops and Locality. with an introduc-on to the memory hierarchy. COMP 506 Rice University Spring target code. source code OpJmizer

Object Oriented Languages. Hwansoo Han

System Software Assignment 1 Runtime Support for Procedures

The compilation process is driven by the syntactic structure of the program as discovered by the parser

12/4/18. Outline. Implementing Subprograms. Semantics of a subroutine call. Storage of Information. Semantics of a subroutine return

CPS311 Lecture: Procedures Last revised 9/9/13. Objectives:

CS 314 Principles of Programming Languages

Storage in Programs. largest. address. address

Types II. Hwansoo Han

CIT Week13 Lecture

Control Abstraction. Hwansoo Han

Implementing Subprograms

Chapter 10. Implementing Subprograms

Building a Parser Part III

Chapter 6 Introduction to Defining Classes

! What is main memory? ! What is static and dynamic allocation? ! What is segmentation? Maria Hybinette, UGA. High Address (0x7fffffff) !

Design issues for objectoriented. languages. Objects-only "pure" language vs mixed. Are subclasses subtypes of the superclass?

Review! Lecture 5 C Memory Management !

Heap Management portion of the store lives indefinitely until the program explicitly deletes it C++ and Java new Such objects are stored on a heap

CS61C : Machine Structures

Compilers. 8. Run-time Support. Laszlo Böszörmenyi Compilers Run-time - 1

CSE351 Winter 2016, Final Examination March 16, 2016

Programming Languages: Lecture 12

Name, Scope, and Binding. Outline [1]

CS 314 Principles of Programming Languages. Lecture 13

Data Abstraction. Hwansoo Han

Concepts Introduced in Chapter 7

The So'ware Stack: From Assembly Language to Machine Code

Parsing II Top-down parsing. Comp 412

NOTE: Answer ANY FOUR of the following 6 sections:

Implementing Subroutines. Outline [1]

Digital Forensics Lecture 3 - Reverse Engineering

Inheritance, Polymorphism and the Object Memory Model

Run-time Environment

Type Checking Binary Operators

CPS 506 Comparative Programming Languages. Programming Language

Just-In-Time Compilers & Runtime Optimizers

CS Programming In C

Grade Weights. Language Design and Overview of COOL. CS143 Lecture 2. Programming Language Economics 101. Lecture Outline

Semantic Analysis and Type Checking

Habanero Extreme Scale Software Research Project

1 Lexical Considerations

Chapter 1: Object-Oriented Programming Using C++

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412

Syntax Analysis, VI Examples from LR Parsing. Comp 412

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Transcription:

COMP 412 FALL 2018 Naming in OOLs and Storage Layout Comp 412 source IR IR target Front End Optimizer Back End Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Chapters 4, 5, 6 & 7 in EaC2e

What Are The Issues? In an ALL, the compiler needs Compile-time mechanism for name resolution Runtime mechanism to compute an address from a name Compiler must emit that builds & maintains the runtime structures for addressability In an OOL, the compiler needs Compile-time mechanism for name resolution Runtime mechanism to compute an address from a name Compiler must emit that builds & maintains the runtime structures for addressability As stated earlier, this lecture focuses on the compile-time mechanisms. COMP Runtime 412, support Fall 2018for ALL & OOL name spaces will appear later. 1 *

The Java Name Space Class Point { public int x, y; public void draw(); } Class ColorPoint extends Point { Color c; public void draw() { } public void test() { y = x; draw(); } } Point a ColorPoint b The Concept x y draw() different x y draw() c test() Class C { // uses ColorPoint int x, y; public void m() { Point cp = new ColorPoint(); y = cp.x; cp.draw(); } } Point and ColorPoint have different runtime representations. The compiler needs: Names, types, and locations for members Location is an offset from cp A mechanism to ensure that a call to draw() reaches the right instance In the example, Point cp is assigned a new ColorPoint, so it reference s ColorPoint s COMP draw() 412,. Fall 2018 5

The Java Name Space The implementation is more complex, but that s a story for another day. Class Point { public int x, y; public void draw(); } Class ColorPoint extends Point { Color c; public void draw() { } public void test() { y = x; draw(); } } Class C { // uses ColorPoint int x, y; public void m() { Point cp = new ColorPoint(); y = cp.x; cp.draw(); } } Point p Point cp class One implementation COMP 412, Fall 2018 6 x y class x y c Code vectors for Point and ColorPoint class ColorPoint draw() draw() test() Every object has an object record, including each class Items shared amongst a class are recorded in the class object record Class private data members, if any Point s vector ColorPoint s vector

The fundamental question Name Resolution in an OOL What names can an executing member (a function) access? Names defined by the member And its surrounding lexical context The receiving object s data members Smalltalk terminology: instance variables The & data members of the class that defines it And its context from inheritance Smalltalk terminology: class variables and methods Any object defined in the global name space The object use to locate the member is termed the receiver. The method might need the address for any or all of these objects Inheritance adds some twists. An OOL resembles an ALL, with a different name space In an ALL, scoping is relative to hierarchy of the In an OLL, scoping is relative to hierarchy of both the & the data COMP 412, Fall 2018 7

class hierarchy Concrete Example: The Java Name Space Code within a method M for object O of class C can see: Local variables declared within M (standard lexical scoping) All instance variables of O & class variables of C All public and protected variables of any superclass of C Classes defined in the same package as C or in any explicitly imported package public class variables and public instance variables of imported classes package class and instance variables in the package containing C lexical Class declarations can be nested! Member declarations hide outer class declarations of the same name Accessibility options: public, private, protected, package Both lexical nesting & class hierarchy at play Superclass is an ancestor in the inheritance hierarchy COMP 412, Fall 2018 8

Compile-time Structures for OOLs Java To compile method M of object O in class C, the compiler needs: 1. Lexically scoped symbol table for the current block and its surrounding scopes Just like ALL inner declarations hide outer declarations 2. Chain of symbol tables for inheritance Class C and all of its superclasses Need to find methods and instance variables in any superclass 3. Symbol tables for all global classes (package scope) Entries for all members with visibility Need to construct symbol tables for imported packages and link them into the structure in appropriate places Three sets of tables are needed for name resolution. In an ALL, we can combine 1 & 3 for a single unified set of tables. In Java, we need to split them so that the compiler can check the inheritance hierarchy between the lexical hierarchy & the global name space. COMP 412, Fall 2018 9

Compile-time Structures for OOLs Java Conceptually Lexical Hierarchy Class Hierarchy Global Scope Search Order: lexical, class, global Again, the sheaf of tables implementation simplifies the conceptual picture. Static COMP coordinate 412, Fall 2018 needs a hierarchy field, as well. 10

Compile-time Structures for OOLs Java To find the address for a reference to x in method M for an object O of class C, the compiler must: For an unqualified use (i.e., x): Search the symbol table for the method s lexical hierarchy Search the symbol tables for the receiver s class hierarchy Search global symbol table (current package and imported) For each hit, check visibility attribute of x For a qualified use (i.e.: Q.x): Find Q by the method above Search from Q for x Must be a class or instance variable of Q or some class it extends Check visibility attribute of x COMP Compile-time 412, Fall 2018 cost increases by a small constant factor. 11

What About Storage Layout? Where does all of this stuff go? And how does it get there? The compiler must classify all & data so that it can assign storage Scopes: local, global, subject to some set of lexical scoping constraints Lifetimes: entire execution, execution of a procedure, or indeterminate Given these classifications & the state of the naming model, the compiler can assign data to specific data areas Each procedure has a local data area Local data from a scope smaller than a procedure usually goes here Declarations, procedures, files, & modules may have a static data area Depends on the specific declarations in the A program may have one or more global data areas & constant pools Each object has an object record A class, being an object, has an object record of class class COMP 412, Fall 2018 12

When Does the Compiler Assign Storage? The compiler must assign storage before using addresses in the IR Code contains implicit information about storage Compiler must make its decisions after parsing declarations Can envision batch schemes or incremental schemes Either way, the compiler assigns addresses Procedure Header { DeclList StmtList } DeclList DeclList Declaration Declaration StmtList StmtList Statement Statement Typical Grammar for a Procedure Compiler wants to perform storage assignment at this point in the parse Without a declarations before statements rule, the compiler must either build an IR that abstracts away addressing, or make two passes. COMP 412, Fall 2018 13

When Does the Compiler Assign Storage? The compiler must assign storage before using addresses in the IR Code contains implicit information about storage Compiler must make its decisions after parsing declarations Can envision batch schemes or incremental schemes Either way, the compiler assigns addresses Procedure Header { Decls StmtList } Decls DeclList DeclList DeclList Declaration Declaration StmtList StmtList Statement Statement Typical Grammar for a Procedure Modify the grammar to create a reduction to assign storage. Final points: 1. When does this happen? At compile time COMP 2. Where 412, does Fall 2018 stuff go? Next lecture 14

Compile time representation of names Translating Names How does the compiler represent a specific instance of a variable x? Name is translated into a static coordinate < level,offset > pair level is lexical nesting level of the declaration offset is unique within that scope Subsequent will use the static coordinate to generate addresses and references level is a function of the table in which x is found Stored in the entry for each x offset must be assigned and stored in the symbol table Assigned at compile time Known at compile time Used to generate that executes at run-time Store static coordinate in the symbol table entry Store shortcut to symbol table entry in the IR Global names are in a table for the global name space, COMP 412, Fall 2018 with its own offset say negative one. 15

Storage Layout The compiler must assign a location to each named variable and each temporary value that it introduces? Where do all these values live? The answer should depend on the value s visibility & its lifetime Lifetimes Automatic variables Implicit allocation & deallocation Lifetime matches lifetime of declaring scope s activation Static variables Implicit allocation and deallocation Lifetime is the complete execution of the program Global variables are static, in almost any PL. Values with irregular lifetimes Explicit allocation and deallocation (new, or malloc, or ) COMP 412, Fall 2018 16

Storage Layout The compiler must assign a location to each named variable and each temporary value that it introduces? Where do all these values live? The answer should depend on the value s visibility & its lifetime Visibility Local Scopes Procedure scope: name is visible in defining procedure (& nested ones) Block scope (C, scheme): name is visible in defining block (& nested ones) Global Scope Names are visible anywhere that they are declared, unless obscured by a local declaration Unusual scopes C s file scope : a variable declared global & static is visible throughout the file where it is declared, but nowhere else COMP 412, Fall 2018 17

Where do all these values live? AR refers to a procedure s activation record Automatic & Local Automatic local values live in the local data area, in the procedure s AR * Or in a register, if it is an unamibiguous scalar value Compiler lays out a map for the local data area, including spill locations Local values from surrounding scopes live in the AR for their defining scope Runtime addressability mechanism establishes a way to reach them Static (implies lifetime of entire execution) Procedure static procedure-specific data area (e.g., &_.procname_da) File static file-specific data area (e.g., &_.filename_da) Necessary to ensure that they remain live across different activations Global (implies lifetime of entire execution) Global values live in one or more global data area(s) One label per variable, or per file, or per program, With consistently mangled labels to connect them across compiles Equivalent to an assembly language defined storage pseudo-operation Compilers obtain these labels by mangling the original name. COMP The linker 412, will Fall report 2018 conflicting definitions of the same label. 18

n k What is this activation record? Activation Record Basics ARP Activation Record Pointer In most systems, Activation Records have a similar layout 0 space for parameters Space for parameters to the current procedure 0 register save area slot for return value Contents of saved registers If function, space for return value slot for return address Address to resume caller ARP slot for addressability slot for caller s ARP Help with non-local access To restore caller s AR on return If ARs are stack allocated, the stack grows in this direction 0 1 m space for local variables Space for local variables, temporaries, & spills One AR for each invocation of a procedure One COMP register 412, Fall dedicated 2017 to hold the current ARP 19

Name Mangling One of the most useful tools the compiler has is an assembly- label Compiler can emit a label in the Assembler will turn it into a relocatable symbol Linker will replace the label with the appropriate virtual address Hardware will translate virtual address to a physical address Name mangling is the standard trick The compiler assumes that some source-language names are unique Global variables, procedure names Use fully-qualified names for nested objects The compiler builds strings based on the unique name Code for procedure fee might be labelled _fee._ or _&fee or Specific mangles for specific purposes Static data area, entry point to the, return address at a call,... Use character combinations that are not legal in the source language COMP 412, Fall 2018 20

Alignment Issues: One More Standard Trick Having values of multiple types in the same data area creates potential alignment issues Each type (size) of value has its own alignment restriction Laying out a data area in arbitrary order can require padding Consider a & c as single-byte characters and b & d as single-word integers a b c d X X X X X X 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 A Layout That Wastes Space b d c d unused 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 A Better Layout COMP 412, Fall 2018 21

Alignment Issues: One More Standard Trick To create a layout that minimizes space wasted in padding Sort variables by alignment restrictions Most strict (e.g., quad word or double word) to least strict (e.g., byte) From most strict to least strict, lay out variables in contiguous memory Alignment constraints generally decrease by ½ at a transiton This scheme avoids almost all padding for scalar variables. Structures that have atypical alignment constraints may need padding A seventeen-byte long structure followed by a quadword aligned double double b d c d 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 whole words bytes unused memory Good Stopping Point COMP 412, Fall 2018 22

Where Do These Data Areas Go? The compiler lays out data areas Where are they located at runtime? How does the find the data areas? The executable assumes that it has its own virtual address space Large, isolated, uninitialized tract of memory Compiled manages the layout of that virtual address space In collaboration with the operating system, which creates & maintains it Virtual Address Space 0 1 2 3 n-2 n-1 n COMP 412, Fall 2018 23

Address Space Layout Most language runtimes layout the address space in a similar way Stacks Growth space for stacks Heap Code Globals Java Memory Layout Pieces (stack, heap,, & globals) may move, but all will be there Stack and heap grow toward each other (if heap grows) Arrays live on one of the stacks, in the global area, or in the heap The picture shows one virtual address space. The hardware supports one virtual address space per process. How does a virtual address space map into physical memory? COMP 412, Fall 2018 24

Multiple Virtual Address Spaces? The Big Picture Compiler s view virtual address spaces (one per process) stack heap stack heap stack heap... stack heap OS view... TLB 0 high 1980 Hardware view Physical address space (big vector of memory) TLB is an address cache used by the OS to speed virtual-to-physical address translation. A processor may have COMP > 1 level 412, of TLB. Fall 2018 25

Mapping Virtual Address Spaces Add a couple of high-end bits so we can treat the process address spaces as one large address space for mapping purposes Unified address space stack heap stack heap stack heap... stack heap OS view... TLB 0 high 1980 Hardware view Physical address space (big vector of memory) TLB is an address cache used by the OS to speed virtual-to-physical address translation. A processor may have COMP > 1 level 412, of TLB. Fall 2018 26

Cache structure matters for performance, not correctness Mapping Virtual Address Spaces Main Memory L3 L2 Data & Code Data & Code Data & Code Typically shared among 2 cores Of course, the Hardware view is no longer that simple Multiple levels of cache Caches shared among cores Caches exclusive to one core Multiple levels of (shared) TLB L1 Data Code Core Functional unit Functional unit Registers Functional unit Functional unit TLB All of the addresses must map in a way that works for the generated to run in a single virtual address space COMP Most 412, processors Fall 2018 have > 1 core 27 Hard STOP