Optimizer. Defining and preserving the meaning of the program

Similar documents
The Procedure Abstraction Part I: Basics

The Procedure Abstraction

CS415 Compilers. Procedure Abstractions. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

The Procedure Abstraction Part I Basics

CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction

Naming in OOLs and Storage Layout Comp 412

Procedure and Function Calls, Part II. Comp 412 COMP 412 FALL Chapter 6 in EaC2e. target code. source code Front End Optimizer Back End

Runtime Support for Algol-Like Languages Comp 412

Run Time Environment. Activation Records Procedure Linkage Name Translation and Variable Access

Intermediate Representations & Symbol Tables

Runtime Support for OOLs Object Records, Code Vectors, Inheritance Comp 412

Run Time Environment. Procedure Abstraction. The Procedure as a Control Abstraction. The Procedure as a Control Abstraction

Separate compilation. Topic 6: Runtime Environments p.1/21. CS 526 Topic 6: Runtime Environments The linkage convention

The Software Stack: From Assembly Language to Machine Code

What about Object-Oriented Languages?

Arrays and Functions

CSE 504: Compiler Design. Runtime Environments

The Processor Memory Hierarchy

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Run-time Environments

Run-time Environments

! Those values must be stored somewhere! Therefore, variables must somehow be bound. ! How?

CS 314 Principles of Programming Languages. Lecture 13

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

Procedure and Object- Oriented Abstraction

Instruction Selection: Preliminaries. Comp 412

Runtime management. CS Compiler Design. The procedure abstraction. The procedure abstraction. Runtime management. V.

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Intermediate Representations

CS415 Compilers Context-Sensitive Analysis Type checking Symbol tables

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

G Programming Languages - Fall 2012

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Lecture 7: Binding Time and Storage

Implementing Control Flow Constructs Comp 412

Compilers and computer architecture: A realistic compiler to MIPS

Handling Assignment Comp 412

Local Register Allocation (critical content for Lab 2) Comp 412

Concepts Introduced in Chapter 7

CS 314 Principles of Programming Languages

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Lecture #16: Introduction to Runtime Organization. Last modified: Fri Mar 19 00:17: CS164: Lecture #16 1

Compiler Design. Code Shaping. Hwansoo Han

Example. program sort; var a : array[0..10] of integer; procedure readarray; : function partition (y, z :integer) :integer; var i, j,x, v :integer; :

Symbol Tables. For compile-time efficiency, compilers often use a symbol table: associates lexical names (symbols) with their attributes

CPS311 Lecture: Procedures Last revised 9/9/13. Objectives:

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Symbol Table Information. Symbol Tables. Symbol table organization. Hash Tables. What kind of information might the compiler need?

Compiler construction

Special Topics: Programming Languages

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments

Just-In-Time Compilers & Runtime Optimizers

Building a Parser Part III

CS415 Compilers. Intermediate Represeation & Code Generation

Scope. Chapter Ten Modern Programming Languages 1

Control Abstraction. Hwansoo Han

Code Shape II Expressions & Assignment

Principles of Programming Languages Topic: Scope and Memory Professor Louis Steinberg Fall 2004

Weeks 6&7: Procedures and Parameter Passing

Short Notes of CS201

Chapter 3:: Names, Scopes, and Bindings (cont.)

Implementing Subprograms

Computing Inside The Parser Syntax-Directed Translation. Comp 412

Chapter 3:: Names, Scopes, and Bindings (cont.)

Compilers. 8. Run-time Support. Laszlo Böszörmenyi Compilers Run-time - 1

CS201 - Introduction to Programming Glossary By

CSE 504: Compiler Design. Intermediate Representations Symbol Table

Parsing II Top-down parsing. Comp 412

The basic operations defined on a symbol table include: free to remove all entries and free the storage of a symbol table

Chapter 9 Subroutines and Control Abstraction. June 22, 2016

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412

CIT Week13 Lecture

CA Compiler Construction

The So'ware Stack: From Assembly Language to Machine Code

Habanero Extreme Scale Software Research Project

Intermediate Representations

CS 406/534 Compiler Construction Putting It All Together

3. Process Management in xv6

5. Semantic Analysis!

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:


5. Semantic Analysis. Mircea Lungu Oscar Nierstrasz

Programming Languages

Announcements. Scope, Function Calls and Storage Management. Block Structured Languages. Topics. Examples. Simplified Machine Model.

COMP 181. Prelude. Intermediate representations. Today. High-level IR. Types of IRs. Intermediate representations and code generation

Typical Runtime Layout. Tiger Runtime Environments. Example: Nested Functions. Activation Trees. code. Memory Layout

Lexical Analysis. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

Module 27 Switch-case statements and Run-time storage management

CS 4100 Block Structured Languages. Fixed vs Variable. Activation Record. Activation Records. State of an Activation

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

Attributes of Variable. Lecture 13: Run-Time Storage Management. Lifetime. Value & Location

Instruction Scheduling Beyond Basic Blocks Extended Basic Blocks, Superblock Cloning, & Traces, with a quick introduction to Dominators.

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412

16 Sharing Main Memory Segmentation and Paging

Binding and Storage. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Transcription:

Where are we? Well understood Engineering Source Code Front End IR Optimizer IR Back End Machine code Errors The latter half of a compiler contains more open problems, more challenges, and more gray areas than the front half This is compilation, as opposed to parsing or translation Implementing promised behavior Defining and preserving the meaning of the program Managing target machine resources Registers, memory, issue slots, locality, power, These issues determine the quality of the compiled code Comp 412, Fall 2010 1

Conceptual Overview The compiler must provide, for each programming language construct, an implementation (or at least a strategy). Those constructs fall into two major categories Individual statements Procedures We will look at procedures first, since they provide the surrounding context needed to implement statements Object-oriented languages add some peculiar twists We will treat OOL features in a separate lecture or two In EaC, Chapter 6 covers implementation of procedures & Chapter 7 covers statements. Comp 412, Fall 2010 2

Conceptual Overview Procedures provide the fundamental abstractions that make programming practical & large software systems possible Information hiding Distinct and separable name spaces Uniform interfaces Hardware does little to support these abstractions Part of the compiler s job is to implement them Compiler makes good on lies that we tell programmers Part of the compiler s job is to make it efficient Role of code optimization Comp 412, Fall 2010 3

Practical Overview The compiler must decide almost everything Location for each value (named and unnamed) Method for computing each result For example, how should it compute y x or a case statement? Compile-time versus runtime behavior How to locate objects & values created & manipulated by code that the compiler cannot see? (other files, libraries) Dynamic loading and linking add more complications All of these issues come to the forefront when we consider the implementation of procedures Pay close attention to compile-time versus runtime Confuses students more than any other issue Comp 412, Fall 2010 4

The Procedure Abstraction The compiler must deal with interface between compile time and run time (static versus dynamic) Most of the tricky issues arise in implementing procedures Issues Compile-time versus run-time behavior Assign storage for everything & map names to addresses Generate code to address any value Compiler knows where some of them are Compiler cannot know where others are Interfaces with other programs, other languages, & the OS Efficiency of implementation Comp 412, Fall 2010 5

The Procedure & Its Three Abstractions The compiler produces code for each procedure Compiled Code Procedure The individual code bodies must fit together to form a working program Comp 412, Fall 2010 6

The Procedure & Its Three Abstractions Naming Environment Compiled Code Procedure Each procedure inherits a set of names Variables, values, procedures, objects, locations, Naming includes the ability to find and access objects in memory Clean slate for new names, scoping can hide other names Comp 412, Fall 2010 7

The Procedure & Its Three Abstractions Naming Environment Control History Compiled Code Procedure Each procedure inherits a control history Chain of calls that led to its invocation Mechanism to return control to caller Some notion of parameterization (ties back to naming) Comp 412, Fall 2010 8

The Procedure & Its Three Abstractions Naming Environment Control History Compiled Code APIs System Services (allocation, communication, I/O, control, naming, ) Procedure Each procedure has access to external interfaces Access by name, with parameters (may include dynamic link & load) Protection for both sides of the interface Comp 412, Fall 2010 9

The Procedure: Three Abstractions Control Abstraction Well defined entries & exits Mechanism to return control to caller Some notion of parameterization (usually) Clean Name Space Clean slate for writing locally visible names Local names may obscure identical, non-local names Local names cannot be seen outside External Interface Access is by procedure name & parameters Clear protection for both caller & callee Invoked procedure can ignore calling context Procedures permit a critical separation of concerns Comp 412, Fall 2010 10

The Procedure (Realist s View) Procedures are the key to building large systems Requires system-wide compact Conventions on memory layout, protection, resource allocation calling sequences, & error handling Must involve architecture (ISA), OS, & compiler Provides shared access to system-wide facilities Storage management, flow of control, interrupts Interface to input/output devices, protection facilities, timers, synchronization flags, counters, Establishes a private context Create private storage for each procedure invocation Encapsulate information about control flow & data abstractions Comp 412, Fall 2010 11

The Procedure (Realist s View) Procedures allow us to use separate compilation Separate compilation allows us to build non-trivial programs Keeps compile times reasonable Lets multiple programmers collaborate Requires independent procedures Without separate compilation, we would not build large systems The procedure linkage convention Ensures that each procedure inherits a valid run-time environment and that the callers environment is restored on return The compiler must generate code to ensure this happens according to conventions established by the system Comp 412, Fall 2010 12

The Procedure (More Abstract View) A procedure is an abstract structure constructed via software Underlying hardware directly supports little of the abstraction it understands bits, bytes, integers, reals, & addresses, but not: Entries and exits Interfaces Call and return mechanisms Typical machine supports the transfer of control (call and return) but not the rest of the calling sequence (e.g., preserving context) Name space Nested scopes All these are established by carefully-crafted mechanisms provided by compiler, run-time system, linker, loader, and OS; The compiler s job is to make good on the lies Comp 412, Fall 2010 told by the programming language design! 13

Run Time versus Compile Time These concepts are often confusing to the newcomer Linkages (and code for procedure body) execute at run time Code for the linkage is emitted at compile time The linkage is designed long before either of these This issue (compile time versus run time) confuses students more than any other issue in Comp 412 We will emphasize the distinction between them Comp 412, Fall 2010 14

The Procedure as a Control Abstraction Procedures have well-defined control-flow The Algol-60 procedure call Invoked at a call site, with some set of actual parameters Control returns to call site, immediately after invocation s = p(10,t,u); int p(a,b,c) int a, b, c; { int d; d = q(c,b);... Most languages allow recursion int q(x,y) int x,y; { if ( ) x = q(x-1,y); return x + y; Comp 412, Fall 2010 15

The Procedure as a Control Abstraction Implementing procedures with this behavior Requires code to save and restore a return address Must map actual parameters to formal parameters (c x, b y) Must create storage for local variables (&, maybe, parameters) p needs space for d (&, maybe, a, b, & c) where does this space go in recursive invocations? s = p(10,t,u); int p(a,b,c) int a, b, c; { int d; d = q(c,b);... int q(x,y) int x,y; { if ( ) x = q(x-1,y); return x + y; Compiler emits code that causes all this to happen at run time Comp 412, Fall 2010 16

The Procedure as a Control Abstraction Implementing procedures with this behavior Must preserve p s state while q executes recursion causes the real problem here Strategy: Create unique location for each procedure activation In simple situations, can use a stack of memory blocks to hold local storage and return addresses (closures heap allocate) s = p(10,t,u); int p(a,b,c) int a, b, c; { int d; d = q(c,b);... int q(x,y) int x,y; { if ( ) x = q(x-1,y); return x + y; Compiler emits code that causes all this to happen at run time Comp 412, Fall 2010 17

The Procedure as a Control Abstraction In essence, the procedure linkage wraps around the unique code of each procedure to give it a uniform interface Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Compiled Code Similar to building a brick wall rather than a rock wall Comp 412, Fall 2010 18 STOP

COMP 412 FALL 2010 The Procedure Abstraction: Part II Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved.

Review From last lecture The Procedure serves as A control abstraction A naming abstraction An external interface Access to system services, libraries, code from others We covered the control abstraction last lecture. Today, we will focus on naming. Comp 412, Fall 2010 20

The Procedure as a Name Space Each procedure creates its own name space Any name (almost) can be declared locally Local names obscure identical non-local names The Java twist: allow fully qualified names to reach around scope rules Local names cannot be seen outside the procedure Nested procedures are inside by definition We call this set of rules & conventions lexical scoping Examples C has global, static, local, and block scopes (Fortran-like) Blocks can be nested, procedures cannot Scheme has global, procedure-wide, and nested scopes (let) Procedure scope (typically) contains formal parameters Comp 412, Fall 2010 21

The Procedure as a Name Space Why introduce lexical scoping? Provides a compile-time mechanism for binding free variables Simplifies rules for naming & resolves conflicts Lets the programmer introduce local names with impunity How can the compiler keep track of all those names? The Problem At point p, which declaration of x is current? At run-time, where is x found? As parser goes in & out of scopes, how does it delete x? The Answer The compiler must model the name space Lexically scoped symbol tables (see 5.7 in EaC 1e) Comp 412, Fall 2010 22

Do People Use This Stuff? C macro from the MSCP compiler #define fix_inequality(oper, new_opcode) Even in C, a language \ not if (value0 < value1) known for abstraction, \ { people use scopes to hide \ Unsigned_Int temp = value0; details! \ value0 = value1; \ value1 = temp; \ opcode_name = new_opcode; \ temp = oper->arguments[0]; \ oper->arguments[0] = oper->arguments[1]; Declares \ a new oper->arguments[1] = temp; name \ temp oper->opcode = new_opcode; \ Comp 412, Fall 2010 23

Do People Use This Stuff? Of course, it might have been more clear written as: #define swap_values( val0, val1 ) \ { \ Unsigned_Int tem = val0; \ val0 = val1; \ val1 = temp; \ Even in C, we can build abstractions that are useful and tasteful. #define fix_inequality(oper, new_opcode) \ if (value0 < value1) \ { \ swap_values( val0, val1 ); \ opcode_name = new_opcode; Comp 412, Fall 2010 24

Lexically-scoped Symbol Tables 5.7 in EaC 1e The problem The compiler needs a distinct record for each declaration Nested lexical scopes admit duplicate declarations The interface insert(name, level ) creates record for name at level lookup(name, level) returns pointer or index delete(level ) removes all names declared at level Many implementation schemes have been proposed We ll stay at the conceptual level Hash table implementation is tricky, detailed, & (yes) fun Good alternatives exist (multiset discrimination, acyclic DFAs) Symbol tables are compile-time structures that the compiler uses to resolve references to names. We ll see the corresponding run-time structures that are used to establish addressability later. Comp 412, Fall 2010 25

Example procedure p { int a, b, c procedure q { int v, b, x, w procedure r { int x, y, z. procedure s { int x, a, v r s q B0: { int a, b, c B1: { int v, b, x, w B2: { int x, y, z. B3: { int x, a, v r s q Comp 412, Fall 2010 26

Lexically-scoped Symbol Tables High-level idea Create a new table for each scope Chain them together for lookup r x y z q v b x w p b a c...... Sheaf of tables implementation insert() may need to create table it always inserts at current level lookup() walks chain of tables & returns first occurrence of name delete() throws away level p table if it is top table in the chain If the compiler must preserve the table (for, say, the debugger ), this idea is actually practical. Individual tables are hash tables. Comp 412, Fall 2010 This high-level idea can be implemented as shown, or it can be 27 implemented in more space-efficient (albeit complex) ways.

Implementing Lexically Scoped Symbol Tables Threaded stack organization h(x) Sparse index z y x w x b v c b a Dense table growth r q p Implementation insert () puts new entry at the head of the list for the name lookup () goes direct to location delete () processes each element in level being deleted to remove from head of list use sparse index for speed use dense table to limit space Advantage lookup is fast Disadvantage delete takes time proportional to number of declared variables in level Parser (or treewalk) encounters names by scope, Comp 412, Fall 2010 28 so each scope forms a block at stack top.

The Procedure as an External Interface Naming plays a critical role in our ability to use procedure calls as a general interface OS needs a way to start the program s execution Programmer needs a way to indicate where it begins The procedure main in most languages When user invokes grep at a command line OS finds the executable OS creates a process and arranges for it to run grep Conceptually, it does a fork() and an exec() of the executable grep grep is code from the compiler, linked with run-time system Starts the run-time environment & calls main After main, it shuts down run-time environment & returns When grep needs system services It makes a system call, such as fopen() UNIX/Linux specific discussion Comp 412, Fall 2010 29

Where Do All These Variables Go? Automatic & Local Keep them in the procedure activation record or in a register Automatic lifetime matches procedure s lifetime Static Procedure scope storage area affixed with procedure name &_p.x File scope storage area affixed with file name Lifetime is entire execution Global One or more named global data areas One per variable, or per file, or per program, Lifetime is entire execution Comp 412, Fall 2010 30

Placing Run-time Data Structures Classic Organization C o d e S G t l a & o t b i a c l S t a c k 0 high Single Logical Address Space H e a p Better utilization if stack & heap grow toward each other Very old result (Knuth) Code & data separate or interleaved Uses address space, not allocated memory Code, static, & global data have known size Use symbolic labels in the code Heap & stack both grow & shrink over time This is a virtual address space Comp 412, Fall 2010 31

How Does This Really Work? The Big Picture Compiler s view virtual address spaces C o d e S G t l a & o t b i a c l S t a c k H e a p C o d e S G t l a & o t b i a c l S t a c k H e a p C o d e S G t l a & o t b i a c l S t a c k H e a p... C o d e S G t l a & o t b i a c l S t a c k H e a p OS view... 0 high 1980 Hardware view Physical address space_ Comp 412, Fall 2010 On many modern processors, L1 cache uses physical addresses 32 Higher level caches typically use virtual addresses

How Does This Really Work? Most systems now include L3 caches. L4 is on its way. Of course, the Hardware view is no longer that simple Main Memory L2 Cache... 0 high... Caches can use physical or virtual addresses Caches can be inclusive or exclusive Caches can be shared L1 Caches Code Data Code Data Processor Cores Functional unit Functional unit Functional unit Functional unit Functional unit Functional unit Functional unit Functional unit Comp 412, Fall 2010 Cache structure matters for performance, not correctness 33

Where Do Local Variables Live? A Simplistic model Allocate a data area for each distinct scope One data area per sheaf in scoped table What about recursion? Need a data area per invocation (or activation) of a scope We call this the scope s activation record The compiler can also store control information there! More complex scheme One activation record (AR) per procedure instance All the procedure s scopes share a single AR (may share space) Static relationship between scopes in single procedure Used this way, static means knowable at compile time (and, therefore, fixed). Comp 412, Fall 2010 34

Translating Local Names How does the compiler represent a specific instance of x? Name is translated into a static coordinate < level,offset> pair level is lexical nesting level of the procedure offset is unique within that scope Subsequent code will use the static coordinate to generate addresses and references level is a function of the table in which x is found Stored in the entry for each x offset must be assigned and stored in the symbol table Assigned at compile time Known at compile time Used to generate code that executes at run-time Comp 412, Fall 2010 35

Storage for Blocks within a Single Procedure B0: { int a, b, c B1: { int v, b, x, w B2: { int x, y, z B3: { int x, a, v Fixed length data can always be at a constant offset from the beginning of a procedure In our example, the a declared at level 0 will always be the first data element, stored at byte 0 in the fixed-length data area The x declared at level 1 will always be the sixth data item, stored at byte 20 in the fixed data area The x declared at level 2 will always be the eighth data item, stored at byte 28 in the fixed data area But what about the a declared in the second block at level 2? Comp 412, Fall 2010 36

Variable-length Data B0: { int a, b assign value to a B1: { int v(a), b, x B2: { int x, y(8) Arrays If size is fixed at compile time, store in fixed-length data area If size is variable, store descriptor in fixed length area, with pointer to variable length area Variable-length data area is assigned at the end of the fixed length area for the block in which it is allocated (including all contained blocks) a b v b x x y(8) v(a) Includes data for all fixed length objects in all blocks Variable-length data Comp 412, Fall 2010 37

Activation Record Basics ARP parameters register save area return value return address addressability caller s ARP local variables Space for parameters to the current routine Saved register contents If function, space for return value Address to resume caller Help with non-local access To restore caller s AR on a return Space for local values & variables (including spills) One AR for each invocation of a procedure Comp 412, Fall 2010 ARP Activation Record Pointer 38

Activation Record Details How does the compiler find the variables? They are at known offsets from the AR pointer The static coordinate leads to a loadai operation Level specifies an ARP, offset is the constant Variable-length data If AR can be extended, put it above local variables Leave a pointer at a known offset from ARP Otherwise, put variable-length data on the heap Initializing local variables Must generate explicit code to store the values Among the procedure s first actions Comp 412, Fall 2010 39

Activation Record Details Where do activation records live? If lifetime of AR matches lifetime of invocation, AND S G If code normally executes a return t l Keep ARs on a stack If a procedure can outlive its caller, OR If it can return an object that can reference its execution state ARs must be kept in the heap If a procedure makes no calls AR can be allocated statically Efficiency prefers static, stack, then heap C o d e a & o t b i a c l S t a c k H e a p Yes! That stack STOP Comp 412, Fall 2010 40

Unused Slides Comp 412, Fall 2010 41

Implementing Lexically Scoped Symbol Tables Stack organization nextfree r (level 2) q (level 1) p (level 0) z y x w x b v c b a growth Implementation insert () creates new level pointer if needed and inserts at nextfree lookup () searches linearly from nextfree 1 forward delete () sets nextfree to the equal the start location of the level deleted. Advantage Uses much less space Disadvantage Lookups can be expensive Comp 412, Fall 2010 42