Compiler Construction

Similar documents
Compiler Construction

Compiler Construction

Compiler Construction

Compiler Construction

Compiler Construction

Compiler Construction

Compiler Construction

Compiler Construction

G Programming Languages - Fall 2012

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

CSE 3302 Notes 5: Memory Management

Chapter 9. Def: The subprogram call and return operations of a language are together called its subprogram linkage

LECTURE 3. Compiler Phases

Compiler Construction Lecture 1: Introduction Winter Semester 2018/19 Thomas Noll Software Modeling and Verification Group RWTH Aachen University

Compiler Construction Lecture 1: Introduction Summer Semester 2017 Thomas Noll Software Modeling and Verification Group RWTH Aachen University

Run-time Environments - 2

Chapter 10 Implementing Subprograms

Static Program Analysis

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Implementing Subprograms

Control Structures. Outline. In Text: Chapter 8. Control structures Selection. Iteration. Gotos Guarded statements. One-way Two-way Multi-way

Programming Languages and Compiler Design

G Programming Languages - Fall 2012

System Software Assignment 1 Runtime Support for Procedures

! Those values must be stored somewhere! Therefore, variables must somehow be bound. ! How?

Lecture 9: Parameter Passing, Generics and Polymorphism, Exceptions

Principles of Programming Languages Topic: Scope and Memory Professor Louis Steinberg Fall 2004

NOTE: Answer ANY FOUR of the following 6 sections:

Chapter 10. Implementing Subprograms ISBN

CS 4100 Block Structured Languages. Fixed vs Variable. Activation Record. Activation Records. State of an Activation

Concepts Introduced in Chapter 7

Today's Topics. CISC 458 Winter J.R. Cordy

Compiler Construction

Compiler Construction

The basic operations defined on a symbol table include: free to remove all entries and free the storage of a symbol table

Lecture08: Scope and Lexical Address

Scope. CSC 4181 Compiler Construction. Static Scope. Static Scope Rules. Closest Nested Scope Rule

CS 314 Principles of Programming Languages

Attributes of Variable. Lecture 13: Run-Time Storage Management. Lifetime. Value & Location

Module 27 Switch-case statements and Run-time storage management

Programming Languages: Lecture 12

Compiler Construction

Instruction Set Architecture

Semantics and Verification of Software

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments

Intermediate Representations & Symbol Tables

Virtual Machine. Part I: Stack Arithmetic. Building a Modern Computer From First Principles.

CS 415 Midterm Exam Spring 2002

Appendix A The DL Language

Implementing Subprograms

Implementing Subprograms

Chapter 10. Implementing Subprograms

Motivation. Compiler. Our ultimate goal: Hack code. Jack code (example) Translate high-level programs into executable code. return; } } return

Lecture #16: Introduction to Runtime Organization. Last modified: Fri Mar 19 00:17: CS164: Lecture #16 1

Run-Time Data Structures

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

CSE 307: Principles of Programming Languages

CS321 Languages and Compiler Design I. Winter 2012 Lecture 1

Control Abstraction. Hwansoo Han

Informatica 3 Marcello Restelli

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

THEORY OF COMPILATION

General Concepts. Abstraction Computational Paradigms Implementation Application Domains Influence on Success Influences on Design

Functions and Procedures

An Overview to Compiler Design. 2008/2/14 \course\cpeg421-08s\topic-1a.ppt 1

CA Compiler Construction

LECTURE 18. Control Flow

CS415 Compilers. Procedure Abstractions. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

The Procedure Abstraction

Programming Languages

The Procedure Abstraction Part I: Basics

Compiler Construction

Chap. 8 :: Subroutines and Control Abstraction

Special Topics: Programming Languages

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

CIT Week13 Lecture

CS 314 Principles of Programming Languages. Lecture 13

low and not larger than high. 18 points

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Compilers and Code Optimization EDOARDO FUSELLA

Run-time Environments

Virtual Machine Where we are at: Part I: Stack Arithmetic. Motivation. Compilation models. direct compilation:... 2-tier compilation:

Virtual Machine. Part II: Program Control. Building a Modern Computer From First Principles.

Run-time Environments

CSc 520 Principles of Programming Languages. Questions. rocedures as Control Abstractions... 30: Procedures Introduction

Principles of Programming Languages

7 Translation to Intermediate Code

CS5363 Final Review. cs5363 1

Every language has its own scoping rules. For example, what is the scope of variable j in this Java program?

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

Outline. Programming Languages 1/16/18 PROGRAMMING LANGUAGE FOUNDATIONS AND HISTORY. Current

MIDTERM EXAM (Solutions)

Run Time Environment. Activation Records Procedure Linkage Name Translation and Variable Access

Scope, Functions, and Storage Management

Test 1 Summer 2014 Multiple Choice. Write your answer to the LEFT of each problem. 5 points each 1. Preprocessor macros are associated with: A. C B.

Announcements. Scope, Function Calls and Storage Management. Block Structured Languages. Topics. Examples. Simplified Machine Model.

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

Naming in OOLs and Storage Layout Comp 412

Memory Management and Run-Time Systems

Transcription:

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-17/cc/

Generation of Intermediate Code Conceptual Structure of a Compiler Source code Lexical analysis (Scanner) Syntax analysis (Parser) Semantic analysis Generation of intermediate code LOAD y2; LIT 1; ADD; STO x1 Code optimisation int Var Asg ok int Var Exp int Sum int Con int tree translations Generation of target code Target code 3 of 26 Compiler Construction

Generation of Intermediate Code Modularisation of Code Generation I Splitting of code generation for programming language PL: PL trans IC code MC Frontend: trans generates machine-independent intermediate code (IC) for abstract (stack) machine Backend: code generates actual machine code (MC) Advantages: IC machine independent = Portability: much easier to write IC compiler/interpreter for a new machine (as opposed to rewriting the whole compiler) Fast compiler implementation: generating IC much easier than generating MC Code size: IC programs usually smaller than corresponding MC programs Code optimisation: division into machine-independent and machine-dependent parts 4 of 26 Compiler Construction

Generation of Intermediate Code Modularisation of Code Generation II Example 15.1 1. UNiversal Computer-Oriented Language (UNCOL; 1960; http://en.wikipedia.org/wiki/uncol): universal intermediate language for compilers (never fully specified or implemented; too ambitious) PL 1. UNCOL MC 1. PL m MC n only n + m translations (in place of n m) 2. Pascal s portable code (P-code; 1970; http://en.wikipedia.org/wiki/p-code_machine) 3. Java Virtual Machine (JVM; Sun; 1995; http://en.wikipedia.org/wiki/java_virtual_machine) 4. Common Intermediate Language (CIL; Microsoft.NET; 2002; http://en.wikipedia.org/wiki/common_intermediate_language) 5 of 26 Compiler Construction

Generation of Intermediate Code Language Structures I Structures in high-level programming languages Basic data types and basic operations Static and dynamic data structures Expressions and assignments Control structures (branching, loops,...) Procedures and functions Modularity: blocks, modules, and classes Use of procedures and blocks FORTRAN: non-recursive and non-nested procedures static memory management (requirements determined at compile time) C: recursive and non-nested procedures dynamic memory management using runtime stack (requirements only known at runtime), no static links Algol-like languages (Pascal, Modula): recursive and nested procedures dynamic memory management using runtime stack with static links Object-oriented languages (C++, Java): object creation and removal dynamic memory management using heap 6 of 26 Compiler Construction

Generation of Intermediate Code Language Structures II Structures in machine code (von Neumann/SISD) Memory hierarchy: accumulators, registers, caches, main memory, background storage Instruction types: arithmetic/boolean/... operation, test/jump instruction, transfer instruction, I/O instruction,... Addressing modes: direct/indirect, absolute/relative,... Architectures: RISC (few [fast but simple] instructions, many registers), CISC (many [complex but slow] instructions, few registers) Structures in intermediate code Data types and operations like PL Data stack with basic operations Jumping instructions for control structures Runtime stack for blocks, procedures, and static data structures Heap for dynamic data structures 7 of 26 Compiler Construction

The Example Programming Language EPL The Example Programming Language EPL Structures of EPL: Only integer and Boolean values Arithmetic and Boolean expressions with strict and non-strict semantics Control structures: sequence, branching, iteration Nested blocks and recursive procedures with local and global variables ( = dynamic memory management using runtime stack with static links) (not/later considered: procedure parameters and [dynamic] data structures) 9 of 26 Compiler Construction

The Example Programming Language EPL Syntax of EPL Definition 15.2 (Syntax of EPL) The syntax of EPL is defined as follows: Z : z (* z is an integer *) Ide : I (* I is an identifier *) AExp : A ::= z I A 1 + A 2... BExp : B ::= A 1 < A 2 not B B 1 and B 2 B 1 or B 2 Cmd : C ::= I := A C 1 ;C 2 if B then C 1 else C 2 while B do C I() Dcl : D ::= D C D V D P D C ::= ε const I 1 := z 1,...,I n := z n ; D V ::= ε var I 1,...,I n ; D P ::= ε proc I 1 ;K 1 ;... ;proc I n ;K n ; Blk : K ::= D C Pgm : P ::= in/out I 1,...,I n ;K. 10 of 26 Compiler Construction

The Example Programming Language EPL EPL Example: Factorial Function Example 15.3 (Factorial function) in/out x; var y; proc F; if x > 1 then y := y * x; x := x - 1; F() y := 1; F(); x := y. 11 of 26 Compiler Construction

Semantics of EPL Static Semantics of EPL Usage of identifiers must be consistent with declaration: I :=... J... = I variable, J constant or variable I() = procedure All identifiers in a declaration D have to be different. Every identifier occurring in the command C of a block D C must be declared in D or in the declaration list of a surrounding block. Static scoping: the usage of an identifier in the body of a called procedure refers to its declaration environment (and not to its calling environment). Multiple declarations of an identifier in different blocks are possible. Each usage in a command refers to the innermost declaration. 13 of 26 Compiler Construction

Semantics of EPL Scoping and Visibility of Identifiers Scope of an identifier The scope (or: range of validity/visibility) of an identifier s declaration refers to the part of the program where a usage of that identifier can refer to that declaration. Static vs. dynamic scoping Static (aka lexical ) scoping: name resolution at compile time, depends on location in source code and lexical context (here) part = area of source code (block) usually admits at most one declaration of same identifier per block but allows additional declarations within nested blocks innermost principle: hide outer declarations (still valid, but not visible) Dynamic scoping: name resolution at runtime, depends on execution context part = runtime history, esp. procedure calls Levels of scope: expression/block/procedure/file/module/... 14 of 26 Compiler Construction

Semantics of EPL Static Scoping by Example Example 15.4 in/out x; const c = 10; var y; proc P; var y, z; proc Q; var x, z; [... z := 1; P()...] [... P()... R()...] proc R; [... P()...] [... x := 0; P()...]. Innermost principle: use of x in main program Innermost principle: use of z in Q Static scoping: call of P in Q can refer to x, y, z Later declaration: call of R in P followed by declaration (in older languages: forward declarations for one-pass compilation) 15 of 26 Compiler Construction

Semantics of EPL Dynamic Semantics of EPL (omitting the details; cf. Semantics of Programming Languages) To run program, execute main block in state (memory location value) determined by input values Effect of declaration = modification of environment constant value, variable memory location, procedure state transformation Effect of statement = modification of state assignment I := A: update of I s location by current value of A composition C 1 ;C 2 : sequential execution branching if B then C 1 else C 2 : test of B, followed by jump to respective branch iteration while B do C: execution of C as long as B is true call I(): transfer control to body of I and return to subsequent statement afterwards EPL program P = in/out I 1,...,I n ;K. Pgm has as semantics a function P : Z n Z n Example 15.5 (Factorial function; cf. Example 15.3) here n = 1 and P (x) = x! (where x! := 1 for x 0) 16 of 26 Compiler Construction

Intermediate Code for EPL The Abstract Machine AM Definition 15.6 (Abstract machine for EPL) The abstract machine for EPL (AM) is defined by the state space S := PC DS PS with the program counter PC := N, the data stack DS := Z (top of stack to the right), and the procedure stack (or: runtime stack) PS := Z (top of stack to the left). Thus a state s = (pc, d, p) S is given by a program counter pc PC, a data stack d = d.r :... : d.1 DS, and a procedure stack p = p.1 :... : p.t PS. 18 of 26 Compiler Construction

Intermediate Code for EPL AM Instructions Definition 15.7 (AM instructions) The set of AM instructions is divided into arithmetic instructions: ADD, MULT,... Boolean instructions: NOT, AND, OR, LT,... jumping instructions: JMP(ca), JFALSE(ca) (ca PC) procedure instructions: CALL(ca,dif,loc) (ca PC, dif, loc N), RET transfer instructions: LOAD(dif,off), STORE(dif,off) (dif, off N), LIT(z) (z Z) 19 of 26 Compiler Construction

Intermediate Code for EPL Semantics of Instructions Definition 15.8 (Semantics of AM instructions (1st part)) The semantics of an AM instruction O is defined as follows: O : S S ADD (pc, d : z 1 : z 2, p) := (pc + 1, d : z 1 + z 2, p) NOT (pc, d : b, p) := (pc + 1, d : b, p) if b {0, 1} AND (pc, d : b 1 : b 2, p) := (pc + 1, d : b 1 b 2, p) if b 1, b 2 {0, 1} OR (pc, d : b 1 : b 2, p) := {(pc + 1, d : b 1 b 2, p) if b 1, b 2 {0, 1} (pc + 1, d : 1, p) if z1 < z LT (pc, d : z 1 : z 2, p) := 2 (pc + 1, d : 0, p) if z 1 z 2 JMP(ca) (pc, d, p) := (ca, d, p) JFALSE(ca) (pc, d : b, p) := { (ca, d, p) if b = 0 (pc + 1, d, p) if b = 1 20 of 26 Compiler Construction

The Procedure Stack Structure of Procedure Stack I The semantics of procedure and transfer instructions requires a particular structure of the procedure stack p PS: it must be composed of frames (or: activation records) of the form sl : dl : ra : v 1 :... : v k where static link sl: points to frame of surrounding declaration environment = used to access non-local variables dynamic link dl: points to previous frame (i.e., of calling procedure) = used to remove topmost frame after termination of procedure call return address ra: program counter after termination of procedure call = used to continue program execution after termination of procedure call local variables v i : values of locally declared variables 22 of 26 Compiler Construction

The Procedure Stack Structure of Procedure Stack II Frames are created whenever a procedure call is performed Two special frames: I/O frame: for keeping values of in/out variables (sl = dl = ra = 0) MAIN frame: for keeping values of top-level block (sl = dl = I/O frame) 23 of 26 Compiler Construction

The Procedure Stack Structure of Procedure Stack III Example 15.9 (cf. Example 15.4) in/out x; const c = 10; var y; proc P; var y, z; proc Q; var x, z; [... P()...] [... Q()...] proc R; [... P()...] [... P()...]. Procedure stack after second call of P: 15 4 5 4 5 4 4 3 0 0 0 sl dl ra y z sl dl ra x z sl dl ra y z sl dl ra y sl dl ra x P() Q() P() MAIN I/O 24 of 26 Compiler Construction

The Procedure Stack Structure of Procedure Stack IV Observation: The usage of a variable in a procedure body refers to its innermost declaration. If the level difference between the usage and the declaration is dif, then a chain of dif static links has to be followed to access the corresponding frame. Example 15.10 (cf. Example 15.9) in/out x; const c = 10; var y; proc P; var y, z; proc Q; var x, z; [... P()...] [... x... y... Q()...] proc R; [... P()...] [... P()...]. Procedure stack after second call of P: 15 4 5 4 5 4 4 3 0 0 0 sl dl ra y z sl dl ra x z sl dl ra y z sl dl ra y sl dl ra x P() Q() P() MAIN I/O P uses x = dif = 2 P uses y = dif = 0 25 of 26 Compiler Construction

The Procedure Stack Displays Optimisation technique to replace static links Implementation: array of pointers to frames: disp Size of the array = maximum block nesting depth of program disp[i] points to frame associated with the current scope at nesting depth i Usage: to access a non-local variable declared at nesting depth i, 1. go to frame pointed to by disp[i] 2. access variable via offset in this frame Advantage: constant-time access (exactly two steps regardless of level difference) Maintenance: use global display for current procedure activation caller saves display in its frame, and restores it when callee returns display for callee constructed using techniques for static link handling, using static scoping 26 of 26 Compiler Construction