Lecture 15 CIS 341: COMPILERS

Similar documents
CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

Lecture 13 CIS 341: COMPILERS

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages. Operational Semantics

CIS 341 Midterm March 2, 2017 SOLUTIONS

Project 2: Scheme Interpreter

CIS 341 Midterm February 28, Name (printed): Pennkey (login id): SOLUTIONS

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Programming Languages and Compilers (CS 421)

Type Inference Systems. Type Judgments. Deriving a Type Judgment. Deriving a Judgment. Hypothetical Type Judgments CS412/CS413

CIS 341 Final Examination 4 May 2017

Topic 6: Types COS 320. Compiling Techniques. Princeton University Spring Prof. David August. Adapted from slides by Aarne Ranta

The Substitution Model

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus

7. Introduction to Denotational Semantics. Oscar Nierstrasz

Fall Lecture 3 September 4. Stephen Brookes

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Flang typechecker Due: February 27, 2015

Parsing Scheme (+ (* 2 3) 1) * 1

Outline. Introduction Concepts and terminology The case for static typing. Implementing a static type system Basic typing relations Adding context

CS558 Programming Languages

Programming Language Features. CMSC 330: Organization of Programming Languages. Turing Completeness. Turing Machine.

CMSC 330: Organization of Programming Languages

The Substitution Model. Nate Foster Spring 2018

CIS 341 Midterm March 2, Name (printed): Pennkey (login id): Do not begin the exam until you are told to do so.

Principles of Programming Languages

CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures. Dan Grossman Autumn 2018

CIS 341 Final Examination 4 May 2018

6.184 Lecture 4. Interpretation. Tweaked by Ben Vandiver Compiled by Mike Phillips Original material by Eric Grimson

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Lecture 2: Big-Step Semantics

Programming Languages Third Edition

Formal Semantics. Prof. Clarkson Fall Today s music: Down to Earth by Peter Gabriel from the WALL-E soundtrack

Programming Languages and Compilers (CS 421)

Operational Semantics. One-Slide Summary. Lecture Outline

COS 320. Compiling Techniques

Comp 411 Principles of Programming Languages Lecture 7 Meta-interpreters. Corky Cartwright January 26, 2018

CS558 Programming Languages

CVO103: Programming Languages. Lecture 5 Design and Implementation of PLs (1) Expressions

Chapter 3 (part 3) Describing Syntax and Semantics

CMSC 330: Organization of Programming Languages

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

CMSC 330: Organization of Programming Languages. Lambda Calculus

Computer Science 21b (Spring Term, 2015) Structure and Interpretation of Computer Programs. Lexical addressing

CSE413: Programming Languages and Implementation Racket structs Implementing languages with interpreters Implementing closures

Semantic Analysis. Lecture 9. February 7, 2018

Programming Languages Fall 2013

Midterm II CS164, Spring 2006

Boolean expressions. Elements of Programming Languages. Conditionals. What use is this?

cs242 Kathleen Fisher Reading: Concepts in Programming Languages, Chapter 6 Thanks to John Mitchell for some of these slides.

Intermediate Code Generation

# true;; - : bool = true. # false;; - : bool = false 9/10/ // = {s (5, "hi", 3.2), c 4, a 1, b 5} 9/10/2017 4

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages. Lambda Calculus

Lecture 2: SML Basics

Hard deadline: 3/28/15 1:00pm. Using software development tools like source control. Understanding the environment model and type inference.

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Notation. The rules. Evaluation Rules So Far.

Programming Languages

A Small Interpreted Language

Pierce Ch. 3, 8, 11, 15. Type Systems

Recap: Functions as first-class values

CSCI-GA Scripting Languages

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

Lecture Compiler Middle-End

Programs as data first-order functional language type checking

Mutable References. Chapter 1

Procedures. EOPL3: Section 3.3 PROC and App B: SLLGEN

CSCE 314 Programming Languages. Type System

Functional Programming. Pure Functional Programming

LECTURE 16. Functional Programming

n n Try tutorial on front page to get started! n spring13/ n Stack Overflow!

Introduction to ML. Mooly Sagiv. Cornell CS 3110 Data Structures and Functional Programming

CSE 341 Section 5. Winter 2018

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

Lecture 16: Static Semantics Overview 1

COMP520 - GoLite Type Checking Specification

Induction and Semantics in Dafny

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

The Environment Model

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Mid-Term 2 Grades

G Programming Languages - Fall 2012

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

Reasoning About Imperative Programs. COS 441 Slides 10

The SPL Programming Language Reference Manual

Comp 311: Sample Midterm Examination

Haske k ll An introduction to Functional functional programming using Haskell Purely Lazy Example: QuickSort in Java Example: QuickSort in Haskell

Introduction to Axiomatic Semantics

CSE-321: Assignment 8 (100 points)

A Simple Syntax-Directed Translator

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Lecture 13. Notation. The rules. Evaluation Rules So Far

Lexical Considerations

CMSC 330: Organization of Programming Languages. OCaml Imperative Programming

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

6.037 Lecture 4. Interpretation. What is an interpreter? Why do we need an interpreter? Stages of an interpreter. Role of each part of the interpreter

Booleans (aka Truth Values) Programming Languages and Compilers (CS 421) Booleans and Short-Circuit Evaluation. Tuples as Values.

Recap. Recap. If-then-else expressions. If-then-else expressions. If-then-else expressions. If-then-else expressions

Transcription:

Lecture 15 CIS 341: COMPILERS

Announcements HW4: OAT v. 1.0 Parsing & basic code generation Due: March 28 th No lecture on Thursday, March 22 Dr. Z will be away Zdancewic CIS 341: Compilers 2

Adding Integers to Lambda Calculus exp ::= n constant integers exp 1 + exp 2 binary arithmetic operation val ::= fun x -> exp functions are values n integers are values n{v/x} = n constants have no free vars. (e 1 + e 2 ){v/x} = (e 1 {v/x} + e 2 {v/x}) substitute everywhere exp 1 n 1 exp2 n 2 exp 1 + exp 2 (n1 + n2) Object-level + Meta-level + Zdancewic CIS 341: Compilers 3

Compiling lambda calculus to straight-line code. Representing evaluation environments at runtime. CLOSURE CONVERSION Zdancewic CIS 341: Compilers 4

Compiling First-class Functions To implement first-class functions on a processor, there are two problems: First: we must implement substitution of free variables Second: we must separate code from data Reify the substitution: Move substitution from the meta language to the object language by making the data structure & lookup operation explicit The environment-based interpreter is one step in this direction Closure Conversion: Eliminates free variables by packaging up the needed environment in the data structure. Hoisting: Separates code from data, pulling closed code to the top level. Zdancewic CIS 341: Compilers 5

Example of closure creation Recall the add function: let add = fun x -> fun y -> x + y Consider the inner function: fun y -> x + y When run the function application: add 4 the program builds a closure and returns it. The closure is a pair of the environment and a code pointer. ptr Code(env, y, body) (4) code body The code pointer takes a pair of parameters: env and y The function code is (essentially): fun (env, y) -> let x = nth env 0 in x + y CIS 341: Compilers 6

Representing Closures As we saw, the simple closure conversion algorithm doesn t generate very efficient code. It stores all the values for variables in the environment, even if they aren t needed by the function body. It copies the environment values each time a nested closure is created. It uses a linked-list datastructure for tuples. There are many options: Store only the values for free variables in the body of the closure. Share subcomponents of the environment to avoid copying Use vectors or arrays rather than linked structures CIS 341: Compilers 7

Array-based Closures with N-ary Functions (fun (x y z) -> (fun (n m) -> (fun p -> (fun q -> n + z) x) x,y,z Closure A n,m fun 2 fun 1 nil x y z p fun 0 Closure B app Note how free variables are addressed relative to the closure due to shared env. fun q Closure A nxt n m env code nxt p + Closure B env code 1,0 2,2 1,0 follow 1 nxt ptr then look up index 0 follow 2 nxt ptrs then look up index 2

Scope, Types, and Context STATIC ANALYSIS Zdancewic CIS 341: Compilers 9

Variable Scoping Consider the problem of determining whether a programmer-declared variable is in scope. Issues: Which variables are available at a given point in the program? Shadowing is it permissible to re-use the same identifier, or is it an error? Example: The following program is syntactically correct but not wellformed. (y and q are used without being defined anywhere) int fact(int x) { var acc = 1; while (x > 0) { acc = acc * y; x = q - 1; } return acc; } Q: Can we solve this problem by changing the parser to rule out such programs? Zdancewic CIS 341: Compilers 10

Contexts and Inference Rules Need to keep track of contextual information. What variables are in scope? What are their types? How do we describe this? In the compiler there's a mapping from variables to information we know about them. Zdancewic CIS 341: Compilers 11

Why Inference Rules? They are a compact, precise way of specifying language properties. E.g. ~20 pages for full Java vs. 100 s of pages of prose Java Language Spec. Inference rules correspond closely to the recursive AST traversal that implements them Type checking (and type inference) is nothing more than attempting to prove a different judgment ( G;L e : t ) by searching backwards through the rules. Compiling in a context is nothing more than a collection of inference rules specifying yet a different judgment ( G src target ) Moreover, the compilation judgment is similar to the typechecking judgment Strong mathematical foundations The Curry-Howard correspondence : Programming Language ~ Logic, Program ~ Proof, Type ~ Proposition See CIS 500 next Fall if you re interested in type systems! CIS 341: Compilers 12

Inference Rules We can read a judgment G;L e : t as the expression e is well typed and has type t For any environment G, expression e, and statements s 1, s 2. G;L;rt if (e) s 1 else s 2 holds if G ;L e : bool and G;L;rt s 1 and G;L;rt s 2 all hold. More succinctly: we summarize these constraints as an inference rule: Premises Conclusion G;L e : bool G;L;rt s 1 G;L;rt s 2 G;L;rt if (e) s 1 else s 2 This rule can be used for any substitution of the syntactic metavariables G, e, s 1 and s 2. CIS 341: Compilers 13

Checking Derivations A derivation or proof tree has (instances of) judgments as its nodes and edges that connect premises to a conclusion according to an inference rule. Leaves of the tree are axioms (i.e. rules with no premises) Example: the INT rule is an axiom Goal of the type checker: verify that such a tree exists. Example1: Find a tree for the following program using the inference rules in oat0-defn.pdf: var x1 = 0; var x2 = x1 + x1; x1 = x1 x2; return(x1); Example2: There is no tree for this ill-scoped program: var x2 = x1 + x1; return(x2); CIS 341: Compilers 14

Example Derivation var x1 = 0; var x2 = x1 + x1; x1 = x1 x2; return(x1); Zdancewic CIS 341: Compilers 15

Example Derivation Zdancewic CIS 341: Compilers 16

Example Derivation D 4 = x 1 :int 2, x 1 :int, x 2 :int G 0 ;, x 1 :int, x 2 :int ` x 1 : int [var] G 0 ;, x 1 :int, x 2 :int;int ` return x 1 ; ), x 1 :int, x 2 :int [Ret] Zdancewic CIS 341: Compilers 17

Why Inference Rules? They are a compact, precise way of specifying language properties. E.g. ~20 pages for full Java vs. 100 s of pages of prose Java Language Spec. Inference rules correspond closely to the recursive AST traversal that implements them Compiling in a context is nothing more an interpretation of the inference rules that specify typechecking*: C e : t Compilation follows the typechecking judgment Strong mathematical foundations The Curry-Howard correspondence : Programming Language ~ Logic, Program ~ Proof, Type ~ Proposition See CIS 500 next Fall if you re interested in type systems! *Here (and later) we ll write context C for G;L, the combination of the CIS 341: Compilers global and local contexts. 18

Compilation As Translating Judgments Consider the source typing judgment for source expressions: C e : t How do we interpret this information in the target language? C e : t =? t is a target type e translates to a (potentially empty) sequence of instructions, that, when run, computes the result into some operand INVARIANT: if C e : t = ty, operand, stream then the type (at the target level) of the operand is ty= t Zdancewic CIS 341: Compilers 19

Example C 341 + 5 : int what is C 341 + 5 : int? 341 : int = (i64, Const 341, []) 5 : int = (i64, Const 5, []) ---------------------------------------- --------------------------------------- C 341 : int = (i64, Const 341, []) C 5 : int = (i64, Const 5, []) ------------------------------------------------------------------------------------------ C 341 + 5 : int = (i64, %tmp, [%tmp = add i64 (Const 341) (Const 5)]) Zdancewic CIS 341: Compilers 20

What is C? What about the Context? Source level C has bindings like: x:int, y:bool We think of it as a finite map from identifiers to types What is the interpretation of C at the target level? C maps source identifiers, x to source types and x What is the interpretation of a variable x at the target level? How are the variables used in the type system? ` x: t 2 L G;L ` x : t as expressions (which denote values) G L ` exp : t ` ) x: t 2 L G;L ` exp : t typ_var typ_assn G;L;rt ` x = exp; ) L G L ` as addresses (which can be assigned) 2 ` ` Zdancewic CIS 341: Compilers 21

Interpretation of Contexts C = a map from source identifiers to types and target identifiers INVARIANT: x:t C means that (1) lookup C x = (t, %id_x) (2) the (target) type of %id_x is t * (a pointer to t ) Zdancewic CIS 341: Compilers 22

Interpretation of Variables Establish invariant for expressions: ` x: t 2 L G;L ` x : t typ_var as expressions (which denote values) G L ` exp : t What about statements? ` ) x: t 2 L G;L ` exp : t G;L;rt ` x = exp; ) L G L ` as addresses (which can be assigned) = (%tmp, [%tmp = load i64* %id_x]) typ_assn 2 ` ` where (i64, %id_x) = lookup L x = stream @ [store t opn, t * %id_x] where (t, %id_x) = lookup L x and G;L exp : t = ( t, opn, stream) Zdancewic CIS 341: Compilers 23

Other Judgments? Statement: C; rt stmt C = C, stream Declaration: G;L t x = exp G;L,x:t = G;L,x:t, stream INVARIANT: stream is of the form: stream @ [ %id_x = alloca t ; store t opn, t * %id_x ] and G;L exp : t = ( t, opn, stream ) Rest follow similarly Zdancewic CIS 341: Compilers 24

COMPILING CONTROL Zdancewic CIS 341: Compilers 25

Translating while Consider translating while(e) s : Test the conditional, if true jump to the body, else jump to the label after the body. C;rt while(e) s C = C, lpre: opn = C e : bool %test = icmp eq i1 opn, 0 br %test, label %lpost, label %lbody lbody: C;rt s C br %lpre lpost: Note: writing opn = C e : bool is pun translating C e : bool generates code that puts the result into opn In this notation there is implicit collection of the code CIS 341: Compilers 26

Translating if-then-else Similar to while except that code is slightly more complicated because if-then-else must reach a merge and the else branch is optional. C;rt if (e 1 ) s 1 else s 2 C = C opn = C e : bool %test = icmp eq i1 opn, 0 br %test, label %else, label %then then: C;rt s 1 C br %merge else: C; rt s 2 C br %merge merge: CIS 341: Compilers 27

Instruction streams: Connecting this to Code Must include labels, terminators, and hoisted global constants Must post-process the stream into a control-flow-graph See frontend.ml from HW4 Zdancewic CIS 341: Compilers 28

OPTIMIZING CONTROL Zdancewic CIS 341: Compilers 29

Standard Evaluation Consider compiling the following program fragment: if (x &!y!w) z = 3; else z = 4; return z; %tmp1 = icmp Eq y, 0 ;!y %tmp2 = and x tmp1 %tmp3 = icmp Eq w, 0 %tmp4 = or %tmp2, %tmp3 %tmp5 = icmp Eq %tmp4, 0 br %tmp4, label %else, label %then then: store z, 3 br %merge else: store z, 4 br %merge merge: %tmp5 = load z ret %tmp5 CIS 341: Compilers 30

Observation Usually, we want the translation e to produce a value C e : t = (ty, operand, stream) e.g. C e 1 + e 2 : int = (i64, %tmp, [%tmp = add e 1 e 2 ]) But when the expression we re compiling appears in a test, the program jumps to one label or another after the comparison but otherwise never uses the value. In many cases, we can avoid materializing the value (i.e. storing it in a temporary) and thus produce better code. This idea also lets us implement different functionality too: e.g. short-circuiting boolean expressions CIS 341: Compilers 31

Idea: Use a different translation for tests Usual Expression translation: C e : t = (ty, operand, stream) Conditional branch translation of booleans, without materializing the value: C e : bool@ ltrue lfalse = stream C, rt if (e) then s1 else s2 C = C, Notes: takes two extra arguments: a true branch label and a false branch label. Doesn t return a value insns 3 then: s1 br %merge else: s 2 br %merge merge: Aside: this is a form of continuation-passing translation where C, rt s 1 C = C, insns 1 C, rt s 2 C = C, insns 2 C e : bool@ then else = insns 3 CIS 341: Compilers 32

Short Circuit Compilation: Expressions C e : bool@ ltrue lfalse = insns C false : bool@ ltrue lfalse = [br %lfalse] FALSE C true : bool@ ltrue lfalse = [br %ltrue] TRUE C e : bool@ lfalse ltrue = insns C!e : bool@ ltrue lfalse = insns NOT Zdancewic CIS 341: Compilers 33

Short Circuit Evaluation Idea: build the logic into the translation C e1 : bool@ ltrue right = insns 1 C e2 : bool@ ltrue lfalse = insns 2 C e1 e2 : bool@ ltrue lfalse = insns 1 right: insn 2 C e1 : bool@ right lfalse = insns 1 C e2 : bool@ ltrue lfalse = insns 2 C e1&e2 : bool@ ltrue lfalse = insns 1 right: insn 2 where right is a fresh label Zdancewic CIS 341: Compilers 34

Short-Circuit Evaluation Consider compiling the following program fragment: if (x &!y!w) z = 3; else z = 4; return z; %tmp1 = icmp Eq x, 0 br %tmp1, label %right2, label %right1 right1: %tmp2 = icmp Eq y, 0 br %tmp2, label %then, label %right2 right2: %tmp3 = icmp Eq w, 0 br %tmp3, label %then, label %else then: store z, 3 br %merge else: store z, 4 br %merge merge: %tmp5 = load z ret %tmp5 CIS 341: Compilers 35