CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures. Dan Grossman Autumn 2018

Similar documents
Typical workflow. CSE341: Programming Languages. Lecture 17 Implementing Languages Including Closures. Reality more complicated

CSE413: Programming Languages and Implementation Racket structs Implementing languages with interpreters Implementing closures

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

CSE 341 Section 7. Ethan Shea Autumn Adapted from slides by Nicholas Shahan, Dan Grossman, Tam Dang, and Eric Mullen

CS 360 Programming Languages Interpreters

CSE 341 Section 7. Eric Mullen Spring Adapted from slides by Nicholas Shahan, Dan Grossman, and Tam Dang

Writing Interpreters Racket eval. CSE341 Section 7. ASTs, Interpreters, MUPL. Sunjay Cauligi. February 21 st, Sunjay Cauligi CSE341 Section 7

CSE341, Spring 2013, Final Examination June 13, 2013

CSE341 Spring 2016, Final Examination June 6, 2016

CSE341 Spring 2017, Final Examination June 8, 2017

Ways to implement a language

CSE341 Autumn 2017, Final Examination December 12, 2017

CSE341 Spring 2017, Final Examination June 8, 2017

Project 2: Scheme Interpreter

Comp 411 Principles of Programming Languages Lecture 7 Meta-interpreters. Corky Cartwright January 26, 2018

The Environment Model. Nate Foster Spring 2018

The Environment Model

CSE 413 Midterm, May 6, 2011 Sample Solution Page 1 of 8

Comp 311: Sample Midterm Examination

Racket. CSE341: Programming Languages Lecture 14 Introduction to Racket. Getting started. Racket vs. Scheme. Example.

Lecture 12: Conditional Expressions and Local Binding

CSE 341, Spring 2011, Final Examination 9 June Please do not turn the page until everyone is ready.

Lecture 09: Data Abstraction ++ Parsing is the process of translating a sequence of characters (a string) into an abstract syntax tree.

Functional Programming. Pure Functional Programming

CS152: Programming Languages. Lecture 2 Syntax. Dan Grossman Spring 2011

CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

CSE341: Programming Languages Lecture 15 Macros. Zach Tatlock Winter 2018

What is a macro. CSE341: Programming Languages Lecture 15 Macros. Example uses. Using Racket Macros. Zach Tatlock Winter 2018

CS 314 Principles of Programming Languages

CSE 341: Programming Languages

Introduction to Typed Racket. The plan: Racket Crash Course Typed Racket and PL Racket Differences with the text Some PL Racket Examples

CIS 194: Homework 3. Due Wednesday, February 11, Interpreters. Meet SImPL

Concepts of programming languages

Functional abstraction. What is abstraction? Eating apples. Readings: HtDP, sections Language level: Intermediate Student With Lambda

Functional abstraction

6.184 Lecture 4. Interpretation. Tweaked by Ben Vandiver Compiled by Mike Phillips Original material by Eric Grimson

CVO103: Programming Languages. Lecture 5 Design and Implementation of PLs (1) Expressions

Why do we need an interpreter? SICP Interpretation part 1. Role of each part of the interpreter. 1. Arithmetic calculator.

Module 8: Local and functional abstraction

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Bindings & Substitution

Parsing Scheme (+ (* 2 3) 1) * 1

CSE 341, Autumn 2005, Assignment 3 ML - MiniML Interpreter

Programs as data first-order functional language type checking

LECTURE 3. Compiler Phases

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

CSE341 Autumn 2017, Midterm Examination October 30, 2017

Lecture 15 CIS 341: COMPILERS

Week 2: The Clojure Language. Background Basic structure A few of the most useful facilities. A modernized Lisp. An insider's opinion

The Typed Racket Guide

CS152: Programming Languages. Lecture 7 Lambda Calculus. Dan Grossman Spring 2011

Whereweare. CS-XXX: Graduate Programming Languages. Lecture 7 Lambda Calculus. Adding data structures. Data + Code. What about functions

CS-XXX: Graduate Programming Languages. Lecture 9 Simply Typed Lambda Calculus. Dan Grossman 2012

CS558 Programming Languages

Programming Languages

Data Abstraction. An Abstraction for Inductive Data Types. Philip W. L. Fong.

Formal Semantics. Chapter Twenty-Three Modern Programming Languages, 2nd ed. 1

CMSC 330: Organization of Programming Languages

CS422 - Programming Language Design

COP4020 Programming Assignment 1 - Spring 2011

CSE 341: Programming Languages

Environments

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

CSE341: Programming Languages Lecture 2 Functions, Pairs, Lists. Dan Grossman Fall 2011

6.001 Notes: Section 15.1

Case Study: Undefined Variables

7. Introduction to Denotational Semantics. Oscar Nierstrasz

CS422 - Programming Language Design

Intro. Scheme Basics. scm> 5 5. scm>

Mid-Term 2 Grades

Review. CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Let bindings (CBV) Adding Stuff. Booleans and Conditionals

Scheme in Scheme: The Metacircular Evaluator Eval and Apply

CSE 341: Programming Languages

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Procedures. EOPL3: Section 3.3 PROC and App B: SLLGEN

CSE341: Programming Languages Lecture 22 OOP vs. Functional Decomposition; Adding Operators & Variants; Double-Dispatch. Dan Grossman Autumn 2018

Syntax Errors; Static Semantics

News. Programming Languages. Recap. Recap: Environments. Functions. of functions: Closures. CSE 130 : Fall Lecture 5: Functions and Datatypes

CSE341: Programming Languages Lecture 9 Function-Closure Idioms. Dan Grossman Winter 2013

Functional Programming. Pure Functional Languages

Lambda Calculus see notes on Lambda Calculus

Principles of Programming Languages

COMP 105 Homework: Type Systems

Lecture 13 CIS 341: COMPILERS

Some Naughty Bits. Jeremy Brown. January 7-8, 2003

Ruby: Introduction, Basics

CS558 Programming Languages

CS 314 Principles of Programming Languages. Lecture 16

Verified compilers. Guest lecture for Compiler Construction, Spring Magnus Myréen. Chalmers University of Technology

Chapter 1. Fundamentals of Higher Order Programming

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

The Substitution Model

6.037 Lecture 4. Interpretation. What is an interpreter? Why do we need an interpreter? Stages of an interpreter. Role of each part of the interpreter

Scope and Introduction to Functional Languages. Review and Finish Scoping. Announcements. Assignment 3 due Thu at 11:55pm. Website has SML resources

CS 415 Midterm Exam Fall 2003

Introduction to lambda calculus Part 3

CS-XXX: Graduate Programming Languages. Lecture 22 Class-Based Object-Oriented Programming. Dan Grossman 2012

Types and Static Type Checking (Introducing Micro-Haskell)

Homework 3 COSE212, Fall 2018

Transcription:

CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Dan Grossman Autumn 2018

Typical workflow concrete syntax (string) "(fn x => x + x) 4" Parsing Possible errors / warnings abstract syntax (tree) Call Function x + Var Var x x Constant 4 Possible errors / warnings Type checking? Rest of implementation 2

Interpreter or compiler So rest of implementation takes the abstract syntax tree (AST) and runs the program to produce a result Fundamentally, two approaches to implement a PL B: Write an interpreter in another language A Better names: evaluator, executor Take a program in B and produce an answer (in B) Write a compiler in another language A to a third language C Better name: translator Translation must preserve meaning (equivalence) We call A the metalanguage Crucial to keep A and B straight 3

Reality more complicated Evaluation (interpreter) and translation (compiler) are your options But in modern practice have both and multiple layers A plausible example: Java compiler to bytecode intermediate language Have an interpreter for bytecode (itself in binary), but compile frequent functions to binary at run-time The chip is itself an interpreter for binary Well, except these days the x86 has a translator in hardware to more primitive micro-operations it then executes DrRacket uses a similar mix 4

Sermon Interpreter versus compiler versus combinations is about a particular language implementation, not the language definition So there is no such thing as a compiled language or an interpreted language Programs cannot see how the implementation works Unfortunately, you often hear such phrases C is faster because it s compiled and LISP is interpreted This is nonsense; politely correct people (Admittedly, languages with eval must ship with some implementation of the language in each program) 5

Typical workflow concrete syntax (string) "(fn x => x + x) 4" Parsing Possible errors / warnings abstract syntax (tree) Call Function x + Var Var x x Constant 4 Possible errors / warnings Type checking? Rest of implementation 6

Skipping parsing If implementing PL B in PL A, we can skip parsing Have B programmers write ASTs directly in PL A Not so bad with ML constructors or Racket structs Embeds B programs as trees in A Call Function Constant x + 4 Var Var x x ; define B s abstract syntax (struct call ) (struct function ) (struct var ) ; example B program (call (function (list "x") (add (var "x") (var "x"))) (const 4)) 7

Already did an example! Let the metalanguage A = Racket Let the language-implemented B = Arithmetic Language Arithmetic programs written with calls to Racket constructors The interpreter is eval-exp (struct const (int) #:transparent) (struct negate (e) #:transparent) (struct add (e1 e2) #:transparent) (struct multiply (e1 e2) #:transparent) (define (eval-exp e) (cond [(const? e) e] [(negate? e) (const (- (const-int (eval-exp (negate-e e)))))] [(add? e) ] [(multiply? e) ] Racket data structure is Arithmetic Language program, which eval-exp runs 8

What we know Define (abstract) syntax of language B with Racket structs B called MUPL in homework Write B programs directly in Racket via constructors Implement interpreter for B as a (recursive) Racket function Now, a subtle-but-important distinction: Interpreter can assume input is a legal AST for B Okay to give wrong answer or inscrutable error otherwise Interpreter must check that recursive results are the right kind of value Give a good error message otherwise 9

Legal ASTs Trees the interpreter must handle are a subset of all the trees Racket allows as a dynamically typed language (struct const (int) #:transparent) (struct negate (e) #:transparent) (struct add (e1 e2) #:transparent) (struct multiply (e1 e2) #:transparent) Can assume right types for struct fields const holds a number negate holds a legal AST add and multiply hold 2 legal ASTs Illegal ASTs can crash the interpreter this is fine (multiply (add (const 3) "uh-oh") (const 4)) (negate -7) 10

Interpreter results Our interpreters return expressions, but not any expressions Result should always be a value, a kind of expression that evaluates to itself If not, the interpreter has a bug So far, only values are from const, e.g., (const 17) But a larger language has more values than just numbers Booleans, strings, etc. Pairs of values (definition of value recursive) Closures 11

Example See code for language that adds booleans, number-comparison, and conditionals: (struct bool (b) #:transparent) (struct eq-num (e1 e2) #:transparent) (struct if-then-else (e1 e2 e3) #:transparent) What if the program is a legal AST, but evaluation of it tries to use the wrong kind of value? For example, add a boolean You should detect this and give an error message not in terms of the interpreter implementation Means checking a recursive result whenever a particular kind of value is needed No need to check if any kind of value is okay 12

Dealing with variables Interpreters so far have been for languages without variables No let-expressions, functions-with-arguments, etc. Language in homework has all these things This segment describes in English what to do Up to you to translate this to code Fortunately, what you have to implement is what we have been stressing since the very, very beginning of the course 13

Dealing with variables An environment is a mapping from variables (Racket strings) to values (as defined by the language) Only ever put pairs of strings and values in the environment Evaluation takes place in an environment Environment passed as argument to interpreter helper function A variable expression looks up the variable in the environment Most subexpressions use same environment as outer expression A let-expression evaluates its body in a larger environment 14

The Set-up So now a recursive helper function has all the interesting stuff: (define (eval-under-env e env) (cond ; case for each kind of )) ; expression Recursive calls must pass down correct environment Then eval-exp just calls eval-under-env with same expression and the empty environment On homework, environments themselves are just Racket lists containing Racket pairs of a string (the MUPL variable name, e.g., "x") and a MUPL value (e.g., (int 17)) 15

A grading detail Stylistically eval-under-env would be a helper function one could define locally inside eval-exp But do not do this on your homework We have grading tests that call eval-under-env directly, so we need it at top-level 16

The best part The most interesting and mind-bending part of the homework is that the language being implemented has first-class closures With lexical scope of course Fortunately, what you have to implement is what we have been stressing since we first learned about closures 17

Higher-order functions The magic : How do we use the right environment for lexical scope when functions may return other functions, store them in data structures, etc.? Lack of magic: The interpreter uses a closure data structure (with two parts) to keep the environment it will need to use later (struct closure (env fun) #:transparent) Evaluate a function expression: A function is not a value; a closure is a value Evaluating a function returns a closure Create a closure out of (a) the function and (b) the current environment when the function was evaluated Evaluate a function call: 18

Function calls (call e1 e2) Use current environment to evaluate e1 to a closure Error if result is a value that is not a closure Use current environment to evaluate e2 to a value Evaluate closure s function s body in the closure s environment, extended to: Map the function s argument-name to the argument-value And for recursion, map the function s name to the whole closure This is the same semantics we learned a few weeks ago coded up Given a closure, the code part is only ever evaluated using the environment part (extended), not the environment at the call-site 19

Is that expensive? Time to build a closure is tiny: a struct with two fields Space to store closures might be large if environment is large But environments are immutable, so natural and correct to have lots of sharing, e.g., of list tails (cf. lecture 3) Still, end up keeping around bindings that are not needed Alternative used in practice: When creating a closure, store a possibly-smaller environment holding only the variables that are free variables in the function body Free variables: Variables that occur, not counting shadowed uses of the same variable name A function body would never need anything else from the environment 20

Free variables examples (lambda () (+ x y z)) ; {x, y, z} (lambda (x) (+ x y z)) ; {y, z} (lambda (x) (if x y z)) ; {y, z} (lambda (x) (let ([y 0]) (+ x y z))) ; {z} (lambda (x y z) (+ x y z)) ; {} (lambda (x) (+ y (let ([y z]) (+ y y)))) ; {y, z} 21

Computing free variables So does the interpreter have to analyze the code body every time it creates a closure? No: Before evaluation begins, compute free variables of every function in program and store this information with the function Compared to naïve store-entire-environment approach, building a closure now takes more time but less space And time proportional to number of free variables And various optimizations are possible [Also use a much better data structure for looking up variables than a list] 22

Optional: compiling higher-order functions If we are compiling to a language without closures (like assembly), cannot rely on there being a current environment So compile functions by having the translation produce regular functions that all take an extra explicit argument called environment And compiler replaces all uses of free variables with code that looks up the variable using the environment argument Can make these fast operations with some tricks Running program still creates closures and every function call passes the closure s environment to the closure s code 23

Recall Our approach to language implementation: Implementing language B in language A Skipping parsing by writing language B programs directly in terms of language A constructors An interpreter written in A recursively evaluates What we know about macros: Extend the syntax of a language Use of a macro expands into language syntax before the program is run, i.e., before calling the main interpreter function 24

Put it together With our set-up, we can use language A (i.e., Racket) functions that produce language B abstract syntax as language B macros Language B programs can use the macros as though they are part of language B No change to the interpreter or struct definitions Just a programming idiom enabled by our set-up Helps teach what macros are See code for example macro definitions and macro uses macro expansion happens before calling eval-exp 25

Hygiene issues Earlier we had material on hygiene issues with macros (Among other things), problems with shadowing variables when using local variables to avoid evaluating expressions more than once The macro approach described here does not deal well with this 26