CSCI B522 Lecture 11 Naming and Scope 8 Oct, 2009

Similar documents
CS 6110 S11 Lecture 12 Naming and Scope 21 February 2011

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

CS 6110 S11 Lecture 25 Typed λ-calculus 6 April 2011

1 Introduction. 3 Syntax

CMSC 330: Organization of Programming Languages. Operational Semantics

CS 6110 S14 Lecture 1 Introduction 24 January 2014

Informal Semantics of Data. semantic specification names (identifiers) attributes binding declarations scope rules visibility

Last class. CS Principles of Programming Languages. Introduction. Outline

Functional Languages. CSE 307 Principles of Programming Languages Stony Brook University

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages. Lambda calculus

CS 314 Principles of Programming Languages. Lecture 21

Functional Languages. Hwansoo Han

CS152: Programming Languages. Lecture 7 Lambda Calculus. Dan Grossman Spring 2011

Lecture 16: Static Semantics Overview 1

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

CSE 505: Concepts of Programming Languages

axiomatic semantics involving logical rules for deriving relations between preconditions and postconditions.

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Programming Languages Third Edition. Chapter 7 Basic Semantics

The role of semantic analysis in a compiler

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

CS101 Introduction to Programming Languages and Compilers

Whereweare. CS-XXX: Graduate Programming Languages. Lecture 7 Lambda Calculus. Adding data structures. Data + Code. What about functions

From the λ-calculus to Functional Programming Drew McDermott Posted

Concepts of Programming Languages

The PCAT Programming Language Reference Manual

9/23/2014. Why study? Lambda calculus. Church Rosser theorem Completeness of Lambda Calculus: Turing Complete

CMSC 330: Organization of Programming Languages

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

CSCC24 Functional Programming Scheme Part 2

Class Structure. Prerequisites

Summer 2017 Discussion 10: July 25, Introduction. 2 Primitives and Define

Scope. Chapter Ten Modern Programming Languages 1

CS 415 Midterm Exam Spring 2002

CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

Programmiersprachen (Programming Languages)

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Semantic Analysis. Lecture 9. February 7, 2018

The SPL Programming Language Reference Manual

CS321 Languages and Compiler Design I. Winter 2012 Lecture 1

Lecture 13 CIS 341: COMPILERS

SCHEME 8. 1 Introduction. 2 Primitives COMPUTER SCIENCE 61A. March 23, 2017

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Understanding Recursion

1 Lexical Considerations

G Programming Languages - Fall 2012

More Lambda Calculus and Intro to Type Systems

Com S 541. Programming Languages I

Note that in this definition, n + m denotes the syntactic expression with three symbols n, +, and m, not to the number that is the sum of n and m.

CA Compiler Construction

(Refer Slide Time: 4:00)

Every language has its own scoping rules. For example, what is the scope of variable j in this Java program?

CS-XXX: Graduate Programming Languages. Lecture 9 Simply Typed Lambda Calculus. Dan Grossman 2012

CS558 Programming Languages

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Organization of Programming Languages CS3200/5200N. Lecture 11

CS558 Programming Languages

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

CSCE 314 Programming Languages. Type System

Functional Programming and λ Calculus. Amey Karkare Dept of CSE, IIT Kanpur

Abstract register machines Lecture 23 Tuesday, April 19, 2016

A Small Interpreted Language

CSE 341: Programming Languages

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

Spring 2018 Discussion 7: March 21, Introduction. 2 Primitives

LECTURE 16. Functional Programming

CIS 500 Software Foundations Fall September 25

COMP 1130 Lambda Calculus. based on slides by Jeff Foster, U Maryland

Introduction to Lambda Calculus. Lecture 7 CS /08/09

Semantics of programming languages

Acknowledgement. CS Compiler Design. Semantic Processing. Alternatives for semantic processing. Intro to Semantic Analysis. V.

6.001 Notes: Section 6.1

Functional Programming. Pure Functional Languages

CS422 - Programming Language Design

CSCI-GA Scripting Languages

CS422 - Programming Language Design

Type Systems Winter Semester 2006

CS450: Structure of Higher Level Languages Spring 2018 Assignment 7 Due: Wednesday, April 18, 2018

Chapter 11 :: Functional Languages

Lecture 15 CIS 341: COMPILERS

CS 242. Fundamentals. Reading: See last slide

CSE 307: Principles of Programming Languages

Wellesley College CS251 Programming Languages Spring, 2000 FINAL EXAM REVIEW PROBLEM SOLUTIONS

CS558 Programming Languages

Fall 2017 Discussion 7: October 25, 2017 Solutions. 1 Introduction. 2 Primitives

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

Lexical Considerations

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

The Environment Model. Nate Foster Spring 2018

Goal. CS152: Programming Languages. Lecture 15 Parametric Polymorphism. What the Library Likes. What The Client Likes. Start simpler.

Semantic Analysis. How to Ensure Type-Safety. What Are Types? Static vs. Dynamic Typing. Type Checking. Last time: CS412/CS413

by the evening of Tuesday, Feb 6

Programming Languages Third Edition

Homework 1. Notes. What To Turn In. Unix Accounts. Reading. Handout 3 CSCI 334: Spring, 2017

One of a number of approaches to a mathematical challenge at the time (1930): Constructibility

Lecture Notes: Control Flow Analysis for Functional Languages

The Lambda Calculus. 27 September. Fall Software Foundations CIS 500. The lambda-calculus. Announcements

Foundations. Yu Zhang. Acknowledgement: modified from Stanford CS242

CS558 Programming Languages

Transcription:

CSCI B522 Lecture 11 Naming and Scope 8 Oct, 2009 Lecture notes for CS 6110 (Spring 09) taught by Andrew Myers at Cornell; edited by Amal Ahmed, Fall 09. 1 Static vs. dynamic scoping The scope of a variable is where that variable can be mentioned and used. Until now we could look at a program as written and immediately determine where any variable was bound. This was possible because the λ-calculus uses static scoping (also known as lexical scoping). The places where a variable can be used are determined by the lexical structure of the program. An alternative to static scoping is dynamic scoping, in which a variable is bound to the most recent (in time) value assigned to that variable. The difference becomes apparent when a function is applied. In static scoping, any free variables in the function body are evaluated in the context of the defining occurrence of the function; whereas in dynamic scoping, any free variables in the function body are evaluated in the context of the function call. The difference is illustrated by the following program: let d = 2 in let f = λx. x + d in let d = 1 in f 2 In ML, which uses lexical scoping, the block above evaluates to 4: 1. The outer d is bound to 2. 2. f is bound to λx. x + d. Since d is statically bound, this is will always be equivalent to λx. x + 2 (the value of d cannot change, since there is no variable assignment in this language). 3. The inner d is bound to 1. 4. f 2 is evaluated using the environment in which f was defined; that is, f is evaluated with d bound to 2. We get 2 + 2 = 4. If the block is evaluated using dynamic scoping, it evaluates to 3: 1. The outer d is bound to 2. 2. f is bound to λx. x + d. The occurrence of d in the body of f is not locked to the outer declaration of d. 3. The inner d is bound to 1. 4. f 2 is evaluated using the environment of the call, in which d is 1. We get 2 + 1 = 3. Dynamically scoped languages are quite common, and include many interpreted scripting languages. Examples of languages with dynamic scoping are (in roughly chronological order): early versions of LISP, APL, PostScript, TeX, and Perl. Early versions of Python also had dynamic scoping before they realized their error. Dynamic scoping does have some advantages: Certain language features are easier to implement. It becomes possible to extend almost any piece of code by overriding the values of variables that are used internally by that piece. These advantages, however, come with a price: 1

Since it is impossible to determine statically what variables are accessible at a particular point in a program, the compiler cannot determine where to find the correct value of a variable, necessitating a more expensive variable lookup mechanism. With static scoping, variable accesses can be implemented more efficiently, as array accesses. Implicit extensibility makes it very difficult to keep code modular: the true interface of any block of code becomes the entire set of variables used by that block. 1.1 Scope and the interpretation of free variables Scoping rules are all about how to evaluate free variables in a program fragment. With static scope, free variables of a function λx. e are interpreted according to the lexical (syntactic) context in which the term λx. e occurs. With dynamic scope, free variables of λx. e are interpreted according to the environment in effect when λx. e is applied. These are not the same in general. We can demonstrate the difference by defining two translations S [ ] and D [ ] for the two scoping rules, static and dynamic. These translations will convert λ CBV (with the corresponding scoping rule) into uml. (Because uml already has static scoping, the static scoping translation should have no effect on the meaning of the program. For both translations, we use an environment to capture the interpretation of names. Here, an environment is simply a function from variables x to values. Environment ρ : Var Value For example, the empty environment is: ρ 0 = λx. error because there are no variables bound in the empty environment. If we wanted an environment that bound only the variable y to 2, we could represent it as a uml term as follows: { y 2} = λx. if x = y then 2 else error The meaning of a language expression e is relative to the environment in which e occurs. Therefore, its meaning [e] is a function from an environment to the computation of a value. [e] : Environment Expr We obtain a target language expression (in Expr) by applying [e] to some environment: [e]ρ : Expr The translations take a term e and an environment ρ and produce a target-language expression involving values and environments that can be evaluated under the usual uml rules to produce a value. The translation of the key expressions (the ones from lambda calculus) for static scoping follows. We use one new piece of syntactic sugar in the target language. We write x to mean a representation of the identifier x as an integer. Of course, there are many possible ways to encode an identifier as an integer, for example by thinking of the identifier as a stream of bits that are the binary representation of the integer. S [x] ρ = ρ( x ) S [e 1 e 2 ] ρ = (S [e 1 ] ρ) (S [e 2 ] ρ) S [λx. e] ρ = λy. S [e] (EXTEND ρ x y), where (EXTEND ρ x v) adds a new binding to the environment ρ with the value of x replaced by v: EXTEND = λρxv. (λy. if x = y then v else ρ y) There are a couple of things to notice about the translation. It eliminates all of the variable names from the source program, and replaces them with new names that are bound immediately at the same level. All the lambda terms are closed, and there is no longer any role for the scoping mechanism of the target language to decide what to do with free variables. 2

1.2 Dynamic scoping We can construct a translation that captures dynamic scoping through a few small changes: D [x] ρ = ρ ( x ) D [λx. e] ρ = λyρ. D [e] (EXTEND ρ x y) (throw out lexical environment!) D [e 1 e 2 ] ρ = (D [e 1 ] ρ) (D [e 2 ] ρ) ρ The key is that the translation of a function is no longer just a function expecting the formal parameter; the function also expects to be provided with an environment ρ describing the variable bindings at the call site. Unlike with static scoping, the translation of a λ abstraction, on the other hand, discards the lexical environment ρ existing at the point where the abstraction is evaluated. This makes it easier to represent functions, but creates the need to pass the dynamic environment explicitly to the function when it is called, as shown in the translation of application. Because a function can be applied in different and unpredictable locations, it is difficult in general to come up with an efficient representation of the dynamic environment. 1.3 Correctness of the static scoping translation That static scoping is the scoping discipline of λ CBV is captured in the following theorem. Theorem For any λ CBV expression e and environment ρ, S [e] ρ is (β, η)-equivalent to e{ρ(y)/y, y Var}. Proof. By structural induction on e. We write ρ[v x] for (EXTEND ρ v x). S [x] ρ = ρ(x) = x{ρ(y)/y, y Var}, S [e 1 e 2 ] ρ = (S [e 1 ] ρ) (S [e 2 ] ρ) = (e 1 {ρ(y)/y, y Var}) (e 2 {ρ(y)/y, y Var}) = (e 1 e 2 ){ρ(y)/y, y Var}, S [λx. e] ρ = λv. S [e] ρ[x v] = λv. e{ρ[x v](y)/y, y Var} = λv. e{ρ[x v](y)/y, y Var {x}}{ρ[x v](x)/x} = λv. e{ρ(y)/y, y Var {x}}{v/x} = β λv. (λx. e{ρ(y)/y, y Var {x}}) v = η λx. e{ρ(y)/y, y Var {x}} = (λx. e){ρ(y)/y, y Var}. The pairing of a function λx. e with an environment ρ is called a closure. The theorem above says that S [ ] can be implemented by forming a closure consisting of the term e and an environment ρ that determines how to interpret the free variables of e. By contrast, in dynamic scoping, the translated function does not record the lexical environment, so closures are not needed. 3

CSCI B522 Lecture 11 Recursive Names and Modules 8 Oct, 2009 Lecture notes for CS 6110 (Spring 09) taught by Andrew Myers at Cornell; edited by Amal Ahmed, Fall 09. 1 Introduction In the last two lectures, we studied the static and dynamic approaches to variable scoping. These scoping disciplines are mechanisms for binding names to values so that the values can later be retrieved by their assigned name. Both static and dynamic naming strategies are hierarchical in the sense that variables enter and leave scope according to the program s abstract syntax tree (in the case of static scoping) or the tree of function calls (in dynamic scoping). This dependence on hierarchy might prove restrictive or inflexible for writing certain kinds of programs. In such cases we might want a more liberal naming discipline that is independent of any hierarchy induced by the syntactic structure or call structure. Such a non-hierarchical scoping structure is provided by modules. A module is like a software black box with its own local namespace. It can export resources as a set of names without revealing its internal composition. Thus the names that can be used at a certain point in a program need not come down from above but can also be names exported by modules. Good programming practices encourage modularity, especially in the construction of large systems. Programs should be composed of discrete components that communicate with one another along small, welldefined interfaces and are reusable. Modules are consistent with this idea. Each module can treat the others as black boxes; that is, they know nothing about what is inside the other module except as revealed by the interface. Early programming languages had one global namespace in which names of all functions in source files and libraries were visible to all parts of the program. This was the approach for example of FORTRAN and C. There are certain problems with this: The various parts of the program can become tightly coupled. In other words, the global namespace does not enforce the modularity of the program. Replacing any particular part of the program with an enhanced equivalent can require a lot of effort. Undesired name collisions occur frequently, since names inevitably tend to coincide. In very large programs, it is difficult to come up with unique names, thus names tend to become non-mnemonic and hard to remember. A solution to this problem is for the language to provide a module mechanism that allows related functions, values, and types to be grouped together into a common context. This allows programmers to create a local namespace, thus minimizing naming conflicts. Examples of modules are classes in Java and C++, packages in Java, or structures in ML. Given a module, we can access the variables in it by qualifying the variable names with the name of the module, or we can import the whole namespace of the module into our code, so we can use the module s names as if they had been declared locally. 2 Modules A module is a collection of named things (such as values, functions, types etc.) that are somehow related to one another. The programmer can choose which names are public (exported to other parts of the program) and which are private (inaccessible outside the module). There are usually two ways to access the contents of a module. The first is with the use of a selector expression, where the name of the module is prefixed to the variable in a certain way. For instance, we write m.x in Java and m::x in C++ to refer to the entity with name x in the module m. The second method of accessing the contents of a module is to use an expression that brings names from a module into scope for a section of code. For example, we can write something like with m do e, which means that x can be used in the block of code e without prefixing it with m. In ML, for instance, 1

the command Open List brings names from the module List into scope. In C++ we write using namespace module name; and in Java we write import module name; for the similar purposes. Another issue is whether to have modules as first class or second class objects. First class objects are entities that can be passed to a function as an argument, bound to a variable or returned from a function. In ML, modules are not first class objects, whereas in Java, modules can be treated as first class objects using the reflection mechanism. While first-class treatment of modules increases the flexibility of a language, it usually requires some extra overhead at run-time. 3 A simple module mechanism We now extend uml, our simple ML-like language, to support modules. We call the new language uml+m to denote that it supports modules. There must be some values that we can use as names with an equality test. The syntax of the new language is: e ::=... module (x 1 = e 1,..., x n = e n ) (module definition) e m.e (selector expression) with e m e (bringing into scope) lookup-error (error) We now want to define a translation from uml+m to uml. 1 To do this, we notice that a module is really just an environment, since it is a mapping from names to values. Here is a translation of the module definition: [module (x 1 = e 1, x 2 = e 2,..., x n = e n )] ρ = λx. if x = x 1 then [e 1 ] ρ else if x = x 2 then [e 2 ] ρ else... if x = x n then [e n ] ρ else lookup-error The above is one possible translation. Note that ρ is passed as the environment to the translation of e 1,..., e n. This has an important consequence: variables defined within the module are not visible within the initialization expression of other variables in the module. For instance, in the above translation, we cannot refer to any of the x i s within any of the e i s. Nor does it seem possible to use the resulting environment within itself for the purpose of accessing the module variables, since this leads to circularity problems. However, we could translate module using techniques from the translation of letrec. The environment that results after the translation should be the same that is used within the module. This can be found by taking the fixed point of this function: λρ. λx. if x = x 1 then [e 1 ] ρ else if x = x 2 then [e 2 ] ρ else... if x = x n then [e n ] ρ else lookup-error This works fine if the e i s are functions. However, it is not clear how to handle variables whose initialization expressions reference one another. For instance, consider module (x 1 = x 2, x 2 = x 1 ) 1 Actually our translation is to the target language uml+s+lookup-error, a simple extension to uml that contains equality operators for strings and an extra token called lookup-error, which is returned when a variable name is not found in a module. 2

Java and C++ avoid this problem by allowing functions within a class to call any other function within the class, but initialization expressions of variables are only allowed to refer to variables declared earlier in the class. In terms of our translation, these languages start by inductively building up an environment for variable binding, then use the environment as the starting point for computing the fixed point. The remainder of the translation is as follows: [e m.e] ρ = ([e m ] ρ) ([e] ρ) [with e m e] ρ = [e] (MERGE ρ ([e m ] ρ)) MERGE ρ ρ = λx. let y = ρ x in if y = lookup-error then ρ x else y In the translation of [e m.e], presumably [e] ρ would evaluate to a name. Note that these translations are really functions of environments. That is, the translation above for [e m.e] can be written: [e m.e] = λρ. ([e m ] ρ) ([e] ρ). 3