CS558 Programming Languages

Similar documents
CS558 Programming Languages

CS558 Programming Languages

CS558 Programming Languages. Winter 2013 Lecture 3

CS558 Programming Languages Winter 2018 Lecture 4a. Andrew Tolmach Portland State University

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

Programming Languages Third Edition. Chapter 7 Basic Semantics

The SPL Programming Language Reference Manual

Informal Semantics of Data. semantic specification names (identifiers) attributes binding declarations scope rules visibility

CS558 Programming Languages

In Java we have the keyword null, which is the value of an uninitialized reference type

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

CS321 Languages and Compiler Design I Winter 2012 Lecture 13

Semantic Analysis and Type Checking

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

CS558 Programming Languages

Short Notes of CS201

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

CS558 Programming Languages

CS201 - Introduction to Programming Glossary By

CSE 307: Principles of Programming Languages

Chapter 5. Names, Bindings, and Scopes

Semantic Analysis. Lecture 9. February 7, 2018

The PCAT Programming Language Reference Manual

The role of semantic analysis in a compiler

CS 430 Spring Mike Lam, Professor. Data Types and Type Checking

CPSC 3740 Programming Languages University of Lethbridge. Data Types

Programming Languages Third Edition. Chapter 10 Control II Procedures and Environments

CSCE 314 Programming Languages. Type System


G Programming Languages - Fall 2012

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

Attributes, Bindings, and Semantic Functions Declarations, Blocks, Scope, and the Symbol Table Name Resolution and Overloading Allocation, Lifetimes,

Lecture Notes on Memory Management

Topics Covered Thus Far CMSC 330: Organization of Programming Languages

CS 360 Programming Languages Interpreters

1 Lexical Considerations

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Lecture 13. Notation. The rules. Evaluation Rules So Far

THE GOOD, BAD AND UGLY ABOUT POINTERS. Problem Solving with Computers-I

Lexical Considerations

Weeks 6&7: Procedures and Parameter Passing

Programmin Languages/Variables and Storage

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Variables. Substitution

Class Information ANNOUCEMENTS

Outline. Introduction Concepts and terminology The case for static typing. Implementing a static type system Basic typing relations Adding context

CMSC 330: Organization of Programming Languages

CS321 Languages and Compiler Design I. Winter 2012 Lecture 2

CS558 Programming Languages

G Programming Languages - Fall 2012

CS558 Programming Languages. Winter 2013 Lecture 4

Functional Programming. Pure Functional Programming

Lecture Notes on Memory Management

Lexical Considerations

Names, Scope, and Bindings

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Topics Covered Thus Far. CMSC 330: Organization of Programming Languages. Language Features Covered Thus Far. Programming Languages Revisited

Chapter 5 Names, Binding, Type Checking and Scopes

CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

Lecture 2: SML Basics

Names, Bindings, Scopes

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

CMSC 330: Organization of Programming Languages

The Environment Model

Name, Scope, and Binding. Outline [1]

Names, Scope, and Bindings

G Programming Languages - Fall 2012

CPSC 213. Introduction to Computer Systems. Instance Variables and Dynamic Allocation. Unit 1c

CMSC 341 Lecture 2 Dynamic Memory and Pointers

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

CSC 533: Organization of Programming Languages. Spring 2005

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

CMSC 330: Organization of Programming Languages

Compiler construction

Lecture 13: Complex Types and Garbage Collection

The Environment Model. Nate Foster Spring 2018

CS321 Languages and Compiler Design I. Fall 2013 Week 8: Types Andrew Tolmach Portland State University

CMSC 330: Organization of Programming Languages

Design Issues. Subroutines and Control Abstraction. Subroutines and Control Abstraction. CSC 4101: Programming Languages 1. Textbook, Chapter 8

CPSC 213. Introduction to Computer Systems. Winter Session 2017, Term 2. Unit 1c Jan 24, 26, 29, 31, and Feb 2

Operational Semantics. One-Slide Summary. Lecture Outline

Lecture 14. No in-class files today. Homework 7 (due on Wednesday) and Project 3 (due in 10 days) posted. Questions?

Heap, Variables, References, and Garbage. CS152. Chris Pollett. Oct. 13, 2008.

Principles of Programming Languages Topic: Scope and Memory Professor Louis Steinberg Fall 2004

Static Program Analysis Part 1 the TIP language

CSE 307: Principles of Programming Languages

The compilation process is driven by the syntactic structure of the program as discovered by the parser

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

Qualifying Exam in Programming Languages and Compilers

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

CS 241 Honors Memory

Lecture Outline. COOL operational semantics. Operational Semantics of Cool. Motivation. Notation. The rules. Evaluation Rules So Far.

Outline. Java Models for variables Types and type checking, type safety Interpretation vs. compilation. Reasoning about code. CSCI 2600 Spring

Runtime. The optimized program is ready to run What sorts of facilities are available at runtime

Operating Systems CMPSCI 377, Lec 2 Intro to C/C++ Prashant Shenoy University of Massachusetts Amherst

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

Project Compiler. CS031 TA Help Session November 28, 2011

Transcription:

CS558 Programming Languages Fall 2016 Lecture 3a Andrew Tolmach Portland State University 1994-2016

Formal Semantics Goal: rigorous and unambiguous definition in terms of a wellunderstood formalism (e.g. logic, set theory, other math) Goal: independence from implementation Can be used as basis for: correct implementations program verification language comparison language design (usually not good for learning a language)

Varieties of Formal Semantics Axiomatic Programs are modeled as transformers on states described using logic formulas Operational Program execution is modeled as a machine taking steps or as sequence of rewriting transformations Denotational Programs are modeled as functions in an abstract mathematical domain

Semantics and Erroneous Programs Important part of language specification is distinguishing valid from invalid programs Useful to define three classes of errors that make programs invalid Of course, even valid programs may not act as the programmer intended!

Static Errors Static errors can be detected before the program is run (at compile or pre-interpretation time) Includes lexical errors, syntactic errors, type errors, etc. Error checker can give precise feedback about erroneous location in source code Language semantics are usually defined only for programs that have no static errors

Checked Runtime Errors Checked runtime errors are violations that the language implementation is required to detect and report at runtime, in a clean way. E.g. in Scala or Java: division by 0, array bounds violations, dereferencing a null pointer Depending on language, might cause an error message + abort, or raise an exception (which in principle can be caught by program) Language semantics must specify what runtime errors are checked and how

Unchecked Runtime Errors Unchecked runtime errors are violations that the implementation does not have to detect. Subsequent behavior of the computation is arbitrary (language semantics typically silent about this) No fail-stop behavior: error might not be manifested until long after it occurs E.g. in C: division by 0, array bounds violations, dereferencing a null pointer Safe languages like Scala, Java, Python have no such errors!

Today: Binding, Scope, Storage Part of being a high-level language is letting the programmer name things: variables constants types functions classes modules fields operators... Generically, we call names identifiers An identifier binding makes an association between the identifier and the thing it names An identifier use refers to the thing named The scope of a binding is the part of the program where it can be used 8

Scala Example object Printer { def print(expr: Expr) : String = unparse(expr).tostring() } def unparse(expr: Expr) : SExpr = expr match { case Num(n) => SNum(n) case (l,r) => SList(SSym("+")::unparse(l)::unparse(r)::Nil) case Mul(l,r) => SList(SSym("*")::unparse(l)::unparse(r)::Nil) case Div(l,r) => SList(SSym("/")::unparse(l)::unparse(r)::Nil) } binding use keyword Identifier syntax is language-specific Usually unbounded sequence of alpha numeric symbol(?) Further rules/conventions for different categories Identifiers are distinct from keywords! Some identifiers are pre-defined 9

Names, values, variables Most languages let us bind variable names to memory cells containing values Name gives access to cell for read or update Many languages also let us bind names directly to (immutable) values computed by expressions Sometimes (confusingly) also called variables Scala var vs. val They let us share expressions to save repeated writing and, maybe, evaluation 10

Local Value Bindings expr ::= num expr + expr... (expr) id let id = expr in expr scope (let a = 8 + 5 in a * 3) + 3 Leta Num3 Mul binding use Num8 Num5 Vara Num3 11

Semantics via Substitution (let a = 8 + 5 in a * 3) + 3 Leta Num3 Rewrite the Mul program text Num8 Num5 Vara Num3 ((8 + 5) * 3) + 3 Mul Num3 Num3 Num8 Num5

Bound vs. Free A variable use x is bound if it appears in the scope of a binding for x Otherwise, it is free Bound and free are relative to an enclosing subexpression, e.g. a is bound in (let a = 8 + 5 in a * 3) but free in a * 3 We cannot evaluate a free variable 13

scopea scopeb Parallel Scopes (let a = 8 + 5 in a * 3) + (let b = 1 in b + 2) Leta Letb Mul Num1 Num8 Num5 Vara Num3 Varb Num2 ((8 + 5) * 3) + (1 + 2) Mul Num3 Num1 Num2 What if both let s bind a? Num8 Num5 14

Nested Scopes (let a = 8 + 5 in let b = a - 10 in a * b) + 2 Leta Num2 Letb scopea scopeb Mul Num8 Num5 Sub Vara Varb scopea scopea&b Vara Num10 15

scopea Shadowing (let a = 8 + 5 in let a = a - 10 in 36 + a) + 3 Leta Num3 Leta scopea Num8 Num5 Sub Vara Num10 Num36 Vara Need more careful definition of substitution: - only substitute for free variables - rename bindings that cause accidental capture 16

Mutually Recursive Definitions letrec a = b + 2 and b = a - 5 in b + a scopea&b Sub Letreca,b (not a very sensible expression) Varb Num2 Vara Num5 Varb Vara Many earlier languages were designed to be compiled by a single pass through the source code and therefore use forward declarations void g (double y); /* declares g but doesn t define it */ void f(double x) { g(x+1.0); } void g(double y) { f(y-1.0); } /* definition is here */ C

Functions and parameters Consider adding functions with parameters to our expression language We give names to these parameters The scope of a parameter is the function body The value of each parameter is provided at the function call (or application ) site declaration AST (f x (+ x 3)) application AST (@ f (* 13 3)) { { function name formal parameter body 18 function name actual parameter

Function scoping (f x (+ x 3)) Fundeff,x scopex Varx Num3

Application via Substitution To evaluate a function application: (@ f (* 13 3)) Evaluate the argument: (@ f 39) Replace application with a copy of the body; then substitute the actual parameter value for the formal parameter in that copy: (+ 39 3) (f x (+ x 3)) (call-by-value) 42

Dynamic Scope What should happen in the following program? (f x (+ x y)) (@ f 42) How about this one? (f x (+ x y)) (let y 2 (@ f 42)) One possible answer: let the value of y leak into f This is an example of dynamic scope Bad idea! 21 Why?

Static scope / Lexical scope Better if this program remains erroneous (f x (+ x y)) (let y 2 (@ f 42)) Looking at a function declaration, we can always determine if and where a variable is bound without considering the dynamic execution of the program! Some scripting languages still use dynamic scope, but as programs get larger, its dangers become obvious 22

Re-using names What happens when the same name is bound twice in the same scope? If the bindings are to different kinds of things (e.g. types vs. variables), can often disambiguate based on syntax, so no problem arises (except maybe readability): type Foo = Int val Foo : Foo = 10 val Bar : Foo = Foo + 1 Scala Here we say that types and variables live in different name spaces If the bindings are in the same namespace, typically an error. But sometimes additional info (such as types) can be used to pick the right binding; this is called overloading

Named scopes: modules, classes Often, the construct that delimits a scope can itself has a name, allowing the programer to manage explicitly the visibility of the names inside it OCaml modules module Env = struct type env = (string * int) list let empty : env = [] let rec lookup (e:env) (k:string) : int =... end let e0 : Env.env = Env.empty in Env.lookup e0 "abc" Java classes class Foo { static int x; static void f(int x); } int z = Foo.f(Foo.x)

Semantics via Environments An environment is a mapping from names to their bindings The environment at a program point describes all the bindings in scope at that point Environment is extended when binding constructs are evaluated Environment is consulted to determine the meaning of names during evaluation More realistic than using substitutions Environments are easily implemented in an interpreter Evaluation doesn t change program text

Environments for everything Environments can hold binding information for all kinds of names a variable name is (typically) bound to location [in the store] containing the variable a value (constant) name may be bound directly bound to the value [environment = store] a function name is bound to description of the function s parameters and body a type name is bound to a type description, including the layout of its values a class name is bound to a list of the class s content etc.

Variables and the Store In most imperative languages, variable names are bound to locations, which in turn contain values. So evaluating a variable declaration involves two things: 1. allocating a new store location (and perhaps initializing its contents) 2. creating a new binding from variable name to location In most languages, there are other ways to allocate storage too, such as explicit new operations or implicit boxing operations Simplistic store model: mutable map from locations to values Better models require distinguishing different classes of storage

Storage Lifetimes Typical computations use far more memory locations in total than they use at any one point So most language implementations support re-use of memory locations that are no longer needed Allocation Point Lifetime Deallocation Point time The lifetime of every object should cover all moments when the object is being used Otherwise, we get a memory safety bug

Storage Classes: Static Data Lifetime = Entire Execution Typically used for global variables and constants If language has no recursion, can also be used for function-local variables Fixed address known before program executes No runtime allocation/deallocation costs

Storage Classes: Stack Data Nested Lifetimes (last allocated is first deallocated) Typically used for function-local variables (and internal control data for function calls) Works because function call lifetimes also nest Allocation/deallocation are very cheap (just adjust the stack pointer) Produces good locality for caches, virtual memory

Storage Classes: Heap Data Arbitrary Lifetimes Typically used for explicitly allocated objects Some languages implicitly heap-allocate other data structures, e.g. bignums, closures, etc. Allocation/deallocation are relatively expensive Run-time library must decide where to allocate Deallocation can be done manually (risking memory bugs) or by a garbage collector

Scope, Lifetime, Memory Safety Lifetime and scope are closely connected For a language to be memory safe, it suffices to make sure that in-scope identifiers never point (directly or indirectly) to deallocated objects For stack-allocated local variables, this happens naturally Stack locations are deallocated only when function returns and its local variables go out of scope forever For heap data, easiest to enforce safety using a garbage collector (GC) GC typically works by recursively tracing all objects reachable from names that are currently in scope (or that might come back into scope later) Only unreachable objects are deallocated, making their locations available for future re-allocation

Explicit Deallocation Many older languages (notably C/C++) support explicit deallocation of heap objects Somewhat more efficient than GC char *foo() { char *p = malloc(100); free(p); return p;} But makes language unsafe: dangling pointer bug occurs if we deallocate an object that is still in use [unchecked runtime error] Converse problem: space leak bug occurs if we don t deallocate an unneeded object. Not a safety problem, but may unnecessarily make program run slower or crash with out of memory error

Pairs To study the essence of heap data structures, we can focus on a single new kind of value, the pair Like a record with two fields, each containing another value Written using infix dot notation We can build larger records of a fixed size just by nesting pairs (1. ((2. 3). 4)) corresponds to 1 4 2 3

Lists We can also build all kinds of interesting arbitrarysized recursive structures using pairs For example, to represent (singly-linked) lists we can use a pair for each node in the list. First field contains an element; second field points to the next link, or is 0 to indicate end-of-list Example: 1,2,3 (1.(2.(3.0))) 1 2 3 0 Note that for programs to detect end-of-list, we need a test that distinguishes integers from pairs