Semantics of programming languages

Similar documents
Semantics of programming languages

Semantics of programming languages

Types and Static Type Checking (Introducing Micro-Haskell)

Types and Static Type Checking (Introducing Micro-Haskell)

LL(1) predictive parsing

CS 4110 Programming Languages & Logics. Lecture 17 Programming in the λ-calculus

LL(1) predictive parsing

Modules, Structs, Hashes, and Operational Semantics

Note that in this definition, n + m denotes the syntactic expression with three symbols n, +, and m, not to the number that is the sum of n and m.

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

7. Introduction to Denotational Semantics. Oscar Nierstrasz

Chapter 3 (part 3) Describing Syntax and Semantics

Syntax and Grammars 1 / 21

axiomatic semantics involving logical rules for deriving relations between preconditions and postconditions.

Variables. Substitution

CMSC 330: Organization of Programming Languages. Operational Semantics

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

Compiler Construction D7011E

CSCE 314 Programming Languages

Types, Expressions, and States

The semantics of a programming language is concerned with the meaning of programs, that is, how programs behave when executed on computers.

Introduction to Syntax Analysis Recursive-Descent Parsing

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Compilers and computer architecture From strings to ASTs (2): context free grammars

1 Introduction. 3 Syntax

Semantics via Syntax. f (4) = if define f (x) =2 x + 55.

Semantic Analysis. Lecture 9. February 7, 2018

Operational Semantics 1 / 13

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

3.7 Denotational Semantics

Operational Semantics. One-Slide Summary. Lecture Outline

CSC324 Principles of Programming Languages

Syntax-Directed Translation. Lecture 14

Introduction to Lexical Analysis

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character

CS 6110 S11 Lecture 25 Typed λ-calculus 6 April 2011

Programming Languages Third Edition

Com S 541. Programming Languages I

Chapter 3. Describing Syntax and Semantics

Introduction to the λ-calculus

Stating the obvious, people and computers do not speak the same language.

CS 6353 Compiler Construction Project Assignments

Automatic generation of LL(1) parsers

Mutable References. Chapter 1

This book is licensed under a Creative Commons Attribution 3.0 License

Abstract Syntax Trees & Top-Down Parsing

A simple syntax-directed

COMPILER DESIGN. For COMPUTER SCIENCE

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

Control Structures. Lecture 4 COP 3014 Fall September 18, 2017

Boolean expressions. Elements of Programming Languages. Conditionals. What use is this?

Writing an Interpreter Thoughts on Assignment 6

Part 5 Program Analysis Principles and Techniques

CSc 520 Principles of Programming Languages

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

1. (a) What are the closure properties of Regular sets? Explain. (b) Briefly explain the logical phases of a compiler model. [8+8]

A Simple Syntax-Directed Translator

The Substitution Model

CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Denotational Semantics. Domain Theory

Implementation of Lexical Analysis

CSE P 501 Exam 8/5/04

COS 320. Compiling Techniques

Parsing II Top-down parsing. Comp 412

Working of the Compilers

Lecture 3: Recursion; Structural Induction

Lecture 9: Typed Lambda Calculus

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

COMP 410 Lecture 1. Kyle Dewey

UNIT -2 LEXICAL ANALYSIS

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens

Chapter 4. Lexical and Syntax Analysis

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

Formal Semantics of Programming Languages

CIS 194: Homework 3. Due Wednesday, February 11, Interpreters. Meet SImPL

Outline. What is semantics? Denotational semantics. Semantics of naming. What is semantics? 2 / 21

Topic 6: Types COS 320. Compiling Techniques. Princeton University Spring Prof. David August. Adapted from slides by Aarne Ranta

14.1 Encoding for different models of computation

Lexical and Syntax Analysis

COSC252: Programming Languages: Semantic Specification. Jeremy Bolton, PhD Adjunct Professor

4. Lexical and Syntax Analysis

Fundamentals and lambda calculus

Formal Semantics of Programming Languages

6.001 Notes: Section 15.1

CPS 506 Comparative Programming Languages. Syntax Specification

Handout 9: Imperative Programs and State

These are notes for the third lecture; if statements and loops.

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?

CS 6353 Compiler Construction Project Assignments

Syntax. 2.1 Terminology

Formal Languages and Compilers Lecture VI: Lexical Analysis

CS 381: Programming Language Fundamentals

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Wednesday, September 9, 15. Parsers

Transcription:

Semantics of programming languages Informatics 2A: Lecture 28 Mary Cryan School of Informatics University of Edinburgh mcryan@inf.ed.ac.uk 21 November 2018 1 / 18

Two parallel pipelines A large proportion of the course thus far can be organised into two parallel language processing pipelines. Formal Language Natural Language Program text Text lexing segmentation, tagging Lexemes Tagged words parsing parsing Syntax tree Parse tree type checking agreement checking (Annotated) (Annotated) syntax tree parse tree semantics Logical form 2 / 18

Two parallel pipelines A large proportion of the course thus far can be organised into two parallel language processing pipelines. Formal Language Natural Language Program text Text lexing segmentation, tagging Lexemes Tagged words parsing parsing Syntax tree Parse tree type checking agreement checking (Annotated) (Annotated) syntax tree parse tree semantics semantics Program behaviour Logical form 2 / 18

Two parallel pipelines A large proportion of the course thus far can be organised into two parallel language processing pipelines. Formal Language Natural Language Program text Text lexing segmentation, tagging Lexemes Tagged words parsing parsing Syntax tree Parse tree type checking agreement checking (Annotated) (Annotated) syntax tree parse tree semantics semantics Program behaviour Logical form Today we look at methods of specifying program behaviour. 2 / 18

Semantics for programming languages The syntax of NLs (as described by CFGs etc.) is concerned with what sentences are grammatical and what structure they have, whilst their semantics are concerned with what sentences mean. A similar distinction can be made for programming languages. Rules associated with lexing, parsing and typechecking concern the form and structure of legal programs, but say nothing about what programs should do when you run them. The latter is what programming language semantics is about. It thus concerns the later stages of the formal language processing pipeline. 3 / 18

Specification vs. implementation In principle, one way to give a semantics (or meaning ) for a programming language is to provide a working implementation of it, e.g. an interpreter or compiler for the language. However, such an implementation will probably consist of thousands of lines of code, and so isn t very suitable as a readable definition or reference specification of the language. The latter is what we re interested in here. In other words, we want to fill the blank in the following table: Specification Implementation Lexical structure Regular exprs. Lexer impl. Grammatical structure CFGs Parser impl. Execution behaviour??? Interpreter/compiler 4 / 18

Semantic paradigms We ll look at two styles of formal programming language semantics: Operational semantics. Typically consists of a bunch of rules for executing programs given by syntax trees. Oriented towards implementations of the language: indeed, an op. sem. often gives rise immediately to a toy implementation. Denotational semantics. Typically consists of a compositional description of the meaning of program phrases (close in spirit to what we ve seen for NLs). Oriented towards mathematical reasoning about the language and about programs written in it. May be executable or not. These two styles are complementary: ideally, it s nice to have both. There are also other styles (e.g. axiomatic semantics), but we won t discuss them here. 5 / 18

Micro-Haskell: recap We use Micro-Haskell (recall Lecture 13 and Assignment 1) as a vehicle for introducing the methods of operational and denotational semantics. The format of MH declarations is illustrated by: div :: Integer -> Integer -> Integer ; div x y = if x < y then 0 else 1 + div (x - y) y ; This declares a function div, of the type specified, such that, when applied to two (non-negative) integer literals m and n, the function application div m n returns, as result, the integer literal representing the integer division of m by n. 6 / 18

Micro-Haskell: recap We use Micro-Haskell (recall Lecture 13 and Assignment 1) as a vehicle for introducing the methods of operational and denotational semantics. The format of MH declarations is illustrated by: div :: Integer -> Integer -> Integer ; div x y = if x < y then 0 else 1 + div (x - y) y ; This declares a function div, of the type specified, such that, when applied to two (non-negative) integer literals m and n, the function application div m n returns, as result, the integer literal representing the integer division of m by n. Q: How would div 2 0 behave? 6 / 18

Micro-Haskell: recap We use Micro-Haskell (recall Lecture 13 and Assignment 1) as a vehicle for introducing the methods of operational and denotational semantics. The format of MH declarations is illustrated by: div :: Integer -> Integer -> Integer ; div x y = if x < y then 0 else 1 + div (x - y) y ; This declares a function div, of the type specified, such that, when applied to two (non-negative) integer literals m and n, the function application div m n returns, as result, the integer literal representing the integer division of m by n. Q: How would div 2 0 behave? A: Loop indefinitely! 6 / 18

Semantic paradigms in the case of MH Operational semantics: This explains the computational process by which MH calculates the value of a function application, such as div m n. Denotational semantics: This defines a mathematical denotation [[ div ]] [[ Integer -> Integer -> Integer ]] Roughly, [[ Integer->Integer->Integer ]] is some set of binary functions on integers, and [[ div ]] is the integer-division function. In reality, denotational semantics is more complicated than this. 7 / 18

Operational semantics We model the execution behaviour of programs as a series of reduction steps. E.g. for Micro-Haskell: if 4+5 <= 8 then 4 else 6+7 if 9 <= 8 then 4 else 6+7 if False then 4 else 6+7 6+7 13 A (small-step) operational semantics is basically a bunch of rules for performing such reductions. 8 / 18

More complex example Consider the Micro-Haskell declaration f x y = x+y+x ; This effectively introduces the definition f = λx.λy. x+y+x Now consider the evaluation of f 3 4: f 3 4 (λx.λy. x+y+x) 3 4 (λy. 3+y+3) 4 3+4+3 7 + 3 10 Notice that two of these steps are β-reductions! 9 / 18

Operational semantics for Micro-Haskell: general rules Suppose E is a runtime environment associating a definition to each function symbol, e.g. E(f) = λx.λy.x+y+x. Also let v range over variables of MH, and write n to mean the integer literal for n. Relative to E, we can define as follows: v E(v) (v a variable defined in E) (λv.m)n M[v N] (β-reduction) m + n m + n, and similarly for other infixes. if True then M else N M if False then M else N N Continued on next slide... 10 / 18

Operational semantics for Micro-Haskell (continued) Let s say a term M is a value if it s an integer literal, a boolean literal, or a λ-abstraction. Let V range over values, Intuition: values are terms that can t be reduced any further. We try to reduce all other terms to values. To complete the definition of, we decree that if M M then: MN M N M N M N ( any infix symbol) V M V M (ditto) if M then N else P if M then N else P We then say M V ( M evaluates to V ) if there s a sequence M M 0 M 1 M r V That defines the intended behaviour of Micro-Haskell programs. It s also (roughly) how the Assignment 1 evaluator for MH works. 11 / 18

Longer example Within a run-time environment that records the definition of div, we have: div 3 2 (λx.λy. if x < y then 0 else 1 + div (x + -y) y) 3 2 (λy. if 3 < y then 0 else 1 + div (3 - y) y) 2 if 3 < 2 then 0 else 1 + div (3-2) 2 if False then 0 else 1 + div (3-2) 2 1 + div (3-2) 2 1 + (λx.λy. if x < y then 0 else 1 + div (x + -y) y) (3-2) 2 Exercise: Finish this off! 12 / 18

Operational semantics: further remarks What happens if we encounter an expression that isn t a value but can t be reduced? E.g. 5 True, or (λx.x)+4?!!! If our original program typechecks, this can never happen!!! Indeed, it can be proved that: if M can be typed, either it s a value or it can be reduced; if M has type t and M M, then M has type t. That s one reason why type systems are so valuable: they can guarantee programs won t derail at runtime. The general form of operational semantics we ve described is immensely flexible. It works beautifully for functional languages like MH. But it can also be adapted to most other kinds of programming language. 13 / 18

Denotational semantics An operational semantics provides a kind of idealized implementation of the language in terms of symbolic rules. That s fine, but doesn t give much structural understanding. Conceptually and mathematically, it is more satisfying to assign meaning to (parts of) a program in roughly the way that mathematical expressions (or indeed NL expressions) have meaning. This is the idea behind denotational semantics: associate a denotation [[ P ]] to each program phrase P in a compositional way. 14 / 18

Denotational semantics for MH: first attempt Let s try interpreting MH types by sets in an natural way: [[ Integer ]] = Z [[ Bool ]] = B = {T, F } [[ σ->τ ]] = [[ τ ]] [[ σ ]] (set of all functions from [[ σ ]] to [[ τ ]]) A closed term M :: τ will receive a denotation [[ M ]] [[ τ ]]. More generally, if M :: τ is a term in the type environment Γ = x 1 :: σ 1,..., x n :: σ n, its denotation will be a function [[ M ]] Γ : [[ σ 1 ]] [[ σ n ]] [[ τ ]] We define [[ M ]] Γ compositionally (just as in NL semantics). E.g. writing a for a 1,..., a n : [[ n ]] Γ : a n [[ x i ]] Γ : a a i [[ M+N ]] Γ : a [[ M ]] Γ ( a) + [[ N ]] Γ ( a) [[ M N ]] Γ : a [[ M ]] Γ ( a)([[ N ]] Γ ( a)), etc. 15 / 18

Denotational semantics for MH: the challenge That works well as far as it goes. The problem comes when we try to interpret recursive definitions, e.g. div = λx.λy.if x < y then 0 else 1 + div (x - y) y ; Two issues here: 1. Value of div 2 0 is undefined. So should now include a special value ( undefined ) in the sets [[ Integer ]] and [[ Bool ]]. 2. The definition of [[ div ]] will be circular: we get an equation that defines [[ div ]] in terms of itself. How can we be sure this equation even has a solution? What if it has more than one? To make the idea work, we need to change the way we define [[ σ->τ ]]. We should take some set of functions that is... rich enough to interpret all programs in our language, but constrained enough that circular definitions make sense. At this point (reluctantly) we move on to something simpler... 16 / 18

Denotational semantics for regular expressions Let s turn to an easier example. Recall our (meta)language of regular expressions: R ɛ a RR R + R R In fact, we ve already met two good den. sems. for this! [[ R ]] 1 = L(R), the language (i.e. set of strings) defined by R. [[ R ]] 2 = the particular (ɛ-)nfa for R constructed by the methods of Lecture 5. Both of these are defined compositionally: e.g. L(R + R ) is defined as L(R) L(R ), and the standard NFA for R + R is constructed out of NFAs for R and R. Note that: [[ ]] 1 is more abstract than [[ ]] 2 : can have [[ R ]] 2 [[ R ]] 2 but [[ R ]] 1 = [[ R ]] 1. So [[ ]] 1 is more useful for arguing that two regular expressions are equivalent. However, [[ ]] 2 is naturally executable, while [[ ]] 1 is not. 17 / 18

Summary Formal semantics can be used to give a concise and precise reference specification for the intended behaviour of programs. Operational semantics is nowadays quite widely used. Denotational semantics gets quite mathematical, and is at present more of a research topic. Operational semantics, and some kinds of denotational semantics, also offer a starting-point for building working implementations of the language. Denotational semantics also offers a framework for proving things about programs. E.g. if [[ P ]] = [[ P ]], that shows that P can be replaced by P in any program context without changing the program s behaviour. Ideas from both op. and den. semantics have had a significant effect on the design of programming languages. 18 / 18