Types and Static Type Checking (Introducing Micro-Haskell)

Similar documents
Types and Static Type Checking (Introducing Micro-Haskell)

Semantics of programming languages

Semantics of programming languages

Semantics of programming languages

LL(1) predictive parsing

7. Introduction to Denotational Semantics. Oscar Nierstrasz

LL(1) predictive parsing

CSE 12 Abstract Syntax Trees

Boolean expressions. Elements of Programming Languages. Conditionals. What use is this?

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

CPS122 Lecture: From Python to Java last revised January 4, Objectives:

SECOND PUBLIC EXAMINATION. Compilers

CPS122 Lecture: From Python to Java

Chapter 9: Dealing with Errors

Programming Languages Third Edition. Chapter 9 Control I Expressions and Statements

Syntax and Grammars 1 / 21

Programming Languages Third Edition. Chapter 7 Basic Semantics

CS 6353 Compiler Construction Project Assignments

Compilers - Chapter 2: An introduction to syntax analysis (and a complete toy compiler)

Comp 411 Principles of Programming Languages Lecture 3 Parsing. Corky Cartwright January 11, 2019

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Handout 10: Imperative programs and the Lambda Calculus

Intermediate Code Generation

Chapter 3. Describing Syntax and Semantics

CS 6110 S14 Lecture 1 Introduction 24 January 2014

6.001 Notes: Section 15.1

Compiler Construction D7011E

Compilers and computer architecture: Semantic analysis

CS 4110 Programming Languages & Logics. Lecture 17 Programming in the λ-calculus

Compilers and computer architecture From strings to ASTs (2): context free grammars

Polymorphic lambda calculus Princ. of Progr. Languages (and Extended ) The University of Birmingham. c Uday Reddy

Lexical and Syntax Analysis. Top-Down Parsing

Automatic generation of LL(1) parsers

2.2 Syntax Definition

COMP 181. Agenda. Midterm topics. Today: type checking. Purpose of types. Type errors. Type checking

Intro to Haskell Notes: Part 5

Lecture 2: Big-Step Semantics

Basic concepts. Chapter Toplevel loop

CSCC24 Functional Programming Scheme Part 2

Introduction to Programming Using Java (98-388)

Chapter 13: Reference. Why reference Typing Evaluation Store Typings Safety Notes

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

Semantics via Syntax. f (4) = if define f (x) =2 x + 55.

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

CSCI 1260: Compilers and Program Analysis Steven Reiss Fall Lecture 4: Syntax Analysis I

6.001 Notes: Section 6.1

CSE P 501 Exam 11/17/05 Sample Solution

1 Lexical Considerations

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

ECE251 Midterm practice questions, Fall 2010

CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures. Dan Grossman Autumn 2018

CSE413: Programming Languages and Implementation Racket structs Implementing languages with interpreters Implementing closures

Compiler Theory. (Semantic Analysis and Run-Time Environments)

Parsing II Top-down parsing. Comp 412

LECTURE 3. Compiler Phases

CS558 Programming Languages

CS4215 Programming Language Implementation. Martin Henz

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

Lecture 10 Parsing 10.1

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

3.7 Denotational Semantics

Lecture 19: Functions, Types and Data Structures in Haskell

Lecture Notes on Static Semantics

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

CS 415 Midterm Exam Spring 2002

Writing an Interpreter Thoughts on Assignment 6

CS 6353 Compiler Construction Project Assignments

Typed Lambda Calculus and Exception Handling

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

CS 320: Concepts of Programming Languages

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

In Our Last Exciting Episode

A Simple Syntax-Directed Translator

G Programming Languages - Fall 2012

Data Types. (with Examples In Haskell) COMP 524: Programming Languages Srinivas Krishnan March 22, 2011

Syntax Analysis. Chapter 4

This book is licensed under a Creative Commons Attribution 3.0 License

CSCE 314 Programming Languages

The role of semantic analysis in a compiler

CPSC 411, 2015W Term 2 Midterm Exam Date: February 25, 2016; Instructor: Ron Garcia

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

6.001 Notes: Section 8.1

CS 11 Ocaml track: lecture 6

Type Inference Systems. Type Judgments. Deriving a Type Judgment. Deriving a Judgment. Hypothetical Type Judgments CS412/CS413

FRAC: Language Reference Manual

Chapter 4. Abstract Syntax

The PCAT Programming Language Reference Manual

Chapter 6 Control Flow. June 9, 2015

Sardar Vallabhbhai Patel Institute of Technology (SVIT), Vasad M.C.A. Department COSMOS LECTURE SERIES ( ) (ODD) Code Optimization

Functions CHAPTER 5. FIGURE 1. Concrete syntax for the P 2 subset of Python. (In addition to that of P 1.)

Abstract Syntax Trees & Top-Down Parsing

Programming Assignment I Due Thursday, October 7, 2010 at 11:59pm

Programming Languages and Compilers Qualifying Examination. Answer 4 of 6 questions.

Programming with Math and Logic

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

CS112 Lecture: Primitive Types, Operators, Strings

Language Processing note 12 CS

Transcription:

Types and Static (Introducing Micro-Haskell) Informatics 2A: Lecture 14 John Longley School of Informatics University of Edinburgh jrl@inf.ed.ac.uk 17 October 2017 1 / 21

1 Types 2 3 4 2 / 21

So far in the course, we have examined the machinery that, in the case of a programming language, takes us from a program text to a parse tree, via the stages of lexing and parsing. This lecture looks at two further stages of the pipeline: The parse tree may be converted into an abstract syntax tree (AST), a kind of simplified parse tree which contains just the information needed for further processing. The resulting AST may then be subjected to various checks to ensure that certain obvious errors are avoided (static analysis). One common form of static analysis is type-checking. After this, the AST will be fed to an evaluator or interpreter to execute the program or else to a compiler to translate it into executable low-level code. 3 / 21

Types Consider the expression 3 + True How is a compiler or interpreter supposed to execute this? It does not make sense to apply the numerical addition operation to the argument True, which is a boolean. This is an example of a type error. Different programming languages take different approaches to such errors. 4 / 21

Approaches to type errors Laissez faire: Even if an operation does not make sense for the data its being applied to, just go ahead and apply it to the (binary) machine representation of the data. In some cases this will do something harmful. In other cases it might even be useful. Dynamic checking: At the point during execution at which a type mismatch (between operation and argument) is encountered, raise an error. This gives rise to helpful runtime errors. Static checking: Check (the AST of) the program to ensure that all operations are applied in a type-meaningful way. If not, identify the error(s), and disallow the program from being run until corrected. This allows many program errors to be identified before execution. 5 / 21

Approaches to type errors Laissez faire: Even if an operation does not make sense for the data its being applied to, just go ahead and apply it to the (binary) machine representation of the data. In some cases this will do something harmful. In other cases it might even be useful. (Adopted, e.g., in C.) Dynamic checking: At the point during execution at which a type mismatch (between operation and argument) is encountered, raise an error. This gives rise to helpful runtime errors. Static checking: Check (the AST of) the program to ensure that all operations are applied in a type-meaningful way. If not, identify the error(s), and disallow the program from being run until corrected. This allows many program errors to be identified before execution. 5 / 21

Approaches to type errors Laissez faire: Even if an operation does not make sense for the data its being applied to, just go ahead and apply it to the (binary) machine representation of the data. In some cases this will do something harmful. In other cases it might even be useful. (Adopted, e.g., in C.) Dynamic checking: At the point during execution at which a type mismatch (between operation and argument) is encountered, raise an error. This gives rise to helpful runtime errors. (Adopted, e.g., in Python.) Static checking: Check (the AST of) the program to ensure that all operations are applied in a type-meaningful way. If not, identify the error(s), and disallow the program from being run until corrected. This allows many program errors to be identified before execution. 5 / 21

Approaches to type errors Laissez faire: Even if an operation does not make sense for the data its being applied to, just go ahead and apply it to the (binary) machine representation of the data. In some cases this will do something harmful. In other cases it might even be useful. (Adopted, e.g., in C.) Dynamic checking: At the point during execution at which a type mismatch (between operation and argument) is encountered, raise an error. This gives rise to helpful runtime errors. (Adopted, e.g., in Python.) Static checking: Check (the AST of) the program to ensure that all operations are applied in a type-meaningful way. If not, identify the error(s), and disallow the program from being run until corrected. This allows many program errors to be identified before execution. (Adopted, e.g., in Java and Haskell.) 5 / 21

In this lecture we look at static stype-checking using a fragment of Haskell as the illustrative programming language. We call the fragment of Haskell Micro-Haskell (MH for short). MH is the basis of this year s Inf2A Assignment 1, which uses it to illustrate the full formal-language-processing pipeline. For those who have never previously met Haskell or who could benefit from a Haskell refresher, we start with a gentle introduction to MH. 6 / 21

Micro-Haskell: a crash course In mathematics, we are used to defining functions via equations, e.g. f (x) = 3x + 7. The idea in functional programming is that programs should look somewhat similar to mathematical definitions: f x = x+x+x + 7 ; This function expects an argument x of integer type (let s say), and returns a result of integer type. We therefore say the type of f is Integer -> Integer ( integer to integer ). By contrast, the definition g x = x+x <= x+7 ; returns a boolean result, so the type of g is Integer -> Bool. 7 / 21

Multi-argument functions What about a function of two arguments, say x :: Integer and y :: Bool? E.g. h x y = if y then x else 0-x ; Think of h as a function that accepts arguments one at a time. It accepts an integer and returns another function, which itself accepts a boolean and returns an integer. So the type of h is Integer -> (Bool -> Integer). By convention, we treat -> as right-associative, so we can write this just as Integer -> Bool -> Integer. Note incidentally the use of if to create expressions rather than commands. In Java, the above if-expression could be written as (y? x : 0-x) 8 / 21

Typechecking in Micro-Haskell In (Micro-)Haskell, the type of h is explicitly given as part of the function definition: h :: Integer -> Bool -> Integer ; h x y = if y then x else 0-x ; The typechecker then checks that the expression on the RHS does indeed have type Integer, assuming x and y have the specified argument types Integer and Bool respectively. Function definitions can also be recursive: div :: Integer -> Integer -> Integer ; div x y = if x < y then 0 else 1 + div (x - y) y ; Here the typechecker will check that the RHS has type Integer, assuming that x and y have type Integer and also that div itself has the stated type. 9 / 21

Higher-order functions The arguments of a function in MH can themselves be functions! F :: (Integer -> Integer) -> Integer ; F g = g 0 + g 1 + g 2 + g 3; The typechecker then checks that the expression on the RHS does indeed have type Integer, assuming x and y have the specified argument types Integer and Bool respectively. For an example application of F, consider the following MH function. inc :: Integer -> Integer ; inc x = x+1 ; If we then type F inc into an evaluator (i.e., interpreter) for MH, the evaluator will compute that the result of the expression F inc is 10 / 21

Higher-order functions The arguments of a function in MH can themselves be functions! F :: (Integer -> Integer) -> Integer ; F g = g 0 + g 1 + g 2 + g 3; The typechecker then checks that the expression on the RHS does indeed have type Integer, assuming x and y have the specified argument types Integer and Bool respectively. For an example application of F, consider the following MH function. inc :: Integer -> Integer ; inc x = x+1 ; If we then type F inc into an evaluator (i.e., interpreter) for MH, the evaluator will compute that the result of the expression F inc is 10. 10 / 21

In principle, the -> constructor can be iterated to produce very complex types, e.g. (((Integer->Bool)->Bool)->Integer)->Integer Such monsters rarely arise in ordinary programs. Nevertheless, MH (and full Haskell) has a precise way of checking whether the function definitions in the program correctly respect the types that have been assigned to them. Before discussing this process, we summarize the types of MH. 11 / 21

MH Types Types The official grammar of MH types (in Assignment 1 handout) is Type Type0 TypeRest TypeRest ɛ -> Type Type0 Integer Bool ( Type ) This is an LL(1) grammar for convenient parsing. However, a parse tree for this grammar contains more detail than is required for understanding a type expression. The following conceptually simpler grammar implements the abstract syntax of types Type Integer Bool Type -> Type 12 / 21

Abstract Syntax Trees The abstract syntax grammar is not appropriate for parsing: It is ambiguous It does not include all aspects of the concrete syntax. In particular, there are no brackets. However parse trees for the abstract syntax grammar unambiguously correspond to types. Instead of working with parse trees for the concrete LL(1) grammar, we convert such parse trees to parse trees for the abstract syntax grammar. Such parse trees are called abstract syntax trees (AST). 13 / 21

Concrete versus abstract syntax The distinction between concrete and abstract syntax is not specific to types, but applies generally to formal and natural languages. In the case of an LL(1)-predictively parsed formal languages, we have the following parsing pipeline: Lexed language phrase (sequence of lexemes) (LL(1) predictive parsing) LL(1)-grammar parse tree (uniquely determined) (Conversion of parse trees: AST see Tutorial 4 Question 2) 14 / 21

Type checking Types Main ideas. 1 Type checking is done compositionally by breaking down expressions into their subexpressions, type-checking the subexpressions, and ensuring that the top-level compound expression can then be given a type itself. 2 Throughout the process, a type environment is maintained which records the types of all variables in the expression. 15 / 21

Illustrative example Types h :: Integer -> Bool -> Integer ; h x y = if y then x else 1+x ; First the type environment Γ is set according to the the type declaration. Γ := h :: Integer -> Bool -> Integer Next, the type environment is extended to assign types to the argument variables x and y. Γ := h :: Integer -> Bool -> Integer, x :: Integer, y :: Bool 16 / 21

Illustrative example (continued) This is done in order to be consistent with the general rule In any expression e 1 e 2 (a function application) we need e 1 to have a function type t 1 -> t 2 with e 2 having the correct type t 1 for its argument. The resulting type of e 1 e 2 is then t 2. Thus, in our example, we have types and h x :: Bool -> Integer h x y :: Integer 17 / 21

Illustrative example (continued) h :: Integer -> Bool -> Integer ; h x y = if y then x else 1+x ; We have h x y with the type environment :: Integer Γ := h :: Integer -> Bool -> Integer, x :: Integer, y :: Bool Our remaining task is to type-check (relative to Γ) the expression: if y then x else 1+x :: Integer 18 / 21

Illustrative example (continued) General rule: In any expression if e 1 then e 2 else e 3 we need e 1 to have type Bool, and e 2 and e 3 to have the same type t. The resulting type of if e 1 then e 2 else e 3 is then t. In our example, we need to type-check if y then x else 1+x :: Integer we have y :: Bool and x :: Integer declared in Γ, so it remains only to type-check 1+x :: Integer 19 / 21

Illustrative example (completed) General rule: In any expression e 1 + e 2 we need e 1 and e 2 to have type Integer. The resulting type of e 1 + e 2 is then Integer. In our example, we need to type-check 1+x :: Integer we have x :: Integer declared in Γ, also the numeral 1 is (of course) given type Integer. Thus indeed we have verified 1+x :: Integer whence, putting everything together, if y then x else 1+x :: Integer as required. 20 / 21

Static type checking summary The program is type-checked purely by looking at the AST of the program. Thus type errors are picked up before the program is executed. Indeed, execution is disallowed for programs that do not type check. Static type checking gives us a guarantee: no type errors will occur during execution. This guarantee can be rigorously established as a mathematical theorem, using a mathematical model of program execution called operational semantics. Operational semantics lies at the heart of the Evaluator provided for Part D of Assignment 1. We shall meet operational semantics later in the course (Lecture 27). 21 / 21