Type Safety. Java and ML are type safe, or strongly typed, languages. C and C++ are often described as weakly typed languages.

Similar documents
1. Abstract syntax: deep structure and binding. 2. Static semantics: typing rules. 3. Dynamic semantics: execution rules.

Part III. Static and Dynamic Semantics

Category Item Abstract Concrete

At build time, not application time. Types serve as statically checkable invariants.

Supplementary Notes on Exceptions

4&5 Binary Operations and Relations. The Integers. (part I)

15 212: Principles of Programming. Some Notes on Induction

Lecture Notes on Static and Dynamic Semantics

AXIOMS FOR THE INTEGERS

An Annotated Language

COS 320. Compiling Techniques

CS4215 Programming Language Implementation. Martin Henz

Programming Languages Fall 2014

Subtyping. Lecture 13 CS 565 3/27/06

1 Introduction. 3 Syntax

Lecture Notes on Aggregate Data Structures

Subsumption. Principle of safe substitution

4.5 Pure Linear Functional Programming

Tradeoffs. CSE 505: Programming Languages. Lecture 15 Subtyping. Where shall we add useful completeness? Where shall we add completeness?

Subtyping (cont) Lecture 15 CS 565 4/3/08

Part III. Chapter 15: Subtyping

Meeting13:Denotations

CS 6110 S11 Lecture 25 Typed λ-calculus 6 April 2011

STABILITY AND PARADOX IN ALGORITHMIC LOGIC

Lecture 9: Typed Lambda Calculus

Lecture 13: Subtyping

Meeting14:Denotations

Lambda Calculus: Implementation Techniques and a Proof. COS 441 Slides 15

The Encoding Complexity of Network Coding

Lecture Notes on Arrays

Goal. CS152: Programming Languages. Lecture 15 Parametric Polymorphism. What the Library Likes. What The Client Likes. Start simpler.

Programming Languages Third Edition

Part III Chapter 15: Subtyping

Boolean expressions. Elements of Programming Languages. Conditionals. What use is this?

Programming Languages

Part VI. Imperative Functional Programming

CSE 505, Fall 2008, Midterm Examination 29 October Please do not turn the page until everyone is ready.

3.7 Denotational Semantics

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF

Lecture Notes on Program Equivalence

Summary of Course Coverage

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF

Subtyping (cont) Formalization of Subtyping. Lecture 15 CS 565. Inversion of the subtype relation:

CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

MPRI course 2-4 Functional programming languages Exercises

A Reduction of Conway s Thrackle Conjecture

Lecture 2: SML Basics

CIS 500 Software Foundations Fall October 2

Runtime Behavior of Conversion Interpretation of Subtyping

Lecture 3: Recursion; Structural Induction

Math 302 Introduction to Proofs via Number Theory. Robert Jewett (with small modifications by B. Ćurgus)

CS422 - Programming Language Design

Computer Science Technical Report

Course notes for Data Compression - 2 Kolmogorov complexity Fall 2005

Review. CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Let bindings (CBV) Adding Stuff. Booleans and Conditionals

Learning Log Title: CHAPTER 3: ARITHMETIC PROPERTIES. Date: Lesson: Chapter 3: Arithmetic Properties

Formal Syntax and Semantics of Programming Languages

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

Semantics with Applications 3. More on Operational Semantics

Chapter 13: Reference. Why reference Typing Evaluation Store Typings Safety Notes

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #33 Pointer Arithmetic

Formal semantics of loosely typed languages. Joep Verkoelen Vincent Driessen

Semantics via Syntax. f (4) = if define f (x) =2 x + 55.

if s has property P and c char, then c s has property P.

Types for References, Exceptions and Continuations. Review of Subtyping. Γ e:τ τ <:σ Γ e:σ. Annoucements. How s the midterm going?

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

CS2 Algorithms and Data Structures Note 10. Depth-First Search and Topological Sorting

Mechanizing Metatheory with LF and Twelf

Formal Semantics. Prof. Clarkson Fall Today s music: Down to Earth by Peter Gabriel from the WALL-E soundtrack

Formal Systems and their Applications

CS-XXX: Graduate Programming Languages. Lecture 9 Simply Typed Lambda Calculus. Dan Grossman 2012

CSE505, Fall 2012, Midterm Examination October 30, 2012

14.1 Encoding for different models of computation

Lecture 1 Contracts. 1 A Mysterious Program : Principles of Imperative Computation (Spring 2018) Frank Pfenning

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

CMSC 336: Type Systems for Programming Languages Lecture 5: Simply Typed Lambda Calculus Acar & Ahmed January 24, 2008

Warm-Up Problem. Let L be the language consisting of as constant symbols, as a function symbol and as a predicate symbol. Give an interpretation where

Modal Logic: Implications for Design of a Language for Distributed Computation p.1/53

EDAA40 At home exercises 1

Mathematically Rigorous Software Design Review of mathematical prerequisites

Costly software bugs that could have been averted with type checking

Lecture 22 Tuesday, April 10

Faster parameterized algorithms for Minimum Fill-In

Featherweight Java (FJ)

2 Introduction to operational semantics

(Refer Slide Time: 4:00)

On the Boolean Algebra of Shape Analysis Constraints

Type Systems. Pierce Ch. 3, 8, 11, 15 CSE

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013

Definition For vertices u, v V (G), the distance from u to v, denoted d(u, v), in G is the length of a shortest u, v-path. 1

CS 3512, Spring Instructor: Doug Dunham. Textbook: James L. Hein, Discrete Structures, Logic, and Computability, 3rd Ed. Jones and Barlett, 2010

Lecture Notes on Data Representation

Introduction to Sets and Logic (MATH 1190)

A Typed Lambda Calculus for Input Sanitation

TAFL 1 (ECS-403) Unit- V. 5.1 Turing Machine. 5.2 TM as computer of Integer Function

Handout 9: Imperative Programs and State

CHAPTER 8. Copyright Cengage Learning. All rights reserved.

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning

Formal Semantics of Programming Languages

Transcription:

Java and ML are type safe, or strongly typed, languages. CMPSCI 630: Programming Languages Spring 2009 (with thanks to Robert Harper) C and C++ are often described as weakly typed languages. What does this mean? 1 Informally, a type-safe language is one for which What is a run-time type error? There is a clearly specified notion of type correctness. Bus error? Type correct programs are free of run-time type errors. Division by zero? Arithmetic overflow? But this begs the question! Array bounds check? Uncaught exception? 2 3 Type safety is a matter of coherence between the static and dynamic semantics. For example, if the type system tracks sizes of arrays, then out-of-bounds subscript is a run-time type error. The static semantics makes predictions about the execution behavior. The type system ensures that access is within allowable limits. The dynamic semantics must comply with those predictions. If the run-time model exceeds these bounds, you have a runtime type error. 4 Similarly, if the type system tracks value ranges, then division by zero or arithmetic overflow is a run-time type error. 5

Demonstrating that a program is well-typed means proving a theorem about its behavior. A type checker is therefore a theorem prover. Non-computability theorems limit the strength of theorems that a mechanical type checker can prove. Fundamentally there is a tension between the expressiveness of the type system, and the difficulty of proving that a program is well-typed. Therein lies the art of type system design. 6 7 Formalization of The coherence of the static and dynamic semantics is neatly summarized by two related properties: 1. Preservation. Well-typed programs do not go off into the weeds. A well-typed program remains well-typed during execution. 2. Progress. Well-typed programs do not get stuck. If an expression is well-typed, then either it is a value or there is a well-defined next instruction. Formalization of More precisely, type safety is the conjunction of two properties: 1. Preservation. If e : τ, and e e, then e : τ. 2. Progress. If e : τ, then either e val, or there exists e such that e e. Consequently, if e : τ and e v, then v : τ. 8 9 Formalization of Moreover, the type of a (closed) value determines its form. For L{num str}, if v : τ, then If τ = num, then v = num[n] for some n. If τ = str, then v = str[s] for some s. Thus if e : num and e v, then v = num[n] for some n. In words: expressions of type num evaluate to natural numbers. Proof of Preservation for L{num str} Theorem 1 (Preservation) If e : τ and e e, then e : τ. Proof: The proof proceeds by induction on evaluation, that is, induction on the transition judgement. This means 1. We must prove it outright for axioms (rules with no premises). 2. For each rule, we may assume the theorem for the premises, and show it is true for the conclusion. 10 11

Proof of Preservation for L{num str} Instruction Steps The primitive operations are straightforward: We have e = plus(num[n 1 ]; num[n 2 ]), τ = num, and e = num[n 1 + n 2 ]. Clearly e : num, as required. The cases for multiplication, concatenation and length are handled similarly. Proof of Preservation for L{num str} Instruction Steps The let instruction is a bit more complex. We require both the inversion and the substitution lemmas. We have e = let(e 1 ; x.e 2 ):τ and e =[e 1 /x]e 2. By inverting the typing of e, we have e 1 : τ 1 for some τ 1 such that x : τ 1 e 2 : τ. By substitution (see the first Structural Properties of Typing slide from the previous lecture) we have [e 1 /x]e 2 : τ, as required. 12 13 plus(e 1 ; e 2 ) plus(e 1 ; e 2) e 1 val e 2 e 2 plus(e 1 ; e 2 ) plus(e 1 ; e 2 ) We have e = plus(e 1 ; e 2 ), e = plus(e 1 ; e 2), and. By inversion e 1 : num and e 2 : num. By induction e 1 : num, and hence e : num, as required. The case for the similar rule for multiplication is handled similarly. We have e = plus(e 1,e 2 ;), e = plus(e 1,e 2 ;), and e 2 e 2. By inversion e 1 : num and e 2 : num, so that by induction e 2 : num, and hence e : num, as required. The case for the similar rule for multiplication is handled similarly. 14 15 cat(e 1 ; e 2 ) cat(e 1 ; e 2) e 1 val e 2 e 2 cat(e 1 ; e 2 ) cat(e 1 ; e 2 ) We have e = cat(e 1 ; e 2 ), e = cat(e 1 ; e 2), and. We have e = cat(e 1,e 2 ;), e = cat(e 1,e 2 ;), and e 2 e 2. By inversion e 1 : str and e 2 : str. hence e : str, as required. By induction e 1 : str, and By inversion e 1 : str and e 2 : str, so that by induction e 2 : str, and hence e : str, as required. 16 17

There is just one case for the by value interpretation of let expressions. let(e 1 ; x.e 2 ) let(e 1 ; x.e 2) We have e = let(e 1 ; x.e 2 ):τ, e = let(e 1 ; x.e 2) and. By inversion e 1 : τ 1, for some type τ 1 such that x : τ 1 e 2 : τ. By induction e 1 : τ 1, and hence e : τ. Proof of Preservation for L{num str} This completes the proof. How might it have failed? Only if some instruction is mis-defined. For example, if we had defined { str[zero] if m = n =0 plus(num[m]; num[n]) num[m + n] otherwise Then preservation would fail. In other words, preservation says that the steps of evaluation are well-behaved. 18 19 Proof of Preservation for L{num str} Notice that if an instruction is undefined, this does not disturb preservation! For example, if we omitted the instruction for plus(num[0]; num[0]), the proof would still go through! In other words, preservation alone is not enough to characterize safety. Canonical Forms Lemma The type system for L{num str} predicts the forms of values: Lemma 2 (Canonical Forms for L{num str}) Suppose that e : τ and e val. 1. If τ = str, then e = str[s] for some s. 2. If τ = num, then e = num[n] for some n. 20 21 Proof of Canonical Forms Lemma The proof is by induction on typing. For example, for e : str, e cannot be a natural number, because num str. e cannot be a variable, because it is closed. Theorem 3 (Progress) If e : τ, then either e is a value, or there exists e such that e e. Proof: The proof is by induction on typing. We consider each typing rule in turn. For axioms, we must demonstrate the theorem directly. Otherwise, for each rule, we assume the theorem for the premises, and show it holds for the conclusion. e can be a string constant, as specified. e cannot be an application of a primitive operation, nor a let expression. 22 23

The expression cannot be a variable, because it is closed. For natural numbers or string constants the result is immediate because they are values. Consider the rule for typing addition expressions. We have e = plus(e 1 ; e 2 ) and τ = num, with e 1 : num and e 2 : num. By induction we have either e 1 is a value, or there exists e 1 such that for some expression e 1. We consider these two cases in turn. If, then e e, where e = plus(e 1 ; e 2), which completes this case. If e 1 is a value, then we note that by the canonical forms lemma e 1 = num[n 1 ] for some n 1, and we consider e 2. By induction either e 2 is a value, or e 2 e 2 for some expression e 2. If e 2 is a value, then by the canonical forms lemma e 2 = num[n 2 ] for some n 2, and we note that e e, where e = num[n 1 + n 2 ]. If e 2 is not a value, then e e, where e = plus(e 1 ; e 2 ). 25 24 Suppose that e = let(e 1 ; x.e 2 ). value interpretation: Consider the case for the by The other cases are handled similarly. How could the proof have failed? By the inductive hypothesis, either e 1 is a value, or there exists e 1 such that. If e 1 is not a value, then e let(e 1 ; x.e 2) by the search rule for let expressions, as required. If e 1 is a value, then by the inversion for typing lemma e 1 : τ 1 such that x : τ 1 e 2 : τ and e e, where e =[e 1 /x]e 2, by the rule for executing let expressions. 26 1. Some instruction step was omitted. If there were no instructions for plus(num[n 1 ]; num[n 2 ]), then progress would fail. 2. Some search rule was omitted. If there were no rule for, say, plus(e 1 ; e 2 ), where e 1 is not a value, then we cannot make progress. In other words, progress implies that we cannot find ourselves in an embarassing situation! 27 Extending the L{num str} Language We deliberately omitted division from L{num str}. Suppose we add div as a primitive operation and define the following evaluation rules for it: (n 2 0) div(num[n 1 ]; num[n 2 ]) num[n 1 n 2 ] div(e 1 ; e 2 ) div(e 1 ; e 2) e 1 val e 2 e 2 div(e 1 ; e 2 ) div(e 1 ; e 2 ) Extending the L{num str} Language Suppose the static semantics gives the following typing to div: Γ e 1 : num Γ e 2 : num Γ div(e 1 ; e 2 ):num Is the language still safe? Preservation continues to hold: new instruction preserves type. Progress fails: div(num[10]; num[0]), yet has type num. 28 29

How can we recover safety? Extending the Language 1. Strengthen the type system to rule out the offending case. Extending the Type System A natural idea: add a type nzint of non-zero integers. the typing rule for division to: Γ e 1 : num Γ e 2 : nzint Γ div(e 1 ; e 2 ):num Revise 2. Change the dynamic semantics to avoid getting stuck when the denominator is zero. But how do we create expressions of type nzint? This type does not have good closure properties, e.g. is not closed under subtraction. It is undecidable in general whether e : num evaluates to a non-zero integer. 30 31 Modifying the Dynamic Semantics One idea: an inductively defined judgement, e err, stating that expression e incurred a checked error, such as zero denominator or array index out of bounds, at run time. Add rules to detect run-time errors Add rules to propagate errors through program execution Revise statement of safety to account for errors. A program has an answer that is either a value or an error. Adding Errors For example, we add a judgement for handling zero divisor: e 1 val div(num[e 1 ]; num[0]) err Then we must propagate errors upwards: e 1 err plus(e 1 ; e 2 ) err e 1 val e 2 err plus(e 1 ; e 2 ) err and so on for the other non-value expression forms. 32 33 Proving Progress Theorem 4 (Progress With Error) If e : τ, then either e err, or e val or there exists e such that e e. Proof: The proof is by induction on typing and proceeds much as before, except that there are now three cases to consider at each point in the proof. Modifying the Dynamic Semantics Adding a separate set of evaluation rules to check for errors seems unnatural. An alternative is to fold error checking into evaluation. Introduce a special expression, error, that signals a checked error has arisen, such as zero denominator or array index out of bounds. Revise typing, dynamic semantics and statement of safety to account for errors. 34 35

Adding Errors Since it aborts computation, static semantics assigns an arbitrary type to error: error : τ The dynamic semantics must be modified in two ways: Primitive operations must yield an error in an otherwise undefined state. Search rules must propagate errors once they arise. 36 Adding Errors For example, we add an error transition for zero divisor: e 1 val div(e 1 ; num[0]) error Then we must propagate errors upwards: plus(error; e 2 ) error e 1 val plus(e 1 ; error) error and so on for the other non-value expression forms. Defining e err to hold exactly when e = error, the Progress With Error theorem still holds. 37 Summary Type safety expresses the coherence of the static and dynamic semantics. Summary Checked errors ensure that behavior is well-defined, even in the presence of undefined operations. Coherence is elegantly expressed as the conjunction of preservation and progress. Explicitly circumscribe error transitions. Explicitly define which states lead to an error. 38 39