CSE 230: Winter 200 Principles of Programming Languages Lecture 4: Sum, Product, Recursive Types The end is nigh HW 3 No HW 4 (= Final) Project (Meeting + Talk) Ranjit Jhala UC San Diego Recap Goal: Relate Static Types w/ Dynamic Semantics Theorem: Result has same type as expression Theorem does not say evalution yields result:. Evaluation never gets stuck applying a non-function, add non-integers 2. Evaluation terminates Preservation and Progress Type preservation theorem: If `e : τ and e e then `e : τ Follows from decomposition lemma Progresstheorem: If `e : τ and e is non-value then there exists e s.t. e can progress: e e In english:. well-typed expression can make progress 2 type is preserved i e result of progress is well-typed 2. type is preserved i.e. result of progress is well-typed 3. Goto : (progress continues, i.e. expr. never stuck)
An Alternative: Explicit Errors Same results via big-step semantics: introduce an error value wrong Result (meaning) of stuck expressions Prove theorem: well-typed programs don t go wrong (i.e never evaluate to wrong ) Formalize first-order type systems Simple types (integers and booleans) Function types (simply typed λ-calculus) Structuredtypes (products and sums) Recursive types (lists, trees) Imperative types (pointers and exceptions) Subtyping Product Types: Static Semantics Extend the syntax with (binary) tuples e ::=... (e, e 2 ) fst e snd e τ ::=... τ τ 2 This language is sometimes called F Typing judgment Γ ` e : τ Γ ` e :τ Γ ` e 2 :τ 2 Γ ` (e,e 2 ): τ τ 2 Γ ` e : τ τ 2 Γ ` e : τ τ 2 Γ ` fst e : τ Γ ` snd e : τ 2 Product Types: Dynamic Semantics New form of values v ::=... (v, v 2 ) New (big step) evaluation rules: (e,e 2 ) v v 2 fst e v fst e v e v e 2 v 2 e (v,v 2 ) e (v,v 2 ) New contexts: H ::=... (H, e 2 ) (v, H 2 ) fst H snd H New redexes: New redexes: fst (v, v 2 ) v snd (v v ) v snd (v, v 2 ) v 2 Type soundness holds just as before
Records = tuples with labels New form of expressions e ::=... {L = e,..., L n = e n } e.l New form of values v ::= {L = v,..., L n = v n } New form of types... similar to F typing rules evaluation rules type soundness τ ::=... {L : τ,..., L n : τ n } Sum Types Types of the form either an int or a string either 0 or moo aka disjoint union types Sum Types New form of expressions and types e ::=... injl e injr e (match e with injl x e injr y e 2 ) τ ::=... τ + τ 2 Value of type τ + τ 2 is either a τ or a τ 2 Like unions in C or Pascal, but safe: Compiler knows which kind of value match-with is a binding operator: xis bound in e (if injl case holds) y is bound in e 2 (if injr case holds) Examples with Sum Types Consider type unit, single element () type optional integer = unit + int Useful for optional args or return values No argument: injl () Argument is 5: injr 5 To use arg, test it: match arg with injl x no-arg-case injr y...y... injl, injr are tags, match is tag checking
Examples with Sum Types bool unit + unit true injl () Static Semantics of Sum Types Typing rules Γ ` e :τ Γ ` e :τ 2 Γ ` injl e : τ + τ 2 Γ ` injr e : τ + τ 2 false injr () if e then e else e 2 match e with injl x e injr y e 2 Γ ` e :τ + τ 2 Γ, x:τ ` e l :τ Γ, y:τ 2 ` e r :τ Γ ` match e with injl x e l injr y e r : τ Types not unique (without annotations for sums): injl : int + bool injl : int + (int int) Dynamic Semantics of Sum Types New values New evaluation rules e v injl e injl v v ::=... injl v injr v e v injr e injr v e injl v [v/x]e l v match e with injl x e l injr y e r v e injr v [v/y]e r v match e with injl x e l injr y e r v Type Soundness for F + Type soundness still holds: Similar, more tedious proof (more cases) Cannot use a τ + τ 2 inappropriately key : only way to use τ + τ 2 is with match-with ensures that one cannot use a τ as a τ 2 In C/Java tag checking is upto programmer i.e., unsafe
Formalize first-order type systems Simple types (integers and booleans) Function types (simply typed λ-calculus) Structuredtypes (products and sums) Recursive types (lists, trees) Imperative types (pointers and exceptions) Subtyping Recursive Types (e.g. Lists) What is a list? How to describe using known types? A list of elements of type τ (i.e. a α list) is: either empty or it is a pair of a α and a α list α list = unit + (α α list) What does this remind you of? Write t for α list Hmm another recursive equation: t = unit + (α t) Recursive Types (e.g. Lists) Hmm another recursive equation: t = unit + (α t) Write as t = τ (t) The type variable t occurs in, is bound in τ Introduce recursive type constructor: t. τ = least fixpoint solution of the equation α list defined as: t. (unit + α t) Allows unnamed recursive types Manipulating Introduce syntactic operations to convert between t.τ and [t.τ/t]τ e.g. between αα list and unit + α α list τ ::= t t.τ e ::= fold t.τ e unfold t.τ e Intuition: ii fold t.τ : takes a τ value, turns it into a t.τ t.τ unfold t.τ : takes a t.τ value, turns it into a τ
Example with Recursive Types Lists α list = t. t (unit + α t) nil α = fold α list (injl ()) cons α = λx:α.λl:αλl:α list. fold α list (injr (x, L)) List length function length α = λl:α list. match (unfold α list L) with injl x 0 injr y + length α (snd y) Check that nil α : α list cons α : α α list α list length α : α list int Static Semantics of Recursive Types Syntax directed Γ ` e : t.τ Γ ` unfold t.τ e : [ t.τ/t]τ Γ ` e : [ t.τ/t]τ Γ ` fold t.τ e : t.τ Often, for simplicity, fold/unfold omitted Dynamics of Recursive Types Add a new form of values v ::= fold t.τ v fold ensures value has recursive type not its unfolding The evaluation rules: e v fold t.τ e fold t.τ v e fold t.τ v unfold t.τ e v The folding annotations for type checking only can be dropped after type checking Recursive Types in ML Syntactic trick avoids explicit un/fold: combine recursive and union types! datatype t = C of τ C 2 of τ 2... C n of τ n recursive: t can appear in τ i datatype intlist = Nil of unit Cons of int * intlist Programmer writes: Cons (5, l) Compiler reads: fold intlist (injr (5, l)) Programmer writes: match e with Nil... Cons (h, t)... Compiler reads: match unfold intlist e with injl_... injr(h,t)...
Encoding CBV λ-calculus in F F can t encode non-terminating computations Cannot encode recursion Cannot write the λx.x x (self-application) Recursive types level playing field: Calculus called: F typed λ-calculus as expressive as untyped λ-calculus! Convert C B V λ-calculus terms to C B V F Untyped programming in F e : conversion of the term e to F The trick? The type of e is V = t. t t Conversion rules: Verify that. ` e : V x = x λx. e = fold V (λx:v. e) e e 2 = (unfold V e ) e 2 2. e v if and only if e v Non-terminating computation D = (λx:v. (unfold V x) x) (fold V (λx:v. (unfold V x) x)))