CS 4110 Programming Languages & Logics Lecture 35 Typed Assembly Language 1 December 2014
Announcements 2 Foster Office Hours today 4-5pm Optional Homework 10 out Final practice problems out this week
Schedule 3 One and a half weeks ago... Typed Assembly Language Last Monday Polymorphism Stack Types Today Compilation
Certified Compilation 4 An automatic way to generate certifiable low-level code Type safety implies memory, security properties
Certified Compilation 4 An automatic way to generate certifiable low-level code Type safety implies memory, security properties Tiny Source language
Certified Compilation 4 An automatic way to generate certifiable low-level code Type safety implies memory, security properties Tiny Source language Typed Assembly Language Compiler target
Certified Compilation 4 An automatic way to generate certifiable low-level code Type safety implies memory, security properties Tiny Source language Typed Assembly Language Compiler target Optimized Typed Assembly Language Optimizer target
The Tiny Language 5 Features Integer expressions Conditionals Recursive functions Function pointers (but no closures) A strong, static type system
The Tiny Language 5 Features Integer expressions Conditionals Recursive functions Function pointers (but no closures) A strong, static type system Example program letrec fun fact (n:int) : int = if n = 0 then 1 else n * fact (n 1) in fact (42)
Tiny Type System 6 Expressions Φ x : Φ(x)
Tiny Type System 6 Expressions Φ x : Φ(x) Φ f : Φ(f)
Tiny Type System 6 Expressions Φ x : Φ(x) Φ f : Φ(f) Φ n : int
Tiny Type System 6 Expressions Φ x : Φ(x) Φ f : Φ(f) Φ n : int Φ e 1 : int Φ e 2 : int Φ e 1 + e 2 : int
Tiny Type System 6 Expressions Φ x : Φ(x) Φ f : Φ(f) Φ n : int Φ e 1 : int Φ e 1 + e 2 : int Φ e 2 : int Φ e 1 : τ 1 τ 2 Φ e 2 : τ 1 Φ e 1 e 2 : τ 2
Tiny Type System 6 Expressions Φ x : Φ(x) Φ f : Φ(f) Φ n : int Φ e 1 : int Φ e 1 + e 2 : int Φ e 2 : int Φ e 1 : τ 1 τ 2 Φ e 2 : τ 1 Φ e 1 e 2 : τ 2 Φ e 1 : int Φ e 2 : τ Φ e 3 : τ Φ if e 1 = 0 then e 2 else e 3 : τ
Tiny Type System 6 Expressions Φ x : Φ(x) Φ f : Φ(f) Φ n : int Φ e 1 : int Φ e 1 + e 2 : int Φ e 2 : int Φ e 1 : τ 1 τ 2 Φ e 2 : τ 1 Φ e 1 e 2 : τ 2 Φ e 1 : int Φ e 2 : τ Φ e 3 : τ Φ if e 1 = 0 then e 2 else e 3 : τ Φ e 1 : τ 1 Φ, x : τ 1 e 2 : τ 2 Φ let x = e 1 in e 2 : τ 1
Tiny Type System 7 Declarations Φ, x : τ 1 e : τ 2 Φ fun f(x : τ 1 ) : τ 2 = e : (f : τ 1 τ 2 )
Tiny Type System 7 Declarations Φ, x : τ 1 e : τ 2 Φ fun f(x : τ 1 ) : τ 2 = e : (f : τ 1 τ 2 ) Programs Φ = f 1 : τ 11 τ 12,..., f k : τ k1 τ k2 Φ d i : (f i : τ i1 τ i2 ) Φ e : int letrec d 1 d k in e
Tiny Type System 7 Declarations Φ, x : τ 1 e : τ 2 Φ fun f(x : τ 1 ) : τ 2 = e : (f : τ 1 τ 2 ) Programs Φ = f 1 : τ 11 τ 12,..., f k : τ k1 τ k2 Φ d i : (f i : τ i1 τ i2 ) Φ e : int letrec d 1 d k in e Exercise: Check that fact is well typed
Type-preserving Compilation 8 Most compilers consist of a series of transformations In our compiler, each step will propagate types A transformation consists of two sub-translations: From source types to target types From source terms to target terms After each step, can typecheck the intermediate output code to detect errors in the compiler Key correctness property will be that transformations preserve types (as well as semantics)
Type Translation 9 T [[ ]] maps Tiny types to TAL Types
Type Translation 9 T [[ ]] maps Tiny types to TAL Types Integers: T [[int]] = int
Type Translation 9 T [[ ]] maps Tiny types to TAL Types Integers: T [[int]] = int Functions: Caller pushes argument and return address on stack Callee pops the return address and argument and places result in ra T [[τ 1 τ 2 ]] = ρ.{sp : K[[τ 2, ρ]] :: T [[τ 1 ]] :: ρ} {} where K[[τ, σ]] = {sp : σ, r a : T [[τ]]} {} K[[τ]] = {r a : T [[τ]]} {}
Expression Translation 10 E[[ ]] maps Tiny expressions (really typing derivations) to labeled code blocks
Expression Translation 10 E[[ ]] maps Tiny expressions (really typing derivations) to labeled code blocks We use the stack extensively: To evaluate an expression, evaluate subexpressions then push them onto the stack Return final value in ra M maps expression variables to stack offsets I(M) increments the offset associated with each variable in M Overall, uses just two registers, r a and r t, and the stack Shape of the translation is E[[e]] M,σ = J, where J is a sequence of typed, labeled blocks. Write L f for the TAL label of function f
Expression Translation 11 Expression variables E[[x]] M,σ = sld r a, M(x)
Expression Translation 11 Expression variables E[[x]] M,σ = sld r a, M(x) Function variables E[[f]] M,σ = mov r a, L f
Expression Translation 11 Expression variables E[[x]] M,σ = sld r a, M(x) Function variables E[[f]] M,σ = mov r a, L f Integers E[[n]] M,σ = mov r a, n
Expression Translation 11 Expression variables E[[x]] M,σ = sld r a, M(x) Function variables E[[f]] M,σ = mov r a, L f Addition E[[e 1 + e 2 ]] M,σ = E[[e 1 ]] M,σ push r a E[[e 2 ]] I(M),int :: σ pop r t add r a, r t, r a Integers E[[n]] M,σ = mov r a, n
Expression Translation 12 Application E[[e 1 e 2 ]] M,σ = E[[e 1 ]] M,σ push r a E[[e 2 ]] I(M),T [[τ1 τ 2 ]] :: σ pop r t push r a push L r [ρ] jmp r t [σ] L r : ρ. K[[τ 2, σ]] where Φ e 1 : τ 1 τ 2 and L r fresh
Expression Translation 13 Conditional E[[if e 1 = 0 then e 2 else e 3 ]] M,σ = E[[e 1 ]] M,σ bneq r a, L else [ρ] E[[e 2 ]] M,σ jmp L end [ρ] L else : ρ.{sp : σ} {} E[[e 3 ]] M,σ jmp L end [ρ] L end : ρ.k[[τ, σ]] where Φ e 2 : τ and Φ e 2 : τ and L else and L end fresh
Declaration Translation 14 Function F[[fun f (x : τ 1 ) : τ 2 = e]] = L f : T [[τ 1 τ 2 ]] E[[e 2 ]] x := 2,K[[τ2,ρ]] :: T [[τ 1 ]] :: ρ pop r t sfree 1 jmp r t
Program Translation 15 Program P[[letrec d 1 d k in e]] = F[[d 1 ]]. F[[d k ]] L main : ρ.{sp : K[[int, ρ]] :: ρ} {} E[[e]],K[[int,ρ]] :: ρ pop r t jmp r t To run the program, push return address on stack and jump to L main Result is returned in r a
Factorial L fact : ρ. {sp : K[[int]] :: int :: ρ} sld r a,2 bneq r a,l else [ρ] mov r a,1 jmp L end L else : ρ. {sp : K[[int]] :: int :: ρ} sld r a,2 push r a mov r a,l fact push r a sld r a,4 push r a mov r a,1 pop r t sub r a,r t,r a pop r t push L r [ρ] jmp r t [int :: K[[int, ρ]] :: int :: ρ] 16
Factorial L fact : ρ. {sp : K[[int]] :: int :: ρ} sld r a,2 bneq r a,l else [ρ] mov r a,1 jmp L end L else : ρ. {sp : K[[int]] :: int :: ρ} sld r a,2 push r a mov r a,l fact push r a sld r a,4 push r a mov r a,1 pop r t sub r a,r t,r a pop r t push L r [ρ] jmp r t [int :: K[[int, ρ]] :: int :: ρ] L r : ρ. {sp : K[[int]] :: int :: ρ, r a : int} pop r t mul r a,r t,r t jmp L end [ρ] L end : ρ. {sp : K[[int]] :: int :: ρ, r a : int} pop r t sfree 1 jmp r t 16
Compiler properties 17 Theorem If P then there exists Ψ such that P[[P]] : Ψ Proof by induction on P... Main idea: We don t have to trust the compiler because we can typecheck the assembly code emitted as output!
Optimizations Almost any compiler will produce better code than ours! (But how many compilers fit on four slides?) The type system makes it possible to implement many standard optimizations to generate better code: Instruction selection and scheduling Register allocation Common subexpression elimination Redundant load/store elimination Strength reduction Loop-invariant removal Tail-calls and others... Types do not interfere with the most common optimizations 18
Tail-call Optimization 19 A crucial optimization for functional languages Applies when the final operation in a function f is a call to another function g Instead of having f push the return address onto the stack and engage in the normal calling sequence, f pops all of its temporary values, jumps directly to g and never returns!
Tail-call Optimization 20 Unoptimized code L f : ρ. {sp : K[[τ, ρ]] :: τ f :: ρ, r a : τ g } {} salloc 2 % setup stack frame sst L r % push return address sst r a,2 % push argument jmp L g [τ :: τ f :: ρ] L r : ρ. {sp : τ :: τ f :: ρ, r a : τ} {} pop r t % pop return address sfree 1 % discard f s argument jmp r t % return Optimized code L f : ρ. {sp : K[[τ, ρ]] :: τ f :: ρ, r a : τ g } {} sst r a,2 jmp L g [ρ] % g returns to f s caller