Formal semantics of loosely typed languages. Joep Verkoelen Vincent Driessen

Similar documents
Formal Semantics of Programming Languages

Formal Semantics of Programming Languages

Semantics with Applications 3. More on Operational Semantics

Programming Languages Third Edition

Note that in this definition, n + m denotes the syntactic expression with three symbols n, +, and m, not to the number that is the sum of n and m.

3.7 Denotational Semantics

Induction and Semantics in Dafny

Semantics. A. Demers Jan This material is primarily from Ch. 2 of the text. We present an imperative

Exercises on Semantics of Programming Languages

Goals: Define the syntax of a simple imperative language Define a semantics using natural deduction 1

Mutable References. Chapter 1

Semantics via Syntax. f (4) = if define f (x) =2 x + 55.

2 Introduction to operational semantics

axiomatic semantics involving logical rules for deriving relations between preconditions and postconditions.

Lecture Notes on Induction and Recursion

Handout 9: Imperative Programs and State

3.4 Deduction and Evaluation: Tools Conditional-Equational Logic

Formal Semantics of Programming Languages

CS103 Spring 2018 Mathematical Vocabulary

Program Analysis: Lecture 02 Page 1 of 32

CS422 - Programming Language Design

Computing Fundamentals 2 Introduction to CafeOBJ

Introduction to Denotational Semantics. Brutus Is An Honorable Man. Class Likes/Dislikes Survey. Dueling Semantics

Programming Languages 3. Definition and Proof by Induction

1 Lexical Considerations

Lexical Considerations

Recursively Enumerable Languages, Turing Machines, and Decidability

Consider a description of arithmetic. It includes two equations that define the structural types of digit and operator:

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

Big-step Operational Semantics Revisited

(Refer Slide Time: 4:00)

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

Formal Syntax and Semantics of Programming Languages

Chapter 3. The While programming language

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

(a) (4 pts) Prove that if a and b are rational, then ab is rational. Since a and b are rational they can be written as the ratio of integers a 1

Lexical Considerations

Mathematically Rigorous Software Design Review of mathematical prerequisites

Introduction to Denotational Semantics. Class Likes/Dislikes Survey. Dueling Semantics. Denotational Semantics Learning Goals. You re On Jeopardy!

Formal Syntax and Semantics of Programming Languages

LECTURE 16. Functional Programming

Lecture Notes on Program Equivalence

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

CMSC 330: Organization of Programming Languages

Defining Functions. CSc 372. Comparative Programming Languages. 5 : Haskell Function Definitions. Department of Computer Science University of Arizona

Verification of Selection and Heap Sort Using Locales

Types. Type checking. Why Do We Need Type Systems? Types and Operations. What is a type? Consensus

15 212: Principles of Programming. Some Notes on Induction

Summer 2017 Discussion 10: July 25, Introduction. 2 Primitives and Define

JAVASCRIPT AND JQUERY: AN INTRODUCTION (WEB PROGRAMMING, X452.1)

Propositional Logic. Part I

The Worker/Wrapper Transformation

Elementary Recursive Function Theory

Application: Programming Language Semantics

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

Functional Programming. Pure Functional Programming

CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

Language Reference Manual simplicity

Fundamental Concepts. Chapter 1

CS4215 Programming Language Implementation. Martin Henz

On Meaning Preservation of a Calculus of Records

14.1 Encoding for different models of computation

Lecture 2: Big-Step Semantics

CPS122 Lecture: From Python to Java last revised January 4, Objectives:

(Refer Slide Time: 00:51)

A macro- generator for ALGOL

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 4 MODULE, SPRING SEMESTER MATHEMATICAL FOUNDATIONS OF PROGRAMMING ANSWERS

7. Introduction to Denotational Semantics. Oscar Nierstrasz

Big-step Operational Semantics (aka Natural Semantics)

CPS122 Lecture: From Python to Java

Reading 8 : Recursion

Formal Methods of Software Design, Eric Hehner, segment 24 page 1 out of 5

Denotational semantics

The Typed λ Calculus and Type Inferencing in ML

Semantics of COW. July Alex van Oostenrijk and Martijn van Beek

CONVENTIONAL EXECUTABLE SEMANTICS. Grigore Rosu CS422 Programming Language Design

The syntax and semantics of Beginning Student

The syntax and semantics of Beginning Student

Functional Programming Languages (FPL)

Programming Languages Fall 2013

Solutions to the Second Midterm Exam

SCHEME 8. 1 Introduction. 2 Primitives COMPUTER SCIENCE 61A. March 23, 2017

Types and Static Type Checking (Introducing Micro-Haskell)

Tail Calls. CMSC 330: Organization of Programming Languages. Tail Recursion. Tail Recursion (cont d) Names and Binding. Tail Recursion (cont d)

Week 5 Tutorial Structural Induction

Bootcamp. Christoph Thiele. Summer An example of a primitive universe

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Graph Theory Questions from Past Papers

CMSC 330: Organization of Programming Languages. Operational Semantics

CSC312 Principles of Programming Languages : Functional Programming Language. Copyright 2006 The McGraw-Hill Companies, Inc.

In Our Last Exciting Episode

IPCoreL. Phillip Duane Douglas, Jr. 11/3/2010

Maciej Sobieraj. Lecture 1

Introduction to the λ-calculus

Meeting13:Denotations

Objectives. Chapter 2: Basic Elements of C++ Introduction. Objectives (cont d.) A C++ Program (cont d.) A C++ Program

Chapter 2: Basic Elements of C++

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013

COP4020 Programming Assignment 1 - Spring 2011

1.3. Conditional expressions To express case distinctions like

Transcription:

Formal semantics of loosely typed languages Joep Verkoelen Vincent Driessen June, 2004

ii

Contents 1 Introduction 3 2 Syntax 5 2.1 Formalities.............................. 5 2.2 Example language LooselyWhile................. 6 3 Semantics 7 4 The function Equals 11 4.1 Rejected alternatives......................... 11 4.2 Accepted alternative......................... 12 4.3 Premises................................ 12 4.4 Functions needed........................... 13 4.5 Type hierarchy............................ 13 4.6 Definition of Equals......................... 14 5 Equivalence with While 15 5.1 Formalities.............................. 16 5.2 Proof.................................. 16 A Casting values 25 A.1 Atomic casts............................. 25 A.2 The wrapper function Cast..................... 26 A.3 Casting to specific types....................... 26 B String conversions 27

2 CONTENTS

CHAPTER 1 Introduction In most programming languages, symbols are assigned a type either statically or dynamically. Statically assigning types to variables is quite simple and most commonly done by declaring a variable and specifying its type at the same time. With dynamical typed variables, the types of variables are unknown at the time of their declaration, but get known along the way. Loosely typed programming languages allow for much less restrictive declaration patterns. In fact, in PHP, you cannot even specify the type of a symbol. Symbols are considered to be of any type it can be interpreted as. For example, in PHP, you could say something like this: 1 function fib($n) { 2 if ($n <= 1) 3 return 1; 4 else 5 return $n * fib($n - 1); 6 } 7 8 $x = fib("10"); // $x holds the integer 3628800 9 $x = substr($x, 0, 3); // $x now holds the string "362" In this small PHP script, before execution, the types of both the function fib and the variables $x and $n are yet unknown. When fib gets called for the first time (line 8), it is invoked with the string parameter "10", which implies that the variable $n will be of type string when entering the function body. However, in line 2, $n gets compared to 1, and PHP understands that we want to perform an integer comparison here. Therefore, it interprets the string "10" as the integer 10. Note, however, that $n is still a string. In line 3, the integer 1 is returned, so in that case the function fib returns an integer. Note that, in PHP, Functions may return different types depending on the execution path (although this does not happen in the above example).

4 Introduction In the first invocation of fib, we already saw that the variable $n is still a string. In line 5, something extraordinary happens: in execution of $n-1, $n again is interpreted as the integer 10, and subtracted by 1, resulting in a recursive call to fib, with the integer (!) 9 as the only argument. This means that in all subsequent recursive calls to fib, integers are used as arguments, instead of strings. Finally, we return to line 5 (in the first invocation of fib), which, because of the multiplication requires an interpretation of "10" as 10. The end result (the integer 3628800) is stored in the variable $x (line 8). This variable is then used in the substr function (line 9) and because this (internal) PHP function requires the first argument to be a string, $x will be interpreted as such. The result of the substr call is a string containing the first three characters from the (interpreted) string $x. Finally, the result of this action (the string "362") is stored in the variable $x, which, until then, held an integer, but now contains a string. From this example, we can summarize PHP s behavior: 1. All variables have types. They get assigned their types the moment they get assigned their values. 2. Symbols are never fixed at a specific type. The type of symbols can change over time, namely at every new assignment. 3. Symbols can be always interpreted as other types, if the context requires it. Note that his does not affect the type of the symbol itself.

CHAPTER 2 Syntax Now that we have seen how loosely typed languages behave, we will introduce a way of describing syntax and semantics on a more abstract level, leaving out the creepy little details. Therefore, we will expand the toy language While, to the new language LooselyWhile. We assume that the reader is familiar with the language While. If not, it is advised to study [1]. 2.1 Formalities Before we introduce the syntax and semantics of LooselyWhile, we will first discuss some formalities that will be used throughout the rest of the document. Binary integers All numbers (integers) we use in LooselyWhile programs are required to be written in binary notation. This is to keep trivial definitions of a few functions short and surveyable. However, in our examples, we will use decimal notations for all integers. If one would want to explicitly define the definitions to use decimal notation, such a derivation from the currently used definition would be trivial. Strict seperation of expressions and statements Furthermore, we will assume that expressions have no side effects, as they can have in the case of PHP and many other programming languages simply since it is not our focus of research. This assumption, therefore, is purely made for the sake of simplicity. Also, this allows for a strict separation of statements and expressions. And it is exactly because of this that the types of variables are not affected when they are interpreted as another type (see the third note on the behavior of PHP in the previous chapter).

6 Syntax No undeclared variables In examples, we do not the use of variables before they are assigned a value. Although this is allowed in PHP variables are always initialized at some NULLvalue, we explicitly forbid it in LooselyWhile. After all, using unassigned variables makes no sense, anyway. 2.2 Example language LooselyWhile We will begin with defining the syntax for LooselyWhile. We will use a syntactic notation based on BNF. First we will list a number of meta-variables which will be used to range over syntactical categories. These meta-variables and categories are as follows: n will range over numerals, Num, b will range over booleans, Bool, w will range over strings, String, x will range over variables, Var, e will range over expressions, Exp, S will range over statements, Stm. The context-free grammar for LooselyWhile then looks as follows: n ::= 0 1 n 0 n 1 b ::= true false c ::= c alpha c num c whitespace c punctuation λ w ::= " c " p ::= p num p alpha λ x ::= alpha p e ::= n b w x e 1 e 2 e 1 + e 2 e 1 e 2 e 1 = e 2 e 1 e 2 e e 1 e 2 e 1 e 2 e e e 1 ++e. 2 e 1 e 2 e 1 = e2 S ::= x := e skip S 1 ; S 2 if e then S 1 else S 2 while e do S Where alpha, num, whitespace and punctuation are sets of characters. Furthermore there are some string operators head ( ), tail ( ), and concatenate (++) that describe the following behavior: Let s be a string. If s is empty, then s and s are both empty strings. Else, s is a string of length 1 with the first character of s, and s is the remainder of the string. For example: Let s = "foo", then s = "f" and s = "oo". Let s 1 and s 2 be strings. Then s 1 ++s 2 will be the concatenation of the two strings. For example: Let s 1 = "foo" and s 2 = "bar". Then s 1 ++s 2 = "foobar".

CHAPTER 3 Semantics The resulting semantics are described below. Most of this is rather straightforward. The only noticeable exception here is the definition of the = relation. The definition of the Equals() function is described in the next chapter. Table 3.1: Semantics of expressions in LooselyWhile N 0 = ( Int, 0 ) N 1 = ( Int, 1 ) N n 0 = ( Int, 2i ) where N n = ( Int, i ) N n 1 = ( Int, 2i + 1 ) where N n = ( Int, i ) B true = ( Bool, tt ) B false = ( Bool, ff ) S " c " = S c S c a = ( String, x 1,..., x n, a ) where S c = ( String, x 1,..., x n ) S c b = ( String, x 1,..., x n, b ) where S c = ( String, x 1,..., x n ).S c z = ( String, x 1,..., x n, z ) where S c = ( String, x 1,..., x n ) S c A = ( String, x 1,..., x n, A ) where S c = ( String, x 1,..., x n ) S c B = ( String, x 1,..., x n, B ) where S c = ( String, x 1,..., x n ).S c Z = ( String, x 1,..., x n, Z ) where S c = ( String, x 1,..., x n ) S c 0 = ( String, x 1,..., x n, 0 ) where S c = ( String, x 1,..., x n ) S c 1 = ( String, x 1,..., x n, 1 ) where S c = ( String, x 1,..., x n )..S λ = ( String, ) E n s = N n Continued on next page

8 Semantics E b s = B b E w s = S w E x s = s x Table 3.1: Semantics of expressions in LooselyWhile E e 1 e 2 s = ( Int, ToInt (E e 1 s) ToInt (E e 2 s) ) E e 1 + e 2 s = ( Int, ToInt (E e 1 s) + ToInt (E e 2 s) ) E e 1 e 2 s = ( Int, ToInt (E e 1 s) ToInt (E e 2 s) ) E e 1 = e 2 s = Equals { ( (E e 1 s, ) E e 2 s) Bool, tt if ToInt (E e E e 1 e 2 s = ( ) 1 s) ToInt (E e 2 s) Bool, ff if ToInt (E e 1 s) > ToInt (E e 2 s) { ( ) Bool, tt if ToBool (E e s) = ff E e s = ( ) Bool, { ( ff if) ToBool (E e s) = tt Bool, tt if ToBool (E e E e 1 e 2 s = ( ) 1 s) = tt and ToBool (E e 2 s) = tt { ( Bool, ff ) if ToBool (E e 1 s) = ff or ToBool (E e 2 s) = ff Bool, tt if ToBool (E e E e 1 e 2 s = ( ) 1 s) = tt or ToBool (E e 2 s) = tt Bool, ff if ToBool (E e 1 s) = ff and ToBool (E e 2 s) = ff E e s = ( String, head (ToString(E e s)) ) E e s = ( String, tail (ToString(E e s)) ) E e 1 ++e 2 s = ( String, concat (ToString(E e 1 s), ToString(E e 2 s)) ) { ( ) Bool, tt E e 1 e 2 s = ( ) ( Bool, ff ) Bool, tt. E e 1 = e2 s = ( ) Bool, ff if E e 1 s = E e 2 s if E e 1 s E e 2 s if t 1 = t 2 where E e 1 s = ( ) t 1, w 1 and E e 2 s = ( ) t 2, w 2 otherwise The functions ToInt, ToBool and ToString are further detailed in appendix A. The string-functions head, tail and concat are detailed in Appendix B.

9 The following table describes the natural semantics of all statements in LooselyWhile. [ass lw ] [skip lw ] [comp lw ] Table 3.2: Natural semantics of statements in LooselyWhile x := a, s s[x E a s] skip, s s S 1, s s S 2, s s S 1 ; S 2, s s [if tt lw ] S 1, s s if e then S 1 else S 2, s s [if ff lw] S 2, s s if e then S 1 else S 2, s s if ToBool (E e s) = tt if ToBool (E e s) = ff [while tt lw ] S, s s while e do S, s s while e do S, s s if ToBool (E e s) = tt [while ff lw ] while e do S, s s if ToBool (E e s) = ff

10 Semantics

CHAPTER 4 The function Equals There are multiple ways of defining the semantics of =. One essential thing about whatever implementation is chosen is that there will be some means of converting the types that get compared to either each other s type or both to some other type and then perform the actual comparison. 4.1 Rejected alternatives Right-to-left type reduction First, one can look at the type of the left argument and convert the right argument to that type, and do a comparison then. For instance, the expression: true = "true" In this interpretation, the first argument is of type Bool and this would mean that the second argument, the String would be converted to a Bool. However this can cause strange behavior. Consider the following expression: 5 = "5t" Using left-to-right type reduction, the string "5t" will be converted to an integer. This conversion could be done in multiple ways (which are mentioned in 4.5), but here it is chosen to scan the string for a numeric prefix, and use that as the integer value, which is the integer 5 in this case. Thus this would evaluate to tt, because the string "5t" would be converted to the integer 5. However, the expression: "5t" = 5 would evaluate to ff, because the integer 5 would be converted to the string "5".

12 The function Equals Left-to-right type reduction Likewise, left-to-right type reduction will leave us with the same problems as the right-to-left type reduction. Two-side type reduction Using two-side type reduction, we introduce one virtually universal data type to which both the left and the right hand arguments get converted, after which the comparison is performed within the universal data type domain. When using this, we should define a function Univ that converts the actual data type s value to a value in the universal domain. This approach, however, postpones all difficulties to the concrete implementation of the Univ function, because it is in that function that we need to ensure that, for example, both the integer 5, and the string "5", map to the same abstract object in the universal domain. This also means that the universal domain should be chosen in such a way that it is also possible to map all newly invented and introduced data types to this universal domain. Should this not be possible, then every time the language gets expanded with a new data type there is a risk that the universal domain should be adjusted, which in turn means that all casting from all existing types to this universal domain should also be adjusted. This kind of construct is a purely theoretical one, and doesn t allow for a concrete implementation, so we are not able to define semantics using such a mechanism in practice. 4.2 Accepted alternative Relative type reduction Relative type reduction can be used to convert either the left or the right side of the equation to the other side s type (or some lesser type that both sides can be converted to). Which of these conversions need to be performed depends on the types of the arguments and possibly by (one of) the values. How these conversion rules are defined, will be covered in more detail in the next sections. Furthermore, we will formally specify what is meant by a lesser type. 4.3 Premises Before we go into too much detail, we will set up some premises for the behavior of the =-relation. We believe that at all times, the =-relation should conform to these premises. Typically, we want to ensure the =-relation to be symmetric, in order to overcome strange behavior mentioned in section 4.1. Premise 1 (Symmetry). Let s be a state. Then, for all expressions e 1 and e 2 : E e 1 = e 2 s = E e 2 = e 1 s

4.4 Functions needed 13 One other problem that we came across was that in some definitions, a comparison with true and a comparison with false would give the same result. In general one would always expect these would give the opposite result if each other. This principle resulted in the following premise: Premise 2 (Opposites). Let s be a state. Then, for all expressions e, for all expressions e tt for which holds that E e tt s = ( Bool, tt ), and for all expressions e ff for which holds that E e ff s = ( Bool, ff ) goes: E e = e tt s = ( Bool, tt ) E e = e ff s = ( Bool, ff ) 4.4 Functions needed The string "false" is something you typically want to use as containing the boolean false. Similarly, you d want to use the string "5" as containing the integer 5. For this, we introduce two functions which determine whether a string is suited for usage as an integer or as a boolean. These functions are IsNumeric () and IsBoolean (), and they determine whether a string begins with a non-empty integer or boolean value. Their definitions are as follows: IsNumeric(S) := StartsWith(S, "0") StartsWith(S, "1") IsBoolean(S) := StartsWith(S, "true") StartsWith(S, "false") In order to make a good definition of the =-relation, we will define a number of functions and relations that will assist in making a solid definition. The first things we need to define are the elementary casting functions. These are functions that cast values of one type to another type. Because we currently have 3 types, there are 3 3 = 9 possible castings. For simplicity s sake, we will define a function Cast that takes 2 types as arguments and returns a function that casts values of the first type to values of the second type. The precise definition of this function can be read in appendix A. 4.5 Type hierarchy The type hierarchy itself is based on the amount of information a type can hold. Or more precise: The amount of information lost when converting to another type. This is best made clear with an example. For instance, consider the types Int and Bool. When a boolean is cast to an integer, no information will be lost, and the original boolean can be recovered by casting the integer back to a boolean. However, when an integer is cast to a boolean, a lot of information will be lost, and the original integer value cannot be recovered by casting back to an integer. This is because boolean has only 2 possible values, while there are countably many integer values. Because of this, the type Int is higher in the type hierarchy than the type Bool. In a similar fashion, we can see that the type String is higher in the type hierarchy than the type Int. Even though one could argue that one can always construct a lexicographical enumeration of strings and thus that there always is a means of casting back and forth between integers and strings, this would result in very odd string conversions and would certainly not be consistent

14 The function Equals with the way that strings are used in our example language LooselyWhile. Therefore, in LooselyWhile, we will consider String to be higher in the type hierarchy than Int. Definition 1. Let t 1 and t 2 be types. We will define the hierarchical ordering relation ( is a lesser type than ) to be: t 1 t 2 if and only if t 1 is lower in the type hierarchy than t 2. Using the Cast function, this results in the following formal definition: t 1 t 2 w1 [ Cast (t 2, t 1 ) (Cast (t 1, t 2 ) (w 1 )) = w 1 ] w2 [ Cast (t 1, t 2 ) (Cast (t 2, t 1 ) (w 2 )) = w 2 ] Where w 1 en w 2 are valid values of types t 1 and t 2, respectively (i.e. where the tuples ( t 1, w 1 ) and ( t2, w 2 ) are valid). 4.6 Definition of Equals Now that we have all these functions, we can finally put this together in the Equals function that we used in the semantics of the = operator. Equals (( ) ( )) t 1, w 1, t2, w 2 = Equals (( ) ( )) (Bool, t) 2, w 2, t1, w 1 if t 2 t 1 (Bool, tt ) if t 1 = t 2 and w 1 = w 2 ff if t 1 = t 2 and w 1 w 2 Equals (( ) ( t 1, w 1, Int, StringToInt(w2 ) )) if t 2 = String and IsNumeric(w 2 ) = tt Equals (( ) ( t 1, w 1, Bool, StringToBool(w2 ) )) if t 2 = Bool and IsBoolean(w 2 ) = tt Equals (( ) ( t 1, w 1, t1, Cast (t 2, t 1 ) (w 2 ) )) otherwise

CHAPTER 5 Equivalence with While LooselyWhile is an extension of the language While. This means that everything that can be expressed in the language While can be expressed in LooselyWhile as well. In fact, in this chapter, we will prove that every program in While syntax can be evaluated with LooselyWhile semantics and will yield a result that is equivalent to the result when evaluated with While semantics. Since LooselyWhile is an extension of While, all programs that are syntactically accepted in While are also accepted by LooselyWhile. We will show that the semantics of these programs remains equivalent. We use the term equivalent on purpose, because mathematically, the semantics of the two languages are not equal. In While, all variables are of the type integer, whereas in LooselyWhile they can be of non-integer types as well. Therefore, we have to use the type-data tuple representation for variables. As a consequence, states in LooselyWhile are affected and their definition differs slightly from the ones in While. Definition 2. Let s be a LooselyWhile-state. Then, like in While, s [y v] is the state s except that the value bound to y is v: { v if x = y (s [y v]) x = s x if x y The difference between the languages is that the v meta variable always holds an integer value in While, but a type-data tuple in LooselyWhile. This requires a notion to link the states. Definition 3. We say that a While-integer v and a LooselyWhile-integer v are equivalent if: v = ( Int, v ) We say that While-state is equivalent to a LooselyWhile-state if for all variables goes that the value of that variable in the While-state is equivalent to the value of the variable in the LooselyWhile-state. Formally:

16 Equivalence with While Definition 4. Let s be a While-state and s be a LooselyWhile-state. We say that s and s are equivalent if and only if s = LW (s), where LW is defined as follows: LW (s) = s x Var [ s x = ( Int, s x ) ] Now we are able to formulate our thesis. Thesis. For all statements S and all While-states s and s holds: If S, s s in While, then S, u u in LooselyWhile. where u and u are equivalent LooselyWhile-states: u = LW (s) and u = LW (s ). 5.1 Formalities In order to make the distinction between the semantic functions of While and LooselyWhile, we will transcribe the While-functions with the subscript While. For example, the semantic function A in While will become A While in the rest of the document. The LooselyWhile functions will keep their names, as defined in Table 3.1. When using the natural semantics rules, we will add the subscript w to the rule to indicate that the rule is from the natural semantics of While. This is more consistent with the subscript lw we have with the natural semantics rules of LooselyWhile. Throughout the rest of this document we will use u instead of LW (s), u instead of LW (s ), etc. This is done to improve the readability of the proofs. Finally, we will assume that While programs never use undeclared variables. 5.2 Proof To ease the proof of this thesis, we will introduce some lemmas. Lemma 1. At first, we will show that the BoolToBool and IntToInt casting functions are identity functions. Formally: 1. for all booleans b holds: BoolToBool(b) = b; and 2. for all integers x holds: IntToInt(x) = x Proof. This is trivial by the definitions of BoolToBool and IntToInt (see Appendix A). Lemma 2. For all n Num and all integers w holds: if N While n = w, then N n = ( Int, w ). Proof. We will prove this with induction on n: Base. We distinguish the following cases:

5.2 Proof 17 The case n = 0: We need to prove that if N While 0 = w, then N 0 = ( Int, w ). We have w = 0 by the definition of N While and N 0 = ( Int, 0 ) by the definition of N 0, in Table 3.1. Thus, in this case the lemma holds. The case n = 1: This is similar to the case n = 0 and we omit the details. Induction. The induction hypothesis is: for all integers w holds: if N While n = w, then N n = ( Int, w ) (Where n is a simpler case.) We distinguish the following cases: The case n = n 0: We need to prove that if N While n 0 = w, then N n 0 = ( Int, w ). We have w = 2 N While n = 2 w by the definition of N While and the induction hypothesis. Furthermore, we have N n 0 = ( Int, 2 i ) where N n = ( Int, i ) by Table 3.1. By the induction hypothesis, we have i = w, so N n 0 = ( Int, 2 w ). Thus, in this case the lemma holds. The case n = n 1: This is similar to the case n = n 0 and we omit the details. Thus, the lemma holds. Lemma 3. For all a Aexp, all integers w and all While-states s holds: if A While a s = w, then E a u = ( Int, w ) Proof. We will prove this with induction on a: Base. We distinguish the following cases: The case a = n: We need to prove that if A While n s = w, then E n u = ( Int, w ). We have A While n s = N While n by the definition of A While. Furthermore, we have E n u = N n by Table 3.1. Now we need to prove that if N While n = w, then N n = ( Int, w ). This is exactly Lemma 2. Thus, the lemma holds in this case. The case a = x: We need to prove that if A While x s = w, then E x u = ( Int, w ). We have A While x s = s x by the definition of A While. Furthermore, we have E x u = u x by Table 3.1. Now we need to prove that if s x = w then u x = ( Int, w ). We will assume that this holds (for more details on this, see the note).

18 Equivalence with While Thus the lemma holds in this case. NOTE: Actually we needed to prove that if s x = w then u x = ( Int, w ). But to prove this, we need to prove that if x := w, s s then x := w, u u. However, if we want to prove this, we must have a proof of that A While a s = w implies E a u = ( Int, w ), which is what we are proving now. This is provable, but this would make things too complex. The idea would be that we use the assumption that variables are only used after they are declared, and that thus we can point out an assignment (if there are any) where no other variables are used, and that this case of the lemma cannot occur. Then we can prove that the assignment of this variable in While is the same as the assignment of this variable in LooselyWhile. Then we can prove that, since variables are always declared before they are used, every assignment of a variable in While is the same as the assignment of this variable in LooselyWhile. And thus for all variables x holds: if s x = w then u x = ( Int, w ). Induction. The induction hypothesis is: for all integers w holds: if A While a s = w, then E a u = ( Int, w ) (Where a is a simpler case.) We distinguish the following cases: The case a = a 1 + a 2 : We need to prove that if A While a 1 + a 2 s = w, then E a 1 + a 2 u = ( Int, w ). Let w 1 = A While a 1 s and w 2 = A While a 2 s. We have A While a 1 + a 2 s = A While a 1 s + A While a 2 s = w 1 + w 2 by the definition of A While. Furthermore, we have E a 1 + a 2 u = ( Int, ToInt (E a 1 u) + ToInt (E a 2 u) ) by Table 3.1. By the induction hypothesis, we have E a 1 + a 2 u = ( Int, ToInt (( Int, w 1 )) + ToInt (( Int, w2 )) ) Thus, by the definition of ToInt (( Int, w )), we have E a 1 + a 2 u = ( Int, Cast (Int, Int) (w 1 ) + Cast (Int, Int) (w 2 ) ) Following the definition of Cast (Int, Int), this expands to E a 1 + a 2 u = ( Int, IntToInt(w 1 ) + IntToInt(w 2 ) )

5.2 Proof 19 Finally we have E a 1 + a 2 u = ( ) Int, w 1 + w 2 by Lemma 1. Thus, in this case the lemma holds. The case a = a 1 a 2 : This is similar to the case a = a 1 + a 2 and we omit the details. Thus, in this case, the lemma holds. The case a = a 1 a 2 : This is similar to the case a = a 1 + a 2 and we omit the details. Thus, the lemma holds. Lemma 4. For all b Bexp, all booleans w and all While-states s holds: if B While b s = w, then E b u = ( Bool, w ) Proof. We will prove this with induction on b: Base. We distinguish the following cases: The case b = true: We need to prove that if B While true s = w, then E true u = ( Bool, w ). We have w = tt by the definition of B While and E true u = ( Bool, tt ) by the definition of E true u. Thus, in this case, the lemma holds. The case b = false: This is similar to the case b = true and we omit the details. Thus, in this case, the lemma holds. The case b = a 1 = a 2 : We need to prove that if B While a 1 = a 2 s = w, then E a 1 = a 2 u = ( Bool, w ). From Lemma 3, we already have that if A While a 1 s = p, then E a 1 u = ( Int, p ) and if AWhile a 2 s = q, then E a 2 u = ( Int, q ). From the definition of B While, we have: { tt if p = q B While a 1 = a 2 s = ff if p q Furthermore, we have E a 1 = a 2 u = Equals (E a 1 u, E a 2 u), from Table 3.1. So we have E a 1 = a 2 u = Equals (( Int, p ), ( Int, q )). Because in this case t 1 = t 2 (Int = Int), the equation expands to: { ( ) Bool, tt if p = q E a 1 = a 2 u = ( ) Bool, ff if p q Thus, in this case, the lemma holds.

20 Equivalence with While The case b = a 1 a 2 : We need to prove that if B While a 1 a 2 s = w, then E a 1 a 2 u = ( Bool, w ). From Lemma 3, we already have that if A While a 1 s = p, then E a 1 u = ( Int, p ) and if AWhile a 2 s = q, then E a 2 u = ( Int, q ). From the definition of B While, we have: { tt if p q B While a 1 a 2 s = ff if p > q Furthermore, from Table 3.1, we have { ( ) Bool, tt if ToInt (E a E a 1 a 2 u = ( ) 1 u) ToInt (E a 2 u) Bool, ff if ToInt (E a 1 u) > ToInt (E a 2 u) From Lemma 3 we get: { ( ) Bool, tt E a 1 a 2 u = ( ) Bool, ff if ToInt (( Int, p )) ToInt (( Int, q )) if ToInt (( Int, p )) > ToInt (( Int, q )) From the definition of ToInt we get: { ( ) Bool, tt if Cast (Int, Int) (p) Cast (Int, Int) (q) E a 1 a 2 u = ( ) Bool, ff if Cast (Int, Int) (p) > Cast (Int, Int) (q) From the definition of Cast (Int, Int) we get: { ( ) Bool, tt if IntToInt(p) IntToInt(q) E a 1 a 2 u = ( ) Bool, ff if IntToInt(p) > IntToInt(q) Finally, by Lemma 1, we find: { ( ) Bool, tt E a 1 a 2 u = ( ) Bool, ff if p q if p > q Thus, in this case, the lemma holds. Induction. The induction hypothesis is: for all booleans w holds: if B While b s = w, then E b u = ( Bool, w ) (Where b is a simpler case.) We distinguish the following cases: The case b = b : We need to prove that if B While b s = w, then E b = ( Bool, w ). Let w = B While b s. From the definition of B While, we have: { B While b tt if w s = = ff ff if w = tt

5.2 Proof 21 Furthermore, from Table 3.1, we have: { ( ) E b Bool, tt if ToBool (E b u = ( ) u) = ff Bool, ff if ToBool (E b u) = tt Using the induction hypothesis, this expands to: { ( ) E b Bool, tt if ToBool (( Bool, w u = ( ) )) = ff Bool, ff if ToBool (( Bool, w )) = tt Through the Cast function, this leads to: { ( ) E b Bool, tt if BoolToBool(w u = ( ) ) = ff Bool, ff if BoolToBool(w ) = tt Using Lemma 1, we have: { ( ) E b Bool, tt u = ( ) Bool, ff if w = ff if w = tt And thus, in this case, the lemma holds. The case b = b 1 b 2 : We need to prove that if B While b 1 b 2 s = w, then E b 1 b 2 u = ( Bool, w ). Let w 1 = B While b 1 s and w 2 = B While b 2 s. From the definition of B While, we have: { tt if w1 = tt and w B While b 1 b 2 s = 2 = tt ff if w 1 = ff or w 2 = ff Furthermore, from Table 3.1, we have: ( ) Bool, tt if ToBool (E b 1 u) = tt E b 1 b 2 u = ( ) and ToBool (E b 2 u) = tt Bool, ff if ToBool (E b 1 u) = ff or ToBool (E b 2 u) = ff Using the induction hypothesis, this expands to: ( ) Bool, tt if ToBool (( )) Bool, w 1 = tt and ToBool (( )) Bool, w E b 1 b 2 u = ( ) 2 = tt Bool, ff if ToBool (( )) Bool, w 1 = ff or ToBool (( )) Bool, w 2 = ff Through the Cast function, this leads to: ( ) Bool, tt if BoolToBool(w 1 ) = tt E b 1 b 2 u = ( ) and BoolToBool(w 2 ) = tt Bool, ff if BoolToBool(w 1 ) = ff or BoolToBool(w 2 ) = ff Using Lemma 1, we have: { ( ) Bool, tt E b 1 b 2 u = ( ) Bool, ff if w 1 = tt and w 2 = tt if w 1 = ff or w 2 = ff And thus, in this case, the lemma holds.

22 Equivalence with While The case b = b 1 b 2 : This is similar to the case b = b 1 b 2 and we omit the details. Thus, the lemma holds. Proof of the thesis. We will prove this with induction on S. Base. We distinguish the following cases: The case S = skip: We need to prove that if skip, s s in While, then skip, u u in LooselyWhile. From the natural semantics of While we get that the only possible deduction rule is [skip w ]. From this rule we get: skip, s s Thus, s = s. From Table 3.2 we get that the only possible deduction rule is [skip lw ]. From this rule we get: skip, u u Thus, the thesis holds in this case. The case S = x := a: We need to prove that if x := a, s s in While, then x := a, u u in LooselyWhile. From the natural semantics of While we get that the only possible deduction rule is [ass w ]. From this rule we get: x := a, s s[x A While a s] Let w = A While a s, so we can write: x := a, s s[x w] From Table 3.2 we get that the only possible deduction rule is [ass lw ]. From this rule we get: x := a, u u[x E a u] From Lemma 3 we get: x := a, u u[x ( Int, w ) ] We can rewrite u (see section 5.1), so we get: x := a, LW (s) LW (s)[x ( Int, w ) ] From the definition of LW it is trivial to see that we get: x := a, LW (s) LW (s[x w]) Thus, the thesis holds in this case.

5.2 Proof 23 Induction. The induction hypothesis is: for all statements S, and all states s and s holds: If S, s s, then S, u u ( ) We distinguish the following cases: The case S = S 1 ; S 2 : We need to prove that if S 1 ; S 2, s s in While, then S 1 ; S 2, u u in LooselyWhile. Assume that S 1 ; S 2, s s in While. Now we need to prove that S 1 ; S 2, u u in LooselyWhile. From the natural semantics of While we get that the only possible deduction rule is [comp w ]. From this rule we get: S 1, s s and S 2, s s From the induction hypothesis and the above we get that in Loosely- While: S 1, u u and S 2, u u From Table 3.2 we get the rule [comp lw ] which says that because we have a deduction of S 1, u u and S 2, u u, we have a deduction of S 1 ; S 2, u u. Thus, the thesis holds in this case. The case S = if b then S 1 else S 2 : We need to prove that: If if b then S 1 else S 2, s s in While, then if b then S 1 else S 2, u u in LooselyWhile. We assume that if b then S 1 else S 2, s s in While. Now we need to prove that if b then S 1 else S 2, u u in LooselyWhile. We distinguish two cases: 1. B While b s = tt We need to prove that if if b then S 1 else S 2, s s in While then if b then S 1 else S 2, u u in LooselyWhile. From the natural semantics of While we get that the only possible deduction rule is [if tt w ]. From this we get S 1, s s. From Lemma 3 we get that E b u = ( Bool, tt ). From Table 3.2 we get that the only possible deduction rule for if b then S 1 else S 2, u is [if tt lw ]. This means that we need to prove that S 1, u u. We get this from the induction hypothesis. Thus, the thesis holds in this case. 2. B While b s = ff We need to prove that if if b then S 1 else S 2, s s in While then if b then S 1 else S 2, u u in LooselyWhile. From the natural semantics of While we get that the only possible deduction rule is [if ff w]. From this we get S 2, s s.

24 Equivalence with While From Lemma 3 we get that E b u = ( Bool, tt ). From Table 3.2 we get that the only possible deduction rule for if b then S 1 else S 2, u is [if ff lw ]. This means that we need to prove that S 2, u u. We get this from the induction hypothesis. Thus, the thesis holds in this case. The case S = while b do S : We need to prove that: If while b do S, s s in While, then while b do S, u u in LooselyWhile. We assume that while b do S, s s. while b do S, u u. We now need to prove that We will prove this by induction on the construction of the deduction tree of the while statement. Base. The last deduction rule used was [while ff w ]. From this rule we get that B While b s = ff and that s = s. Thus we need to prove that while b do S, u u. From Lemma 3 we get that E b u = ( Bool, ff ), thus the only possible deduction rule is [while ff lw ], which is exactly what we needed to prove. Thus, the thesis holds in this case. Induction. The induction hypothesis is that for smaller subtrees of the while statement, the thesis holds. Thus, for a certain s : If while b do S, s s, then while b do S, u u ( ) From the natural semantics of While we get that the only possible last step in the deduction tree of the while statement is [while tt w ], because the other possibility ([while ff w]) is covered in the base step of this induction. From this rule we get that B While b s = tt. We also get that there are subtrees of S, s s and while b do S, s s. From Table 3.2 we now get that the only possible deduction rule for while b do S, u is [while tt lw ]. From [while tt lw ] we get that if we can prove that there are subtrees for S, u u and while b do S, s u, we have proven that while b do S, u) u. S, u u follows from ( ). while b do S, u u follows from ( ). Thus, the thesis holds.

APPENDIX A Casting values To be able to cast values of a given type to any other given type, we need to introduce functions for every type to every other type that describe this cast. For readability s sake, we will introduce a wrapper function that returns the actual atomic cast function that performs the cast. A.1 Atomic casts The functions that perform (dumb) atomic casts are defined below: BoolToBool(b) = b{ 0 if b = true BoolToInt(b) = { 1 if b = false "true" if b = true BoolToString(b) = { "false" if b = false tt if x 0 IntToBool(x) = ff if x = 0 IntToInt(x) = x IntToString(x) = "0"IntToString( x "0" if x = 0 "1" if x = 1 2 ) if x 2 = 0 "1"IntToString( x 2 ) if x 2 = 1 tt if StartsWith(w, "true") StringToBool(w) = ff if StartsWith(w, "false") { ff otherwise GetIntFromString(w) if IsNumeric(w) = tt StringToInt(w) = 0 if IsNumeric(w) = ff StringToString(w) = w Here, the GetIntFromString(w) function returns the integer value of the longest head of the string w that is an integer. Working out these trivial

26 Casting values details to the finest would only decrease the readability. For example: GetIntFromString("701") = 701 GetIntFromString("12 or more") = 12 GetIntFromString("2 1 4") = 2 GetIntFromString("whatever") = 0 A.2 The wrapper function Cast The function Cast is trivial. It takes two types t 1 and t 2 as input and returns the atomic cast function that needs to be called for casting values of expressions of type t 1 to t 2 : Cast (t 1, t 2 ) = BoolToBool BoolToInt BoolToString IntToBool IntToInt IntToString StringToBool StringToInt StringToString if t 1 = Bool and t 2 = Bool if t 1 = Bool and t 2 = Int if t 1 = Bool and t 2 = String if t 1 = Int and t 2 = Bool if t 1 = Int and t 2 = Int if t 1 = Int and t 2 = String if t 1 = String and t 2 = Bool if t 1 = String and t 2 = Int if t 1 = String and t 2 = String A.3 Casting to specific types In the semantics described in chapter 3 the functions ToInt, ToBool and tostring are used. Now that we have a nice definition of the wrapper function Cast, these functions can (and will) be defined as follows: ToBool (( t, w )) ToInt (( t, w )) ToString (( t, w )) = Cast (t, Bool) (w) = Cast (t, Int) (w) = Cast (t, String) (w)

APPENDIX B String conversions Below are the definitions of a number of functions that were used in the definition of the semantics of LooselyWhile. These functions are head, tail and concat. head (s) = tail (s) = { if s = c 1 if s = c 1,..., c n { if s = c 2,..., c n if s = c 1,..., c n concat (s 1, s 2 ) = c 1,..., c n, k 1,..., k m where s 1 = c 1,..., c n and s 2 = k 1,..., k m

28 String conversions

Bibliography [1] Hanne Riis Nielson and Flemming Nielson. Semantics With Applications A Formal Introduction, 1999.