Reasoning about programs Chapter 9 of Thompson
Proof versus testing A proof will state some property of a program that holds for all inputs. Testing shows only that a property holds for a particular set of inputs. Property checking, such as with QuickCheck, improves coverage by randomly generating inputs, but is still not a proof. Proofs in functional programming rely on treating function definitions as logical terms, amenable to manipulation via the rules of logic.
Understanding definitions Consider: length [] = 0 -- (length.1) length (z:zs) = 1 + length zs -- (length.2)
Understanding definitions: by evaluation length [2,3,1] 1 + length [3,1] by (length.2) 1 + (1 + length [1]) by (length.2) 1 + (1 + (1 + length [])) by (length.2) 1 + (1 + (1 + 0)) by (length.1) 3
Understanding definitions: as descriptions (length.1) says what length [] is (length.1) says that whatever values of x and xs we choose, length (x:xs) will be equal to 1 + length xs The second case is a general property of length: how it behaves on all non-empty lists. These allow us to conclude that length [x] = 1 (length.3)... but how?
Understanding definitions: as descriptions length [x] = length (x:[]) by definition of [x] = 1 + length [] by (length.2) = 1 + 0 by (length.1) = 1 We can read a definition as: 1. describing how to compute particular results, and 2. a general description of the behaviour of the function, allowing deductions to be made like (length.3) and others like length (xs ++ ys) = length xs + length ys
Proof as symbolic evaluation Instead of using a particular value as argument to length, like 2, we replace 2 with a symbolic variable x, but use the evaluation rules in the same way. Combining this symbolic evaluation with other proof techniques (like induction) allows many proofs for recursive functions.
Testing mysterymax :: Integer -> Integer -> Integer -> Integer mysterymax x y z x > y && x > z = x y > x && y > z = y otherwise = z prop_mystery :: Integer -> Integer -> Integer -> Bool prop_mystery x y z = mysterymax x y z == (x `max` y) `max` z
Proof by cases Consider: x > y && x > z y > x && y > z z > x && z > y For each of these cases, mysterymax is correct.
Proof by cases For all other cases at least two of the arguments are equal. If all three are equal: x == y && y == z then mysterymax is correct. Suppose: y == z && z > x then it is still correct. But, for: x == y && y > z theresult z is incorrect!
Proof by cases A form of symbolic testing case by case, for classes of inputs. However, in general, finding a proof of correctness is more difficult than this example. So, concrete testing is still a valuable exercise.
Definedness and termination Evaluation can have two outcomes: the evaluation can halt (terminate) with an answer the evaluation can go on forever (the value is undefined) Consider: fact :: Integer -> Integer fact n n==0 = 1 otherwise = n * fact (n-1)
Definedness and termination fact 2 terminates. fact (-2) is undefined: fact (-2) (-2) * fact (-3) (-2) * (-3) * fact (-4)...
Definedness and termination Proofs must confine themselves to cases for defined values where expected properties hold. 0 * e = 0, but only if e is defined 0 * e = undefined 0, if e is undefined Proofs usually hold only for all defined values. Undefined values are only of interest if the function in question does not give a defined value when it is expected to.
Finiteness Haskell evaluation is lazy, so arguments are evaluated only if their values are actually needed. Lazy evaluation allows definition and use of infinite lists, like [1,2,3,...] and partially defined lists. Our main attention will be to finite lists, which have a defined, finite length, and defined elements, e.g.: [] [1,2,3] [[4,5],[3,2,1],[]]
Assumptions in proofs Logical implication A B says that if A holds then B also holds (B follows from A) Proving an implication A B, we can assume A in proving B, and then simply need to prove A to guarantee our proof of B. A proof of A B is a process for turning a proof of A into a proof of B. In proof by induction, the induction step proves one property assuming another.
Free variables and quantifiers Equational reasoning implicitly quantifies over all possible values of free variables: square x = x * x says this holds for all (defined) values of the free variable x. More explicitly, we should actually write this with a logical quantifier: x (square x = x * x)
Induction Consider: sum :: [Integer] -> Integer sum [] = 0 -- (sum.1) sum (x:xs) = x + sum xs -- (sum.2) This gives a value outright at [], and defines the value of sum (x:xs) using the value sum xs
Principle of structural induction for lists In order to prove that a logical property P(xs) holds for all finite lists xs we have to do two things: Base case: Prove P([]) outright. Induction step: Prove P(x:xs) on the assumption that P(xs) holds. In other words P(xs) P(x:xs) has to be proved. The P(xs) is called the induction hypothesis since it is assumed in proving P(x:xs). This is just like primitive recursion: instead of building values of a function we build up parts of a proof. In both cases [] is a base case, and the general case goes from xs to (x:xs).
Justification of structural induction for lists Just as recursion is not circular, proof by induction builds a proof for all finite lists in stages. Given proofs of P([]) and P(xs) P(x:xs) for all x and xs suppose we want to show that P([1,2,3]): 1. P([]) holds; 2. P([]) P([3]) holds, since it is a case of P(xs) P(x:xs); 3. 1 & 2 give us that P([3]) holds; 4. P([3]) P([2,3]) holds, as for 2; 5. 3 & 4 give us that P([2,3]) holds; 6. P([2,3]) P([1,2,3]) holds, as for 2; 7. 5 & 6 give us that P([1,2,3]) holds. This works for all finite lists, so we have P(xs) for all finite lists.
Example: doubleall Consider: doubleall :: [Integer] -> [Integer] doubleall [] = [] -- (doubleall.1) doubleall (z:zs) = 2*z : doubleall zs -- (doubleall.2) Presumably: sum (doubleall xs) = 2 * sum xs -- (sum+dblall) [quickcheck property testing passes]
Example: doubleall Two induction goals: sum (doubleall []) = 2 * sum [] sum (doubleall (x:xs)) = 2 * sum (x:xs) using the induction hypothesis: sum (doubleall xs) = 2 * sum xs -- (base) -- (ind) -- (hyp)
Example: doubleall The base case: sum (doubleall []) = sum ([]) by (doubleall.1) = 0 by (sum.1) 2 * sum ([]) = 2 * 0 by (sum.1) = 0 by *
Example: doubleall The induction step: Now: sum (doubleall (x:xs)) = sum (2*x : doubleall xs) by (doubleall.2) = 2*x + sum (doubleall xs) by (sum.2) 2 * sum (x:xs) = 2 * (x + sum xs) by (sum.2) = 2*x + 2 * sum xs by distribution of * 2*x + sum (doubleall xs) = 2*x + 2 * sum xs by (hyp) QED
Finding induction proofs First step: define a QuickCheck property, and ensure it generates no counter examples. State the goal of the induction and the two sub-goals: (base) and (hyp) (ind) Change variable names as needs to avoid confusion (α-conversion) Use only definitions of functions involved and general rules of arithmetic to simplify sub-goals (for equations do LHS and RHS separately) For the induction step (ind) use (hyp) in its proof Label each step of the proof with its justification