Principles of Dependable Systems Building Reliable Software School of Computer & Communication Sciences École Polytechnique Fédérale de Lausanne Winter 2006-2007
Outline Class Projects: mtgs next week OP2 + OP3: next week Building reliable software Program Analysis Specifications Model Checking Theorem Proving 2
Reliability Definition Recap of last week's lecture Hardware reliability 3
Example: Memory Leaks (1) Proving absence of memory leaks Code Inspection Testing Formally 4
Example: Memory Leaks (2) Build abstract model of system allocate(blk), read(blk), write(blk), free(blk), copy(blk1, blk2),... State property precisely Exec :=... allocate(b)... [ read(b) write(b) ] free(b)...!( read(b), write(b),... )* Prove model has property Extended regular expression generation & match... Prove model is accurate 5
Example: Bell-LaPadula Model for multi-level security (1973) U.S. Air Force was concerned... Formal state transition model: subjects, labeled objects, and access control rules 2 core properties Simple Security Property (no read above) *-Property (no write below) 6
Example: MLS System Honeywell SCOMP (1983) derivative of Multics hardware + software formally verified used for mail guards time-phased force deployment data system Provided basis for Orange Book one of only 3 implementations of A1 level 7
Principle: Refinement Model A e.g., ABS customer spec refines Model B e.g., Design diagram refines Model C e.g., Microcode refines Model D e.g., Hardware implementation 8
Concrete Realizations Program Analysis Static Analysis Model Checking Theorem Proving 9
Static Analysis Useful questions: Does program terminate? Does x stay constant? Can p ever be NULL? Do p and q ever point to the same data structure in the heap? Equally hard: int a=1 ; if ( procedure( gettime() ) ) a=2 ; Sound approximations 10
Type Checking Type = set of constraints Can be static or dynamic Conservative... int a=1 ; float x=2.0 ; if ( array[a+x] > 0 ) printf( "yes" ) ; Memory safety: arbitrary bit patterns? "If program compiles, it is correct" 11
Domain-Specific Rules We know the rules: check length of incoming strings don t use freed memory... Metacompilation: compiler checks rules programmers can write them, or statistically inferred 12
Example: Using Freed Mem... connection->buffer = malloc(... ); if (NULL == connection->buffer) { free( connection ); printf( "Out of memory" ); goto done; } Track state of connection Allocated Freed... /* connection establishment */... done: return connection; Bug! How might we infer this rule? 13
Data Flow Analysis print STDERR "Enter file name:"; $x=<stdin>; # $x is tainted (user input)... more code... $z="/tmp/safe_file.txt"; $y="$sysdir/$x"; system("cat $y"); # $z is clean # $y is tainted # disallowed! system("cat $z"); # OK Why might tainted data be bad? 14
Static Analysis Pros/Cons Benefits done at compile time don t need to execute code (like in testing) don t need to understand intention of the code Drawbacks checked properties are shallow (close to the code) more aggressive more false positives 15
Outline Class Projects: mtgs next week OP2 + OP3: next week Building reliable software Program Analysis Specifications Model Checking Theorem Proving 16
Formal Verification Model Checking Theorem Proving... both require a formal specification of system properties of interest 17
Writing Specifications Use computer-understandable language Concise Unambiguous Complete Say what without saying how "how" is implementation... Spec languages vs. prog languages 18
State Machine Specs 1) Define system states 2) Define all legal state-modifying ops result: TYPE = [output: BRKTYPE, state: BRKTYPE] ABS_apply_brakes( command: CMDTYPE, brk_state: BRKTYPE ) : result = (IF (command == APPLY) THEN return [brk_state,[~brk_state[left], ~brk_state[right]] ELSE return [brk_state, brk_state]) 19
Axiomatic Specs Implicit statements about operations Pre- and post-conditions binary_search( Array, Key ) Index PRE: ordered( Array ) ( Key in Array ) POST: Array[Index]==Key Invariants: pre/post-conditions for all ops INVAR: ordered(array) size(array)<100 20
Model Checking Exhaustively search states of system BRKSTATUS: TYPE = ONEOF( applied, released ) BRKTYPE: TYPE = [ BRKSTATUS, BRKSTATUS ] result: TYPE = [output: BRKTYPE, state: BRKTYPE] ABS_apply_brakes( command: CMDTYPE, brk_state: BRKTYPE ) : result = (IF (command == APPLY) THEN return[ brk_state, [~brk_state[left], ~brk_state[right] ] ELSE return[ brk_state, brk_state ]) INVAR: ( brake_state[left] == brake_state[right] ) Does invariant hold? Finite model guaranteed termination
Model Checking: Pros/Cons Benefits completely automatic (unlike theorem proving) provides counter examples works on partial specs (good for large systems) more interesting properties than static analysis Drawbacks state space explosion (exponential) writing and maintaining abstract model is hard coarse models far from actual implementation 22
Theorem Provers 1) Specify system using logic 2) Provide context (axioms+inference rules) 3) Specify desired properties (theorems) 4) Machine generates proof of theorem Theorem prover = fancy pattern matcher Needs user guidance (lemmas, defs) 23
Example: ABS (1) Modified ABS system RULE: x div by 2 (x+2) div by 2 ABS_apply_brakes( command: CMDTYPE, brk_state: BRKTYPE ) : result =... brk_state[left] += 10 brk_state[right] += 10... INVAR: ( for all i, brake_state[i] div by 2 ) Need helper lemma... 24
Example: ABS (2) LEMMA: x div by 2 (x+10) div by 2 PROOF: x div by 2 x+2 div by 2... (inductive proof)... Using Lemma, prover can do its job... ABS_apply_brakes( command: CMDTYPE, brk_state: BRKTYPE ) : result =... brk_state[left] += 10 brk_state[right] += 10... INVAR: ( for all i, brake_state[i] div by 2 ) Done? 25
Example: ABS (3) Tell prover about associativity AXIOM: (x+y)+z == x+(y+z) Tell prover about transitivity AXIOM: (P1 P2 AND P2 P3) (P1 P3) Now prover can do its job LEMMA: x div by 2 (x+10) div by 2 PROOF: x div by 2 x+2 div by 2 x+2 div by 2 (x+4) div by 2 (associativity)... (inductive proof)... x div by 2 (x+10) div by 2 (transitivity) 26
Theorem Proving: Pros/Cons Benefits handle unbounded # of states solid proofs (if assumptions are correct) Drawbacks Specs are very abstract, far from real sys Prover requires human guidance slow, error-prone process requires human to have an idea about the proof 27
Example of Theorem Prover Java bytecode verifier bytecode = model of the program properties to be proven types are used correctly types are preserved no object access violations no forged pointers 28
Myths (1) 1. Formal Methods guarantee sw is perfect... They help find errors early on; do not eliminate need for testing 2. Formal methods are all about proofs... They work because they make you think hard about your proposed system (adapted from J. Anthony Hall, Seven Myths of Formal Methods, IEEE Software 5(7), Sep. 1990) 29
Myths (2) 3. They require trained mathematicians... Mathematical specs aren't necessarily harder to read/write with than programs. 4. They increase cost of development... Sometimes they actually decrease the cost; no good evidence either way. (adapted from J. Anthony Hall, Seven Myths of Formal Methods, IEEE Software 5(7), Sep. 1990) 30
Myths (3) 5. Formal methods aren't useful to clients Depending on the client, it might actually help them understand what they're buying. 6. They are not used on real, large-scale sw Some large practical projects have used them successfully; more engineering is required (adapted from J. Anthony Hall, Seven Myths of Formal Methods, IEEE Software 5(7), Sep. 1990) 31