Software Model Checking with SLAM

Similar documents
Outline. Introduction SDV Motivation Model vs Real Implementation SLIC SDVRP SLAM-2 Comparisons Conclusions

Predicate Abstraction Daniel Kroening 1

No model may be available. Software Abstractions. Recap on Model Checking. Model Checking for SW Verif. More on the big picture. Abst -> MC -> Refine

Having a BLAST with SLAM

Having a BLAST with SLAM

Software Model Checking. From Programs to Kripke Structures

Having a BLAST with SLAM

Part II: Atomicity for Software Model Checking. Analysis of concurrent programs is difficult (1) Transaction. The theory of movers (Lipton 75)

Automatically Validating Temporal Safety Properties of Interfaces

Having a BLAST with SLAM

CS 510/13. Predicate Abstraction

Proof Pearl: The Termination Analysis of Terminator

VS 3 : SMT Solvers for Program Verification

Static Program Analysis

Software Model Checking. Xiangyu Zhang

Software Model Checking with Abstraction Refinement

appropriate abstraction by (1) obtaining an initial abstraction from the property that needs to be checked, and (2) refining this abstraction using an

Double Header. Two Lectures. Flying Boxes. Some Key Players: Model Checking Software Model Checking SLAM and BLAST

language is a quantifier-free logic, rather than the more powerful logic of [30]. ffl We have applied C2bp and Bebop to examples from Necula's work on

Mutual Summaries: Unifying Program Comparison Techniques

Proofs from Tests. Nels E. Beckman, Aditya V. Nori, Sriram K. Rajamani, Robert J. Simmons, SaiDeep Tetali, Aditya V. Thakur

Formal Methods. CITS5501 Software Testing and Quality Assurance

Counterexample Guided Abstraction Refinement in Blast

An Annotated Language

Abstract Interpretation

An Eclipse Plug-in for Model Checking

F-Soft: Software Verification Platform

Model Checking with Abstract State Matching

Automatic Software Verification

Interpolation-based Software Verification with Wolverine

Static Analysis Basics II

Program Static Analysis. Overview

Formal Methods: Model Checking and Other Applications. Orna Grumberg Technion, Israel. Marktoberdorf 2017

Alternation for Termination

Model Checking and Its Applications

Learning Summaries of Recursive Functions

SymDiff: A language-agnostic semantic diff tool for imperative programs

Over-Approximating Boolean Programs with Unbounded Thread Creation

Main Goal. Language-independent program verification framework. Derive program properties from operational semantics

In Our Last Exciting Episode

4/24/18. Overview. Program Static Analysis. Has anyone done static analysis? What is static analysis? Why static analysis?

Finding and Fixing Bugs in Liquid Haskell. Anish Tondwalkar

Bounded Model Checking Of C Programs: CBMC Tool Overview

Ufo: A Framework for Abstraction- and Interpolation-Based Software Verification

Reducing Concurrent Analysis Under a Context Bound to Sequential Analysis

CMSC 330: Organization of Programming Languages

CS 161 Computer Security

Complete Instantiation of Quantified Formulas in Satisfiability Modulo Theories. ACSys Seminar

Checks and Balances - Constraint Solving without Surprises in Object-Constraint Programming Languages: Full Formal Development

Introduction to CBMC. Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel December 5, 2011

Embedded Software Verification Challenges and Solutions. Static Program Analysis

Overview. Verification with Functions and Pointers. IMP with assertions and assumptions. Proof rules for Assert and Assume. IMP+: IMP with functions

Temporal Refinement Using SMT and Model Checking with an Application to Physical-Layer Protocols

Summaries for While Programs with Recursion

Symbolic Evaluation/Execution

Chapter 3. Describing Syntax and Semantics

More on Verification and Model Checking

BOOGIE. Presentation by Itsik Hefez A MODULAR REUSABLE VERIFIER FOR OBJECT-ORIENTED PROGRAMS MICROSOFT RESEARCH

Reasoning about programs

Last time. Reasoning about programs. Coming up. Project Final Presentations. This Thursday, Nov 30: 4 th in-class exercise

Proofs from Tests. Sriram K. Rajamani. Nels E. Beckman. Aditya V. Nori. Robert J. Simmons.

Scalable Program Analysis Using Boolean Satisfiability: The Saturn Project

Small Formulas for Large Programs: On-line Constraint Simplification In Scalable Static Analysis

Encoding functions in FOL. Where we re at. Ideally, would like an IF stmt. Another example. Solution 1: write your own IF. No built-in IF in Simplify

Introduction to CBMC: Part 1

Computing Fundamentals 2 Introduction to CafeOBJ

Abstraction Refinement for Quantified Array Assertions

Reasoning About Imperative Programs. COS 441 Slides 10

Hyperkernel: Push-Button Verification of an OS Kernel

Polymorphic Predicate Abstraction

This page intentionally left blank.

Static program checking and verification

Reading Assignment. Symbolic Evaluation/Execution. Move from Dynamic Analysis to Static Analysis. Move from Dynamic Analysis to Static Analysis

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction

Effectively-Propositional Modular Reasoning about Reachability in Linked Data Structures CAV 13, POPL 14 Shachar Itzhaky

The Pointer Assertion Logic Engine

Programming Languages Fall 2014

InterprocStack analyzer for recursive programs with finite-type and numerical variables

SYNERGY : A New Algorithm for Property Checking

Mutable References. Chapter 1

Chapter 3. Describing Syntax and Semantics ISBN

Towards a Software Model Checker for ML. Naoki Kobayashi Tohoku University

Typed Lambda Calculus. Chapter 9 Benjamin Pierce Types and Programming Languages

Simplifying Loop Invariant Generation Using Splitter Predicates. Rahul Sharma Işil Dillig, Thomas Dillig, and Alex Aiken Stanford University

Foundations: Syntax, Semantics, and Graphs

Induction and Semantics in Dafny

Computer aided verification

Predicate Refinement Heuristics in Program Verification with CEGAR

Optimizing for Bugs Fixed

CSCI-GA Scripting Languages

Verifying Concurrent Programs

Static Program Checking

Symbolic and Concolic Execution of Programs

CS 242. Fundamentals. Reading: See last slide

GNATprove a Spark2014 verifying compiler Florian Schanda, Altran UK

Introduction to Axiomatic Semantics (1/2)

ESC/Java2 vs. JMLForge. Juan Pablo Galeotti, Alessandra Gorla, Andreas Rau Saarland University, Germany

Static and Precise Detection of Concurrency Errors in Systems Code Using SMT Solvers

Abstractions from Proofs

Testing & Symbolic Execution

Transcription:

Software Model Checking with SLAM Thomas Ball Software Reliability Research MSR Redmond Sriram K. Rajamani Reliable Software Engineering MSR Bangalore http://research.microsoft.com/slam/

People behind SLAM Partners in MS Ella Bounimova, Byron Cook, Nar Ganapathy, Shuvendu Lahiri, Vladimir Levin, Jakob Lichtenberg, Con McGarvey, Bonus Ondrusek, Abdullah Ustuner Summer interns Sagar Chaki, Satyaki Das, Rahul Kumar, Shuvendu Lahiri, Jakob Lichtenberg, Rupak Majumdar, Todd Millstein, Mayur Naik, Robby, Wes Weimer, George Weissenbacher, Fei Xie Visitors Giorgio Delzanno, Albert Oliveras, Andreas Podelski, Stefan Schwoon

Outline Part I: Overview (30 min) overview of SLAM process demonstration (Static Driver Verifier) lessons learned Part II: Basic SLAM (1 hour) foundations basic algorithms (no pointers) Part III: Advanced Topics (30 min) pointers + procedures

Part I: Overview of SLAM

What is SLAM? SLAM is a software model checking project at Microsoft Research started in 2000 check C programs against safety properties first application domain: device drivers still active! Ships in the Windows Driver Kit, as the analysis engine in Static Driver Verifier

Rules Read for understanding New API rules Static Driver Verifier Precise API Usage Rules (SLIC) Drive testing tools Development Defects Software Model Checking Testing 100% path coverage Source Code

SLAM Software Model Checking Counterexample-driven abstraction of software boolean programs model creation (c2bp) model checking (bebop) model refinement (newton) Builds on OCaml programming language dataflow and pointer analysis predicate abstraction symbolic model checking CUDD, SAT solver, SMT theorem prover

SLIC Finite state language for stating rules monitors behavior of C code temporal safety properties (security automata) familiar C syntax Suitable for expressing control-dominated properties e.g. proper sequence of events can encode data values inside state

State Machine for Locking Locking Rule in SLIC Rel Acq Unlocked Rel Error Locked Acq state { enum {Locked,Unlocked s := Unlocked; KeAcquireSpinLock.entry { if (s=locked) abort; else s := Locked; KeReleaseSpinLock.entry { if (s=unlocked) abort; else s := Unlocked;

Example do { KeAcquireSpinLock(); Does this code obey the locking rule? npacketsold := npackets; if(request){ request := request->next; KeReleaseSpinLock(); npackets++; while (npackets!= npacketsold); KeReleaseSpinLock();

U L Example do { KeAcquireSpinLock(); Model checking boolean program (bebop) L L L U U if(*){ while (*); KeReleaseSpinLock(); L U U E KeReleaseSpinLock();

Example U L b : (npacketsold = npackets) do { KeAcquireSpinLock(); npacketsold := npackets; Is error path feasible in C program? (newton) L L L U U if(request){ request := request->next; KeReleaseSpinLock(); npackets++; while (npackets!= npacketsold); L U U E KeReleaseSpinLock();

Example U L b : (npacketsold = npackets) do { KeAcquireSpinLock(); Add new predicate to boolean program (c2bp) npacketsold := npackets; b := true; L L L U U if(request){ request := request->next; KeReleaseSpinLock(); npackets++; b := b? false : *; while (npackets!= npacketsold);!b L U U E KeReleaseSpinLock();

U L Example b : (npacketsold = npackets) do { KeAcquireSpinLock(); b := true; Model checking refined boolean program (bebop) b L if(*){ b L b b!b L U U KeReleaseSpinLock(); b := b? false : *; while (!b ); b L U KeReleaseSpinLock(); b U E

U L Example b : (npacketsold = npackets) do { KeAcquireSpinLock(); b := true; Model checking refined boolean program (bebop) b L if(*){ b L b b!b L U U KeReleaseSpinLock(); b := b? false : *; while (!b ); b L KeReleaseSpinLock(); b U

Inferred Invariant

Counterexample-driven refinement #include <ntddk.h> boolean program Bebop reachability check + C2BP predicate abstraction error path Harness SLIC Rule refinement predicates Newton feasibility check

Part I: Demo Static Driver Verifier src/kernel/serenum, lowerdriverreturn.slic

Part I: Lessons Learned

SLAM Boolean program model has proved itself Successful for domain of device drivers control-dominated safety properties few boolean variables needed to do proof or find real counterexamples Counterexample-driven refinement terminates in practice incompleteness of theorem prover not an issue

What Was Hard? Abstracting from a language with pointers (C) to one without pointers (boolean programs) all side effects need to be modeled by copying Developing specifications, environment model False alarms Technology transfer!

Part II: Basic SLAM

C- Types τ ::= void bool int ref τ Expressions e ::= c x e 1 op e 2 &x *x LExpression l ::= x *x Declaration d ::= τ x 1,x 2,,x n Statements s ::= skip goto L 1,L 2 L n L: s assume(e) l := e l := f (e 1,e 2,,e n ) return x s 1 ; s 2 ; ; s n Procedures p ::= τ f (x 1 : τ 1,x 2 : τ 2,,x n : τ n ) Program g ::= d 1 d 2 d n p 1 p p 2 n

C-- Types τ ::= void bool int Expressions e ::= c x e 1 op e 2 LExpression l ::= x Declaration d ::= τ x 1,x 2,,x n Statements s ::= skip goto L 1,L 2 L n L: s assume(e) l := e f (e 1,e 2,,e n ) return s 1 ; s 2 ; ; s n Procedures p ::= f (x 1 : τ 1,x 2 : τ 2,,x n : τ n ) Program g ::= d 1 d 2 d n p 1 p p 2 n

BP Types τ ::= void bool Expressions e ::= c x e 1 op e 2 LExpression l ::= x Declaration d ::= τ x 1,x 2,,x n Statements s ::= skip goto L 1,L 2 L n L: s assume(e) l := e f (e 1,e 2,,e n ) return s 1 ; s 2 ; ; s n Procedures p ::= f (x 1 : τ 1,x 2 : τ 2,,x n : τ n ) Program g ::= d 1 d 2 d n p 1 p p 2 n

Syntactic sugar if (e) { S1; else { S2; S3; goto L1, L2; L1: assume(e); S1; goto L3; L2: assume(!e); S2; goto L3; L3: S3;

Example, in C int g; main(int x, int y){ cmp(x, y); if (!g) { if (x!= y) assert(0); void cmp (int a, int b) { if (a = b) g := 0; else g := 1;

Example, in C-- int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); void cmp(int a, int b) { goto L1, L2; L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return;

SLAM Process #include <ntddk.h> boolean program Bebop reachability check + C2BP predicate abstraction error path Harness SLIC Rule refinement predicates Newton feasibility check

c2bp: Predicate Abstraction for C Programs Given P : a C program F : {e 1,...,e n each e i a pure boolean expression each e i represents set of states for which e i is true Produce a boolean program B(P,F) same control-flow structure as P boolean vars {b 1,...,b n to match {e 1,...,e n B(P,F) simulates P

c2bp Algorithm Performs modular abstraction abstracts each procedure in isolation Within each procedure, abstracts each statement in isolation no control-flow analysis no need for loop invariants

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); void cmp (int a, int b) { goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; Preds: {x=y {g=0 {a=b

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); decl {g=0 ; main( {x=y ) { Preds: {x=y {g=0 {a=b void cmp (int a, int b) { goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; void cmp ( {a=b ) {

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); decl {g=0 ; main( {x=y ) { cmp( {x=y ); assume( {g=0 ); assume(!{x=y ); assert(0); Preds: {x=y {g=0 {a=b void cmp (int a, int b) { goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; void cmp ( {a=b ) { goto L1, L2; L1: assume( {a=b ); {g=0 := T; return; L2: assume(!{a=b ); {g=0 := F; return;

C-- Types τ ::= void bool int Expressions e ::= c x e 1 op e 2 LExpression l ::= x Declaration d ::= τ x 1,x 2,,x n Statements s ::= skip goto L 1,L 2 L n L: s assume(e) l := e f (e 1,e 2,,e n ) return s 1 ; s 2 ; ; s n Procedures p ::= f (x 1 : τ 1,x 2 : τ 2,,x n : τ n ) Program g ::= d 1 d 2 d n p 1 p p 2 n

Abstracting Assigns via WP Statement y:=y+1 and F={ y<4, y<5 {y<4, {y<5 := *, {y<4 ; WP(x:=e,Q) = Q[x -> e] WP(y:=y+1, y<5) = (y<5) [y -> y+1] = (y+1<5) = (y<4)

WP Problem WP(s, e i ) not always expressible via { e 1,...,e n Example F = { x=0, x=1, x<5 WP( x:=x+1, x<5 ) = x<4 Best possible: x=0 x=1

Abstracting Expressions via F F = { e 1,...,e n Implies F (e) best boolean function over F that implies e ImpliedBy F (e) best boolean function over F implied by e ImpliedBy F (e) =!Implies F (!e)

Implies F (e) and ImpliedBy F (e) e ImpliedBy F (e) Implies F (e)

Abstracting Assignments if Implies F (WP(s, e i )) is true before s then e i is true after s if Implies F (WP(s,!e i )) is true before s then e i is false after s {e i := Implies F (WP(s, e i ))? true : Implies F (WP(s,!e i ))? false : *;

Assignment Example Statement in P: Predicates in E: y := y+1; {x=y Weakest Precondition: WP(y:=y+1, x=y) = x=y+1 Implies F ( x=y+1 ) = false Implies F ( x!=y+1 ) = x=y Abstraction of assignment: {x=y := {x=y? false : *;

Abstracting Assumes assume(e) is abstracted to: assume( ImpliedBy F (e) ) Example: F = {x=2, x<5 assume(x < 2) is abstracted to: assume( {x<5 &&!{x=2 )

Assume(e) and ImpliedBy F (e) e ImpliedBy F (e)

Abstracting Procedures Each predicate in F is annotated as being either global or local to a particular procedure Procedures abstracted in two passes: a signature is produced for each procedure in isolation procedure calls are abstracted given the callees signatures

Abstracting a procedure call Procedure call a sequence of assignments from actuals to formals see assignment abstraction Procedure return NOP for C-- with assumption that all predicates mention either only globals or only locals with pointers and with mixed predicates: most complicated part of c2bp covered in the advanced topics section

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); decl {g=0 ; main( {x=y ) { cmp( {x=y ); assume( {g=0 ); assume(!{x=y ); assert(0); {x=y {g=0 {a=b void cmp (int a, int b) { goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; void cmp ( {a=b ) { Goto L1, L2 L1: assume( {a=b ); {g=0 := T; return; L2: assume(!{a=b ); {g=0 := F; return;

Computing Implies F (e) minterm m = d 1 &&... && d n where d i = e i or d i =!e i Implies F (e) disjunction of all minterms that imply e Naïve approach generate all 2 n possible minterms for each minterm m, use decision procedure to check validity of each implication m e

Fast Predicate Abstraction Idea: compute set of minterms m that imply e directly via theorem prover data structures

Example Consider Equalities F = { a=b, b=c, c=d Implies F (a=d) Equality graph induced by F: a b c d Implies F (a=d) = (a=b) ^ (b=c) ^ (c=d)

Efficient Implementation of Implies(e) Graph representation of equalities uninterpreted function symbols (*p = *q) inequalities (x < c) Computation of good minterms via reachability query

The Rest of the Story For predicates not in theory (i.e., x<y+c), call theorem prover as before limit size of minterms for efficiency Use Das/Dill refinement to deal with approximation introduced by heuristics

c2bp Precision For program P and F = {e 1,...,e n, there exist two ideal abstractions: Boolean(P,F) : most precise abstraction Cartesian(P,F) : less precise abstraction, where each boolean variable is updated independently [See Ball-Podelski-Rajamani, TACAS 00] Theory: with an ideal theorem prover, c2bp can compute Cartesian(P,F) Practice: c2bp computes a less precise abstraction than Cartesian(P,F) we use Das/Dill s technique to incrementally improve precision the combination of c2bp + Das/Dill can compute Boolean(P,F)

SLAM Process #include <ntddk.h> boolean program Bebop reachability check + C2BP predicate abstraction error path Harness SLIC Rule refinement predicates Newton feasibility check

Bebop Symbolic reachability for boolean programs Based on CFL reachability [Sharir-Pnueli 81] [Reps-Sagiv-Horwitz 95] Uses BDDs to represent sets of paths

BDDs Summarize a Set of Paths Canonical representation of boolean functions set of (fixed-length) bitvectors binary relations over finite domains Efficient algorithms for common analysis operations transfer function join/meet subsumption test void cmp ( e2 ) { [5]Goto L1, L2 [6]L1: assume( e2 ); [7] gz := T; goto L3; [8]L2: assume(!e2 ); [9]gz := F; goto L3 [10] L3: return; BDD at line [10] of cmp: e2=e2 & gz =e2 Read: cmp leaves e2 unchanged and sets gz to have the same final value as e2

decl gz ; main( e ) { [1] cmp( e ); [2] assume( gz ); [3] assume(!e ); [4] assert(f); void cmp ( e2 ) { [5]Goto L1, L2 [6]L1: assume( e2 ); [7] gz := T; goto L3; [8]L2: assume(!e2 ); [9]gz := F; goto L3 [10] L3: return; gz=gz & e=e e=e & gz =e e=e & gz =1 & e =1 e2=e gz=gz & e2=e2 gz=gz & e2=e2 & e2 =T e2=e2 & e2 =T & gz =T gz=gz & e2=e2 & e2 =F e2=e2 & e2 =F & gz =F e2=e2 & gz =e2

Bebop: Summary Explicit representation of CFG Implicit representation of sets of paths via BDDs Generation of hierarchical error traces Complexity: O(E * 2 O(N) ) E is the size of the CFG N is the max. number of variables in scope

SLAM Process #include <ntddk.h> boolean program Bebop reachability check + C2BP predicate abstraction error path p Harness SLIC Rule refinement predicates Newton feasibility check

Newton Bebop produces potential error path p Is p a feasible path of the C program? Yes: found an error No: find predicates that explain the infeasibility Symbolic execution of a program path feasibility testing with theorem prover

Newton Execute path symbolically Check conditions for inconsistency using theorem prover (satisfiability) After detecting inconsistency: minimize inconsistent conditions traverse dependencies obtain predicates

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); void cmp (int a, int b) { Goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return;

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); Global: void cmp (int a, int b) { Goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; main: (1) x: X (2) y: Y Conditions:

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); Global: void cmp (int a, int b) { Goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; main: (1) x: X (2) y: Y cmp: (3) a: A (4) b: B Map: X A Y B Conditions:

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); Global: (6) g: 0 main: (1) x: X (2) y: Y cmp: (3) a: A (4) b: B Map: X A Y B void cmp (int a, int b) { Goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; Conditions: (5) (A = B) [3, 4]

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); Global: (6) g: 0 main: (1) x: X (2) y: Y cmp: (3) a: A (4) b: B Map: X A Y B void cmp (int a, int b) { Goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; Conditions: (5) (A = B) [3, 4] (6) (X = Y) [5]

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); Global: (6) g: 0 main: (1) x: X (2) y: Y cmp: (3) a: A (4) b: B void cmp (int a, int b) { Goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; Conditions: (5) (A = B) [3, 4] (6) (X = Y) [5] (7) (X!= Y) [1, 2]

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); Global: (6) g: 0 main: (1) x: X (2) y: Y cmp: (3) a: A (4) b: B Contradictory! void cmp (int a, int b) { Goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; Conditions: (5) (A = B) [3, 4] (6) (X = Y) [5] (7) (X!= Y) [1, 2]

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); Global: (6) g: 0 main: (1) x: X (2) y: Y cmp: (3) a: A (4) b: B Contradictory! void cmp (int a, int b) { Goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; Conditions: (5) (A = B) [3, 4] (6) (X = Y) [5] (7) (X!= Y) [1, 2]

int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x!= y) assert(0); void cmp (int a, int b) { Goto L1, L2 L1: assume(a=b); g := 0; return; L2: assume(a!=b); g := 1; return; Predicates after simplification: { x = y, a = b

Part III: Advanced Topics

C- Types τ ::= void bool int ref τ Expressions e ::= c x e 1 op e 2 &x *x LExpression l ::= x *x Declaration d ::= τ x 1,x 2,,x n Statements s ::= skip goto L 1,L 2 L n L: s assume(e) l := e l := f (e 1,e 2,,e n ) return x s 1 ; s 2 ; ; s n Procedures p ::= τ f (x 1 : τ 1,x 2 : τ 2,,x n : τ n ) Program g ::= d 1 d 2 d n p 1 p p 2 n

Pointers and SLAM With pointers, C supports call by reference Strictly speaking, C supports only call by value With pointers and the address-of operator, one can simulate call-by-reference Boolean programs support only call-by-value-result SLAM mimics call-by-reference with call-by-value-result Extra complications: address operator (&) in C multiple levels of pointer dereference in C

What changes with pointers? C2bp abstracting assignments abstracting procedure returns Newton simulation needs to handle pointer accesses need to copy local heap across scopes to match Bebop s semantics Bebop remains unchanged!

Assignments + Pointers Statement in P: Predicates in E: *p := 3 {x=5 Weakest Precondition: WP( *p:=3, x=5 ) = x=5 What if *p and x alias? Correct Weakest Precondition: (p=&x && 3==5) (p!=&x && x=5) We use Das s pointer analysis [PLDI 2000] to prune disjuncts representing infeasible alias scenarios.

Abstracting Procedure Return Need to account for lhs of procedure call mixed predicates side-effects of procedure Boolean programs support only call-byvalue-result c2bp models all side-effects using return processing

Abstracting Procedure Returns Let a be an actual at call-site P( ) pre(a) = the value of a before transition to P Let f be a formal of procedure P pre(f) = the value of f upon entry to P

predicate call/return relation call/return assign {x=1 {x=2 Q() { int x := 1; x := R(x) f := x pre(f) = pre(x) x := r int R (int f) { int r := f+1; f := 0; return r; {f=pre(f) {r=pre(f)+1 WP(f:=x, f=pre(f) ) = x=pre(f) x=pre(f) is true at the call to R WP(x:=r, x=2) = r=2 pre(f)=pre(x) and pre(x)=1 and r=pre(f)+1 implies r=2 {x=1 s Q() { {x=1,{x=2 := T,F; s := R(T); {x=2 := s & {x=1; bool R ( {f=pre(f) ) { {r=pre(f)+1 := {f=pre(f); {f=pre(f) := *; return {r=pre(f)+1;

predicate call/return relation call/return assign {x=1 {x=2 Q() { int x := 1; x := R(x) f := x pre(f) = pre(x) x := r int R (int f) { int r := f+1; f := 0; return r; {f=pre(f) {r=pre(f)+1 WP(f:=x, f=pre(f) ) = x=pre(f) x=pre(f) is true at the call to R WP(x:=r, x=2) = r=2 pre(f)=pre(x) and pre(x)=1 and r=pre(f)+1 implies r=2 Q() { {x=1,{x=2 := T,F; s := R(T); {x=1, {x=2 := *, s & {x=1; {x=1 s bool R ( {f=pre(f) ) { {r=pre(f)+1 := {f=pre(f); {f=pre(f) := *; return {r=pre(f)+1;

Extending Pre-states Suppose formal parameter is a pointer eg. P(int*f) pre( *f ) value of *f upon entry to P can t change during P * pre( f ) value of dereference of pre( f ) can change during P

predicate call/return relation call/return assign {x=1 {x=2 Q() { int x := 1; R(&x); a := &x pre(a) = &x pre(*a)=pre(x) int R (int *a) { *a := *a+1; {a=pre(a) {*pre(a)=pre(*a) {*pre(a)=pre(*a)+1 pre(x)=1 and pre(*a)=pre(x) and *pre(a)=pre(*a)+1 and pre(a)=&x implies x=2 {x=1 s Q() { {x=1,{x=2 := T,F; s := R(T,T); {x=2 := s & {x=1; bool R ( {a=pre(a), {*pre(a)=pre(*a) ) { {*pre(a)=pre(*a)+1 := {*pre(a)=pre(*a) ; return {*pre(a)=pre(*a)+1;

Newton: what changes with pointers? Simulation needs to handle pointer accesses Need to copy local heap across scopes to match Bebop s semantics

main(int *x){ assume(*x < 5); foo(x); void foo (int *a) { assume(*a > 5); assert(0);

main(int *x){ assume(*x < 5); foo(x); void foo (int *a) { assume(*a > 5); assert(0); Predicates after simplification: main: (1) x: X (2) *X: Y [1] foo: (3) a: A (4) *A: B [3] *x < 5, *a < 5 Conditions: Contradictory! (5) (Y< 5) [1,2] (6) (B < 5) [3,4,5] (7) (B > 5) [3,4]

What worked well? Specific domain problem Safety properties Shoulders & synergies Separation of concerns Summer interns & visitors Windows partners

Further Reading See papers, slides from: http://research.microsoft.com/slam