Static Analysis in Practice

Similar documents
Static Analysis in Practice

Software Excellence via. at Microsoft

Software Excellence via Program Analysis at Microsoft. Manuvir Das

Static Analysis. Principles of Software System Construction. Jonathan Aldrich. Some slides from Ciera Jaspan

Software Security Program Analysis with PREfast & SAL. Erik Poll. Digital Security group Radboud University Nijmegen

Principles of Software Construction: Objects, Design and Concurrency. Static Analysis. toad Jonathan Aldrich Charlie Garrod.

Defect Detection at Microsoft Where the Rubber Meets the Road

Software Security: Vulnerability Analysis

Today Program Analysis for finding bugs, especially security bugs problem specification motivation approaches remaining issues

Language Security. Lecture 40

Principles of Software Construction: Objects, Design, and Concurrency (Part 2: Designing (Sub )Systems)

CS 161 Computer Security

One-Slide Summary. Lecture Outline. Language Security

Towards automated detection of buffer overrun vulnerabilities: a first step. NDSS 2000 Feb 3, 2000

Classes, interfaces, & documentation. Review of basic building blocks

Memory Safety (cont d) Software Security

Automatic Generation of Program Specifications

5) Attacker causes damage Different to gaining control. For example, the attacker might quit after gaining control.

Checking Program Properties with ESC/Java

Hoare Logic: Proving Programs Correct

System Security Class Notes 09/23/2013

Scalable Defect Detection. Manuvir Das, Zhe Yang, Daniel Wang Center for Software Excellence Microsoft Corporation

Static Program Analysis Part 1 the TIP language

Assertions, pre/postconditions

Software Engineering Testing and Debugging Testing

COS 320. Compiling Techniques

Static Vulnerability Analysis

Lecture 9 Assertions and Error Handling CS240

Introduction to Axiomatic Semantics

Simple Overflow. #include <stdio.h> int main(void){ unsigned int num = 0xffffffff;

Static program checking and verification

CS-XXX: Graduate Programming Languages. Lecture 9 Simply Typed Lambda Calculus. Dan Grossman 2012

Static Analysis in C/C++ code with Polyspace

Software Engineering

Verification & Validation of Open Source

Analysis of Software Artifacts

A Practical Approach to Programming With Assertions

School of Computer & Communication Sciences École Polytechnique Fédérale de Lausanne

JML tool-supported specification for Java Erik Poll Radboud University Nijmegen

Assertions & Design-by-Contract using JML Erik Poll University of Nijmegen

CITS5501 Software Testing and Quality Assurance Formal methods

Securing Software Applications Using Dynamic Dataflow Analysis. OWASP June 16, The OWASP Foundation

Lecture 4 September Required reading materials for this class

CSE 565 Computer Security Fall 2018

CS 370 The Pseudocode Programming Process D R. M I C H A E L J. R E A L E F A L L

CCured. One-Slide Summary. Lecture Outline. Type-Safe Retrofitting of C Programs

Intro to Proving Absence of Errors in C/C++ Code

18-642: Code Style for Compilers

SOME TYPICAL QUESTIONS, THEIR EXPECTED ANSWERS AND THINGS

Checking System Rules Using System-Specific, Programmer- Written Compiler Extensions

Goal. Overflow Checking in Firefox. Sixgill. Sixgill (cont) Verifier Design Questions. Sixgill: Properties 4/8/2010

Static Analysis methods and tools An industrial study. Pär Emanuelsson Ericsson AB and LiU Prof Ulf Nilsson LiU

1 Dynamic Memory continued: Memory Leaks

Modular and Verified Automatic Program Repairs

Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs. {livshits,

Reasoning about programs

Last time. Reasoning about programs. Coming up. Project Final Presentations. This Thursday, Nov 30: 4 th in-class exercise

ESC/Java2 extended static checking for Java Erik Poll Radboud University Nijmegen

Program Verification (6EC version only)

Unit Testing as Hypothesis Testing

Java Modeling Language (JML)

Lecture 10. Pointless Tainting? Evaluating the Practicality of Pointer Tainting. Asia Slowinska, Herbert Bos. Advanced Operating Systems

Stanford University Computer Science Department CS 295 midterm. May 14, (45 points) (30 points) total

Code verification. CSE 331 University of Washington. Michael Ernst

"Secure" Coding Practices Nicholas Weaver

MSO Lecture Design by Contract"

Advances in Programming Languages

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Type Theory meets Effects. Greg Morrisett

Automatic Software Verification

Global Optimization. Lecture Outline. Global flow analysis. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization

CS61C Machine Structures. Lecture 4 C Pointers and Arrays. 1/25/2006 John Wawrzynek. www-inst.eecs.berkeley.edu/~cs61c/

ESC/Java2 Use and Features

CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C

Information Flow Analysis and Type Systems for Secure C Language (VITC Project) Jun FURUSE. The University of Tokyo

ESC/Java2 Use and Features

Program Correctness and Efficiency. Chapter 2

Lecture 1 Contracts. 1 A Mysterious Program : Principles of Imperative Computation (Spring 2018) Frank Pfenning

A Unix process s address space appears to be three regions of memory: a read-only text region (containing executable code); a read-write region

Program Static Analysis. Overview

n Specifying what each method does q Specify it in a comment before method's header n Precondition q Caller obligation n Postcondition

Generating String Attack Inputs Using Constrained Symbolic Execution. presented by Kinga Dobolyi

Testing, Debugging, and Verification

17. Assertions. Outline. Built-in tests. Built-in tests 3/29/11. Jelle Slowack, Bart Smets, Glenn Van Loon, Tom Verheyen

17. Assertions. Jelle Slowack, Bart Smets, Glenn Van Loon, Tom Verheyen

Program Verification. Aarti Gupta

CS 31: Intro to Systems Pointers and Memory. Kevin Webb Swarthmore College October 2, 2018

CS 161 Computer Security. Security Throughout the Software Development Process

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall 2011.

Inference of Memory Bounds

CS 31: Intro to Systems Pointers and Memory. Martin Gagne Swarthmore College February 16, 2016

Contracts in OpenBSD

Information page for written examinations at Linköping University

Agenda. Peer Instruction Question 1. Peer Instruction Answer 1. Peer Instruction Question 2 6/22/2011

Lecture Notes: Hoare Logic

Backward Reasoning: Rule for Assignment. Backward Reasoning: Rule for Sequence. Simple Example. Hoare Logic, continued Reasoning About Loops

Counterexample Guided Abstraction Refinement in Blast

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning

HW1 due Monday by 9:30am Assignment online, submission details to come

C Bounds Non-Checking. Code Red. Reasons Not to Use C. Good Reasons to Use C. So, why would anyone use C today? Lecture 10: *&!

Transcription:

in Practice 17-654/17-754: Analysis of Software Artifacts Jonathan Aldrich 1 Quick Poll Who is familiar and comfortable with design patterns? e.g. what is a Factory and why use it? 2 1

Outline: in Practice Case study: Analysis at ebay Case study: Analysis at Microsoft Analysis Results and Process Example: Standard Annotation Language Dataflow Analysis Soundness 3 Analyzing Tool Output True Positive real issue The error condition warned about can occur at run time, for some program input True Positive but don t care That program input is impossible due to the system context e.g. security: attacker cannot control the input (are you sure?) The error condition is not one we care about False Positive The error condition cannot occur at run time, no matter what the program input is Tool project: must distinguish from true positive but don t care False Negative The program can get into an error condition for an attribute covered by the tool for some input, but the tool does not warn of it 4 2

Analysis Maturity Model Caveat: not yet enough experience to make strong claims Level 1: use typed languages, ad-hoc tool use Level 2: run off-the-shelf tools as part of process pick and choose analyses which are most useful Level 3: integrate tools into process check in quality gates, milestone quality gates integrate into build process, developer environments use annotations/settings to teach tool about internal libraries Level 4: customized analyses for company domain extend analysis tools to catch observed problems Level 5: continual optimization of analysis infrastructure mine patterns in bug reports for new analyses gather data on analysis effectiveness tune analysis based on observations 5 in Engineering Practice A tool with different tradeoffs Soundness: can find all errors in a given class Focus: mechanically following design rules Major impact at Microsoft and ebay Tuned to address company-specific problems Affects every part of the engineering process 6 3

Outline: in Practice Case study: Analysis at ebay Case study: Analysis at Microsoft Analysis Results and Process Example: Standard Annotation Language Dataflow Analysis Soundness 7 Standard Annotation Language (SAL) A language for specifying contracts between functions Intended to be lightweight and practical More powerful but less practical contracts supported in systems like ESC/Java or Spec # Preconditions Conditions that hold on entry to a function What a function expects of its callers Postconditions Conditions that hold on exiting a function What a function promises to its callers Initial focus: memory usage buffer sizes, null pointers, memory allocation 8 4

SAL is checked using PREfast Lightweight analysis tool Only finds bugs within a single procedure Also checks SAL annotations for consistency with code To use it (for free!) Download and install Microsoft Visual C++ 2005 Express Edition http://msdn.microsoft.com/vstudio/express/visualc/ Download and install Microsoft Windows SDK for Vista http://www.microsoft.com/downloads/details.aspx?familyid=c2b1 e300-f358-4523-b479-f53d234cdccf Use the SDK compiler in Visual C++ In Tools Options Projects and Solutions VC++ Directories add C:\Program Files\Microsoft SDKs\Windows\v6.0\VC\Bin (or similar) In project Properties Configuration Properties C/C++ Command Line add /analyze as an additional option 9 Demonstration: PREfast 10 5

Buffer/Pointer Annotations Format Leading underscore List of components below _in _inout _out _bcount(size) _ecount(size) _opt The function reads from the buffer. The caller provides the buffer and initializes it. The function both reads from and writes to buffer. The caller provides the buffer and initializes it. The function writes to the buffer. If used on the return value, the function provides the buffer and initializes it. Otherwise, the caller provides the buffer and the function initializes it. The buffer size is in bytes. The buffer size is in elements. This parameter can be NULL. 11 PREfast: Immediate Checks Library function usage deprecated functions e.g. gets() vulnerable to buffer overruns correct use of printf e.g. does the format string match the parameter types? result types e.g. using macros to test HRESULTs Coding errors = instead of == inside an if statement Local memory errors Assuming malloc returns non-zero Array out of bounds 12 6

Other Useful Annotations checkreturn Callers must check the return value..net Annotation Format Pre/Post attribute with arguments for the pre or postcondition Surrounded in brackets Alternative for the annotations above Required for Tainted [SA_Pre(Tainted=SA_Yes)] [SA_Pre(Tainted=SA_No)] [SA_Post(Tainted=SA_No)] This argument is tainted and cannot be trusted without validation. This argument is not tainted and can be trusted Same as above, but useful as a postcondition 13 Other Supported Annotations How to test if this function succeeded How much of the buffer is initialized? Is a string null-terminated? Is an argument reserved? Is this an overriding method? Is this function a callback? Is this used as a format string? What resources might this function block on? Is this a fallthrough case in a switch? 14 7

SAL: the Benefit of Annotations Annotations express design intent How you intended to achieve a particular quality attribute e.g. never writing more than N elements to this array As you add more annotations, you find more errors Get checking of library users for free Plus, those errors are less likely to be false positives The analysis doesn t have to guess your intention Annotations also improve scalability PreFAST uses very sophisticated analysis techniques These techniques can t be run on large programs Annotations isolate functions so they can be analyzed one at a time 15 SAL: the Benefit of Annotations How to motivate developers? Especially for millions of lines of unannotated code? Microsoft approach Require annotations at checkin Reject code that has a char* with no ecount() Make annotations natural Ideally what you would put in a comment anyway But now machine checkable Avoid formality with poor match to engineering practices Incrementality Check code design consistency on every compile Rewards programmers for each increment of effort Provide benefit for annotating partial code Can focus on most important parts of the code first Avoid excuse: I ll do it after the deadline Build tools to infer annotations Inference is approximate May need to change annotations Hopefully saves work overall Unfortunately not yet available outside Microsoft 16 8

Case Study: SALinfer [Source: Manuvir Das] void work() { int elements[200]; wrap(elements, 200); } void wrap(pre elementcount(len) int *buf, int len) { int *buf2 = buf; int len2 = len; zero(buf2, len2); } Track flow of values through the code 1. Finds stack buffer 2. Adds annotation 3. Finds assignments 4. Adds annotation void zero(pre elementcount(len) int *buf, int len) { int i; for(i = 0; i <= len; i++) buf[i] = 0; } 17 Case Study: SAL verification [Source: Manuvir Das] void work() { int elements[200]; wrap(elements, 200); } void wrap(pre elementcount(len) int *buf, int len) { int *buf2 = buf; int len2 = len; zero(buf2, len2); } void zero(pre elementcount(len) int *buf, int len) { int i; for(i = 0; i <= len; i++) buf[i] = 0; } Building and solving constraints 1. Builds constraints 2. Verifies contract 3. Builds constraints len = length(buf); i len 4. Finds overrun i < length(buf)? NO! Limited tool available as part of Microsoft Visual Studio 2005 18 9

Outline: in Practice Case study: Analysis at ebay Case study: Analysis at Microsoft Analysis Results and Process Example: Standard Annotation Language Dataflow Analysis Soundness 19 Dataflow Analysis Soundness Intuition The result of dataflow analysis is a conservative approximation of all possible run time states Definition A dataflow analysis is sound if, for all programs P, for all inputs I, for all times T in P s execution on input I, If P is at statement S at time T, with memory η, and the analysis returned abstract state σ for S, then α(η) σ 20 10

Local Soundness ƒ DF (σ i, S) α DF (η i ) σ i σ o α DF (η o ) α DF α DF η i S η o Local correctness condition for dataflow analysis If applying a transfer function to statement S and input information σ i yields output information σ o, Then for all program states η i such that α(η i ) σ i and executing S in state η i yields state η o, We must have α(η o ) σ o Global soundness follows from local soundness by induction Initial dataflow facts are sound Each step in program execution maintains soundness invariant 21 Finding Errors with Local Soundness Consider the incorrect flow function: ƒ ZA (σ, [x := y op z]) = if σ[y]=z σ[z]=z then [x Z]σ else [x MZ]σ ƒ DF (σ i, S) α DF (η i ) σ i σ o α DF (η o ) Local Soundness fails! Let σ i = [], S = x := 3+0 Consider η i = []. As required, α DF (η i ) = [] σ i Now σ o = ƒ DF (σ i, S) = [x Z] And η o = S(η i ) = [x 3] So α DF (η o ) = α DF ([x 3]) = [x NZ] BUT α DF (η o ) σ o because Z NZ, so local soundness is violated η i α DF S η o α DF 22 11

Why care about Soundness? Analysis Producers Writing analyses is hard People make mistakes all the time Need to know how to think about correctness When the analysis gets tricky, it s useful to prove it correct formally Analysis Consumers Sound analysis provides guarantees Optimizations won t break the program Finds all defects of a certain sort Decision making Knowledge of soundness techniques lets you ask the right questions about a tool you are considering Soundness affects where you invest QA resources Focus testing efforts on areas where you don t have a sound analysis! 23 Questions? 24 12