Stephen McLaughlin. From Uncertainty to Belief: Inferring the Specification Within

Similar documents
From Uncertainty to Belief: Inferring the Specification Within

[0569] p 0318 garbage

Homework #3 CS2255 Fall 2012

Lecture 14. No in-class files today. Homework 7 (due on Wednesday) and Project 3 (due in 10 days) posted. Questions?

Dynamic Allocation in C

Outline. Java Models for variables Types and type checking, type safety Interpretation vs. compilation. Reasoning about code. CSCI 2600 Spring

CSE 307: Principles of Programming Languages

Pointers, Dynamic Data, and Reference Types

Static Analysis in C/C++ code with Polyspace

Dynamic Allocation in C

CS162 - POINTERS. Lecture: Pointers and Dynamic Memory

Checking Program Properties with ESC/Java

Memory Management. CS31 Pascal Van Hentenryck CS031. Lecture 19 1

fopen() fclose() fgetc() fputc() fread() fwrite()

Secure Programming. An introduction to Splint. Informatics and Mathematical Modelling Technical University of Denmark E

Pointers. 1 Background. 1.1 Variables and Memory. 1.2 Motivating Pointers Massachusetts Institute of Technology

COMP322 - Introduction to C++ Lecture 10 - Overloading Operators and Exceptions

In Java we have the keyword null, which is the value of an uninitialized reference type

Verification and Validation

Variables, Memory and Pointers

Advances in Programming Languages: Regions

2 rd class Department of Programming. OOP with Java Programming

Pointers. Memory. void foo() { }//return

7/21/ FILE INPUT / OUTPUT. Dong-Chul Kim BioMeCIS UTA

Static program checking and verification

Splint Pre-History. Security Flaws. (A Somewhat Self-Indulgent) Splint Retrospective. (Almost) Everyone Hates Specifications.

CS164: Programming Assignment 5 Decaf Semantic Analysis and Code Generation

Introduction to file management

Introduction to Computer Science Midterm 3 Fall, Points

Content. Input Output Devices File access Function of File I/O Redirection Command-line arguments

CS159. Nathan Sprague

CS558 Programming Languages

Pointers. Addresses in Memory. Exam 1 on July 18, :00-11:40am

SYSC 2006 C Winter 2012

REFERENCES, POINTERS AND STRUCTS

CS 326 Operating Systems C Programming. Greg Benson Department of Computer Science University of San Francisco

CS558 Programming Languages

VIRTUAL FUNCTIONS Chapter 10

Implementing Abstractions

Dynamic Memory Allocation

TYPES, VALUES AND DECLARATIONS

The Environment Model

QUIZ. What is wrong with this code that uses default arguments?

Closures. Mooly Sagiv. Michael Clarkson, Cornell CS 3110 Data Structures and Functional Programming

Midterm Exam Nov 8th, COMS W3157 Advanced Programming Columbia University Fall Instructor: Jae Woo Lee.

Advances in Programming Languages

PIC 10A Pointers, Arrays, and Dynamic Memory Allocation. Ernest Ryu UCLA Mathematics

Towards automated detection of buffer overrun vulnerabilities: a first step. NDSS 2000 Feb 3, 2000

Computer System and programming in C

C Pointers. 6th April 2017 Giulio Picierro

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING

Instructions: Submit your answers to these questions to the Curator as OQ02 by the posted due date and time. No late submissions will be accepted.

Lectures 5-6: Introduction to C

Finding User/Kernel Pointer Bugs with Type Inference p.1

CS201- Introduction to Programming Current Quizzes

A Factor Graph Model for Software Bug Finding

The Environment Model. Nate Foster Spring 2018

Oracle Developer Studio Code Analyzer

Lecture 07 Debugging Programs with GDB

Arrays and Strings. Antonio Carzaniga. February 23, Faculty of Informatics Università della Svizzera italiana Antonio Carzaniga

CSC209H Lecture 3. Dan Zingaro. January 21, 2015

QUIZ How do we implement run-time constants and. compile-time constants inside classes?

CS2141 Software Development using C/C++ Debugging

Gaps in Static Analysis Tool Capabilities. Providing World-Class Services for World-Class Competitiveness

A Short Summary of Javali

A Practical Approach to Programming With Assertions

Introduction to C++ Part II. Søren Debois. Department of Theoretical Computer Science IT University of Copenhagen. September 12th, 2005

Vector and Free Store (Pointers and Memory Allocation)

Systems Programming and Computer Architecture ( )

UNIT 3 ARRAYS, RECURSION, AND COMPLEXITY CHAPTER 11 CLASSES CONTINUED

COP 3223 Final Review

Chapter 1: Object-Oriented Programming Using C++

Variation of Pointers

TEXT FILE I/O. printf("this data will be written to the screen.\n"); printf("x = %d, y = %f\n", x, y);

Emulating Goliath Storage Systems with David

DEBUGGING: STATIC ANALYSIS

Programming. Pointers, Multi-dimensional Arrays and Memory Management

Lecture 2, September 4

Class Information ANNOUCEMENTS

18-600: Recitation #3

Today s lecture. Pointers/arrays. Stack versus heap allocation CULTURE FACT: IN CODE, IT S NOT CONSIDERED RUDE TO POINT.

Structure of a Compiler. We looked at each stage in turn. Exceptions. Language Design and Implementation Issues

Compiler Construction

EL2310 Scientific Programming

Bug Hunting and Static Analysis

CS107 Handout 08 Spring 2007 April 9, 2007 The Ins and Outs of C Arrays

An Extended Behavioral Type System for Memory-Leak Freedom. Qi Tan, Kohei Suenaga, and Atsushi Igarashi Kyoto University

Programming in C. main. Level 2. Level 2 Level 2. Level 3 Level 3

Optimizing for Bugs Fixed

MEMORY MANAGEMENT TEST-CASE GENERATION OF C PROGRAMS USING BOUNDED MODEL CHECKING

Contents. I. Classes, Superclasses, and Subclasses. Topic 04 - Inheritance

CSC209: Software tools. Unix files and directories permissions utilities/commands Shell programming quoting wild cards files

CSC209: Software tools. Unix files and directories permissions utilities/commands Shell programming quoting wild cards files. Compiler vs.

Formal Methods for Program Analysis and Generation

Memory and C++ Pointers

CS 430 Spring Mike Lam, Professor. Data Types and Type Checking

Manipulating Integers

C Review. MaxMSP Developers Workshop Summer 2009 CNMAT

Pointers II. Class 31

FORM 1 (Please put your name and section number (001/10am or 002/2pm) on the scantron!!!!) CS 161 Exam II: True (A)/False(B) (2 pts each):

Transcription:

From Uncertainty to Belief: Inferring the Specification Within

Overview Area: Program analysis and error checking / program specification Problem: Tools lack adequate specification. Good specifications are hard to make. More specifically, the ownership problem Solution: Automate some or all of program specification Methodology: Quasi expert systems approach / implementation

Uses of program analysis 1. Basic syntax, convention and common error checking - lint 2. Slightly more advanced error detection, e.g. dereferencing NULL and type checking - splint 3. Formal methods What is the trend moving down this list?

Annotations splint /*@null@*/ char *c - Forces a check for NULL before every dereference /*@in@*/ int *i - Forces an actual parameter to be completely defined before being passed to a function /*@out@*/ int *o - Forces the parameter to be completely defined before the function returns Java 5.0+ Supports basic pre-defined annotations, as well as some advanced features Users can define their own annotations, which can themselves be annotated. A method s annotations can be obtained at runtime through Java reflection. @Override somemethod() - Compiler throws an error if the annotated method does not override one in a superclass

Specification in this paper Basic idea: Who is allocating & deallocating memory Generalized: Who is returning and claiming ownership of resources? Ownership: A pointer owns a resource if it is the pointer that could currently be used to de-allocate that resource. Possible annotations for a function are co - Claims ownership of a resource ro - Returns ownership ro/ co -? Assumption?

Annotation Variables Syntax: method:ret formal parameter number Example: FILE *fp = fopen("myfile.txt","r"); fread(buffer, n, 1000, fp); fclose(fp);

Annotation Variables fopen:ret { ro, ro} FILE *fp = fopen("myfile.txt","r"); fread:4 { co, co} fread(buffer, n, 1000, fp); fclose:1 { co, co} fclose(fp);

Specifications and Factors A is the set of annotation variables in a program A = a is an assignment to these variables, that is, a specification A factor f i is a mapping: Example: Product of experts f i : A i [0, ) where A i A. f FILE (fopen : ret = ro) = 0.51 and f FILE (fopen : ret = ro) = 0.49 P(A) = 1 Z f i {f i } f i(a)

More factors Prior Factors: One per annotation variable Adds a bias to each variable Check Factors: Two values that sum to 1 θ OK and θ BUG OK or BUG is decided by a FSM Normally want θok > θ BUG Where does each type of factor come from?

Annotation Factor Graph (AFG) The code: 1. FILE * fp1 = fopen("myfile.txt", "r"); 2. FILE * fp2 = fdopen(fd, "w"); 3. fread(buffer, n, 1, fp1); 4. fwrite(buffer, n, 1, fp2); 5. fclose(fp1); 6. fclose(fp2); The graph:

Using the graph

More inference techniques Multiple Behavioral Tests Adds new states to FSM used for check factors Correct States: Deallocator, Ownership, Contra-Ownership Incorrect States: Leak, Invalid Use Once again, correct states weighted more heavily than incorrect Naming conventions Most developers follow a pattern in which functions with similar behaviors or purposes have similar names. We add an extra factor to each callsite that evaluates to θ keyword,(co ro) or θ keyword,( co ro) Note that developers can influence the accuracy of this factor by choice of keywords.

Implementation Checker observes All call sites that return pointer String constants (treated as returned by ro) (Why?) Pointer dereferences (treated as co) (Why?) Note: Model is computationally infeasible, so Gibb s sampling is used. Also uses, simulated annealing and false path pruning for AFGs

Evaluation

Evaluation Three AFGs Basic AFG - as described earlier AFG NoFPP - AFG that does no false path pruning AFG Rename - Renames functions at each callsite to make all function calls unique

Take away Technique works assuming assumption about programming idioms hold This is often true except in special cases such as the Linux kernel