Advanced SmPL: Finding Missing IS ERR tests

Similar documents
Introduction to Coccinelle

Coccinelle: Practical Program Transformation for the Linux Kernel. Julia Lawall (Inria/LIP6) June 25, 2018

Coccigrep: a semantic grep for C language

Coccinelle. Julia Lawall (Inria/Irill/LIP6) August 19, 2015

Coccinelle: A Program Matching and Transformation Tool for Systems Code

Finding Error Handling Bugs in OpenSSL using Coccinelle

Inside the Mind of a Coccinelle Programmer

Coccinelle: A Program Matching and Transformation Tool for Linux

Coccinelle: Bug Finding for the Linux Community

Prequel: A Patch-Like Query Language for Commit History Search

How Double-Fetch Situations turn into Double-Fetch Vulnerabilities:

Semantic Patches. Yoann Padioleau EMN Julia L. Lawall DIKU Gilles Muller EMN

Dissecting a 17-year-old kernel bug

Code Generation II. Code generation for OO languages. Object layout Dynamic dispatch. Parameter-passing mechanisms Allocating temporaries in the AR

Introduction to C++ Introduction. Structure of a C++ Program. Structure of a C++ Program. C++ widely-used general-purpose programming language

Implementing an Embedded Compiler using Program Transformation Rules

Introduction to C++ with content from

Coccinelle Usage (version 0.1.7)

Radix Tree, IDR APIs and their test suite. Rehas Sachdeva & Sandhya Bankar

Aalborg Universitet. Published in: Science of Computer Programming. DOI (link to publication from Publisher): /j.scico

CMa simple C Abstract Machine

FORM 2 (Please put your name and form # on the scantron!!!!)

Memory management. Johan Montelius KTH

Unleashing D* on Android Kernel Drivers. Aravind Machiry

Intermediate Programming, Spring 2017*

Code Generation. Lecture 19

Lecture 5: Procedure Calls

11/6/17. Functional programming. FP Foundations, Scheme (2) LISP Data Types. LISP Data Types. LISP Data Types. Scheme. LISP: John McCarthy 1958 MIT

Coccinelle: Tool support for automated CERT C Secure Coding Standard certification

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Scripting Linux system calls with Lua. Lua Workshop 2018 Pedro Tammela CUJO AI

Towards Easing the Diagnosis of Bugs in OS Code

Diagnosys: Automatic Generation of a Debugging Interface to the Linux Kernel

CSCI 3155: Principles of Programming Languages Exam preparation #1 2007

Functional Languages. Hwansoo Han

SmPL: A Domain-Specic Language for Specifying Collateral Evolutions in Linux Device Drivers

Cross-checking Semantic Correctness: The Case of Finding File System Bugs

CS 31: Intro to Systems Pointers and Memory. Martin Gagne Swarthmore College February 16, 2016

Clang Static Analyzer Documentation

CS61C : Machine Structures

Programming in C. Lecture 5: Aliasing, Graphs, and Deallocation. Dr Neel Krishnaswami. Michaelmas Term University of Cambridge

Project 3 Due October 21, 2015, 11:59:59pm

13th ANNUAL WORKSHOP 2017 VERBS KERNEL ABI. Liran Liss, Matan Barak. Mellanox Technologies LTD. [March 27th, 2017 ]

Concepts of program design Exam January 31, 13:30 16:30

Ruby: Introduction, Basics

Documenting and Automating Collateral Evolutions in Linux Device Drivers

Lecture Notes CPSC 122 (Fall 2014) Today Quiz 7 Doubly Linked Lists (Unsorted) List ADT Assignments Program 8 and Reading 6 out S.

Coccinelle: Killing Driver Bugs Before They Hatch

We do not teach programming

CMSC 330: Organization of Programming Languages. OCaml Expressions and Functions

COMP-520 GoLite Tutorial

Lecture Outline. Topic 1: Basic Code Generation. Code Generation. Lecture 12. Topic 2: Code Generation for Objects. Simulating a Stack Machine

C++ Important Questions with Answers

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Lecture 7: Type Systems and Symbol Tables. CS 540 George Mason University

The role of semantic analysis in a compiler

Machine-Learning-Guided Selectively Unsound Static Analysis

LESSON 6 FLOW OF CONTROL

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Quiz I

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Introduction to C: Pointers

CS240: Programming in C. Lecture 14: Errors

functional programming in Python, part 2

C The new standard

Semantic Analysis. Lecture 9. February 7, 2018

CS471 ASST2. System Calls

C++ (Non for C Programmer) (BT307) 40 Hours

Algebraic Program Analysis

FLICONV-API. Generated by Doxygen

COMPUTER SCIENCE TRIPOS

COP5621 Exam 4 - Spring 2005

POLYMORPHISM 2 PART. Shared Interface. Discussions. Abstract Base Classes. Abstract Base Classes and Pure Virtual Methods EXAMPLE

POLYMORPHISM 2 PART Abstract Classes Static and Dynamic Casting Common Programming Errors

Introduction to Computer Systems. Exam 1. February 22, Model Solution fp

Chapter 9 TRAP Routines and Subroutines

Defective Error/Pointer Interactions in the Linux Kernel

Midterm Exam #2 April 20, 2016 CS162 Operating Systems

CS 31: Intro to Systems Pointers and Memory. Kevin Webb Swarthmore College October 2, 2018

CS558 Programming Languages

CMSC330. Objects, Functional Programming, and lambda calculus

The Compiler So Far. Lexical analysis Detects inputs with illegal tokens. Overview of Semantic Analysis

Iterative Languages. Scoping

CMSC 330: Organization of Programming Languages. OCaml Imperative Programming

Final exam. Final exam will be 12 problems, drop any 2. Cumulative up to and including week 14 (emphasis on weeks 9-14: classes & pointers)

Could you make the XNA functions yourself?

8. Control statements

Principles of Programming Languages. Lecture Outline

Assignment 4: Semantics

Functions and Methods. Questions:

Code Generation. Lecture 12

TRAPs and Subroutines. University of Texas at Austin CS310H - Computer Organization Spring 2010 Don Fussell

Structured Data. CIS 15 : Spring 2007

Digging into the GAT API

ECE/CS 314 Fall 2003 Homework 2 Solutions. Question 1

Chapter 9 TRAP Routines and Subroutines

CS 164, Midterm #2, Spring O, M, C - e1 : Bool O, M, C - e2 : t2 O, M, C - e2 : T3 O, M, C - if e1 then e2 else e3 fi : T2 _ T3

Comp 411 Principles of Programming Languages Lecture 7 Meta-interpreters. Corky Cartwright January 26, 2018

CS2210: Compiler Construction. Runtime Environment

Extensions to Barrelfish Asynchronous C

Compilers CS S-08 Code Generation

Transcription:

1 Advanced SmPL: Finding Missing IS ERR tests Julia Lawall January 26, 2011

The error handling problem The C language does not provide any error handling abstractions. For pointer-typed functions, Linux commonly uses: NULL Example: ERR PTR(...)/IS ERR(...) static struct fsnotify_event *get_one_event(struct fsnotify_group *group, size_t count) { if (fsnotify_notify_queue_is_empty(group)) return NULL; if (FAN_EVENT_METADATA_LEN > count) return ERR_PTR(-EINVAL); return fsnotify_remove_notify_event(group); } Problems: The result of a function call may not be tested for an error. The result of a function call may be tested for the wrong kind of error. Our goal: Find missing tests for ERR PTR.

3 How often is ERR PTR used anyway? Strategy: Initialize a counter. Match calls to ERR PTR. Increment a counter for each reference. Print the final counter value.

4 Initialize a counter Arbitrary computations can be done using a scripting language: Python OCaml Initialize script Invoked once, when spatch starts. State maintained across the treatment of all files. Example: @initialize:python@ count = 0

Match and count calls to ERR PTR Observations: SmPL allows matching code, but not performing computation. Python/OCaml allows performing computation, but not matching code. Solution: Match using SmPL code. Communicate information about the position of each match to Python code. Example: @r@ position p; ERR_PTR@p(...) @script:python@ p << r.p; count = count + 1

How does it work? Spatch matches the first rule against each top-level code element, then the second rule against each top-level code element, etc. Each match creates an environment mapping metavariables to code fragments or positions. Some metavariables are only used in the current rule. Others are inherited by later rules. Each SmPL rule is invoked once for each pair of A toplevel code element, and An environment of inherited metavariables Each script rule is invoked once for each each environment that binds all of the inherited metavariables.

7 Example @r@ position p; ERR_PTR@p(...) @script:python@ p << r.p; count = count + 1 static struct jump_label_entry *add_jump_label_module_entry(...) {... e = kmalloc(sizeof(struct jump_label_module_entry), GFP_KERNEL); if (!e) return ERR_PTR(-ENOMEM);... } static struct jump_label_entry *add_jump_label_entry(...) {... e = get_jump_label_entry(key); if (e) return ERR_PTR(-EEXIST); e = kmalloc(sizeof(struct jump_label_entry), GFP_KERNEL); if (!e) return ERR_PTR(-ENOMEM);... } Matching r against add jump label module entry gives one environment: [p line 5] Matching r against add jump label entry gives two environments: [p line 12], [p line 16] Script rule invoked 3 times.

Print the final counter value Finalize script Invoked once, when spatch terminates. Can access the state obtained from the treatment of all files. Example: @finalize:python@ print count

9 Summary Complete semantic patch: @initialize:python@ count = 0 @r@ position p; ERR_PTR@p(...) @script:python@ p << r.p; count = count + 1 @finalize:python@ print count Result for Linux-next, 01.21.2011: 3166

10 Finding missing IS ERR tests If a function returns ERR PTR(...) its result must be tested using IS ERR. ERR PTR(...) can be returned directly: static struct fsnotify_event *get_one_event(struct fsnotify_group *group, size_t count) { if (fsnotify_notify_queue_is_empty(group)) return NULL; if (FAN_EVENT_METADATA_LEN > count) return ERR_PTR(-EINVAL); return fsnotify_remove_notify_event(group); } Or returned via a variable: struct ctl_table_header *sysctl_head_grab(struct ctl_table_header *head) { if (!head) BUG(); spin_lock(&sysctl_lock); if (!use_table(head)) head = ERR_PTR(-ENOENT); spin_unlock(&sysctl_lock); return head; }

Collecting functions that return ERR PTR(...) @r exists@ identifier f,x; expression E; f(...) {... when any ( return ERR_PTR(...); x = ERR_PTR(...)... when!= x = E return x; ) }

Finding missing IS ERR tests @e exists@ @script:python@ identifier r.f,fld; f << r.f; expression x; p1 << e.p1; position p1,p2; p2 << e.p2; statement S1, S2; cocci.print_main (f,p1) ( cocci.print_secs ("ref",p2) IS_ERR(x = f(...)) x@p1 = f(...) )... when!= IS_ERR(x) ( if (IS_ERR(x)...) S1 else S2 x@p2->fld )

Finding missing IS ERR tests @e exists@ @script:python@ identifier r.f,fld; f << r.f; expression x; p1 << e.p1; position p1,p2; p2 << e.p2; statement S1, S2; cocci.print_main (f,p1) ( cocci.print_secs ("ref",p2) IS_ERR(x = f(...)) x@p1 = f(...) )... when!= IS_ERR(x) ( if (IS_ERR(x)...) S1 else S2 x@p2->fld )

Iteration Problem: Limited to functions and function calls in the same file. Solution idea: Invoke collection rules on the entire code base. Then, reinvoke spatch on the bug finding rules for each collected function. Issues: Two phases. In each phase, use only a subset of the semantic patch rules. Need a way to give arguments to a semantic patch.

Invoking a subset of the rules of a semantic patch virtual after_start @r depends on!after_start exists@ identifier f; f(...) {... return ERR_PTR(...); } @e depends on after_start exists@...... Command line options to invoke e: spatch -sp file rule.cocci -D after start

Giving arguments to a semantic patch @e depends on after_start exists@ identifier virtual.f, fld; expression x; position p1,p2; statement S1, S2; ( IS_ERR(x = f(...)) x@p1 = f(...) )... when!= IS_ERR(x) ( if (IS_ERR(x)...) S1 else S2 x@p2->fld ) Command line options to define f: spatch -sp file rule.cocci -D f=alloc

17 Constructing the iteration (OCaml only) Goal: Use the identifiers collected by the following rule as arguments to the semantic patch: @r depends on!after_start exists@ identifier f; f(...)... return ERR_PTR(...); Relevant information: Files to consider (the current set or a smaller one?) The validity of virtual rules. The bindings of virtual identifiers

OCaml code @script:ocaml@ f << r.f; let it = new iteration() in (* it#set_files file_list *) it#add_virtual_rule After_start; it#add_virtual_identifier F f; it#register()

Summary virtual after_start @r depends on!after_start exists@ identifier f; f(...)... return ERR_PTR(...); @script:ocaml@ f << r.f; let it = new iteration() in (* it#set_files file_list *) it#add_virtual_rule After_start; it#add_virtual_identifier F f; it#register() @e depends on after_start exists@ identifier virtual.f, fld; expression x; position p1,p2; statement S1, S2; ( IS_ERR(x = f(...)) x@p1 = f(...) )... when!= IS_ERR(x) ( if (IS_ERR(x)...) S1 else S2 x@p2->fld ) @script:python@ p1 << e.p1; p2 << e.p2; f << virtual.f; cocci.print_main (f,p1) cocci.print_main ("ref",p2)

20 Results (Linux-next from 21.01.2011) 7 reports: 5 real bugs 2 false positives Issues: Not interprocedural. Can also iterate the function finding process. Not sensitive to function visibility (static). For bugs, find them directly. For function collection, limit reinvocation to the same file.

21 Conclusion Main issues: Initialize and finalize: scripts for initializing and accessing global state. Environments for managing inherited metavariables. Virtual rules. Virtual identifiers. Iteration.