First Steps to Automated Driver Verification via Model Checking

Similar documents
Specification and Generation of Environment for Model Checking of Software Components *

Bandera: Extracting Finite-state Models from Java Source Code

Specification and Generation of Environment for Model Checking of Software Components

Research on the Static Analysis Method of the Localization Embedded Platform Software Code Zhijie Gaoa, Ling Lu, Wen Jiao

Program Partitioning - A Framework for Combining Static and Dynamic Analysis

Advanced Slicing of Sequential and Concurrent Programs

Lecture 1: Model Checking. Edmund Clarke School of Computer Science Carnegie Mellon University

Barrier Slicing and Chopping

F-Soft: Software Verification Platform

Source Code Formal Verification. Riccardo Sisto, Politecnico di Torino

Part II: Atomicity for Software Model Checking. Analysis of concurrent programs is difficult (1) Transaction. The theory of movers (Lipton 75)

Computer aided verification

Introduction In Practice State Explosion Problem Infinity and Uncomputability Techniques References. Model Checking. Toryn Qwyllyn Klassen

Model Checking DSL-Generated C Source Code

Symbolic Execution, Dynamic Analysis

Cover Page. The handle holds various files of this Leiden University dissertation

Bisection Debugging. 1 Introduction. Thomas Gross. Carnegie Mellon University. Preliminary version

Analyzing Conversations of Web Services

dsoli: Data Structure Operation Location and Identification

1 PROCESSES PROCESS CONCEPT The Process Process State Process Control Block 5

Data Structure. IBPS SO (IT- Officer) Exam 2017

BINTEST Binary Search-based Test Case Generation

Impact of Dependency Graph in Software Testing

Verification of Windows NT kernel drivers using Zing model checker. Tomáš Matoušek.

Formal Verification of Synchronization Issues in SpecC Description with Automatic Abstraction

Annales UMCS Informatica AI 2 (2004) UMCS. An algorithm and case study for the object oriented abstraction.

Automated Freedom from Interference Analysis for Automotive Software

Extracting the Range of cps from Affine Typing

Ufo: A Framework for Abstraction- and Interpolation-Based Software Verification

MODELLING COMPOSITIONS OF MODULAR EMBEDDED SOFTWARE PRODUCT LINES

Equivalence Checking of C Programs by Locally Performing Symbolic Simulation on Dependence Graphs

Model Checking: Back and Forth Between Hardware and Software

Implementation of Process Networks in Java

Hierarchical Pointer Analysis for Distributed Programs

Zjednodušení zdrojového kódu pomocí grafové struktury

Bogor An extensible and highly-modular model checking framework

Policy-Based Context-Management for Mobile Solutions

Lock-sensitive Interference Analysis for Java: Combining Program Dependence Graphs with Dynamic Pushdown Networks

Supporting Model Checking Education using BOGOR/Eclipse

Sendmail crackaddr - Static Analysis strikes back

Joint Entity Resolution

Regression Verification - a practical way to verify programs

Student Project: System Dependency Graphs in Earendil

Automatic Generation of Graph Models for Model Checking

IMPACT OF DEPENDENCY GRAPH IN SOFTWARE TESTING

Modeling Interactions of Web Software

Distributed Systems Programming (F21DS1) Formal Verification

Static Slicing of Threaded Programs

Context-Switch-Directed Verification in DIVINE

CHAPTER 1 INTRODUCTION

Runtime Checking for Program Verification Systems

Computing Approximate Happens-Before Order with Static and Dynamic Analysis

Algorithms in Systems Engineering IE172. Midterm Review. Dr. Ted Ralphs

Platform-Specific Restrictions on Concurrency in Model Checking of Java Programs

Extension of GCC with a fully manageable reverse engineering front end

2 Introduction to Processes

Chapter 1. Preliminaries

Program Sifting: Select Property-related Functions for Language-based Static Analysis

TraceBack: First Fault Diagnosis by Reconstruction of Distributed Control Flow

Performance Cockpit: An Extensible GUI Platform for Performance Tools

Duet: Static Analysis for Unbounded Parallelism

Addressing Verification Bottlenecks of Fully Synthesized Processor Cores using Equivalence Checkers

Interprocess Communication By: Kaushik Vaghani

TypeChef: Towards Correct Variability Analysis of Unpreprocessed C Code for Software Product Lines

Interprocedural Variable Liveness Analysis for Function Signature Recovery

An Incremental Heap Canonicalization Algorithm

Program Slicing in the Presence of Pointers (Extended Abstract)

Constructing Control Flow Graph for Java by Decoupling Exception Flow from Normal Flow

Operating System. Operating System Overview. Structure of a Computer System. Structure of a Computer System. Structure of a Computer System

Formal Verification of Control Software: A Case Study

Meta generation of syntax oriented editors

Single-pass Static Semantic Check for Efficient Translation in YAPL

Requirements Modelling and Software Systems Implementation Using Formal Languages

SCHOOL: a Small Chorded Object-Oriented Language

OCL Support in MOF Repositories

Hi. My name is Jasper. Together with Richard we thought of some ways that could make a parallel approach to sequential flowsheeting attractive.

What do Compilers Produce?

Software Model Checking. From Programs to Kripke Structures

Checking System Rules Using System-Specific, Programmer- Written Compiler Extensions

Control Flow Analysis with SAT Solvers

Control-Flow Analysis

Java PathFinder JPF 2 Second Generation of Java Model Checker

Formats of Translated Programs

Coping with Conflicts in an Optimistically Replicated File System

Introduction to Operating Systems Prof. Chester Rebeiro Department of Computer Science and Engineering Indian Institute of Technology, Madras

Scenario Graphs Applied to Security (Summary Paper)

Source Code Analysis and Slicing for Program Comprehension

TVLA: A SYSTEM FOR GENERATING ABSTRACT INTERPRETERS*

IUT Job Cracker Design and Implementation of a Dynamic Job Scheduler for Distributed Computation

INF672 Protocol Safety and Verification. Karthik Bhargavan Xavier Rival Thomas Clausen

Question 1. Notes on the Exam. Today. Comp 104: Operating Systems Concepts 11/05/2015. Revision Lectures

Checking Memory Safety with Blast

An Eclipse Plug-in for Model Checking

Transforming Data from into DataPile RDF Structure into RDF

How much is a mechanized proof worth, certification-wise?

Technical aspects of VTL to SQL translation Prepared by Regional Statistical Office in Olsztyn, Poland

Influence of Word Normalization on Text Classification

CSolve: Verifying C With Liquid Types

Core Expressions: An Intermediate Representation for Expressions in C

Using Architectural Models at Runtime: Research Challenges

Transcription:

WDS'06 Proceedings of Contributed Papers, Part I, 146 150, 2006. ISBN 80-86732-84-3 MATFYZPRESS First Steps to Automated Driver Verification via Model Checking T. Matoušek Charles University Prague, Faculty of Mathematics and Physics, Prague, Czech Republic. Abstract. The paper summarizes the current state of our work addressing the verification of Windows kernel drivers via model checking technique. Our goal is to implement a tool that extracts verification models using driver source code and specifications of the kernel environment written in DeSpec language, which we introduced previously. The DeSpec language enables specifying the kernel environment as well as the rules imposed on drivers. The DeSpec Model Extractor tool builds a Zing model capturing those parts of the driver and kernel behavior related to a selected subset of the specification rules. Processing the resulting model in the Zing model checker could reveal the errors in the driver that are commonly difficult to discover via traditional methods of software testing due to the concurrency and complexity of the Windows kernel. Introduction Model Checking The model checking technique [2] is a formal verification method based on thorough examination of a model that emulates the software unit with respect to a verified property. This model should ideally retain those parts of the software that influence the property so that the verification is sound and complete with respect to the property. On the other hand, the model should be much simpler than the original software because the time and space requirements of the verification process grow exponentially with respect to the number of operations, threads, and variables used in the program (the state explosion problem [19]). That is because the model checker explores all possible states of the model to check that the property is valid in each one of them. Verification of Windows Drivers Windows kernel drivers are relatively small libraries mainly written in C language and running in a privileged mode that enables them to work directly with hardware. This introduces a high risk of damaging the other parts of the kernel if the driver contains an error. Hence, the correctness of drivers is crucial for the operating system security and stability and so drivers are common subject of software verification. Microsoft itself has developed several tools that verify drivers correctness. These include the Driver Verifier [13], which tests the drivers at run-time by emulating critical conditions in tight cooperation with the kernel, PREfast [14], which statically analyses the driver s code searching for common erroneous code patterns, and finally Static Driver Verifier (SDV) [16] based on techniques of static analysis and model checking. Zing Modeling Language The target modeling language for our model extractor is the Zing language [18] [1], being developed by Microsoft Research group on the top of the Microsoft.NET Framework platform [11]. This language has been chosen due to a rich modeling functionality it provides and the state of its current development the preview implementation of the model checker is available and works quite well. However, most ideas behind our work are independent of the target model checker and can be applied on any modeling language that provides at least some basic level of abstractions like classes, methods, exceptions, non-deterministic choices, and threads. Another modeling language meeting these criteria should be the new version of Bandera Intermediate Representation (BIR) a modeling language of Bogor model checking framework [23]. Driver Environment Specification Language 146

In our previous work [9] [10], we have introduced a new object-oriented specification language primarily targeting Windows kernel driver environment called DeSpec. It allows writing formal specifications of the kernel API provided to drivers, modeling the kernel s behavior to the drivers, and capturing rules imposed on the drivers in a formal yet still comprehensible form. The language integrates the majority of Zing modeling language features and adds means for defining parameterized abstractions of the kernel functions and structures at varying levels of detail. It enables to map the C language constructs to object-oriented constructs of the Zing language. In this sense, the DeSpec language bridges the gap between the C source code and the Zing model. We have demonstrated [9] the expressiveness and suitability of the DeSpec language on a significant part of the Windows kernel API and many rules described in Driver Development Kit [15] as well as those verified by the Microsoft Static Driver Verifier tool. Driven by DeSpec specifications, the Model Extractor is supposed to generate a Zing model from the driver source codes and kernel header files. The essential part of the DeSpec project is therefore the Specification Repository whose task is to load specifications from DeSpec source files and provide them to the Model Extractor in a convenient form. Contribution In this paper, we summarize the current state of our work addressing the verification of Windows kernel drivers via model checking. The current implementation of the DeSpec Model Extractor is capable of extracting Zing models from C programs using our novel approach to modeling C pointers and arrays. In Section 4, we present this approach and show that it is feasible in practice. Section 2 introduces the Model Extractor s front-end the part of the Model Extractor responsible for transformation of C source codes to the inner representation used in the rest of the tool. Section 3 is summarizing slicing algorithms applied by the Model Extractor on the inner representation prior to the Zing model generation in order to reduce its size. Finally, Section 5 concludes and outlines our future work. Model Extractor Front-end An appropriate front-end that could parse and represent source code of the driver needs to be chosen. The major requirement on the front-end is a support for Microsoft extensions to C language including e.g. structured exception handling commonly used by Windows drivers. The Infrastructure for C Program Analysis and Transformation [20] [22] is a suitable front-end for the extractor as it is able to parse, merge, normalize, and transform C source codes and is capable of both Microsoft and GCC extensions. It converts the source codes to C Intermediate Language (CIL), which is basically a subset of C language replacing complicated constructs with simpler ones that are equivalent. CIL is much easier to analyze since it fairly reduces the number of possible cases the analyzer has to distinguish. For working with projects comprising of multiple source files, which is usually the case, the infrastructure provides the source code merging feature. It is able to merge multiple source codes into a single compilation unit and to remove superfluous type definitions. A single CIL abstract syntax tree then represents the entire program source code. Hence, the tools analyzing the code need not to care about multiple source files. The system is also extensible by custom modules that can operate on the internal CIL representation. A chain of modules can be executed enriching the AST with additional information or computing other structures like e.g. a control flow graph. The process of source code parsing, file merging, AST building, and execution of the extension modules is implemented by the tool called Cilly. The infrastructure is written mainly in OCaml programming language [5] and is currently available for Windows platform using the Cygwin environment. On.NET Framework platform, the majority of OCaml language is implemented by Microsoft Research s F# system [17]. Unfortunately, some of the OCaml language features used by the infrastructure are not currently supported by the F# so it is not possible to run it directly on.net Framework. That is why a workaround is needed. To overcome the platform difference, we have implemented a CIL dump module. It is a simple Cilly extension written in OCaml going through the entire CIL AST and dumping it into a text file. 147

The file is than consumed by the C# utility building the representation resembling the CIL AST in the managed environment of.net Framework. The dump module is placed at the and of the module chain allowing some useful transformations of the CIL AST that are already implemented in OCaml to be performed before dumping the AST. Their results could therefore be loaded by the C# representation builder. When the F# system will be capable of all features used by Cilly sometime in the future, the mediatory text file could be dropped and the dumper could build our representation directly from the Cilly s one. The DeSpec Model Extractor loads the driver s source code representation in 3 phases. Firstly, it runs the driver builder (i.e. build command) from Windows DDK, which is used by driver developers for building drivers. This utility is used to provide full compatibility with the current driver building process. However, some instrumentation to the builder is necessary to get the preprocessed source files instead of the driver binary. One more change is needed to get all the information required for the model extraction into the preprocessed source files. A presence of macros causes a problem when a function the kernel specification is referring to is actually a macro that either renames the function to an internal kernel name or even completely removes function calls and replaces them with the code. If the preprocessor expanded the macro before the CIL AST is build the information about the original function call would be lost. Therefore, such macros have to be removed from the set of preprocessor symbols and replaced with function stubs. The second phase builds CIL AST up by executing Cilly on preprocessed files and dumps it to the text file. In the final phase, the Model Extractor reads the text file and creates the internal C# representation. Slicing There are plenty of operations that need to be performed on the driver s code representation before the generation of the Zing model can take place. Program slicing is one of the most important prerequisites since the resulting model should contain as few code and variables as possible. Otherwise, the resulting model could be infeasible to model check due to its size. At the beginning of the extraction process, the user is expected to choose a set of rules to be verified from the Specification Repository. The Model Extractor should then slice out code and data that are irrelevant to the selected rules. The complexity of program slicing ranges from relatively simple algorithms for slicing sequential code without pointers up to the undecidable problems of slicing programs with unrestricted use of pointers. Slicing methods are covered extensively by [7] and by dozens of other research works. So far, we have implemented intraprocedural pointer-less slicing based on the Program Dependence Graph (PDG) data structure [3]. PDG captures both data and control dependencies among statements and expressions within a function body. Its control dependency sub-graph can be constructed using the Lengauer-Tarjan [8] algorithm and the data dependency sub-graph by the minimal fixed-point algorithm. The PDG can be further extended to the Interprocedural PDG (IPDG) or the threaded PDG (tpdg) for the purpose of interprocedural and concurrent slicing [7]. To extend slicing algorithms to the programs with pointers, some kind of points-to analysis [4] is necessary. Such an analysis discovers sets of aliases for chosen variables. When modeling the function pointers, we also need to discover a set of functions that could be possibly targeted by a specified function pointer variable. The points-to analysis can give us that information. Although it is not always possible to determine the points-to sets precisely, an approximation should be sufficient for the model extraction purpose. The goal of the analysis application is a reduction of the size of the model. Without the analysis, the extractor may conservatively assume that pointers can point to any data and create a larger model incorporating all the possibilities. It is however desirable to make the model as small as possible and hence find an acceptable trade-off between analysis preciseness (and complexity) and the model size. Extracting Zing Models from C Source Code 148

We propose a novel approach to the extraction of verification models from C source code and provide the implementation targeting the Zing model checker. Existing works either focus on Java-like languages (e. g. Bandera [23], Java Path Finder [20]), do not extract the model fully automatically (e.g. SPIN [5]) and/or are very limited on the constructs that can be used in the source code (e.g., SPIN does not support unbounded heap allocation, call stacks nor dynamic thread creation). The major issues of the C program model extraction stem from pointer and array operations. In our work [11], we distinguish four kinds of pointers depending on the kind of memory and the possible number of items they are pointing to. Although this differentiation leads to more complicated dereferencing operations, it minimizes the state space of the model. Due to the atomicity of the dereferencing operations, the complexity increase does not influence the resulting model size. Each pointer is represented by a pair <target, offset>, where target is a reference to the Zing object representing the value the pointer points to or the Zing array storing multiple values if the pointer points to (or can point to) a sequence of values. In the latter case, the offset is the index to the array. If the pointer target is allocated dynamically in the C language the target does not directly refer to the value the pointer points to. Instead, it refers to an instance of Memory class that represents the allocated memory and holds the value the pointer points to. We proved that our approach is feasible in practice by verifying correctness of the C implementation of a synchronized priority queue represented by a singly linked list. The C source code has around 110 lines and the entire generated Zing model about 900 lines. All tests were performed on 1.4GHz/1GB machine. Deliberately introduced race conditions to the implementation were discovered by the model extractor within a few seconds. The correct implementation running 3 producers each inserting 3 items to the queue passed the verification in about 30 minutes. We also observed that the number of threads has much greater impact than the number of items inserted to the queue, which is positive as the race conditions are usually revealed even for a small number of threads. Conclusion In our previous work, we have introduced a new specification language targeting Windows kernel environment called DeSpec. The language is designed to enable writing modular, readable, and wellarranged specifications of the Windows kernel driver environment as well as formally, yet still comprehensibly, capture the rules imposed on drivers by the kernel and documented in plain English in DDK. Consecutively we started to implement the Model Extractor tool, which should be eventually used for an extraction of a Zing model from the source codes of the driver, kernel header files, and the DeSpec specifications of the driver environment. The Model Extractor uses the CIL infrastructure for building an internal representation of the driver s source code and the DeSpec Specification Repository for managing the specifications. We have already implemented the front-end of the Repository that parses DeSpec files and builds appropriate representation in a form of abstract syntax tree. Further work will include implementation of the specification analyzer that would check the consistency of the specifications and perform the transformations that are required before they can be provided to the Model Extractor. To get information about the driver source code that is necessary for the model generation, we implement various C code static analyses. The results of these analyses also allow us to reduce the resulting model and so target the state explosion problem. So far, we have implemented Lengauer- Tarjan algorithm for building Program Dependency Graph and used this data structure for intraprocedural slicing without presence of procedure calls and pointers. We will enhance slicing capabilities of the extractor by interprocedural and concurrent slicing and points-to analysis in our future work. We also implement the component of the Model Extractor tool that automatically generates a Zing model from the source code of the program. We have proposed a novel approach to modeling various constructs of the C language that do not map to the Zing modeling language straightforwardly (i.e. pointers, arrays, etc.) and we have shown on several examples that the verification of the extracted model is feasible in practice. Our future work in this area will focus on improvements to the Model Extractor making the generated models more compact. 149

References [1] Andrews, T., Qadeer, S., Rajamani, S. K., Rehof, J., Xie, Y: Zing: A model checker for concurrent software, Technical report, Microsoft Research, 2004. [2] Clarke, E. M., Grumberg, O., Peled, D. A.: Model Checking, MIT Press, 2000. [3] Ferrante, J., Ottenstein, K. J., Warren, J. D.: The Program Dependence Graph and Its Use in Optimization, ACM Transactions on Programming Languages and Systems, Vol. 9, No. 3, July 1987, Pages 319-349. [4] Hind, M.: Pointer analysis: Haven t we solved this problem yet? In 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 01), 2001. [5] Holzmann, G. J.: The SPIN Model Checker: Primer and Reference Manual, Addison-Wesley Professional, 2003. [6] INRIA: The OCaml Language, http://caml.inria.fr [7] Krinke, J.: Advanced Slicing of Sequential and Concurrent Programs, PhD thesis, Fakultät Für Mathematik und Informatik, Universität Passau, 2003. [8] Lengauer, T., Tarjan, R.E.: A Fast Algorithm for Finding Dominators in a Flow Graph, ACM Transactions on Programming Languages and Systems, 1:121-141, 1979. [9] Matousek, T.: Model of the Windows Driver Environment, Master Thesis at Department of Software Engineering, Charles University in Prague, 2005. [10] Matousek, T., Jezek, P.: DeSpec: Modeling the Windows Driver Environment [11] Matousek, T., Zavoral F.: Extracting Zing Models from C Source Code [12] Microsoft:.NET Framework, MSDN, http://msdn.microsoft.com/netframework [13] Microsoft: Driver Verifier, http://www.microsoft.com/whdc/devtools/tools/drvverifier.mspx [14] Microsoft: PREfast, http://www.microsoft.com/whdc/devtools/tools/prefast.mspx [15] Microsoft: Windows Driver Development Kit, WHDC, http://www.microsoft.com/whdc/devtools/ddk/default.mspx [16] Microsoft: Static Driver Verifier: Finding Driver Bugs at Compile-Time, WHDC, http://www.microsoft.com/whdc/devtools/tools/sdv.mspx [17] Microsoft Research: F#, http://research.microsoft.com/projects/ilx/fsharp.aspx [18] Microsoft Research: Zing Model Checker, http://research.microsoft.com/zing [19] McMillan, K. L.: Symbolic model checking an approach to the state explosion problem, PhD thesis, SCS, Carnegie Mellon University, 1992. [20] NASA Intelligent Systems Division: Java Path Finder, http://ase.arc.nasa.gov/havelund/jpf.html [21] Necula, G. C., McPeak, S., Rahul, S. P., Weimer, W.: CIL: Intermediate Language for Analysis and Transformation of C Programs, Proceedings of Conference on Compiler Construction, 2002. [22] Necula, G. C., McPeak, S., Weimer, W., Liblit B., Harren, M.: CIL: Infrastructure for C Program Analysis and Transformation, http://manju.cs.berkeley.edu/cil [23] Robby, Dwyer, M. B., Hatcliff, J.: Bogor: An Extensible and Highly Modular Software Model Checking Framework, SIGSOFT Software Engineering Notes 28, 5, 267-276, 2003 150