How Efficient Can Fully Verified Functional Programs Be - A Case Study of Graph Traversal Algorithms

How Efficient Can Fully Verified Functional Programs Be - A Case Study of Graph Traversal Algorithms Mirko Stojadinović Faculty of Mathematics, University of Belgrade Abstract. One approach in achieving fully verified software is formalizing the software within a proof assistant, proving its total correctness, and exporting executable code in a functional programming language by means of code generation. In order to be applicable for the real world applications, the generated code must be efficient. In this work we evaluate how efficient can verified programs be, by doing a case study of graph algorithms. We focus on BFS (Breadth first search) and specify it in Isabelle/HOL proof assistant in several ways: a purely functional recursive specification, and an imperative specification within the Imperative/HOL framework. The exported SML programs are compared to the unverified version implemented in C, and, although an order of magnitude slower, the Imperative/HOL version can still successfully handle very large graphs. 1 Introduction Software verification is a discipline whose goal is to assure that software fully satisfies all of the indicated requirements. Two main approaches to verification exist. Dynamic verification also known as testing or experimentation is performed during the execution of software and dynamically checks its behavior. By testing software in many possible situations, errors can be discovered. However, number of possible situations is usually so large, that only a small fraction of them can be checked. For example, testing whether a program has expected output for all double precision floating point numbers is impossible, and is not going to be possible in near future. The alternative is static verification that proves correctness of software. This assures that the program will have expected output for all inputs, or more generally, that the program will function as specified. Static verification includes different types of verifications, and in this work we are focused on formal verification. It uses formal mathematical methods to prove correctness of programs. A detailed description of the subject can be found in [5] and [6]. Two main approaches to formal verification exist. In the first approach, one is trying to prove that the already implemented program is correct. However, due to complexity of real programming languages, this task is difficult and usually only some properties of the program are shown. The second approach relies on using code generation. A program is specified in a higher order logic

How Efficient Can Fully Verified Functional Programs Be? 201 (treated as a purely functional programming language), usually as a set of recursive functions within a theorem prover and its correctness is proved (usually by induction). This is called the shallow embedding to HOL. From the specification, executable code can be obtained by means of code generation [3]. The main drawback of this approach is inefficiency of generated functional programs. Program (C, Java,...) Specification in HOL theorem prover and proof of its correctness proof code generation (Partially) verified program Fully verified program Fig. 1: Two main approaches to formal verification Isabelle is a generic proof assistant [1]. It allows mathematical statements to be expressed in a formal language and provides tools for proving them in a logical calculus. The most widespread instance of Isabelle nowadays is Isabelle/HOL, providing a higher-order logic theorem proving environment ready to use for sizable applications. Proofs can be written in Isabelle/Isar. Isar is a declarative proof language used in Isabelle/HOL. Proofs written in Isar in structure and style correspond to classic mathematical proofs. Still, they are readable for both human and machine. Imperative/HOL [2] is a new framework included in the latest version of Isabelle/HOL. This framework formalizes using imperative data structures within HOL and within functional program specifications. This can significantly improve efficiency of generated programs. The main motivation for this work is to see if efficiency of fully verified functional programs (potentially using imperative data structures) can be comparable to efficiency of unverified imperative programs. If this is the case, then we can have functional programs that are proven to work correctly and are fast enough to be used in real-world industrial applications. It further means that in some software applications demanding high levels of reliability companies could implement programs, prove their correctness within a proof assistant and then export executable code. This is the alternative to debugging software which requires a lot of time and money and even after thorough checking does not assure that the program will behave as expected. In order to make efficiency comparison, we decided to formalize, verify and evaluate different versions of graph algorithms. The current state of our formalization is available at http://www.matf.bg.ac.rs/~mirkos/graph_algs.zip. 2 Formalization of Graph Algorithms Our main goal is to create a formally verified library of graph algorithms that are as efficient as possible, and to compare efficiency of different approaches

202 Mirko Stojadinović to standard unverified implementations. Currently, we have implemented algorithms that check various graph properties (e.g., reflexivity, symmetry, transitivity) and we have created different specifications of graph traversal algorithms (BFS (Breadth First Search) and DFS (Depth First Search) [7]). In the following, we will present formalization of the BFS algorithm as a typical example of what has been done so far. Four different specifications/implementations of the BFS algorithm have been created. First three versions are written in Isabelle/HOL and executable code is exported from their specifications. The first(second) version is implemented as a purely functional program using sets(lists). The third version is a program based on the Imperative/HOL framework employing imperative data structures within a functional program. The fourth version is written in C and is used only for efficiency comparison with previous three programs. Due to limited space, we present only the third version in detail. 2.1 BFS First Two Versions (Purely Functional) The first and the second version are written in Isabelle/HOL and are using sets and lists respectively. We do not expect from this programs to be efficient. The reason is recreation of structure (set, list) in each situation in which even one of its elements changes. For example, at each level of traversal the set of visited vertices is updated and that means this set is recreated many times, which is a very expensive operation. Recreation of structures has been the main reason for not using classical functional programing in modeling real world problems. 2.2 BFS Imperative/HOL Data structures. The third specification of the BFS algorithm is also written in Isabelle/HOL, but uses imperative data structures (arrays) that are not used in classical functional programing. The framework Imperative/HOL allows usage of arrays and references. It uses a concept of monads, that were first introduced in Haskell [4]. A monad is a kind of abstract data type constructor, used to represent computations. We now show central definitions used in the third BFS specification. Adjacency list representation of graphs is used, and lists are linked together in a single array. The definition of a graph is: record graph al = nv :: nat /* number of vertices in a graph */ ne :: nat /* number of edges in a graph */ nbrs :: "vertex array" /* list of all neighbours */ inds :: "(nat nat) array" /* each pair represents the index in the array nbrs where the neighbours of a vertex start from and the number of these neighbours. */ For example, the graph given in Fig. 2 is represented as

How Efficient Can Fully Verified Functional Programs Be? 203 Fig. 2: An example of a graph nv = 8 ne = 12 nbrs = 2, 4, 5, 2, 3, 5, 2, 8, 5, 6, 1, 7 inds = (0,3), (3,0), (3,1), (4,2), (6,2), (8,1), (9,1), (10,2) The neighbours of a vertex 1 start from index 0, and as it has 3 neighbours, they are on positions 0, 1, 2 (the neighbours are 2, 4, 5). The neighbours of a vertex 2 start from index 3 and it has 0 neighbours. Within our third BFS specification, queues are used as an auxiliary data structure. Queues are also implemented with arrays record a queue = q elements :: " a array" q front :: nat q back :: nat Algorithm specification. The definition of bfs function is: function bfs aux where "bfs aux g s = (if empty (queue s) then return s else do { (v, q) dequeue (queue s); (from, n) Array.nth (inds g) v; s enqueue nonvisited nbrs (nbrs g) from n (s ( queue := q )); bfs aux g s } )" definition bfs where "bfs g v = do { vst Array.new (nv g) False; q alloc (nv g) 0; q enqueue v q; Array.upd v True vst; bfs aux g ( visited = vst, queue = q ) }" During execution of the algorithms queue and visited change and they represent the current state of the traversal. In bfs aux this ordered pair is denoted with s. The idea is: as the new vertices are being visited they are added to the end of the queue and marked as visited. The function bfs resembles the code written in the imperative languages. Function Array.new is used for allocation of boolean array that keeps information about which vertices are visited and which are not and then queue q is allocated. Vertex v is added to the beginning of the queue and function Array.upd that changes element of the array is used to mark v as visited. Then function bfs aux is called with the second argument representing the current state.

204 Mirko Stojadinović Function bfs aux removes the first element from the queue, and adds all of its neighbours to the end of the queue. Function dequeue returns two values: vertex v from the front of the queue and the queue q without this vertex. This function only changes index q front that points to the first element of the queue. Elements with indices smaller than q front are kept intact and they can later be printed if we want to see the order in which vertices were visited. Search is finished when there are no more vertices in the queue (indices q front and q back become the same). Function Array.nth is used to get the element with index v from the array inds g. Function enqueue nonvisited nbrs adds neighbours of vertex v to the end of the queue and marks them as visited. Correctness. We have proved correctness of the first version, and currently we are proving that it is equivalent to the third BFS specification, meaning that they traverse the same vertices of a graph. That way, we will show that the third version is correct (it turns out that this is much simpler then directly proving correctness of the third version). 2.3 BFS C The fourth BFS implementation is done in C. It is intended for efficiency comparison between previous versions and a program implemented in an imperative language. We do not aim to prove the correctness of this program. The code uses arrays and is really similar to the code of the third version of BFS. This provides a fair efficiency comparison. 2.4 Experimental Evaluation. From logical specification in Isabelle/HOL code can be exported to functional languages SML, OCaml, Haskell and Scala [3]. From the specifications of the first three version, codes are exported to SML and their execution times are compared to the execution time of the version implemented in C. We computed the times needed only for execution of algorithms. Times needed for input parsing and printing results are not included. Measuring the execution time for C programs is done by using Unix command time, and Isabelle allows us to export each function to SML and measure its execution time directly within the system. All programs have been ran on PC Pentium (R) Dual-Core E6500 2.93GHz with 2GB RAM, running under Linux. Although many instances of graphs in DIMACS format are available on-line, third and fourth program finished most of them in less then 0.01 seconds. So we have written our own generator of random graphs (in C), and tested programs on more then 30 generated instances. We present results only for six graphs that we find most representative. The following results are obtained:

How Efficient Can Fully Verified Functional Programs Be? 205 Vertices Edges SML (sets) SML (lists) SML (arrays) C 64 640 0.8s 8.2s 0.003s 0.000001s 193 1930 4.3s 320s 0.003s 0.000001s 70 570 705 700 * * 0.7s 0.02s 99 888 499 440 * * 0.15s 0.03s 4 500 10 143 032 * * 0.5s 0.05s 5 555 15 450 463 * * 0.7s 0.07s First two rows show that even for very small graphs, first two versions are very inefficient. Sign * denotes that time consumption is greater then 10 minutes and in that case we have not waited for a program to finish. Next two rows represent results for rare graphs where every vertex has 5-10 neighbours. Last 2 rows represent results for dense graphs. The time needed for graph traversal in SML version using arrays is in average 10-20 times greater then one needed in C (average time is calculated based on all generated instances). 3 Conclusions and Future Work. From the table above, we see that graph traversal for graphs containing thousands of vertices and millions of edges is done within a second in the verified SML program exported from Isabelle. These results are encouraging. SML program is slower than the program written in C, but this is a price that needs to be paid if we want software that is proven to work correctly (e.g., verified program is using big numbers instead of integers to ensure that overflows will not happen). All measured times are small, and very large graph instances must be used in order to make a comparison. We hope that good efficiency can be also obtained for some other graph algorithms. Hopefully, verified functional implementations will show even better results if algorithms require more computations and less data-structure manipulation (contrary to BFS). We plan to address this matter in our future work and to implement, verify and evaluate several other graph algorithms. As much bigger task, we plan to compare different approaches to verification by implementing and proving correctness of graph algorithms in other systems, e.g. Coq, ACL2, Microsoft Boogie. References 1. Nipkow T., Paulson C. L., Wenzel M. Isabelle/HOL: A Proof Assistant for Higher- Order Logic, volume 2283 of Lecture Notes. In Computer Science. Springer-Verlag, 2002. 2. Bulwahn, L., Krauss, A., Haftmann, F., Erkok, L., Matthews, J. Imperative Functional Programming with Isabelle/HOL. In: TPHOLs 08 3. Haftmann F., Nipkow T.. A code generator framework for Isabelle/HOL. In K. Schneider and J. Brandt, editors, TPHOL: Emerging Trends. Department of Computer Science, University of Kaiserslautern

206 Mirko Stojadinović 4. Learn You a Haskell for Great Good! available at http://learnyouahaskell.com 5. Harrison J., Formal proof theory and practice. In: Notices of the American Mathematical Society, 55(11):1395-1406, December 2008. 6. Woodcock J. C. P., Davies J. Using Z: Specification, Refinement, and Proof. Prentice-Hall, 1996. 7. Cormen T., Lesierson C., Rivest L., Stein C. Introduction to Algorithms. MIT Press, Cambridge, MA, 2nd edition, 2001.