Runtime Checking and Test Case Generation for Python

Runtime Checking and Test Case Generation for Python Anna Durrer Master Thesis Chair of Programming Methodology D-INFK ETH Supervisor: Marco Eilers, Prof. Peter Müller 24. Mai 2017 1 Introduction This thesis is part of a software verification project of the Chair of Programming Methodology. The goal of program verification is to prove mathematically that the given program fulfils a given formal specification. The consequence is that for every input that fulfils the precondition the postcondition and all loop invariants have to hold. This is more powerful than testing in that the results are guaranteed for any inputs. Writing all the necessary conditions, also called contracts, can be tedious and error-prone, therefore it is desirable to write them incrementally and check whether they are sufficient. One of the problems that occurs when writing contracts is that whenever the program cannot be verified by an automatic verifier, the cause could lie in three different reasons. The different kinds of errors are: The program code contains one or more errors that disallows the correct contracts to be proven. A specification is too weak. This means that a precondition or loop invariant within the method or a postcondition of a called method is not strong enough and has to be amended. The contracts are of a form that is not provable by the verifier. For example, the SMT solver cannot prove some non-linear arithmetic expressions. For a programmer, it is important to distinguish these errors, because then they know whether to change the code or specification. 1

In this thesis, we aim to develop an automated mechanism that uses a counterexample and the symbolic heap that we get from the verifier to decide whether the detected error is more likely in the specification or the code. 1.1 Example # Example for different kinds of error in Python def m (self, my_boolean): #Method with one argument, a boolean Ensures (self.x == 3) #Postcondition if my_boolean: #Branching self.x = 2 #Setting x to the wrong value => CODE ERROR else: calculate_x(self) #Calling a method => SPECIFICATION ERROR def calculate_x(self): Ensures (self.x > 0) self.x = 3 #Too weak postcondition #Correct code Here is an example of Python code with different kinds of verification errors. In the method m there are two errors: In Line 4, the program sets x to 2 but the specification states that x has to be 3 at the end of the execution. This is a program code error because the code does not comply with the required result. (Of course, it could be possible that the postcondition is wrong, but we assume that the postcondition of a method is always correct but maybe too weak for calling methods.) The postcondition of calculate x is not strong enough to actually prove that x is 3 at the end of the second branch. Here, it is a specification error, because the code guarantees our postcondition but calculate x does not guarantee it in its specification. 1.2 SCION As mentioned above, this thesis is part of a project of the Chair of Programming Methodology. The project is the verification of the Python implementation of the SCION internet architecture developed at ETH. The verification is done automatically by a newly developed static verifier called Nagini that has its own specification language. One speciality of Viper, which is used by Nagini, are the access permissions for each heap or field access. Each heap location has a total permission of 1, this means the sum of all permissions given is at most 1. To read or write a heap location a method needs to acquire a fraction of the total permission. Namely, to read an object, a positive fraction of a permission is needed (for example 0.5). To write to a heap location, we need the full permission of 1. 2

1.3 Counterexample The verifier represents a possible code or specification error by a counterexample. The counterexample given by the Symbolic Execution verifier of Viper consists of two different kinds of information. The counterexample of the Z3 prover - an SMT solver - with values assigned to variables and the symbolic state at every point in the execution. The symbolic state is - slightly simplified - structured in the following way: store #Local variables x Int x 0 y Ref y 0 z Ref z 0 heap #Known heap structure with permissions y 0.f 0.5 v 2 path_condition #Path conditions established at program point is_instance(x 0, list) v 2 > 0 Here, it is important to note that the symbolic state has incomplete information. Only fields that have permissions are visible in the state but there is no information about fields without permission. 1.4 Approach To solve the problem of distinguishing the different errors mentioned above, the goal is to use the counterexample as specified in the previous section. The steps that we will use are the following: 1. Run the static verifier and obtain a counterexample. 2. Generate test inputs that fulfil the conditions of the counterexample. 3. Run the test cases on the program and check all assertions and contracts at runtime. 2 Core Goals 2.1 Runtime Checking One part of this Master thesis deals with the problem of translating a Python method with specifications that are only checked in the verifier to a Python method that checks the properties also at runtime. 2.1.1 Determining strategy As the first step, it is necessary to determine which Viper pre-, postconditions and loop invariants can be translated where and how to Python assertions. Especially, the translation of the above described permissions seems to be challenging 3

because each heap access has to be checked for the correct permission. Another important part here is to decide which data structures are used for representing permissions, as those are just concepts within the verification language. 2.1.2 Implementation After determining how to translate the Viper contracts into Python assertions, the next step is the implementation of a code instrumentation that transforms any method into a method with the according assertions. Methodically, it is interesting how to transform Python programs that use methods of standard libraries. The interesting thing about standard libraries is that contracts have been written for their methods but the contracts are unchecked. It would be desirable to check at least some of the contracts also in standard libraries when doing runtime checking. One idea is to rewrite the method getattribute - that is called whenever a field is accessed in any Python method - for any type of Python to check whether the access is allowed according to the method s permissions, whether or not this approach is useful has to be determined at this step. 2.2 Test Case Generation Another important step in this thesis is to actually produce test cases that check which of the above mentioned error categories the error reported by the SMT solver actually belongs to. 2.2.1 Parsing counterexample The counterexample given by the SMT solver is quite difficult to read and understand. The first step for producing test cases is to parse the counterexample and map it to Python - directly or via an intermediate step through Viper. Here, one expected difficulty is that the symbolic heap is not complete as shown above. 2.2.2 Generate input As the next step, several - if possible - objects that fulfil the conditions of the counterexample are to be produced automatically. And with those objects, different inputs for the method have to be constructed. It is notable that some input objects could not occur in the symbolic heap and have to be guessed. Here, different approaches on how to initialize unspecified values have to be tested or decided on (choose randomly, establish corner cases,...) 2.3 Combination The first two parts of the thesis, runtime checking and test case generation, have to be combined in the end. The generated test cases should be run on the translated methods accordingly and the results are to be returned in a useful way. For example, one could output all test inputs with the results (no assertion 4

broken or the assertion that does not hold). One part of the thesis is to apply the process to some of the examples in SCION and determine whether the output is useful for the programmer or not. 3 Extension Goals There are some extensions to the thesis that could improve the project further. Some of the proposed extensions are: Including information from predicates and functions in the test case generation step: Without including information from predicates and functions, we have two options: either we disallow programs that contain those or we ignore the information given by them and therefore can have false positives. In the first case, this extension allows a wider application and in the second one we can avoid false positives. Extending runtime checking by obligations: Nagini contains obligations, namely for I/O contracts, loop termination and other conditions. By including those in the runtime checking the project could have a wider application by including also programs with obligations or be more precise by checking obligations as well. Determining the statement where the contracts are too weak: If the project determines that the error is due to insufficient contracts, it would be quite helpful for the programmer to know at which instruction the contracts are insufficient. To that end, the runtime checking might be adapted to run a method from a certain point in the method and check when the input produced in the earlier steps in not sufficient. To illustrate how this could be accomplished, we should consider the following example: # Example for finding position of specification error in Python def m (self, my_boolean) #Method with one argumnt, a boolean Ensures(x == 3) #Postcondition calculate_y(self) #other function of no consequence calculate_x(self) #Place of SPECIFICATION ERROR calculate_z(self) #other function of no consequence In this example, the specification error is in the second function call but we do not know that in advance. So, the goal is that we can generate test inputs from the symbolic state at the end of some instruction (for example after the first method call) and then check whether the assertion holds or whether there are allowed inputs in the symbolic state that lead to an assertion failure (in this example we could start after the second function call with a value x = 1 and get an assertion failure). One difficulty here is that we have to generate additional values for local variables. 5