A Satisfiability Modulo Theories (SMT) Solver for the Theory of Equality and Uninterpreted Functions (EUF)

Size: px

Start display at page:

Download "A Satisfiability Modulo Theories (SMT) Solver for the Theory of Equality and Uninterpreted Functions (EUF)"

Sharon Moody
5 years ago
Views:

1 THE UNIVERSITY OF MANCHESTER A Satisfiability Modulo Theories (SMT) Solver for the Theory of Equality and Uninterpreted Functions (EUF) Anmol Khurana May, 2016 Supervised by Dr. Renate Schmidt A third year project report submitted for the degree: B.Sc. (Hons) Computer Science from The School of Computer Science at The University of Manchester

2 Abstract A Satisfiability Modulo Theories (SMT) solver for the Theory of Equality and Uninterpreted Functions (EUF) is a combination of a Parser, a Satisfiability (SAT) Solver and a Theory Reasoner. The output from an SMT solver for the Theory of EUF is a propositional logic model that does not lead to contradictions with respect to the background Theory of EUF. The goal of my project was to optimise the performance of an SMT Solver that was developed in a previous third year project by Alexander White. The optimisation was carried out by replacing the Davis-Putnam- Logemann-Loveland (DPLL) based SAT solver with a Conflict Driven Clause Learning (CDCL) based SAT solver. The former uses a naïve backracking technique whereas the latter uses a more efficient backjumping technique. CDCL was implemented with two learning schemes, namely [i] the First Unique Implication Point (1-UIP) scheme and [ii] the Last Unique Implication Point (Rel_Sat) scheme. The solver was further modified to generate a minimal model. Two different methods to compute a minimal model, namely [i] Multiple Model method and [ii] Iterative Deepening method, were explored. The benchmarking results indicated improved performance of the CDCL based SMT solver over White s DPLL based SMT solver. The performance of the 1-UIP scheme was marginally better than that of the Rel_Sat scheme. However, both should be considered for future implementations. The iterative deepening method was observed to be faster at finding the minimal model than the multiple model method. Page 2 of 57

3 Acknowledgement I would like to thank my supervisor Dr Renate Schmidt for her continual guidance and support. She gave valuable advice and feedback throughout the course of the project. I would like to acknowledge the work done by Alexander White, as this project is based on top of his project. Reusing his files gave me more time to optimise the solver and explore additional features. Lastly, I would like to thank my family and friends for their encouragement and for always being there by my side. Page 3 of 57

4 Contents Abstract... 2 Acknowledgement... 3 Contents... 4 List of Figures... 6 List of Graphs... 7 List of Tables Introduction Aims and Motivation Report Structure Context Introduction to Satisfiability and SAT Solvers The Davis, Putnam, Logemann and Loveland (DPLL) algorithm The Conflict Driven Clause Learning (CDCL) algorithm Satisfiability Modulo Theories (SMT) Solver How the Theory Reasoner works? Multiple models and the Minimal model Design and Implementation Introduction to Design Activity diagram for the general SMT solver and the multiple model SMT Solver Activity Diagram for the SMT solver for Iterative Deepening Activity diagram for the CDCL based SAT Solver Activity diagram for the CDCL based SAT Solver for Iterative Deepening method Implementation of the CDCL based SAT Solver Implementation of the SMT Solver for Theory of EUF Implementation Considerations and Reflections Testing, Benchmarking and Evaluation Benchmarking for the DPLL based SMT Solver and the CDCL based SMT Solver Experimentation with the different CDCL learning schemes Multiple Model method and Iterative Deepening method Reflection and Conclusion Project Achievement Further Implementation Page 4 of 57

5 5.3 Knowledge Gained Personal Development Conclusion Bibliography Appendices Appendix A Appendix B Appendix C Appendix D Page 5 of 57

6 List of Figures Figure 2.1: DPLL and CDCL Example Tree Figure 2.2 Implication Graph for CDCL example Figure 2.3 Implication Graph with 1-UIP Cut for CDCL example Figure 2.4 Implication Graph with Rel_Sat Cut for CDCL example Figure 2.5 Deriving Minimal Model by finding Multiple Models Figure 2.6 Minimal Model by Iterative Deepening Figure 3.1 SMT Solver Design for this Project Figure 3.2 Activity Diagram for the SMT Solver Figure 3.3 Activity Diagram for the SMT Solver that returns the minimal model Figure 3.4 Activity Diagram for the CDCL based SAT Solver Figure 3.5 Activity diagram for the SAT Solver that returns Minimal Model Figure 3.6 Pseudocode for the CDCL algorithm in SMT solver for Theory of EUF Figure 3.7 Pseudocode for CDCL in Iterative Deepening method Figure 3.8 Class Diagram for the SMT solver Figure B.1 ArrayList based Impication Graph Figure B.2 Representation of the Implication Graph Figure C.1 Screenshot of the SMT solver output for the twosat problem Figure C.2 Screenshot of the SMT solver output for the eqdiamond problem Figure C.3 Screenshot of the SMT solver output for the threeunsat problem Figure C.4 Screenshot of the SMT solver output for the threeunsat problem Figure C.5 Screenshot of the Z3 solver output for the threeunsat problem Figure D.1 Screenshot of the SMT solver output for the Multiple Model method Figure D.2 Screenshot of the SMT solver output for the Iterative Deepening method Figure D.3 Screenshot of the SMT solver output for the Iterative Deepening method Figure D.4 Screenshot of the SMT solver output for the Iterative Deepening method Page 6 of 57

7 List of Graphs Graph 4.1 Time/s against Benchmark Number for uf20 files Graph 4.2 Time/s against Benchmark Number for uf50 files Graph 4.3 Time/s against Benchmark Number for uf75 files Graph 4.4 Time/s against Benchmark Number for uf100 files Graph 4.5 Time/s against Benchmark Number for uuf50 files Graph 4.6 Time/s against Benchmark Number for uuf100 files Graph 4.7 Time/s against Number of Variables for 1-UIP and Rel_Sat Graph 4.8 Time/s against Number of Variables for Multiple Model and Iterative Deepening.. 39 List of Tables Table 2.1 Decision Tree for CDCL example Table 4.1 Benchmarking Results for DPLL and CDCL Table 4.2 Benchmarking Results for 1-UIP and Rel_Sat Table 4.3 Benchmarking Results for Multiple Model and Iterative Deepening Table B.1 Nodes with their previous nodes list Page 7 of 57

8 1. Introduction This section consists of the project goals, the motivation for choosing the project, and the structure of the report. The aims of the project were as follows: Aims and Motivation Implement the Conflict Driven Clause Learning (CDCL) algorithm in an existing Satisfiability Modulo Theories (SMT) solver, Experiment with different learning schemes associated with the CDCL algorithm (1-UIP, Rel_Sat), Equip the SMT Solver to find the multiple models and the minimal model, and Experiment with more efficient methods to compute the minimal model. The project was chosen because of the field of the project, the programming language involved, and the nature of the project. I am interested in the field of Logic and Modelling, and this project provided me with the opportunity to exercise that interest. This was a Java based project and Java is my favoured programming language. Having prior knowledge of the programming language is beneficial when the project is complicated. This project was essentially a research driven project. I possess an inquisitive mind, and like to research, understand and apply advanced topics. These reasons combined made this project a suitable choice for my third year project. The structure of the report is as follows: - Chapter 1: Introduction 1.2 Report Structure This chapter lists the aims, the motivation behind the project choice and the report structure. These help to give an overview of the report. Chapter 2: Context The chapter covers the concepts and principles associated with the project to help the reader better understand the achievements of the project. Page 8 of 57

9 Chapter 3: Design and Implementation This section documents the design and implementation that was carried out for this project. The design section looks at the activity diagrams for the SMT solver and the CDCL based SAT solver. The implication section documents the code development process and explains the resulting class diagram of the SMT solver. Chapter 4: Testing, Benchmarking and Evaluation This chapter covers the testing that was done to check the correctness of the SMT solver and the benchmarking that was done to compare the performance of this project s SMT solver and White s SMT solver. The benchmarking results are analysed and evaluated. Chapter 5: Conclusion This chapter covers the project achievements, the avenues for further research, the lessons learned and the knowledge gained. Page 9 of 57

10 2. Context This section introduces the SAT solver, the Davis-Putnam-Logemann-Loveland (DPLL) algorithm, the Conflict Driven Clause Learning (CDCL) algorithm, the Satisfiability Modulo Theories (SMT) solver, and the Theory Reasoner. 2.1 Introduction to Satisfiability and SAT Solvers This section covers the basic concepts of satisfiability. Satisfiability (SAT) problem is a decision problem involving a set of clauses. The answer is yes if this set is satisfiable. 1 If the set of clauses are true in an interpretation I, then I satisfies A and is said to be a model of A. 2 SAT problem has its applications in wide array of fields, such as: - Crosstalk noise prediction in integrated circuits, 3 Termination analysis in term-rewrite systems, 4 Model checking of finite-state systems, 5 Design debugging, 6 AI planning, 7 Haplotype Inference in Bioinformatics, 8 Software Model checking, 9 Software Testing, 10 Circuit Delay Computation, 11 and Test Pattern Generation in Digital Systems. The logic formula stemming from real world applications have large number of variables. Even with an efficient algorithm, the formula cannot be solved manually. Hence, the need for automated SAT solvers. 1 (Voronkov, 2014) Chapter 5 Page 67 2 (Voronkov, 2014) Chapter 3 Page 29 3 (Chen & Keutzer, November 1999) 4 (Fuhs, et al., 2007) 5 (Biere, et al., March 1999) 6 (Smith, et al., 2005) 7 (Selman & Kautz, 1992) 8 (Lynce & Marques-Silva, July 2006) 9 (Jackson, et al., 2000) 10 (Khurshid & Marinov, 2004) 11 (McGeer, et al., 1991) 12 (Larrabee, 1992) 13 (Marques-Silva, 2008) Page 10 of 57

11 Big companies like IBM and Microsoft have countless research publications pertaining to SAT solvers. Microsoft Research developed Z3, a SAT solver which targets software analysis and software verification problems. 14 The dedicated research teams at multinational corporations are a testament to the importance of efficient SAT solving in the industry and highlights the relevance of this project. 2.2 The Davis, Putnam, Logemann and Loveland (DPLL) algorithm The Davis-Putnam-Logemann-Loveland (DPLL) algorithm is a backtracking based algorithm that solves the satisfiability problems for propositional logic formula. The DPLL algorithm takes as input a set of clauses in Conjunctive Normal Form (CNF), and generates a model. The working of DPLL is a combination of the unit propagation and the splitting method. Unit propagation involves repeatedly performing the following transformation to the given set of clauses S: if S has a unit clause, i.e. clause with just one literal L, then [i] remove from S every clause of the form L C ; [ii] replace in S every clause of the form L C by the clause C. 15 Splitting involves assigning either true of false to a literal, and thereby splitting the search space by propagation down the branch with the chosen assignment. If the branch does not return a model, then propagation is done on the other branch. To start the DPLL algorithm, a literal L is selected from the set of clauses S. The DPLL algorithm is then recursively performed on the set S augmented with unit clause L. However, if this branch returns unsatisfiable, then DPLL is performed on the set S augmented with L. 16 This goes on until satisfiable is returned or all branches return unsatisfiable. The limitation associated with the DPLL algorithm is the lack of learning. The basic functioning involves assigning true or false value to a literal and checking if that leads to the solution. If not, the opposite value is assigned to the literal and the solver checks if that leads to the model solution. This chronological backtracking approach may be ineffective for large problems and so more efficient algorithms which have learning and intelligent backtracking are explored. 2.3 The Conflict Driven Clause Learning (CDCL) algorithm This project revolves around the use of Conflict Driven Clause Learning (CDCL) method to solve the SAT problem. Figure 2.1 depicts the difference in the DPLL algorithm s and the CDCL algorithm s traversal of the search space. The green node is assumed to be the model solution. Figure 2.1 (left) depicts the tradition backtracking scheme used in the DPLL algorithm. Figure 2.1 (right) shows the non-chronological backtracking technique, also known as backjumping, which is at the heart of the CDCL algorithm. 14 (Moura & Bjørner, 2008) 15 (Voronkov, 2014) Chapter 5 Page (Voronkov, 2014) Chapter 5 Page 71 Page 11 of 57

12 Figure 2.1: DPLL and CDCL Example Tree CDCL is driven by conflicts and subsequent clause learning. The algorithm learns a new clause from each conflict, and this records the reasons deduced from the conflict to avoid the repetition of the same assignments in the future. The algorithm returns a backjump level that resolves the conflict, allows the solver to skip the unavailing branches and get to the solution efficiently. 17 The following example, taken from (Tichy & Glase, 2006), will help to understand the working of the CDCL algorithm and the associated terminology. The example consists of the following set of clauses: - w 1 = (x 1 x 2 ), w 2 = (x 1 x 3 x 7 ), w 3 = ( x 2 x 3 x 4 ), w 4 = ( x 4 x 5 x 8 ), w 5 = ( x 4 x 6 x 9 ), w 6 = ( x 5 x 6 ) A decision variable is variable assignment made as a free decision. An implied decision is a variable assignment made because of other decision variables. This happens when the clause (containing the variable) is still not true, and all the other variables in the clause have been assigned a value. So the implied decision must be made such that the clause can be set to true. In the CDCL algorithm, each variable assignment is assigned a decision level starting from zero upwards. The decision Level is assigned differently to decision variables (free decisions) and to implied decisions. For decision variables, the decision level is incremented by one and the variable is assigned this new decision level. For implied decisions, the variable gets assigned the same decision level as the pervious decision variable (Zhang, et al., 2001) 18 (Tichy & Glase, 2006) Page 12 of 57

13 The decisions made are recorded in a directed graph called the Implication Graph. In the implication graph, each vertex represents a variable assignment and the edge to that vertex represents the reason leading to that assignment. The decision variables have no incident edges whereas the implied decisions have incident edges. 19 Table 2.1 represents the decision tree for the CDCL example. The decision tree consists of the decision variables (free decisions) and their corresponding decision levels. Figure 2.2 represents the implication graph for the CDCL example corresponding to the decisions made so far. The implication graph consists of the decision variables and the implied decisions. Table 2.1 Decision Tree for CDCL example Figure 2.2 Implication Graph for CDCL example 20 As can be seen in Figure 2.2, the decision variables and the implied decisions have resulted in a conflict. Prior variable assignments coupled with clause w 4 = ( x 4 x 5 x 8 ) require x 5 to be True. Similarly, prior variables coupled with the clause w 6 = ( x 5 x 6 ) require x 5 to be False. The conflict analysis segment revolves around the Unique Implication Points (UIP). A UIP is any node at the current decision level such that any path from the decision variable to the conflict node must pass through it. When a conflict is encountered, the implication graph is partitioned by means of a cut into two parts: a reason side and a conflict side. The cut is made next to a UIP and the learning scheme that is employed in the CDCL algorithm determines which UIP is selected for the cut. 19 (Tichy & Glase, 2006) 20 (Tichy & Glase, 2006) Page 13 of 57

14 Figure 2.3 depicts the implication graph with the conflict side and the reason side when the First Unique Implication Point (1-UIP) scheme is used. The cut is made after the first UIP, i.e. x 4. This places all variables assigned after 1-UIP that have paths to the conflicting variable on the conflict side, and everything else on the reason side. 21 Figure 2.3 Implication Graph with 1-UIP Cut for CDCL example 22 Figure 2.4 depicts the implication graph when partitioned following the Last Unique Implication Point (Rel_Sat) scheme. In Rel_Sat scheme, the cut that segregates the Implication Graph is placed before the last UIP on the path from the conflict node to the decision variable. The last UIP in the example is x 4. All the variables assigned at the current decision level with the exception of the decision variable is placed on the conflict side. The decision variable and the variables assigned at decision levels less than the current decision level are on the reason side. 23 Figure 2.4 Implication Graph with Rel_Sat Cut for CDCL example (Zhang, et al., 2001) 22 (Tichy & Glase, 2006) 23 (Zhang, et al., 2001) 24 (Tichy & Glase, 2006) Page 14 of 57

15 After the partition, the implication graph is analysed to learn information regarding this conflict. The information learned is in the form of a conflict clause and the backjump level. Conflict clause is made up of the nodes on the reason side that have edges leading into the conflict side. 25 The clause learned is disjunction of the negation of the original variable assignments corresponding to these nodes. Using the 1-UIP scheme, the clause learned is w 7 = ( x 4 x 8 x 9 ) and this is appended to the set of clauses. Using the Rel_Sat scheme, the clause learned is w 7 = (x 1 x 7 x 8 x 9 ). The newly learned clause consists of the negation of the decisions made previously as it signifies that those combinations of assignments are invalid because they force the conflicting variable to assume both true and false value and so should not be made in the future. 26 This new clause is appended to the existing clause set to ensure that the same decisions are not repeated. The last step of CDCL is to determine the backjump level from the conflict clause. The backjumplevel is the maximum decision level of the nodes in the conflict clause not considering the node assigned at the conflict level. 27 Using both the 1-UIP Scheme and Rel_Sat scheme, resulting backjumplevel = 3. So, the rollback to decision level three follows and the decision variable assignment at level three is flipped. The conflict is resolved and further decisions are made until the model solution is found. 2.4 Satisfiability Modulo Theories (SMT) Solver Satisfiability (SAT) problems deal with propositional logic. Satisfiability Modulo Theories (SMT) problems deal with more expressive forms of logic. In SMT problems, the satisfiability of firstorder formulas is decided with respect to background theories such as the theory of equality, and theory of arrays. 28 In other simpler words, an SMT problem the clauses may contain propositional atoms along with atoms over theories such as theory of equality with uninterpreted functions or of arrays or of integers. 29 The SMT solver in this project supports the Theory of Equality and Uninterpreted Functions (EUF). The difference between a SAT problem and an SMT problem is as follows: - In a SAT problem, an atom represents only a boolean variable In an SMT problem for Theory of Equality and Uninterpreted Functions (EUF), an atom may represent a boolean variable, a predicate or an equation. 25 (Tichy & Glase, 2006) 26 (Zhang, et al., 2001) 27 (Tichy & Glase, 2006) 28 (Nieuwenhuis, et al., 2006) 29 (Nieuwenhuis, et al., 2006) Page 15 of 57

16 2.5 How the Theory Reasoner works? Theory Reasoner has to deal with these equalities, inequalities and predicates, and check that the propositional logic model does not lead to contradictions with respect to the Theory of EUF. This project only requires a high level understanding of the theory reasoner. The algorithm for Theory Reasoner applies the rewrite rules to process the set of equations. Once an equation is processed, it is represented using constants which correspond to terms in the equation (contained in set K) and rewrite rules that denote the equality relations that have been derived so far (contained in set R) 30. There are six rewrite rules which are used by the equality solver in this project are Extension, Simplification, Deletion, Orientation, Collapse, and Deduction. There are additional rules for the simplification of inequalities and predicates. These rules are explained in Appendix A. The Theory Reasoner uses the newly generated sets K and R to infer if the propositional logic model leads to a contradiction with respect to the set of equalities, inequalities and predicates. 2.6 Multiple models and the Minimal model This section introduces the multiple models and the minimal model, and covers the methods used to compute these models. A SAT or SMT problem may have more than one model. Deriving multiple models for the SAT or SMT problem refers to deriving all the models for the given problem. The multiple models can be compared on the basis of some criteria to derive the minimal model. These criteria may be the number of true assignments, the total number of assignments or some other criteria. This project uses the number of total assignments as the criteria. Computing all the models (multiple models) for a set of clauses requires traversing all the branches which have a satisfiable model at the leaf nodes. To do so, upon reaching a model, the solver still backtracks one level and continues traversing the search space for more models. This procedure is reflected in Figure 2.5 by the arrows. The example assumes that each leaf node results in a model solution and also lists the sizes of the derived models. The following explains the backtracking process clearly. After traversing down the left most branch and obtaining the model a 1 a 2 a 3, the solver backtracks one level and gets the model from the adjacent branch that is a 1 a 2 a 3. Then it would again backtrack one level and get the model a 1 a 2 a 3 and so on and so forth. In this manner, the different branches are explored and all the possible models are obtained. Their sizes can then be compared to obtain the minimal model. 30 (Bachmair, et al., 2003) Page 16 of 57

Figure 2.5 Deriving Minimal Model by finding Multiple Models Computing all the models by traversing the entire search space is highly inefficient as it takes exponential time.

17 Figure 2.5 Deriving Minimal Model by finding Multiple Models Computing all the models by traversing the entire search space is highly inefficient as it takes exponential time. For formulae with hundred plus variables, the drawbacks of extensive computation required for finding the minimal model may outweigh the benefits of having the minimal model. A more effective method which is inspired from Paolo Liberatore s Algorithms and Experiments on Finding Minimal Models is the bounded Iterative Deepening method. The SAT solver s algorithm is tweaked such that it finds a model bounded by some number. 31 This method is depicted in Figure 2.6. When the incomplete variable assignment exceeds the bound, the propagation down that branch terminates and moves to the next branch. Figure 2.6 Minimal Model by Iterative Deepening 31 (Liberatore, 2000) Page 17 of 57

18 For instance, with bound two, the solver traversing the leftmost branch will have the partial model a 1 a 2. The solver would realize that with this partial model, the set of clauses is not satisfied, and making anymore decisions is not possible without exceeding the bound. So, the solver stops exploring this branch, backtracks and continues traversal along the adjacent branch. If with the current bound, all the branches have been covered and no solution is found, the bound is incremented by one and the propagation is restarted. Comparing the number of steps for the logic trees in Figure 2.5 and Figure 2.6, the former takes twenty-four steps while the latter takes twelve steps. This roughly points towards the efficiency of the Iterative Deepening method. Page 18 of 57

alternative implementations that were considered. The activity diagrams and the class diagrams were made using Violet UML Editor 32

19 3. Design and Implementation This design section coves the activity diagrams associated with the development of the SMT solver. The implementation section covers the pseudocode associated with the development process, explains the components of the resulting class diagram of the SMT solver and briefly talks about the alternative implementations that were considered. The activity diagrams and the class diagrams were made using Violet UML Editor Introduction to Design White s project uses a lazy approach for the SMT solver i.e. SAT solver generates a propositional logic model and then the Theory Reasoner to check if the model leads to any contradictions. Figure 3.1 explains the structure of the resulting SMT Solver from this project. SMT Solver = Parser + CDCL based SAT Solver + Theory Reasoner. The implication of the lazy approach on this project was that the DPLL based SAT solver was substituted with the CDCL based SAT solver with changes made to the external design of the Parser and the Theory Reasoner. These external changes are addressed through the SMT solver activity diagram. Figure 3.1 SMT Solver Design for this Project The internal design of these reused components is abstracted as no changes were made to their internal working. This aids in highlighting the design and development of the CDCL based SAT solver, which was newly built for this project. 32 (Pellegrin & Horstmann, Last Updated ) Page 19 of 57

20 3.2 Activity diagram for the general SMT solver and the multiple model SMT Solver The purpose of using the activity diagrams is two folds: - - To shed light on the external design modifications made to the SAT solver s interaction with the parser and the theory reasoner, and - To depict the information flow in the SMT solver. The components of the general SMT Solver (not following the dotted red components) are as follows: - SMT Parser parses the input logic formula and assigns boolean variables to equations and predicates. This is the first point of communication with the problem file, where the problem file is interpreted and stored in the solver. CDCL based SAT Solver checks if the propositional logic formula is satisfiable or unsatisfiable. The internal design of this SAT solver is explained in detail in Section 3.4. Model Found? is a condition that checks if the SAT solver has come up with a model. If a model is found, then the model is passed to the Theory Reasoner to check if the propositional logic model is consistent with the background Theory of EUF. If no model is found, the SMT Solver returns unsatisfiable. Contradiction Found? is a condition that checks if the propositional logic formula is consistent with the Theory of EUF. If there is a contradiction, the SMT solver invokes the SAT solver to look for another model. If there are no contradictions, the SMT Solver returns satisfiable. All Clauses Have Single Variables? is a condition that checks if all clauses are unit clauses. This is because if the solver reaches this point and all the clauses are unit clauses, then the only possible propositional logic model has been rejected by the Theory Reasoner. If the condition returns yes, the SMT solver returns unsatisfiable without invoking the SAT solver again. If the condition returns no, then the SAT solver is invoked again to find alternate models. This is an improvement over White s project as his integration of the SAT solver and the Theory Reasoner did not perform this check. The components of the multiple model SMT Solver (following the dotted red components) are as follows: - Add model to the hash map stores the propositional logic formula when it is consistent with the Theory Reasoner. The solver then backtracks one level and continues search for more models. Page 20 of 57

21 HashMap empty? is a condition that checks the size of the hash map to determine if the set of clauses was satisfiable or unsatisfiable, as in both cases the last model returned will be null. This decision checks if the hash map is empty. An empty hash map indicates that no prior models were found and then the SMT solver returns unsatisfiable. A non-empty hash map indicates that model(s) were found and added to the hash map and then the solver prints the hash map and returns satisfiable. Figure 3.2 shows the resulting activity diagram of the general SMT solver, and the dotted red lines show the modifications made to design the SMT solver that returns multiple models. Figure 3.2 Activity Diagram for the SMT Solver Page 21 of 57

22 3.3 Activity Diagram for the SMT solver for Iterative Deepening The design for the SMT solver that returns the minimal model is different as the SMT solver has to set a bound and the functionality is amended to accommodate the bound. Figure 3.3 shows the resulting activity diagram for the minimal model SMT Solver. Figure 3.3 Activity Diagram for the SMT Solver that returns the minimal model Page 22 of 57

23 In addition to the components which have been previously explained, the activity diagram in Figure 3.3 has two new added components. These are as follows: - Setting Bound action gives an initial bound to the solver. If a model is found with the current bound, then the solver checks its consistency with the Theory Reasoner. If no model is found with the current bound, the bound is incremented by one and the cycle is restarted with the new bound. Timeout? condition is needed to stop the solver. If the clauses are satisfiable, a model will eventually be found with a certain bound and the solver would then terminate. If the clauses are unsatisfiable, no model will be found ever and so a timeout judges when to terminate the solver. The condition for the timeout could be either dependent on the actual running time or the number of variables. The latter is the more pragmatic option as the solver will be stopped when the bound exceeds the number of variables in the clause set. 3.4 Activity diagram for the CDCL based SAT Solver This general purpose CDCL based SAT Solver presented below is contained in both the general SMT solver and the multiple model SMT solver. Figure 3.4 shows the resulting activity diagram and the associated components are explained below: - Status of Clauses? condition checks the status of the propositional logic formula at the start of each cycle. If all clauses are satisfied, the SAT Solver returns satisfiable. If not all clauses are satisfied, it proceeds with the rest of the cycle. Unit Propagation and DecideNextBranch: The clause set is simplified with this action. The decide next branch action deals with choosing a literal, assigning it true or false value, and adds it to the implication graph. The unit propagation action reflects this decision on the set of clauses, infers implied decisions and adds these to the implication graph. Conflict? condition checks for conflict in the Implication Graph. If there is no conflict, then it goes back to Status of Clauses? action again. Otherwise it proceeds with the Conflict Analysis. Conflict Analysis: The solver analyses the implication graph to get the Unique Implication Point. Depending on the learning scheme (Rel_Sat, 1-UIP), the solver segments the graph, derives the new clause and gets the backjumplevel. BackJumpLevel<1?: If the back jump level is less than one, the SAT solver returns unsatisfiable. Otherwise, the SAT solver performs the rollback and returns to back jump level. The rollback involves resetting the clauses and variables which assumed their values after the back jump level. Page 23 of 57

24 Figure 3.4 Activity Diagram for the CDCL based SAT Solver 3.5 Activity diagram for the CDCL based SAT Solver for Iterative Deepening method The CDCL based SAT solver contained in the SMT solver that returns the minimal model using the Iterative Deepening method has additional components. The difference arises from having to accommodate bound which is central to the Iterative Deepening method. Figure 3.5 presents the resulting activity diagram for the CDCL based SAT Solver that returns minimal model. The newly added components are as follows: - Exceed Bound? is a condition that checks if the model size is greater than or equal to the bound. If so, the SAT Solver returns unsatisfiable (with the current bound). Otherwise, the CDCL procedure continues as per normal. Page 24 of 57

25 The functionality of Decide Next Branch is amended to compare the bound and model size prior to making any new decisions. If the model size is less than the bound, a new decision is made. If not, the model is flagged as not being within the bound and a new decision is not made. Figure 3.5 Activity diagram for the SAT Solver that returns Minimal Model Page 25 of 57

26 3.6 Implementation of the CDCL based SAT Solver This section covers how the pseudocode for a commonplace CDCL algorithm was adapted to make it suitable for this project. Pseudocode for the CDCL algorithm in SMT Solver for Theory of EUF Figure 3.6 (left) shows the pseudo code for CDCL in a commonplace SAT solver adapted from (Marques-Silva, et al., 2008). A commonplace CDCL implementation is only required to return the first model it encounters in the search space. But in this project CDCL is a part of a larger SMT solver. A model generated by CDCL may be rejected by the Theory Reasoner. In such a scenario, CDCL has to resume from the point the last model (which was rejected) was found, backtrack, and continue the search for another model. Hence the Figure 3.6 (left) has to be modified to add the above mentioned functionality to it. procedure CDCL(S) input: set of clauses S output: satisfiable or unsatisfiable begin while (not allvariablesassigned) decide_next_branch() decisionlevel = decisionlevel + 1 unit Propagation() if checkforconflict = CONFLICT then backjumplevel = conflictanalysis() if (backjumplevel <= 0) then return UNSATISFIABLE else backtrack(backjumplevel) return SATISFIABLE end procedure CDCL(S) input: set of clauses S output: satisfiable or unsatisfiable begin if status = CONFLICT then retractlevel = getretractlevel(); If retractlevel <= 0 then return Unsatisfiable else retracttolevel(retractlevel) while (not allvariablesassigned) decide_next_branch() decisionlevel = decisionlevel + 1 unit Propagation() if checkforconflict = CONFLICT then backjumplevel = conflictanalysis() if (backjumplevel <= 0) then return UNSATISFIABLE else backtrack(backjumplevel) return SATISFIABLE end Figure 3.6 Pseudocode for the CDCL algorithm in SMT solver for Theory of EUF (Marques-Silva, et al., 2008) Page 26 of 57

27 The modified pseudocode is displayed in Figure 3.6 (right). When the Theory Reasoner rejects the model, the status of the solver is set to Conflict. The SAT Solver is then prompted to come up with another solution. The retractlevel() method propagates upwards in the decision tree to find a node which has not been flipped yet. If the level returned is a valid level, the solver backtracks and continues to search for the model. Otherwise, it is implied that the values have all been flipped and no model exists and the solver returns unsatisfiable. Pseudocode for CDCL in Iterative Deepening Minimal Model method The pseudo-code is further altered for the Iterative Deepening Method and the changes are highlighted in Figure 3.7. In the decide next branch method, if the model size is not less than the bound, then Exceed is appended to the model. The modification made to the pseudocode in Figure 3.7 checks if Exceed is appended and if so, then it returns unsatisfiable with the current bound to the SMT solver. procedure CDCL(S) input: set of clauses S output: satisfiable or unsatisfiable begin if status = Conflict then retractlevel = getretractlevel(); If retractlevel <= 0 then return Unsatisfiable else retracttolevel(retractlevel) while (not allvariablesassigned) decide_next_branch() if model!= null and model.get(0).equals( Exceed ) then return Unsatisfiable decisionlevel = decisionlevel + 1 unit Propagation() if checkforconflict = CONFLICT then backjumplevel = conflictanalysis() if (backjumplevel <= 0) then return UNSATISFIABLE else backtrack(backjumplevel) return SATISFIABLE end Figure 3.7 Pseudocode for CDCL in Iterative Deepening method 3.7 Implementation of the SMT Solver for Theory of EUF The resulting class diagram of the SMT solver is documented in Figure 3.8. This will be referred to while explaining the overall implementation of the SMT solver. Page 27 of 57

28 SMTSolverCDCL Class The class consists of the main method. The class sets the file name and calls the SMT Controller object, which invokes the SMT solver with the filename problem. The elapsed time for the SMT solver to generate a model is calculated in this class. SMTControllerCDCL Class The class governs the interactions between the SAT Solver and the Theory Reasoner. As shown in Figure 3.8, it contains extractclauses( ), processproblem( ) and areselectionsvalid(clause) methods. The extractclauses( ) invokes the parser that parses and extracts the set of clauses. The processproblem( ) method invokes the CDCL based SAT solver that checks the answer for the Satisfiability problem. The areselectionsvalid(clause) method invokes the Theory Reasoner that checks if the model is consistent with the Theory of EUF. Figure 3.8 Class Diagram for the SMT solver Page 28 of 57

29 CDCL Solver class The class contains the SAT Solver and its methods are indicated in Figure 3.8. The enginecdcl is the main control method. decide_next_branch( ), unitpropagation( ) and conflictanalysis( ) are the methods which perform the functions associated with CDCL. isonlyactivevariable( ) and isnotinmodel( ) are helper methods which consist of refactored code used multiple times in different places. An improved implementation of the unit propagation method and the decide next branch method was employed in this class. In the CDCL algorithm, when a clause status is not true and the clause has only one remaining unassigned variable, it has to be added to the Implication Graph as an implied decision. This is essentially what causes the conflict and is the essence of the whole CDCL algorithm. Rather than adding the functionality to spot the implied decisions to White s implementation of unit propagation method, the whole unit propagation method was developed from scratch. This was done so that the resulting implementation is integrated and more efficient compared to if the method had just been forcefully attached to White s unit propagation implementation. In addition to this, the decide next branch method in White s project always picked the first variable of the first unsatisfied clause. This variable picking was randomised in my implementation of the same method. This was initially done to direct the propagation towards a conflict such that the working of the conflict analysis method could be observed. This was retained in the final implementation of the method. Implication Graph Class The ImplicationGraph object represents the graph object that is central to the functioning of CDCLSolver. The attribute listofnodes is of the type ArrayLists<ImplicationGraphNode> and keeps a record of the nodes that are a part of the Implication Graph. In addition to the attributes, there are methods to add and remove nodes, which are used by CDCLSolver. The use of ArrayList<ImplicationGraphNode> i.e. ArrayList data structure that stores elements of type ImplicationGraphNode was at the heart of the implementation of Implication Graph as well as the Implication Graph Node. As it is an implementation of the List interface, its use allowed access to all the methods associated with Lists and made it easier to add and retrieve information. Implication Graph Node Class The ImplicationGraphNode class represents the node object which makes up the ImplicationGraph. As indicated in Figure 3.8, the node attributes are name, decisionlevel, variable object and the list of previous nodes. The class consists of get() methods to simplify the access to the node attributes. The list of previous nodes is of the type ArrayList<ImplicationGraphNode>. Page 29 of 57

30 3.8 Implementation Considerations and Reflections Using ArrayList<type> over Graph based implementation to model Implication Graph The initial choice for the implementation of the Implication Graph was a Graph based data structure as the Implication Graph is an acyclic directed graph. However, the implementation was done using ArrayList<ImplicationGraphNode>. The sole functionality desired of the Implication Graph was the ability to trace the paths from the conflict to the decision node. These paths were used for conflict analysis. This functionality was easily achievable with the use of an ArrayList. All the nodes in the ImplicationGraph are recorded in a list. Each element in that list, i.e. each node keeps a record of its previous nodes in another list. The previous nodes list for each node can be used to trace the edges that would have existed in the graph representation. Therefore, the nodes can be pieced together to represent the Implication Graph. This representation of the Implication Graph is documented in Appendix B. Using ArrayList provides a lightweight representation of the implication graph. More importantly, this allowed room for scalability to execute large problems. Graph based implementation would have been more memory intensive and imposed problem size restrictions on the solver. In retrospect, using Array Lists was a wise decision as it simplified the development process and resulted in a lightweight solver. Reusing underlying implementation from White s project While designing the SAT solver, some of the files were reused from White s SAT Solver. In the initial stages of the project this seemed practical as if the files were rewritten, the new files would have mostly the same code. In the more advanced stages of the project however, having shared files for White s solver and my solver imposed design restrictions on my solver. For instance, at times rather than designing the functionality in the most optimal manner, compromises had to be made such that the design obeyed the structure of the shared files. The structure of the shared file could not be changed as it would then affect White s solver. This was initially done in the interest of time, but it led to more time being wasted in development and debugging. In retrospect, this approach did not adhere to the high cohesion low coupling design principle. 34 The solver s development could have been better and more streamlined if better design discipline had been observed. This was an effective learning experience for future projects. 34 (Serrat, 2014) Page 30 of 57

31 4. Testing, Benchmarking and Evaluation The following testing was done to ensure the correctness of the SMT solver for this project. Unit Testing and Module Testing For the unit testing, the unit tests from White s project were reused to test some parts of the solver. As the Parser and the Theory Reasoner were reused from White s project and minimal changes were made to them, their testing was not as thorough as the testing for the SAT solver. For the module testing of the SAT solver, simple SAT problems adapted from the COMP21111 Logic and Modelling notes 35 were executed on the SAT solver and results cross-checked. The example run of one such problem twosat is detailed in Appendix C. Integration Testing At the topmost abstraction layer, the SMT solver can be split into three modules, Parser, SAT Solver and Theory Reasoner. Integration Testing was perhaps the most important testing with respect to this project. This was because the Parser and the Theory Reasoner were reused from White s project. White s DPLL based SAT solver was then replaced with the newly developed CDCL based SAT solver. Integration Testing was therefore essential to ensure that the newly inserted CDCL based SAT solver works well with the other two components. The simple tests were the problems that were satisfiable or unsatisfiable for both the SAT solver and the Theory Reasoner. eqdiamond problem 36 is an example of a simple test. The edge tests were the problems which were satisfiable for one of the modules, say the SAT solver and unsatisfiable for the other module, that is the Theory Reasoner or vice versa. For instance, threeunsat is a crafted SMT problem with three unit clauses, out of which two clauses represent contradicting equations. threeunsat has straightforward propositional logic model but always leads to contradictions in the Theory Reasoner. The run of eqdiamond and threeunsat is captured in Appendix C. Every time these problems were executed, the integration with the parser was tested as well. This is because the expected answer was generated for the given problem, implying that the parser worked accurately. Validation Testing The validation done was two-folds. Firstly, the output from my solver was compared with the output from White s solver. The problems were further executed on the Z3 solver available at 37 and the results were cross checked with the output from my SMT solver. The example output from the online Z3 solver for threeunsat is shown in Appendix C. 35 (Voronkov, 2014) 36 (Stirchman, et al., 2005) 37 (Microsoft, 2012) Page 31 of 57

32 Time / s 4.1 Benchmarking for the DPLL based SMT Solver and the CDCL based SMT Solver Benchmarking was done to compare the performance of White s SMT solver and the solver developed in this project. The problems were obtained from To get a wide range of results, both satisfiable and unsatisfiable problems with varying number of variables (20, 50, 75, 100, 125) were considered. Table 4.1 summarizes the performance of the DPLL based SMT Solver and CDCL based SMT Solver. Graph 4.1 to Graph 4.6 plot the benchmarking results. Number of Variables SAT/UNSAT Average DPLL Time Average CDCL Time / File type uf20 SAT uf50 SAT uf75 SAT uf100 SAT uf125 SAT Timeout uf150 SAT Timeout uuf50 UNSAT uuf100 UNSAT uuf125 UNSAT Timeout uuf150 UNSAT Timeout Table 4.1 Benchmarking Results for DPLL and CDCL Time / s against Benchmark Number for UF20 files Benchmark Number DPLL CDCL Graph 4.1 Time/s against Benchmark Number for uf20 files, that is satisfiable problems with 20 variables Page 32 of 57

33 Time / s Time / s Time / s against Benchmark Number for UF50 files Benchmark Number DPLL CDCL Graph 4.2 Time/s against Benchmark Number for uf50 files, that is satisfiable problems with 50 variables Time / s against Benchmark Number for UF75 Files Benchmark Number DPLL CDCL Graph 4.3 Time/s against Benchmark Number for uf75 files, that is satisfiable problems with 75 variables Page 33 of 57

34 Time / s Time / s Time against Benchmark Number for UF100 files Benchmark Number DPLL CDCL Graph 4.4 Time/s against Benchmark Number for uf100 files, that is satisfiable problems with 100 variables Time against Benchmark Number for UUF50 files Benchmark Number DPLL CDCL Graph 4.5 Time/s against Benchmark Number for uuf50 files, that is unsatisfiable problems with 50 variables Page 34 of 57

35 Time / s Time / s against Benchmark Number for UUF100 Files Benchmark Number DPLL CDCL Graph 4.6 Time/s against Benchmark Number for uuf100 files, that is unsatisfiable problems with 100 variables The following summarizes the observations made from the graphs, and offers an explanation for the observations. CDCL based SMT solver is faster than DPLL based SMT solver As can be inferred from Graph 4.1 to Graph 4.6, CDCL based SMT solver is always the faster than DPLL based SMT solver. This is because CDCL s conflict driven intelligent backjumping trumps DPLL s naïve backtracking. While DPLL walks, CDCL hops around the exploration tree skipping branches where there is no answer. This is crucially why the CDCL based SMT solver fares better compared to the DPLL based SMT solver. In addition to this, as discussed in Chapter 3: Design and Implementation, this project has more efficient implementation of the unit propagation and the decide next branch method, and the integration of the SAT solver and the Theory Reasoner was improved. The speedup from these changes is definitely not as significant as the speedup resulting from the use of CDCL over DPLL, but they do optimise the SMT solver to some extent. Unsatisfiable problems take longer than Satisfiable problems This can be seen by comparing the average times of satisfiable and unsatisfiable problems of the same sizes (number of variables) in Table 4.1. This is because proving that a problem is satisfiable is arguably easier than proving that the problem is unsatisfiable. Page 35 of 57

36 To prove that a problem is satisfiable, the solver needs to find one model that satisfies the set of clauses. For the DPLL algorithm, this involves chronologically traversing the search space until the first model is found. For the CDCL algorithm, this involves non-chronologically traversing the search space until the first model is found. To prove that a problem is unsatisfiable, the solver needs to exhaustively traverse the search tree and this presumably takes longer time. For the DPLL algorithm, this involves traversing the entire search tree, and ensuring that no models exist on any of the leaf nodes. For the CDCL algorithm, this involves encountering multiple conflicts and learning a new clause from each conflict. After certain number of conflicts, the expanded set of clauses will prevent the solver from making any further decisions. This is CDCL s way of covering the entire search space and ensuring that no models exist. To summarize, for both the algorithms, proving that a problem is satisfiable requires partial traversal of the search space whereas proving that the problem is unsatisfiable requires complete traversal. This explains why the latter takes longer time. Performance varies between benchmarks for same variable sized problems for the same method (CDCL, DPLL) This is seen through the gradual increase as the benchmarks go from 1 through to 100. The benchmarks are of differing difficulty, and so the gradual increase is observed in the time taken to solve the first problem and the hundredth problem. The starting problems are straightforward and it is relatively easier for the solver to make the assignments to generate the model. The latter problems are complicated and the solver takes more time to make the assignments to generate the model and hence the gradual increase in time is observed. 4.2 Experimentation with the different CDCL learning schemes The CDCL based SMT Solver was executed with two different learning schemes, namely 1-UIP and Rel_Sat. Table 4.2 summarizes the results and Graph 4.7 plots the performance of the two learning schemes. Number of Variables Average time for 1- Average time for / File type UIP Rel_Sat Table 4.2 Benchmarking Results for 1-UIP and Rel_Sat Page 36 of 57

37 Time / s Time / s against Number of Variables Number of Variables 1-UIP Time Rel_Sat Time Graph 4.7 Time/s against Number of Variables for 1-UIP and Rel_Sat The two schemes represent two school of thoughts, one that believes that the cause of the conflict lies closer to the conflict, and the other that believes that the cause lies closer to the root. The results indicate that the cause is probably closer to the conflict than the root. The results are also consistent with the research done by (Zhang, et al., 2001), as their results also indicate that the learning scheme based on the first UIP was effective in solving SAT problems in comparison with other schemes. 38 More backjumping and rollback involved in Rel_Sat This is more of a personal evaluation of the performance of the Rel_Sat scheme. An implication graph starts from the left and grows towards the right side when the new decisions are added. The larger the problems are, the wider is the implication graph. When a conflict occurs, it will occur closer to the right end, as the new assignments and the implied assignments are added to the right. As the 1-UIP scheme considers the first UIP, the cut made is closer to the right end of the implication graph. For Rel_Sat, which considers the last UIP, the cut is closer to the left end of the implication graph. The backjump level returned after analysing the cuts is likely to be lower for Rel_Sat as it will belong to one of the earliest decisions. The backjumplevel for 1- UIP will be higher as it will belong to one of the latest decisions. 38 (Zhang, et al., 2001) Page 37 of 57

38 For Rel_Sat, more decisions have to be rolled back, more clauses have to be reset, and so more computation is involved. I believe that this is why Rel_Sat takes longer time compared to 1-UIP. So, 1-UIP or Rel_Sat I strongly believe the performance of the two schemes varies because of more rollback involved in Rel_Sat scheme. With a more efficient implementation of the rollback section, the two schemes may be at par with each other. Though better performance was observed on average for the 1-UIP scheme, the performance of the Rel_Sat is still better than performance of White s DPLL solver. To sum up this section, the 1-UIP scheme performs better than the Rel_Sat scheme by a small margin and does not knock the Rel_Sat scheme out of the ballpark. Hence, for the future implementations of the CDCL algorithm, the Rel_Sat scheme should not be ruled out. 4.3 Multiple Model method and Iterative Deepening method This section explores the effectiveness of the proposed Iterative Deepening method in finding the minimal model. This bounded method is compared to the Multiple Model method, which computes all the models and then compares their sizes. An example run of the Multiple Model method and the Iterative Deepening method is documented in Appendix D. Number of Variables Average Multiple Average Iterative Average Model / Filetype Model method Time Deepening method Time Size uf uf uf uf uf Table 4.3 Benchmarking Results for Multiple Model method and Iterative Deepening method Page 38 of 57

Time / s against Number of Variables 20 79.26336 15 10 5 0 uf10 uf20 uf50 uf75 uf100 Multiple Model Time Iterative Deepening Time Graph 4.

39 Time / s against Number of Variables uf10 uf20 uf50 uf75 uf100 Multiple Model Time Iterative Deepening Time Graph 4.8 Time/s against Number of Variables for Multiple Model and Iterative Deepening method The effectiveness of the iterative deepening method, as summarized by Table 4.3 and Graph 4.8, can be attributed to the exponential increase in time for the multiple model method. Exponential increase in time for Multiple Model method As the number of variables in the logical formula increases, the time taken by the multiple model method increases exponentially. This is because the limitation of the multiple model method is the exponential increase in the search space with the increase in the number of variables. Where the multiple model method falters, the iterative deepening method fares well. The exponentially increased search space is pruned by the deepening method as the bound limits the number of nodes required to be explored each time. The following improvements can be applied to the Iterative Deepening method to further improve its efficiency. Better estimation of the Initial Bound using Average Model Sizes The closer the bound is to the actual size of the minimal model, the faster is the completion of the Iterative Deepening Method. Each time the bound fails to deliver a model, the solver has to increment the bound and start again with the new bound. A better approximation of the initial bound requires fewer reruns of the solver and speeds up the method. Page 39 of 57

40 The model sizes were expected to be between ½ and ¾ of the number of variables in the given formula. However, the average model size column in Table 4.3 show that the actual model size is 9 10 of the number of variables in the formula. There is a stark difference between the expected and the actual measure. This can be overcome with training done using a machine learning algorithm with number of variables and average model size data pairs. This could help to determine a better initial bound and speed up the Iterative Deepening method. Setting the bound dynamically The bound could be set on the run i.e. dynamically. This is done in the following manner. Prior to finding the first model, the bound is unset. However, after finding the first model, the bound is the size of the first model. Thereon, the tree is only explored for as many as bound number of decisions. If a smaller model is encountered, then the bound is set to the size of the smaller model and the exploration continues. If no smaller model is found, the current model is returned. Page 40 of 57

41 5. Reflection and Conclusion This section presents the achievements associated with the project, ideas for further research and the important lessons learned from COMP Project Achievement The project aim, to optimise an SMT solver for the Theory of EUF, was fulfilled. Extensive research was carried out into advanced topics associated with logic and modelling, the CDCL based SAT solver was developed in Java and integrated with Parser and Theory Reasoner to build an SMT solver and additional features were added to it. The performance of the solver was measured and the improvement in the efficiency was addressed. 5.2 Further Implementation This project continues to use the lazy approach which was used in White s project for the interaction between the SAT Solver and Theory Reasoner. In the lazy approach, the SMT solver waits for the boolean model before it checks consistency. I am very curious as to how the SMT solver would fare when the implemented using the eager approach. The following example, taken from (Dutertre & Moura, 2012), demonstrates the working of an eager approach based SMT solver that interleaves boolean propagation and theory reasoner calls. This works towards pruning the SAT solver search space. 39 For the given logic formula x + y 0 (x = z z + y = 1) z > 3t Replacing the atoms by boolean variables, a x + y 0, b x = z, c z + y = 1 and d z > 3t After some initial decision making, let us assume the partial model is {a, d, c} In such a scenario, the eager approach based SMT solver should realise that the current assignments, x + y 0 z + y = 1 => (x = z) So b should be assigned false in the SAT solver. 39 (Dutertre & Moura, 2012) Page 41 of 57

42 An eager SMT solver transcends the view in which an SMT solver is an extension of a SAT solver. This points towards more dedicated SMT solvers, which are not built as an extension of SAT solvers but are built from scratch for the sole purpose of solving SMT problems. This area can be further explored and one can definitely hope for promising results and improvements as far as the efficiency of SMT solvers is concerned. Logic and Modelling Knowledge 5.3 Knowledge Gained There was a steep learning curve in the initial stages of the project. The COMP21111 Logic and Modelling module in second year did provide an introduction to computational logic and algorithms for solving SAT problems such as DPLL. However, the concepts associated with this project were beyond COMP21111 and had to be researched and learned from scratch. The project involved researching about CDCL from journal articles, figuring out how the algorithm worked, researching about possible extensions and implementing CDCL and the extensions in Java. It is safe to assume that the only three undergraduates who have a complete understanding of CDCL are the students who worked on this project. That speaks volumes about the difficulty level of the project. On a positive note, the project gave me an opportunity to go above and beyond the second year Logic and Modelling module, and explore this field in more depth. I have developed a great understanding of some of the more advanced research associated with SAT solvers and this will definitely make for some very interesting interview conversations in the near future. Design Lessons This experience taught me important design lessons. I had come across the high cohesion, low coupling design principle prior to the project, but it took this project to actually experience its value. While developing the solver, I could have obeyed the low coupling principle to a greater extent. I feel that the overall SAT solver design would have been much better if my SAT solver and White s SAT solver did not use the same files. Doing so, would have given me complete freedom design wise, and lead to a more optimal design of the SAT solver. For anyone working on similar extension projects, I would highly recommend to create your own files and adapt the code piece by piece into your new files. Keep the interaction between your part of the system and project you are working on to a minimum. Creating your own files and adapting the code from the existing files would be better than modifying on top of the same files. Prioritise design over convenience. Page 42 of 57

43 5.4 Personal Development The most important soft skills polished through this experience were my presentation and explanation skills. CDCL is not the most straightforward algorithm and explaining it to other people is not an easy task. Having to explain it for the purposes of my seminar and demonstration required careful structuring of the concepts and segregating the explanation into layers. I was able to elucidate the workings of CDCL to audiences and that made me realize that I was improving at the way I structure my explanations. Each time I received great feedback from my supervisor and the second marker. I believe that polishing this skill will be very valuable in the near future. In addition to this, the skills usually associated with a project of this magnitude, such as time management, organisational skills and punctuality were also polished. 5.5 Conclusion Overall, working on the project was a thoroughly enriching experience. I improved my Java programming skills and amassed a great deal of knowledge pertaining to SAT and SMT solvers. The whole experience also provided me with an insight into challenges associated with complex projects. Page 43 of 57

44 Bibliography Bachmair, L., Tiwari, A. & Vigneron, L., Abstract Congruence Closure. Journal of Automated Reasoning, pp Biere, A., Cimatti, A., Clarke, E. & Zhu, Y., March Symbolic model checking without BDDs. Tools and Algorithms for the Construction and Analysis of Systems, pp (*) Chen, P. & Keutzer, K., November Towards true crosstalk noise analysis. Internaitonal Conference on Computer-Aided Design, pp (*) Dutertre, B. & Moura, L. d., Satisfiability Modulo Theories Equalities + Uninterpreted Functions (EUF) Linear Arithmetic, Summer School on Formal Techniques, Menlo Park. [Online] Available at: [Accessed 19 March 2016]. Fuhs, C. et al., SAT solving for termination analysis with polynomial interpretations. International Conference on Theory and Applications of Satisfiability Testing, pp (*) Jackson, D., Schechter, I. & Shlyakhter, I., Alcoa: the Alloy constraint analyzer. International Conference on Software Engineering, pp (*) Khurshid, S. & Marinov, D., TestEra: Specification-based testing of java programs using SAT. Autom. Softw. Eng., 11(4), pp Larrabee, T., Test pattern generation using Boolean Satisfiability. IEEE Transactions on Computer-Aided Design, 11(1), pp Larrosa, J., Lynce, I. & Marques-Silva, J., Satisfiability: Algorithms, Applications and Extensions (SAC 2010), s.l.: s.n. Liberatore, P., Algorithms and Experiments on Finding Minimal Models, s.l.: Department of Computer and System Sciences, University of Rome "La Sapienza". Lynce, I. & Marques-Silva, J., July Efficient haplotype inference with Boolean satisfiability. National Conference on Artificial Intelligence. (*) Marques-Silva, J., Practical Applications of Boolean Satisfiability, s.l.: s.n. Marques-Silva, J., Lynce, I. & Malik, S., Chapter 4: Conflict-Driven Clause Learning SAT Solvers, s.l.: s.n. McGeer, P. C. et al., Timing analysis and delay-fault test generation using path-recursive functions. International Conference on Computer-Aided Design, pp (*) Moura, L. d. & Bjørner, N., Z3: An Efficient SMT Solver, s.l.: s.n. Nieuwenhuis, R., Oliveras, A. & Tinelli, C., Solving SAT and SAT Modulo Theories: From an Abstract Davis-Putnam-Logemann-Loveland Procedure to DPLL(T). Journal of the ACM (JACM), 53(6), pp Page 44 of 57

45 Pellegrin, A. d. & Horstmann, C., Last Updated Violet UML Editor at SourceForge.net. [Online] Available at: [Accessed December 2015]. Research, M., n.d. rise4fun from Microsoft. [Online] Available at: [Accessed February 2016]. Selman, B. & Kautz, H., Planning as satisfiability. European Conference on Artificial Intelligence, pp (*) Serrat, J., Object-oriented design: GRASP patterns, s.l.: s.n. Smith, A., Veneris, A. G., Ali, M. F. & Viglas, A., Fault diagnostics and logic debugigng using Boolean satisfiability. IEEE Transactions on Computer-Aided Design, 24(10), pp (*) Stirchman, O., Rozanov, M. & Moura(Translator), L. d., Generating minimum transitivity constraints in P-time for deciding Equality Logic, SMT Workshop 2005, s.l.: s.n. Tichy, R. & Glase, T., Clause Learning in SAT, Seminar Automatic Problem Solving, WS 2005/06, s.l.: Faculty of Computer Science, University of Potsdam. Voronkov, A., Logic and Modelling Notes. [Online] Available at: [Accessed October 2015]. Zhang, L., Madigan, C. F., Moskewicz, M. H. & Malik, S., Efficient Conflict Driven Learning in a Boolean Satisfiability Solver. Proceedings of the 2001 IEEE/ACM International Conference on Computer-aided Design (ICCAD '01), pp * implies Seconday Reference. Page 45 of 57

46 Appendices Appendix A This appendix section covers the rewrite rules concerned with the working of Congruence Closure. The Extension, Simplification, Deletion, Orientation, Collapse, and Deduction rules are described below. Transition rules manipulate the triples (K, E, R). These rules are taken from (Bachmair, et al., 2003) and are accompanied by a brief description. Extension This rule is used when an equation term is replaced with a constant and the rule t c is appended to the congruence closure. Simplification (K, E[t], R) (K {c}, E[c], R {t c} ) This rule makes sure that elements of E, terms and constants alike, are represented by the constant which has the highest total ordering. Orientation (K, E[t], R {t c} ) (K, E[c], R {t c} ) This rule generates an ordering for terms and constants. Deletion (K {c}, E {t c}, R) (K {c}, E, R {t c} ) As the name suggests, this rule deletes unneeded equations. Collapse (K, E {t t}, R) (K, E, R) This rule replaces function s argument with highest ordered constant. Deduction (K, E, R {s[c] c, c d} ) (K, E, R {s[d] c, c d} ) As the name suggests, this rule infers new equalities from the rewrite rules. (K, E, R {t c, t d} ) (K, E {c d}, R {t d} ) Page 46 of 57

1. The only purpose of this list is to provide access to each individual node. Implication Graph = a b c d e f Figure B.

47 Appendix B This appendix explains how the ArrayList<ImplicationGraphNode> is used to implement the Implication Graph. The listofnodes attribute of the implication graph is of the type ArrayList<ImplicationGraphNode> and keeps track of all the nodes. So, an implication graph is nothing but a list as shown in Figure C.1. The only purpose of this list is to provide access to each individual node. Implication Graph = a b c d e f Figure B.1 ArrayList based Impication Graph Each node keeps its own list of previous nodes. This is depicted in Table B.1. So,a s list of previous nodes is empty, signalling that it is a decision variable. c s list of previous node includes a, b, signalling that it is an implied variable. Table B.1 Nodes with their previous nodes list The implication graph corresponding to the information contained in Figure B.1 and Table B.1 is shown in Figure B.2. The implicit edges are indicated by dotted lines and this is fitting as the edges do not exist and relationship between the nodes is preserved using lists. Figure B.2 Representation of the Implication Graph Page 47 of 57

48 Appendix C This appendix documents the screenshots of the example runs of the SMT solver. twosat is a simple SAT problem adapted from the COMP21111 Logic and Modelling notes 40, to check the working of the SAT solver. Figure C.1 shows the output of the CDCL based SAT solver for the twosat problem and the parts to look at are as follows: - A. This point shows the simple unit propagation being performed to make both the clauses true. After the first decision A, the solver spots the unit clause B and branches with B next. B. This point shows that all clauses are set to true after the unit propagation is over. C. This points to the decision variable (free decision) in the solving process, which is contained in the Decision Tree. D. This point indicates the model, which consists of the decision variables and the implied decisions, which is contained in the Model. E. This points to the resulting final solution. The time elapsed is also given as the output. eqdiamond is a simple SMT problem used to check if the SAT solver and the Theory Reasoner worked in tandem with each other. Figure C.2 shows the output of the SMT solver for the eqdiamond problem and the important parts of the output are listed below: - A. Shows the equations that are being passed to the Theory Reasoner. B. This point shows the Theory Reasoner using the rewrite rules to generate the congruence closure, and subsequently check for contradictions. C. No contradictions are found. If any contradictions are found, they are listed at this point. D. This point shows the final SMT solver output and the elapsed time. 40 (Voronkov, 2014) Page 48 of 57

49 Figure C.1 Screenshot of the SMT solver output for the twosat problem Page 49 of 57

50 Figure C.2 Screenshot of the SMT solver output for the eqdiamond problem Page 50 of 57

51 threeunsat problem is an SMT problem wherein all the clauses are unit clauses. The problem therefore has one and only propositional logic model. This one and only model leads to contradiction with respect to the Theory of EUF. Figure C.3 and Figure C.4 capture the output of the threeunsat problem and the important parts to look at are as follows: A. This point shows the empty decision tree and the empty model at the start of the SMT solvers working. B. This point shows the simple unit propagation that is carried out to derive the straightforward propositional logic model. This simple model consists of each of the unit clauses. C. This point shows that after the unit propagation method, all the clauses are set to true. D. This point shows the equations being passed to the Theory Reasoner. Two of the three equations are a == b and a! = b. E. This shows the Theory Reasoner at work. Using the rewrite rules, the Theory Reasoner checks if the generated propositional logic model leads to any contradictions. F. This represents the contradiction in the terms used by the Theory Reasoner, which are assigned to the resulting system of equations after the rewrite rules are applied to the equations which were passed to the reasoner. G. This point shows that the SMT solver spotting that all clauses are unit clauses, and that the only propositional logic model has been rejected. So, it returns SMT problem Unsat without invoking the SAT solver to check for more models. Page 51 of 57

52 Figure C.3 Screenshot of the SMT solver output for the threeunsat problem Page 52 of 57

53 Figure C.4 Screenshot of the SMT solver output for the threeunsat problem Page 53 of 57

54 Figure C.5 shows the output of the online Z3 solver 41 for the threeunsat problem. The important things to look at in this output are as follows: - A. The name of the solver, Z3 developed by Microsoft Research, is indicated on top of the page. B. The threeunsat problem represented in the SMT Lib v2 format. C. This button is pressed to start the embedded SMT solver. D. The output from the SMT solver is displayed in the textbox. In this case the unsat output was expected. Figure C.5 Screenshot of the Z3 solver output for the threeunsat problem 41 (Microsoft, 2012) Page 54 of 57

55 Appendix D In the multiple model method, the different models of the method are found and added to the hash map. In the end, their sizes are compared and model with the smallest size is given as the output. Figure D.1 depicts the example run of ninesat 42 using this method and the important points to look at are highlighted: - A. The multiple models stored in the hashmap are printed at the end. B. The size of the multiple models is then compared to compute the minimal model. This minimal model is returned at this point. Figure D.1 Screenshot of the SMT Solver output for Multiple Model method Figure D.2 to Figure D.4 depict the run of the ninesat file using the Iterative Deepening method. The initial bound is set to two. The points to note are highlighted in the figures and are as follows: - A. This point occurs when the solver finishes traversing the entire search space with bound two. No model is found so the Solver returns unsatisfiable with the current bound. B. The bound is incremented and the solver restarts the search with bound three. C. This points to the fresh start of the SMT solver with the new bound. 42 Problem crafted by me following the example listed in (Tichy & Glase, 2006) Page 55 of 57

56 D. This shows how and when a model is rejected. With the current bound three, when three decisions have been made, and the set of clauses is still not satisfiable, rather than making a new decision, the decide next branch appends Exceed to the model. This tells the solver that this particular model is rejected. E. The solver reacts to the Exceed by firstly removing it from the model. F. Then the solver backtracks to level two and resumes the search. G. Similar to A and B, the solver eventually recognises that no models exist with bound three and then bound is increased to four. The first model found with bound four is accepted. H. As the output, the original bound, the actual bound of the minimal model and the difference between the two is displayed alongside the elapsed time. Figure D.2 Screenshot for the SMT Solver output for the Iterative Deepening method Figure D.3 Screenshot for the SMT Solver output for the Iterative Deepening method Page 56 of 57

57 Figure D.4 Screenshot for the SMT Solver output for the Iterative Deepening method Page 57 of 57

Deductive Methods, Bounded Model Checking

Deductive Methods, Bounded Model Checking http://d3s.mff.cuni.cz Pavel Parízek CHARLES UNIVERSITY IN PRAGUE faculty of mathematics and physics Deductive methods Pavel Parízek Deductive Methods, Bounded