of our toolkit of data structures and algorithms for automated deduction in rst-order

Similar documents
An LCF-Style Interface between HOL and First-Order Logic

System Description: iprover An Instantiation-Based Theorem Prover for First-Order Logic

Higher-Order Conditional Term Rewriting. In this paper, we extend the notions of rst-order conditional rewrite systems

The underlying idea for the proposed proof procedure is to transform a formula into a Shannon graph and compile this graph into Horn clauses When run

Bliksem 1.10 User Manual

Uncurrying for Termination

of m clauses, each containing the disjunction of boolean variables from a nite set V = fv 1 ; : : : ; vng of size n [8]. Each variable occurrence with

Cooperation of Heterogeneous Provers

AND-OR GRAPHS APPLIED TO RUE RESOLUTION

Lecture 17 of 41. Clausal (Conjunctive Normal) Form and Resolution Techniques

Equational Reasoning in THEOREMA

Higher-Order Recursive Path Orderings à la carte

sketchy and presupposes knowledge of semantic trees. This makes that proof harder to understand than the proof we will give here, which only needs the

PROTEIN: A PROver with a Theory Extension INterface

SPASS Version 3.5. Christoph Weidenbach, Dilyana Dimova, Arnaud Fietzke, Rohit Kumar, Martin Suda, and Patrick Wischnewski

Module 6. Knowledge Representation and Logic (First Order Logic) Version 2 CSE IIT, Kharagpur

Tidying up the Mess around the Subsumption Theorem in Inductive Logic Programming Shan-Hwei Nienhuys-Cheng Ronald de Wolf bidewolf

Foundations of AI. 9. Predicate Logic. Syntax and Semantics, Normal Forms, Herbrand Expansion, Resolution

On Meaning Preservation of a Calculus of Records

Localization in Graphs. Richardson, TX Azriel Rosenfeld. Center for Automation Research. College Park, MD

DPLL(Γ+T): a new style of reasoning for program checking

Reasoning About Loops Using Vampire

UMIACS-TR December, CS-TR-3192 Revised April, William Pugh. Dept. of Computer Science. Univ. of Maryland, College Park, MD 20742

has developed a specication of portions of the IEEE 854 oating-point standard in PVS [7]. In PVS, the injective function space injection can be dened

Automated Termination Proofs with AProVE

Implementação de Linguagens 2016/2017

The temporal explorer who returns to the base 1

where is a constant, 0 < <. In other words, the ratio between the shortest and longest paths from a node to a leaf is at least. An BB-tree allows ecie

Theorem proving. PVS theorem prover. Hoare style verification PVS. More on embeddings. What if. Abhik Roychoudhury CS 6214

Term Algebras with Length Function and Bounded Quantifier Elimination

ON A HOMOMORPHISM PROPERTY OF HOOPS ROBERT VEROFF AND MATTHEW SPINKS

ARELAY network consists of a pair of source and destination

Hyperplane Ranking in. Simple Genetic Algorithms. D. Whitley, K. Mathias, and L. Pyeatt. Department of Computer Science. Colorado State University

A Simplied NP-complete MAXSAT Problem. Abstract. It is shown that the MAX2SAT problem is NP-complete even if every variable

Tsukuba Termination Tool

Reduced branching-factor algorithms for constraint satisfaction problems

Chapter 2 The Language PCF

MAX-PLANCK-INSTITUT INFORMATIK

Module 6. Knowledge Representation and Logic (First Order Logic) Version 2 CSE IIT, Kharagpur

MAX-PLANCK-INSTITUT F UR. Ordered Semantic Hyper-Linking. David A. Plaisted. MPI{I{94{235 August 1994 INFORMATIK. Im Stadtwald. D Saarbrucken

Prime Implicate Generation in Equational Logic (extended abstract)

Edinburgh Research Explorer

Slothrop: Knuth-Bendix Completion with a Modern Termination Checker

A Boolean Expression. Reachability Analysis or Bisimulation. Equation Solver. Boolean. equations.

System Description: Twelf A Meta-Logical Framework for Deductive Systems

Implementation of Lambda-Free Higher-Order Superposition. Petar Vukmirović

Lecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying

First-Order Proof Tactics in Higher Order Logic Theorem Provers Joe Hurd p.1/24

Losp for Predicate Calculus

Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee. The Chinese University of Hong Kong.

Scan Scheduling Specification and Analysis

Efficient Two-Phase Data Reasoning for Description Logics

(Refer Slide Time: 4:00)

VS 3 : SMT Solvers for Program Verification

Topology and Topological Spaces

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems

Conditional Branching is not Necessary for Universal Computation in von Neumann Computers Raul Rojas (University of Halle Department of Mathematics an

The Maude LTL Model Checker and Its Implementation

SOFTWARE VERIFICATION RESEARCH CENTRE DEPARTMENT OF COMPUTER SCIENCE THE UNIVERSITY OF QUEENSLAND. Queensland 4072 Australia TECHNICAL REPORT. No.

First Order Logic in Practice 1 First Order Logic in Practice John Harrison University of Cambridge Background: int

A Comparison of Different Techniques for Grounding Near-Propositional CNF Formulae

,, 1{48 () c Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Optimal Representations of Polymorphic Types with Subtyping * ALEXAN

size, runs an existing induction algorithm on the rst subset to obtain a rst set of rules, and then processes each of the remaining data subsets at a

of Sets of Clauses Abstract The main operations in Inductive Logic Programming (ILP) are generalization and

The Encoding Complexity of Network Coding

Rewriting. Andreas Rümpel Faculty of Computer Science Technische Universität Dresden Dresden, Germany.

Reconciling Dierent Semantics for Concept Denition (Extended Abstract) Giuseppe De Giacomo Dipartimento di Informatica e Sistemistica Universita di Ro

Quantifier-Free Equational Logic and Prime Implicate Generation

Logic Programming and Resolution Lecture notes for INF3170/4171

On the BEAM Implementation

Beluga: A Framework for Programming and Reasoning with Deductive Systems (System Description)

Safe Stratified Datalog With Integer Order Does not Have Syntax

SBR3: A Refutational Prover for Equational Theorems

Smoothsort's Behavior on Presorted Sequences. Stefan Hertel. Fachbereich 10 Universitat des Saarlandes 6600 Saarbrlicken West Germany

The design of a programming language for provably correct programs: success and failure

Model Elimination, Logic Programming and Computing Answers

SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION

the application rule M : x:a: B N : A M N : (x:a: B) N and the reduction rule (x: A: B) N! Bfx := Ng. Their algorithm is not fully satisfactory in the

Theoretical Foundations of SBSE. Xin Yao CERCIA, School of Computer Science University of Birmingham

An Approach to Abductive Reasoning in Equational Logic

Lecture 5: Exact inference

Backtracking and Induction in ACL2

Minimum Cost Edge Disjoint Paths

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C

A stack eect (type signature) is a pair of input parameter types and output parameter types. We also consider the type clash as a stack eect. The set

The NP-Completeness of Some Edge-Partition Problems

General properties of staircase and convex dual feasible functions

Decision Procedures for Recursive Data Structures with Integer Constraints

A Database of Categories

Knowledge Representation and Reasoning Logics for Artificial Intelligence

Two Problems - Two Solutions: One System - ECLiPSe. Mark Wallace and Andre Veron. April 1993

E 1.4 User Manual. preliminary version. Stephan Schulz August 20, 2011

A Pearl on SAT Solving in Prolog (extended abstract)

Efficient Second-Order Iterative Methods for IR Drop Analysis in Power Grid

evaluation using Magic Sets optimization has time complexity less than or equal to a particular

Automated Reasoning: Past Story and New Trends*

Incremental Flow Analysis. Andreas Krall and Thomas Berger. Institut fur Computersprachen. Technische Universitat Wien. Argentinierstrae 8

Automated Reasoning PROLOG and Automated Reasoning 13.4 Further Issues in Automated Reasoning 13.5 Epilogue and References 13.

when a process of the form if be then p else q is executed and also when an output action is performed. 1. Unnecessary substitution: Let p = c!25 c?x:

Worst-case running time for RANDOMIZED-SELECT

Transcription:

The Barcelona Prover ROBERT NIEUWENHUIS, JOSE MIGUEL RIVERO and MIGUEL ANGEL VALLEJO Technical University of Catalonia Barcelona, Spain roberto@lsi.upc.es Abstract. Here we describe the equational theorem prover Barcelona, in its version that participated in the CADE'96 theorem proving competition. The system was built on top of our toolkit of data structures and algorithms for automated deduction in rst-order logic with equality, and was devised mainly to test the performance of this toolkit. Key words: Automated theorem proving, competition, Barcelona, data structures and algorithms, implementation 1. Introduction During the last decade, research on automated deduction in our group has mainly focussed on theoretical results for rst-order logic with equality. New techniques for e.g., clausal rewriting and deduction with constrained clauses have been developed and completeness results established. Many necessary underlying results on term orderings, constraint solving and answer computation have been given, with their decidability and complexity characteristics (cf. http://www-lsi.upc.es/dept/sectp.html). In order to better understand these new techniques and learn more about their practical behaviour, during their development we have always worked with prototype implementations written in Prolog. These experimental systems converged in 1992 into our laboratory implementation Saturate [9, 3]. Although it has been applied successfully in many contexts and has led to interesting new insight and theoretical results (see also [1, 7]), the Saturate system is not a highperformance theorem prover. Therefore, in order to focus more on practice, some years ago we decided to work on eciency-oriented systems as well. When studying the existing provers, we found a wide range of dierent ad-hoc data structures for implementing very similar calculi and control mechanisms. E.g., for term representation there are the linear atterms [2] and diverse other types of nonlinear terms, and for indexing, discrimination trees [2, 6], path indexing [10, 6] and substitution trees [4], among others. In spite of the evident qualities of all these data structures, we somewhat missed a more uniform framework, like the Warren Abstract Machine (WAM) [11] in logic programming. In such a framework structure sharing, indexing for all the main operations and perhaps even compilation of stat-

2 ROBERT NIEUWENHUIS, JOSE MIGUEL RIVERO and MIGUEL ANGEL VALLEJO ic parts of the clause sets should be possible in a standard, elegant, wellstructured and reusable way. After a large number of experiments with many techniques (most of them with rather negative results), we nally came to a kernel of data structures satisfying these requirements. We chose a WAM-like heap structure for storing terms as DAG's, where structure sharing is possible but not forced. Regarding indexing, we developed several techniques based on substitution trees, which turn out to be surprisingly well combinable with WAM terms due to conceptual similarities. Many known techniques from the WAM, like variable binding, backtracking and memory management, which are the result of many years of work in logic programming implementation, can be smoothly incorporated here. Another requirement is satised as well: static clause (sub)sets can be compiled in this framework into ecient abstract machine code for inference computation and redundancy proving. All this led to a toolkit called Dedam (Deduction Abstract Machine) [8] in which all basic operations (indexing, variable management, I/O, etc.) are provided, and on top of which one can build theorem provers in a simple way. Here we describe the equational theorem prover Barcelona, in its version that participated in the CADE-13 theorem proving competition. The system was built on top of (a rst version of) our kernel of data structures, and was devised mainly to test the performance of this toolkit. During the competition the relatively high throughput of the data structures seems to have compensated for the fact that the prover itself is just an unfailing Knuth-Bendix completion procedure without many heuristics or tuning for the specic class of problems. 2. Architecture 2.1. Calculus As said, the Barcelona prover is essentially an implementation of unfailing Knuth-Bendix completion. Hence a term ordering is central. For completeness reasons, it must be a reduction ordering that is (extendable to) a total ordering on all ground terms, so as a simple choice we have taken the lexicographic path ordering (LPO) [5]. The LPO extends a precedence ordering on the function symbols. In the Barcelona prover the only inference rule is the well-known rule of equational superposition: s 0 = t 0 s = t (s[t 0 ] p = t) where: is the mgu of s 0 and sj p sj p =2 Vars(s) t 6 s and t 0 6 s 0 finjar.tex; 17/01/1997; 15:26; no v.; p.2

THE BARCELONA PROVER 3 (here sj p denotes the subterm of s at position p, and s[t 0 ] p is the result of replacing it in s by t 0 ). The conclusion (s[t 0 ] p = t) of the inference is called a critical pair. Furthermore there are two main mechanisms for detecting redundant equations: forward and backward demodulation and forward subsumption. For eciency reasons, in order to avoid checking with LPO at each rewrite step, we only considered demodulation with rules that can be oriented once and for all: an equation l = r where l r (i.e., l = r is an oriented rewrite rule) demodulates an equation s[l] p = t into s[r] p = t. An equation s = t subsumes another equation s = t if is some matching substitution. 2.2. Control and tuning for the competition The following is a typical main loop of completion: 1. new := the set of initial equations 2. old := ; 3. While new 6= ; And no inconsistency detected Do 4. select one equation eq in new 5. remove eq from new and add it to old 6. For all critical pairs cp between eq and old Do 7. If cp is not redundant Then 8. orient cp (if possible) and add it to new 9. use cp to detect redundancies in new and old 10. EndIf 11. EndFor 12. EndWhile The previous scheme has many degrees of freedom, which were solved as follows for the competition: Line 3: What notion of inconsistency is used? The system only deals with (universally quantied) positive equations. At the competition, the input axioms were of this form, but the theorem could be an arbitrarily quantied equation. After negation and Skolemization, it can however be expressed as thm(s; t) = false and, if an additional thm(x; x) = true is input, an inconsistency exists i the equation true = false appears at some step. Line 4: Which equation eq is selected? In our system for the competition we used the following measure of term and equation size: size(x) = 1 for a variable x; furthermore, size(f(t1 ; : : : ; t n)) = 3+size(t1)+: : :+ size(t n ) and size(s = t) = size(s)+size(t). The selection takes the smallest new equation according to this size, except that once in each ve iterations it takes the smallest descendant of thm(s; t) = false (if there is such a descendant in new). finjar.tex; 17/01/1997; 15:26; no v.; p.3

4 ROBERT NIEUWENHUIS, JOSE MIGUEL RIVERO and MIGUEL ANGEL VALLEJO Line 7: What is done for checking forward redundancy? As said, we use demodulation with orientable equations and subsumbtion with nonorientable equations, in both methods with equations from both new and old. The rewriting strategy applied for normalization is a (leftmost) innermost strategy with marking of irreducible subterms to avoid unnecessary reduction attempts. Line 8: How is orientation done? The precedence of the LPO ordering we use is as follows: symbols with greater arity are bigger in the precedence and between symbols with the same arity we compare the natural numbers that internally represent each symbol. Line 9: How to detect backward redundancies? Only orientable equations cp are used. First, it is checked whether cp demodulates any of the equations in new or old. If this is the case, then the reduced equation is further treated as a critical pair (with some optimizations taking advantage of its previous orientation). 3. Implementation The system is implemented in C. We have been especially careful regarding the readability and the structuring of the data structures, since our aim is to provide a clean, reusable and standard framework for the implementation of rst-order provers with equality. The prover has no further language or hardware requirements. Probably due to the fact that we always keep the data base of equations fully interreduced and simplied, and that there is a considerable amount of sharing in the indexing data structures, memory has not been a bottleneck in any of the competition problems. The user interface of the system has been kept exible in the sense that a large number of output settings can be enabled or disabled. A proof in tree format is output by the system after each successful run. 3.1. Data Structures There is an overall heap data structure for all terms, with their basic operations: input/ouput, LPO, etc. Furthermore, there are four indexing data structures: demodtree is specialized in matching, and contains all left hand sides of rewrite rules applicable in demodulation; backdemodtree is specialized in nding instances (i.e., reverse matching) for backward demodulation and contains the (shared) subterms of old and new rules; oldtree and suboldtree are specialized in unication for inferences and contain the maximal side(s) of old rules and their (shared) subterms respectively. finjar.tex; 17/01/1997; 15:26; no v.; p.4

THE BARCELONA PROVER 5 Each critical pair is demodulated as soon as it is generated by (backtrackable) rewriting, and only if it is not convergent it is copied, oriented and stored as a new rule; then the demodulation steps are undone and the critical pair search goes on by backtracking in the unication indexing tree (oldtree or suboldtree). As a simple example, in the gure below we show how sharing in our data structures leads to eciency for inference computation (for e.g., backward demodulation there are similar mechanisms).?@ @@??? suboldtree @ g(a) f(g(a))? 20, [r3,r4]? 40, [r3,r4,r5] 20 ref 25 25 f 26 ref 40 40 ref 50 50 g 51 ref 52 52 ạ.. At the right hand side in this gure a small part of our WAM-like heap is shown. Roughly (we are omitting a number of details here), terms are represented on the heap as usual in the WAM: each function symbol of arity n is followed by n contiguous ref positions pointing to its arguments. The gure at the left represents the tree suboldtree and two of its leaves, corrsponding to the terms g(a) and f(g(a)) respectively. At each leaf a heap address and a list of rule numbers is stored. Suppose we are looking for inferences with the rule g(x)! h(x) (not shown here on the heap) using suboldtree. When we arrive at the leaf corresponding to the term g(a), the variable x will have been instantiated accordingly, and we nd the heap address (40) of a ref that points to g(a) at position 50. By temporarily changing this ref at position 40, making it point instead to the term h(x), we can simply read o all the critical pairs from the given list of rules [r3,r4,r5] which share the subterm g(a) through position 40. Note that g(a) is also shared by other terms in the tree like f(g(a)), and hence the rule set for g(a) contains the one at f(g(a)) as a subset. 4. Performance During the competition (and especially the rst 15-20 problems) the relatively high throughput of the data structures seems to have compensated the simplicity of the prover itself and its lack of many heuristics or tuning to the specic class of problems. As it was developed only very recently, there are no further experimental results outside the competition. finjar.tex; 17/01/1997; 15:26; no v.; p.5

6 ROBERT NIEUWENHUIS, JOSE MIGUEL RIVERO and MIGUEL ANGEL VALLEJO 5. Conclusion The Barcelona prover in its CADE-13 competition version was built on top of a rst version of our Dedam kernel of data structures, and was devised mainly to test the performance of this kernel. The strength of the prover came from the eciency of Dedam, which we believe is now starting to be useful for enhancing the eciency of state-of-the-art provers. The prover's main weakness was the lack of heuristics or tuning to the rather specic class of problems, aspects which we are currently working on. Regarding further work on the Dedam kernel of data structures, it seems that there is still a lot of room for progress. For example, the performance of matching for demodulation has recently been enhanced importantly thanks to some new ideas on indexing data structures, and there are also some other recent improvements regarding memory management: by applying WAMbased techniques we can in fact almost completely avoid it. Both matching and memory management are well-known main bottlenecks in equational theorem proving. In the near future we will continue investigating some further ideas related to the underlying data structures. We will also build a prover for full rst-order clauses with equality on top of Dedam. References 1. David Basin and Harald Ganzinger. Complexity Analysis Based on Ordered Resolution. In 11th IEEE LICS, pages 456{465, New Brunswick, NJ, July, 1996. 2. Jim Christian. Flatterms, Discrimination Nets, and Fast Term rewriting. Journal of Automated Reasoning, 10:95{113, 1993. 3. Harald Ganzinger, Robert Nieuwenhuis, and Pilar Nivela. The Saturate System, 1995. See http://www.mpi-sb.mpg.de/saturate/saturate.html for software and documentation. 4. Peter Graf. Substitution Tree Indexing. In J. Hsiang, editor, 6th RTA, LNCS 914, pages 117{131, Kaiserslautern, Germany, April 4{7, 1995. Springer-Verlag. 5. S. Kamin and J.-J. Levy. Two generalizations of the recursive path ordering. Unpublished note, Dept. of Computer Science, Univ. of Illinois, Urbana, IL, 1980. 6. William McCune. Experiments with discrimination tree indexing and path indexing for term retrieval. Journal of Automated Reasoning, 9(2):147{167, October 1992. 7. Robert Nieuwenhuis. Basic paramodulation and decidable theories. In 11th IEEE LICS, pages 473{482, New Brunswick, NJ, USA, July 27{30, 1996. 8. Robert Nieuwenhuis, Jose Miguel Rivero, and Miguel Angel Vallejo. An implementation kernel for theorem proving with equality clauses. Technical report, Dept. LSI, Technical University of Catalonia, Barcelona, May 1996. 9. Pilar Nivela and Robert Nieuwenhuis. Practical results on the saturation of full rstorder clauses: Experiments with the saturate system. (system description). In Proc. 5th RTA, LNCS 690, Montreal, June 16{18, 1993. Springer-Verlag. 10. Mark Stickel. The path-indexing method for indexing terms. Technical Report 473, Articial Intelligence Center, SRI International, October 1989. 11. David H.D. Warren. An Abstract Prolog Instruction Set. Technical Report Technical Note 309, SRI International, Menlo Park, CA, October 1983. finjar.tex; 17/01/1997; 15:26; no v.; p.6