Query Containment in the Presence of Limited Access Patterns. Abstract

Size: px
Start display at page:

Download "Query Containment in the Presence of Limited Access Patterns. Abstract"

Transcription

1 Query Containment in the Presence of Limited Access Patterns Chen Li Computer Science Department Stanford University Edward Chang Electrical & Computer Engineering University of California, Santa Barbara Abstract In information-integration systems, sources may have access pattern limitations, i.e., they require values for certain attributes to return tuples. In this paper we study the following problem: given views with access pattern limitations, how to test whether the maximal answer to a conjunctive query (CQ) is contained in that to another CQ? Since a datalog program is necessary to compute the maximal answer to a CQ, as shown in [9, 21], the containment appears undecidable. However, because these programs have a special form, their containment can be reduced to containment of monadic programs, which is known to be decidable [7]. We prove the decidability for both the source-centric approach and the query-centric approach to information integration [8]. The results can be extended to the case where the contained CQ and the containing CQ have dierent initial bindings. Our work complements the recent paper by Millstein, Levy, and Friedman [25]. In addition, the decidability of monadic programs involves a complex algorithm. We develop a polynomial-time algorithm for testing boundedness of these programs [27], and show that when a program in the test is bounded, we can perform the containment test eciently. Keywords: information-integration systems, access pattern limitations, query containment, query equivalence, datalog programs. 1 Introduction The goal of information-integration systems (e.g., [3, 12, 14, 15, 16, 17, 23, 24, 26]) is to support seamless access to heterogeneous data sources. In these systems, sources may have access pattern limitations (also called binding restrictions). For instance, many Web sources such as The IMDB [13] and Cinemachine [6] return movie information only if some values are specied for certain attributes, such as movie title, star name, and etc. These sources do not accept queries such as \return all the movie information you know about." Given sources with access pattern limitations, Duschka and Levy [9] showed how to compute the maximal answer to a query by translating the query and the source limitations into a datalog program [31]. In a recent paper [21], we developed an algorithm for nding the relevant sources that need to be accessed to answer a query. In this paper we address the problem of whether the maximal answer to one query is contained in that to another query. The following is a motivating example. EXAMPLE 1.1 Assume we have four source views as shown in Figure 1(a). Their schemas can be represented as a hypergraph [31] as shown in Figure 1(b), in which each node is an attribute and each hyperedge is a view schema. For example, source view v 1 has the information about studios and their movies, and source view v 2 has the information about movie awards and stars. The access limitation of each view is described as a binding pattern [31], in which each attribute is adorned as b (a value 1

2 View schema v 1 (Studio; Movie) v 2 (Movie; Star; Award) v 3 (Movie; Star) v 4 (Star; Addr) Binding pattern bf bbf bf bf b f v3(movie; Star) b b f v2(movie; Star; Award) Studio Movie Star Award b f b f v1(studio; Movie) Addr v4(star; Addr) (a) View schemas with binding patterns (b) The hypergraph representation Figure 1: Four movie sources must be specied for this attribute) or f (it can be free). For instance, the binding pattern bf of view v 1 says that every query using view v 1 must specify a studio name. Consider the following two conjunctive queries (CQs): Q 1 : ans(a) :- v 1 (disney; M);v 2 (M; S; W );v 4 (S; A) Q 2 : ans(a) :- v 1 (disney; M);v 3 (M; S);v 4 (S; A) Both queries ask for the addresses of stars in Disney movies by taking the joins of dierent views. To answer Q 2,we can rst send a query v 1 (disney; M) tov 1 to retrieve its Disney movies. For each of these movies m, send query v 3 (m; S) to retrieve its stars. Then for each star s, send query v 4 (s; A) to retrieve his/her addresses. The binding restrictions of these three views make the above plan executable. While answering query Q 1,we cannot get any bindings for the Star attribute by using only the views v 1, v 2, and v 3 in Q 1 due to their binding restrictions. However, we can use view v 3 (Movie; Star) with the binding pattern bf to obtain some Star bindings, although v 3 is not in Q 1. In addition, in order to query view v 3,we can access view v 1 (Studio; Movie) to retrieve the necessary Movie bindings using disney as a studio name. Although the two queries use dierent views, surprisingly, if all the available information about studios, movies, and stars is from the two queries and the four source views, all the answer to query Q 1 that we can compute is contained in that to query Q 2.(We give the formal proof in Section 2.3.) Therefore, if a user submits a query Q 1 [ Q 2,we can answer this query by just answering query Q 2, and thus save the queries to view v 2. 2 In this paper we study the following problem: Given conjunctive queries on views with bindingpattern restrictions, is the maximal answer to one query contained in that to another query? The solution to this problem can help us avoid unnecessary source accesses, as shown in the example above. However, [9, 21] show that we may need a potentially recursive datalog program to compute the maximal answer to a CQ. Thus this containment problem seems undecidable, since containment of datalog programs is undecidable [30]. In Section 2 we prove this containment problem is decidable by showing the datalog program to compute the maximal answer to a CQ is monadic. That is, all its recursive predicates [31] are monadic (i.e., with arity one); its nonrecursive IDB predicates can have arbitrary arity. Therefore, our containment problem can be reduced to containment of monadic programs, which is known to be decidable [7]. [7] involves a complex algorithm (using tree-automata theory) to test containment of monadic programs. Therefore, we are interested in the case where a program in the test is bounded, and 2

3 the containment can be tested eciently using the algorithms in [4, 5, 29]. A datalog program is bounded if it is equivalent to a nite union of CQs [27]. [7] shows that boundedness is decidable for monadic datalog programs, although it is not decidable in general [11]. However, testing boundedness of monadic program also involves a complex algorithm. In Section 3, we study a class of CQs, called connection queries [21]. We develop a polynomial-time algorithm for testing boundedness of datalog programs for connection queries. There are two approaches to information integration [8]: the query-centric approach (in which user queries are in terms of views synthesized on source views), and the source-centric approach (in which user queries and source views are in terms of global views). In Sections 2 and 3 we take the query-centric approach. In Section 4 we extend our decidability result to the source-centric approach. 1.1 Related work Recently Millstein, Levy, and Friedman [25] study relative containment in the context of answering queries using views [18, 19]. The problem is to test query containment relative to source views with binding restrictions that are available to an information-integration system. Suppose P 1 (resp. P 2 )is the program to compute the maximal answer to a (potentially recursive) query Q 1 (resp. Q 2 ). The authors show that, surprisingly, the program P 1 is contained in P 2 if and only if P 1 is contained in Q 2. Then the authors prove that when Q 2 is a CQ, then the relative containment is decidable using the results of [5]. Our paper complements the paper [25] as follows: 1. [25] uses the source-centric approach to information integration. In this paper we give the decidability result for both the query-centric approach and the source-centric approach. 2. The decidability proof in [25] is based on the assumption that the set of bindings for the contained query is a subset of the bindings for the containing query. We loosen this assumption by showing that containment is decidable even if the two CQs have dierent initial bindings. However, we assume that the contained query is a CQ, while in [25] the contained query can be a recursive datalog program. Some other studies on answering queries in the presence of binding restrictions include: how to derive equivalent rewritings of CQs [28], how to optimize CQs [10, 22, 35], how to test whether the complete answer to a query can be computed [20]. 2 Relative containment in the query-centric approach In this section we take the query-centric approach to information integration. We give the formal denition of query containment in the presence of binding restrictions, and prove this containment is decidable. 2.1 Source views and queries Let v 1 ;:::;v n be n source views. Each view has a binding pattern representing the possible queries that the view can accept. In each binding pattern, an attribute is adorned as b (a value must be specied 3

4 for this attribute) or f (the attribute can be free). Let V denote the views with their adornments (\source descriptions" for short). We consider conjunctive queries (CQs) in the form ans( X):-g 1 ( X 1 );:::;g n ( X n ): In each subgoal g i ( X i ), predicate g i is a view in V, and every argument in the subgoal is either a variable or a constant. We consider safe CQs, i.e., every variable in the head appears in the body. 2.2 The maximal answer to a query Let Q be a query on source descriptions V. For a database of V, the maximal answer to Q is the answer we can compute if we retrieve asmany tuples as possible from the views, using only the initial bindings in Q and the bindings retrievable from V. It is known that a recursive datalog program may be necessary to compute the maximal answer to a query [9, 21], since we could access sources repeatedly to retrieve more bindings, and use these bindings to retrieve more tuples, and then more bindings, and so on. We construct a datalog program, denoted (Q; V ), which can be evaluated on V to compute the maximal answer to Q. In [21] we discuss how the program (Q; V ) is constructed for connection queries (see Section 3 for the denition of connection queries). Here we generalize the way of constructing (Q; V )toanycqq. r 1 : ans(a) :- bv 1 (disney; M); bv 2 (M; S; W); bv 4 (S; A) r 6 : bv 3 (M; S) :- movie(m); v 3 (M; S) r 2 : bv 1 (T; M) :- studio(t); v 1 (T; M) r 7 : star(s) :- movie(m); v 3 (M; S) r 3 : movie(m) :-studio(t); v 1 (T; M) r 8 : bv 4 (S; A) :- star(s); v 4 (S; A) r 4 : bv 2 (M; S; W):- movie(m); star(s); v 2 (M; S; W) r 9 : addr(a) :- star(s); v 4 (S; A) r 5 : award(w) :-movie(m); star(s); v 2 (M; S; W) r 10 : studio(disney) :- Figure 2: The program (Q 1 ;V) for the query Q 1 in Example 1.1 Figure 2 shows the program (Q 1 ;V) for the four views and the query Q 1 in Example 1.1. We use this program to show in general how to construct the program (Q; V ) for a CQ Q: 1. For each view v i 2 V,introduce an IDB predicate bv i, called the -predicate of v i, to store the tuples that can be retrieved from the view. For each domain A of the attributes in V,introduce a domain predicate doma to store the bindings for this domain that are retrievable from Q and the source views. For instance, IDB predicates cv 1 ;:::;cv 4 in Figure 2 are the corresponding -predicates of the four source views; IDB predicates studio, movie, star, award, and addr are the domain predicates for the domains of the ve corresponding attributes. 2. Replace each subgoal in the query Q with the -predicate of the corresponding view. The new rule is called the connection rule of Q. For instance, query Q 1 is rewritten as rule r For each view v i, write the following -rule and domain rules based on its binding pattern. Suppose v i has m attributes, say A 1 ;:::;A m, and the binding pattern of v i says that the arguments in positions 1;:::;p need to be bound, and the arguments in positions p +1;:::;m can be free. The following rules are the -rule and domain rules of v i : -rule: ^v i (A 1 ;:::;A m ) :- doma 1 (A 1 );:::;doma p (A p );v i (A 1 ;:::;A m ) domain rules: doma k (A k ) :- doma 1 (A 1 );:::;doma p (A p );v i (A 1 ;:::;A m ) (k = p +1;:::;m) 4

5 in which each doma j (j =1;:::;p) is the domain predicate for attribute A j.for instance, rule r 2 in Figure 2 is the -rule of v 1, and r 3 is the domain rule. 4. For each binding a for a domain A that can be derived from Q, write a fact rule doma(a) :-.For instance, r 10 in Figure 2 is a fact rule representing the fact that from query Q 1 we know disney is a studio name. During the construction of the program (Q; V ), we make the following assumptions: (1) Each binding for an attribute must be from the domain of the attribute. (2) If a source view requires a value, say, a string, as a particular argument, we will not allow the \strategy" of trying all the possible strings for this argument to test the source, since this strategy will not terminate. (3) Each binding we use is either obtained from the user query, or from a tuple returned by another source query. Ifwehave more bindings, we can incorporate them into the program (Q; V ) by adding the corresponding fact rules. For instance, consider the program (Q 1 ;V) in Figure 2. If we know that there is a movie titled \King Kong," we can add the following fact rule to the program (Q 1 ;V): movie( 0 King Kong 0 ):-. For any database of V,byevaluating the program (Q; V ) on the views V,we can obtain all the retrievable tuples from the source views, since every possible source query is captured by an evaluation of a rule in (Q; V ). Therefore, the program can compute the maximal answer to the query. 2.3 Problem denition Denition 2.1 (relative containment) Given two CQs Q 1 and Q 2 on source descriptions V,we say Q 1 is contained inq 2 relative to V, denoted Q 1 V Q 2, if for any database of V, the maximal answer to Q 1 is contained in the maximal answer to Q 2 ; that is, (Q 1 ;V) (Q 2 ;V). 2 EXAMPLE 2.1 In Example 1.1, the programs (Q 1 ;V) and (Q 2 ;V) are equivalent to the following two programs P 1 and P 2, respectively. 1 P 1 : ans(a) :- v 1 (disney; M);v 3 (M; S);v 2 (M; S; W );v 4 (S; A) P 2 : ans(a) :- v 1 (disney; M);v 3 (M; S);v 4 (S; A) For instance, we can simplify the program (Q 1 ;V) as follows. Consider the subgoals cv 1 (disney; M), cv 2 (M; S; W ), and cv 4 (S; A) in rule r 1. We substitute each of them by the body of the -rule (r 2, r 4, and r 8 ) of the corresponding view, with the necessary variable unication. After the substitutions, we can remove all the -rules. The new program is shown in Figure 3. We then substitute the domain predicates studio, movie, and star in rule r 0 1 with the body of the corresponding domain rule. Then the rule r 0 1 becomes query P 1. Similarly, program (Q 2 ;V) can be simplied to P 2. r1 0 : ans(a) :- studio(disney); v 1 (disney; M); movie(m); star(s); v 2 (M; S; W); star(s); v 4 (S; A) r 3 : movie(m) :- studio(t); v 1 (T; M) r 5 : award(w) :- movie(m); star(s); v 2 (M; S; W) r 7 : star(s) :- movie(m); v 3 (M; S) r 9 : addr(a) :- star(s); v 4 (S; A) 10 : studio(disney) :- Figure 3: An equivalent program for the program in Figure 2 1 Two datalog programs are equivalent if they produce the same ans facts for any database. 5

6 Since the identity mapping on subgoals of P 1 and P 2 gives us a containment mapping [4] from P 2 to P 1,wehave P 1 P 2. Therefore, (Q 1 ;V) (Q 2 ;V), and Q 1 V Q 2. 2 In general, our problem is: given two queries Q 1 and Q 2 on source views V with binding restrictions, how to test whether Q 1 V Q 2? 2.4 Relative containment is decidable Since the programs (Q 1 ;V) and (Q 2 ;V) can be recursive [9, 21], the relative containment seems undecidable [30]. In this section we prove that relative containment is decidable using the results of monadic programs. A datalog program is monadic if its recursive IDB predicates are monadic (the nonrecursive predicates can have arbitrary arity). An IDB predicate is nonrecursive if the predicate is not on any cycle in the dependency graph of the program [31]. Cosmadakis et al. [7] showed that containment of monadic programs is decidable. 2 Lemma 2.1 The program (Q; V ) of a query Q on source descriptions V can be translated into an equivalent monadic datalog program. 2 Proof: Consider the rule in (Q; V ) with the ans predicate as the head. For each -predicate bv i in its body, it can be substituted by the body of the corresponding -rule of v i, with the necessary variable unication. After the substitutions, remove the -rules from (Q; V ), and the new program can compute the same ans facts as (Q; V ) for any database. (For instance, the program (Q 1 ;V) in Figure 2 can be rewritten to the equivalent program in Figure 3.) The IDB predicates of the new program include the ans predicate and the domain predicates. ans is not recursive, since it only appears in the head of the connection rule. All other IDB predicates, i.e., the domain predicates, are monadic. Therefore, the new program is monadic. By Lemma 2.1 and the results in [7], we have: Theorem 2.1 Relative containment with binding restrictions is decidable. 2 Notice that Theorem 2.1 is correct even if the two queries have dierent initial bindings. The reason is that we can incorporate their dierent ininial bindings to their datalog programs by adding the corresponding fact rules, which are also monadic. 3 Testing program boundedness [7] uses a complex algorithm involving automata theory to test containment of monadic programs. If one of the two programs in the test is bounded, the containment can be tested more eciently using 2 The notion of nonrecursive predicates in [7] is slightly dierent from our denition above. In that paper nonrecursive predicates cannot depend on recursive predicates. That is, a predicate is nonrecursive if it either does not depend on another predicate, or it depends only on nonrecursive predicates. Thus, the program can be unfolded so that nonrecursive predicates do not depend on any other IDB predicate. However, its decidability result can be generalized to our stronger denition of nonrecursive predicates [34]. 6

7 the algorithms in [4, 5, 29]. A datalog program is bounded if it is equivalent to a nite union of CQs. For instance, the programs for the two queries in Example 1.1 are both bounded, because each of them can be rewritten to an equivalent CQ. In this section, we study the following problem: given a query Q on source views V with binding restrictions, how to test the boundedness of (Q; V )? [7] also gives an algorithm for testing boundedness of monadic programs using automata theory, although boundedness of datalog programs is undecidable in general [11]. We develop a polynomial-time algorithm for testing boundedness of programs of a class of CQs, called connection queries. 3.1 Connection queries Assume we have a set of global attributes, and dierent attributes are from dierent domains. Let V be a set of views with binding restrictions. Each view schema is a subset of the global attributes. A connection query Q is a natural join of a set of views (denoted T (Q)) with selections on some attributes (called the input attributes of Q, denoted I(Q)) and projections on some other attributes (called the output attributes of Q, denoted O(Q)). The set of views T (Q) is also called the connection of Q. The user species values for the input attributes I(Q), and is interested in the values of the output attributes O(Q). (See [21] for details about connection queries.) EXAMPLE 3.1 In Example 1.1, studio, movie, star, award, and addr are global attributes from dierent domains. Each of the four view schemas is a subset of these attributes. Queries Q 1 and Q 2 are two connection queries. For query Q 1, it has a connection T (Q 1 )=fv 1 ;v 2 ;v 4 g, a set of input attributes I(Q 1 )=fstudiog, and a set of output attributes O(Q 1 )=faddrg. Similarly, T (Q 2 )=fv 1 ;v 3 ;v 4 g, I(Q 2 )=fstudiog, and O(Q 2 )=faddrg. Suppose we have two views R(A; B; C) and S(B; C; D). The following query ans(c) :-R(A; A; C);S(X;C;C) is not a connection query, since it is not a natural join of the two views Boundedness of connection queries Given a connection query Q on views V with binding restrictions, we say Q is bounded if the datalog program (Q; V ) is bounded. For instance, the two queries in Example 1.1 are both bounded. The following connection query is unbounded. f b v 1 (A; B) A B f b v 3 (B; D) D E f f v 5 (D; E) bf v 2 (B; C) C bf v 4 (B; D) Figure 4: The source descriptions in Example 3.2 7

8 EXAMPLE 3.2 Consider the ve source views in Figure 4, whose schemas are represented as a hypergraph. The ve attributes have ve dierent domains. Assume a user knows the value of A is a, and wants to get the C values by joining the views v 1 and v 2. The following is the corresponding query Q: ans(c) :-v 1 (a; B);v 2 (B; C) That is, T (Q)=fv 1 ;v 2 g, I(Q) =fag, and O(Q) =fcg. Figure 5 shows the program (Q; V ), which is unbounded. Intuitively, since the binding pattern of v 3 (B; D) isfb, and the binding pattern of v 4 (B; D) isbf, we can visit these two source views repeatedly to retrieve more B bindings. For each new B binding, it may participate in v 1 1v 2, and generate more answers to Q. (We will give a formal proof of the unboundedness in Section 3.5.) 2 r 1 : ans(c) :- bv 1 (a; B); bv 2 (B; C) r 8 : bv 4 (B; D) :- domb(b); v 4 (B; D) r 2 : bv 1 (A; B) :- domb(b); v 1 (A; B) r 9 : domd(d) :- domb(b); v 4 (B; D) r 3 : doma(a) :- domb(b); v 1 (A; B) r 10 : bv 5 (D; E) :- v 5 (D; E) r 4 : bv 2 (B; C) :- domb(b); v 2 (B; C) r 11 : domd(d) :- v 5 (D; E) r 5 : domc(c) :- domb(b); v 2 (B; C) r 12 : dome(e) :- v 5 (D; E) r 6 : bv 3 (B; D) :- domd(d); v 3 (B; D) r 13 : doma(a) :- r 7 : domb(b) :- domd(d); v 3 (B; D) Figure 5: The program (Q; V ) in Example 3.2. We want to solve the following problem: Given a connection query Q on views V with binding restrictions, how to test the boundedness of Q? 3.3 Forward-closure and independent connections We rst review some denitions in [21]. Given a source view v i, let B(v i ) and F(v i ) respectively denote the bound attributes and free attributes in the binding pattern of v i. Let A(v i )=B(v i ) [F(v i ) be all the attributes in v i. Suppose W is a set of source views in V, let A(W) denote the attributes in W. For instance, in Example 3.2, B(v 1 )=fag, F(v 1 )=fbg, A(v 1 )=fa; Bg, and A(fv 1 ;v 2 g)=fa; B; Cg. Given a set of source views W V and a set of attributes X A(V ), the forward-closure of X given W, denoted f-closure(x; W), is the set of source views in W such that, starting from the attributes in X as the initial bindings, the binding requirements of these source views are satised by using only the source views in W. For instance, in Example 3.2, f-closure(fag; fv 1 ;v 2 g)=, and f-closure(fbg; fv 1 ;v 2 g)=fv 1 ;v 2 g. Let V q = f-closure(i(q);v) be all the queryable source views, i.e., the source views that we may eventually query, starting with the initial bindings in I(Q), and perhaps using several preliminary queries to other sources in order to obtain the necessary bindings for these source views. The nonqueryable source views in V, V q can be ignored without changing the maximal answer to Q, since we cannot retrieve any tuples from them. If there is a nonqueryable view in T (Q), then the maximal answer to Q is empty. A query Q is independent if its connection T (Q) satises f-closure(i(q);t(q)) = T (Q). For instance, the query Q 2 in Example 1.1 is independent, since f-closure(i(q);t(q 2 )) = f-closure(fstudiog; fv 1 ;v 3 ;v 4 g)=t (Q 2 ). Query Q 1 is not independent since f-closure(i(q);t(q 1 )) = f-closure(fstudiog; fv 1 ;v 2 ;v 4 g)=fv 1 g6= T (Q 1 ). Similarly, the connection query in Example 3.2 is not independent. 8

9 Theorem 3.1 If a connection query Q on source views V with binding restrictions is independent, then the program (Q; V ) is bounded. 2 Proof: See the appendix. If a connection query Q is not independent, then the program (Q; V )may not be bounded, as shown in Example BF-chain, BF-loop, Backward-closure, and kernel A sequence of views w 1 ;:::;w k forms a BF-chain (bound-free chain) if for i =1;:::;k, 1, F(w i ) \ B(w i+1 ) 6=. That is, for two adjacent views w i and w i+1 in the BF-chain, w i can contribute some bindings to w i+1. The source views w 1 and w k are the head and the tail of the BF-chain, respectively. A sequence of views forms a BF-loop if it forms a BF-chain, and the bound attributes of the head overlap with the free attributes of the tail (as shown in Figure 6). In Figure 4, (v 3 ;v 4 ) forms a BF-loop, because F(v 3 ) \B(v 4 )=fbg and F(v 4 ) \B(v 3 )=fdg. free bound free bound... w 5 w 4 free wn w 1 w 2 w 3 bound bound free free bound free bound Figure 6: A BF-loop Suppose A is an attribute in the queryable views V q, i.e., A 2A(V q ). The backward-closure of A, denoted b-closure(a), is the set of queryable source views that can be backtracked from A by following some BF-chain in a reverse order, in which A is a free attribute of the tail in the BF-chain. The backward-closure of a set of attributes X A(V q ), denoted b-closure(x), is the union of all the backward-closures of the attributes in X, i.e., b-closure(x) = S A2X b-closure(a). For instance, in Example 3.2, b-closure(b) =fv 1 ;v 3 ;v 4 ;v 5 g, and b-closure(fb; Cg)=fv 1 ;v 2 ;v 3 ;v 4 ;v 5 g. Denition 3.1 (BF-graph) The BF-graph of a set of source views W is a directed graph in which each vertex corresponds to a view in W, and there is an edge from vertex v i to vertex v j if and only if F(v i ) \B(v j ) 6=. 2 Intuitively, there is an edge from vertex v i to vertex v j if view v i can provide some bindings for view v j. Figures 7 and 8 show the BF-graphs of the source views in Example 1.1 and Example 3.2, respectively. For instance, in Figure 7, there is an edge from vertex v 1 to vertex v 2 because F(v 1 ) \ B(v 2 )=fmovieg6=. Clearly there is a BF-loop among a set of source views if and only if the BF-graph of these views is cyclic. 9

10 v 1 v 3 v 3 v 5 v 2 v 4 v 1 v 2 v 4 Figure 7: BF-graph for Example 1.1 Figure 8: BF-graph for Example 3.2. Denition 3.2 (kernel) Assume Q is a connection query on source descriptions V. A set of attributes K is a kernel of Q if f-closure(k[i(q);t(q)) = T (Q); and by removing any attribute A from K, f-closure((k,fag) [ I(Q);T(Q)) 6= T (Q): Intuitively, akernel K of a connection query Q is a minimal set of attributes in A(Q) such that, if the attributes in K have been bound, together with the initial bindings in I(Q), we can bind all the attributes A(Q) by using only the source views in T (Q). For instance, in Example 3.2, fbg is the only kernel of the connection fv 1 ;v 2 g.itisshown in [21] that a connection query may have multiple kernels, and all its kernels have the same backward-closure. In addition, we can compute the maximal answer to Q using only the views in b-closure(k) [ T (Q), in which K is a kernel of Q Testing boundedness of connection queries Theorem 3.2 If Q is a connection query on source descriptions V, and all the source views in T (Q) are queryable, then (Q; V ) is bounded if and only if there is no BF-loop among the views in b-closure(k), in which K is a kernel of Q. 2 Proof: See the appendix. Intuitively, for the \only if" part, if there is a BF-loop among the views in b-closure(k), we can populate the views in the loop following the loop as many times as possible, such that only after a certain number k of source accesses we can retrieve a tuple in the answer to the query, and k can be arbitrarily large. Thus (Q; V )isunbounded. EXAMPLE 3.3 Consider the two connection queries in Example 1.1. Query Q 1 has one kernel fstarg, whose backward-closure is fv 1 ;v 3 g. Clearly there is no BF-loop in fv 1 ;v 3 g,thus (Q 1 ;V)is bounded. Similarly, query Q 2 has one kernel, and there is no BF-loop in its backward-closure, thus (Q 2 ;V) is also bounded. In Example 3.2, fbg is the only kernel of the connection fv 1 ;v 2 g, and the backward-closure of fbg is fv 1 ;v 3 ;v 4 ;v 5 g. Since there is a BF-loop, (v 3 ;v 4 ), among these four views, by Theorem 3.2, this connection query is unbounded. 2 If a query Q is independent, it has only one kernel, the empty set. Thus the backward-closure of this kernel is empty, and there is no BF-loop in the backward-closure. By Theorem 3.2, (Q; V ) 10

11 is bounded, which is consistent with Theorem 3.1. By Theorem 3.2, we give an algorithm called TestBoundedness, for testing boundedness of connection queries, as shown in Figure 9. Algorithm TestBoundedness: Test boundedness of connection queries Input: V : Source views with binding restrictions. Q: A connection query on V. Output: Decision about the boundedness of (Q; V ). Method: (1) Compute the queryable views V q = f-closure(i(q);v); (2) If there is one view v 2 T (Q) that is not in V q,(q; V ) is bounded and return; (3) Compute a kernel K of Q; (4) Compute b-closure(k); (5) Build the BF-graph G of b-closure(k); (6) Test the acyclicity of G. If G is acyclic, then (Q; V ) is bounded; otherwise, (Q; V ) is unbounded. Figure 9: Testing the boundedness of a connection Let us analyze the complexity of the algorithm TestBoundedness. Assume V has n views, T (Q) has m views and k attributes, and b-closure(k) has p views. [21] gives the details how steps 1 to 4 are executed in O(kn 2 ) time. We can test the cyclicity of the BF-graph G using a depth-rst search algorithm in directed graphs, as described in [1]. The complexity of deciding the cyclicity of a directed graph G(V;E)isO(jEj), where E is the set of edges in graph G(V;E). There could be at most, p 2 edges in the BF-graph, so step 5 can be done in O(p 2 ) time, including the time of building the structure of the adjacent vertices for each vertex, as described in [1]. Step 6 can be done in O(p 2 ) time using a depth-rst search algorithm. Therefore, the complexity of the algorithm TestBoundedness is: O(kn 2 )+O(p 2 )+O(p 2 )=O(kn 2 ) 4 Extend the decidability result to the source-centric approach In this section we extend the decidability result of query-centric approach in Section 2 to the sourcecentric approach to information integration [8]. ([32] is a good survey on the dierences between these two approaches.) 4.1 Notation in the source-centric approach Let Q be a CQ, and V be a set of conjunctive source views with binding restrictions. Both Q and V are dened on some global predicates. A rewriting of a query of Q relative tov is a datalog program P, such that the EDB predicates in P are the views in V, and the expansion of P is contained in the query Q. The expansion of P, denoted P exp, is obtained from P by replacing all source-view literals by their denitions. Existentially quantied variables in a source view are replaced by fresh variables in the expansion. The following example is borrowed from [9]. EXAMPLE 4.1 Assume parent, male, and f emale are three global predicates. The following two views v 1 and v 2 store the father and mother relation, respectively. v 1 (X; Y ) :- parent(x; Y ); male(x) v 2 (X; Y ) :- parent(x; Y ); female(x) 11

12 The following query asks for the grandparents of smith: ans(x) :-parent(x; Z); parent(z; smith) The following is a rewriting of the query: ans(x) ans(x) ans(x) ans(x) :- v 1 (X; Z);v 1 (Z; smith) :- v 1 (X; Z);v 2 (Z; smith) :- v 2 (X; Z);v 1 (Z; smith) :- v 2 (X; Z);v 2 (Z; smith) 2 A rewriting of a query Q relative to a set of views V is the maximally-contained rewriting of Q relative tov if it is not contained in any other rewriting of Q relative tov. Let P 1 and P 2 be the maximally-contained rewriting of Q 1 and Q 2 relative tov, respectively. Query Q 1 is contained inq 2 relative to V, denoted Q 1 V Q 2, if for any database of the source views V, the answer computed by Q 1 is a subset of that computed by P 2, i.e., P 1 P 2 [25]. 4.2 Decidability result Given a set V of views with binding restrictions and two CQs Q 1 and Q 2, our goal is to test whether Q 1 V Q 2.We prove this containment is decidable by showing that the maximally-contained rewriting of a query is also a monadic program. We give the proof in two steps: 1. If we do not consider the binding restrictions, then the maximally-contained rewriting is inherently nonrecursive, i.e., it is equivalent to a nite union of CQs. 2. Then we consider the binding restrictions by adding monadic rules, thus the nal maximallycontained rewriting is monadic. In the rst step, we want to know whether we should consider recursive datalog programs to nd the maximally-contained rewriting of a CQ. [19] shows how to obtain an equivalent rewriting of a CQ in the space of unions of CQs. [8] shows how to get the maximally-contained rewriting of a query in the space of datalog programs, and does not show whether the rewriting is bounded or not. The following lemma shows that we do not need to consider recursive datalog programs to nd the maximally-contained rewriting of a CQ, since the maximally-contained rewriting is equivalent toa nite union of CQs. Lemma 4.1 In the source-centric approach, if the source views are conjunctive without binding restrictions, then the maximally-contained rewriting of a CQ is a bounded datalog program. 2 Proof: Suppose Q is a CQ, and V is a set of conjunctive views without binding restrictions. Using the inverse-rule algorithm in [8], we can obtain a maximally-contained rewriting P DL, which isa datalog program. Now we prove this program P DL is inherently bounded. P DL can be expanded into a (possibly innite) union of CQs. For each C i of these, C i uses only views in V, and the expansion C exp i is contained in Q. It is shown in [33] that there must be a conjunctive rewriting C 0 i of Q, such that C 0 i has no more subgoals than Q, while C i C 0 i. Since there are nite number of conjunctive rewritings of Q with no more subgoals than Q, we can nd a nite union P UCQ of CQs as a rewriting 12

13 of Q, and P DL P UCQ. Since P DL is the maximally-contained rewriting, i.e., P DL P UCQ. We have P DL = P UCQ ; that is, the maximally-contained rewriting of Q in the space of datalog programs is a nite union of CQs. Lemma 4.2 In the source-centric approach, if the source views are conjunctive with binding restrictions, then the maximally-contained rewriting of a CQ relative to the views is a monadic datalog program. 2 Proof: Let P be the maximally-contained rewriting of a CQ Q relative tov without binding restrictions. By Lemma 4.1, P is equivalent to a nite union of CQs, in which each EDB predicate is a view in V.[9] shows how to construct the maximally-contained rewriting of Q relative tov with binding restrictions in two steps: (1) Add a set of domain rules domain(v;q). 3 (2) For each rule r in P, insert a subgoal dom(x) before subgoals g in r that have avariable X in an argument position that is required to be bound, and X does not appear in the subgoals to the left of g in the body. In the new program, the only recursive predicates are those domain predicates, which are monadic. Therefore, the program is monadic. By Lemma 4.2 and the results in [7], we have: Theorem 4.1 In the source-centric approach, query containment relative to views with binding restrictions is decidable. 2 Notice that the decidability result holds even if the two queries have dierent initial bindings, since we can incorporate their initial bindings to their corresponding maximally-contained rewritings by adding the necessary monadic rules. 5 Conclusion In this paper we solved the following problem: given views with access pattern limitations, how to test whether the maximal answer to a conjunctive query (CQ) is contained in that to another CQ? We proved that the problem is decidable using the results of monadic programs. We gave the decidability results for both the source-centric approach and the query-centric approach to information integration. We also developed a polynomial-time algorithm for testing boundedness of these programs, and show that when a program in the containment test is bounded, we can perform the test eciently. Acknowledgments: We thank Je Ullman for his valuable comments and many discussions on this material. We thank Anand Rajaraman for helpful discussions on Lemma 4.1. We also thank Rada Chirkova for her helpful comments on this material. References [1] A. V. Aho, J. E. Hopcroft, and J. D. Ullman. Data Structures and Algorithms. Addison-Wesley Publishing Company, The domain rules in [8] are slightly dierent from the domain rules in Section 2. 13

14 [2] C. Beeri and R. Ramakrishnan. On the power of magic. In Proc. of ACM Symposium on Principles of Database Systems (PODS), pages 269{283, [3] T. Catarci. Web-based information access. IFCIS International Conference on Cooperative Information Systems (CoopIS), pages 10{19, [4] A. K. Chandra and P. M. Merlin. Optimal implementation of conjunctive queries in relational data bases. STOC, pages 77{90, [5] S. Chaudhuri and M. Y. Vardi. On the equivalence of recursive and nonrecursive datalog programs. In Proc. of ACM Symposium on Principles of Database Systems (PODS), pages 55{66, [6] Cinemachine. [7] S. S. Cosmadakis, H. Gaifman, P. C. Kanellakis, and M. Y. Vardi. Decidable optimization problems for database logic programs. ACM Symposium on Theory of Computing (STOC), pages 477{490, [8] O. M. Duschka. Query planning and optimization in information integration. Ph.D. Thesis, Computer Science Dept., Stanford Univ., [9] O. M. Duschka and A. Y. Levy. Recursive plans for information gathering. Proceedings of the Fifteenth International Joint Conference on Articial Intelligence, IJCAI-97, [10] D. Florescu, A. Levy, I. Manolescu, and D. Suciu. Query optimization in the presence of limited access patterns. In Proc. of ACM SIGMOD, pages 311{322, [11] H. Gaifman, H. G. Mairson, Y. Sagiv, and M. Y. Vardi. Undecidable optimization problems for database logic programs. Journal of the ACM, pages 683{713, [12] L. M. Haas, D. Kossmann, E. L. Wimmers, and J. Yang. Optimizing queries across diverse data sources. In Proc. of VLDB, pages 276{285, [13] IMDB. The Internet Movie Database Ltd. Search Engine, [14] Z. Ives, D. Florescu, M. Friedman, A. Levy, and D. Weld. An adaptive query execution engine for data integration. In Proc. of ACM SIGMOD, pages 299{310, [15] V. Josifovski and T. Risch. Integrating heterogenous overlapping databases through object-oriented transformations. In Proc. of VLDB, pages 435{446, [16] Z. Kedad and M. Bouzeghoub. Discovering view expressions from a multi-source information system. IFCIS International Conference on Cooperative Information Systems (CoopIS), pages 57{68, [17] S. Kerr, A. Gal, and J. Mylopoulos. Information services for the web: Building and maintaining domain models. IFCIS International Conference on Cooperative Information Systems (CoopIS), pages 4{13, [18] A. Y. Levy. Answering queries using views: A survey. In [19] A. Y. Levy, A. O. Mendelzon, Y. Sagiv, and D. Srivastava. Answering queries using views. In Proc. of ACM Symposium on Principles of Database Systems (PODS), pages 95{104, [20] C. Li. Computing complete answers to queries in the presence of limited access patterns (extended version). Technical report, Computer Science Dept., Stanford Univ., [21] C. Li and E. Chang. Query planning with limited source capabilities. International Conference on Data Engineering (ICDE), pages 401{412, [22] C. Li, R. Yerneni, V. Vassalos, H. Garcia-Molina, Y. Papakonstantinou, J. D. Ullman, and M. Valiveti. Capability based mediation in TSIMMIS. In Proc. of ACM SIGMOD, pages 564{566, [23] L. Liu, C. Pu, and W. Han. XWRAP: An XML-enabled wrapper construction system for web information sources. International Conference on Data Engineering (ICDE), pages 611{621, [24] R. J. Miller. Using schematically heterogeneous structures. In Proc. of ACM SIGMOD, pages 189{200, [25] T. Millstein, A. Levy, and M. Friedman. Query containment for data integration systems. In Proc. of ACM Symposium on Principles of Database Systems (PODS), [26] T. Milo and S. Zohar. Using schema matching to simplify heterogeneous data translation. In Proc. of VLDB, pages 122{133, [27] J. F. Naughton and Y. Sagiv. A decidable class of bounded recursions. In Proc. of ACM Symposium on Principles of Database Systems (PODS), pages 227{236. ACM, [28] A. Rajaraman, Y. Sagiv, and J. D. Ullman. Answering queries using templates with binding patterns. In Proc. of ACM Symposium on Principles of Database Systems (PODS), pages 105{112, [29] Y. Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and dierence operators. Journal of the ACM, 27(4):633{655, [30] O. Shmueli. Equivalence of datalog queries is undecidable. Journal of Logic Programming, 15(3):231{241,

15 [31] J. D. Ullman. Principles of Database and Knowledge-base Systems, Volumes II: The New Technologies. Computer Science Press, New York, [32] J. D. Ullman. Information integration using logical views. International Conference on Database Theory (ICDT), pages 19{40, [33] J. D. Ullman. Lecture notes on principles of database systems [34] M. Vardi. Personal communication [35] R. Yerneni, C. Li, J. D. Ullman, and H. Garcia-Molina. Optimizing large join queries in mediation systems. International Conference on Database Theory (ICDT), pages 348{364, A Appendix A.1 Proof of Theorem 3.1 Proof: Assume T (Q) = fw 1 ;:::;w k g is the connection in the query Q on source descriptions V. Since f-closure(i(q);t(q)) = T (Q), there exists a sequence of the views in connection T (Q), say, w i1 ; ;w i k, that satises: (i) B(w i 1 ) I(Q); (ii) for j =2;:::;k, B(w i j ) I(Q) [A(w i 1 ) [ [ A(w i j,1 ). For any database of V,we can compute the maximal answer to Q as follows. Compute the corresponding sequence of n supplementary relations [2, 31] I 1 ;:::;I n, where I i is the supplementary relation after the rst i subgoals have been processed. The supplementary relation I n is the answer to query Q. Therefore, we can compute the answer to the query using n + 1 applications of the rules in (Q; V ) (the last application is to evaluate the connection rule). A.2 Proof of Theorem 3.2 Proof: If: Assume T (Q) =fw 1 ;:::;w n g, and b-closure(k) =fv 1 ;:::;v k g. Since there is no BFloop among the views in b-closure(k), there exists a BF-chain in b-closure(k) with distinct views v i1 ;:::;v i k, such that the free attributes of each view v ij do not overlap with the bound attributes of any previous source view. Starting with the initial bindings in Q and following the sequence, we use the views in this sequence to send source queries and retrieve all the possible bindings for the attributes in K. With these bindings and the initial bindings in I(Q), there exists a sequence of the views in T (Q), say w l1 ;:::;w ln, such that the binding requirements of each view in the sequence can be satised by previous subgoals. We follow this sequence to send source queries, collect tuples from the sources in the connection, and evaluate the connection rule in (Q; V ) to compute the maximal answer to Q. Therefore, we can evaluate the rules in nite number of steps to compute the maximal answer to Q, and the number is independent of the source relations. Thus (Q; V ) is bounded. Only If: If there is a BF-loop among the views b-closure(k), we prove (Q; V )isunbounded by showing that for any integer k > 0, there exists some database, such that only after k applications of the rules in (Q; V ) can we compute a tuple in the maximal answer to Q. Since there is a BF-loop among b-closure(k), there exists an attribute A in K, such that there is a BF-loop among b-closure(a). For any integer k>0, there is a BF-chain v 1 ;:::;v k with length k, such that A 2F(v k ). We can add tuples to the relations on the BF-chain, such that only following the BF-chain can we retrieve a tuple in the answer to Q. In other words, we populate the relations in a BF-loop of the views in b-closure(k) along the loop as many times as we want. By the way the database is constructed, we can only compute a tuple in the answer to Q after k applications of the rules in (Q; V ). Thus (Q; V ) is unbounded. 15

Answering Queries with Useful Bindings

Answering Queries with Useful Bindings Answering Queries with Useful Bindings CHEN LI University of California at Irvine and EDWARD CHANG University of California, Santa Barbara In information-integration systems, sources may have diverse and

More information

Access Patterns (Extended Version) Chen Li. Department of Computer Science, Stanford University, CA Abstract

Access Patterns (Extended Version) Chen Li. Department of Computer Science, Stanford University, CA Abstract Computing Complete Answers to Queries in the Presence of Limited Access Patterns (Extended Version) Chen Li Department of Computer Science, Stanford University, CA 94305 chenli@db.stanford.edu Abstract

More information

Designing Views to Answer Queries under Set, Bag,and BagSet Semantics

Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Rada Chirkova Department of Computer Science, North Carolina State University Raleigh, NC 27695-7535 chirkova@csc.ncsu.edu Foto Afrati

More information

Query Containment for Data Integration Systems

Query Containment for Data Integration Systems Query Containment for Data Integration Systems Todd Millstein University of Washington Seattle, Washington todd@cs.washington.edu Alon Levy University of Washington Seattle, Washington alon@cs.washington.edu

More information

Computing Complete Answers to Queries in the Presence of Limited Access Patterns (Revision)

Computing Complete Answers to Queries in the Presence of Limited Access Patterns (Revision) Computing Complete Answers to Queries in the Presence of Limited Access Patterns (Revision) Chen Li Department of Information and Computer Science University of California at Irvine, CA 92697-3425 chenli@ics.uci.edu

More information

(i.e., produced only a subset of the possible answers). We describe the novel class

(i.e., produced only a subset of the possible answers). We describe the novel class Recursive Query Plans for Data Integration Oliver M. Duschka Michael R. Genesereth Department of Computer Science, Stanford University, Stanford, CA 94305, USA Alon Y. Levy 1 Department of Computer Science

More information

Systems. Ramana Yerneni, Chen Li. fyerneni, chenli, ullman, Stanford University, USA

Systems. Ramana Yerneni, Chen Li. fyerneni, chenli, ullman, Stanford University, USA Optimizing Large Join Queries in Mediation Systems Ramana Yerneni, Chen Li Jerey Ullman, Hector Garcia-Molina fyerneni, chenli, ullman, hectorg@cs.stanford.edu, Stanford University, USA Abstract. In data

More information

Conjunctive queries. Many computational problems are much easier for conjunctive queries than for general first-order queries.

Conjunctive queries. Many computational problems are much easier for conjunctive queries than for general first-order queries. Conjunctive queries Relational calculus queries without negation and disjunction. Conjunctive queries have a normal form: ( y 1 ) ( y n )(p 1 (x 1,..., x m, y 1,..., y n ) p k (x 1,..., x m, y 1,..., y

More information

Finding Equivalent Rewritings in the Presence of Arithmetic Comparisons

Finding Equivalent Rewritings in the Presence of Arithmetic Comparisons Finding Equivalent Rewritings in the Presence of Arithmetic Comparisons Foto Afrati 1, Rada Chirkova 2, Manolis Gergatsoulis 3, and Vassia Pavlaki 1 1 Department of Electrical and Computing Engineering,

More information

I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications,

I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications, I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications, Proc. of the International Conference on Knowledge Management

More information

OPTIMIZING RECURSIVE INFORMATION GATHERING PLANS. Eric M. Lambrecht

OPTIMIZING RECURSIVE INFORMATION GATHERING PLANS. Eric M. Lambrecht OPTIMIZING RECURSIVE INFORMATION GATHERING PLANS by Eric M. Lambrecht A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science ARIZONA STATE UNIVERSITY December 1998

More information

Schemas for Integration and Translation of. Structured and Semi-Structured Data?

Schemas for Integration and Translation of. Structured and Semi-Structured Data? Schemas for Integration and Translation of Structured and Semi-Structured Data? Catriel Beeri 1 and Tova Milo 2 1 Hebrew University beeri@cs.huji.ac.il 2 Tel Aviv University milo@math.tau.ac.il 1 Introduction

More information

Datalog Evaluation. Linh Anh Nguyen. Institute of Informatics University of Warsaw

Datalog Evaluation. Linh Anh Nguyen. Institute of Informatics University of Warsaw Datalog Evaluation Linh Anh Nguyen Institute of Informatics University of Warsaw Outline Simple Evaluation Methods Query-Subquery Recursive Magic-Set Technique Query-Subquery Nets [2/64] Linh Anh Nguyen

More information

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture

More information

Expressive capabilities description languages and query rewriting algorithms q

Expressive capabilities description languages and query rewriting algorithms q The Journal of Logic Programming 43 (2000) 75±122 www.elsevier.com/locate/jlpr Expressive capabilities description languages and query rewriting algorithms q Vasilis Vassalos a, *, Yannis Papakonstantinou

More information

following syntax: R ::= > n j P j $i=n : C j :R j R 1 u R 2 C ::= > 1 j A j :C j C 1 u C 2 j 9[$i]R j (» k [$i]r) where i and j denote components of r

following syntax: R ::= > n j P j $i=n : C j :R j R 1 u R 2 C ::= > 1 j A j :C j C 1 u C 2 j 9[$i]R j (» k [$i]r) where i and j denote components of r Answering Queries Using Views in Description Logics Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini Dipartimento di Informatica e Sistemistica, Universit a di Roma La Sapienza" Via Salaria 113,

More information

DATABASE THEORY. Lecture 12: Evaluation of Datalog (2) TU Dresden, 30 June Markus Krötzsch

DATABASE THEORY. Lecture 12: Evaluation of Datalog (2) TU Dresden, 30 June Markus Krötzsch DATABASE THEORY Lecture 12: Evaluation of Datalog (2) Markus Krötzsch TU Dresden, 30 June 2016 Overview 1. Introduction Relational data model 2. First-order queries 3. Complexity of query answering 4.

More information

Encyclopedia of Database Systems, Editors-in-chief: Özsu, M. Tamer; Liu, Ling, Springer, MAINTENANCE OF RECURSIVE VIEWS. Suzanne W.

Encyclopedia of Database Systems, Editors-in-chief: Özsu, M. Tamer; Liu, Ling, Springer, MAINTENANCE OF RECURSIVE VIEWS. Suzanne W. Encyclopedia of Database Systems, Editors-in-chief: Özsu, M. Tamer; Liu, Ling, Springer, 2009. MAINTENANCE OF RECURSIVE VIEWS Suzanne W. Dietrich Arizona State University http://www.public.asu.edu/~dietrich

More information

10.3 Recursive Programming in Datalog. While relational algebra can express many useful operations on relations, there

10.3 Recursive Programming in Datalog. While relational algebra can express many useful operations on relations, there 1 10.3 Recursive Programming in Datalog While relational algebra can express many useful operations on relations, there are some computations that cannot be written as an expression of relational algebra.

More information

DATABASE THEORY. Lecture 15: Datalog Evaluation (2) TU Dresden, 26th June Markus Krötzsch Knowledge-Based Systems

DATABASE THEORY. Lecture 15: Datalog Evaluation (2) TU Dresden, 26th June Markus Krötzsch Knowledge-Based Systems DATABASE THEORY Lecture 15: Datalog Evaluation (2) Markus Krötzsch Knowledge-Based Systems TU Dresden, 26th June 2018 Review: Datalog Evaluation A rule-based recursive query language father(alice, bob)

More information

Lecture 1: Conjunctive Queries

Lecture 1: Conjunctive Queries CS 784: Foundations of Data Management Spring 2017 Instructor: Paris Koutris Lecture 1: Conjunctive Queries A database schema R is a set of relations: we will typically use the symbols R, S, T,... to denote

More information

Query Rewriting Using Views in the Presence of Inclusion Dependencies

Query Rewriting Using Views in the Presence of Inclusion Dependencies Query Rewriting Using Views in the Presence of Inclusion Dependencies Qingyuan Bai Jun Hong Michael F. McTear School of Computing and Mathematics, University of Ulster at Jordanstown, Newtownabbey, Co.

More information

Integrating Information by Outerjoins. Dept. of Computer Science. Stanford University ABSTRACT

Integrating Information by Outerjoins. Dept. of Computer Science. Stanford University ABSTRACT Integrating Information by Outerjoins and Full Disjunctions 1 Anand Rajaraman Jerey D. Ullman Dept. of Computer Science Stanford University ABSTRACT Our motivation is the piecing together tidbits of information

More information

A Framework for Ontology Integration

A Framework for Ontology Integration A Framework for Ontology Integration Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza Via Salaria 113, 00198 Roma, Italy

More information

Lecture 22 Acyclic Joins and Worst Case Join Results Instructor: Sudeepa Roy

Lecture 22 Acyclic Joins and Worst Case Join Results Instructor: Sudeepa Roy CompSci 516 ata Intensive Computing Systems Lecture 22 Acyclic Joins and Worst Case Join Results Instructor: Sudeepa Roy uke CS, Fall 2016 CompSci 516: ata Intensive Computing Systems Announcements Final

More information

Describing and Utilizing Constraints to Answer Queries in Data-Integration Systems

Describing and Utilizing Constraints to Answer Queries in Data-Integration Systems Describing and Utilizing Constraints to Answer Queries in Data-Integration Systems Chen Li Information and Computer Science University of California, Irvine, CA 92697 chenli@ics.uci.edu Abstract In data-integration

More information

DATABASE THEORY. Lecture 11: Introduction to Datalog. TU Dresden, 12th June Markus Krötzsch Knowledge-Based Systems

DATABASE THEORY. Lecture 11: Introduction to Datalog. TU Dresden, 12th June Markus Krötzsch Knowledge-Based Systems DATABASE THEORY Lecture 11: Introduction to Datalog Markus Krötzsch Knowledge-Based Systems TU Dresden, 12th June 2018 Announcement All lectures and the exercise on 19 June 2018 will be in room APB 1004

More information

Stanford Warren Ascherman Professor of Engineering, Emeritus Computer Science

Stanford Warren Ascherman Professor of Engineering, Emeritus Computer Science Stanford Warren Ascherman Professor of Engineering, Emeritus Computer Science Bio ACADEMIC APPOINTMENTS Emeritus Faculty, Acad Council, Computer Science Teaching COURSES 2016-17 Mining Massive Data Sets:

More information

Database Theory: Beyond FO

Database Theory: Beyond FO Database Theory: Beyond FO CS 645 Feb 11, 2010 Some slide content based on materials of Dan Suciu, Ullman/Widom 1 TODAY: Coming lectures Limited expressiveness of FO Adding recursion (Datalog) Expressiveness

More information

SORT INFERENCE \coregular" signatures, they derive an algorithm for computing a most general typing for expressions e which is only slightly more comp

SORT INFERENCE \coregular signatures, they derive an algorithm for computing a most general typing for expressions e which is only slightly more comp Haskell Overloading is DEXPTIME{complete Helmut Seidl Fachbereich Informatik Universitat des Saarlandes Postfach 151150 D{66041 Saarbrucken Germany seidl@cs.uni-sb.de Febr., 1994 Keywords: Haskell type

More information

Finding a winning strategy in variations of Kayles

Finding a winning strategy in variations of Kayles Finding a winning strategy in variations of Kayles Simon Prins ICA-3582809 Utrecht University, The Netherlands July 15, 2015 Abstract Kayles is a two player game played on a graph. The game can be dened

More information

Rewriting Ontology-Mediated Queries. Carsten Lutz University of Bremen

Rewriting Ontology-Mediated Queries. Carsten Lutz University of Bremen Rewriting Ontology-Mediated Queries Carsten Lutz University of Bremen Data Access and Ontologies Today, data is often highly incomplete and very heterogeneous Examples include web data and large-scale

More information

August 1998 Answering Queries by Semantic Caches Godfrey & Gryz p. 1 of 15. Answering Queries by Semantic Caches. Abstract

August 1998 Answering Queries by Semantic Caches Godfrey & Gryz p. 1 of 15. Answering Queries by Semantic Caches. Abstract August 1998 Answering Queries by Semantic Caches Godfrey & Gryz p. 1 of 15 Answering Queries by Semantic Caches Parke Godfrey godfrey@cs.umd.edu Department of Computer Science University of Maryland College

More information

Using Views to Generate Efficient Evaluation Plans for Queries

Using Views to Generate Efficient Evaluation Plans for Queries Using Views to Generate Efficient Evaluation Plans for Queries Foto N. Afrati a,chenli b, and Jeffrey D. Ullman c a School of Electrical and Computing Engineering, National Technical University of Athens,

More information

has phone Phone Person Person degree Degree isa isa has addr has addr has phone has phone major Degree Phone Schema S1 Phone Schema S2

has phone Phone Person Person degree Degree isa isa has addr has addr has phone has phone major Degree Phone Schema S1 Phone Schema S2 Schema Equivalence in Heterogeneous Systems: Bridging Theory and Practice R. J. Miller y Y. E. Ioannidis z R. Ramakrishnan x Department of Computer Sciences University of Wisconsin-Madison frmiller, yannis,

More information

Wrapper 2 Wrapper 3. Information Source 2

Wrapper 2 Wrapper 3. Information Source 2 Integration of Semistructured Data Using Outer Joins Koichi Munakata Industrial Electronics & Systems Laboratory Mitsubishi Electric Corporation 8-1-1, Tsukaguchi Hon-machi, Amagasaki, Hyogo, 661, Japan

More information

Structural characterizations of schema mapping languages

Structural characterizations of schema mapping languages Structural characterizations of schema mapping languages Balder ten Cate INRIA and ENS Cachan (research done while visiting IBM Almaden and UC Santa Cruz) Joint work with Phokion Kolaitis (ICDT 09) Schema

More information

A Retrospective on Datalog 1.0

A Retrospective on Datalog 1.0 A Retrospective on Datalog 1.0 Phokion G. Kolaitis UC Santa Cruz and IBM Research - Almaden Datalog 2.0 Vienna, September 2012 2 / 79 A Brief History of Datalog In the beginning of time, there was E.F.

More information

. The problem: ynamic ata Warehouse esign Ws are dynamic entities that evolve continuously over time. As time passes, new queries need to be answered

. The problem: ynamic ata Warehouse esign Ws are dynamic entities that evolve continuously over time. As time passes, new queries need to be answered ynamic ata Warehouse esign? imitri Theodoratos Timos Sellis epartment of Electrical and Computer Engineering Computer Science ivision National Technical University of Athens Zographou 57 73, Athens, Greece

More information

Localization in Graphs. Richardson, TX Azriel Rosenfeld. Center for Automation Research. College Park, MD

Localization in Graphs. Richardson, TX Azriel Rosenfeld. Center for Automation Research. College Park, MD CAR-TR-728 CS-TR-3326 UMIACS-TR-94-92 Samir Khuller Department of Computer Science Institute for Advanced Computer Studies University of Maryland College Park, MD 20742-3255 Localization in Graphs Azriel

More information

has to choose. Important questions are: which relations should be dened intensionally,

has to choose. Important questions are: which relations should be dened intensionally, Automated Design of Deductive Databases (Extended abstract) Hendrik Blockeel and Luc De Raedt Department of Computer Science, Katholieke Universiteit Leuven Celestijnenlaan 200A B-3001 Heverlee, Belgium

More information

NP-Completeness of 3SAT, 1-IN-3SAT and MAX 2SAT

NP-Completeness of 3SAT, 1-IN-3SAT and MAX 2SAT NP-Completeness of 3SAT, 1-IN-3SAT and MAX 2SAT 3SAT The 3SAT problem is the following. INSTANCE : Given a boolean expression E in conjunctive normal form (CNF) that is the conjunction of clauses, each

More information

Information integration using logical views

Information integration using logical views Theoretical Computer Science 239 (2000) 189 210 www.elsevier.com/locate/tcs Information integration using logical views Jerey D. Ullman Department of Computer Science, Stanford University, Margaret Jacks

More information

Data integration supports seamless access to autonomous, heterogeneous information

Data integration supports seamless access to autonomous, heterogeneous information Using Constraints to Describe Source Contents in Data Integration Systems Chen Li, University of California, Irvine Data integration supports seamless access to autonomous, heterogeneous information sources

More information

A MODEL FOR ADVANCED QUERY CAPABILITY DESCRIPTION IN MEDIATOR SYSTEMS

A MODEL FOR ADVANCED QUERY CAPABILITY DESCRIPTION IN MEDIATOR SYSTEMS A MODEL FOR ADVANCED QUERY CAPABILITY DESCRIPTION IN MEDIATOR SYSTEMS Alberto Pan, Paula Montoto and Anastasio Molano Denodo Technologies, Almirante Fco. Moreno 5 B, 28040 Madrid, Spain Email: apan@denodo.com,

More information

A technique for adding range restrictions to. August 30, Abstract. In a generalized searching problem, a set S of n colored geometric objects

A technique for adding range restrictions to. August 30, Abstract. In a generalized searching problem, a set S of n colored geometric objects A technique for adding range restrictions to generalized searching problems Prosenjit Gupta Ravi Janardan y Michiel Smid z August 30, 1996 Abstract In a generalized searching problem, a set S of n colored

More information

Top-k Keyword Search Over Graphs Based On Backward Search

Top-k Keyword Search Over Graphs Based On Backward Search Top-k Keyword Search Over Graphs Based On Backward Search Jia-Hui Zeng, Jiu-Ming Huang, Shu-Qiang Yang 1College of Computer National University of Defense Technology, Changsha, China 2College of Computer

More information

Optimizing Recursive Information Gathering Plans

Optimizing Recursive Information Gathering Plans Research Paper: america22 1 Optimizing Recursive Information Gathering Plans Eric Lambrecht & Subbarao Kambhampati Department of Computer Science and Engineering Arizona State University, Tempe, AZ 85287

More information

1 Connected components in undirected graphs

1 Connected components in undirected graphs Lecture 10 Connected components of undirected and directed graphs Scribe: Luke Johnston (2016) and Mary Wootters (2017) Date: October 25, 2017 Much of the following notes were taken from Tim Roughgarden

More information

Let v be a vertex primed by v i (s). Then the number f(v) of neighbours of v which have

Let v be a vertex primed by v i (s). Then the number f(v) of neighbours of v which have Let v be a vertex primed by v i (s). Then the number f(v) of neighbours of v which have been red in the sequence up to and including v i (s) is deg(v)? s(v), and by the induction hypothesis this sequence

More information

Checking Containment of Schema Mappings (Preliminary Report)

Checking Containment of Schema Mappings (Preliminary Report) Checking Containment of Schema Mappings (Preliminary Report) Andrea Calì 3,1 and Riccardo Torlone 2 Oxford-Man Institute of Quantitative Finance, University of Oxford, UK Dip. di Informatica e Automazione,

More information

These notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models.

These notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models. Undirected Graphical Models: Chordal Graphs, Decomposable Graphs, Junction Trees, and Factorizations Peter Bartlett. October 2003. These notes present some properties of chordal graphs, a set of undirected

More information

Data Integration by Describing Sources with Constraint Databases

Data Integration by Describing Sources with Constraint Databases Data Integration by Describing Sources with Constraint Databases Xun Cheng 1 Guozhu Dong 2 Tzekwan Lau 1 Jianwen Su 1 xun@cs.ucsb.edu gdong@cs.wright.edu tzekwan@cs.ucsb.edu su@cs.ucsb.edu 1 Department

More information

}Optimization Formalisms for recursive queries. Module 11: Optimization of Recursive Queries. Module Outline Datalog

}Optimization Formalisms for recursive queries. Module 11: Optimization of Recursive Queries. Module Outline Datalog Module 11: Optimization of Recursive Queries 11.1 Formalisms for recursive queries Examples for problems requiring recursion: Module Outline 11.1 Formalisms for recursive queries 11.2 Computing recursive

More information

}Optimization. Module 11: Optimization of Recursive Queries. Module Outline

}Optimization. Module 11: Optimization of Recursive Queries. Module Outline Module 11: Optimization of Recursive Queries Module Outline 11.1 Formalisms for recursive queries 11.2 Computing recursive queries 11.3 Partial transitive closures User Query Transformation & Optimization

More information

On the Role of Integrity Constraints in Data Integration

On the Role of Integrity Constraints in Data Integration On the Role of Integrity Constraints in Data Integration Andrea Calì, Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza

More information

Can Datalog Be Approximated?

Can Datalog Be Approximated? journal of computer and system sciences 55, 355369 (1997) article no. SS971528 Can Datalog Be Approximated? Surajit Chaudhuri Microsoft Research, One Microsoft Way, Redmond, Washington 98052 and Phokion

More information

Abstract. We address the problem of query rewriting for TSL, a language for querying semistructured data. We

Abstract. We address the problem of query rewriting for TSL, a language for querying semistructured data. We Paper Number P044 Query Rewriting for Semistructured Data Yannis Papakonstantinou y University of California, San Diego yannis@cs.ucsd.edu Vasilis Vassalos z Stanford University vassalos@cs.stanford.edu

More information

1 1 Introduction In recent years, the problem of information integration has received a lot of attention. In particular, several information integrati

1 1 Introduction In recent years, the problem of information integration has received a lot of attention. In particular, several information integrati The use of carin language and algorithms for Information Integration: the PICSEL project Veronique Lattes and Marie-Christine Rousset L.R.I. Computer Science Laboratory C.N.R.S & University of Paris-Sud

More information

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of California, San Diego CA 92093{0114, USA Abstract. We

More information

[13] D. Karger, \Using randomized sparsication to approximate minimum cuts" Proc. 5th Annual

[13] D. Karger, \Using randomized sparsication to approximate minimum cuts Proc. 5th Annual [12] F. Harary, \Graph Theory", Addison-Wesley, Reading, MA, 1969. [13] D. Karger, \Using randomized sparsication to approximate minimum cuts" Proc. 5th Annual ACM-SIAM Symposium on Discrete Algorithms,

More information

The problem of minimizing the elimination tree height for general graphs is N P-hard. However, there exist classes of graphs for which the problem can

The problem of minimizing the elimination tree height for general graphs is N P-hard. However, there exist classes of graphs for which the problem can A Simple Cubic Algorithm for Computing Minimum Height Elimination Trees for Interval Graphs Bengt Aspvall, Pinar Heggernes, Jan Arne Telle Department of Informatics, University of Bergen N{5020 Bergen,

More information

Processing Regular Path Queries Using Views or What Do We Need for Integrating Semistructured Data?

Processing Regular Path Queries Using Views or What Do We Need for Integrating Semistructured Data? Processing Regular Path Queries Using Views or What Do We Need for Integrating Semistructured Data? Diego Calvanese University of Rome La Sapienza joint work with G. De Giacomo, M. Lenzerini, M.Y. Vardi

More information

Reducing Directed Max Flow to Undirected Max Flow and Bipartite Matching

Reducing Directed Max Flow to Undirected Max Flow and Bipartite Matching Reducing Directed Max Flow to Undirected Max Flow and Bipartite Matching Henry Lin Division of Computer Science University of California, Berkeley Berkeley, CA 94720 Email: henrylin@eecs.berkeley.edu Abstract

More information

Interoperability via Mapping Objects

Interoperability via Mapping Objects Interoperability via Mapping Objects Fragkiskos Pentaris and Yannis Ioannidis University of Athens, Dept of Informatics and Telecommunications Panepistemioupolis, 157 84, Athens, Hellas (Greece) {frank,

More information

On the Hardness of Counting the Solutions of SPARQL Queries

On the Hardness of Counting the Solutions of SPARQL Queries On the Hardness of Counting the Solutions of SPARQL Queries Reinhard Pichler and Sebastian Skritek Vienna University of Technology, Faculty of Informatics {pichler,skritek}@dbai.tuwien.ac.at 1 Introduction

More information

v 2 v 3 v 1 v 0 K 3 K 1

v 2 v 3 v 1 v 0 K 3 K 1 It is Hard to Know when Greedy is Good for Finding Independent Sets Hans L. Bodlaender, Dimitrios M. Thilikos, Koichi Yamazaki Department of Computer Science, Utrecht University, P.O. Box 80.089, 3508

More information

The NP-Completeness of Some Edge-Partition Problems

The NP-Completeness of Some Edge-Partition Problems The NP-Completeness of Some Edge-Partition Problems Ian Holyer y SIAM J. COMPUT, Vol. 10, No. 4, November 1981 (pp. 713-717) c1981 Society for Industrial and Applied Mathematics 0097-5397/81/1004-0006

More information

Data Integration Systems

Data Integration Systems Data Integration Systems Haas et al. 98 Garcia-Molina et al. 97 Levy et al. 96 Chandrasekaran et al. 2003 Zachary G. Ives University of Pennsylvania January 13, 2003 CIS 650 Data Sharing and the Web Administrivia

More information

Accessing Data Integration Systems through Conceptual Schemas

Accessing Data Integration Systems through Conceptual Schemas Accessing Data Integration Systems through Conceptual Schemas Andrea Calì, Diego Calvanese, Giuseppe De Giacomo, and Maurizio Lenzerini Dipartimento di Informatica e Sistemistica Università di Roma La

More information

Foundations of Databases

Foundations of Databases Foundations of Databases Free University of Bozen Bolzano, 2004 2005 Thomas Eiter Institut für Informationssysteme Arbeitsbereich Wissensbasierte Systeme (184/3) Technische Universität Wien http://www.kr.tuwien.ac.at/staff/eiter

More information

Who won the Universal Relation wars? Alberto Mendelzon University of Toronto

Who won the Universal Relation wars? Alberto Mendelzon University of Toronto Who won the Universal Relation wars? Alberto Mendelzon University of Toronto Who won the Universal Relation wars? 1 1-a Who won the Universal Relation wars? The Good Guys. The Good Guys 2 3 Outline The

More information

Database Theory: Datalog, Views

Database Theory: Datalog, Views Database Theory: Datalog, Views CS 645 Mar 8, 2006 Some slide content courtesy of Ramakrishnan & Gehrke, Dan Suciu, Ullman & Widom 1 TODAY: Coming lectures Adding recursion: datalog Summary of Containment

More information

Simplicial Cells in Arrangements of Hyperplanes

Simplicial Cells in Arrangements of Hyperplanes Simplicial Cells in Arrangements of Hyperplanes Christoph Dätwyler 05.01.2013 This paper is a report written due to the authors presentation of a paper written by Shannon [1] in 1977. The presentation

More information

Alon Levy University of Washington

Alon Levy University of Washington From: AAAI-99 Proceedings. Copyright 1999, AAAI (www.aaai.org). All rights reserved. Navigational Plans For Data Integration Marc Friedman University of Washington friedman@cs.washington.edu Alon Levy

More information

New Rewritings and Optimizations for Regular Path Queries

New Rewritings and Optimizations for Regular Path Queries New Rewritings and Optimizations for Regular Path Queries Gösta Grahne and Alex Thomo Concordia University, Montreal, Canada {grahne, thomo}@cs.concordia.ca Abstract. All the languages for querying semistructured

More information

A Simplied NP-complete MAXSAT Problem. Abstract. It is shown that the MAX2SAT problem is NP-complete even if every variable

A Simplied NP-complete MAXSAT Problem. Abstract. It is shown that the MAX2SAT problem is NP-complete even if every variable A Simplied NP-complete MAXSAT Problem Venkatesh Raman 1, B. Ravikumar 2 and S. Srinivasa Rao 1 1 The Institute of Mathematical Sciences, C. I. T. Campus, Chennai 600 113. India 2 Department of Computer

More information

evaluation using Magic Sets optimization has time complexity less than or equal to a particular

evaluation using Magic Sets optimization has time complexity less than or equal to a particular Top-Down vs. Bottom-Up Revisited Raghu Ramakrishnan and S. Sudarshan University of Wisconsin-Madison Madison, WI 53706, USA fraghu,sudarshag@cs.wisc.edu Abstract Ullman ([Ull89a, Ull89b]) has shown that

More information

Query Processing and Optimization on the Web

Query Processing and Optimization on the Web Query Processing and Optimization on the Web Mourad Ouzzani and Athman Bouguettaya Presented By Issam Al-Azzoni 2/22/05 CS 856 1 Outline Part 1 Introduction Web Data Integration Systems Query Optimization

More information

Term Algebras with Length Function and Bounded Quantifier Elimination

Term Algebras with Length Function and Bounded Quantifier Elimination with Length Function and Bounded Ting Zhang, Henny B Sipma, Zohar Manna Stanford University tingz,sipma,zm@csstanfordedu STeP Group, September 3, 2004 TPHOLs 2004 - p 1/37 Motivation: Program Verification

More information

of m clauses, each containing the disjunction of boolean variables from a nite set V = fv 1 ; : : : ; vng of size n [8]. Each variable occurrence with

of m clauses, each containing the disjunction of boolean variables from a nite set V = fv 1 ; : : : ; vng of size n [8]. Each variable occurrence with A Hybridised 3-SAT Algorithm Andrew Slater Automated Reasoning Project, Computer Sciences Laboratory, RSISE, Australian National University, 0200, Canberra Andrew.Slater@anu.edu.au April 9, 1999 1 Introduction

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

An Algorithm for Answering Queries Efficiently Using Views Prasenjit Mitra Infolab, Stanford University Stanford, CA, 94305, U.S.A.

An Algorithm for Answering Queries Efficiently Using Views Prasenjit Mitra Infolab, Stanford University Stanford, CA, 94305, U.S.A. An Algorithm for Answering Queries Efficiently Using Views Prasenjit Mitra Infolab, Stanford University Stanford, CA, 94305, U.S.A. mitra@db.stanford.edu September, 1999 Abstract Algorithms for answering

More information

Signed domination numbers of a graph and its complement

Signed domination numbers of a graph and its complement Discrete Mathematics 283 (2004) 87 92 www.elsevier.com/locate/disc Signed domination numbers of a graph and its complement Ruth Haas a, Thomas B. Wexler b a Department of Mathematics, Smith College, Northampton,

More information

Lecture 7. s.t. e = (u,v) E x u + x v 1 (2) v V x v 0 (3)

Lecture 7. s.t. e = (u,v) E x u + x v 1 (2) v V x v 0 (3) COMPSCI 632: Approximation Algorithms September 18, 2017 Lecturer: Debmalya Panigrahi Lecture 7 Scribe: Xiang Wang 1 Overview In this lecture, we will use Primal-Dual method to design approximation algorithms

More information

A Practical Algorithm for Reformulation of Deductive Databases

A Practical Algorithm for Reformulation of Deductive Databases A Practical Algorithm for Reformulation of Deductive Databases Michael Genesereth and Abhijeet Mohapatra Stanford University, Stanford, CA - 94305, USA {genesereth, abhijeet}@cs.stanford.edu Abstract.

More information

THE FREUDENTHAL-HOPF THEOREM

THE FREUDENTHAL-HOPF THEOREM THE FREUDENTHAL-HOPF THEOREM SOFI GJING JOVANOVSKA Abstract. In this paper, we will examine a geometric property of groups: the number of ends of a group. Intuitively, the number of ends of a group is

More information

Selecting and Using Views to Compute Aggregate Queries

Selecting and Using Views to Compute Aggregate Queries Selecting and Using Views to Compute Aggregate Queries Foto Afrati National Technical University of Athens 157 73 Athens, Greece afrati@cs.ece.ntua.gr Rada Chirkova Computer Science North Carolina State

More information

Eulerian subgraphs containing given edges

Eulerian subgraphs containing given edges Discrete Mathematics 230 (2001) 63 69 www.elsevier.com/locate/disc Eulerian subgraphs containing given edges Hong-Jian Lai Department of Mathematics, West Virginia University, P.O. Box. 6310, Morgantown,

More information

LIF Marseille, CNRS & University Aix{Marseille address: URL:

LIF Marseille, CNRS & University Aix{Marseille  address: URL: 1D EFFECTIVELY CLOSED SUBSHIFTS AND 2D TILINGS BRUNO DURAND 1, ANDREI ROMASHCHENKO 2, AND ALEXANDER SHEN 2 1 LIF Marseille, CNRS & University Aix{Marseille E-mail address: Bruno.Durand@lif.univ-mrs.fr

More information

Weak Dynamic Coloring of Planar Graphs

Weak Dynamic Coloring of Planar Graphs Weak Dynamic Coloring of Planar Graphs Caroline Accurso 1,5, Vitaliy Chernyshov 2,5, Leaha Hand 3,5, Sogol Jahanbekam 2,4,5, and Paul Wenger 2 Abstract The k-weak-dynamic number of a graph G is the smallest

More information

Implementation of a Database System with Boolean Algebra Constraints

Implementation of a Database System with Boolean Algebra Constraints University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Computer Science and Engineering: Theses, Dissertations, and Student Research Computer Science and Engineering, Department

More information

Algorithms for Learning and Teaching. Sets of Vertices in Graphs. Patricia A. Evans and Michael R. Fellows. University of Victoria

Algorithms for Learning and Teaching. Sets of Vertices in Graphs. Patricia A. Evans and Michael R. Fellows. University of Victoria Algorithms for Learning and Teaching Sets of Vertices in Graphs Patricia A. Evans and Michael R. Fellows Department of Computer Science University of Victoria Victoria, B.C. V8W 3P6, Canada Lane H. Clark

More information

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph.

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph. Trees 1 Introduction Trees are very special kind of (undirected) graphs. Formally speaking, a tree is a connected graph that is acyclic. 1 This definition has some drawbacks: given a graph it is not trivial

More information

Implementation of a Database System with Boolean Algebra Constraints Andras Salamon, M.S. University of Nebraska, 1998 Advisor: Peter Z. Revesz This t

Implementation of a Database System with Boolean Algebra Constraints Andras Salamon, M.S. University of Nebraska, 1998 Advisor: Peter Z. Revesz This t Implementation of a Database System with Boolean Algebra Constraints by Andras Salamon A THESIS Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulllment of Requirements

More information

ALGORITHMIC DECIDABILITY OF COMPUTER PROGRAM-FUNCTIONS LANGUAGE PROPERTIES. Nikolay Kosovskiy

ALGORITHMIC DECIDABILITY OF COMPUTER PROGRAM-FUNCTIONS LANGUAGE PROPERTIES. Nikolay Kosovskiy International Journal Information Theories and Applications, Vol. 20, Number 2, 2013 131 ALGORITHMIC DECIDABILITY OF COMPUTER PROGRAM-FUNCTIONS LANGUAGE PROPERTIES Nikolay Kosovskiy Abstract: A mathematical

More information

The temporal explorer who returns to the base 1

The temporal explorer who returns to the base 1 The temporal explorer who returns to the base 1 Eleni C. Akrida, George B. Mertzios, and Paul G. Spirakis, Department of Computer Science, University of Liverpool, UK Department of Computer Science, Durham

More information

Higher-Order Conditional Term Rewriting. In this paper, we extend the notions of rst-order conditional rewrite systems

Higher-Order Conditional Term Rewriting. In this paper, we extend the notions of rst-order conditional rewrite systems Higher-Order Conditional Term Rewriting in the L Logic Programming Language Preliminary Results Amy Felty AT&T Bell Laboratories 600 Mountain Avenue Murray Hill, NJ 07974 Abstract In this paper, we extend

More information

Complexity of Answering Queries Using Materialized Views

Complexity of Answering Queries Using Materialized Views Complexity of Answering Queries Using Materialized Views Serge Abiteboul, Olivier Duschka To cite this version: Serge Abiteboul, Olivier Duschka. Complexity of Answering Queries Using Materialized Views.

More information

The 3-Steiner Root Problem

The 3-Steiner Root Problem The 3-Steiner Root Problem Maw-Shang Chang 1 and Ming-Tat Ko 2 1 Department of Computer Science and Information Engineering National Chung Cheng University, Chiayi 621, Taiwan, R.O.C. mschang@cs.ccu.edu.tw

More information