and therefore the system throughput in a distributed database system [, 1]. Vertical fragmentation further enhances the performance of database transa

Size: px
Start display at page:

Download "and therefore the system throughput in a distributed database system [, 1]. Vertical fragmentation further enhances the performance of database transa"

Transcription

1 Vertical Fragmentation and Allocation in Distributed Deductive Database Systems Seung-Jin Lim Yiu-Kai Ng Department of Computer Science Brigham Young University Provo, Utah 80, U.S.A. Abstract Although approaches for vertical fragmentation and data allocation have been proposed [1, 1], algorithms for vertical fragmentation and allocation of data and rules in distributed deductive database systems (DDDBSs) are lacking. In this paper, we present dierent approaches for vertical fragmentation of relations that are referenced by rules and an allocation strategy for rules and fragments in a DDDBS. The potential advantages of the proposed fragmentation and allocation scheme include maximal locality of query evaluation and minimization of communication cost in a distributed system, in addition to the desirable properties of (vertical) fragmentation and rule allocation as discussed in the literature [11, 1]. We also formulate the mathematical interpretation of the proposed vertical fragmentation and allocation algorithms. Keywords: rules, fragmentation, allocation, replication, deductive databases, distributed systems 1 Introduction Deductive database systems enhance the expressive power of conventional relational database systems by adopting logic programming as a query language which allows recursion, while distributed database systems oer many advantages over centralized database systems which include the enhancement of reliability and availability of the involved databases, improvement of overall system performance by executing transactions in parallel, and minimization of contention for system resources [1]. The integration of these two database systems appears to provide a promising, potentially more powerful and reliable database system for information processing. Two of the main design activities in distributed systems are fragmentation and data allocation. Fragmentation allows parallel execution of a single query, reduces the amount of irrelevant data access and unnecessary data transfer, increases the level of concurrency 1

2 and therefore the system throughput in a distributed database system [, 1]. Vertical fragmentation further enhances the performance of database transactions by closely matching fragments for the requirements of transactions [1]. Our design goals of fragmentation and allocation of rule and data aim to maximize concurrent rule execution, reduce replication of rules and data, minimize communication cost during query evaluation, and decrease the query response time. Dierent approaches for (vertical) fragmentation and rule allocation in distributed (deductive) database systems have been proposed. [1] introduces a vertical partitioning algorithm using a graphical technique that provides an improvement over the previous work on vertical partitioning [1]. [10] presents a theory of fragmentation and studies the completeness and update problems of overlapping fragments. [] develops a fragmentation technique for class objects in a distributed object based system. [11] discusses the rule allocation problem in a distributed database system and proposes a rule partitioning method. [1, 1] construct algorithms for dynamic data allocation in distributed systems. [1] proposes a distributed data allocation algorithm that utilizes actual query processing schedules. The proposed method, which integrates the problems of distributed query optimization and optimal data allocation statically by sequentially optimizing query strategies and then data allocation, determines the (horizontal) fragments of relations to be allocated so that the total transmission cost of processing user queries and updates is minimized. The same problem has been addressed by [] which includes an iterative method for integrating query optimization and data allocation methods in distributed database design. An optimization heuristic is adopted in [] which iteratively determines the minimum cost query strategies and minimum cost data allocation until a local minimum for the combined problem is found. All of these approaches, however, either address the problem of data allocation [1, 1] and optimal query processing [1, ], focus on the partitioning problem of rules [11], or deal with the fragmentation of relations or class objects in a distributed system [, 10, 1, 1]. Algorithms for vertical fragmentation of relations referenced by rules and allocation of rules and corresponding fragments in a distributed deductive database system (DDDBS) are lacking. In this paper, we present dierent approaches for vertical fragmentation of relations and allocation of rules and fragments. Data and rules are distributed across dierent sites in a network to meet the operational needs and to handle future information processing at each site. The proposed fragmentation and allocation strategy maximizes locality of query evaluation while minimizes communication cost and execution time during query processing. We proceed to present our results as follows. In Section we provide our basic denitions for dependencies among rule expressions and base relations. In Section we propose four dierent algorithms: RCA for rule clustering, OVF for computing overlapping vertical fragmentation, DVF for generating disjoint vertical fragmentation, and CAA for allocating rules and corresponding fragments. In Section we include the mathematical interpretation of the proposed algorithms and formulate the communication costs of distributing rules and fragments and query evaluation in Sections.1 and., respectively. In Section we present the proofs of correctness and complexity analysis of the proposed algorithms and give the concluding remarks in Section.

3 Basic Denitions We consider each Datalog rule r in a DDDBS of the form p(x 1 ; ; X n ) :- q 1 (Y 1 ; ; Y m ); ; q t (Z 1 ; ; Z s ) where p is the head (predicate) of r and is either a derived (intensional) or mixed predicate (relation) 1. q i (1 i t) is either a derived, mixed, or base (extensional) predicate (relation), and q 1 ; : : : ; q t form the body of r. An argument of a predicate is either a variable or a constant. A rule with an empty body is a base relation, i.e., extensional predicate, which contains a set of facts, and a rule without the head predicate is a query. r is recursive if at least one of the predicates in the body of r is p []. Two predicates p and q are mutually recursive if p and q are (in)directly dependent on each other, i.e., in order to compute p, we need to compute q, and vice versa, and we call any two rules with p and q as head predicates respectively mutually recursive rules. r can be extended to handle complex data structures as in higher-order logic database languages [9], and our proposed solutions for fragmentation and allocation problems are independent of the constructs of r, i.e., whether r is a Datalog rule or is extended as a rule in a higher-order logic database language. We apply some basic principles of graph theory to the proposed fragmentation and rule and data allocation algorithms, and use matrices to capture the dependency relationships among rules and base relations. Denition 1 A directed graph (digraph for short) G(N; E) consists of two sets, the nonempty set N (or N(G)) of nodes and the set E (or E(G)) of edges. Each node in N represents either a rule or a base relation, whereas each edge (n 1 ; n ) in E denotes that n 1 and n are rules, and the head predicate of n (which can be a base relation) appears in the body of n 1. Denition In a digraph G, node n j is reachable from node n i if there exists a path from n i to n j. Node n j is directly reachable from node n i if there exists a path of length 1 from n i to n j. Denition Given a digraph G, let A be the Boolean adjacency matrix of G. Then, A (m) [i; j] = ( 1 if there exists a path of length m from node ni to node n j 0 otherwise In particular, A (1) [i; j] is called a direct dependency matrix of G. Denition In a digraph G, a reachability matrix R is dened as R = A (1) _ A () A (n) where an entry of R is computed by applying the Boolean addition (_) to the corresponding entries in A (1) ; ; A (n). 1 A predicate p is mixed if there is a set of ground facts for p and p appears as the head predicate of some rules [].

4 There exist direct dependency matrices that capture the rule-to-rule and rule-torelation relationships, respectively in DDDBSs. Denition In a digraph G with N r distinct rules, an N r N r direct rule-to-rule dependency matrix, rr, is dened as rr[i; j] = 8 >< >: 1 if rule r j is directly reachable from rule r i ; i.e., the head predicate of r j appears in the body of r i 0 otherwise Denition In a digraph G with N r distinct rules and N R base relations, an N r N R direct rule-to-relation dependency matrix, rr, is dened as rr[i; j] = 8 >< >: 1 if base relation R j is directly reachable from rule r i ; i.e., R j appears in the body of r i 0 otherwise Denition An N R 1 table-size matrix T S is dened as T S[i] = n, where i denotes base relation R i, and n denotes the size (in bytes) of R i. Denition 8 A network topology matrix T, which is symmetric relative to the principal diagonal, is dened as T [i; j] = w, where w denotes the total weight of the shortest path (measured by the physical distance) from site n i to site n j in a network. w = 0 if i = j. The connection weight of a site S i in a network with p other sites is dened accordingly as P p T [i; k], which is the sum of the total weight of the shortest path from S k=1 i to each of the other sites in the network. Example 1 Consider a distributed deductive database (DDDB) D consisting of nine rules and four base relations whose relationships are captured by a direct rule-to-rule dependency matrix rr and a direct rule-to-relation dependency matrix rr. Further assume that a tablesize matrix T S for the base relations in D and a network topology matrix T are given along with rr and rr as follows: rr = T S = ; T = ; rr = ;

5 (a) A given DDDB and its digraph G (b) A network and its topology matrix Figure 1: The dependency graph of a DDDB and a network Figure 1(a) depicts D, in which rules are without arguments for simplicity of presentation, and relationships among rules and base relations are captured by a digraph G. Figure 1(b) shows a network with labeled edges, which denote weights of the edges, and its network topology matrix T. Fragmentation and Allocation Algorithms In this section, we present a fragmentation and allocation algorithm (FAA). FAA, as de- ned, consists of three subalgorithms: a rule clustering algorithm (RCA), a data clustering algorithm (DCA), and a rule and fragment allocation algorithm (CAA). FAA aims for maximizing the locality of query evaluation and thus minimizing communication cost and search space during query processing. In order to minimize the communication cost of transmitting data or partial answers to a query, rules with the same head predicate and, if possible, fragments of the base relations on which they depend (either directly or indirectly) are allocated to the same site. For if not, the computation of partial answers to a query Q involving rules with the same head predicate are performed at dierent sites since none of these sites has all of these rules locally. As a result, these sites must communicate with one another in order to generate all the answers to Q, and hence adding to the communication cost. Furthermore, our clustering and allocation algorithms assign mutually recursive rules to the same site. For if not, participating sites, where these rules are stored, have to communicate at each intermediate step of executing a query Q, and thus increasing the communication cost and execution time of Q since processors at dierent sites may spend most of their time waiting for one another or transmitting data across sites in the network [11]. RCA, which allows replicated rules in a DDDBS, is properly designed so that most, if not all, of the rules used by a query in a DDDBS are executed locally and hence reduces communication overheads. Since knowledge and integrity constraints represented in a de- When a query Q of the form?- q1 (V1); : : : ; q n (V n ), where V i (1 i n) denotes a vector of arguments of q i, is submitted to a site S where (a subset of) the rules for computing the answers to Q do not reside, (subqueries in) Q must be remotely executed and its answers are transmitted to S.

6 ductive database, which are captured by rule expressions, are much less time-variant than data [11], the eect on updates of replicated rules is reduced. DCA, on the other hand, provides two alternatives: either replication or partition of fragments of base relations referenced by rules. Replication is a desirable feature in a static DDDBS since it increases the locality of query processing and the availability and reliability of a DDDBS []. It is assumed that two direct dependency matrices, rr and rr, a table-size matrix T S, and a network topology matrix T are given as inputs (i.e., they are predetermined) to FAA. It is further assumed that each site has its local distributed data directory (which is called knowledge directory in [8]) that contains the information of \which site has which rules and fragments of base relations," and all the rules and fragments of base relations to be allocated are originally stored at a particular site, called primary site, in the network..1 RCA We consider two kinds of rules in a DDDB: (i) a directly dependent rule r 1 on another rule r, i.e., the head predicate of r appears in the body of r 1, and (ii) an indirectly dependent rule r 1 on another rule r, i.e., the head predicate of r appears in the body of r 1 through a number of intermediate rules. For example, in Figure 1(a), rules r and r are directly dependent on rule r, but rules r 1 and r are indirectly dependent on rule r through r and on base relation R through rules r and r. RCA rst constructs a digraph (DG) G using rr (G represents both direct and indirect dependency relationships among rules), and then computes each distinct subgraph (subdg) of G. Distinct subgraphs of G are used by DCA to compute fragments of base relations on which rules in each distinct subgraph depend (either directly or indirectly). Sections.1.1 and.1. include the steps of RCA..1.1 Computing Prospective Distinct Subgraphs We construct each prospective distinct subgraph of rules, that are not base relations, in G such that either consists of a single rule that does not reach other rules (that are not base relations) in G, or one (and only one) of the rules r in can directly reach other rules in, i.e., the head of every rule (except r) in appears in the body of r. Example Given the DDDB and its dependency graph G in Figure 1(a), the sets of rules fr 1 ; r g, fr ; r 1 ; r g, fr g, fr ; r g, fr ; r g, fr ; r ; r g, fr ; r g, fr 8 g, and fr 9 g with their A subgraph SG in G is called a distinct subgraph if SG has no outgoing edges to other nodes, which denote rules, not base relations, in G. All the rules in SG are eventually distributed to a particular site in the network. A prospective distinct subgraph = fr1 ; rg of a DG G is a subgraph of G such that r is directly reachable from r1. Each prospective distinct subgraph eventually becomes (a portion of) a distinct subgraph of G. If there exists no node which is directly reachable from r1, then there is no r in.

7 (a) The dependency graph G of a given DDDB (b) Prospective distinct subgraphs (c) Attach fr1; rg; frg, and fr; rg to fr; r1; rg (d) Attach frg, fr,rg and fr,rg to fr,r,rg, and discard embedded subgraphs Figure : (Prospective) Distinct subgraphs of DDDB corresponding edges as shown in Figure (b) form dierent prospective distinct subgraphs of G in Figure 1(a). To simplify the discussion, from now on we denote a subdg of a DG by a set of nodes, assuming that the edges connecting nodes in are implicitly represented..1. Generating Distinct Subgraphs The next step of RCA expands, if possible, a prospective distinct subgraph iteratively to generate a distinct subgraph by adding indirectly dependent rules to as follows: If every node in a prospective distinct subgraph j is directly or indirectly reachable from a node in another prospective distinct subgraph i, expand i by attaching j to i and retain j. Repeat this process until no further changes can be made. Each subgraph of G which is embedded within another subgraph of G is discarded. This yields a set of distinct subgraphs of G. Example Consider the set of prospective distinct subgraphs generated in Example. By applying the steps in this subsection, we obtain ve distinct subgraphs: subdg 1 =fr 1,r,r,r g, subdg = fr,r g, subdg = fr,r,r,r g, subdg = fr 8 g and subdg =fr 9 g. Figures (c) - (d) illustrate the process of merging the prospective distinct subgraphs shown in Figure (b).

8 . DCA In this section, we propose a data clustering algorithm (DCA) which generates vertical fragments of base relations that are referenced by rules in a distinct subgraph computed by RCA. These fragments are attached to the distinct subgraphs and are allocated along with rules, which (in)directly depend on the fragments, at chosen sites in a DDDBS according to the cluster allocation algorithm to be introduced in Section.. Conventional techniques for developing fragments have tailored on the needs of user applications (queries). Our vertical fragmentation technique, however, is strictly based on the given set of rule expressions represented in a DG as well as the access frequency of queries in one of the fragmentation algorithms. The uniqueness of our approach is two-fold. First, no information of access pattern on base relations is needed by our fragmentation approach and hence our approach eliminates extra inputs as required by conventional fragmentation methods [1, 1, 1]. Second, attributes clustered in a vertical fragment are often determined by using an attribute anity matrix (constructed by using an attribute usage matrix and transactions) in the conventional approaches [1, 1]; however, fragments generated by using our vertical fragmentation approaches are strictly based on the rule-toattribute dependency matrices. We are motivated to investigate the vertical fragmentation problem in DDDBSs since it is inherently more complicated than horizontal fragmentation due to the total number of alternatives that are available in the vertical case [1]. More importantly, our vertical fragmentation methods generate an \optimal" clustering scheme of database relations that are referenced by rules in a DDDB. The resultant fragmentation scheme is optimal since only relevant rules and essential data that are needed for processing a particular query in a DDDB are clustered together to enhance the eciency and minimize overheads (in terms of communication costs) during query processing. Since disjoint fragmentation can be more easily handled by a distributed system than more sophisticated overlapping fragmentation [1], in this paper we present a disjoint vertical fragmentation algorithm, called DVF, as one of the two subalgorithms of DCA. DVF disallows distinct fragments of a base relation R to contain common attributes of R, and these fragments are not replicated over the network. On the other hand, since disjoint fragmentation is impractical in some real-world applications due to the constraints that it imposes to database design [10], we also consider an alternative fragmentation scheme, the overlapping vertical fragmentation, called OVF, the other subalgorithm of DCA. OVF allows overlapped fragments of a base relation to be replicated and distributed over the network. These two vertical fragmentation approaches are based on the notion of direct and indirect rule-to-attribute dependencies. In a DDDBS where response time and communication cost are the primary design issues and the involved databases are static, i.e., most of the data processing activities are retrievals (i.e., read), rather than modication (i.e., update), OVF is preferable than DVF. On the other hand, if communication cost is not a major concern, such as DDDBSs built based on local area networks, and database updates occur frequently, then DVF is preferable than OVF. It is assumed that there exists a tuple identier attribute T ID [10] for each fragmented relation R such that T ID is allocated with each fragment of R to the site where the fragment resides. (Tuple identier attributes ensure the lossless-join decomposition of various vertical 8

9 fragments of a base relation, and this concept is well understood in the literature [10, 1].) Prior to the introduction of the two fragmentation approaches, we give a few denitions that are used in the two proposed vertical fragmentation algorithms. Denition 9 An N r NA rule-to-attribute dependency R k matrix AD k of base relation R k, where N r is the number of distinct rules and NA is the number of attributes in R R k k, in a DDDBS is dened as AD k [i; j] = ( 1 if the jth attribute of Rk is used by rule r i 0 otherwise It is assumed that given a rule r that references a base relation p, all the attributes of p that are not used by r are replaced by the \don't care" symbol [], i.e.,, in p. For example, given the rule r: q(v ) :- : : : ; p(a 1 ; ; A ); : : :., where p is a base relation with attributes A 1 ; A, and A, r uses only attributes A 1 and A of p since A of p is replaced by ' ' in r. Denition 10 Given a base relation R k and a distinct subgraph subdg i, the minimal set of attributes i;k is the subset of attributes SB of R k such that each attribute in SB is referred by at least one rule in subdg i. Denition 11 Let A k;i denote the ith attribute of a base relation R k. A vertical fragment of a base relation R k, denoted F i;k = fa k1 ; ; A kj g, is a subset of attributes of R k that are referenced by a rule in subdg i...1 Overlapping Vertical Fragmentation Recall that in Section.1. subsets of rules are clustered into distinct subgraphs to be allocated to dierent sites in a DDDBS. In this subsection, we propose a strategy for clustering vertical fragments of base relations that are referenced by a set of rules S r in a distinct subgraph. These fragments are allocated along with S r to a chosen site in a DDDBS. Algorithm OVF: Overlapping Vertical Fragmentation Algorithm INPUT: Distinct subgraphs subdgs and the set of all rule-to-attribute dependency matrices AD 1,, AD n, where n denotes the number of based relations in a DDDB. OUTPUT: A set of overlapping vertical fragments F S. F i;k in F S is a fragment of base relation R k that is referenced by a rule in subdg i. For each distinct subgraph subdg i, determine the minimal set of attributes i;k of each base relation R k that are referenced by some rules in subdg i. A rule-to-attribute dependency matrix is similar to the attribute usage matrix as dened in [1, 1], whereas the former is based on rule expressions while the latter is based on past query history. Note that i;k = F i;k in OVF, but i;k F i;k in DVF. 9

10 In OVF, two attributes A and B in a base relation are assigned to the same fragment if A and B are referenced by a rule r to be stored at the same site regardless of their degree of anity [1, 1]. Hence, a query that uses r can process r at a single site. The vertical fragmentation algorithms in [1, 1], on the other hand, allocate A and B to dierent fragments if the anity A and B is negligible. As a result, maximal locality of query evaluation is guaranteed in a DDDBS using our fragmentation and allocation approaches; however, this clustering approach is not adopted by the conventional vertical fragmentation approaches. Example Consider Example again and let AD 1, AD, AD, and AD be the given rule-to-attribute dependency matrices of the base relations R 1, R, R, and R in D of Example 1, respectively as shown below. AD1 = ;AD = ;AD = ;AD = OVF yields the minimal sets of attributes for subdg 1,, subdg as follows: 1;1 = fa 1;1 ; A 1; ; A 1; g; 1; = 1; = fg; 1; = fa ;1 ; A ; ; A ; ; A ; g; ;1 = ; = ; = fg; ; = fa ;1 ; A ; ; A ; ; A ; g; ;1 = fg; ; = fa ; ; A ; g; ; = fa ;1 ; A ; ; A ; ; A ; g; ; = fa ;1 ; A ; g; ;1 = ; = ; = fg; ; = fa ; ; A ; ; A ; g; ;1 = ; = ; = fg; ; = fa ;1 ; A ; ; A ; ; A ; g Hence, we obtain F i;k for each distinct subgraph subdg i in Example. Appending each F i;k to the corresponding subdg yields subdg 1 = fr 1 ; r ; r ; r, F 1;1 = fa 1;1 ; A 1; ; A 1; g; F 1; = fa ;1 ; A ; ; A ; ; A ; gg; subdg = fr ; r, F ; = fa ;1 ; A ; ; A ; ; A ; gg; subdg = fr ; r ; r ; r, F ; = fa ; ; A ; g; F ; = fa ;1 ; A ; ; A ; ; A ; g; F ; = fa ;1 ; A ; gg; subdg = fr 8, F ; = fa ; ; A ; ; A ; gg; subdg = fr 9, F ; = fa ;1 ; A ; ; A ; ; A ; gg 10

11 .. Disjoint Vertical Fragmentation In this subsection, we present the approach of DVF that is used as an alternative of OVF in conjunction with RCA. It is common in deductive databases that dierent rules reference the (same set of attributes of the) same base relation. It is also likely that dierent distinct subgraphs, as discussed in Section.1, include the same subset of rules that are to be distributed over dierent sites in a DDDBS. Hence, dierent distinct subgraphs may be extended to include the (same set of attributes of the) same base relation. If we do not allow replication of a subset of attributes A of a base relation for various reasons, we must decide which distinct subgraph should include A. Therefore, one of the design issues of our disjoint vertical fragmentation approach is to determine the distinct subgraph to which A should be assigned when dierent distinct subgraphs, including, depend on A. Our primary goal in determining the allocation of A is to minimize data transfer across the network during the query evaluation process that involves A. Given a set of attributes A in base relation R and a distinct subgraph, there are two cases to be considered for the allocation of A to : (i) depends on A, or (ii) does not depend on A. Obviously, we prefer to assign A to if depends on A to minimize data transfer during the query evaluation process involving rules in that depend on A. We also need to consider the situation when two or more distinct subgraphs depend on A. Our clustering strategy is based on the query access frequency of rules such that a more frequently referenced rule should be given priority on clustering with the data on which it depends. Before we discuss the approach of our DVF for assigning A to one of these distinct subgraphs, we give the following denitions. Denition 1 Given a DDDBS D with N Q vector of D, denoted freq Q, is dened as distinct queries, the query-access-frequency freq Q [i] = n; where i denotes query Q i (1 i N Q ) and n denotes the access frequency of query Q i during a given period of time in D [1]. Rule r j is said to be referenced by query Q i, or Q i depends on r j, if the head predicate of r j appears in Q i. Furthermore, if distinct subgraph k includes r j, which is referenced by Q i, then we say that k is referenced by Q i or Q i depends on k. Denition 1 Given a DDDBS D with N Q distinct queries and N r distinct rules, the N Q N r query-access-rule matrix Qr of D is dened as where 1 i N Q and 1 j N r. Qr[i; j] = ( 1 if rule rj is referenced by query Q i 0 otherwise A rule r depends on (a subset of attributes A of) base relation R, or (A in) R is referenced by r, if (attributes in A that appear as arguments of) R is in the body of r. Any distinct subgraph that includes r is said to depend on (A in) R, or (A in) R is said to be referenced by. 11

12 Denition 1 Given a DDDBS D, its query-access-frequency vector freq Q and queryaccess-rule matrix Qr, the rule-access-frequency vector freq r of D is dened as freq r = freq Q Qr; where freq r [i] denotes the access frequency of rule r i according to the given set of queries during a given period of time in D. Denition 1 Given a DDDBS D and its rule-access-frequency vector freq r, the subgraphaccess-frequency vector of a set of distinct subgraphs of D, denoted freq, for the given set of queries during a given period of time in D is dened as freq [i] = X k freq r [k]; where rule r k is included in distinct subgraph i. Given the denitions above, we now dene the most frequently referenced rules/distinct subgraphs as follows: Denition 1 Given a rule-access-frequency vector freq r, rule r i is the most frequently referenced rule among the set of N r rules denoted in freq r if max(freq r [1]; : : : ; freq r [N r ]) = freq r [i]. With respect to the reference frequency of rules according to a given set of queries, we can dene the most frequently referenced distinct subgraph among the given set of distinct subgraphs. This can be done by replacing r i, N r, and freq r in Denition 1 by distinct subgraph i, a number of the distinct subgraphs N, and subgraph-access-frequency vector freq, respectively. Example Consider the query-access-frequency vector freq Q and the query-access-rule matrix Qr given below. h freq Q = i 9 ; Qr = The rule-access-frequency vector is calculated as freq r = freq Q Qr = h i which indicates that r is the most frequently referenced rule by the given set of seven queries. Subsequently, using the distinct subgraphs generated in Example, we compute the subgraph-access-frequency vector as follows: freq [1] = freq r [1] + freq r [] + freq r [] + freq r [] = = 8 freq [] = freq r [] + freq r [] = + 10 = freq [] = freq r [] + freq r [] + freq r [] + freq r [] = = freq [] = freq r [8] = 8 freq [] = freq r [9] = 19 1

13 Hence, subdg is the most frequently referenced distinct subgraph among all the distinct subgraphs computed in Example, which is followed by subdg 1, subdg, subdg, and subdg. With the above denitions, we now present our approach for assigning a subset of attributes A of base relation R to one of the distinct subgraphs that depends on A. Given a set of queries, if a set of two or more distinct subgraphs S depend on A, we assign A to the distinct subgraph which is the most frequently referenced distinct subgraph in S so that data transfer during query processing involving A can be minimized. Hence, the following criteria are used in DVF: [Assignment Criteria] Distinct subgraph subdg i in S is assigned a subset of attributes A of base relation R if Criterion 1. subdg i depends on A, whereas no other distinct subgraph does, or Criterion. subdg i is the most frequently referenced distinct subgraph in S. If there exist more than one most frequently referenced distinct subgraph in S, then A is assigned to subdg i if subdg i is the rst to be considered for A. Let's consider Criteria for the set of attributes A which is competed by more than one distinct subgraph. Suppose that distinct subgraph subdg 1 depends on A, subdg depends on another subset of attributes B of R, subdg depends on subset C of R, and so forth. Furthermore, assume that A \ B \ C \ = (= ;). If subdg 1 is the most frequently referenced distinct subgraph among all the distinct subgraphs that depends on, then is assigned to subdg 1. We apply the same criteria for determining the assignment of A?, B?, and so forth. Algorithm DVF: Disjoint Vertical Fragmentation Algorithm INPUT: Distinct subgraphs subdg 1, : : :, subdg N, direct rule-to-rule dependency matrix rr, rule-to-attribute dependency matrix AD k for each base relation R k, queryaccess-frequency vector freq Q, and query-access-rule matrix Qr. OUTPUT: A set of disjoint vertical fragments F S. F i;k in F S is a fragment of base relation R k that is referenced by subdg i, 1 i N. Step 1. For each distinct subgraph subdg i, identify all the attributes of each base relation R k that are referenced by (a rule in) subdg i using AD k. Step. For each subset of attributes A of base relation R k that is referenced by only subdg i, apply the rst criterion of the Assignment Criteria and let F i;k = A. Step. For each subset of attributes A of base relation R k that is referenced by more than one distinct subgraph, apply the second criterion of the Assignment Criteria to determine the assignment of A and let F i;k = A. 1

14 Example Consider base relation R of the DDDB in Example 1 and the subgraph-accessfrequency vector computed in Example. Suppose that the rule-to-attribute dependency matrices are as given in Example. AD in Example indicates that rule r depends on attributes A ;1, A ;, A ; and A ;, whereas rule r 8 depends on attributes A ;, A ; and A ;. As a result, subdg 1, subdg, and subdg, which include r, are competing for A ;1, A ;, A ; and A ;. SubDG, which includes r 8, is competing with subdg 1, subdg, and subdg for A ; and A ;. Using the subdgs and matrices in Examples and, respectively, step 1 of DVF identies the sets of dependent attributes as shown in Table 1. subdg rules dependent attributes subdg 1 r 1 ; r ; r ; r fa 1;1 ; A 1; ; A 1; g fa ;1 ; A ; ; A ; ; A ; g subdg r ; r fa ;1 ; A ; ; A ; ; A ; g subdg r ; r ; r ; r fa ; ; A ; g, fa ;1 ; A ; g fa ;1 ; A ; ; A ; ; A ; g subdg r 8 fa ; ; A ; ; A ; g subdg r 9 fa ;1 ; A ; ; A ; ; A ; g Table 1: Dependencies between subgraphs and attributes Note that subdg 1, subdg, and subdg are competing for the same subset of attributes in R, i.e., fa ;1 ; A ; ; A ; ; A ; g, whereas subdg is competing for a dierent subset of attributes of R, i.e., fa ; ; A ; ; A ; g. Since no other distinct subgraph competes for attribute A ; with subdg, by step of DVF, A ; is assigned to subdg. We now consider the assignment of A ;1 ; A ; ; A ;, and A ; to either subdg 1, subdg, subdg, or partly to subdg. SubDG 1, subdg, and subdg compete for the common attributes A ;1 and A ;, whereas subdg 1, subdg, subdg and subdg compete for A ; and A ;. According to the subgraph-access-frequency vector computed in Example and the Assignment Criteria, we assign A ;1 and A ; to subdg since subdg is the most frequently referenced subgraph among the three distinct subgraphs. Also, we assign A ; and A ; to subdg for the same reason among the four distinct subgraphs. Hence, appending the fragmentation of R to the subdgs in Example yields subdg 1 fr 1 ; r ; r ; r ; F 1; = g = fr 1 ; r ; r ; r g; subdg fr ; r ; F ; = g = fr ; r g; subdg fr ; r ; r ; r ; F ; = fa ;1 ; A ; ; A ; ; A ; gg; subdg fr 8 ; F ; = fa ; gg; subdg fr 9 g Note that we have yet to assign attribute A ; to any of the distinct subgraphs. This happens because no distinct subgraph depends on A ;. We discuss the strategy to allocate A ; in the Cluster Allocation Algorithm in the next section. 1

15 . Cluster Allocation Having included in each distinct subgraph a set of rules S r with the corresponding set of vertical fragments of base relations { which are necessary to evaluate the rules in S r { using RCA and either OVF or DVF, FAA proceeds to choose network sites for the allocation of the clusters, each of which contains the rules and all the corresponding vertical fragments of base relations in a particular distinct subgraph, by using the Cluster Allocation Algorithm (CAA). This section describes the steps in CAA. Our strategy for the allocation of clusters of rules and data is consistent with the strategy that we use for clustering rules and data discussed in the preceding sections, i.e., our primary concern in the allocation of the clusters over the network is to minimize data transfer across the network during the query evaluation process. For this purpose, we consider the access frequency of a query at a particular network site which is shown in the query-access-frequency-at-site matrix. Denition 1 Given a DDDBS with N S network sites and N Q queries, an N S N Q queryaccess-frequency-at-site matrix freq SQ is dened as follows: freq SQ [i; j] = n; where 1 i N S ; 1 j N Q, and n denotes the number of jth query initiated at site i during a given period of time. Furthermore, the sum of the jth column of freq SQ denotes the access frequency of the jth query at dierent sites. Thus, the sum of the jth column in freq SQ has to be the same as freq Q [j] as given in Denition 1. Subsequently, we can derive freq Q using freq SQ as follows: freq Q [j] = XNS i=1 freq SQ [i; j]; where 1 j N Q. Example Suppose that we are given the following query-access-frequency-at-site matrix for the four network sites in Example 1 and the seven queries mentioned in Example : freq SQ = The summation of each column of freq SQ yields X h i freq SQ [i; j] = i=1 which is freq Q in Example. Using the query-access-frequency-at-site matrix and the query-access-rule matrix of a DDDBS, we determine the access frequency of a rule at a network site, represented by the rule-access-frequency-at-site matrix. 1

16 Denition 18 Given a query-access-frequency-at-site matrix freq SQ and a query-accessrule matrix Qr, an N S N r rule-access-frequency-at-site matrix freq Sr can be computed as freq Sr = freq SQ Qr; where freq Sr [i; j] = n denotes that site i accesses the jth rule n times during a given period of time. Furthermore, the sum of the jth column of freq Sr denotes the access frequency of rule j at dierent sites. Thus, the sum has to be the same as freq r [j] in Denition 1. Subsequently, we can derive freq r using freq Sr as follows: freq r [j] = XNS i=1 freq Sr [i; j]; where 1 j N r. Example 8 Consider freq SQ in Example and Qr in Example again. We obtain the access frequency of a rule at a network site by computing the rule-access-frequency-at-site matrix freq Sr as follows: freq Sr = freq SQ Qr = = The summation of each column of freq Sr yields X h i freq Sr [i; j] = i=1 which is freq r in Example. Based on the notion of the access frequency of a rule at dierent network sites, we now dene the access frequency of a distinct subgraph at a network site, represented by the subgraph-access-frequency-at-site matrix using the rule-access-frequency-at-site matrix. Denition 19 Given a rule-access-frequency-at-site matrix freq Sr, an N S N subgraphaccess-frequency-at-site matrix of the given set of sites and distinct subgraphs, denoted freq S, is dened as follows: freq S [i; j] = X k freq Sr [i; k]; where rule r k is included in distinct subgraph j and j is referenced by a query at site i. 1

17 We now dene the site which accesses distinct subgraph i most frequently among the given N S sites in a network as follows: Denition 0 Given the subgraph-access-frequency-at-site matrix freq S of a DDDBS with N S sites, site S i is the site which accesses distinct subgraph j (1 j N ) most frequently if max(freq S [1; j]; : : : ; freq S [N S ; j]) = freq S [i; j]. Example 9 Consider freq Sr in Example 8 and the ve distinct subgraphs generated in Example again. We obtain the subgraph-access-frequency-at-site matrix as follows: freq S = = Using freq S, we can determine which site accesses subgraph i ; 1 i, most frequently among all the network sites. As computed, site is the one accessing 1 and most frequently; sites and are the most frequently sites accessing, sites for, and site for. We now propose a strategy for allocating each cluster of rules and vertical fragments computed by either OVF or DVF of base relations to a network site. Algorithm CAA: Cluster Allocation Algorithm INPUT: A set of clusters C 1, : : :, C N, a set of network sites S 1, : : :, S N S with the associated subgraph-access-frequency-at-site matrix freq S, the network topology matrix T, and E which is a set of attributes not referenced by any distinct subgraph. OUTPUT: Allocation of each cluster C i ; 1 i N, and E to a network site. Step 1. For each cluster C i, identify the site S M i which accesses C i most frequently among all of the sites in the network using freq S. Step. If there exists only one site S M i allocate C i to S M i. which is the most frequently access site of C i, then Step. If a number of sites S 1 ; : : : ; S n are the most frequently access sites of C i, choose one of these sites whose connection weight is the smallest using T. (If more than one site has the smallest connection weight, arbitrarily choose one of these sites.) Allocate C i to the chosen site. 1

18 Step. Allocate E to the site in the network whose connection weight is the smallest. (If there exists more than one such site, arbitrarily choose one of them.) Note that we consider the connection weight at step above. The connection weight CW i of site S i indicates the data transfer cost 8 between S i and all other sites in the network. When two or more sites access a cluster C at the same frequency, it is reasonable to allocate C to the site whose connection weight is less than the others in order to minimize data transfer cost during the query evaluation process in general. Example 10 Consider the subgraph-access-frequency-at-site matrix freq S in Example 9 and the network topology matrix T given in Example 1 again. We identify the site S M i (1 i ) that accesses cluster C i most frequently as follows using freq S : S M 1 = S S M = S S M = S ; S S M = S S M = S Hence, we allocate C 1 and C to site, C to site, and C to site. In case of C, site and site access it at the same frequency, thus we consider the connection weight of these sites. The connection weight of each site CW i, 1 i, is computed as follows using T : CW 1 = = 8 CW = = CW = = 1 CW = = 10 Hence, C is allocated to site since CW < CW. Furthermore, by step of Algorithm CAA, A ; in base relation R is allocated to site since A ; is not referenced by any rule, and hence any distinct subgraph. Mathematical Interpretation of FAA In this section, we present the mathematical implication of the proposed algorithms..1 Mathematical Implication of RCA We construct the direct and indirect rule-to-rule dependency matrix R rr and the segmentto-rule dependency matrix R 0 rr which correspond to the steps of constructing (prospective) distinct subgraphs as discussed in Section.1. This can be done by rst computing the reachability matrix R rule, which captures all direct and indirect rule-to-rule dependencies among dierent rules, from the given direct rule-to-rule dependency matrix rr, and then cost. 8 It is assumed that the physical distance between two network sites is proportional to the data transfer 18

19 performing a Boolean addition on R rule and an N r N r identity matrix I, where N r is the number of distinct rules in a given database. (R rule _ I retains all the rules that are not reachable from other rules in the database.) Hence, R rule = rr (1) _ rr () rr (Nr) (1) R rr = R rule _ I () Hereafter, we proceed to generate distinct subgraphs as computed in Section.1. using R rr by extracting each row of R rr that is not included in any other row of R rr9. The resultant matrix is R 0 rr, the segment-to-rule dependency matrix, where each row is called a segment which is a vector representation of a distinct subgraph computed in Section.1.. Example 11 Consider the direct rule-to-rule dependency matrix rr in Example 1. Hence, R rule = rr 1 _ rr rr 9 = = _ _ 9 Row i in Rrr includes row j of R rr if 8 Nr k=1 (R rr[j; k] = 1 implies R rr [i; k] = 1), where 1 i; j N r and N r denotes the number of distinct rules in a database. 19

20 R rr = R rule _ I = ; R 0 rr = The ith row of R 0 rr, 1 i, corresponds to the distinct subgraph subdg i in Example.. Mathematical Implication of OVF Each row of the segment-to-rule matrix R 0 rr is a segment which includes rules that are referenced by the corresponding distinct subgraph. Since the given rule-to-attribute dependency matrix AD k of base relation R k captures the information of which rules reference which attributes in base relation R k, using R 0 rr and AD k, we can determine which segment references which attributes in R k. Hence, the set of minimal overlapping vertical fragments F k of base relation R k can be computed as follows: F k = R 0 rr AD k Note that the `' operation in the above formula is not a boolean multiplication. Instead, it is a normal matrix multiplication. The ith row of F k (1 i N ) with at least one non-zero entry yields a minimal vertical fragment 10 of base relation R k for the ith segment which represents the ith distinct subgraph, i.e., subdg i, computed by RCA. Example 1 Consider the segments captured in Rrr, 0 where R 0 rr is computed in Example 11, and the rule-to-attribute dependency matrices AD 1, AD, AD, and AD of base relations R 1, R, R, and R, respectively, given in Example. Then, F 1 = R 0 rr AD 1 = F = R 0 rr AD = ; F = R 0 rr AD = ; F = R 0 rr AD = We see that the ith row of F k (1 i, 1 k ) corresponds to i;k in Example. Using the ith row of each F k and the ith row of R 0 rr, we can construct the corresponding distinct subdg i in Example. 10 The ith row in Fk is a minimal overlapping vertical fragment of R k since the ith row includes only the attributes of R k that are referenced by subdg i. 0

21 . Mathematical Implication of DVF As discussed in Section.., the major task of DVF is to determine the distinct subgraph to which a subset of attributes A of a base relation should be assigned when more than one distinct subgraph references A. This task is accomplished by Algorithm DVF using the Assignment Criteria as discussed in Section... We now show that the task performed by Algorithm DVF can be accomplished by manipulating the segment-to-rule matrix Rrr, 0 the rule-to-attribute dependency matrix AD k for each base relation R k, the query-accessfrequency vector freq Q which is derived from the query-access-frequency-at-site matrix freq SQ, and the query-access-rule matrix Qr as follows, given N, the number of distinct subgraphs, and N, the number of attributes in R A k k: [Step 1 of DVF] For each distinct subgraph subdg i, 1 i N, the subset of attributes A of base relation R k and AD k as follows: that is referenced by subdg i can be identied by using R 0 rr F k = R 0 rr AD k where `' is the matrix multiplication operation, and for any j (1 j N A k ), F k [i; j] = 1 denotes that the jth attribute of R k is referenced by subdg i, and A = [ 1jNA k attr(f k [i; j]) where attr(f k [i; j]) = ( fa k;j g if F k [i; j] = 1 ; otherwise. [Step of DVF] The subset of attributes B i of base relation R k that is referenced by only one distinct subgraph subdg i can be determined by using F k computed above. If there exists only one i (1 i N ) such that F k [i; j] = 1, for any j (1 j N A k ), then the jth attribute of R k is referenced only by subdg i. Hence, B i = [ 1jNA k attr(f k [i; j]) where attr(f k [i; j]) = 8 >< >: fa k;j g if F k [i; j] = 1, F k [i 0 ; j] = 0, and i = i 0, for all i 0, 1 i 0 N ; otherwise. which indicates that B i should be assigned to subdg i according to the rst criterion of the Assignment Criteria. [Step of DVF] The jth attribute C k;j of base relation R k that is referenced by more than one distinct subgraph can also be determined by using F k. If there exist more 1

22 than one i (1 i N ) such that F k [i; j] = 1, for any j (1 j N ), A k C k;j is referenced by more than one distinct subgraph. Hence, where attr(f k [i; j]) = 8 >< >: C k;j = attr(f k [i; j]) fa k;j g if F k [i; j] = 1, F k [i 0 ; j] = 1, and i = i 0 for some i 0 (1 i 0 N ) ; otherwise. To simulate the second criterion of the Assignment Criteria, we rst compute the rule-access-frequency vector freq r = freq Q Q r and the subgraph-access-frequency vector freq [i] = P k freq r [k], for any rule r k that is included in subgraph i. (Note that freq can be computed by the matrix multiplication of R 0 rr and freq 0 r, i.e., freq = R 0 rr freq0 r, where freq 0 r is the column vector representation of freq r.) Using freq, we determine the most frequently referenced distinct subgraph M from the set of distinct subgraphs S g in which each distinct subgraph depends on C k;j. M is the chosen distinct subgraph in S g whose corresponding entry in freq is larger than (or equal to) each of the corresponding entry in freq which denotes the access frequency of one of the distinct subgraphs in S g, i.e., freq M = max(freq [i 1 ], freq [i ],, freq [ M ],, freq [i n ]), where the corresponding distinct subgraphs of freq [i 1 ],, freq [i n ] are the distinct subgraphs in S g 11. Then, we assign C k;j to M. Example 1 Consider the resultant matrices F 1, F, F, and F in Example 1. The value 1 in each matrix indicates that the corresponding attribute is referenced by a distinct subgraph. It is not dicult to see that the third column of the ith row of Table 1 includes all the attributes whose corresponding entries are set to 1 in the ith row of F 1, F, F, and F, respectively which is step 1 of DVF. Let us consider F. Note that F [; ] denotes the th attribute of R, A ;, which is referenced only by subdg. Hence, we assign A ; to subdg, which is step of DVF. On the other hand, A ;1 and A ; are referenced by subdg 1, subdg and subdg, whereas A ; and A ; are referenced by subdg 1, subdg, subdg, and subdg. Therefore, we apply the second criterion of the Assignment Criteria to determine a distinct subgraph to which the set of attributes is to be assigned by the following matrix manipulation, where freq Q and Qr are given in Example, and R 0 rr is given in Example 11: freq r = freq Q Qr = h i Sg can be determined by extracting all the corresponding entries in the jth column of F k. If F k [i; j] = 1 (1 i N ), then subdg i depends on the jth attribute of base relation R k, as discussed in step 1 of DVF.

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

A Mixed Fragmentation Methodology For. Initial Distributed Database Design. Shamkant B. Navathe. Georgia Institute of Technology.

A Mixed Fragmentation Methodology For. Initial Distributed Database Design. Shamkant B. Navathe. Georgia Institute of Technology. A Mixed Fragmentation Methodology For Initial Distributed Database Design Shamkant B. Navathe Georgia Institute of Technology Kamalakar Karlapalem Hong Kong University of Science and Technology Minyoung

More information

MC 302 GRAPH THEORY 10/1/13 Solutions to HW #2 50 points + 6 XC points

MC 302 GRAPH THEORY 10/1/13 Solutions to HW #2 50 points + 6 XC points MC 0 GRAPH THEORY 0// Solutions to HW # 0 points + XC points ) [CH] p.,..7. This problem introduces an important class of graphs called the hypercubes or k-cubes, Q, Q, Q, etc. I suggest that before you

More information

Math 443/543 Graph Theory Notes

Math 443/543 Graph Theory Notes Math 443/543 Graph Theory Notes David Glickenstein September 3, 2008 1 Introduction We will begin by considering several problems which may be solved using graphs, directed graphs (digraphs), and networks.

More information

Math 443/543 Graph Theory Notes

Math 443/543 Graph Theory Notes Math 443/543 Graph Theory Notes David Glickenstein September 8, 2014 1 Introduction We will begin by considering several problems which may be solved using graphs, directed graphs (digraphs), and networks.

More information

Parallel Graph Algorithms

Parallel Graph Algorithms Parallel Graph Algorithms Design and Analysis of Parallel Algorithms 5DV050 Spring 202 Part I Introduction Overview Graphsdenitions, properties, representation Minimal spanning tree Prim's algorithm Shortest

More information

Relational Database: The Relational Data Model; Operations on Database Relations

Relational Database: The Relational Data Model; Operations on Database Relations Relational Database: The Relational Data Model; Operations on Database Relations Greg Plaxton Theory in Programming Practice, Spring 2005 Department of Computer Science University of Texas at Austin Overview

More information

A Connection between Network Coding and. Convolutional Codes

A Connection between Network Coding and. Convolutional Codes A Connection between Network Coding and 1 Convolutional Codes Christina Fragouli, Emina Soljanin christina.fragouli@epfl.ch, emina@lucent.com Abstract The min-cut, max-flow theorem states that a source

More information

The Enhancement of Semijoin Strategies in Distributed Query Optimization

The Enhancement of Semijoin Strategies in Distributed Query Optimization The Enhancement of Semijoin Strategies in Distributed Query Optimization F. Najjar and Y. Slimani Dept. Informatique - Facult6 des Sciences de Tunis Campus Universitaire - 1060 Tunis, Tunisie yahya, slimani@f

More information

Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we ha

Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we ha Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we have to take into account the complexity of the code.

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions... Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing

More information

ON CONSISTENCY CHECKING OF SPATIAL RELATIONSHIPS IN CONTENT-BASED IMAGE DATABASE SYSTEMS

ON CONSISTENCY CHECKING OF SPATIAL RELATIONSHIPS IN CONTENT-BASED IMAGE DATABASE SYSTEMS COMMUNICATIONS IN INFORMATION AND SYSTEMS c 2005 International Press Vol. 5, No. 3, pp. 341-366, 2005 004 ON CONSISTENCY CHECKING OF SPATIAL RELATIONSHIPS IN CONTENT-BASED IMAGE DATABASE SYSTEMS QING-LONG

More information

1 Non greedy algorithms (which we should have covered

1 Non greedy algorithms (which we should have covered 1 Non greedy algorithms (which we should have covered earlier) 1.1 Floyd Warshall algorithm This algorithm solves the all-pairs shortest paths problem, which is a problem where we want to find the shortest

More information

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing

More information

International Journal of Foundations of Computer Science c World Scientic Publishing Company DFT TECHNIQUES FOR SIZE ESTIMATION OF DATABASE JOIN OPERA

International Journal of Foundations of Computer Science c World Scientic Publishing Company DFT TECHNIQUES FOR SIZE ESTIMATION OF DATABASE JOIN OPERA International Journal of Foundations of Computer Science c World Scientic Publishing Company DFT TECHNIQUES FOR SIZE ESTIMATION OF DATABASE JOIN OPERATIONS KAM_IL SARAC, OMER E GEC_IO GLU, AMR EL ABBADI

More information

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer

17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer Module 2: Divide and Conquer Divide and Conquer Control Abstraction for Divide &Conquer 1 Recurrence equation for Divide and Conquer: If the size of problem p is n and the sizes of the k sub problems are

More information

Framework for Design of Dynamic Programming Algorithms

Framework for Design of Dynamic Programming Algorithms CSE 441T/541T Advanced Algorithms September 22, 2010 Framework for Design of Dynamic Programming Algorithms Dynamic programming algorithms for combinatorial optimization generalize the strategy we studied

More information

.Math 0450 Honors intro to analysis Spring, 2009 Notes #4 corrected (as of Monday evening, 1/12) some changes on page 6, as in .

.Math 0450 Honors intro to analysis Spring, 2009 Notes #4 corrected (as of Monday evening, 1/12) some changes on page 6, as in  . 0.1 More on innity.math 0450 Honors intro to analysis Spring, 2009 Notes #4 corrected (as of Monday evening, 1/12) some changes on page 6, as in email. 0.1.1 If you haven't read 1.3, do so now! In notes#1

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

MATH 682 Notes Combinatorics and Graph Theory II. One interesting class of graphs rather akin to trees and acyclic graphs is the bipartite graph:

MATH 682 Notes Combinatorics and Graph Theory II. One interesting class of graphs rather akin to trees and acyclic graphs is the bipartite graph: 1 Bipartite graphs One interesting class of graphs rather akin to trees and acyclic graphs is the bipartite graph: Definition 1. A graph G is bipartite if the vertex-set of G can be partitioned into two

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition. 18.433 Combinatorial Optimization Matching Algorithms September 9,14,16 Lecturer: Santosh Vempala Given a graph G = (V, E), a matching M is a set of edges with the property that no two of the edges have

More information

Bipartite Roots of Graphs

Bipartite Roots of Graphs Bipartite Roots of Graphs Lap Chi Lau Department of Computer Science University of Toronto Graph H is a root of graph G if there exists a positive integer k such that x and y are adjacent in G if and only

More information

2 ATTILA FAZEKAS The tracking model of the robot car The schematic picture of the robot car can be seen on Fig.1. Figure 1. The main controlling task

2 ATTILA FAZEKAS The tracking model of the robot car The schematic picture of the robot car can be seen on Fig.1. Figure 1. The main controlling task NEW OPTICAL TRACKING METHODS FOR ROBOT CARS Attila Fazekas Debrecen Abstract. In this paper new methods are proposed for intelligent optical tracking of robot cars the important tools of CIM (Computer

More information

[Ch 6] Set Theory. 1. Basic Concepts and Definitions. 400 lecture note #4. 1) Basics

[Ch 6] Set Theory. 1. Basic Concepts and Definitions. 400 lecture note #4. 1) Basics 400 lecture note #4 [Ch 6] Set Theory 1. Basic Concepts and Definitions 1) Basics Element: ; A is a set consisting of elements x which is in a/another set S such that P(x) is true. Empty set: notated {

More information

Approximation Algorithms for Wavelength Assignment

Approximation Algorithms for Wavelength Assignment Approximation Algorithms for Wavelength Assignment Vijay Kumar Atri Rudra Abstract Winkler and Zhang introduced the FIBER MINIMIZATION problem in [3]. They showed that the problem is NP-complete but left

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

Enumeration of Full Graphs: Onset of the Asymptotic Region. Department of Mathematics. Massachusetts Institute of Technology. Cambridge, MA 02139

Enumeration of Full Graphs: Onset of the Asymptotic Region. Department of Mathematics. Massachusetts Institute of Technology. Cambridge, MA 02139 Enumeration of Full Graphs: Onset of the Asymptotic Region L. J. Cowen D. J. Kleitman y F. Lasaga D. E. Sussman Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02139 Abstract

More information

10.3 Recursive Programming in Datalog. While relational algebra can express many useful operations on relations, there

10.3 Recursive Programming in Datalog. While relational algebra can express many useful operations on relations, there 1 10.3 Recursive Programming in Datalog While relational algebra can express many useful operations on relations, there are some computations that cannot be written as an expression of relational algebra.

More information

Interleaving Schemes on Circulant Graphs with Two Offsets

Interleaving Schemes on Circulant Graphs with Two Offsets Interleaving Schemes on Circulant raphs with Two Offsets Aleksandrs Slivkins Department of Computer Science Cornell University Ithaca, NY 14853 slivkins@cs.cornell.edu Jehoshua Bruck Department of Electrical

More information

CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh HW#3 Due at the beginning of class Thursday 02/26/15

CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh HW#3 Due at the beginning of class Thursday 02/26/15 CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh (rezab@stanford.edu) HW#3 Due at the beginning of class Thursday 02/26/15 1. Consider a model of a nonbipartite undirected graph in which

More information

On Universal Cycles of Labeled Graphs

On Universal Cycles of Labeled Graphs On Universal Cycles of Labeled Graphs Greg Brockman Harvard University Cambridge, MA 02138 United States brockman@hcs.harvard.edu Bill Kay University of South Carolina Columbia, SC 29208 United States

More information

Incompatibility Dimensions and Integration of Atomic Commit Protocols

Incompatibility Dimensions and Integration of Atomic Commit Protocols The International Arab Journal of Information Technology, Vol. 5, No. 4, October 2008 381 Incompatibility Dimensions and Integration of Atomic Commit Protocols Yousef Al-Houmaily Department of Computer

More information

Distributed Data Structures and Algorithms for Disjoint Sets in Computing Connected Components of Huge Network

Distributed Data Structures and Algorithms for Disjoint Sets in Computing Connected Components of Huge Network Distributed Data Structures and Algorithms for Disjoint Sets in Computing Connected Components of Huge Network Wing Ning Li, CSCE Dept. University of Arkansas, Fayetteville, AR 72701 wingning@uark.edu

More information

Matching Theory. Figure 1: Is this graph bipartite?

Matching Theory. Figure 1: Is this graph bipartite? Matching Theory 1 Introduction A matching M of a graph is a subset of E such that no two edges in M share a vertex; edges which have this property are called independent edges. A matching M is said to

More information

An Eternal Domination Problem in Grids

An Eternal Domination Problem in Grids Theory and Applications of Graphs Volume Issue 1 Article 2 2017 An Eternal Domination Problem in Grids William Klostermeyer University of North Florida, klostermeyer@hotmail.com Margaret-Ellen Messinger

More information

The External Network Problem

The External Network Problem The External Network Problem Jan van den Heuvel and Matthew Johnson CDAM Research Report LSE-CDAM-2004-15 December 2004 Abstract The connectivity of a communications network can often be enhanced if the

More information

Parallel Databases C H A P T E R18. Practice Exercises

Parallel Databases C H A P T E R18. Practice Exercises C H A P T E R18 Parallel Databases Practice Exercises 181 In a range selection on a range-partitioned attribute, it is possible that only one disk may need to be accessed Describe the benefits and drawbacks

More information

International Journal of Multidisciplinary Research and Modern Education (IJMRME) Impact Factor: 6.725, ISSN (Online):

International Journal of Multidisciplinary Research and Modern Education (IJMRME) Impact Factor: 6.725, ISSN (Online): COMPUTER REPRESENTATION OF GRAPHS USING BINARY LOGIC CODES IN DISCRETE MATHEMATICS S. Geetha* & Dr. S. Jayakumar** * Assistant Professor, Department of Mathematics, Bon Secours College for Women, Villar

More information

1 Matchings in Graphs

1 Matchings in Graphs Matchings in Graphs J J 2 J 3 J 4 J 5 J J J 6 8 7 C C 2 C 3 C 4 C 5 C C 7 C 8 6 J J 2 J 3 J 4 J 5 J J J 6 8 7 C C 2 C 3 C 4 C 5 C C 7 C 8 6 Definition Two edges are called independent if they are not adjacent

More information

to automatically generate parallel code for many applications that periodically update shared data structures using commuting operations and/or manipu

to automatically generate parallel code for many applications that periodically update shared data structures using commuting operations and/or manipu Semantic Foundations of Commutativity Analysis Martin C. Rinard y and Pedro C. Diniz z Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 fmartin,pedrog@cs.ucsb.edu

More information

Routing. Information Networks p.1/35

Routing. Information Networks p.1/35 Routing Routing is done by the network layer protocol to guide packets through the communication subnet to their destinations The time when routing decisions are made depends on whether we are using virtual

More information

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm In the name of God Part 4. 4.1. Dantzig-Wolf Decomposition Algorithm Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Introduction Real world linear programs having thousands of rows and columns.

More information

Chapter 17: Parallel Databases

Chapter 17: Parallel Databases Chapter 17: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems Database Systems

More information

Number Theory and Graph Theory

Number Theory and Graph Theory 1 Number Theory and Graph Theory Chapter 6 Basic concepts and definitions of graph theory By A. Satyanarayana Reddy Department of Mathematics Shiv Nadar University Uttar Pradesh, India E-mail: satya8118@gmail.com

More information

Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, Roma, Italy

Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, Roma, Italy Graph Theory for Modelling a Survey Questionnaire Pierpaolo Massoli, ISTAT via Adolfo Ravà 150, 00142 Roma, Italy e-mail: pimassol@istat.it 1. Introduction Questions can be usually asked following specific

More information

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology Mobile and Heterogeneous databases Distributed Database System Query Processing A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in four lectures. In case you

More information

Vertical decomposition of a lattice using clique separators

Vertical decomposition of a lattice using clique separators Vertical decomposition of a lattice using clique separators Anne Berry, Romain Pogorelcnik, Alain Sigayret LIMOS UMR CNRS 6158 Ensemble Scientifique des Cézeaux Université Blaise Pascal, F-63 173 Aubière,

More information

An Ecient Approximation Algorithm for the. File Redistribution Scheduling Problem in. Fully Connected Networks. Abstract

An Ecient Approximation Algorithm for the. File Redistribution Scheduling Problem in. Fully Connected Networks. Abstract An Ecient Approximation Algorithm for the File Redistribution Scheduling Problem in Fully Connected Networks Ravi Varadarajan Pedro I. Rivera-Vega y Abstract We consider the problem of transferring a set

More information

2. CONNECTIVITY Connectivity

2. CONNECTIVITY Connectivity 2. CONNECTIVITY 70 2. Connectivity 2.1. Connectivity. Definition 2.1.1. (1) A path in a graph G = (V, E) is a sequence of vertices v 0, v 1, v 2,..., v n such that {v i 1, v i } is an edge of G for i =

More information

perspective, logic programs do have a notion of control ow, and the in terms of the central control ow the program embodies.

perspective, logic programs do have a notion of control ow, and the in terms of the central control ow the program embodies. Projections of Logic Programs Using Symbol Mappings Ashish Jain Department of Computer Engineering and Science Case Western Reserve University Cleveland, OH 44106 USA email: jain@ces.cwru.edu Abstract

More information

Basics of Graph Theory

Basics of Graph Theory Basics of Graph Theory 1 Basic notions A simple graph G = (V, E) consists of V, a nonempty set of vertices, and E, a set of unordered pairs of distinct elements of V called edges. Simple graphs have their

More information

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of

More information

Fundamental Properties of Graphs

Fundamental Properties of Graphs Chapter three In many real-life situations we need to know how robust a graph that represents a certain network is, how edges or vertices can be removed without completely destroying the overall connectivity,

More information

9.5 Equivalence Relations

9.5 Equivalence Relations 9.5 Equivalence Relations You know from your early study of fractions that each fraction has many equivalent forms. For example, 2, 2 4, 3 6, 2, 3 6, 5 30,... are all different ways to represent the same

More information

Discrete Mathematics Lecture 4. Harper Langston New York University

Discrete Mathematics Lecture 4. Harper Langston New York University Discrete Mathematics Lecture 4 Harper Langston New York University Sequences Sequence is a set of (usually infinite number of) ordered elements: a 1, a 2,, a n, Each individual element a k is called a

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the Heap-on-Top Priority Queues Boris V. Cherkassky Central Economics and Mathematics Institute Krasikova St. 32 117418, Moscow, Russia cher@cemi.msk.su Andrew V. Goldberg NEC Research Institute 4 Independence

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi

Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi Project and Production Management Prof. Arun Kanda Department of Mechanical Engineering Indian Institute of Technology, Delhi Lecture - 8 Consistency and Redundancy in Project networks In today s lecture

More information

Varying Applications (examples)

Varying Applications (examples) Graph Theory Varying Applications (examples) Computer networks Distinguish between two chemical compounds with the same molecular formula but different structures Solve shortest path problems between cities

More information

Implementation Techniques

Implementation Techniques V Implementation Techniques 34 Efficient Evaluation of the Valid-Time Natural Join 35 Efficient Differential Timeslice Computation 36 R-Tree Based Indexing of Now-Relative Bitemporal Data 37 Light-Weight

More information

16 Greedy Algorithms

16 Greedy Algorithms 16 Greedy Algorithms Optimization algorithms typically go through a sequence of steps, with a set of choices at each For many optimization problems, using dynamic programming to determine the best choices

More information

Discrete Mathematics, Spring 2004 Homework 8 Sample Solutions

Discrete Mathematics, Spring 2004 Homework 8 Sample Solutions Discrete Mathematics, Spring 4 Homework 8 Sample Solutions 6.4 #. Find the length of a shortest path and a shortest path between the vertices h and d in the following graph: b c d a 7 6 7 4 f 4 6 e g 4

More information

Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

More information

The problem of minimizing the elimination tree height for general graphs is N P-hard. However, there exist classes of graphs for which the problem can

The problem of minimizing the elimination tree height for general graphs is N P-hard. However, there exist classes of graphs for which the problem can A Simple Cubic Algorithm for Computing Minimum Height Elimination Trees for Interval Graphs Bengt Aspvall, Pinar Heggernes, Jan Arne Telle Department of Informatics, University of Bergen N{5020 Bergen,

More information

V10 Metabolic networks - Graph connectivity

V10 Metabolic networks - Graph connectivity V10 Metabolic networks - Graph connectivity Graph connectivity is related to analyzing biological networks for - finding cliques - edge betweenness - modular decomposition that have been or will be covered

More information

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Greedy Algorithms (continued) The best known application where the greedy algorithm is optimal is surely

More information

Module 1. Preliminaries. Contents

Module 1. Preliminaries. Contents Module 1 Preliminaries Contents 1.1 Introduction: Discovery of graphs............. 2 1.2 Graphs.............................. 3 Definitions........................... 4 Pictorial representation of a graph..............

More information

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism Parallel DBMS Parallel Database Systems CS5225 Parallel DB 1 Uniprocessor technology has reached its limit Difficult to build machines powerful enough to meet the CPU and I/O demands of DBMS serving large

More information

Pebble Sets in Convex Polygons

Pebble Sets in Convex Polygons 2 1 Pebble Sets in Convex Polygons Kevin Iga, Randall Maddox June 15, 2005 Abstract Lukács and András posed the problem of showing the existence of a set of n 2 points in the interior of a convex n-gon

More information

Disjoint Support Decompositions

Disjoint Support Decompositions Chapter 4 Disjoint Support Decompositions We introduce now a new property of logic functions which will be useful to further improve the quality of parameterizations in symbolic simulation. In informal

More information

Simpler, Linear-time Transitive Orientation via Lexicographic Breadth-First Search

Simpler, Linear-time Transitive Orientation via Lexicographic Breadth-First Search Simpler, Linear-time Transitive Orientation via Lexicographic Breadth-First Search Marc Tedder University of Toronto arxiv:1503.02773v1 [cs.ds] 10 Mar 2015 Abstract Comparability graphs are the undirected

More information

Introduction Alternative ways of evaluating a given query using

Introduction Alternative ways of evaluating a given query using Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction

More information

Design of Parallel Algorithms. Models of Parallel Computation

Design of Parallel Algorithms. Models of Parallel Computation + Design of Parallel Algorithms Models of Parallel Computation + Chapter Overview: Algorithms and Concurrency n Introduction to Parallel Algorithms n Tasks and Decomposition n Processes and Mapping n Processes

More information

15.4 Longest common subsequence

15.4 Longest common subsequence 15.4 Longest common subsequence Biological applications often need to compare the DNA of two (or more) different organisms A strand of DNA consists of a string of molecules called bases, where the possible

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

Power Set of a set and Relations

Power Set of a set and Relations Power Set of a set and Relations 1 Power Set (1) Definition: The power set of a set S, denoted P(S), is the set of all subsets of S. Examples Let A={a,b,c}, P(A)={,{a},{b},{c},{a,b},{b,c},{a,c},{a,b,c}}

More information

EXERCISES SHORTEST PATHS: APPLICATIONS, OPTIMIZATION, VARIATIONS, AND SOLVING THE CONSTRAINED SHORTEST PATH PROBLEM. 1 Applications and Modelling

EXERCISES SHORTEST PATHS: APPLICATIONS, OPTIMIZATION, VARIATIONS, AND SOLVING THE CONSTRAINED SHORTEST PATH PROBLEM. 1 Applications and Modelling SHORTEST PATHS: APPLICATIONS, OPTIMIZATION, VARIATIONS, AND SOLVING THE CONSTRAINED SHORTEST PATH PROBLEM EXERCISES Prepared by Natashia Boland 1 and Irina Dumitrescu 2 1 Applications and Modelling 1.1

More information

Searching a Sorted Set of Strings

Searching a Sorted Set of Strings Department of Mathematics and Computer Science January 24, 2017 University of Southern Denmark RF Searching a Sorted Set of Strings Assume we have a set of n strings in RAM, and know their sorted order

More information

Advanced Databases: Parallel Databases A.Poulovassilis

Advanced Databases: Parallel Databases A.Poulovassilis 1 Advanced Databases: Parallel Databases A.Poulovassilis 1 Parallel Database Architectures Parallel database systems use parallel processing techniques to achieve faster DBMS performance and handle larger

More information

PatternRank: A Software-Pattern Search System Based on Mutual Reference Importance

PatternRank: A Software-Pattern Search System Based on Mutual Reference Importance PatternRank: A Software-Pattern Search System Based on Mutual Reference Importance Atsuto Kubo, Hiroyuki Nakayama, Hironori Washizaki, Yoshiaki Fukazawa Waseda University Department of Computer Science

More information

Finding Strongly Connected Components

Finding Strongly Connected Components Yufei Tao ITEE University of Queensland We just can t get enough of the beautiful algorithm of DFS! In this lecture, we will use it to solve a problem finding strongly connected components that seems to

More information

Characterizations of Trees

Characterizations of Trees Characterizations of Trees Lemma Every tree with at least two vertices has at least two leaves. Proof. 1. A connected graph with at least two vertices has an edge. 2. In an acyclic graph, an end point

More information

Stored Relvars 18 th April 2013 (30 th March 2001) David Livingstone. Stored Relvars

Stored Relvars 18 th April 2013 (30 th March 2001) David Livingstone. Stored Relvars Stored Relvars Introduction The purpose of a Stored Relvar (= Stored Relational Variable) is to provide a mechanism by which the value of a real (or base) relvar may be partitioned into fragments and/or

More information

Let v be a vertex primed by v i (s). Then the number f(v) of neighbours of v which have

Let v be a vertex primed by v i (s). Then the number f(v) of neighbours of v which have Let v be a vertex primed by v i (s). Then the number f(v) of neighbours of v which have been red in the sequence up to and including v i (s) is deg(v)? s(v), and by the induction hypothesis this sequence

More information

Mathematics and Computer Science

Mathematics and Computer Science Technical Report TR-2006-010 Revisiting hypergraph models for sparse matrix decomposition by Cevdet Aykanat, Bora Ucar Mathematics and Computer Science EMORY UNIVERSITY REVISITING HYPERGRAPH MODELS FOR

More information

Chapter 4. Relations & Graphs. 4.1 Relations. Exercises For each of the relations specified below:

Chapter 4. Relations & Graphs. 4.1 Relations. Exercises For each of the relations specified below: Chapter 4 Relations & Graphs 4.1 Relations Definition: Let A and B be sets. A relation from A to B is a subset of A B. When we have a relation from A to A we often call it a relation on A. When we have

More information

REGULAR GRAPHS OF GIVEN GIRTH. Contents

REGULAR GRAPHS OF GIVEN GIRTH. Contents REGULAR GRAPHS OF GIVEN GIRTH BROOKE ULLERY Contents 1. Introduction This paper gives an introduction to the area of graph theory dealing with properties of regular graphs of given girth. A large portion

More information

Algorithms Exam TIN093/DIT600

Algorithms Exam TIN093/DIT600 Algorithms Exam TIN093/DIT600 Course: Algorithms Course code: TIN 093 (CTH), DIT 600 (GU) Date, time: 24th October 2015, 14:00 18:00 Building: M Responsible teacher: Peter Damaschke, Tel. 5405. Examiner:

More information

Greedy Algorithms CHAPTER 16

Greedy Algorithms CHAPTER 16 CHAPTER 16 Greedy Algorithms In dynamic programming, the optimal solution is described in a recursive manner, and then is computed ``bottom up''. Dynamic programming is a powerful technique, but it often

More information

Outline. Distributed DBMS Page 5. 1

Outline. Distributed DBMS Page 5. 1 Outline Introduction Background Distributed DBMS Architecture Distributed Database Design Fragmentation Data Location Semantic Data Control Distributed Query Processing Distributed Transaction Management

More information

Distributed minimum spanning tree problem

Distributed minimum spanning tree problem Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with

More information

Module 7. Independent sets, coverings. and matchings. Contents

Module 7. Independent sets, coverings. and matchings. Contents Module 7 Independent sets, coverings Contents and matchings 7.1 Introduction.......................... 152 7.2 Independent sets and coverings: basic equations..... 152 7.3 Matchings in bipartite graphs................

More information

Lecture 21. Software Pipelining & Prefetching. I. Software Pipelining II. Software Prefetching (of Arrays) III. Prefetching via Software Pipelining

Lecture 21. Software Pipelining & Prefetching. I. Software Pipelining II. Software Prefetching (of Arrays) III. Prefetching via Software Pipelining Lecture 21 Software Pipelining & Prefetching I. Software Pipelining II. Software Prefetching (of Arrays) III. Prefetching via Software Pipelining [ALSU 10.5, 11.11.4] Phillip B. Gibbons 15-745: Software

More information

Antisymmetric Relations. Definition A relation R on A is said to be antisymmetric

Antisymmetric Relations. Definition A relation R on A is said to be antisymmetric Antisymmetric Relations Definition A relation R on A is said to be antisymmetric if ( a, b A)(a R b b R a a = b). The picture for this is: Except For Example The relation on R: if a b and b a then a =

More information

Chapter 11: Query Optimization

Chapter 11: Query Optimization Chapter 11: Query Optimization Chapter 11: Query Optimization Introduction Transformation of Relational Expressions Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming

More information

Solutions to Homework 10

Solutions to Homework 10 CS/Math 240: Intro to Discrete Math 5/3/20 Instructor: Dieter van Melkebeek Solutions to Homework 0 Problem There were five different languages in Problem 4 of Homework 9. The Language D 0 Recall that

More information

GraphBLAS Mathematics - Provisional Release 1.0 -

GraphBLAS Mathematics - Provisional Release 1.0 - GraphBLAS Mathematics - Provisional Release 1.0 - Jeremy Kepner Generated on April 26, 2017 Contents 1 Introduction: Graphs as Matrices........................... 1 1.1 Adjacency Matrix: Undirected Graphs,

More information