Attribute-Pair Range Rules

Size: px
Start display at page:

Download "Attribute-Pair Range Rules"

Transcription

1 Lecture Notes in Computer Science 1 Attribute-Pair Range Rules Jerome Robinson Barry G. T. Lowden Department of Computer Science, University of Essex Colchester, Essex, CO4 3SQ, U.K. {robij, lowdb}@essex.ac.uk Abstract. This paper examines the properties of metadata in the form of IF THEN rules which contain two predicates on attributes of a relational database table. For example: a( ) d( ), which means "if the value of attribute 'a' in a tuple is in the range 15 to 30 then the value of attribute 'd' will be in the range 243 to 271." Metadata of this kind is useful in Semantic Query Optimisation and Remote Cache Management. The two predicates (antecedent and consequent) in each rule are Selection Conditions or constraints of the type found in database queries. Each condition therefore denotes a subset of a database table. Rules can be cascaded, using subrange containment as the link between successive rules. The set of rules can therefore be regarded as a set of edges in a Condition Dependency Graph, and using the rule-set is path discovery in the graph. The purpose of the current paper is to introduce some of the properties of attribute-pair range rules. 1 Introduction If data servers took more interest in their data they could be more helpful in their response to client queries. Metadata, in the form of pairs of range conditions from different attributes can be easily derived from the data, either by induction triggered by queries [4] or by systematic analysis [6]. The resulting attribute-pair rules can be used for Semantic Query Optimisation and cache management in remote clients. Each Attribute Pair (AP) Rule is a subset descriptor for a database table, comprising a set selector (the antecedent) and a set descriptor (the consequent). Methods for choosing appropriate subsets include deriving rules by systematic analysis and set reduction as in [6], or by associating a rule with each bar of a histogram describing the table as in [7], or by recognizing subsets that are frequently used during database access. Some properties of AP Rules are now specified. Each rule has the simple form: A B, where antecedent A and consequent B are range constraints or conditions applied to attributes a and b respectively. Range conditions include equality conditions, such as (a = n). Range Conditions have two forms: i) single-test conditions: (a θ n) where θ { <,, =,, > }, 'a' is an attribute name, and n is a value of the same type as the attribute,

2 Lecture Notes in Computer Science 2 and ii) double-test conditions: (n a m) where n and m are values, of attribute a s type. a(n.. m) is an abbreviation for (n a m). We assume a(n.. m) denotes a closed interval (i.e. the values n and m are included in the range). Rules are used by matching (comparing) condition A or B with another range condition such as a query condition or a condition in another rule. Two range conditions on the same database table are comparable if they have the same attribute name. Eg: b(5.. 81), (b > 63), (b = 549) can all be compared because they all denote ranges for attribute b. Comparison is a test for subrange containment. Set (A) is a subset of set (B) in the rule A B The set selected by applying selection condition A is a subset of the tuples obtained by condition B. This is a necessary consequence of the assertion A B, which can be read as "all As are Bs". Furthermore, a rule partitions its database table into three disjoint subsets, corresponding to selection conditions: A B, A B, A B. The significance of this table partition is discussed in [7]. Attribute Pair Rules are used as subset selector/descriptor pairs, and they are an exact description of the data rather than the probabilistic knowledge usually produced in KDD. A large amount of such metadata is desirable, for a full description of the data, but the rule set must be structured for fast access. Examination of the Hasse Diagram for the ordered set of attribute ranges (e.g. Figs. 2 and 3) identifies an efficient data structure, access strategy, and systematic rule discovery algorithm for arbitrary sets of range rules. The structure of the rest of the paper is as follows. Section 2 introduces Remote Cache Management as an example of the use of attribute-pair rules. Section 3 describes the rule set as a Condition Dependency Graph (CD Graph) and mentions Semantic Query Optimisation [ 4, 8, 9, 10, 11 ] as a second application requiring these simple subset descriptors. Section 4 provides a data structure for range rule sets, which allows rapid range matching, since fast access to the rule base is important to its applications. 2 Remote Cache Management A query-processing issue of current research interest [eg 1,2,3,5] concerns remote access, via wide area network, to the data server. Mediators, for example, are a local interface for clients to multiple remote data servers. They cache queries in order to reuse cached values to answer later queries. This provides faster query answering, achieves some degree of independence from network delays and breakdowns and helps reduce internet traffic congestion. Their problem is to identify queries that can be answered with locally-cached data. Query result containment by previous result sets, and overlap (partial containment) must be recognized. Existing attempts to solve this problem by query analysis suffer the same drawback as Conventional Query Optimisers: they are syntactic rather than semantic operations. Knowledge about the data can help.

3 Lecture Notes in Computer Science 3 For example: A query, Q1, selects all tuples in a table, where attribute b is greater than 27. The result set is cached. A new query whose selection condition is ( 40 b 53 ) can obviously be answered from set Q1. But query Q3, whose selection condition is ( 21 e 33 ) has no apparent connection with the cached set, since it refers to a different attribute. No amount of syntactic analysis will reveal any connection. But the data server may know that ( 21 e 33 ) ( b > 27 ) i.e.: all tuples in that table with e column values in the range [21..33] have a value of the b attribute that is > 27. Therefore the result set for Q3 is contained in the cached set from Q1. Two ways to use this knowledge of the data to assist remote cache managers are: 1. If the Mediator sent query Q3 to the data server, the server would recognize the connection with previous query Q1 and reply with a selection condition to apply to the specified cache set to obtain the new results. This short reply is a small singlepacket message able to pass rapidly through the store-and-forward network, faster than the result set. 2. In the case of a static data repository (such as a Data Warehouse, Data Archive, or just stable data) the server could send relevant rules to the remote cache manager, so that it can supplement its decisions with knowledge of the data. Each query result set dispatched to a registered cache manager can be supplemented by a set of rules. N-1 rules for an N-ary database relation, one for each attribute other than the one in the current query. For example: A cached result is for the range condition a( ) on attribute a. For each other attribute x i in the table, produce a rule: range(x i ) a( ). This means "all tuples with x i value within the specified range(x i ) will have values of a in the range [15..30] and are therefore in the local cache". Such rules can be obtained from the existing rule set, as follows. Merging Rules Each rule to be dispatched to the remote cache manager is obtained by merging cache-relevant rules from the server s rule set. There are 2 steps: Step 1. Rules, in general, are classified according to the pair of attributes they contain. From the rules whose consequent condition refers to attribute a, and whose antecedent attribute is d, say, extract those with consequent range within [ ], i.e. [ ]. E.g. : R1 d( ) a( ) R2 d( ) a( ) R3 d( ) a( ) R1, R2 are nested rules, so R2 can be deleted from the set. It defines characteristics of a subset of the tuples covered by R1. R1 and R3 are overlap rules. These are the sort to merge, in step 2: Step 2. From the selected set of d(n i.. m i ) a(p i.. q i ) rules, obtain s = min(n i ), t = max(m i ), v = min(p i ), w = max(q i ). The merged rule is: d(s.. t) a(v.. w) i.e., the union of antecedent conditions implies the union of consequents. E.g. from the rules above, d( ) a( ). This means all tuples selected by condition d( ) will have attribute a values in the range [ ].

4 Lecture Notes in Computer Science 4 Explanation: From R1, all tuples with d( ) have a( ); and from R3, all tuples with d( ) have a( ). Unioning the set selected by d( ) with that selected by d( ) produces the set selected by d( ). This set inherits the consequents from the rules describing the smaller sets. Since no new tuples were added, all tuples selected by constraint d( ) have ( a(16..25) OR a(21..30) ), i.e. all have a( ). Consequent ranges must overlap if antecedents overlap [ 7 ]. Note that the consequent does not quite fill the cache range a( ). If more rules had been available for merging, the consequent range might have been widened, in which case the antecedent range would necessarily increase and would then be able to capture a greater number of future queries. 2.1 Partial Containment by Cache Current research on remote cache management [e.g. 1, 5] is interested in whether some of the answers to a current query are contained in the local cache. The user may not require more than the sample of results contained in the cache. But if the full set is needed, a request for a smaller set is made to the remote server, thereby reducing network traffic and also providing immediate access to some of the data. The smaller set complements the cached set to complete the query result set. The server can support partial containment by the cache as well as full containment. Full containment is expressed by a rule A B, where A is a new query and B the old, cached, query condition. Partial containment can be indicated by rules in a number of ways. For example, a rule B A means cached set B is a subset of new query result A. Partial overlap in condition ranges also represents partial containment in the cache. Eg, new query: a( ), rule: a( ) B, where B is the cached set. 2.2 Conjunctive Conditions The cached query may be the result of a conjunction of selection conditions on the table. E.g. (A & C), where A, C are constraints on two different database attributes. Two rules, B A and B C mean B (A & C). In practice, two rules with exactly the same antecedent, B, may not exist, so the intersection of antecedent ranges B and B" from two rules B A and B" C" are used, where range(a) contains range(a ) and range(c) contains range(c"). The intersection of B and B provides the value of B in the rule B (A & C). For example: cached query: ( c( ) AND a( ) ) rules: b(3.. 8) c( ) b( 4..9) a( ) The first rule states: if attribute b has a value in the range [3.. 8] then attribute c will have a value in [ ], which is therefore in the wider range [ ] used in the cache descriptor. Similarly, b(4..9) implies a is in [ ] and therefore in [12..20]. The rules therefore become: b(3.. 8) c( ) and b( 4..9) a( ).

5 Lecture Notes in Computer Science 5 (A rule means that all tuples matching the antecedent constraint will also obey the consequent constraint. It does not imply that values exist at all points on the number line interval denoted by the consequent range). From these rules it follows that b(4..8) ( c(15..30) AND a(12..20) ) because if both rule antecedents are true (i.e. the intersection of their ranges) then both consequents are true. This rule does not claim to provide a complete description of the cached set, just some information rapidly available from currently held rules. The antecedent range [4..8] could no doubt be increased if the data itself, in the base relation, was examined. 2.3 Query Simplification The remote cache manager labels each query result set with the query expression which produced the set [e.g. 2, 3]. The data server can use AP rules for semantic query reformulation, e.g. to eliminate redundant terms. For example, a query expression A & B is reduced to B by rule B A. (A & B B if B A, because set(b) set(a) set(b) when B A). Eliminating terms from cache descriptor expressions allows them to subsume more queries in syntactic analysis, so cache management benefits from knowledge-based simplification of query expressions, using AP rules. 3 The Condition Dependency Graph (C D Graph) It is useful to regard the set of rules as edges in a Condition Dependency Graph. Each rule is an edge, and rule composition produces paths of two or more edges which denote transitive rules. The first and last nodes in any path are antecedent and consequent in a rule, and can be linked directly by a single arc. These transitive rules identified by paths are no different in character from rules produced directly from data, e.g. by induction [4] or systematic analysis [6]. They denote a relationship between data values in two attributes. So although a path is a chain of deduction, it reveals a rule which is then independent of the inference path. Intermediate rules in the path could be deleted from the rule set without affecting the validity of the transitive rule. Paths branch because a consequent range can imply many antecedent ranges in other rules; and because one antecedent condition can appear in several rules with consequents on different attributes. Branching is useful because transitive paths from a common antecedent can be intersected to produce a new, more specific rule. Fig. 1. Part of a Condition Dependency Graph including query conditions A, F and E.

6 Lecture Notes in Computer Science 6 Fig. 1 shows a second practical application of Attribute Pair rules, namely Semantic Query Optimisation (SQO), which involves path discovery in the Condition Dependency Graph. A query is a collection of conditions which query-result tuples must satisfy. The collection of query conditions is structured as a sum of products expression (i.e. a disjunction of conjunctive sub-queries). SQO rewrites the sub-queries in order to produce a query that can be processed faster. Fig. 1 shows three conditions, A, F and E in a conjunctive query have been matched with conditions in rules. The path from A to E means that condition E can be deleted from the query, because A E so any tuples which satisfy condition A will also satisfy E, without being tested. The cycle containing condition F means F can be replaced in the query expression by K, if equivalent condition K is a faster test to apply to tuples. Equivalent conditions select the same set of tuples. If any of the conditions implied by A or E contradict condition F then the query will produce no results and can be answered immediately with the empty set without consulting any data (or if the conjunct A E F is a subquery it can be deleted from the larger query). For example, if F is the condition (25 d 34) and a rule (i.e. path) A (45 d 63) exists, then no tuple can satisfy both conditions A and F as required by the query, because all tuples meeting condition A have attribute d values outside the range required by condition F. 3.1 Condition Matching Matching conditions in the CD graph (e.g. query conditions to rule conditions, or consequent to antecedent when cascading rules) need not involve exact match if range conditions are involved. In fact, matching is an inference process using subrange containment as the rule of inference. A range implies its super-ranges: [n.. m] [ n.. m]. e.g. b(15..30) b(12..40), meaning if b is in the interval [15..30] it is also in [12..40]. Conditions denote sets of tuples, so set( b(15..30) ) is part of the set of tuples with values in the range [12..40] Query Condition/Rule Antecedent Matching For comparable conditions, a query condition matches a rule antecedent if: range(rule antecedent condition) contains range(query condition) Reason: Query condition must imply rule antecedent. Implication is by subrange containment. Set(query condition) set(rule antecedent condition) Rule Consequent/Query Condition Matching For comparable conditions, rule consequent matches query condition if: range(query condition) contains range(rule consequent condition) Reason: Consequent condition must imply query condition is true. (This ends a chain of inference from one query condition to another, through one or more rules). Semantic Query Optimisation rewrites a query into a form which a Data Server can answer more quickly. SQO is a pre-processing stage between user and server, which intercepts and rewrites queries. It must therefore be fast to avoid delaying the query and so counteracting the benefits of the faster query it produces. Therefore it is im-

7 Lecture Notes in Computer Science 7 portant to pre-process the CD Graph, before a query arrives, so that transitive rules are ready to apply without having to build paths at query time. It is therefore useful to know (as discussed in the next section) whether there is a limit to the amount of pre-processing to be done, since full transitive closure is a significant workload. 3.2 Maximum Pathlength in the CD Graph The maximum pathlength to be examined in the CD graph when deriving transitive rules is N edges, where N is the number of columns in the database table. Paths in the graph can be longer than N, but rule derivation need only examine paths to depth N. A path is a sequence of pairs of attribute conditions, formed by cascading attributepair rules. The longest useful path is one in which every attribute appears exactly once. When a path reaches a condition on the attribute from which it started, the node at the end of the path must denote a superset of the tuples denoted by the start of path condition. Eg: a(15..20) a(12..93). Paths such as a(15..20) a(23..32) or a(15..20) a(18..19) cannot be derived, since these rules are self-contradictory. A range can only imply a super-range of itself. Consider the path: A 1 B 1 C 1 A 2 C 2 E 1 where A i denotes condition i on attribute a, B i condition i on attribute b, etc. Case 1: Nested paths with Shared Consequent Three rules linking attributes a and c are shown: A 1 C 1 A 2 C 2 A 1 C 2 But the second subsumes the third, since A 1 is a subrange of A 2 (because A 1 A 2 ). For example, if A 1 = a( ) and A 2 = a( ) the information in the rule a( ) C 2 is only part of the information in the rule a( ) C 2. Case 2: Absence of Consequent Attribute from the First Cycle When a path reaches an attribute for the second time it starts a second cycle. Eg A 1 and A 2 in the path above start different cycles through the database table s attribute set. Two rules link attributes a and e in the path above: A 1 E 1 and A 2 E 1 Since condition node A 1 does not imply any other condition on attribute e in the path, the rule A 1 E 1 is the best rule in which condition A 1 implies a value range for attribute e. But it is a redundant rule, since it is subsumed by A 2 E 1 because range(a 2 ) contains range(a 1 ), so set(a 1 ) set(a 2 ). Therefore, although paths longer than N can be generated they are not useful. A second cycle can be treated as a SEPARATE path, providing information about a different subset of the database table. This is a superset of the tuples described by the first cycle, since the start node of the second path is a super-range of the first path s start node. (The start node is the antecedent of the rules in that section of the path, and selects the set of tuples described by the rules). 3.3 Merging Antecedent Ranges The CD graph can therefore be seen as a collection of separate, but maybe concatenated, paths. So the maximum number of incoming edges to any node is N-1, representing each of the other columns of the database table. More than N-1 would denote redundant rules, since two rules have the same consequent, the same antecedent col-

8 Lecture Notes in Computer Science 8 umn, but different antecedent range. If those ranges are nested then the subrange rules are discarded. But if ranges overlap, they are merged during graph processing to produce a wider (more useful) range antecedent, as follows. Consider two incoming edges: a(15..20) b(12..23) and a(17..23) b(16..29) to the antecedent node of rule: b(10..31) d(15..44). The two original rules are separate subset descriptors, with overlapping antecedent ranges. They can be combined, by unioning, to a single rule: a(15..23) b(12..29) whose consequent assertion satisfies the antecedent constraint of b(10..31) d(15..44) and therefore produces a new, transitive, rule: a(15..23) d(15..44). A new antecedent node is thus added to the CD graph, for the condition a( ), as the start of transitive rules. 3.4 Maximum Out-degree of Graph Vertices When producing new edges to represent transitive paths, the number of outgoing edges from any node must be limited to N-1 (for an N-ary database relation) so that a node is only connected to the narrowest available range for each consequent attribute. Sequential and parallel paths can provide consequents. Nested consequent ranges are produced by successive cycles in sequential paths, producing rules such as: a(15..23) e(20..27), a(15..23) e(10..49), a(15..23) e(3..182). The first rule subsumes the others. The redundant rules are not produced if transitive pathlength is limited as discussed in section 3.2. Parallel paths produce overlapping consequent ranges, such as: a(15..23) e(20..27), a(15..23) e(24..31). Rule intersection provides an improved transitive rule, a(15..23) e(24..27). Both original rules are true, but less informative than their intersection rule. Each consequent assertion is improved (made more specific) by information in another consequent, since only values common to all the overlapping consequent ranges can actually exist in the subset data they describe. 4 Rule or Condition Match Algorithms The condition match problem is one that must be solved in any rule system which has a large number of rules. In a system that automatically increases the size of the rule set by discovery, there is a tendency for the rule system s performance to deteriorate as the set grows. The match phase in production systems (such as Expert Database Systems or C-A Rule support in Active Databases) is a very time-consuming component of rule use, in the continuous match-select-act cycle. It requires the use of discrimination networks such as RETE, TREAT or GATOR which store partial match results, in order to improve performance. However, attribute-pair range-condition rules are very different. Their structure and semantics allow rapid matching, by simple lookup algorithms. The algorithms, and the storage structure, are derived from the Hasse Diagram shown in Fig. 2, which denotes the ordered set of possible range conditions on an integer attribute whose extreme values are 10 and 20. But the diagram reveals the structure of any set of range conditions on any orderable attribute type. It is the position of certain zones on the diagram (those shown in Fig. 3) which suggests a suitable data structure and search algorithm for attribute-pair rules.

9 Lecture Notes in Computer Science 9 Fig. 2. Hasse Diagram for Range Conditions Fig. 3. Significant areas for range node 12-17

10 Lecture Notes in Computer Science 10 Fig. 4a. A Data Structure Fig. 4b. Possible Version in Practice Fig.4a shows a tree data structure derived from the Hasse diagram: a sequence of lists of nodes (range conditions). All nodes in a diagonal list have the same lower limit for their ranges (as shown in Fig. 2). This common value for a list is the listvalue. The set of lists is sorted by listvalue, so the sequence of diagonals in Fig. 4a is in ascending order of lower range limits. The nodes within each list are sorted in descending order of upper range-limit. This structure, corresponds to the diagonal rows in the Hasse diagram, and so allows early-terminating search algorithms to be easily specified for range match in any of the 'significant areas' in Fig. 3. In practice, nodes may be missing from the structure shown in Fig. 2, as indicated in Fig. 4b, but since the order of nodes is maintained the same search algorithm still applies, and works equally well with non-integer numeric types. To support antecedent lookup in the rule set, as required for SQO, a Hasse Diagram whose nodes correspond to antecedent ranges is used. The structure represents the AP set associated with one pair of attributes from a specific table. The rule set can be stored as a sorted set of lists of rules, a shown in Fig. 4b. Each rule has the form a(m.. n) b(p.. q). All rules in a list have the same m value, the listvalue for that list, which is used to sort the lists into ascending order. Within each list each node contains three values: <n, p, q> where n is called the nodevalue, and is used to sort nodes in the list into descending order. The attribute names, a and b, are implicit since the data structure contains only rules for one specific attribute pair. 5 Conclusions This paper introduced some of the properties of Attribute-Pair range-condition rules, which can be derived automatically from data and constitute a description of certain features of the data. Two practical uses for these rules were identified, namely Remote Cache Management and Semantic Query Optimisation. These simple rules avoid many of the practical difficulties associated with more elaborate rule bases, which render them impractical for strictly time-constrained applications. More elaborate rule structures are more expressive, but lack benefits such as (i) fast access

11 Lecture Notes in Computer Science 11 through a simple regular data structure, (ii) graph representation providing a map to guide rule application, and allowing full pre-computation of transitive inferences, (iii) modular rulebase structure, from rule classification by attribute pair and database relation, allowing efficient rule set management (such as access only to currently relevant subsets), and (iv) easily derived rules which are therefore easily discarded, rather than accumulating continuously. The CD graph represents dependencies between subsets of data in a database table. Each path represents a sequence of monotonically increasing nested sets of tuples. SQO uses forward paths. Cache management requires backward paths from a specified node. Graph pre-processing builds transitive paths before they are urgently needed. Parallel paths represent sets of transitive rules which can be merged to a single rule by intersection, or to another rule by unioning corresponding ranges in rules. Unioning is useful in cache expression implication, and in broadening the antecedent range to enclose a given assertion range. Intersection narrows ranges to produce a more specific subset descriptor. Previous work in SQO has used rule sets which are an arbitrary collection of rule structures, not amenable to the graph representation for inference closure, and with inherently lower utility per rule in the applications we consider. References 1. Adali, S., Candan, K. S., Papakonstantinou, Y., Subrahmanian, V. S.: Query Caching and Optimization in Distributed Mediator Systems. ACM SIGMOD Conf. (1996) Dar, S., Franklin, M. J., Jonsson, B. T., Srivastava, D., Tan, M.: Semantic Data Caching and Replacement, Proc. 22nd VLDB Conference (1996) Keller, A. M., Basu, J.: A Predicate-based Caching Scheme for Client-Server Database Architectures. VLDB Journal 5(1) 1996, Lowden, B.G.T., Robinson, J., Lim, K.Y.: A Semantic Query Optimiser using Automatic Rule Derivation. WITS 95, 5th Intl. Workshop on Information Technologies and Systems (1995) Qian, X.: Query Folding. 12th IEEE Intl. Conf. on Data Engineering (1996) Robinson, J., Lowden, B. G. T.: Data Analysis for Query Processing. 2nd Intl. Symposium on Intelligent Data Analysis (1997) (LNCS 1280) 7. Robinson, J., Lowden, B. G. T.: Semantic Query Optimisation and Rule Graphs. KRDB'98, 5th International Workshop on Knowledge Representation meets Data Bases (1998). 8. Shekhar, S., et al.: A Formal Trade-off between Optimization and Execution Costs in Semantic Query Optimization. Proc. 14th VLDB Conference (1988) Shenoy, S. T., Ozsoyoglu, Z. M.: Design and Implementation of a Semantic Query Optimizer. IEEE Trans. Knowledge and Data Engineering, 1(3) 1989, Siegel, M.,et al.: A Method for Automatic Rule Derivation to Support Semantic Query Optimization. ACM Trans. Database Systems, 17(4) 1992, Yu, C., Sun, W.: Automatic Knowledge Acquisition and Maintenance for Semantic Query Optimization. IEEE Trans. Knowledge and Data Engineering, 1(3) 1989,

Distributing the Derivation and Maintenance of Subset Descriptor Rules

Distributing the Derivation and Maintenance of Subset Descriptor Rules Distributing the Derivation and Maintenance of Subset Descriptor Rules Jerome Robinson, Barry G. T. Lowden, Mohammed Al Haddad Department of Computer Science, University of Essex Colchester, Essex, CO4

More information

Utilizing Multiple Computers in Database Query Processing and Descriptor Rule Management

Utilizing Multiple Computers in Database Query Processing and Descriptor Rule Management Utilizing Multiple Computers in Database Query Processing and Descriptor Rule Management Jerome Robinson, Barry G. T. Lowden, Mohammed Al Haddad Department of Computer Science, University of Essex Colchester,

More information

The Use of Statistics in Semantic Query Optimisation

The Use of Statistics in Semantic Query Optimisation The Use of Statistics in Semantic Query Optimisation Ayla Sayli ( saylia@essex.ac.uk ) and Barry Lowden ( lowdb@essex.ac.uk ) University of Essex, Dept. of Computer Science Wivenhoe Park, Colchester, CO4

More information

A Statistical Approach to Rule Selection in Semantic Query Optimisation

A Statistical Approach to Rule Selection in Semantic Query Optimisation A Statistical Approach to Rule Selection in Semantic Query Optimisation Barry G. T. Lowden and Jerome Robinson Department of Computer Science, The University of ssex, Wivenhoe Park, Colchester, CO4 3SQ,

More information

Striped Grid Files: An Alternative for Highdimensional

Striped Grid Files: An Alternative for Highdimensional Striped Grid Files: An Alternative for Highdimensional Indexing Thanet Praneenararat 1, Vorapong Suppakitpaisarn 2, Sunchai Pitakchonlasap 1, and Jaruloj Chongstitvatana 1 Department of Mathematics 1,

More information

Designing Views to Answer Queries under Set, Bag,and BagSet Semantics

Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Rada Chirkova Department of Computer Science, North Carolina State University Raleigh, NC 27695-7535 chirkova@csc.ncsu.edu Foto Afrati

More information

Topic Maps Reference Model, version 6.0

Topic Maps Reference Model, version 6.0 Topic Maps Reference Model, 13250-5 version 6.0 Patrick Durusau Steven R. Newcomb July 13, 2005 This is a working draft of the Topic Maps Reference Model. It focuses on the integration of Robert Barta

More information

Inheritance Metrics: What do they Measure?

Inheritance Metrics: What do they Measure? Inheritance Metrics: What do they Measure? G. Sri Krishna and Rushikesh K. Joshi Department of Computer Science and Engineering Indian Institute of Technology Bombay Mumbai, 400 076, India Email:{srikrishna,rkj}@cse.iitb.ac.in

More information

Evaluating XPath Queries

Evaluating XPath Queries Chapter 8 Evaluating XPath Queries Peter Wood (BBK) XML Data Management 201 / 353 Introduction When XML documents are small and can fit in memory, evaluating XPath expressions can be done efficiently But

More information

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions... Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing

More information

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions. THREE LECTURES ON BASIC TOPOLOGY PHILIP FOTH 1. Basic notions. Let X be a set. To make a topological space out of X, one must specify a collection T of subsets of X, which are said to be open subsets of

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

[Ch 6] Set Theory. 1. Basic Concepts and Definitions. 400 lecture note #4. 1) Basics

[Ch 6] Set Theory. 1. Basic Concepts and Definitions. 400 lecture note #4. 1) Basics 400 lecture note #4 [Ch 6] Set Theory 1. Basic Concepts and Definitions 1) Basics Element: ; A is a set consisting of elements x which is in a/another set S such that P(x) is true. Empty set: notated {

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Appendix 1. Description Logic Terminology

Appendix 1. Description Logic Terminology Appendix 1 Description Logic Terminology Franz Baader Abstract The purpose of this appendix is to introduce (in a compact manner) the syntax and semantics of the most prominent DLs occurring in this handbook.

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

Appendix 1. Description Logic Terminology

Appendix 1. Description Logic Terminology Appendix 1 Description Logic Terminology Franz Baader Abstract The purpose of this appendix is to introduce (in a compact manner) the syntax and semantics of the most prominent DLs occurring in this handbook.

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Binary Decision Diagrams

Binary Decision Diagrams Logic and roof Hilary 2016 James Worrell Binary Decision Diagrams A propositional formula is determined up to logical equivalence by its truth table. If the formula has n variables then its truth table

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee. The Chinese University of Hong Kong.

Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee. The Chinese University of Hong Kong. Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee Department of Computer Science and Engineering The Chinese University of Hong Kong Shatin, N.T., Hong Kong SAR, China fyclaw,jleeg@cse.cuhk.edu.hk

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Introduction to Sets and Logic (MATH 1190)

Introduction to Sets and Logic (MATH 1190) Introduction to Sets and Logic () Instructor: Email: shenlili@yorku.ca Department of Mathematics and Statistics York University Dec 4, 2014 Outline 1 2 3 4 Definition A relation R from a set A to a set

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications,

I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications, I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications, Proc. of the International Conference on Knowledge Management

More information

Web Service Usage Mining: Mining For Executable Sequences

Web Service Usage Mining: Mining For Executable Sequences 7th WSEAS International Conference on APPLIED COMPUTER SCIENCE, Venice, Italy, November 21-23, 2007 266 Web Service Usage Mining: Mining For Executable Sequences MOHSEN JAFARI ASBAGH, HASSAN ABOLHASSANI

More information

Fast Discovery of Sequential Patterns Using Materialized Data Mining Views

Fast Discovery of Sequential Patterns Using Materialized Data Mining Views Fast Discovery of Sequential Patterns Using Materialized Data Mining Views Tadeusz Morzy, Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science ul. Piotrowo

More information

detected inference channel is eliminated by redesigning the database schema [Lunt, 1989] or upgrading the paths that lead to the inference [Stickel, 1

detected inference channel is eliminated by redesigning the database schema [Lunt, 1989] or upgrading the paths that lead to the inference [Stickel, 1 THE DESIGN AND IMPLEMENTATION OF A DATA LEVEL DATABASE INFERENCE DETECTION SYSTEM Raymond W. Yip and Karl N. Levitt Abstract: Inference is a way tosubvert access control mechanisms of database systems.

More information

The Relationship between Slices and Module Cohesion

The Relationship between Slices and Module Cohesion The Relationship between Slices and Module Cohesion Linda M. Ott Jeffrey J. Thuss Department of Computer Science Michigan Technological University Houghton, MI 49931 Abstract High module cohesion is often

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

STABILITY AND PARADOX IN ALGORITHMIC LOGIC

STABILITY AND PARADOX IN ALGORITHMIC LOGIC STABILITY AND PARADOX IN ALGORITHMIC LOGIC WAYNE AITKEN, JEFFREY A. BARRETT Abstract. Algorithmic logic is the logic of basic statements concerning algorithms and the algorithmic rules of deduction between

More information

Data Analytics and Boolean Algebras

Data Analytics and Boolean Algebras Data Analytics and Boolean Algebras Hans van Thiel November 28, 2012 c Muitovar 2012 KvK Amsterdam 34350608 Passeerdersstraat 76 1016 XZ Amsterdam The Netherlands T: + 31 20 6247137 E: hthiel@muitovar.com

More information

15.4 Longest common subsequence

15.4 Longest common subsequence 15.4 Longest common subsequence Biological applications often need to compare the DNA of two (or more) different organisms A strand of DNA consists of a string of molecules called bases, where the possible

More information

Chapter 13: Query Optimization. Chapter 13: Query Optimization

Chapter 13: Query Optimization. Chapter 13: Query Optimization Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Equivalent Relational Algebra Expressions Statistical

More information

LOGIC AND DISCRETE MATHEMATICS

LOGIC AND DISCRETE MATHEMATICS LOGIC AND DISCRETE MATHEMATICS A Computer Science Perspective WINFRIED KARL GRASSMANN Department of Computer Science University of Saskatchewan JEAN-PAUL TREMBLAY Department of Computer Science University

More information

ITCT Lecture 6.1: Huffman Codes

ITCT Lecture 6.1: Huffman Codes ITCT Lecture 6.1: Huffman Codes Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Huffman Encoding 1. Order the symbols according to their probabilities

More information

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Petr Somol 1,2, Jana Novovičová 1,2, and Pavel Pudil 2,1 1 Dept. of Pattern Recognition, Institute of Information Theory and

More information

CSE 20 DISCRETE MATH. Fall

CSE 20 DISCRETE MATH. Fall CSE 20 DISCRETE MATH Fall 2017 http://cseweb.ucsd.edu/classes/fa17/cse20-ab/ Final exam The final exam is Saturday December 16 11:30am-2:30pm. Lecture A will take the exam in Lecture B will take the exam

More information

Redundant States in Sequential Circuits

Redundant States in Sequential Circuits Redundant States in Sequential Circuits Removal of redundant states is important because Cost: the number of memory elements is directly related to the number of states Complexity: the more states the

More information

CSE 20 DISCRETE MATH. Winter

CSE 20 DISCRETE MATH. Winter CSE 20 DISCRETE MATH Winter 2017 http://cseweb.ucsd.edu/classes/wi17/cse20-ab/ Final exam The final exam is Saturday March 18 8am-11am. Lecture A will take the exam in GH 242 Lecture B will take the exam

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Bipartite Graph Partitioning and Content-based Image Clustering

Bipartite Graph Partitioning and Content-based Image Clustering Bipartite Graph Partitioning and Content-based Image Clustering Guoping Qiu School of Computer Science The University of Nottingham qiu @ cs.nott.ac.uk Abstract This paper presents a method to model the

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

CS Bootcamp Boolean Logic Autumn 2015 A B A B T T T T F F F T F F F F T T T T F T F T T F F F

CS Bootcamp Boolean Logic Autumn 2015 A B A B T T T T F F F T F F F F T T T T F T F T T F F F 1 Logical Operations 1.1 And The and operator is a binary operator, denoted as, &,, or sometimes by just concatenating symbols, is true only if both parameters are true. A B A B F T F F F F The expression

More information

UML Class Model Abstract Syntax and Set-Based Semantics

UML Class Model Abstract Syntax and Set-Based Semantics UML Class Model Abstract Syntax and Set-Based Semantics Mira Balaban and Azzam Maraee Computer Science Department Ben-Gurion University of the Negev, ISRAEL mira,mari@cs.bgu.ac.il March 31, 2017 The class-model

More information

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA

SEQUENTIAL PATTERN MINING FROM WEB LOG DATA SEQUENTIAL PATTERN MINING FROM WEB LOG DATA Rajashree Shettar 1 1 Associate Professor, Department of Computer Science, R. V College of Engineering, Karnataka, India, rajashreeshettar@rvce.edu.in Abstract

More information

Alloy: A Lightweight Object Modelling Notation

Alloy: A Lightweight Object Modelling Notation Alloy: A Lightweight Object Modelling Notation Daniel Jackson, ACM Transactions on Software Engineering, 2002 Presented By: Steven Stewart, 2012-January-23 1 Alloy: 2002 to present Software is built on

More information

Programming Languages Third Edition

Programming Languages Third Edition Programming Languages Third Edition Chapter 12 Formal Semantics Objectives Become familiar with a sample small language for the purpose of semantic specification Understand operational semantics Understand

More information

The Inverse of a Schema Mapping

The Inverse of a Schema Mapping The Inverse of a Schema Mapping Jorge Pérez Department of Computer Science, Universidad de Chile Blanco Encalada 2120, Santiago, Chile jperez@dcc.uchile.cl Abstract The inversion of schema mappings has

More information

6. Relational Algebra (Part II)

6. Relational Algebra (Part II) 6. Relational Algebra (Part II) 6.1. Introduction In the previous chapter, we introduced relational algebra as a fundamental model of relational database manipulation. In particular, we defined and discussed

More information

Relational Model: History

Relational Model: History Relational Model: History Objectives of Relational Model: 1. Promote high degree of data independence 2. Eliminate redundancy, consistency, etc. problems 3. Enable proliferation of non-procedural DML s

More information

Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes?

Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes? White Paper Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes? How to Accelerate BI on Hadoop: Cubes or Indexes? Why not both? 1 +1(844)384-3844 INFO@JETHRO.IO Overview Organizations are storing more

More information

CHAPTER-23 MINING COMPLEX TYPES OF DATA

CHAPTER-23 MINING COMPLEX TYPES OF DATA CHAPTER-23 MINING COMPLEX TYPES OF DATA 23.1 Introduction 23.2 Multidimensional Analysis and Descriptive Mining of Complex Data Objects 23.3 Generalization of Structured Data 23.4 Aggregation and Approximation

More information

XML Filtering Technologies

XML Filtering Technologies XML Filtering Technologies Introduction Data exchange between applications: use XML Messages processed by an XML Message Broker Examples Publish/subscribe systems [Altinel 00] XML message routing [Snoeren

More information

Chapter 11: Query Optimization

Chapter 11: Query Optimization Chapter 11: Query Optimization Chapter 11: Query Optimization Introduction Transformation of Relational Expressions Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming

More information

A Model of Machine Learning Based on User Preference of Attributes

A Model of Machine Learning Based on User Preference of Attributes 1 A Model of Machine Learning Based on User Preference of Attributes Yiyu Yao 1, Yan Zhao 1, Jue Wang 2 and Suqing Han 2 1 Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

A Fast Method for Ensuring the Consistency of Integrity Constraints

A Fast Method for Ensuring the Consistency of Integrity Constraints A Fast Method for Ensuring the Consistency of Integrity Constraints Barry G. T. Lowden and Jerome Robinson Department of Computer Science, The University of Essex, Wivenhoe Park, Colchester CO4 3SQ, Essex,

More information

Chapter 14: Query Optimization

Chapter 14: Query Optimization Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Query processing and optimization

Query processing and optimization Query processing and optimization These slides are a modified version of the slides of the book Database System Concepts (Chapter 13 and 14), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan.

More information

Multidimensional Indexes [14]

Multidimensional Indexes [14] CMSC 661, Principles of Database Systems Multidimensional Indexes [14] Dr. Kalpakis http://www.csee.umbc.edu/~kalpakis/courses/661 Motivation Examined indexes when search keys are in 1-D space Many interesting

More information

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:

More information

Managing Firewall Services

Managing Firewall Services CHAPTER 11 Firewall Services manages firewall-related policies in Security Manager that apply to the Adaptive Security Appliance (ASA), PIX Firewall (PIX), Catalyst Firewall Services Module (FWSM), and

More information

Notes on Binary Dumbbell Trees

Notes on Binary Dumbbell Trees Notes on Binary Dumbbell Trees Michiel Smid March 23, 2012 Abstract Dumbbell trees were introduced in [1]. A detailed description of non-binary dumbbell trees appears in Chapter 11 of [3]. These notes

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24 Databases -Normalization I (GF Royle, N Spadaccini 2006-2010) Databases - Normalization I 1 / 24 This lecture This lecture introduces normal forms, decomposition and normalization. We will explore problems

More information

Module 9: Selectivity Estimation

Module 9: Selectivity Estimation Module 9: Selectivity Estimation Module Outline 9.1 Query Cost and Selectivity Estimation 9.2 Database profiles 9.3 Sampling 9.4 Statistics maintained by commercial DBMS Web Forms Transaction Manager Lock

More information

Digital Archives: Extending the 5S model through NESTOR

Digital Archives: Extending the 5S model through NESTOR Digital Archives: Extending the 5S model through NESTOR Nicola Ferro and Gianmaria Silvello Department of Information Engineering, University of Padua, Italy {ferro, silvello}@dei.unipd.it Abstract. Archives

More information

Novel Materialized View Selection in a Multidimensional Database

Novel Materialized View Selection in a Multidimensional Database Graphic Era University From the SelectedWorks of vijay singh Winter February 10, 2009 Novel Materialized View Selection in a Multidimensional Database vijay singh Available at: https://works.bepress.com/vijaysingh/5/

More information

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery

AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery : Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Hong Cheng Philip S. Yu Jiawei Han University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center {hcheng3, hanj}@cs.uiuc.edu,

More information

Constraint Processing Offers Improved Expressiveness and Inference for Interactive Expert Systems

Constraint Processing Offers Improved Expressiveness and Inference for Interactive Expert Systems Constraint Processing Offers Improved Expressiveness and Inference for Interactive Expert Systems James Bowen Cork Constraint Computation Centre UCC, Cork, Ireland Email: j.bowen@4c.ucc.ie Abstract. Expert

More information

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization. This lecture Databases -Normalization I This lecture introduces normal forms, decomposition and normalization (GF Royle 2006-8, N Spadaccini 2008) Databases - Normalization I 1 / 23 (GF Royle 2006-8, N

More information

Computing Data Cubes Using Massively Parallel Processors

Computing Data Cubes Using Massively Parallel Processors Computing Data Cubes Using Massively Parallel Processors Hongjun Lu Xiaohui Huang Zhixian Li {luhj,huangxia,lizhixia}@iscs.nus.edu.sg Department of Information Systems and Computer Science National University

More information

Issues on Decentralized Consistency Checking of Multi-lateral Collaborations

Issues on Decentralized Consistency Checking of Multi-lateral Collaborations Issues on Decentralized Consistency Checking of Multi-lateral Collaborations Andreas Wombacher University of Twente Enschede The Netherlands a.wombacher@utwente.nl Abstract Decentralized consistency checking

More information

Automata Theory for Reasoning about Actions

Automata Theory for Reasoning about Actions Automata Theory for Reasoning about Actions Eugenia Ternovskaia Department of Computer Science, University of Toronto Toronto, ON, Canada, M5S 3G4 eugenia@cs.toronto.edu Abstract In this paper, we show

More information

Big Data Management and NoSQL Databases

Big Data Management and NoSQL Databases NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic

More information

Knowledge Discovery from Client-Server Databases

Knowledge Discovery from Client-Server Databases Knowledge Discovery from Client-Server Databases Nell Dewhurst and Simon Lavington Department of Computer Science, University of Essex, Wivenhoe Park, Colchester CO4 4SQ, UK neilqessex, ac.uk, lavingt

More information

Module 3. Requirements Analysis and Specification. Version 2 CSE IIT, Kharagpur

Module 3. Requirements Analysis and Specification. Version 2 CSE IIT, Kharagpur Module 3 Requirements Analysis and Specification Lesson 6 Formal Requirements Specification Specific Instructional Objectives At the end of this lesson the student will be able to: Explain what a formal

More information

Data Access Paths for Frequent Itemsets Discovery

Data Access Paths for Frequent Itemsets Discovery Data Access Paths for Frequent Itemsets Discovery Marek Wojciechowski, Maciej Zakrzewicz Poznan University of Technology Institute of Computing Science {marekw, mzakrz}@cs.put.poznan.pl Abstract. A number

More information

On Reduct Construction Algorithms

On Reduct Construction Algorithms 1 On Reduct Construction Algorithms Yiyu Yao 1, Yan Zhao 1 and Jue Wang 2 1 Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {yyao, yanzhao}@cs.uregina.ca 2 Laboratory

More information

Join (SQL) - Wikipedia, the free encyclopedia

Join (SQL) - Wikipedia, the free encyclopedia 페이지 1 / 7 Sample tables All subsequent explanations on join types in this article make use of the following two tables. The rows in these tables serve to illustrate the effect of different types of joins

More information

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch Advanced Databases Lecture 4 - Query Optimization Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Lecture 22 Tuesday, April 10

Lecture 22 Tuesday, April 10 CIS 160 - Spring 2018 (instructor Val Tannen) Lecture 22 Tuesday, April 10 GRAPH THEORY Directed Graphs Directed graphs (a.k.a. digraphs) are an important mathematical modeling tool in Computer Science,

More information

CHAPTER 3 FUZZY RELATION and COMPOSITION

CHAPTER 3 FUZZY RELATION and COMPOSITION CHAPTER 3 FUZZY RELATION and COMPOSITION The concept of fuzzy set as a generalization of crisp set has been introduced in the previous chapter. Relations between elements of crisp sets can be extended

More information

ANFIS: ADAPTIVE-NETWORK-BASED FUZZY INFERENCE SYSTEMS (J.S.R. Jang 1993,1995) bell x; a, b, c = 1 a

ANFIS: ADAPTIVE-NETWORK-BASED FUZZY INFERENCE SYSTEMS (J.S.R. Jang 1993,1995) bell x; a, b, c = 1 a ANFIS: ADAPTIVE-NETWORK-ASED FUZZ INFERENCE SSTEMS (J.S.R. Jang 993,995) Membership Functions triangular triangle( ; a, a b, c c) ma min = b a, c b, 0, trapezoidal trapezoid( ; a, b, a c, d d) ma min =

More information

Nesnelerin İnternetinde Veri Analizi

Nesnelerin İnternetinde Veri Analizi Bölüm 4. Frequent Patterns in Data Streams w3.gazi.edu.tr/~suatozdemir What Is Pattern Discovery? What are patterns? Patterns: A set of items, subsequences, or substructures that occur frequently together

More information

Representing Product Designs Using a Description Graph Extension to OWL 2

Representing Product Designs Using a Description Graph Extension to OWL 2 Representing Product Designs Using a Description Graph Extension to OWL 2 Henson Graves Lockheed Martin Aeronautics Company Fort Worth Texas, USA henson.graves@lmco.com Abstract. Product development requires

More information

Slides for Faculty Oxford University Press All rights reserved.

Slides for Faculty Oxford University Press All rights reserved. Oxford University Press 2013 Slides for Faculty Assistance Preliminaries Author: Vivek Kulkarni vivek_kulkarni@yahoo.com Outline Following topics are covered in the slides: Basic concepts, namely, symbols,

More information

Oracle Database 11g: SQL Tuning Workshop

Oracle Database 11g: SQL Tuning Workshop Oracle University Contact Us: Local: 0845 777 7 711 Intl: +44 845 777 7 711 Oracle Database 11g: SQL Tuning Workshop Duration: 3 Days What you will learn This Oracle Database 11g: SQL Tuning Workshop Release

More information

UDP Packet Monitoring with Stanford Data Stream Manager

UDP Packet Monitoring with Stanford Data Stream Manager UDP Packet Monitoring with Stanford Data Stream Manager Nadeem Akhtar #1, Faridul Haque Siddiqui #2 # Department of Computer Engineering, Aligarh Muslim University Aligarh, India 1 nadeemalakhtar@gmail.com

More information

A Framework for Securing Databases from Intrusion Threats

A Framework for Securing Databases from Intrusion Threats A Framework for Securing Databases from Intrusion Threats R. Prince Jeyaseelan James Department of Computer Applications, Valliammai Engineering College Affiliated to Anna University, Chennai, India Email:

More information

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture

More information

Chapter 14. Chapter 14 - Objectives. Purpose of Normalization. Purpose of Normalization

Chapter 14. Chapter 14 - Objectives. Purpose of Normalization. Purpose of Normalization Chapter 14 - Objectives Chapter 14 Normalization The purpose of normalization. How normalization can be used when designing a relational database. The potential problems associated with redundant data

More information

Guided Tour: Intelligent Conceptual Modelling in EER and UML-like class diagrams with icom compared to ORM2

Guided Tour: Intelligent Conceptual Modelling in EER and UML-like class diagrams with icom compared to ORM2 Guided Tour: Intelligent Conceptual Modelling in EER and UML-like class diagrams with icom compared to ORM2 Abstract. In this guided tour we illustrate the advantages of intelligent conceptual modelling,

More information

SQL Data Querying and Views

SQL Data Querying and Views Course A7B36DBS: Database Systems Lecture 04: SQL Data Querying and Views Martin Svoboda Faculty of Electrical Engineering, Czech Technical University in Prague Outline SQL Data manipulation SELECT queries

More information

Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm

Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm Marek Wojciechowski, Krzysztof Galecki, Krzysztof Gawronek Poznan University of Technology Institute of Computing Science ul.

More information

Structure of Association Rule Classifiers: a Review

Structure of Association Rule Classifiers: a Review Structure of Association Rule Classifiers: a Review Koen Vanhoof Benoît Depaire Transportation Research Institute (IMOB), University Hasselt 3590 Diepenbeek, Belgium koen.vanhoof@uhasselt.be benoit.depaire@uhasselt.be

More information

CS521 \ Notes for the Final Exam

CS521 \ Notes for the Final Exam CS521 \ Notes for final exam 1 Ariel Stolerman Asymptotic Notations: CS521 \ Notes for the Final Exam Notation Definition Limit Big-O ( ) Small-o ( ) Big- ( ) Small- ( ) Big- ( ) Notes: ( ) ( ) ( ) ( )

More information

CSCC24 Functional Programming Scheme Part 2

CSCC24 Functional Programming Scheme Part 2 CSCC24 Functional Programming Scheme Part 2 Carolyn MacLeod 1 winter 2012 1 Based on slides from Anya Tafliovich, and with many thanks to Gerald Penn and Prabhakar Ragde. 1 The Spirit of Lisp-like Languages

More information