XSelMark: A Micro-Benchmark for Selectivity Estimation Approaches of XML Queries
|
|
- Evan Adams
- 5 years ago
- Views:
Transcription
1 XSelMark: A Micro-Benchmark for Selectivity Estimation Approaches of XML Queries Sherif Sakr National ICT Australia (NICTA) Sydney, Australia sherif.sakr@nicta.com.au Abstract. Estimating the sizes of query results and intermediate results is a crucial part of any effective query optimization process. Due to several reasons, the selectivity estimation problem in the XML domain is more complicated than that in the relational domain. Several research efforts have proposed selectivity estimation approaches in the XML domain. Lacking of a suitable benchmark was one of the main reasons which prevented a real assessment and comparison between the approaches to be conducted. This paper is a first step towards a comprehensive assessment of the available selectivity estimation approaches of XML queries along with their strengths and weaknesses. We propose a selectivity estimation benchmark for XML queries, XSelMark. It consists of a set of 25 queries organized into seven groups and covers the main aspects of selectivity estimation of XML queries. These queries have been designed with respect to an XML document instance of a popular benchmark for XML data management, XMark. In addition, we suggest some criteria of assessing the capability and quality of XML queries selectivity estimation approaches. Finally, we use the proposed benchmark to assess the capabilities of the-state-of-the-art of the selectivity estimation approaches. 1 Introduction Modern implementations of query processors are heavily relying for their efficient performance on sophisticated optimizer components to achieve a proper selection of many optimization decisions such as: access paths, join orders and materialization strategies. Estimating the sizes of query results and intermediate results is a crucial part of any effective query optimization process. In fact, the selectivity estimation problem in the XML domain is more complicated than that in the relational domain. There are several reasons behind this such as: 1) the absence of strict schema notion in the XML data. 2) the dualism between structural and value-based querying. 3) the high expressiveness of the XML query languages [8]. 4) the non-uniform distribution of tags and data. 5) the correlation and dependencies between the occurrences of the elements. In the recent past, several research efforts have proposed different selectivity estimation approaches in the XML domain [9, 19, 20, 24]. However, these approaches are never
2 comprehensively assessed, evaluated and compared. One of the main reasons for this situation is that there is a lack of a suitable benchmark that facilitates the ability to conduct such real assessments and comparisons. This implies that there is no clear view about the state-of-the-art in this domain, which in turn makes it difficult to decide what further steps should be taken next. Although the XML research community has proposed several benchmarks [4, 5, 10, 16, 17, 21, 23] which are very useful for their intended targets and perspectives, none of these benchmarks fits in the context of being able to assess and evaluate the different selectivity estimation approaches of XML queries. The author of this paper has been faced with this problem during his work in [20]. In general, XML benchmarks can be classified into two main categories: 1) Application (Macro) benchmarks [4, 5, 17, 21, 23] which are used to evaluate the overall performance of an XML management system. Hence, this kind of benchmarks are not very useful for conducting a detailed assessment of specific aspects of an implementation that need improvement. 2) Micro-benchmarks [10, 16] which are designed to assess the performance of specific features of a system. In [16], Michiels et al. have motivated the crucial need of different microbenchmarks in order to get a good understanding of the different aspects in implementing efficient query processors in the XML domain. Therefore, the goal of this paper is to contribute and develop an XML Micro-benchmark, XSel- Mark, which is mainly focussed on exercising the selectivity estimation aspects of XML queries. The proposed benchmark is considered as a first step to bring an overview of the state-of-the-art of the available approaches in the domain of selectivity estimation of XML queries along with their strengths and weaknesses. It aims of to be a guide for researchers and implementors in benchmarking and improving their research efforts in this domain. XSelMark consists of 25 queries organized into seven groups where each group is intended to address the challenges posed by the different aspects of XML query result size estimation. The remainder of this paper is organized as follows. Section 2 briefly gives an overview on the related benchmarks in the XML domain. Section 3 describes the main aspects of the selectivity estimation problem in the XML domain. Section 4 presents the set of queries of the XSelMark benchmark. An overview and an assessment of the supported features of the-state-of-the-art in the selectivity estimation approaches of XML queries is presented in Section 5 before we conclude Section 6. 2 Related Work Several benchmarks for the evaluation of XML data management systems have been proposed by the XML research community [4, 5, 10, 16, 17, 21, 23]. Most of these benchmarks are application oriented [4, 5, 17, 21, 23], while few others are designed as Micro-benchmarks [10, 16]. In this section we give a brief overview about the state-of-the-art of XML benchmarks. XMach-1 [4] is a scalable multi-user benchmark. It is based on a web application and considers text documents and catalog data. It only defines a small
3 number of XML queries that cover multiple functions and update operations for which system performance is determined. The benchmark consists of 8 queries and 3 update operations. The goal of the benchmark is to test how many queries per second the query engine can execute and to stress the XML systems under a multi-user workload. XOO7 [5] is considered to be the XML counterpart of the OO7 benchmark [6] which is geared towards object repositories. Besides mapping the database and original queries of OO7 into XML, XOO7 is enriched with document and navigational queries that are specific for XML databases. The goal of XOO7 is to evaluate the performance of XML management systems. XBench [23] is a comprehensive XML database benchmark that covers a large number of XML database applications. These applications are characterized by whether they are data-centric or text-centric and whether they consist of a single document or multiple documents. XBench workload covers the functionality of XQuery as captured in the Use Cases. XMark [21] is a single-user benchmark. The database model is based on an internet auction site and consists of one big regularly structured XML document with text and non-text data. It provides a concise and comprehensive set of queries which allows users and developers to assess the performance characteristics of the different XML engines. The TPOX benchmark [17] is an application-level XML database benchmark based on a financial application scenario. It is used to evaluate the performance of XML database systems. It is mainly focussed on exercising all aspects of XML database management systems such as: storage, indexing, logging, transaction processing and concurrency control. The work load of TPOX consists of insert, update and delete operations as well as query operations. XPathMark [10] is a Micro XPath 1.0 benchmark for XMark. It presents a set of XPath queries which covers the major aspects of the XPath language including different axes, node tests, Boolean operators, references, and functions. The targets of XPathMark is to assess the functional completeness, correctness, efficiency and data scalability of XPath implementations. MemBeR [16] is another Micro-Benchmark which has a main focus to benchmark the XQuery engines with respect to the efficiency of their implementation to four important XQuery constructs: XPath navigation, XPath predicates, XQuery FLWORs and XQuery Node Construction. These four constructs form the foundation of the language and thus their efficient implementation greatly impact the overall query engine performance. 3 Main Aspects of Selectivity Estimation in the XML Domain When looking for an efficient, capable and accurate selectivity estimation approach for XML queries, there are several issues that need to be addressed. From the experience of our work in [20], the major issues of this problem include:
4 It should support structural and data value queries. In principal, all XML query languages can involve structural conditions in addition to the valuebased conditions. Therefore, any complete selectivity estimation system for the XML queries requires maintaining statistical summary information about both of the structure and the data values of the the underlying XML documents. A recommended way of doing this is to apply the XMill approach [14] in separating the structural part of the XML document from the data part and then group the related data values according to their path and data types into homogenous sets. A suitable summary structure for each set can then be easily selected. For example, the most common approaches in summarizing the numerical data values is histograms or wavelets while several tree synopses could be used to summarize the structural part. It must be practical. In general, one of the main usages of the selectivity estimation approaches is to accelerate the performance of the query evaluation process. Thus, while theoretical guarantees are important for any proposed approach, practical considerations is much more important. The performance characteristics of the selectivity estimation process is a crucial aspect for any approach. The selectivity estimation process of any query or sub-query must be much faster than the real evaluation process. In other words, the cost savings on the query evaluation process using the selectivity information must be higher than the cost of performing the selectivity estimation process. In addition, the required summary structure(s) for achieving the selectivity estimation process must be efficient in terms of memory and space consumption. It should be strongly capable. The standard query language for XML namely XPath and XQuery are very rich languages. It provides a rich set of functions and features. These features include structure and content-based search, path expressions, element construction, join, sort, duplicate elimination and aggregation operations. Thus, a good selectivity estimation approach should be able to provide accurate estimates for a wide range of these features. In addition, it should maintain a set of special summary information about the underlying source XML documents. For example, a universal assumptions about uniform distribution of the elements structure and the data values may lead to many potential estimation errors because of the irregular nature of many XML documents. It should be composable. The XML query languages, specially XQuery, are compositional in nature as sub-expressions are combined with each other to form the final query. Hence, a good selectivity estimation approach should be able to estimate the selectivity of the final expressions as well as each sub-expressions. This feature is crucial for any cost-based query optimizer to enable a proper selection of a cheap execution plans according to the feeded selectivity information of each sub-expression. It must be accurate. On the one hand, providing an accurate estimation for the query optimizer can effectively accelerate evaluation process of any query. However, on the other hand, providing the query optimizer with incorrect
5 selectivity information will lead the query optimizer to incorrect decisions and consequently to inefficient execution plans. It should be independent. It is recommended that the selectivity estimation process be independent of the actual evaluation process and it can be used with different query engines which are applying different evaluation mechanisms. 4 XSelMark Benchmark Queries XMark [21] is a well-known benchmark for XML data management. The XMark database is modelling an internet auction web site. XMark comes with an XML generator that produces XML documents according to a numeric scaling factor proportional to the document size. We base the queries of our proposed benchmark on the structure of the XMark document auction.xml which is described in detail in [21]. The set of queries of our proposed benchmark, XSelMark, represents a mix of XML queries which covers a wide set of the major selectivity estimation aspects in the domain of XML queries. They are designed in a way to allow a realistic assessment for the advantages and shortcomings of the proposed XML selectivity estimation approaches and to identify their respective impact. The set of queries are expressed using two standard XML query languages XPath and XQuery. They are concise, easy to read and understand and available at the web page of the benchmark [1]. 4.1 Group 1: Path Expressions Path expression is a fundamental building block on querying XML data. This group of queries investigates the ability of the selectivity estimation approaches on dealing with the structural XML queries. Q1) Path expression with non-recursive axes: Find the names of all persons. /site/people/person/name/text() where non-recursive axes are child, parent, attribute, following-sibling and precedingsibling. Q2) Path expression with recursive axes: Find all description nodes descendant of all item nodes. /site//item//description where recursive axes are descendant, descendant-or-self, ancestor and ancestoror-self. Q3) Path expression with wild cards: Return the item subtrees of all regions. /site/regions/*//item/*
6 Q4) Path expression with ordered-based axes: Return the description nodes which are following the tags with the name closed auction. /site//closed_auction/following::description where ordered-based axes are following, following-sibling, preceding and precedingsibling. Supporting such type of queries requires the selectivity estimation approach to capture specific statistical information about the order of the elements in the XML documents. Q5) Branching XPath Expressions: Return the names of all persons who have age information in their profiles. /site//person[profile/age]/name 4.2 Group 2: Twig Expressions Q6) Simple twig expression: Return the names and descriptions of all items. for $b in //item return ($b/name,$b/description) Q7) Twig expression with element construction: Return the restructured results of the names and descriptions of all items. for $b in //item return <Result> <name>{$b/name}</name> <price>{$b/price}</price> </Result> 4.3 Group 3: Predicates The estimation of predicate selectivity is a well-known problem in database theory and practice. Most common solutions of this problem rely on histograms for capturing the distribution of data values, and on the use of the uniform distribution when nothing is known about the data involved in the predicate. In the context of XML, predicate selectivity estimation poses new challenges such as: 1) The predicates can be structural-based as well as value based. 2) Positional predicates represents a special form of predicates over the order information of the elements in the XML document. 3) XML elements are usually distributed in a non-uniform way, hence assuming a simple uniform distribution of the elements structure may lead to many potential estimation errors especially when the operated sequence of nodes are constructed by merging nodes from different groups of data elements. Q8) Positional Predicates: Return the third bidder of each open auction. /site/open_auctions/open_auction/bidder[3]
7 Q9) Equality Predicates: Return the closed auctions with price equal to 40. /site//closed_auction[price = 40] Q10) Range Predicates: Return the closed auctions with price less than 40. /site//closed_auction[price < 40] where the range predicates uses any of the operators (<,, =,! =, >, ). Q11) Conjunctive/Disjunctive Predicates: Return the closed auctions with price greater than 40 and less than 100. /site//closed_auction[price > 40 and price < 100] where conjunctive predicates can use any of the conjunctive/disjunctive operators (AND, OR). Q12) Predicates with merged nodes from different paths: Return the african and asian items with id value greater than 100. for $b in (/site/africa/item, /site/asia/item) where data($b/@id)> 100 return $b An accurate estimation of such query should consider the different distribution for the data values nodes resulting from each different path expression as well as the percentage of each path in construcing the nodes of the operated sequence. Q13) Predicates with merged nodes from different paths and hybrid nature: Return the price nodes and quantity nodes with value greater than 100. for $b in (/site//price,/site//quantity) where data($b) > 1 and data($b) > 100 return $b This query is more challenging than the previous one because the resulting nodes of the operated sequence are representing completely different data items (price, quantity) which may have totally different distributions for their data values. Q14) String Predicates: Return all persons with id value greater than person200. /site/people/person[@id > "person200"] 4.4 Group 4: Value-Based Joins (Theta Joins) This group of queries assess the ability and the accuracy of the selectivity estimation approaches on effective and accurate dealing with value-based join operations between the data values of XML nodes. Q15) Value-based join instances where the values of each operand are constructed by path expression: Return all pairs of increase value and price value where the increase value is greater than the price value.
8 for $x in /site//increase, $y in /site//price where data($x) > data($y) return <pair>{$x,$y}</pair> Q16) Value-based join instances where the values of one operand are constructed by path expression and the values of the other operand are constructed by path expression manipulated with arithmetic expression: Return all pairs of increase value and price value where the increase value is greater than the price value multiplied by 2. for $x in /site//increase, $y in /site//price where data($x) > data($y) * 2 return <pair>{$x,$y}</pair> Q17) Equi-Joins of data values: Return all pairs of increase value and price value where the increase value is equal to the price value. for $x in /site//increase, $y in /site//price where data($x) = data($y) return <pair>{$x,$y}</pair> 4.5 Group 5: Arithmetic and Comparison operations over Data Value Statistics This group of queries assess the ability of the selectivity estimation approaches on their ability of not only being able to capture summary information about the data values of the XML elements but also on their ability of applying arithmetic and comparison operations over these summary information in a consistent and accurate way which does not hurt the quality of the selectivity estimation results. Q18) Arithmetic over Data Value Statistics 1: Return all pairs of increase value and price value where the sum of the increase value and the price value is greater than 100. for $x in /site//increase, $y in /site//price where data($x) + data($y) > 100 return <pair>{$x,$y}</pair> Q19) Arithmetic over Data Value Statistics 2: Return all pairs of increase value and price value where the sum of the increase value and the price value is equal to 100. for $x in /site//increase,$y in /site//price where data($x) + data($y) = 100 return <pair>{$x,$y}</pair> Q20) Arithmetic and Comparison operations over Data Value Statistics 3: Return all triples of increase value, price value and income where the sum of the increase value and the income value is greater than the sum of the price value and the income value.
9 for $x in /site//increase, $y in /site//price, $z in where data($x) + data($z) > data($y) + data($z) return <pair>{$x,$y,$z}</pair> 4.6 Group 6: Nested Expressions XQuery, as with many other XML query languages such as SQL/XML [7], is a free nesting language, where nested queries can be used for many targets such as reshaping elements or computing aggregate values. Since the result of nested queries may be the input for navigational or filtering operations in the outer query, predicting the size of nested query results will require building on-the-fly statistics about these intermediate results. Q21) Let - Aggregates: Return the names of persons and the number of items that they bought. for $p in /site/people/person let $a := for $t in /site//closed_auction where $t/buyer/@person = $p/@id return $t return <item> <person>{$p/name/text()}</person> <count>{count($a)}</count> </item> Q22) Predicates with values constructed by aggregate function: Return the open auctions with sum of bidder increases that are greater than for $b in /site/open_auctions/open_auction where sum(data($b/bidder/increase)) > 1000 return <increase>{$b}</increase> 4.7 Group 7: Data Dependent Estimations This group of queries requires capturing additional specific forms of summary information about the data values of the underlying XML documents. Q23) Full Text Search: Return the names of all items whose description contains the word gold. /site//item[contains(description, gold )] Q24) Distinct Operator: Return the distinct price values. for $b in distinct-values(//price/text()) Q25) Existential Document Order: Return the open auctions where a certain person issued a bid before another person.
10 for $b in /site/open_auctions/open_auction where some $pr1 in = "person20"], $pr2 in = "person51"] satisfies $pr1 << $pr2 return <history>{$b}</history> 5 XML Selectivity Estimation Approaches: state-of-the-art In this section we give an overview of the state-of-the-art of the selectivity estimation approaches in the XML domain after which we will use the set of XSelMark queries to assess the capabilities and features supported by the functionality of each approach. The work of Aboulnaga et al. [2] is considered to be the first to deal with the selectivity estimation of simple path expressions. They presented two different techniques for capturing the structure of the XML documents and for providing accurate selectivity estimations for simple path expressions The first technique is a summarizing tree structure called a path tree. It is a tree containing each distinct rooted path in the database. The second technique is a statistical structure called Markov table. This table, implemented as an ordinary hash table, contains any distinct path of length up to m and its selectivity. The presented techniques only work for simple path expressions that are without predicates, inline conditions, recursive axes and order-based axes. In [15], the authors present an XPathLearner as a selectivity estimation system for XPath expressions which employs the same summarization and estimation techniques presented in [2] with two main modifications. The first modification is that it gathers and refines the required statistical information in an on-line manner from query feedbacks and the second modification is that it supports the handling of predicates by storing statistical information for each distinct tag-value pair in the source XML document. The work of Zhang et al. in [24] is mainly focusing on the handling of XPath expressions which involve only structural conditions. The main idea behind the paper is to provide an efficient treatment of recursive XML documents and the accurate estimation of recursive queries. The authors define a summary structure for summarizing the source XML documents into a compact graph structure called XSEED. Relying on the defined statistic graph structure, the authors propose an algorithm for the selectivity estimation of the structural XPath expressions. In [11] Freire et al. have presented an XML Schema-based statistics collection technique called StatiX. StatiX leverages the available information in the XML Schema to capture both structural and value statistics about the source XML documents. These structural and value statistics are collected in the form of histograms. The StatiX systems is employed in a cost-based XML-to-relational
11 storage mapping engine which tries to generate efficient relational configurations for the XML documents, LegoDB [3]. In [22] Wang et al. have proposed a special histogram structure for the selectivity estimation of XPath queries in a dynamic context named as Bloom Histogram. The Bloom Histogram keeps a count of the statistics for paths in XML data. A bloom histogram H is constructed by sorting the frequency values of the distinct paths in XML data and then grouping the paths with similar frequency values into buckets. Although, Bloom Histogram is designed to deal with data updates and the estimation error is theoretically bounded by its size, it is very limited as it deals only with simple forms of path expressions. In [13], Li et al. have described a framework for estimating the selectivity of XPath expressions with a main focus on the order-based axes (following, preceding, following-sibling, preceding-sibling). They used two histogram structures to aggregate the path and order information of XML data called p-histogram and o-histogram. A p-histogram is built for each distinct element tag to summarize the pathid-frequency information. In this histogram, each bucket contains a set of path ids and their average frequency value. The o-histogram summarizes the path-order information of each distinct element tag name to capture the siblingorder information based on the path ids. In [9] Fisher et al. have proposed the SLT XML tree synopsis. The main idea of this synopsis is to remove the repeated patterns in the XML tree and to replace the multiple occurrences of equal subtrees by pointers to a single occurrence of the subtree. They described an algorithm for representing the resulting DAG structures using a special form of grammars alled an SLT grammar (straight line tree grammar). A tree automata is designed to run over the generated lossy SLT grammars to estimate the selectivity of queries containing all XPath axes, including the order-sensitive ones. The proposed synopes can deal only with structural XPath queries with no support of any form of predicate queries or XQuery expressions. In [19] Polyzotis et al. have proposed the XCluster synopses as a clusteringbased framework that can capture the key correlations between and across structure and values of different types. XCluster is considered to be a generalized form of the XSketch tree synopses which is a previous work of the authors represented in [18]. It employs the well-known histogram techniques for numeric and string values, and introduces the class of end-biased term histograms for summarizing the distribution of unique terms within textual XML content. This approach can support twig queries with predicates on numeric content, string content, and textual content. However, the authors did not mention how XCluster can be extended to deal with more complicated query situations such as value-based join operations and nested expressions. The work of [20] has described the design and implementation of a relational algebraic based framework for estimating the selectivity of XQuery expressions. In this approach, XML queries are translated into relational algebraic plans [12]. Summary information about the structure and the data values of the underlying XML documents are kept separately. Then by exploiting the relational alge-
12 braic infrastructure, the special properties of the generated algebraic plans, the summary information and a set of inference rules, the relational estimation approach is able to provide accurate selectivity estimations in the context of XML and XQuery domains. The framework enjoys the flexibility of integrating any XPath or predicate selectivity estimation technique, which enables it to support the selectivity estimation of a large subset of the powerful XML query language XQuery and to provide estimates not only of the whole XQuery expression but also of each sub-expression as well as the selectivity of each iteration in the context of FLWOR. Features Assessment One of the main goals of XSelMark benchmark is to provide a framework of assessing the completeness of the selectivity estimation approaches of XML queries. We used the set of XSelMark benchmark queries for an initial assessment of the supported features by the state-of-the-art. Table 1 lists the set of queries supported by each approach where the symbol X is used to indicate the ability of the approach to support the associated query and the symbol - is used to indicate the inability to support the associated query. The assessment has shown some interesting preliminary results: 1) Most of the selectivity estimation approaches [11, 13, 15, 24, 22] are limited on their abilities to support only small subsets of the XML query languages. They are only able to deal with structural XPath queries. 2) The two synopses of [13, 9] are the only two synopses which are able to support the selectivity estimation of order-sensitive XPath axes. 3) The approaches of [19, 20] cover a wider range of the XML query features. The synopsis of [19] is the only one which is able to deal with the estimation of full text search queries while [20] is able to uniquely deal with many of the features of XQuery languages such as join operation and different type of predicates. 6 Conclusion Several research efforts have been invested on designing Macro-Benchmarks to assess the overall performance of XML data management systems. There is currently a big demand for several Micro-Benchmarks which assess specific aspects in the XPath, XQuery and XML management system domains. Several research efforts have proposed different selectivity estimation approaches in the XML domain. Due to the lack of a suitable benchmark, it was difficult to assess, evaluate and compare these approaches and in order to get a clear view about the state-of-the-art. This paper is considered as a first step towards a comprehensive assessment of the available selectivity estimation approaches of XML queries. We proposed XSelMark as a Micro-Benchmark to assess the state-of-the-art of the selectivity estimation approach of XML queries. An initial assessment for the features and capabilities of the current approaches has shown that most of them are limited to supporting the estimation of the structural XPath queries. Hence, several avenues for further research and development are still widely open
13 XPath- XSEED StatiX Path-Order Bloom SLT XCluster Relational Learner [15] [24] [11] Histogram [13] Histogram [22] Gramar [9] [19] Alg. Est. [20] Q1 X X X X X X X X Q2 X X X X X X X X Q3 X X X X X X X X Q X - X - X Q5 X X X X - X X - Q X X Q X X Q X - X Q9 X - X X X Q10 X - X X X Q11 X X X Q X Q X Q X X Q X Q X Q X Q X Q X Q X Q X Q Q X - Q Q Table 1. An assessment of the capabilities of the state-of-the-art of the selectivity estimation approaches using XSelMark benchmark. in this domain to provide accurate, capable and complete frameworks aligned with the rich querying capabilities of the standard XML query languages. We believe that XSelMark is useful for both researchers and developers. It identifies the major aspects of selectivity estimation of XML queries, helps researchers to discover the strengths and weaknesses of the current approaches and provides the researchers and developers with a clearer view of developing more enhanced mechanisms of selectivity estimation of XML queries. In addition, we believe that the selectivity estimation problem is an important research field which has many useful applications other than being a crucial piece for an effective query optimization process such as: 1) allowing the query engines to provide the users with an early feedback about the expected outcome of their queries and the associated computational efforts. 2) providing the query engines with hints on the possible avenues to optimize the resource allocation of the execution process. 3) playing an effective role for efficient approximate query answering techniques. As a future work, we are planning to use XSelMark to perform more detailed assessment of the selectivity estimation approaches of XML queries in terms of their accuracy, performance and memory requirements. References 1. XSelMark: A Micro-Benchmark of Selectivity Estimation of XML Queries A. Aboulnaga, A. Alameldeen, and J. Naughton. Estimating the Selectivity of XML Path Expressions for Internet Scale Applications. In VLDB, 2001.
14 3. P. Bohannon, J. Freire, P. Roy, and J. Siméon. From XML Schema to Relations: A Cost-Based Approach to XML Storage. In ICDE, T. Böhme and E. Rahm. XMach-1: A Benchmark for XML Data Management. In BTW, S. Bressan, M. Lee, Y. Li, Z. Lacroix, and U. Nambiar. The XOO7 Benchmark. In VLDB 2002 Workshops, London, UK, M. Carey, D. DeWitt, and J. Naughton. The OO7 Benchmark. SIGMOD Record (ACM Special Interest Group on Management of Data), 22, Andrew Eisenberg and Jim Melton. Advancements in SQL/XML. SIGMOD Record, 33(3):79 86, M. Fernández, A. Malhotra, J. Marsh, M. Nagy, and N. Walsh. XQuery 1.0 and XPath 2.0 Data Model (XDM). World Wide Web Consortium Proposed Recommendation, November D. Fisher and S. Maneth. Structural Selectivity Estimation for XML Documents. In ICDE, M. Franceschet. XPathMark: An XPath Benchmark for the XMark Generated Data. Database and XML Technologies, J. Freire, J. Haritsa, M. Ramanath, P. Roy, and J. Siméon. StatiX: making XML count. In SIGMOD, T. Grust, S. Sakr, and J. Teubner. XQuery on SQL Hosts. In VLDB, H. Li, M. Lee, W. Hsu, and G. Cong. An Estimation System for XPath Expressions. In ICDE, H. Liefke and D. Suciu. XMill: An efficient compressor for XML data. In W. Chen, J. F. Naughton, and P. A. Bernstein, editors, SIGMOD, L. Lim, M. Wang, S. admanabhan, J. Vitter, and R. Parr. XPathLearner: An On-line Self-Tuning Markov Histogram for XML Path Selectivity Estimation. In VLDB, P. Michiels, I. Manolescu, and C. Miachon. Toward microbenchmarking XQuery. Information System, 33(2), M. Nicola, I. Kogan, and B. Schiefer. An XML transaction processing benchmark. In SIGMOD, N. Polyzotis and M. Garofalakis. Structure and Value Synopses for XML Data Graphs. In VLDB, N. Polyzotis and M. Garofalakis. XCluster Synopses for Structured XML Content. In ICDE, S. Sakr. Cardinality-Aware and Purely Relational Implementation of an XQuery Processor. PhD thesis, University of Konstanz, A. Schmidt, F. Waas, M. Kersten, M. Carey, I. Manolescu, and R. Busse. XMark: A Benchmark for XML Data Management. In VLDB, W. Wang, H. Jiang, H. Lu, and J. Xu Yu. Bloom Histogram: Path Selectivity Estimation for XML Data with Updates. In VLDB, B. Yao, T. Özsu, and J. Keenleyside. XBench - A Family of Benchmarks for XML DBMSs. In VLDB Workshop, N. Zhang, T. Özsu, A. Aboulnaga, and I. Ilyas. XSEED: Accurate and Fast Cardinality Estimation for XPath Queries. In ICDE, 2006.
Cardinality estimation of navigational XPath expressions
University of Twente Department of Electrical Engineering, Mathematics and Computer Science Database group Cardinality estimation of navigational XPath expressions Gerben Broenink M.Sc. Thesis 16 June
More informationFlexBench: A Flexible XML Query Benchmark
FlexBench: A Flexible XML Query Benchmark Maroš Vranec Irena Mlýnková Department of Software Engineering Faculty of Mathematics and Physics Charles University Prague, Czech Republic maros.vranec@gmail.com
More informationXQuery Optimization in Relational Database Systems
XQuery Optimization in Relational Database Systems Riham Abdel Kader Supervised by Maurice van Keulen Univeristy of Twente P.O. Box 217 7500 AE Enschede, The Netherlands r.abdelkader@utwente.nl ABSTRACT
More informationStatiX: Making XML Count
StatiX: Making XML Count * Prasan Roy Jerome Simeon Bell Labs - Lucent Technologies Jayant Haritsa Maya Ramanath Indian Institute of Science Statix SIGMOD, 2002 1 Motivation Statistics to estimate cardinality
More informationEstimating the Selectivity of XML Path Expression with predicates by Histograms
Estimating the Selectivity of XML Path Expression with predicates by Histograms Yu Wang 1, Haixun Wang 2, Xiaofeng Meng 1, and Shan Wang 1 1 Information School, Renmin University of China, Beijing 100872,
More informationA Sampling Approach for XML Query Selectivity Estimation
A Sampling Approach for XML Query Selectivity Estimation Wen-Chi Hou Computer Science Department Southern Illinois University Carbondale Carbondale, IL 62901, U.S.A. hou@cs.siu.edu Cheng Luo Department
More informationMulti-User Evaluation of XML Data Management Systems with XMach-1
Multi-User Evaluation of XML Data Management Systems with XMach-1 Timo Böhme, Erhard Rahm University of Leipzig, Germany {boehme, rahm}@informatik.uni-leipzig.de http://dbs.uni-leipzig.de Abstract. XMach-1
More informationA FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS
A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:
More informationSchema-Based XML-to-SQL Query Translation Using Interval Encoding
2011 Eighth International Conference on Information Technology: New Generations Schema-Based XML-to-SQL Query Translation Using Interval Encoding Mustafa Atay Department of Computer Science Winston-Salem
More informationXPathMark: an XPath Benchmark for the XMark Generated Data
XPathMark: an XPath Benchmark for the XMark Generated Data Massimo Franceschet Informatics Institute, University of Amsterdam, Kruislaan 403 1098 SJ Amsterdam, The Netherlands Dipartimento di Scienze,
More informationSymmetrically Exploiting XML
Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA The 15 th International World Wide Web Conference
More informationIntegrating Path Index with Value Index for XML data
Integrating Path Index with Value Index for XML data Jing Wang 1, Xiaofeng Meng 2, Shan Wang 2 1 Institute of Computing Technology, Chinese Academy of Sciences, 100080 Beijing, China cuckoowj@btamail.net.cn
More informationPart XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321
Part XII Mapping XML to Databases Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321 Outline of this part 1 Mapping XML to Databases Introduction 2 Relational Tree Encoding Dead Ends
More informationADT 2009 Other Approaches to XQuery Processing
Other Approaches to XQuery Processing Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/ 12.11.2009: Schedule 2 RDBMS back-end support for XML/XQuery (1/2): Document Representation (XPath
More informationA Framework for Estimating XML Query Cardinality
A Framework for Estimating XML Query Cardinality Carlo Sartiani Dipartimento di Informatica - Università di Pisa Via Buonarroti 2, Pisa, Italy sartiani@di.unipi.it ABSTRACT Tools for querying and processing
More informationAn XML Routing Synopsis for Unstructured P2P Networks
An XML Routing Synopsis for Unstructured P2P Networks Qiang Wang University of Waterloo q6wang@uwaterloo.ca Abhay Kumar Jha IIT, Bombay abhaykj@cse.iitb.ac.in M. Tamer Özsu University of Waterloo tozsu@uwaterloo.ca
More informationA Scheme for Evaluating XML Engine on RDBMS
I.J.Modern Education and Computer Science, 2011, 2, 51-60 Published Online April 2011 in MECS (http://www.mecs-press.org/) A Scheme for Evaluating XML Engine on RDBMS Guannan Si, Zhengji Zhou, Nan Li,
More informationEvaluating XPath Queries
Chapter 8 Evaluating XPath Queries Peter Wood (BBK) XML Data Management 201 / 353 Introduction When XML documents are small and can fit in memory, evaluating XPath expressions can be done efficiently But
More informationXQuery Implementation Paradigms (06472)
Executive Summary of Dagstuhl Seminar XQuery Implementation Paradigms (06472) Nov 19 22, 2006 Organizers: Peter A. Boncz (CWI Amsterdam, NL) Torsten Grust (TU München, DE) Jérôme Siméon (IBM TJ Watson
More informationParameterized XPath Views
Parameterized XPath Views Timo Böhme, Erhard Rahm Database Group University of Leipzig {boehme,rahm}@informatik.uni-leipzig.de Abstract: We present a new approach for accelerating the execution of XPath
More informationEcient XPath Axis Evaluation for DOM Data Structures
Ecient XPath Axis Evaluation for DOM Data Structures Jan Hidders Philippe Michiels University of Antwerp Dept. of Math. and Comp. Science Middelheimlaan 1, BE-2020 Antwerp, Belgium, fjan.hidders,philippe.michielsg@ua.ac.be
More informationPathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data
PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg
More informationXQuery Optimization Based on Rewriting
XQuery Optimization Based on Rewriting Maxim Grinev Moscow State University Vorob evy Gory, Moscow 119992, Russia maxim@grinev.net Abstract This paper briefly describes major results of the author s dissertation
More informationBig Data Management and NoSQL Databases
NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic
More informationOne of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while
1 One of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while leaving the engine to choose the best way of fulfilling
More informationEffective Schema-Based XML Query Optimization Techniques
Effective Schema-Based XML Query Optimization Techniques Guoren Wang and Mengchi Liu School of Computer Science Carleton University, Canada {wanggr, mengchi}@scs.carleton.ca Bing Sun, Ge Yu, and Jianhua
More informationStatiX: Making XML Count
StatiX: Making XML Count Juliana Freire 1 Jayant R. Haritsa 2 Maya Ramanath 2 Prasan Roy 1 Jérôme Siméon 1 1 Bell Labs 2 Indian Institute of Science fjuliana,prasan,simeong@research.bell-labs.com fharitsa,mayag@dsl.serc.iisc.ernet.in
More informationXML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9
XML databases Jan Chomicki University at Buffalo Jan Chomicki (University at Buffalo) XML databases 1 / 9 Outline 1 XML data model 2 XPath 3 XQuery Jan Chomicki (University at Buffalo) XML databases 2
More informationA Clustering-based Scheme for Labeling XML Trees
84 IJCSNS International Journal of Computer Science and Network Security, VOL.6 No.9A, September 2006 A Clustering-based Scheme for Labeling XML Trees Sadegh Soltan, and Masoud Rahgozar, University of
More informationAccelerating XML Structural Matching Using Suffix Bitmaps
Accelerating XML Structural Matching Using Suffix Bitmaps Feng Shao, Gang Chen, and Jinxiang Dong Dept. of Computer Science, Zhejiang University, Hangzhou, P.R. China microf_shao@msn.com, cg@zju.edu.cn,
More informationModule 9: Selectivity Estimation
Module 9: Selectivity Estimation Module Outline 9.1 Query Cost and Selectivity Estimation 9.2 Database profiles 9.3 Sampling 9.4 Statistics maintained by commercial DBMS Web Forms Transaction Manager Lock
More informationAn Efficient Eigenvalue-based P2P XML Routing Framework
An Efficient Eigenvalue-based P2P XML Routing Framework Qiang Wang Univ. of Waterloo, Canada q6wang@uwaterloo.ca M. Tamer Özsu Univ. of Waterloo, Canada tozsu@uwaterloo.ca Abstract Many emerging applications
More informationQuerying and Updating XML with XML Schema constraints in an RDBMS
Querying and Updating XML with XML Schema constraints in an RDBMS H. Georgiadis I. Varlamis V. Vassalos Department of Informatics Athens University of Economics and Business Athens, Greece {harisgeo,varlamis,vassalos}@aueb.gr
More informationFractional XSketch Synopses for XML Databases
Fractional XSketch Synopses for XML Databases Natasha Drukh 1, Neoklis Polyzotis 2, Minos Garofalakis 3, and Yossi Matias 1 1 School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel, kreimern@post.tau.ac.il,
More informationSummarization of XML Documents
Summarization of XML Documents Hesham Elzentani, Prof. dr Mladen Veinović Abstract EXtensible Markup Language (XML) has become a standard of data exchange and representation in many applications. An XML
More informationObject Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ
45 Object Placement in Shared Nothing Architecture Zhen He, Jeffrey Xu Yu and Stephen Blackburn Λ Department of Computer Science The Australian National University Canberra, ACT 2611 Email: fzhen.he, Jeffrey.X.Yu,
More informationXML Query Processing and Optimization
XML Query Processing and Optimization Ning Zhang School of Computer Science University of Waterloo nzhang@uwaterloo.ca Abstract. In this paper, I summarize my research on optimizing XML queries. This work
More informationXML Systems & Benchmarks
XML Systems & Benchmarks Christoph Staudt Peter Chiv Saarland University, Germany July 1st, 2003 Main Goals of our talk Part I Show up how databases and XML come together Make clear the problems that arise
More informationTwigINLAB: A Decomposition-Matching-Merging Approach To Improving XML Query Processing
American Journal of Applied Sciences 5 (9): 99-25, 28 ISSN 546-9239 28 Science Publications TwigINLAB: A Decomposition-Matching-Merging Approach To Improving XML Query Processing Su-Cheng Haw and Chien-Sing
More informationEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data Marcin Wylot 1 Motivation and objectives of the research The proliferation of heterogeneous Linked Data on the Web requires data management
More informationElement Algebra. 1 Introduction. M. G. Manukyan
Element Algebra M. G. Manukyan Yerevan State University Yerevan, 0025 mgm@ysu.am Abstract. An element algebra supporting the element calculus is proposed. The input and output of our algebra are xdm-elements.
More informationData Structures for Maintaining Path Statistics in Distributed XML Stores
Data Structures for Maintaining Path Statistics in Distributed XML Stores c Yury Soldak Department of Computer Science, Saint-Petersburg State University University Prospekt 28 Saint-Petersburg Russian
More informationParallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce
Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce Huayu Wu Institute for Infocomm Research, A*STAR, Singapore huwu@i2r.a-star.edu.sg Abstract. Processing XML queries over
More informationThe Michigan Benchmark: A Micro-Benchmark for XML Query Performance Diagnostics
The Michigan Benchmark: A Micro-Benchmark for XML Query Performance Diagnostics Jignesh M. Patel and H. V. Jagadish Department of Electrical Engineering and Computer Science The University of Michigan,
More informationSet-at-a-time Access to XML through DOM
Set-at-a-time Access to XML through DOM Hai Chen Frank Wm. Tompa School of Computer Science University of Waterloo Waterloo,ON,Canada +1-519-888-4567 {h24chen,fwtompa@db.uwaterloo.ca ABSTRACT To support
More informationτ-xsynopses - A System for Run-time Management of XML Synopses
τ-xsynopses - A System for Run-time Management of XML Synopses Natasha Drukh School of Computer Science Tel Aviv University kreimern@cs.tau.ac.il Leon Portman School of Computer Science Tel Aviv University
More informationContents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...
Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing
More information1 Introduction. Philippe Michiels. Jan Hidders University of Antwerp. University of Antwerp. Roel Vercammen. University of Antwerp
OPTIMIZING SORTING AND DUPLICATE ELIMINATION IN XQUERY PATH EXPRESSIONS Jan Hidders University of Antwerp jan.hidders@ua.ac.be Roel Vercammen University of Antwerp roel.vercammen@ua.ac.be Abstract Philippe
More informationRelational Model: History
Relational Model: History Objectives of Relational Model: 1. Promote high degree of data independence 2. Eliminate redundancy, consistency, etc. problems 3. Enable proliferation of non-procedural DML s
More informationXML Query Processing. Announcements (March 31) Overview. CPS 216 Advanced Database Systems. Course project milestone 2 due today
XML Query Processing CPS 216 Advanced Database Systems Announcements (March 31) 2 Course project milestone 2 due today Hardcopy in class or otherwise email please I will be out of town next week No class
More informationEfficient XQuery Evaluation of Grouping Conditions with Duplicate Removals
Efficient XQuery uation of Grouping Conditions with Duplicate Removals Norman May and Guido Moerkotte University of Mannheim B6, 29 68131 Mannheim, Germany norman moer@db.informatik.uni-mannheim.de Abstract.
More informationMonetDB/XQuery (2/2): High-Performance, Purely Relational XQuery Processing
ADT 2010 MonetDB/XQuery (2/2): High-Performance, Purely Relational XQuery Processing http://pathfinder-xquery.org/ http://monetdb-xquery.org/ Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/
More informationXML Tree Structure Compression
XML Tree Structure Compression Sebastian Maneth NICTA & University of NSW Joint work with N. Mihaylov and S. Sakr Melbourne, Nov. 13 th, 2008 Outline -- XML Tree Structure Compression 1. Motivation 2.
More informationCompacting XML Structures Using a Dynamic Labeling Scheme
Erschienen in: Lecture Notes in Computer Science (LNCS) ; 5588 (2009). - S. 158-170 https://dx.doi.org/10.1007/978-3-642-02843-4_16 Compacting XML Structures Using a Dynamic Labeling Scheme Ramez Alkhatib
More informationCHAPTER 3 LITERATURE REVIEW
20 CHAPTER 3 LITERATURE REVIEW This chapter presents query processing with XML documents, indexing techniques and current algorithms for generating labels. Here, each labeling algorithm and its limitations
More informationQuery Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.
COS 597: Principles of Database and Information Systems Query Optimization Query Optimization Query as expression over relational algebraic operations Get evaluation (parse) tree Leaves: base relations
More informationAn Efficient XML Index Structure with Bottom-Up Query Processing
An Efficient XML Index Structure with Bottom-Up Query Processing Dong Min Seo, Jae Soo Yoo, and Ki Hyung Cho Department of Computer and Communication Engineering, Chungbuk National University, 48 Gaesin-dong,
More informationADT 2010 ADT XQuery Updates in MonetDB/XQuery & Other Approaches to XQuery Processing
1 XQuery Updates in MonetDB/XQuery & Other Approaches to XQuery Processing Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/ MonetDB/XQuery: Updates Schedule 9.11.1: RDBMS back-end support
More informationQuickXDB: A Prototype of a Native XML QuickXDB: Prototype of Native XML DBMS DBMS
QuickXDB: A Prototype of a Native XML QuickXDB: Prototype of Native XML DBMS DBMS Petr Lukáš, Radim Bača, and Michal Krátký Petr Lukáš, Radim Bača, and Michal Krátký Department of Computer Science, VŠB
More informationCompression of the Stream Array Data Structure
Compression of the Stream Array Data Structure Radim Bača and Martin Pawlas Department of Computer Science, Technical University of Ostrava Czech Republic {radim.baca,martin.pawlas}@vsb.cz Abstract. In
More informationQuery Processing and Optimization in Native XML Databases
Query Processing and Optimization in Native XML Databases Ning Zhang David R. Cheriton School of Computer Science University of Waterloo nzhang@uwaterloo.ca Technical Report CS-2006-29 August 2006 Abstract
More informationMemBeR: A Micro-benchmark Repository for XQuery
MemBeR: A Micro-benchmark Repository for XQuery Loredana Afanasiev 1, Ioana Manolescu 2, and Philippe Michiels 3 1 University of Amsterdam, The Netherlands, lafanasi@science.uva.nl 2 INRIA Futurs & LRI,
More informationUsing an Oracle Repository to Accelerate XPath Queries
Using an Oracle Repository to Accelerate XPath Queries Colm Noonan, Cian Durrigan, and Mark Roantree Interoperable Systems Group, Dublin City University, Dublin 9, Ireland {cnoonan, cdurrigan, mark}@computing.dcu.ie
More informationFull-Text and Structural XML Indexing on B + -Tree
Full-Text and Structural XML Indexing on B + -Tree Toshiyuki Shimizu 1 and Masatoshi Yoshikawa 2 1 Graduate School of Information Science, Nagoya University shimizu@dl.itc.nagoya-u.ac.jp 2 Information
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 7 - Query execution
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 7 - Query execution References Generalized Search Trees for Database Systems. J. M. Hellerstein, J. F. Naughton
More informationAnalysis of Different Approaches for Storing GML Documents
Analysis of Different Approaches for Storing GML Documents J. E. Córcoles Secc. Tecnología de la Información Universidad de Castilla-La Mancha Campus Universitario s/n.02071.albacete. Spain +34967599200
More informationXML Index Recommendation with Tight Optimizer Coupling
XML Index Recommendation with Tight Optimizer Coupling Technical Report CS-2007-22 July 11, 2007 Iman Elghandour University of Waterloo Andrey Balmin IBM Almaden Research Center Ashraf Aboulnaga University
More informationSTRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE
STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE Wei-ning Qian, Hai-lei Qian, Li Wei, Yan Wang and Ao-ying Zhou Computer Science Department Fudan University Shanghai 200433 E-mail: wnqian@fudan.edu.cn
More informationDatabase Management
Database Management - 2011 Model Answers 1. a. A data model should comprise a structural part, an integrity part and a manipulative part. The relational model provides standard definitions for all three
More informationSFilter: A Simple and Scalable Filter for XML Streams
SFilter: A Simple and Scalable Filter for XML Streams Abdul Nizar M., G. Suresh Babu, P. Sreenivasa Kumar Indian Institute of Technology Madras Chennai - 600 036 INDIA nizar@cse.iitm.ac.in, sureshbabuau@gmail.com,
More informationTime Complexity and Parallel Speedup to Compute the Gamma Summarization Matrix
Time Complexity and Parallel Speedup to Compute the Gamma Summarization Matrix Carlos Ordonez, Yiqun Zhang Department of Computer Science, University of Houston, USA Abstract. We study the serial and parallel
More informationData Centric Integrated Framework on Hotel Industry. Bridging XML to Relational Database
Data Centric Integrated Framework on Hotel Industry Bridging XML to Relational Database Introduction extensible Markup Language (XML) is a promising Internet standard for data representation and data exchange
More informationSomething to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:
Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base
More informationThe Michigan Benchmark: Towards XML Query Performance Diagnostics
The Michigan Benchmark: Towards XML Query Performance Diagnostics Kanda Runapongsa Jignesh M. Patel H. V. Jagadish Yun Chen Shurug Al-Khalifa University of Michigan 1301 Beal Avenue; Ann Arbor, MI 48109-2122;
More informationXML Filtering Technologies
XML Filtering Technologies Introduction Data exchange between applications: use XML Messages processed by an XML Message Broker Examples Publish/subscribe systems [Altinel 00] XML message routing [Snoeren
More informationTDDD43. Theme 1.2: XML query languages. Fang Wei- Kleiner h?p:// TDDD43
Theme 1.2: XML query languages Fang Wei- Kleiner h?p://www.ida.liu.se/~ Query languages for XML Xpath o Path expressions with conditions o Building block of other standards (XQuery, XSLT, XLink, XPointer,
More informationCXHist : An On-line Classification-Based Histogram for XML String Selectivity Estimation
CXHist : An On-line Classification-Based Histogram for XML String Selectivity Estimation Lipyeow Lim 1 Min Wang 1 Jeffrey Scott Vitter 2 1 IBM T. J. Watson Research Center 19 Skyline Drive Hawthorne, NY
More informationExtending database technology: a new document data type
Extending database technology: a new document data type Stefania Leone Departement of Informatics, University of Zurich Binzmuehlestr. 14, 8050 Zurich, Switzerland leone@ifi.unizh.ch Abstract. Our research
More informationSQL, XQuery, and SPARQL:Making the Picture Prettier
SQL, XQuery, and SPARQL:Making the Picture Prettier Jim Melton, Oracle Corporation, Copyright 2007 Oracle, jim.melton@acm.org Introduction Last year, we asked what s wrong with this picture? regarding
More informationTwigList: Make Twig Pattern Matching Fast
TwigList: Make Twig Pattern Matching Fast Lu Qin, Jeffrey Xu Yu, and Bolin Ding The Chinese University of Hong Kong, China {lqin,yu,blding}@se.cuhk.edu.hk Abstract. Twig pattern matching problem has been
More informationAdvanced Database Systems
Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed
More informationTitle: STEP: Extending Relational Query Engines for Efficient XML Query Processing
Paper ID: 258 Title: STEP: Extending Relational Query Engines for Efficient XML Query Processing Authors: Feng Tian, David J. DeWitt Topic Area: Core Database Technology Category: Research Subject Area:
More informationXQuery Query Processing in Relational Systems
XQuery Query Processing in Relational Systems by Yingwen Chen A thesis presented to the University of Waterloo in fulfilment of the thesis requirement for the degree of Master of Mathematics in Computer
More informationPathfinder: Compiling XQuery for Execution on the Monet Database Engine
Pathfinder: Compiling XQuery for Execution on the Monet Database Engine Jens Teubner University of Konstanz Dept. of Computer & Information Science Box D 188, 78457 Konstanz, Germany teubner@inf.uni-konstanz.de
More informationXML/Relational mapping Introduction of the Main Challenges
HELSINKI UNIVERSITY OF TECHNOLOGY November 30, 2004 Telecommunications Software and Multimedia Laboratory T-111.590 Research Seminar on Digital Media (2-5 cr.): Autumn 2004: Web Service Technologies XML/Relational
More informationXML Native Storage and Query Processing
XML Native Storage and Query Processing Ning Zhang Facebook M. Tamer Özsu University of Waterloo, Canada ABSTRACT As XML has evolved as a data model for semi-structured data and the de facto standard for
More informationJoin Processing for Flash SSDs: Remembering Past Lessons
Join Processing for Flash SSDs: Remembering Past Lessons Jaeyoung Do, Jignesh M. Patel Department of Computer Sciences University of Wisconsin-Madison $/MB GB Flash Solid State Drives (SSDs) Benefits of
More informationA New Way of Generating Reusable Index Labels for Dynamic XML
A New Way of Generating Reusable Index Labels for Dynamic XML P. Jayanthi, Dr. A. Tamilarasi Department of CSE, Kongu Engineering College, Perundurai 638 052, Erode, Tamilnadu, India. Abstract XML now
More informationIntroduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe
Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms
More informationDesigning Views to Answer Queries under Set, Bag,and BagSet Semantics
Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Rada Chirkova Department of Computer Science, North Carolina State University Raleigh, NC 27695-7535 chirkova@csc.ncsu.edu Foto Afrati
More informationChild Prime Label Approaches to Evaluate XML Structured Queries
Child Prime Label Approaches to Evaluate XML Structured Queries Shtwai Abdullah Alsubai Department of Computer Science the University of Sheffield This thesis is submitted for the degree of Doctor of Philosophy
More informationTowards microbenchmarking. June 30, 2006
1 Towards microbenchmarking XQuery June 30, 2006 Ioana Manolescu Cedric Miachon Philippe Michiels INRIA Futurs, France Univ. Paris XI, France Univ. Antwerp, Belgium 2 Plan Micro-benchmark principles Choosing
More informationLeveraging Set Relations in Exact Set Similarity Join
Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,
More informationXML: Extensible Markup Language
XML: Extensible Markup Language CSC 375, Fall 2015 XML is a classic political compromise: it balances the needs of man and machine by being equally unreadable to both. Matthew Might Slides slightly modified
More informationRiMOM Results for OAEI 2009
RiMOM Results for OAEI 2009 Xiao Zhang, Qian Zhong, Feng Shi, Juanzi Li and Jie Tang Department of Computer Science and Technology, Tsinghua University, Beijing, China zhangxiao,zhongqian,shifeng,ljz,tangjie@keg.cs.tsinghua.edu.cn
More informationFedX: A Federation Layer for Distributed Query Processing on Linked Open Data
FedX: A Federation Layer for Distributed Query Processing on Linked Open Data Andreas Schwarte 1, Peter Haase 1,KatjaHose 2, Ralf Schenkel 2, and Michael Schmidt 1 1 fluid Operations AG, Walldorf, Germany
More informationRelational Query Optimization
Relational Query Optimization Module 4, Lectures 3 and 4 Database Management Systems, R. Ramakrishnan 1 Overview of Query Optimization Plan: Tree of R.A. ops, with choice of alg for each op. Each operator
More informationAn Implementation of Tree Pattern Matching Algorithms for Enhancement of Query Processing Operations in Large XML Trees
An Implementation of Tree Pattern Matching Algorithms for Enhancement of Query Processing Operations in Large XML Trees N. Murugesan 1 and R.Santhosh 2 1 PG Scholar, 2 Assistant Professor, Department of
More informationAdministrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments
Administrivia Midterm on Thursday 10/18 CS 133: Databases Fall 2018 Lec 12 10/16 Prof. Beth Trushkowsky Assignments Lab 3 starts after fall break No problem set out this week Goals for Today Cost-based
More informationXML and Databases. Lecture 10 XPath Evaluation using RDBMS. Sebastian Maneth NICTA and UNSW
XML and Databases Lecture 10 XPath Evaluation using RDBMS Sebastian Maneth NICTA and UNSW CSE@UNSW -- Semester 1, 2009 Outline 1. Recall pre / post encoding 2. XPath with //, ancestor, @, and text() 3.
More information