Data Structures for Maintaining Path Statistics in Distributed XML Stores
|
|
- Juliet Robertson
- 6 years ago
- Views:
Transcription
1 Data Structures for Maintaining Path Statistics in Distributed XML Stores c Yury Soldak Department of Computer Science, Saint-Petersburg State University University Prospekt 28 Saint-Petersburg Russian Federation ysoldak@acm.org Abstract The paper contains description of distributed XML store model based on notion of distributed XML document. Classification of XPath expressions is defined and the notion of distributed XML document is introduced. Definition of DataGuide-based statistical structure for XML stores is proposed and two possible approaches to maintain its actuality are discussed. Stability of feedback-based approach is shown. Generalization of the structure on distributed case is described. 1 Introduction & Related Work Developed for data exchange on the Web, XML becomes widely accepted. It is very likely that most of data on the Web can be reached in a form of XML documents in the nearest future. Furthermore, data is stored in XML on many sites already. As for the Web, it can be characterized as fairly unpredictable network of heterogeneous data sources [9]. The development of different aspects related to XML-query evaluation on the Web is the topical problem. Particularly this is true for set of remote servers which form a distributed XML store. Many open issues exist in the area of effective distributed XML query evaluation (sec. 2.2). Two papers focused on related problems are the background of the current work. In the first paper [1] two techniques were proposed for estimating the selectivity of simple path expressions over large-scale XML data: path trees and Markov tables. Both techniques summarize complex and large-scale data in a small amount of memory then use this summary for selectivity estimation. An idea of exploiting path tree to store statistical information is obtained from the paper and heavily used in the current work. The second paper [7] introduces XPathLearner, a technique for estimating selectivity of simple path expressions based on a feedback analysis. XPathLearner stores statistics in a Markov table. As considered further, this is not the best solution in the case of distributed XML Proceedings of the Spring Young Researcher s Colloquium on Database and Information Systems, Moscow, Russia, 2006 store. Primary goal of the current paper is to define structure which will be (a) suited for the distributed case and (b) a convenient basis for developing XPathLearner-like solution. Both papers mentioned above study problems of harvesting, updating and storing selectivity statistics in a global scope. In other words, there is no way to estimate the path expression selectivity for the particular site of the store. Therefore, the techniques lack for one of the most needful features for effective distributed queries evaluation (sec. 2.2). The rest of the paper is organized as follows. In the section 2 we describe a distributed data model used in the paper and discuss the problems related to the distributed query evaluation. Then the query optimizer structure, the place and importance of statistics module are discussed (section 3). After that, in the section 4, one can find XPath expressions classification we use. XML Tree Sibling Summarization structure and related issues are considered in the section 5. And, finally, the section 6 contains conclusions. 2 Distributed XML Store 2.1 Model Definition Distributed XML document (DXML document) is a document which contains at least one XInclude[10] or XLink[11] element inside it s body. Definition We name an DXML document locally distributed in the case of all included (or linked) XML fragments and the including document itself reside at the same server. DXML documents as defined above are the building blocks of our distributed XML store. Example of an employee list for a multi-office company is shown in Figure 1. Every office has its own employee list which is managed independently on other offices and located at separate site. Every such list changes constantly and unpredictably depending on hires, dismissals and small changes in personal data of any employee. Any HR manager can add (or remove) some elements into own part of the employee list even in the case of common person description structure is developed. As a result we have true distributed semistructured XML
2 <company xmlns:xi=" xmlns:xl=" > <name>the very big company</name> <staff> <office id="main"> <person position="ceo" office="main"> <name>john Smith</name> </person> <xi:include xi:href="/db/rnd.xml" xi:xpointer="element(/persons/person)" /> <xi:include xi:href="/db/qa.xml" xi:xpointer="element(/persons/person)" /> <xl:link xl:type="simple" xl:href="/db/managers.xml#xpointer(//person)" /> </office> <office id="o1"> <xi:include xi:href=" xi:xpointer="element(/staff/person)" /> </office> <office id="o2"> <xi:include xi:href=" xi:xpointer="element(/staff/persons/person)" /> </office> </staff> </company> Figure 1: Distributed XML document store. This store is the simplest example based on a single distributed document. Of course distributed store can contain any number of documents (distributed or local). Furthermore, it is absolutely not necessary that roots of distributed documents belong to a single server. Having DXML document we can define several separate parts one part for each site. We assume that these sites are independently maintained. So they may perfectly belong to different companies. Sites are some kind of black boxes to each other. The only requirement is the interface to query xml data on each site. There are no restrictions on the type of the interface. Sites can understand queries on any known xml query language. We use XQuery-over-HTTP approach for prototyping. during evaluation of query listed in Figure 2. These sequences might be obtained in several different ways. For example, query evaluator naively obtains all the person elements for each office (sends simple queries to the corresponding servers), then locally joins two (possibly) big sequences. Obviously, described approach is not optimal. Approaches similar to semi-joins for distributed RDBMSs are more attractive. We have to know selectivity of the path expressions (for person and familyname elements in our case) in order to use them. Moreover, number of distinct values for resulting node sequences (so-called distinct selectivity) is of interest too. And finally, it is important to know selectivity with regard to a server, not just abstract selectivity in the global scope. This example shows the crucial role of XPath selectivity estimation for evaluation of queries on DXML documents. XML query optimizer structure and place of an XPath selectivity estimator in it are discussed in the next section. 3 XML Queries Optimization 3.1 Optimizer structure & general issues 2.2 Query Evaluation It would be really useful to query DXML with the conventional XQuery language. The result we need to obtain is equal to the result when all parts of DXML are downloaded from remote servers and merged into temporary local XML document on which existing XQuery evaluator runs our query. Described is the naive implementation and expected to be very slow. It is required to evaluate queries on DXML more effectively. In other words, the query optimizer for local documents should be extended to generate optimal query plans for DXML and the query evaluator should be able to evaluate these new query plans. Figure 2 presents a simple example of the query on distributed XML store. The query obtains information about persons from office 1 which possibly have relatives (i.e. persons with the same family name) working at office 2. Here company.xml is the distributed XML document and information about two offices is included into it with the help of two XInclude elements (see Figure 1). for $p1 in doc( company.xml )//office[@id= o1 ]//person, $p2 in distinct-values(doc( company.xml )//office[@id= o2 ]//person/familyname) where $p1/familyname = $p2 return $p1 Figure 2: Query on distributed XML store It is necessary to join two sequences $p1 and $p2 Figure 3: Query optimizer Classic query optimizer structure is shown in Figure 3. It consists of two main blocks: logical and physical optimization modules. The logical module rewrites a query using chosen XML algebra rules, the physical module generates various physical execution plans and selects the best of them exploiting execution cost estimator. The cost estimator in turn requires various statistics to estimate a cost. Both relational and XML query optimizers expected to implement this simple architecture. Differences between the optimizers are in implementation. XML optimizers is harder to implement. First of all semistructured optimizers work in terms of more complex data structures (tree or graph structures) than their relational counterparts. Furthermore, XML databases area is not so well developed as the relational one. As a result XML
3 optimizer developers forced to make a lot of (ad-hock) decisions which are not grounded theoretically and are not proven to be best as it is the case for RDMSs. For example, XML database developers have no even single widely-accepted XML algebra to use. Implementation of a physical optimizer for XML databases is really the challenge today. We try to make one step forward in that direction developing statistics module which can be exploited by cost estimator. 3.2 Cost Estimator Another challenge is related to the distributed nature of source data. Cost estimator should be aware on this store specifics. Different cost models exist in distributed environment [6]. The very simple model would be to estimate the cost of evaluation of a path expression as k n where n is the estimated selectivity of the expression and k is the host-specific parameter. The parameter would be small for fast database components and large for slow ones or components which are only reachable by slow communication link. The k parameter can also depend on size of elements reachable by the path expression due to the fact that both serialization and transmission of large elements are very costly operations. Structures developed with the purpose of support distributed cost models with necessary statistical information are described in the section 5. 4 XPath Expressions Classification The XQuery language uses XPath expressions to define sequences of XML elements on which operations are performed. XPath expressions are used in XQuery queries in many different ways, so notation of these expressions vary significantly. Several expression types are defined here. These definitions will be used in the following sections. The list below contains 4 characteristics which are necessary to check in order to classify XPath expression: First sequence construction method (the very first step) Presence of predicates (which are not identically true) in step definitions Direction of step axes Presence of branchpoints An element sequence is the input and output of any XPath step. The very first sequence in XPath expression may be defined either by function call (document(), collection()) or by a variable reference. In the first case we name expression functional, in the second case variational. Every XPath step contains predicate expression (omitted in step notion when identically true). An XPath expression is predicative or simple depending on presence of at least one predicate in the expression notion. There are 13 axes defined in XPath specification. In [4] four major directions (ancestor, descendant, following and preceding) are defined. As a result some of XPath axes are co-directed in terms of major directions. For example axes parent and ancestor are co-directed, but parent and following-sibling are not. XPath expression is directed if and only if all its steps are co-directed and multidirected otherwise. In special cases number of major directions can be explicitly defined for multidirected expressions. And finally XPath expression is branched if and only if at least one of steps has branchpoint (name test of kind (a b... c)). Examples: doc( foo.xml )/a/b/c - functional simple directed nonbranched doc( foo.xml )/a//b/c[@e = 1 ] - functional predicative directed nonbranched doc( foo.xml )//a[@b = 3 ]/following::c - functional predicative multidirected nonbranched $v/a/(b[@c = $w] d) - variational predicative directed branched 5 XML Tree Sibling Summarization 5.1 Definition Conception of DataGuides is widely known to semistructured data researchers. It was originally introduced in [3]. From that times till present DataGuides are used as a base for indexes (for example [2]) and structures for statistical information representation [1]. All statistical structures defined in the paper are based on the DataGuide notion. This gives us several benefits. Small amount of memory required to store statistics is one of major benefits and not the single. DataGuide-like structures can be easily extended to support distributed case as shown in the section 5.6. Original DataGuide was developed for the area where only parent-child relations are used. As a result it is impossible to extend the structure to support all kinds of XPath expressions. Only simple directed XPath expressions where major direction is descendant (child, descendant and descendant-self axes) are supported in all the structures considered below. Furthermore, only functional XPath expressions are studied for now and not variational ones. Each branched expression is splitted to several nonbranched which are studied separately. XML Tree Sibling Summarization (XTSS) structure is developed for maintaining XPath selectivity statistics for any ordinary XML document. This is the DataGuide tree where every node keeps number of sibling XML elements with the same name joined to construct the node as well as name of these elements. Every arc defines parent-child relationship for source elements. In Figure 4 an example of XTSS is shown at right and its respective source XML document is shown at left. 5.2 Construction in Offline XQuery query in Figure 5 recursively constructs XTSS for XML document (In our particular case the XML document contains text of Shakespeare s Macbeth) This query was evaluated on Ipedo[5] and exist[8] Native XML DBMS in order to obtain approximations
4 constructor fetches XPath expressions from the user query and their respective selectivities from the query result. Then builds XTSS branch for each XPath and adds it to the (partial) XTSS. In the case of branch already exists in the XTSS, the selectivity value of respecive node is updated. Theoretically whole XTSS can be built this way. Practically, however, we will always have only part of whole XTSS depending on queries evaluated. Moreover, selectivity values expected to be only in the leaves and rarely in the intermediate nodes. Figure 4: XML Tree Sibling Summarization define function xtss( $seq as node, $deep as integer ) as node { let $newdeep := $deep + 1 let $names := for $s in $seq return name($s) let $dnames := distinct-values($names) for $name in $dnames let $nodes := $seq[name() = $name] let $nextnodes := $nodes/* return element {$name} { attribute { c } {count($nodes)}, attribute { d } {$deep}, xtss($nextnodes,$newdeep) } } xtss(document( db/plays/macbeth.xml )/*, 0) Figure 5: XTSS generation XQuery for time required to construct whole XTSS for middlesized XML document. Of course this routine should work much faster when coded as part of query engine. Here we try to define higher bound for XTSS construction time. Node constructors were removed from query to minimize query execution time. The results of this quick experiment are shown in Table 1. XML DBMS Time (secs) Ipedo exist Table 1: XTSS offline construction time Obviously construction of whole XTSS is a costly operation. The process of constructing an XTSS for each document in DB with thousands of documents will run too long. Moreover, XTSS should be maintained and will force us to start described process from time to time. This approach can be very resource consuming and so is not good for statistical structure. The solution is to construct partial XTSS following the online (or feedback) approach. 5.3 Feedback Approach Feedback approach let us construct XTSS branch by branch exploiting results of the user queries. An XTSS (a) /a/b/c & /a/d (b) /a//e added Figure 6: Partial XTSS The partial XTSS obtained after evaluating /a/b/c and /a/d path expressions is shown in Figure 6(a). Any information about processed XPath expressions is valuable when feedback approach is used. Unfortunately complete information is not always available. The most frequent case of that incompleteness is evaluation of steps with descendant axis. In order to fill the gap the partial XTSS notion was enriched with ancestordescendant (also named generalized) arcs marked by * at figures. Figure 6(b) presents an example of partial XTSS with one generalized arc added. Using generalized arcs we can obtain data duplication in our structure. This is not good for structure size and statistics accuracy. Assume two path expressions were evaluated: /a//e at first and then /a/b/e. Depending on structure of the source data, the first expression may (and may not) define sequence of more XML elements. Leaving both branches in the XTSS we obtain the data duplication problem. On the other hand it is possible to leave only one of these branches and possibly hit accuracy. It was decided to leave most concrete branch (/a/b/e in our case) each time we have situation like described. Following that rule we ll avoid data duplication and can hit accuracy in case of generalized expression defines larger sequence than concrete one. This is the price we pay for graceful and predictable statistical structure. The decision is based on our experience in real-world applications development. The problem mentioned above is rarely appear there. In many cases the generalized expressions are used in place of more effective concrete ones in order to reduce the size of a query textual representation. In the case of selectivity of evaluated expression equals 0, the branch is not added to the XTSS or is removed in case the branch exists in the XTSS already. In some cases we don t remove whole branch, but cut it at first branching node reached from the branch leaf. 5.4 Ambiguity During Updates Handling of generalized path expressions faces the maintenance problem: ambiguity during distribution of new
5 selectivity of generalized path expression among all satisfying XTSS nodes. Having generalized path expression let us assume that all the satisfying paths in the source data have corresponding nodes in the XTSS. Otherwise correct selectivity distribution is impossible. Following the feedback way we always should assume something. The new selectivity is achieved after expression evaluation. The question is how to distribute this selectivity among all satisfying nodes. It is clear that common selectivity may decrease or increase. Let us assume the later is true and difference is d n. Let S n and S n 1 are current and previous common selectivities respectively where S n S n 1 = d n. S n = m where m is the number of nodes and s i n is the selectivity of i-th node. At least three distribution approaches exist: equal, proportional and history-based. The equal method distributes difference by the simplest formula: s i n s i n = s i n 1 + d n /m Using proportional method difference is distributed in following way: s i n = s i n 1 S n /S n 1 The additional information is necessary to be stored in each XTSS node in order to use the third approach. This information is the value of increment ˆd i made during recent node update after evaluation of a concrete path expression. Selectivity distribution formila for this case is following: s i n = s i n 1 + d n ˆd i / ˆd, ˆd = m Clearly, described approaches are just simplest and not all the possible ones. More approaches can be developed. For example, it is possible to maintain selectivity alteration frequency for each node and then use this information to distribute selectivity as separate (the fourth) method or to improve the third approach. The store structure and behavior of a stored data define the distribution approach. Unfortunately any approach can t guarantee accurate difference distribution. However we state that regardless of method of use XTSS will contain accurate (or very close to accurate) selectivity values if queries which affect nodes of interest are evaluated several times. In other words XTSS is the stable structure. The next section proves the statement. 5.5 XTSS Stability Let we have the XTSS part of m nodes where each node defines concrete path expression corresponding to the branch ended in that node. These m nodes and only them satisfy generalized path expression q. Selectivity values stored in that nodes are accurate: d i = ṡ i s i = 0, i = 1 : m ˆd i where ṡ i is actual selectivity and s i is stored selectivity. Let us assume source data has been changed so that selectivity values of k XTSS nodes with indexes i I, I = k are no more accurate and should be updated: m D = d i where and i / I di = d i = 0 i I di 0 After evaluation of the q expression we know new common selectivity of m nodes and having previous common selectivity we know D. We should distribute the difference D among m nodes. We don t know which of XTSS nodes actually should be updated, and even don t know the number k of that nodes. So we distribute D among all m candidates following one of the approaches described in the previous section. After that the common selectivity of selected XTSS nodes equals actual selectivity of the q expression and at the same time selectivity values of particular nodes can be wrong (so not accurate). Stability means that the structure tends to contain accurate values. Definition XTSS is accurate if and only if each its node has accurate selectivity value. Since XTSS can be devided into several parts, XTSS is also accurate if and only if each XTSS part is accurate. We use following formula to measure accuracy of XTSS part: m A = d i where m is the number of nodes in measured part and d i is the difference between accurate and stored (in the XTSS node) selectivity value. The part is accurate if and only if A = 0. Proposition 5.1 Let q be a generalized path expression what defines XTSS part of m nodes as described above. Suppose that all elements reachable by q have corresponding nodes in XTSS, select queries are more frequent than update ones, user queries contain both generalized and concrete path expressions Then A 0. Proof Indeed A decreases each time concrete path expression is evaluated because value of the particular node becomes accurate (therefore corresponding d = 0). A remains the same in case of generalized expression evaluation because the distribution approaches do not change A.
6 In some special cases A can stay unchanged for a long time even if concrete and generalized expressions are evaluated constantly. It depends on the set of concrete path expressions which are evaluated. These expressions are frequently used and selectivity values for them stored in XTSS are accurate. So XTSS is accurate for these expressions. If an XTSS node is never accessed to obtain its selectivity (and so refined after expression is evaluated) it can never have accurate selectivity. As a result XTSS is accurate but only for the frequent expressions. This is natural for feedback approach and is enough to say XTSS is stable. The speed of A decreasing depends on a store properties and the only advice to be given is to experiment with the distribution approaches. It is a good idea to turn off the feedback evaluation of generalized path expressions in the case of source data changes so frequently that XTSS has no time for stabilization. 5.6 Generalization of XTSS for Distributed XML Statistical information is required in order to have a possibility to evaluate queries on distributed XML documents efficiently as this is the case for local documents. Having XTSS defined for local documents we ll extend the definition for distributed case introducing Distributed XTSS (DXTSS) notion. DXTSS plays the same role for distributed documents as XTSS for local ones. Both parent-child and ancestor-descendant arc types share a property they define relations between nodes of the same local document. Arcs of the cross-document references appear in the distributed case. They can be not only cross-document, but cross-server too in the case of a distributed document fragments reside at different servers. We name such arcs associative and mark with symbol. In such a way DXTSS is a set of XTSSs connected to each other by associative arcs. One of XTSSs is considered to be main and contains the structure s root. See Figure 7 for example of DXTSS. in our statistical structure in order to use it for cost estimation. The large values not necessarily mean that chain exists, this can be the result of an outdated hardware or overload of a remote server. The real situation is not so important for successful cost estimation. The only crucial information is how long the remote operation lasts. Associative arcs are suited to store that statistics. 6 Conclusions The paper contains description of distributed XML store model based on notion of distributed XML document. It is shown how conventional XQuery language can be used to query stores of that type. Query evaluation issues are discussed and value of path selectivity statistics is shown. XPath expressions classification based on four characteristics is introduced. It can be used by researchers and developers to easily refer to XPath expression classes as we do in the paper. DataGuide-like XML Tree Sibling Summarization structures are defined in the paper. They suited to contain statistical information about XPath expression selectivities and are used by cost estimator module of our XML query optimizer. Generalization of that structure on distributed case is described utilizing associative arcs to put local XTSSs together. Feedback approach to maintain the partial XTSS structure is described and its stability is shown. 7 Acknowledgements I would like to thank my scientific adviser Boris Novikov for his support and valuable comments. Many issues and application patterns of XTSS were discussed with Anton Gubanov and Maxim Lukichev, thank you colleagues for that. References [1] Ashraf Aboulnaga, Alaa R. Alameldeen, and Jeffrey F. Naughton. Estimating the selectivity of XML path expressions for internet scale applications. In The VLDB Journal, pages , [2] A. Fomichev. XML Storing and Processing Techniques. In SYRCoDIS, pages NIIMM, Figure 7: Distributed XTSS It is worth to mention that only one associative arc is allowed in DXTSS branch. This restriction is explained by the fact we consider remote servers to be independent and atomic. So we can t demand any private information (for example, store scheme) from them. It is possible that distributed documents form a chain (or even cycle) including parts of each other. But we ll never know exactly about that evaluating a query on a distributed document. It is acceptably to measure and store connection and/or transmission speed (the k parameter of simple cost function considered in the section 3.2) for each associative arc [3] Roy Goldman and Jennifer Widom. Dataguides: Enabling query formulation and optimization in semistructured databases. In VLDB 97, Proceedings of 23rd International Conference on Very Large Data Bases, pages Morgan Kaufmann, [4] T. Grust. Accelerating XPath location steps. In Proceedings of ACM Conference on Management of Data (SIGMOD), [5] Ipedo XML database website. Website. [6] Donald Krossmann. The State of the Art in Distributed Query Processing. In ACM Computing Surveys, volume 32, pages , 2000.
7 [7] L. Lim, M. Wang, S. Padmanabhan, J. S. Vitter, and R. Parr. XPathLearner: An On-line Self-Tuning Markov Histogram for XML Path Selectivity Estimation. In VLDB, pages , [8] Wolfgang Meier. exist: An Open Source Native XML Database. In Web, Web-Services, and Database Systems, pages , [9] Marko Smiljanic, Henk M. Blanken, Maurice van Keulen, and Willem Jonker. Distributed XML Database Systems. Technical Report TR-CTIT-02-46, CTIT, University of Twente, The Netherlands, October [10] XML Inclusions (XInclude) Version 1.0, 20 December W3C Recommendation. [11] XML Linking Language (XLink) Version 1.0, 27 June W3C Recommendation.
Full-Text and Structural XML Indexing on B + -Tree
Full-Text and Structural XML Indexing on B + -Tree Toshiyuki Shimizu 1 and Masatoshi Yoshikawa 2 1 Graduate School of Information Science, Nagoya University shimizu@dl.itc.nagoya-u.ac.jp 2 Information
More informationA FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS
A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:
More informationPathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data
PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg
More informationPart XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321
Part XII Mapping XML to Databases Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321 Outline of this part 1 Mapping XML to Databases Introduction 2 Relational Tree Encoding Dead Ends
More informationCardinality estimation of navigational XPath expressions
University of Twente Department of Electrical Engineering, Mathematics and Computer Science Database group Cardinality estimation of navigational XPath expressions Gerben Broenink M.Sc. Thesis 16 June
More informationSome aspects of references behaviour when querying XML with XQuery
Some aspects of references behaviour when querying XML with XQuery c B.Khvostichenko boris.khv@pobox.spbu.ru B.Novikov borisnov@acm.org Abstract During the XQuery query evaluation, the query output is
More informationADT 2009 Other Approaches to XQuery Processing
Other Approaches to XQuery Processing Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/ 12.11.2009: Schedule 2 RDBMS back-end support for XML/XQuery (1/2): Document Representation (XPath
More informationIndexing XML Data with ToXin
Indexing XML Data with ToXin Flavio Rizzolo, Alberto Mendelzon University of Toronto Department of Computer Science {flavio,mendel}@cs.toronto.edu Abstract Indexing schemes for semistructured data have
More informationEcient XPath Axis Evaluation for DOM Data Structures
Ecient XPath Axis Evaluation for DOM Data Structures Jan Hidders Philippe Michiels University of Antwerp Dept. of Math. and Comp. Science Middelheimlaan 1, BE-2020 Antwerp, Belgium, fjan.hidders,philippe.michielsg@ua.ac.be
More informationSupporting Positional Predicates in Efficient XPath Axis Evaluation for DOM Data Structures
Supporting Positional Predicates in Efficient XPath Axis Evaluation for DOM Data Structures Torsten Grust Jan Hidders Philippe Michiels Roel Vercammen 1 July 7, 2004 Maurice Van Keulen 1 Philippe Michiels
More informationA Structural Numbering Scheme for XML Data
A Structural Numbering Scheme for XML Data Alfred M. Martin WS2002/2003 February/March 2003 Based on workout made during the EDBT 2002 Workshops Dao Dinh Khal, Masatoshi Yoshikawa, and Shunsuke Uemura
More informationKeyword Search over Hybrid XML-Relational Databases
SICE Annual Conference 2008 August 20-22, 2008, The University Electro-Communications, Japan Keyword Search over Hybrid XML-Relational Databases Liru Zhang 1 Tadashi Ohmori 1 and Mamoru Hoshi 1 1 Graduate
More informationSymmetrically Exploiting XML
Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA The 15 th International World Wide Web Conference
More informationOne of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while
1 One of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while leaving the engine to choose the best way of fulfilling
More informationTDDD43. Theme 1.2: XML query languages. Fang Wei- Kleiner h?p:// TDDD43
Theme 1.2: XML query languages Fang Wei- Kleiner h?p://www.ida.liu.se/~ Query languages for XML Xpath o Path expressions with conditions o Building block of other standards (XQuery, XSLT, XLink, XPointer,
More informationMonotone Constraints in Frequent Tree Mining
Monotone Constraints in Frequent Tree Mining Jeroen De Knijf Ad Feelders Abstract Recent studies show that using constraints that can be pushed into the mining process, substantially improves the performance
More informationEstimating Result Size and Execution Times for Graph Queries
Estimating Result Size and Execution Times for Graph Queries Silke Trißl 1 and Ulf Leser 1 Humboldt-Universität zu Berlin, Institut für Informatik, Unter den Linden 6, 10099 Berlin, Germany {trissl,leser}@informatik.hu-berlin.de
More informationADT 2010 ADT XQuery Updates in MonetDB/XQuery & Other Approaches to XQuery Processing
1 XQuery Updates in MonetDB/XQuery & Other Approaches to XQuery Processing Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/ MonetDB/XQuery: Updates Schedule 9.11.1: RDBMS back-end support
More informationElement Algebra. 1 Introduction. M. G. Manukyan
Element Algebra M. G. Manukyan Yerevan State University Yerevan, 0025 mgm@ysu.am Abstract. An element algebra supporting the element calculus is proposed. The input and output of our algebra are xdm-elements.
More informationEstimating the Selectivity of XML Path Expression with predicates by Histograms
Estimating the Selectivity of XML Path Expression with predicates by Histograms Yu Wang 1, Haixun Wang 2, Xiaofeng Meng 1, and Shan Wang 1 1 Information School, Renmin University of China, Beijing 100872,
More informationSomething to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:
Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base
More informationSemi-structured Data. 8 - XPath
Semi-structured Data 8 - XPath Andreas Pieris and Wolfgang Fischl, Summer Term 2016 Outline XPath Terminology XPath at First Glance Location Paths (Axis, Node Test, Predicate) Abbreviated Syntax What is
More informationPart XVII. Staircase Join Tree-Aware Relational (X)Query Processing. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 440
Part XVII Staircase Join Tree-Aware Relational (X)Query Processing Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 440 Outline of this part 1 XPath Accelerator Tree aware relational
More informationInformatics 1: Data & Analysis
Informatics 1: Data & Analysis Lecture 11: Navigating XML using XPath Ian Stark School of Informatics The University of Edinburgh Tuesday 23 February 2016 Semester 2 Week 6 http://blog.inf.ed.ac.uk/da16
More informationOptimising XML-Based Web Information Systems
Optimising XML-Based Web Information Systems Colm Noonan and Mark Roantree Interoperable Systems Group, Dublin City University, Ireland - {mark,cnoonan}@computing.dcu.ie Abstract. Many Web Information
More informationIndex-Driven XQuery Processing in the exist XML Database
Index-Driven XQuery Processing in the exist XML Database Wolfgang Meier wolfgang@exist-db.org The exist Project XML Prague, June 17, 2006 Outline 1 Introducing exist 2 Node Identification Schemes and Indexing
More informationH2 Spring B. We can abstract out the interactions and policy points from DoDAF operational views
1. (4 points) Of the following statements, identify all that hold about architecture. A. DoDAF specifies a number of views to capture different aspects of a system being modeled Solution: A is true: B.
More informationAn Efficient XML Node Identification and Indexing Scheme
An Efficient XML Node Identification and Indexing Scheme Jan-Marco Bremer and Michael Gertz Department of Computer Science University of California, Davis One Shields Ave., Davis, CA 95616, U.S.A. {bremer
More informationXQuery Optimization Based on Rewriting
XQuery Optimization Based on Rewriting Maxim Grinev Moscow State University Vorob evy Gory, Moscow 119992, Russia maxim@grinev.net Abstract This paper briefly describes major results of the author s dissertation
More informationIntegrating Path Index with Value Index for XML data
Integrating Path Index with Value Index for XML data Jing Wang 1, Xiaofeng Meng 2, Shan Wang 2 1 Institute of Computing Technology, Chinese Academy of Sciences, 100080 Beijing, China cuckoowj@btamail.net.cn
More informationEvaluating XPath Queries
Chapter 8 Evaluating XPath Queries Peter Wood (BBK) XML Data Management 201 / 353 Introduction When XML documents are small and can fit in memory, evaluating XPath expressions can be done efficiently But
More informationXQuery Optimization in Relational Database Systems
XQuery Optimization in Relational Database Systems Riham Abdel Kader Supervised by Maurice van Keulen Univeristy of Twente P.O. Box 217 7500 AE Enschede, The Netherlands r.abdelkader@utwente.nl ABSTRACT
More informationKnowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey
Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey G. Shivaprasad, N. V. Subbareddy and U. Dinesh Acharya
More informationAn XML-IR-DB Sandwich: Is it Better With an Algebra in Between?
An XML-IR-DB Sandwich: Is it Better With an Algebra in Between? Vojkan Mihajlović Djoerd Hiemstra Henk Ernst Blok Peter M. G. Apers CTIT, University of Twente P.O. Box 217, 7500AE Enschede, The Netherlands
More informationMain Memory and the CPU Cache
Main Memory and the CPU Cache CPU cache Unrolled linked lists B Trees Our model of main memory and the cost of CPU operations has been intentionally simplistic The major focus has been on determining
More informationXML and Databases. Lecture 9 Properties of XPath. Sebastian Maneth NICTA and UNSW
XML and Databases Lecture 9 Properties of XPath Sebastian Maneth NICTA and UNSW CSE@UNSW -- Semester 1, 2009 Outline 1. XPath Equivalence 2. No Looking Back: How to Remove Backward Axes 3. Containment
More informationXML and Databases. Lecture 10 XPath Evaluation using RDBMS. Sebastian Maneth NICTA and UNSW
XML and Databases Lecture 10 XPath Evaluation using RDBMS Sebastian Maneth NICTA and UNSW CSE@UNSW -- Semester 1, 2009 Outline 1. Recall pre / post encoding 2. XPath with //, ancestor, @, and text() 3.
More informationLecture 13 Thursday, March 18, 2010
6.851: Advanced Data Structures Spring 2010 Lecture 13 Thursday, March 18, 2010 Prof. Erik Demaine Scribe: David Charlton 1 Overview This lecture covers two methods of decomposing trees into smaller subtrees:
More informationUsing an Oracle Repository to Accelerate XPath Queries
Using an Oracle Repository to Accelerate XPath Queries Colm Noonan, Cian Durrigan, and Mark Roantree Interoperable Systems Group, Dublin City University, Dublin 9, Ireland {cnoonan, cdurrigan, mark}@computing.dcu.ie
More informationCHAPTER 3 LITERATURE REVIEW
20 CHAPTER 3 LITERATURE REVIEW This chapter presents query processing with XML documents, indexing techniques and current algorithms for generating labels. Here, each labeling algorithm and its limitations
More informationEfficient Implementation of XQuery Constructor Expressions
Efficient Implementation of XQuery Constructor Expressions c Maxim Grinev Leonid Novak Ilya Taranov Institute of System Programming Abstract Element constructor is one of most expensive operations of the
More informationArbori Starter Manual Eugene Perkov
Arbori Starter Manual Eugene Perkov What is Arbori? Arbori is a query language that takes a parse tree as an input and builds a result set 1 per specifications defined in a query. What is Parse Tree? A
More informationMining XML data: A clustering approach
Mining XML data: A clustering approach Saraee, MH and Aljibouri, J Title Authors Type URL Published Date 2005 Mining XML data: A clustering approach Saraee, MH and Aljibouri, J Conference or Workshop Item
More informationProcessing Rank-Aware Queries in P2P Systems
Processing Rank-Aware Queries in P2P Systems Katja Hose, Marcel Karnstedt, Anke Koch, Kai-Uwe Sattler, and Daniel Zinn Department of Computer Science and Automation, TU Ilmenau P.O. Box 100565, D-98684
More informationSemantic Characterizations of XPath
Semantic Characterizations of XPath Maarten Marx Informatics Institute, University of Amsterdam, The Netherlands CWI, April, 2004 1 Overview Navigational XPath is a language to specify sets and paths in
More informationMETAXPath. Utah State University. From the SelectedWorks of Curtis Dyreson. Curtis Dyreson, Utah State University Michael H. Böhen Christian S.
Utah State University From the SelectedWorks of Curtis Dyreson December, 2001 METAXPath Curtis Dyreson, Utah State University Michael H. Böhen Christian S. Jensen Available at: https://works.bepress.com/curtis_dyreson/11/
More informationInformatics 1: Data & Analysis
T O Y H Informatics 1: Data & Analysis Lecture 11: Navigating XML using XPath Ian Stark School of Informatics The University of Edinburgh Tuesday 26 February 2013 Semester 2 Week 6 E H U N I V E R S I
More informationLECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS
Department of Computer Science University of Babylon LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS By Faculty of Science for Women( SCIW), University of Babylon, Iraq Samaher@uobabylon.edu.iq
More informationNested Intervals Tree Encoding with Continued Fractions
Nested Intervals Tree Encoding with Continued Fractions VADIM TROPASHKO Oracle Corp There is nothing like abstraction To take away your intuition Shai Simonson http://aduniorg/courses/discrete/ We introduce
More informationXML Systems & Benchmarks
XML Systems & Benchmarks Christoph Staudt Peter Chiv Saarland University, Germany July 1st, 2003 Main Goals of our talk Part I Show up how databases and XML come together Make clear the problems that arise
More informationXPath and XQuery. Introduction to Databases CompSci 316 Fall 2018
XPath and XQuery Introduction to Databases CompSci 316 Fall 2018 2 Announcements (Tue. Oct. 23) Homework #3 due in two weeks Project milestone #1 feedback : we are a bit behind, but will definitely release
More informationCompression of the Stream Array Data Structure
Compression of the Stream Array Data Structure Radim Bača and Martin Pawlas Department of Computer Science, Technical University of Ostrava Czech Republic {radim.baca,martin.pawlas}@vsb.cz Abstract. In
More informationAn Efficient XML Index Structure with Bottom-Up Query Processing
An Efficient XML Index Structure with Bottom-Up Query Processing Dong Min Seo, Jae Soo Yoo, and Ki Hyung Cho Department of Computer and Communication Engineering, Chungbuk National University, 48 Gaesin-dong,
More informationData Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.
Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data
More informationXPath. by Klaus Lüthje Lauri Pitkänen
XPath by Klaus Lüthje Lauri Pitkänen Agenda Introduction History Syntax Additional example and demo Applications Xpath 2.0 Future Introduction Expression language for Addressing portions of an XML document
More informationKeywords: Binary Sort, Sorting, Efficient Algorithm, Sorting Algorithm, Sort Data.
Volume 4, Issue 6, June 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Efficient and
More information11. XML storage details Introduction Last Lecture Introduction Introduction. XML Databases XML storage details
XML Databases Silke Eckstein Andreas Kupfer Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 2 11.1 Last Lecture Different methods for storage of XML documents
More informationCourse: The XPath Language
1 / 30 Course: The XPath Language Pierre Genevès CNRS University of Grenoble Alpes, 2017 2018 2 / 30 Why XPath? Search, selection and extraction of information from XML documents are essential for any
More informationUNIVERSITY OF TWENTE. Querying Uncertain Data in XML
UNIVERSITY OF TWENTE. Querying Uncertain Data in XML X Daniël Knippers MSc Thesis August 214 Y 1 2 1 2 Graduation committee Dr. ir. Maurice van Keulen Dr. Mena Badieh Habib Morgan Abstract This thesis
More informationHow to speed up a database which has gotten slow
Triad Area, NC USA E-mail: info@geniusone.com Web: http://geniusone.com How to speed up a database which has gotten slow hardware OS database parameters Blob fields Indices table design / table contents
More informationQuickXDB: A Prototype of a Native XML QuickXDB: Prototype of Native XML DBMS DBMS
QuickXDB: A Prototype of a Native XML QuickXDB: Prototype of Native XML DBMS DBMS Petr Lukáš, Radim Bača, and Michal Krátký Petr Lukáš, Radim Bača, and Michal Krátký Department of Computer Science, VŠB
More informationKnowledge discovery from XML Database
Knowledge discovery from XML Database Pravin P. Chothe 1 Prof. S. V. Patil 2 Prof.S. H. Dinde 3 PG Scholar, ADCET, Professor, ADCET Ashta, Professor, SGI, Atigre, Maharashtra, India Maharashtra, India
More informationAn Extended Byte Carry Labeling Scheme for Dynamic XML Data
Available online at www.sciencedirect.com Procedia Engineering 15 (2011) 5488 5492 An Extended Byte Carry Labeling Scheme for Dynamic XML Data YU Sheng a,b WU Minghui a,b, * LIU Lin a,b a School of Computer
More informationXML Query Processing. Announcements (March 31) Overview. CPS 216 Advanced Database Systems. Course project milestone 2 due today
XML Query Processing CPS 216 Advanced Database Systems Announcements (March 31) 2 Course project milestone 2 due today Hardcopy in class or otherwise email please I will be out of town next week No class
More informationXML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9
XML databases Jan Chomicki University at Buffalo Jan Chomicki (University at Buffalo) XML databases 1 / 9 Outline 1 XML data model 2 XPath 3 XQuery Jan Chomicki (University at Buffalo) XML databases 2
More informationAdvanced Database Systems
Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed
More informationA Methodology for Integrating XML Data into Data Warehouses
A Methodology for Integrating XML Data into Data Warehouses Boris Vrdoljak, Marko Banek, Zoran Skočir University of Zagreb Faculty of Electrical Engineering and Computing Address: Unska 3, HR-10000 Zagreb,
More informationB-Trees. Version of October 2, B-Trees Version of October 2, / 22
B-Trees Version of October 2, 2014 B-Trees Version of October 2, 2014 1 / 22 Motivation An AVL tree can be an excellent data structure for implementing dictionary search, insertion and deletion Each operation
More informationAn AVL tree with N nodes is an excellent data. The Big-Oh analysis shows that most operations finish within O(log N) time
B + -TREES MOTIVATION An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations finish within O(log N) time The theoretical conclusion
More informationXML Technologies. Doc. RNDr. Irena Holubova, Ph.D. Web pages:
XML Technologies Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz Web pages: http://www.ksi.mff.cuni.cz/~holubova/nprg036/ Outline Introduction to XML format, overview of XML technologies DTD
More informationLecture 25 Notes Spanning Trees
Lecture 25 Notes Spanning Trees 15-122: Principles of Imperative Computation (Spring 2016) Frank Pfenning 1 Introduction The following is a simple example of a connected, undirected graph with 5 vertices
More informationXML and Databases. Outline. Outline - Lectures. Outline - Assignments. from Lecture 3 : XPath. Sebastian Maneth NICTA and UNSW
Outline XML and Databases Lecture 10 XPath Evaluation using RDBMS 1. Recall / encoding 2. XPath with //,, @, and text() 3. XPath with / and -sibling: use / size / level encoding Sebastian Maneth NICTA
More informationTable : IEEE Single Format ± a a 2 a 3 :::a 8 b b 2 b 3 :::b 23 If exponent bitstring a :::a 8 is Then numerical value represented is ( ) 2 = (
Floating Point Numbers in Java by Michael L. Overton Virtually all modern computers follow the IEEE 2 floating point standard in their representation of floating point numbers. The Java programming language
More informationDominance Constraints and Dominance Graphs
Dominance Constraints and Dominance Graphs David Steurer Saarland University Abstract. Dominance constraints logically describe trees in terms of their adjacency and dominance, i.e. reachability, relation.
More informationQuerying Tree-Structured Data Using Dimension Graphs
Querying Tree-Structured Data Using Dimension Graphs Dimitri Theodoratos 1 and Theodore Dalamagas 2 1 Dept. of Computer Science New Jersey Institute of Technology Newark, NJ 07102 dth@cs.njit.edu 2 School
More informationComputer Science 210 Data Structures Siena College Fall Topic Notes: Trees
Computer Science 0 Data Structures Siena College Fall 08 Topic Notes: Trees We ve spent a lot of time looking at a variety of structures where there is a natural linear ordering of the elements in arrays,
More informationM-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height =
M-ary Search Tree B-Trees Section 4.7 in Weiss Maximum branching factor of M Complete tree has height = # disk accesses for find: Runtime of find: 2 Solution: B-Trees specialized M-ary search trees Each
More informationRank-aware XML Data Model and Algebra: Towards Unifying Exact Match and Similar Match in XML
Proceedings of the 7th WSEAS International Conference on Multimedia, Internet & Video Technologies, Beijing, China, September 15-17, 2007 253 Rank-aware XML Data Model and Algebra: Towards Unifying Exact
More informationXML Databases 11. XML storage details
XML Databases 11. XML storage details Silke Eckstein Andreas Kupfer Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 11. XML storage details 11.1 Introduction
More informationCourse: The XPath Language
1 / 27 Course: The XPath Language Pierre Genevès CNRS University of Grenoble, 2012 2013 2 / 27 Why XPath? Search, selection and extraction of information from XML documents are essential for any kind of
More informationData Analytics and Boolean Algebras
Data Analytics and Boolean Algebras Hans van Thiel November 28, 2012 c Muitovar 2012 KvK Amsterdam 34350608 Passeerdersstraat 76 1016 XZ Amsterdam The Netherlands T: + 31 20 6247137 E: hthiel@muitovar.com
More informationApplication-Tailored XML Storage
Application-Tailored XML Storage Maxim Grinev, Ivan Shcheklein Institute for System Programming of the Russian Academy of Sciences maxim@grinev.net, shcheklein@ispras.ru Abstract Several native approaches
More informationLecture 9 March 4, 2010
6.851: Advanced Data Structures Spring 010 Dr. André Schulz Lecture 9 March 4, 010 1 Overview Last lecture we defined the Least Common Ancestor (LCA) and Range Min Query (RMQ) problems. Recall that an
More informationXML Index Recommendation with Tight Optimizer Coupling
XML Index Recommendation with Tight Optimizer Coupling Technical Report CS-2007-22 July 11, 2007 Iman Elghandour University of Waterloo Andrey Balmin IBM Almaden Research Center Ashraf Aboulnaga University
More informationCOMP9319 Web Data Compression & Search. Cloud and data optimization XPath containment Distributed path expression processing
COMP9319 Web Data Compression & Search Cloud and data optimization XPath containment Distributed path expression processing DATA OPTIMIZATION ON CLOUD Cloud Virtualization Cloud layers Cloud computing
More informationXSelMark: A Micro-Benchmark for Selectivity Estimation Approaches of XML Queries
XSelMark: A Micro-Benchmark for Selectivity Estimation Approaches of XML Queries Sherif Sakr National ICT Australia (NICTA) Sydney, Australia sherif.sakr@nicta.com.au Abstract. Estimating the sizes of
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More information4 Fractional Dimension of Posets from Trees
57 4 Fractional Dimension of Posets from Trees In this last chapter, we switch gears a little bit, and fractionalize the dimension of posets We start with a few simple definitions to develop the language
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part V Lecture 13, March 10, 2014 Mohammad Hammoud Today Welcome Back from Spring Break! Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+
More information/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17
01.433/33 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/2/1.1 Introduction In this lecture we ll talk about a useful abstraction, priority queues, which are
More informationInternational Journal of Advance Engineering and Research Development. Performance Enhancement of Search System
Scientific Journal of Impact Factor(SJIF): 3.134 International Journal of Advance Engineering and Research Development Volume 2,Issue 7, July -2015 Performance Enhancement of Search System Ms. Uma P Nalawade
More informationSFilter: A Simple and Scalable Filter for XML Streams
SFilter: A Simple and Scalable Filter for XML Streams Abdul Nizar M., G. Suresh Babu, P. Sreenivasa Kumar Indian Institute of Technology Madras Chennai - 600 036 INDIA nizar@cse.iitm.ac.in, sureshbabuau@gmail.com,
More informationXPath Lecture 34. Robb T. Koether. Hampden-Sydney College. Wed, Apr 11, 2012
XPath Lecture 34 Robb T. Koether Hampden-Sydney College Wed, Apr 11, 2012 Robb T. Koether (Hampden-Sydney College) XPathLecture 34 Wed, Apr 11, 2012 1 / 20 1 XPath Functions 2 Predicates 3 Axes Robb T.
More informationImplementation of Relational Operations in Omega Parallel Database System *
Implementation of Relational Operations in Omega Parallel Database System * Abstract The paper describes the implementation of relational operations in the prototype of the Omega parallel database system
More informationarxiv: v2 [cs.ds] 9 Apr 2009
Pairing Heaps with Costless Meld arxiv:09034130v2 [csds] 9 Apr 2009 Amr Elmasry Max-Planck Institut für Informatik Saarbrücken, Germany elmasry@mpi-infmpgde Abstract Improving the structure and analysis
More informationImproving generalized inverted index lock wait times
Journal of Physics: Conference Series PAPER OPEN ACCESS Improving generalized inverted index lock wait times To cite this article: A Borodin et al 2018 J. Phys.: Conf. Ser. 944 012022 View the article
More informationStructural Consistency: Enabling XML Keyword Search to Eliminate Spurious Results Consistently
Last Modified: 22 Sept. 29 Structural Consistency: Enabling XML Keyword Search to Eliminate Spurious Results Consistently Ki-Hoon Lee, Kyu-Young Whang, Wook-Shin Han, and Min-Soo Kim Department of Computer
More informationCopyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1
Slide 27-1 Chapter 27 XML: Extensible Markup Language Chapter Outline Introduction Structured, Semi structured, and Unstructured Data. XML Hierarchical (Tree) Data Model. XML Documents, DTD, and XML Schema.
More informationDirected Graphical Models (Bayes Nets) (9/4/13)
STA561: Probabilistic machine learning Directed Graphical Models (Bayes Nets) (9/4/13) Lecturer: Barbara Engelhardt Scribes: Richard (Fangjian) Guo, Yan Chen, Siyang Wang, Huayang Cui 1 Introduction For
More informationDecision trees. Decision trees are useful to a large degree because of their simplicity and interpretability
Decision trees A decision tree is a method for classification/regression that aims to ask a few relatively simple questions about an input and then predicts the associated output Decision trees are useful
More information