A Structural Numbering Scheme for XML Data
|
|
- Allan Jacobs
- 5 years ago
- Views:
Transcription
1 A Structural Numbering Scheme for XML Data Alfred M. Martin WS2002/2003 February/March 2003 Based on workout made during the EDBT 2002 Workshops Dao Dinh Khal, Masatoshi Yoshikawa, and Shunsuke Uemura Graduate School of Information Science Nara Institute of Science and Technology Takayama, Ikoma, Nara , Japan Information Technology Center, Nagoya University Funo-cho, Chikusa-ku, Nagoya , Japan Abstract. Identifier generation is a common but crucial task in many XML applications. In addition, the structural information of XML data is essential to evaluate the XML queries. In order to meet both these requirements, several numbering schemes, including the powerful UID technique, have been proposed. We introduce a new numbering scheme based on the UID techniques called multilevel recursive UID (ruid). The proposed ruid is robust, scalable and hierarchical. ruid features identifier generation by level and takes into account the XML tree topology. ruid not only enables the computation of the parent node's identifier from the child node's identifier, as in the original UID, but also deals effectively with XML structural update and can be applied to arbitrarily large XML documents. 1. Introduction The goal of a structural numbering scheme in XML Data is to generate stable identifiers, query evaluation and managing large XML trees. To achieve these goals the original used UID (Unified Identifiers) is extended by the new ruid, the recursive Unified Identifiers system. Among these schemes, the technique referred to as the Unique Identifier (UID), enumerates nodes using a k-ary tree where k is the maximal fanout of the nodes. Each of internal nodes supposedly has the same fan-out k by assigning a number of virtual children if needed. Consecutive integers starting from 1 are assigned to the nodes, including the virtual nodes, in order from top to bottom and from left to fight in each level.
2 Whereas other numbering schemes only can compare two identifiers, the identifiers must already be known, in order to determine the parent-child relationship, the UID technique has an interesting property whereby the parent node can be determined based on the identifier of the child node. Given a node having the identifier i we can compute the identifier of the parent of the node using the formula: The main features of the ruid technique: 1. Parent-child determination property: given the identifier of a node, the parent node's identifier can be efficiently computed. Using smallsize global information stored in main memory, the new technique allows the ancestor-descendant relationship to be determined without any I/0. 2. Robustness for structural change: the scope of data amendment when a structural update occurs is effectively reduced. 3. Scalability: the new presentation can overcome the identifier limitation of the original UID technique and can be applied to arbitrarily large XML documents. 4. Structural richness: ruid is effective in representing the structural components in XPath expressions 2. Multilevel Recursive UID 2.1 Definitions and pictures Definition 1. (A frame) Given an XML tree T rooted at r, a frame F is a tree: (1) rooted at r, (2) the node set which is a subset of the node set of T and (3) for any two nodes u and v in the frame, an edge exists connecting the nodes if and only if one of the nodes is
3 an ancestor of the other in T and there is no other node x that lies between u and v in T and x belongs to the frame. A tree and one of its frames are shown in Fig. 2(a).The dotted arrows connect the c. Fig.2 Frame and UID-local area Definition 2. (UID-local area) Given an XML tree T rooted at r, a frame F of T, and a node n of F a UID-local area of n is an induced subtree of T rooted at n such that each of the subtree s node paths is terminated either by a child node of n in F or a leaf node of T, if between the leaf node and n in T there exists no other node that belongs to F. Definition 3. (2-level ruid) The full 2-level ruid of a node n is a triple (g i, l i, r i ), where g i, l i, and r i are called the global index, local index, and root indicator, respectively. If n is a non-root node, then g i is the index of the UID-local area containing n, l i is the index of n inside the area, and r i is false. If n is the root node of an UID-local area, then g i is the index of the area, l i is the index of n as a leaf node in the upper UID-local area, and r i is true. The identifier of the root of the main XML tree is (1, 1, true). Input: An XML tree T Output: The 2-level ruid identifiers of nodes in T / / Global enumeration 1. Partition XML tree into UID-local areas and build the frame F upon their roots 2. Find the maximal fan-out k of F 3. Compute the global index g i using k-ary tree presentation of F / / Local enumerations 4. for each i th UID-local area 5. find the local maximal fan-out denoted by k i 6. compute the local indices l ij of nodes in the area via a k j -ary tree 7. if l ij = 1 then 8. recompute l ij in the upper UID-local area 9. r ij := true 10. update K using (g i, l ij, k i ) 11. else 12. r ij := false
4 13. end 14. Generate the identifiers of the nodes from (g i, l ij, r ij ) 15. end e. Save k and K Fig.3 Outline of the algorithm used to compute 2-level ruid Fig. 4 An original UID and its corresponding 2-level ruid counterpart Fig. 5 Global parameter table for the 2-level ruid shown in Fig. 4(b) 2.2 Parent-Child Relationship in 2-Level ruid Lemma 1. Given an XML tree T and a node n, based on the value k, and the table K, the identifier of the parent of n can be computed if the identifier of n is known. Input: An XML tree T, its k and K, and the 2-level ruid (g i, l i, r i ) of a node Output: The 2-1evel ruid (g, l, r) of the parent node 1. if (r i == true) then 2. g := L(g i - 2)/k + 1J 3. else 4. g := g i 5. end 6. get the fan-out of the row with the global index g in K 7. l := L(l i 2)/k i + 1J 8. if (l == 1) then
5 9. set l equal to the local index of the row with the global index g in K 10. r := true 11. else 12. r := false 13. end e. return (g, l, r) Fig. 6 rparent() the algorithm to compute the parent s 2-level ruid of a node Example: Suppose that k equals 4 and the table K is given in Fig. 5. Let c and p denote a node and its parent node, respectively. I illustrate how to determine the identifier of p from the identifier of c by considering several configurations of the child node: - c is the non-root node (2, 7, false): From the second line of K we know that the local fan-out of the UID-local area containing c is 2. Therefore, the local index of the identifier of p is l_(7-2)/2 + 1_l, which is equal to 3. Hence, p is the non area root node (2, 3, false). In Fig. 4, the node p is depicted by a fine-lined circle containing the numbers (2, 3). - c is the root node (10, 9, true): The upper UID-local area containing p must be determined. Because k equals 4, the upper UID-local area's index is l_(10-2)/4 + 1_l or 3. The local fan-out of the UID local area is shown in the third line of K and is equal to 3. The local index of p is l_(9-2)/3 + 1_l, which is equal to 3. The value is greater than 1, so p is the non area root node (3, 3, false). - c is the non-root node (3, 3, false): From the second line of K we know that the local fan-out of the UID-local area containing c is equal to 3 so the index of p in the UID-local area is l_(3-2) /3 +1_l, which is equal to 1. This means that p is the root of the considered UID-local area. Therefore, the local index of p must equal the index of the node in the upper local UID area. From K, the value is found to be 3, and p is the area root node (3, 3, true). Note that if the value k, together with the table K are known and are loaded into the main memory, then all of the steps in the algorithm rparent() can be formed completely inside the main memory without any disk I/O. 2.3 Adjustment of the Maximal Fan-out of Frame
6 2.4 Description of Multilevel ruid Definition 4. (Multilevel ruid) Given an XML tree T, the l-level ruid of a node n has the form: {θ, (α l-1, ß l-1 ),, (α 2,ß 2 ), (α 1, ß 1 )} where: - for j = 1... l-1: α j is the local index and ß j is the root indicator of n in its UID-local area identified by { θ, (α l-1, ß l- 1),, (α j+1, ß j+1 )} in the level j+1. - θ is the original UID in the level l. The symbols θ, α i and ß i have meanings similar to the first, second, and third components of 2-level ruid. Fig. 8 A multilevel ruid example Example: In Fig. 8, each polygon denotes an UID-local area. Suppose using 2-level ruid the node n has the identifier {8, (a, true)}, where the boolean value true indicates that n is the root of an UID-local area, 8 is the index of n in the second level's frame, and the integer number a is the index of n in the upper UID-local area that has the index 2. Using 3-level ruid, the index 8 is decomposed into (2, 4, false) and the full identifier of n becomes {2,(4, false),(a, true)}. The construction of multilevel ruid is consecutively building the UID levels, each created on the top of the previous level. First the 2-level ruid of the form {x l, (α l, ß l )} is constructed. If needed, the 3-level ruid of the
7 form {x l-1, (α l-1,ß l-1 ),(α l,ß l )} is constructed, and so on. The process stops when the top level becomes small enough to be stored. 3. Properties Robustness with Structural Update In the original UID, if a new node is inserted into an XML tree when space is available then the insertion causes the identifiers of the sibling nodes to the right of the inserted node as well as those of their descendant nodes, to be modified. In the worst case, when the insertion increases the tree's maximal fan-out, the entire enumeration has to be performed again. Identifiers of all of the nodes must be changed, which leads to an expensive reconstruction. The ruid copes better with structure update of XML data than does the original UID. The scope of identifier update due to a node insertion is reduced by a magnitude of two. If a node is inserted, at first only the nodes in the UID-local area where the update occurs need to be considered. If an appropriate space is available for the new node, then among the descendants of the sibling nodes to the right of the inserted node, only those which belong to the same UID-local area will have their identifiers modified. The nodes in the descendant areas are not affected because the frame F is unchanged. Otherwise, if such a space does not exist for the newly inserted node then the fan-out of the tree used in enumerating the UID-local area must be enlarged. Rather than modifying the identifiers of every XML component, the enlargement changes only the identifiers of the nodes in this area. In both cases, since the size of an UID-local area is much smaller than the size of the entire data set tree, the scope of the identifier update is greatly reduced. Similarly, the new ruid deals with another structural operation called node deletion. Note that any node deletion in an XML tree is cascading. That means all of the descendant nodes of the deleted node are deleted. The change of the identifiers of the sibling nodes to the right of the deleted node will affect the descendant nodes belonging to the UID-local area, where the deletion occurs. 3.2 XPath Axes Expressiveness In this section, we shall investigate the power of ruid to express XPath expressions. This property is important for the applicability of ruid in XML query processing. We consider XPath because XPath has become the standard on which many new proposed XML query languages are based. Furthermore, XPath expressions have additional concepts specific to XML data, such as axes that do not exist in regular path expressions. XPath is a language for addressing parts of a XML document, and was designed to be used by other languages such as XSLT and XPointer. In addition XPath provides basic facilities for the manipulation of strings, numbers and boolean operators in the logical structure of a XML document. One important kind of XPath expression is the location path. A location path selects a set of nodes relative to a context node. The result
8 of evaluating a location path is the node-set containing the nodes selected by the location path. I will focus only on the core rules of XPath, such as the following: [1] LocationPath ::= RelativeLocationPath AbsoluteLocationPath [2] AbsoluteLocationPath ::= '/' RelativeLocationPath? [3] RelativeLocationPath ::= Step RelativeLocationPath '/' Step Therefore, a location path can be written in the form: δ Step 1 τ 1 Step 2 τ 2 Step l where l 0, δ can be an empty symbol (indicating that nothing appears) or '/', τ i (i = 1...l - 1) is '/', and Step i (i = 1...l) is a location step. A location step has three parts: 1) an axis, which specifies the hierarchical relationship between the nodes considered in the location step and the context node, 2) a node test, which specifies the node type and expanded-name of the nodes selected by the location step, and 3) zero or more predicates to further refine the set of nodes. An initial node-set is generated from the axis and the node test and is then filtered by each of the predicates in turn. A predicate filters a node-set with respect to an axis to produce a new node-set. As described above, generating and filtering the axes is essential in evaluation of location steps in XPath expressions. The general task is as follows: "Given a context node n identified by (θ, α, ß), generate the node set belonging to a specific axis of n and satisfy a condition C". The condition C may be "to satisfy a logical expression related to data content", "to belong to a specific element type", etc. Depending on the particular C, the order to process may be: generating the set of nodes satisfying C and checking which nodes belongs to the specific axis, or generating the specified axis and then checking which nodes satisfy C. The first approach is good only for the cases in which C is specific, so the set of nodes satisfying C is small. The second approach is more generally applicable and thus we shall focus on discussing it. We demonstrate the XPath axes expressiveness of ruid by proposing several routines to generate the axes. We limit the scope of discussion to the axes that specify sets of nodes in term of the node position in XML documents. Due to triviality, we exclude the -or-self portion of axes from consideration. Specifically, the following axes will be considered: (1) parent and ancestor, (2) attribute, child, and descendant, (3) preceding-sibling and following-sibling, and (4) preceding and following. Parent and Ancestor axes: As shown in Section 2.2, after loading the value k and the table K, the parent's identifier for a given node can be computed using rparent() in main memory. The routine rancestor(n), used to generate the list of the ancestors of n, is a repetition of rparent(). Note that the numbering schemes based on the loose hierarchical order require additional parameters to express the hierarchical level, such as grandparent, or grand-grandparent. This task can be accomplished much more simply using ruid. For example, let us consider
9 an expression in abbreviated syntax such as "element 1 /*/element 2 ", in which the explicit requirement exists that between "element 1 " and "element 2 " there exists one and only one element. Naturally, we do not have to know the exact buffer element. Using ruid, we can avoid scanning the entire collection of available elements to find the parent of "element 2 ". We need only to list the grandparents, by applying rparent() twice, of the elements of the type "element 2 " and exclude those elements which are not of the type "element 1 ". Child and Descendant axes: In the 1-level UID, if p is the parent's UID, then the identifiers of its children belong to the range [(p-1)*k + 2, p*k + 1], where k is the fan-out of the enumerating tree. In the 2-level ruid, the routine rchildren(n) to create the list L of possible children of n is as follows. First, use k and θ to compute the sorted list L 1 of children of θ in the frame of T. Let k denote the local fan-out corresponding to θ and obtained from K. Let L 2 denote the list of integers in the interval [2, k + 1] if ß is true, or in the interval [(α -1)*k + 2, α *k + 1] if ß is false. For each i in L 2, if there exists no θ in L 1 such that (θ, i) is found in K as the global and local indices of a row, the add (θ, i, false) to L. Otherwise, add (θ, i, true) to L. In order to confirm the existent of such a θ', we first find in K the list of the local indices corresponding to the values in L 1 as the global indices. We then intersect the list with L 2. Note that both L 1 and K are sorted so this process is fast. The routine rdescendant(n) to generate the list of the descendants of n may be designed as a repetition of rchildren(). Another method is based on the following observation. Given two nodes n 1 and n 2, r 1 and r 2 are the roots of the UID-local areas containing n 1 and n 2, respectively. Then, if r 1 is a descendant of n 2, then n 1 is a descendant of n 2. Therefore, we first need to find the descendants of n inside of its UID-local area only, using rchildren(). Among these nodes, consider the UID-local area root nodes. In F find all the nodes which are descendant-or-self of the roots. All nodes in the areas rooted at the newly found nodes are descendants of n. Preceding-sibling and Following-sibling axes: We explain the routine denoted by rpsibling(n) to generate the list L of the preceding siblings of n. Using k and θ, we generate the sorted list L 1 of child nodes of θ in the frame F of T. In the context UID-local area, compute the sorted list L 2 of the preceding siblings of α. For each α i in L 2 if there exists no θ j such that (θ j, α i ) is found, in K as the global index and the local index of a row, then add (θ, α i, false) to L. Otherwise, add (θ j, α i, true) to L. This argument is similar to the routine for child and descendant axes. Similarly, we can design the routine rfsibling(n) to generate the list of the following siblings of n. In general, the multilevel ruid has the following property: For the axes 'preceding', 'following' the relative position of two nodes can be determined by the first different and preceding-following decidable components of their multilevel ruid. In the 2-level ruid, the orders among nodes are rejected in the frame F. We can use this property to accelerate the axis constructions.
10 4. Conclusion and Summary - Generating stable identifiers which robust against and at structural updates - Query evaluation - Managing large XML documents and their large trees 5. References 1. P.Buneman, S.Davidson, M.Fernandez, D.Suciu. Adding Structure to Unstructured Data. Proc. of the ICDT, Greece, , S.Chien, V.J.Tsotras, C.Zaniolo, D.Zhang. Storing and Querying Multiversion XML Documents using Durable Node Numbers. Proc. of the Inter. cour. on WISE:, Japan, , P.F.Dietz. Maintaining order in a link list. Proceeding of the Fourteenth ACM Symposium on Theory of Computing, California, , R.Goldman, J.Widom. DataGuides: enabling query formulation and optimization in semi structured databases. Proc. of the Inter. cour. on VLDB, , 1997, 5. H.Jang, Y.Kim, D.Shin. An Effective Mechanism for Index Update in Structured Documents. Proc. of CIKM, USA, , Q.Li, B.Moon. Indexing and Querying XML Data for Regular Path Expressions. Proc. of the Inter. Conf. on VLDB, Italy, Y.K.Lee, S-J.Yoo, K.Yoon, P.B.Berra. Index Structures for structured documents. ACM First Inter. conf. on Digital Libraries, Maryland, 91-99, A.Marian, S.Abiteboul, G.Cobena, L.Mignet. Change-Centric Management of Versions in an XML Warehouse, Proc. of the Inter. conf. on VLDB, Italy, T.Milo, D.Suciu. Index Structures for Path Expression. Proc. of the ICDT, , D.Shin. XML Indexing and Retrieval with a Hybrid Storage Model. J. Of Knowledge and Information Systems, 3: , C.Zhang, J.Naughton, D.DeWitt, Q.Luo, G.Lohman. On Supporting Containment: Queries in Relational Database Management Systems. Proc. of the ACM SIGMOD, USA, World Wide Web Consortium. Extensible Markup Language (XML) World Wide Web Consortium. XML Path Language (XPath) Version World Wide Web Consortium. Document Object Model (DOM) Level 2 Core Specification Version Core/, 2002.
A Structural Numbering Scheme for XML Data
A Structural Numbering Scheme for XML Data Dao Dinh Kha 1, Masatoshi Yoshikawa 1,2, and Shunsuke Uemura 1 1 Graduate School of Information Science Nara Institute of Science and Technology 8916-5 Takayama,
More informationSemi-structured Data. 8 - XPath
Semi-structured Data 8 - XPath Andreas Pieris and Wolfgang Fischl, Summer Term 2016 Outline XPath Terminology XPath at First Glance Location Paths (Axis, Node Test, Predicate) Abbreviated Syntax What is
More informationA System for Storing, Retrieving, Organizing and Managing Web Services Metadata Using Relational Database *
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 6, No 1 Sofia 2006 A System for Storing, Retrieving, Organizing and Managing Web Services Metadata Using Relational Database
More informationFull-Text and Structural XML Indexing on B + -Tree
Full-Text and Structural XML Indexing on B + -Tree Toshiyuki Shimizu 1 and Masatoshi Yoshikawa 2 1 Graduate School of Information Science, Nagoya University shimizu@dl.itc.nagoya-u.ac.jp 2 Information
More informationCHAPTER 3 LITERATURE REVIEW
20 CHAPTER 3 LITERATURE REVIEW This chapter presents query processing with XML documents, indexing techniques and current algorithms for generating labels. Here, each labeling algorithm and its limitations
More informationA Persistent Labelling Scheme for XML and tree Databases 1
A Persistent Labelling Scheme for XML and tree Databases 1 Alban Gabillon Majirus Fansi 2 Université de Pau et des Pays de l'adour IUT des Pays de l'adour LIUPPA/CSYSEC 40000 Mont-de-Marsan, France alban.gabillon@univ-pau.fr
More informationEvaluating XPath Queries
Chapter 8 Evaluating XPath Queries Peter Wood (BBK) XML Data Management 201 / 353 Introduction When XML documents are small and can fit in memory, evaluating XPath expressions can be done efficiently But
More informationXML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9
XML databases Jan Chomicki University at Buffalo Jan Chomicki (University at Buffalo) XML databases 1 / 9 Outline 1 XML data model 2 XPath 3 XQuery Jan Chomicki (University at Buffalo) XML databases 2
More informationLabeling and Querying Dynamic XML Trees
Labeling and Querying Dynamic XML Trees Jiaheng Lu, Tok Wang Ling School of Computing, National University of Singapore 3 Science Drive 2, Singapore 117543 {lujiahen,lingtw}@comp.nus.edu.sg Abstract With
More informationTrees. Q: Why study trees? A: Many advance ADTs are implemented using tree-based data structures.
Trees Q: Why study trees? : Many advance DTs are implemented using tree-based data structures. Recursive Definition of (Rooted) Tree: Let T be a set with n 0 elements. (i) If n = 0, T is an empty tree,
More information2006 Martin v. Löwis. Data-centric XML. XPath
Data-centric XML XPath XPath Overview Non-XML language for identifying particular parts of XML documents First person element of a document Seventh child element of third person element ID attribute of
More informationIndex-Driven XQuery Processing in the exist XML Database
Index-Driven XQuery Processing in the exist XML Database Wolfgang Meier wolfgang@exist-db.org The exist Project XML Prague, June 17, 2006 Outline 1 Introducing exist 2 Node Identification Schemes and Indexing
More informationH2 Spring B. We can abstract out the interactions and policy points from DoDAF operational views
1. (4 points) Of the following statements, identify all that hold about architecture. A. DoDAF specifies a number of views to capture different aspects of a system being modeled Solution: A is true: B.
More informationXPath. Lecture 36. Robb T. Koether. Wed, Apr 16, Hampden-Sydney College. Robb T. Koether (Hampden-Sydney College) XPath Wed, Apr 16, / 28
XPath Lecture 36 Robb T. Koether Hampden-Sydney College Wed, Apr 16, 2014 Robb T. Koether (Hampden-Sydney College) XPath Wed, Apr 16, 2014 1 / 28 1 XPath 2 Executing XPath Expressions 3 XPath Expressions
More informationUPDATING MULTIDIMENSIONAL XML DOCUMENTS 1)
UPDATING MULTIDIMENSIONAL XML DOCUMENTS ) Nikolaos Fousteris, Manolis Gergatsoulis, Yannis Stavrakas Department of Archive and Library Science, Ionian University, Ioannou Theotoki 72, 4900 Corfu, Greece.
More informationPathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data
PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg
More informationInformatics 1: Data & Analysis
T O Y H Informatics 1: Data & Analysis Lecture 11: Navigating XML using XPath Ian Stark School of Informatics The University of Edinburgh Tuesday 26 February 2013 Semester 2 Week 6 E H U N I V E R S I
More informationThese notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models.
Undirected Graphical Models: Chordal Graphs, Decomposable Graphs, Junction Trees, and Factorizations Peter Bartlett. October 2003. These notes present some properties of chordal graphs, a set of undirected
More informationReducing the Size of Routing Tables for Large-scale Network Simulation
Reducing the Size of Routing Tables for Large-scale Network Simulation Akihito Hiromori, Hirozumi Yamaguchi, Keiichi Yasumoto, Teruo Higashino and Kenichi Taniguchi Graduate School of Engineering Science,
More informationIndex-Trees for Descendant Tree Queries on XML documents
Index-Trees for Descendant Tree Queries on XML documents (long version) Jérémy arbay University of Waterloo, School of Computer Science, 200 University Ave West, Waterloo, Ontario, Canada, N2L 3G1 Phone
More informationCMSC 754 Computational Geometry 1
CMSC 754 Computational Geometry 1 David M. Mount Department of Computer Science University of Maryland Fall 2005 1 Copyright, David M. Mount, 2005, Dept. of Computer Science, University of Maryland, College
More informationGraph and Digraph Glossary
1 of 15 31.1.2004 14:45 Graph and Digraph Glossary A B C D E F G H I-J K L M N O P-Q R S T U V W-Z Acyclic Graph A graph is acyclic if it contains no cycles. Adjacency Matrix A 0-1 square matrix whose
More informationXML & Databases. Tutorial. 3. XPath Queries. Universität Konstanz. Database & Information Systems Group Prof. Marc H. Scholl
XML & Databases Tutorial Christian Grün, Database & Information Systems Group University of, Winter 2007/08 XPath Introduction navigational access to XML documents sub-language in XQuery, XSLT, or XPointer
More informationSemantic Characterizations of XPath
Semantic Characterizations of XPath Maarten Marx Informatics Institute, University of Amsterdam, The Netherlands CWI, April, 2004 1 Overview Navigational XPath is a language to specify sets and paths in
More informationAn Extended Byte Carry Labeling Scheme for Dynamic XML Data
Available online at www.sciencedirect.com Procedia Engineering 15 (2011) 5488 5492 An Extended Byte Carry Labeling Scheme for Dynamic XML Data YU Sheng a,b WU Minghui a,b, * LIU Lin a,b a School of Computer
More informationQuerying Tree-Structured Data Using Dimension Graphs
Querying Tree-Structured Data Using Dimension Graphs Dimitri Theodoratos 1 and Theodore Dalamagas 2 1 Dept. of Computer Science New Jersey Institute of Technology Newark, NJ 07102 dth@cs.njit.edu 2 School
More information18.3 Deleting a key from a B-tree
18.3 Deleting a key from a B-tree B-TREE-DELETE deletes the key from the subtree rooted at We design it to guarantee that whenever it calls itself recursively on a node, the number of keys in is at least
More informationGraph Algorithms Using Depth First Search
Graph Algorithms Using Depth First Search Analysis of Algorithms Week 8, Lecture 1 Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Graph Algorithms Using Depth
More informationOptimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C
Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of California, San Diego CA 92093{0114, USA Abstract. We
More informationXPath Lecture 34. Robb T. Koether. Hampden-Sydney College. Wed, Apr 11, 2012
XPath Lecture 34 Robb T. Koether Hampden-Sydney College Wed, Apr 11, 2012 Robb T. Koether (Hampden-Sydney College) XPathLecture 34 Wed, Apr 11, 2012 1 / 20 1 XPath Functions 2 Predicates 3 Axes Robb T.
More informationExtending E-R for Modelling XML Keys
Extending E-R for Modelling XML Keys Martin Necasky Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic martin.necasky@mff.cuni.cz Jaroslav Pokorny Faculty of Mathematics and
More informationCSE 530A. B+ Trees. Washington University Fall 2013
CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key
More informationHEAPS ON HEAPS* Downloaded 02/04/13 to Redistribution subject to SIAM license or copyright; see
SIAM J. COMPUT. Vol. 15, No. 4, November 1986 (C) 1986 Society for Industrial and Applied Mathematics OO6 HEAPS ON HEAPS* GASTON H. GONNET" AND J. IAN MUNRO," Abstract. As part of a study of the general
More information9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology
Introduction Chapter 4 Trees for large input, even linear access time may be prohibitive we need data structures that exhibit average running times closer to O(log N) binary search tree 2 Terminology recursive
More informationCS 441 Discrete Mathematics for CS Lecture 26. Graphs. CS 441 Discrete mathematics for CS. Final exam
CS 441 Discrete Mathematics for CS Lecture 26 Graphs Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Final exam Saturday, April 26, 2014 at 10:00-11:50am The same classroom as lectures The exam
More informationOne of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while
1 One of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while leaving the engine to choose the best way of fulfilling
More informationEfficient pebbling for list traversal synopses
Efficient pebbling for list traversal synopses Yossi Matias Ely Porat Tel Aviv University Bar-Ilan University & Tel Aviv University Abstract 1 Introduction 1.1 Applications Consider a program P running
More informationQuiz 1 Solutions. (a) f(n) = n g(n) = log n Circle all that apply: f = O(g) f = Θ(g) f = Ω(g)
Introduction to Algorithms March 11, 2009 Massachusetts Institute of Technology 6.006 Spring 2009 Professors Sivan Toledo and Alan Edelman Quiz 1 Solutions Problem 1. Quiz 1 Solutions Asymptotic orders
More informationChapter 13 XML: Extensible Markup Language
Chapter 13 XML: Extensible Markup Language - Internet applications provide Web interfaces to databases (data sources) - Three-tier architecture Client V Application Programs Webserver V Database Server
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL
More information3 Competitive Dynamic BSTs (January 31 and February 2)
3 Competitive Dynamic BSTs (January 31 and February ) In their original paper on splay trees [3], Danny Sleator and Bob Tarjan conjectured that the cost of sequence of searches in a splay tree is within
More informationCopyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1
Slide 27-1 Chapter 27 XML: Extensible Markup Language Chapter Outline Introduction Structured, Semi structured, and Unstructured Data. XML Hierarchical (Tree) Data Model. XML Documents, DTD, and XML Schema.
More informationComputational Geometry
Windowing queries Windowing Windowing queries Zoom in; re-center and zoom in; select by outlining Windowing Windowing queries Windowing Windowing queries Given a set of n axis-parallel line segments, preprocess
More informationBottom-Up Evaluation of Twig Join Pattern Queries in XML Document Databases
Bottom-Up Evaluation of Twig Join Pattern Queries in XML Document Databases Yangjun Chen Department of Applied Computer Science University of Winnipeg Winnipeg, Manitoba, Canada R3B 2E9 y.chen@uwinnipeg.ca
More informationTwigList: Make Twig Pattern Matching Fast
TwigList: Make Twig Pattern Matching Fast Lu Qin, Jeffrey Xu Yu, and Bolin Ding The Chinese University of Hong Kong, China {lqin,yu,blding}@se.cuhk.edu.hk Abstract. Twig pattern matching problem has been
More informationInformatics 1: Data & Analysis
Informatics 1: Data & Analysis Lecture 11: Navigating XML using XPath Ian Stark School of Informatics The University of Edinburgh Tuesday 28 February 2017 Semester 2 Week 6 https://blog.inf.ed.ac.uk/da17
More informationXML: Extensible Markup Language
XML: Extensible Markup Language CSC 375, Fall 2015 XML is a classic political compromise: it balances the needs of man and machine by being equally unreadable to both. Matthew Might Slides slightly modified
More informationKeyword Search over Hybrid XML-Relational Databases
SICE Annual Conference 2008 August 20-22, 2008, The University Electro-Communications, Japan Keyword Search over Hybrid XML-Relational Databases Liru Zhang 1 Tadashi Ohmori 1 and Mamoru Hoshi 1 1 Graduate
More informationTrees, Part 1: Unbalanced Trees
Trees, Part 1: Unbalanced Trees The first part of this chapter takes a look at trees in general and unbalanced binary trees. The second part looks at various schemes to balance trees and/or make them more
More informationEcient XPath Axis Evaluation for DOM Data Structures
Ecient XPath Axis Evaluation for DOM Data Structures Jan Hidders Philippe Michiels University of Antwerp Dept. of Math. and Comp. Science Middelheimlaan 1, BE-2020 Antwerp, Belgium, fjan.hidders,philippe.michielsg@ua.ac.be
More informationDesign of Index Schema based on Bit-Streams for XML Documents
Design of Index Schema based on Bit-Streams for XML Documents Youngrok Song 1, Kyonam Choo 3 and Sangmin Lee 2 1 Institute for Information and Electronics Research, Inha University, Incheon, Korea 2 Department
More informationA FRACTIONAL NUMBER BASED LABELING SCHEME FOR DYNAMIC XML UPDATING
A FRACTIONAL NUMBER BASED LABELING SCHEME FOR DYNAMIC XML UPDATING Meghdad Mirabi 1, Hamidah Ibrahim 2, Leila Fathi 3,Ali Mamat 4, and Nur Izura Udzir 5 INTRODUCTION 1 Universiti Putra Malaysia, Malaysia,
More informationOutline. Approximation: Theory and Algorithms. Ordered Labeled Trees in a Relational Database (II/II) Nikolaus Augsten. Unit 5 March 30, 2009
Outline Approximation: Theory and Algorithms Ordered Labeled Trees in a Relational Database (II/II) Nikolaus Augsten 1 2 3 Experimental Comparison of the Encodings Free University of Bozen-Bolzano Faculty
More informationTDDD43. Theme 1.2: XML query languages. Fang Wei- Kleiner h?p:// TDDD43
Theme 1.2: XML query languages Fang Wei- Kleiner h?p://www.ida.liu.se/~ Query languages for XML Xpath o Path expressions with conditions o Building block of other standards (XQuery, XSLT, XLink, XPointer,
More informationEstimating the Free Region of a Sensor Node
Estimating the Free Region of a Sensor Node Laxmi Gewali, Navin Rongratana, Jan B. Pedersen School of Computer Science, University of Nevada 4505 Maryland Parkway Las Vegas, NV, 89154, USA Abstract We
More informationTrees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph.
Trees 1 Introduction Trees are very special kind of (undirected) graphs. Formally speaking, a tree is a connected graph that is acyclic. 1 This definition has some drawbacks: given a graph it is not trivial
More informationSemistructured Data Store Mapping with XML and Its Reconstruction
Semistructured Data Store Mapping with XML and Its Reconstruction Enhong CHEN 1 Gongqing WU 1 Gabriela Lindemann 2 Mirjam Minor 2 1 Department of Computer Science University of Science and Technology of
More informationParallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce
Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce Huayu Wu Institute for Infocomm Research, A*STAR, Singapore huwu@i2r.a-star.edu.sg Abstract. Processing XML queries over
More informationAn Implementation of Tree Pattern Matching Algorithms for Enhancement of Query Processing Operations in Large XML Trees
An Implementation of Tree Pattern Matching Algorithms for Enhancement of Query Processing Operations in Large XML Trees N. Murugesan 1 and R.Santhosh 2 1 PG Scholar, 2 Assistant Professor, Department of
More informationArbori Starter Manual Eugene Perkov
Arbori Starter Manual Eugene Perkov What is Arbori? Arbori is a query language that takes a parse tree as an input and builds a result set 1 per specifications defined in a query. What is Parse Tree? A
More informationA more efficient algorithm for perfect sorting by reversals
A more efficient algorithm for perfect sorting by reversals Sèverine Bérard 1,2, Cedric Chauve 3,4, and Christophe Paul 5 1 Département de Mathématiques et d Informatique Appliquée, INRA, Toulouse, France.
More informationAn approach to the model-based fragmentation and relational storage of XML-documents
An approach to the model-based fragmentation and relational storage of XML-documents Christian Süß Fakultät für Mathematik und Informatik, Universität Passau, D-94030 Passau, Germany Abstract A flexible
More informationM-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height =
M-ary Search Tree B-Trees Section 4.7 in Weiss Maximum branching factor of M Complete tree has height = # disk accesses for find: Runtime of find: 2 Solution: B-Trees specialized M-ary search trees Each
More informationAn Optimal Dynamic Interval Stabbing-Max Data Structure?
An Optimal Dynamic Interval Stabbing-Max Data Structure? Pankaj K. Agarwal Lars Arge Ke Yi Abstract In this paper we consider the dynamic stabbing-max problem, that is, the problem of dynamically maintaining
More informationComputational Geometry
Windowing queries Windowing Windowing queries Zoom in; re-center and zoom in; select by outlining Windowing Windowing queries Windowing Windowing queries Given a set of n axis-parallel line segments, preprocess
More informationSDD Advanced-User Manual Version 1.1
SDD Advanced-User Manual Version 1.1 Arthur Choi and Adnan Darwiche Automated Reasoning Group Computer Science Department University of California, Los Angeles Email: sdd@cs.ucla.edu Download: http://reasoning.cs.ucla.edu/sdd
More informationInformatics 1: Data & Analysis
Informatics 1: Data & Analysis Lecture 11: Navigating XML using XPath Ian Stark School of Informatics The University of Edinburgh Tuesday 23 February 2016 Semester 2 Week 6 http://blog.inf.ed.ac.uk/da16
More informationCSE 21 Mathematics for Algorithm and System Analysis
CSE 21 Mathematics for Algorithm and System Analysis Unit 4: Basic Concepts in Graph Theory Section 3: Trees 1 Review : Decision Tree (DT-Section 1) Root of the decision tree on the left: 1 Leaves of the
More informationXML Data Management. 5. Extracting Data from XML: XPath
XML Data Management 5. Extracting Data from XML: XPath Werner Nutt based on slides by Sara Cohen, Jerusalem 1 Extracting Data from XML Data stored in an XML document must be extracted to use it with various
More informationXML Technologies. Doc. RNDr. Irena Holubova, Ph.D. Web pages:
XML Technologies Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz Web pages: http://www.ksi.mff.cuni.cz/~holubova/nprg036/ Outline Introduction to XML format, overview of XML technologies DTD
More informationIntroduction. for large input, even access time may be prohibitive we need data structures that exhibit times closer to O(log N) binary search tree
Chapter 4 Trees 2 Introduction for large input, even access time may be prohibitive we need data structures that exhibit running times closer to O(log N) binary search tree 3 Terminology recursive definition
More informationCourse: The XPath Language
1 / 30 Course: The XPath Language Pierre Genevès CNRS University of Grenoble Alpes, 2017 2018 2 / 30 Why XPath? Search, selection and extraction of information from XML documents are essential for any
More informationCourse: The XPath Language
1 / 27 Course: The XPath Language Pierre Genevès CNRS University of Grenoble, 2012 2013 2 / 27 Why XPath? Search, selection and extraction of information from XML documents are essential for any kind of
More informationA FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS
A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:
More informationECE 242 Data Structures and Algorithms. Trees IV. Lecture 21. Prof.
ECE 22 Data Structures and Algorithms http://www.ecs.umass.edu/~polizzi/teaching/ece22/ Trees IV Lecture 2 Prof. Eric Polizzi Summary previous lectures Implementations BST 5 5 7 null 8 null null 7 null
More informationPolygon Triangulation
Polygon Triangulation Definition Simple Polygons 1. A polygon is the region of a plane bounded by a finite collection of line segments forming a simple closed curve. 2. Simple closed curve means a certain
More informationMonotone Constraints in Frequent Tree Mining
Monotone Constraints in Frequent Tree Mining Jeroen De Knijf Ad Feelders Abstract Recent studies show that using constraints that can be pushed into the mining process, substantially improves the performance
More information1 The range query problem
CS268: Geometric Algorithms Handout #12 Design and Analysis Original Handout #12 Stanford University Thursday, 19 May 1994 Original Lecture #12: Thursday, May 19, 1994 Topics: Range Searching with Partition
More informationBinary Trees, Binary Search Trees
Binary Trees, Binary Search Trees Trees Linear access time of linked lists is prohibitive Does there exist any simple data structure for which the running time of most operations (search, insert, delete)
More informationMining XML Functional Dependencies through Formal Concept Analysis
Mining XML Functional Dependencies through Formal Concept Analysis Viorica Varga May 6, 2010 Outline Definitions for XML Functional Dependencies Introduction to FCA FCA tool to detect XML FDs Finding XML
More informationV Advanced Data Structures
V Advanced Data Structures B-Trees Fibonacci Heaps 18 B-Trees B-trees are similar to RBTs, but they are better at minimizing disk I/O operations Many database systems use B-trees, or variants of them,
More informationFigure 4.1: The evolution of a rooted tree.
106 CHAPTER 4. INDUCTION, RECURSION AND RECURRENCES 4.6 Rooted Trees 4.6.1 The idea of a rooted tree We talked about how a tree diagram helps us visualize merge sort or other divide and conquer algorithms.
More informationChapter 11.!!!!Trees! 2011 Pearson Addison-Wesley. All rights reserved 11 A-1
Chapter 11!!!!Trees! 2011 Pearson Addison-Wesley. All rights reserved 11 A-1 2015-12-01 09:30:53 1/54 Chapter-11.pdf (#13) Terminology Definition of a general tree! A general tree T is a set of one or
More informationChapter 11.!!!!Trees! 2011 Pearson Addison-Wesley. All rights reserved 11 A-1
Chapter 11!!!!Trees! 2011 Pearson Addison-Wesley. All rights reserved 11 A-1 2015-03-25 21:47:41 1/53 Chapter-11.pdf (#4) Terminology Definition of a general tree! A general tree T is a set of one or more
More informationSupporting Positional Predicates in Efficient XPath Axis Evaluation for DOM Data Structures
Supporting Positional Predicates in Efficient XPath Axis Evaluation for DOM Data Structures Torsten Grust Jan Hidders Philippe Michiels Roel Vercammen 1 July 7, 2004 Maurice Van Keulen 1 Philippe Michiels
More informationIntegrating Path Index with Value Index for XML data
Integrating Path Index with Value Index for XML data Jing Wang 1, Xiaofeng Meng 2, Shan Wang 2 1 Institute of Computing Technology, Chinese Academy of Sciences, 100080 Beijing, China cuckoowj@btamail.net.cn
More informationChapter 12: Indexing and Hashing. Basic Concepts
Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition
More informationM-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss)
M-ary Search Tree B-Trees (4.7 in Weiss) Maximum branching factor of M Tree with N values has height = # disk accesses for find: Runtime of find: 1/21/2011 1 1/21/2011 2 Solution: B-Trees specialized M-ary
More informationCommando: Solution. Solution 3 O(n): Consider two decisions i<j, we choose i instead of j if and only if : A S j S i
Commando: Solution Commando: Solution Solution 1 O(n 3 ): Using dynamic programming, let f(n) indicate the maximum battle effectiveness after adjustment. We have transfer equations below: n f(n) = max
More informationCS 310 B-trees, Page 1. Motives. Large-scale databases are stored in disks/hard drives.
CS 310 B-trees, Page 1 Motives Large-scale databases are stored in disks/hard drives. Disks are quite different from main memory. Data in a disk are accessed through a read-write head. To read a piece
More informationAn Efficient XML Index Structure with Bottom-Up Query Processing
An Efficient XML Index Structure with Bottom-Up Query Processing Dong Min Seo, Jae Soo Yoo, and Ki Hyung Cho Department of Computer and Communication Engineering, Chungbuk National University, 48 Gaesin-dong,
More informationProblem Set 5 Solutions
Introduction to Algorithms November 4, 2005 Massachusetts Institute of Technology 6.046J/18.410J Professors Erik D. Demaine and Charles E. Leiserson Handout 21 Problem Set 5 Solutions Problem 5-1. Skip
More informationXPathMark: an XPath Benchmark for the XMark Generated Data
XPathMark: an XPath Benchmark for the XMark Generated Data Massimo Franceschet Informatics Institute, University of Amsterdam, Kruislaan 403 1098 SJ Amsterdam, The Netherlands Dipartimento di Scienze,
More informationXML Query Processing. Announcements (March 31) Overview. CPS 216 Advanced Database Systems. Course project milestone 2 due today
XML Query Processing CPS 216 Advanced Database Systems Announcements (March 31) 2 Course project milestone 2 due today Hardcopy in class or otherwise email please I will be out of town next week No class
More informationTrees. (Trees) Data Structures and Programming Spring / 28
Trees (Trees) Data Structures and Programming Spring 2018 1 / 28 Trees A tree is a collection of nodes, which can be empty (recursive definition) If not empty, a tree consists of a distinguished node r
More informationRelations and Graphs
s and are Pictures of (Binary) s E. Wenderholm Department of Computer Science SUNY Oswego c 2016 Elaine Wenderholm All rights Reserved Outline 1 A Function that returns a boolean Special Properties of
More informationarxiv: v1 [cs.ds] 23 Jul 2014
Efficient Enumeration of Induced Subtrees in a K-Degenerate Graph Kunihiro Wasa 1, Hiroki Arimura 1, and Takeaki Uno 2 arxiv:1407.6140v1 [cs.ds] 23 Jul 2014 1 Hokkaido University, Graduate School of Information
More informationChapter 12: Indexing and Hashing
Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL
More information[ DATA STRUCTURES ] Fig. (1) : A Tree
[ DATA STRUCTURES ] Chapter - 07 : Trees A Tree is a non-linear data structure in which items are arranged in a sorted sequence. It is used to represent hierarchical relationship existing amongst several
More informationBacktracking. Chapter 5
1 Backtracking Chapter 5 2 Objectives Describe the backtrack programming technique Determine when the backtracking technique is an appropriate approach to solving a problem Define a state space tree for
More information