An approach to the model-based fragmentation and relational storage of XML-documents
|
|
- Cornelius Stone
- 5 years ago
- Views:
Transcription
1 An approach to the model-based fragmentation and relational storage of XML-documents Christian Süß Fakultät für Mathematik und Informatik, Universität Passau, D Passau, Germany Abstract A flexible method to store XML documents in relational or object-relational databases is presented that is based on an adaptable fragmentation. Whereas most known approaches decompose XML documents into minimal units we propose to store fragments of variable granularity ranging from single elements to whole documents. Different fragmentation strategies depending on the specific access and query requirements can be applied to the same XML documents. Experiments have shown that the response times are much better than those for the complete decomposition. Furthermore, our storage model which is based on directed acyclic graphs facilitates the reuse of XML subdocuments and supports different views on XML documents. 1 Introduction Today, there exist numerous different approaches to store and query XML data. Besides storing XML data in the file system, which is straightforward but does not support querying XML data, object-oriented database systems as well as systems based on semi-structured data models or native XML systems volunteer. However, comparative studies [4, 6, 7, 8] indicate, that relational or object-relational database systems are still very competitive. In general, there are many ways to store XML data in relational or object-relational database systems. For example, in [1, 2] the user or database administrator can decide how to store XML elements in relational tables. Appropriate relational schemata can also be derived automatically from a given XML schema, e.g. a DTD [3, 8]. There are also generic approaches that store documents without any user interaction and do not require any kind of schema and which provide for the storage and retrieval of different types of XML documents, e.g. XSL documents etc., using the same relational schema for storing. For example, [4] presents different strategies to completely decompose arbitrary XML documents into relational tables. They show a good overall performance, but reconstructing complete documents is rather expensive. In relational databases, clustering and indexing is applied to compensate for the performance loss caused by the (more or less) rigorous decomposition of the relational schemata which is required for normalization purposes. However, up to now it is not clear what clustering and indexing of XML documents exactly means. In this paper we propose a flexible fragmentation of XML documents that avoids at least unnecessary joins when reconstructing frequently accessed parts of the document. Fragmentation has also been suggested in [9]. However, [9] relies on an fragmentation specification supplied by the user and stored in the source document or the DTD, whereas our generic approach is completely independent of such hard-wired directives. Furthermore, in our approach multiple fragmentation strategies can be defined and applied to the document whenever appropriate. The fragmentation strategies presented in this paper are guided by an underlying domain model which is used to specify the fragments at a high level of abstraction. As compared to [4] we do not lose the difference between subelements and attributes as well as between subelements and references. Furthermore, applying our approach to object-relational databases we can benefit from specific user defined datatypes [8] and indexing techniques [1, 2]. Finally, our storage model is based on directed acylic graphs. In contrast to tree-based storage structures, it facilitates the reuse of XML subdocuments and supports different views on XML documents. The rest of this paper is organized as follows: In 2 we formally define the fragmentation of XML documents. In 3 relational database schemata as well as algorithms to store and retrieve XML fragments are introduced. Section 4 focuses on model-based fragmentation strategies and presents experimental results. The paper concludes with a short summary in 5.
2 2 Fragmentation of XML documents Definition 1 (Notation for XML documents) Let doc be an XML document. Then elements(doc) denotes the set of elements in doc where elements having the same tag name are identified by unique subindices. tree(doc) is the tree structure of the elements defined by doc. root(doc) is the root element of doc and therefore of tree(doc), too. Let e elements(doc) be an element of doc. Then doc(e) denotes the subdocument of doc having e as its root element. tag(e) is the tag name of e. value(e, attr) is the value of the attribute attr of e. xml(e) denotes the XML representation or serialization of e in doc including the opening and closing tags of e and all of its contents. In particular, xml(root(doc)) is the XML representation of the entire document doc. children(e) elements(doc) denotes the set of elements directly contained in e excluding e itself. Conversely parent(e) elements(doc) denotes the element which contains e. Obviously, parent is not defined for the root element. <course title="xml Tutorial"> < title="introduction"> <motivation> <>XML is needed for many reasons...</> <image src="xml.gif"/> </motivation> < title="basics"> <definition> <>An XML document is...</> </definition> <example> <>Consider the following XML document...</> </example> </course> Figure 1: Sample XML document Definition 2 (Fragment) Let doc be an XML document. Then f elements(doc) is a fragment of doc iff the subgraph of tree(doc) which is induced by f is connected. This subgraph forms a tree denoted by tree(f) and having the root element root(f). Definition 3 (Fragmentation) Let doc be an XML document. A fragmentation F = {f 1,..., f n } of doc is a partitioning of elements(doc) into fragments f 1 to f n, i.e., the f i are pairwise disjoint and their union equals elements(doc). roots(f ) denotes the set of root elements of the fragments in F. The elements of roots(f ) uniquely determine F. We assume that each fragment f i in F, 1 i n, has an unique identifier id(f i ). The XML representation xml(f) of a fragment f F is the result of replacing in xml(root(f)) the XML representation xml(e) of the root element e of any other fragment g F occurring as a subtree in f by the element < tag(e) fragment-id = id(g) / > where tag(e) is the tag name of the root element of g and id(g) is the unique identifier of g Thus, essentially every fragment subtree is replaced by a reference to the fragment. Let doc be an XML document. Let F be a fragmentation of doc. Then the tree structure tree(doc) induces a graph graph(f ) on the fragments of F which is a tree, too, because in XML each element can only be contained in exactly one other element. However, we allow directed acyclic fragmentation graphs, in which each fragment can have more than one parent fragment. Definition 4 (Graph of a fragmentation) Let F be a fragmentation and f, g F. Then f is called a parent fragment of g and g a child fragment of f in F iff in the tree of the original document doc it holds that parent(root(g)) f. children(f) denotes the child fragments of f in F and parents(f) denotes the parent fragments of f in F. Example 5 (Fragments) Figure 2 shows the tree structure of the XML document of figure 1 using four fragments with root elements course, motivation, definition and example identified by the unique identifiers 1 through 4. The XML representation of fragment 1 containing three references to the child fragments can be found in figure 3.
3 2 1 motivation course 4 definition 3 4 example <course title="xml Tutorial"> < title="introduction"> <motivation fragment-id="2" /> < title="basics"> <definition fragment-id="3" /> <example fragment-id="4" /> </course> image Figure 2: Tree structure with four fragments Figure 3: XML representation of a fragment 3 Relational storage of XML fragments Definition 6 (Relational schema) To store the graph of a fragmentation in a relational database we use a relational schema consisting of the three tables fragment(id, tag, xml), attribute(id, name, value) and child(parid, childid, pos) where underlined attributes denote primary keys. Attribute id of table attribute and attribute parid as well as attribute childid of table child are foreign keys of the table fragment. Attribute xml is of a type appropriate to store large character sequences (e.g. CLOB). Note that we show only the essential attributes. Algorithm 7 (Storage of an XML document) Let doc be an XML document and let F be a fragmentation of doc. Then doc is stored according to F in a relational database using the schema of definition 6 by the following algorithm: 1. For each fragment f F, insert into table fragment the tuple (id(f), tag(root(f)), xml(f)). 2. For each attribute-value pair name=value of a root element root(f) of a fragment f, insert into table attribute the tuple (id(f), name, value). 3. For each pair of fragments f and g, where g is the i-th child fragment of f according to the element ordering in the original document, insert into table child the tuple (id(f), id(g), i). Example 8 (Storage of an XML document) Figure 4 shows the extension of the tables after storing the XML document of figure 1 according to the fragmentation shown in figure 2. For the complete contents of column xml in the first row of table fragment see figure 3. Algorithm 9 (Retrieval of a fragment) Let the XML document doc be stored as described in algorithm 7 according to a fragmentation F. Let e = root(f) be the root element of a fragmentf F. Then we obtain the subdocument xml(e) which contains e and all its XML contents using the following algorithm: 1. Execute the SQL query SELECT xml FROM fragment WHERE id= id(e). The result of this query is the XML representation xml(f) of fragment f. 2. Replace each element < tag(e 2 ) fragment-id = id / > by the XML representation xml(e 2 ) of the root element e 2 of the fragment with identifier id obtained by a recursive application of this algorithm. According to algorithm 9 tables attribute and child are not necessary for retrieving a document, because their information is also contained in the XML representation of the fragments. Nevertheless they are important for the efficient retrieval of fragments and navigation in the fragmentation graph. fragment id tag xml 1 course <course... 2 motivation <motivation>... 3 definition <definition>... 4 example <example>... attribute id name value 1 title XML Tutorial child parid childid pos Figure 4: Extension of tables for fragmented sample XML document
4 1,0 1,0 0,8 0,8 Relative Retrieval Time 0,6 0,4 Relative Retrieval Time 0,6 0,4 Single Elements ContentModules Cou/Sec/Ex Cou/Sec Course 1 large Fragment 0,2 0,2 0, Levels per Fragment Figure 5: (a) Uninformed Fragmentation 0, Chapters per Document (b) Model-Based Fragmentation 4 Model-based fragmentation strategies Definition 10 (Fragmentation strategy) Let doc be an XML document. A fragmentation strategy S elements(doc) is a subset of the set of elements of doc specifying the root elements of the fragments. Strategies specify those elements which are stored in separate fragments. They use predicates which every element has to satisfy to qualify as a root element of a fragment, e.g. match patterns for tag names and attribute values of elements (see example 12) or more sophisticated structural conditions (see example 11). Strategies can also be based on the nesting depth, i.e., how many levels of nested elements are stored in a fragment. Figure 5 (a) shows the experimental results of retrieving an entire XML document having 14 levels of element nesting. As expected, the response time of the rightmost strategy where all 14 levels, i.e. the whole document, are stored in one fragment is about 15 times as fast as the response time of the leftmost strategy where each fragment contains only one element, i.e., the document is completely decomposed. These strategies are uninformed and therefore produce unpredictable fragmentations resulting in the different response times shown in figure 5 (a). To meet the specific access and query requirements of a given application we define fragmentation strategies which are guided by the domain model of the application. For example, figure 6 shows a simplified part of the teachware model presented in [10] which describes learning material using specialized DTDs [5] (see the sample document in figure 1) Course Module StructureModule ContentModule motivation definition paragraph illustration example exercise remark Figure 6: Domain model for teachware Example 11 (Sequential Leaf Access) From the teachware domain model we know that a learner mostly accesses the leaf s of a course document doc in a sequential way. Those s directly contain at least one ContentModule while their enclosing s do not directly contain any ContentModule. To meet this specific access requirement, we now define a fragmentation strategy S 1 to store such leaf s in separate fragments: S 1 = root(doc) {e elements(doc) (tag(e) = ( c children(e) : tag(c) CM)) p ancestors(e) : ( c children(e) : tag(c) CM))} ancestors(e) is the set of all ancestors of e in tree(doc) and CM = {motivation, definition,...} is the set of all tag names of ContentModule elements according to the given model.
5 Example 12 (Supporting Reuse of Modules) We know that authors usually reuse ContentModules in more than one course document. Thus, we define a corresponding strategy S 2 = {e elements(doc) (tag(e) CM)} which stores each ContentModule in a separate fragment. Note, that our storage model which is based on directed acyclic graphs (see definition 4) directly supports the reuse of XML subdocuments. Figure 5 (b) shows the experimental results of retrieving complete course documents containing one to five chapters, i.e., top-level s. The response time for S 2 depicted in the second line from the top is significantly less than the response time of the complete decomposition depicted in the line on top. From examples 11 and 12 we can see, that there can be more than one fragmentation strategy for a single XML document. Both strategies S 1 and S 2 can be applied whenever appropriate. Moreover, we can define a combined strategy S 1 S 2 which produces a finer-grained fragmentation and which facilitates the sequential leaf access of example 11 as well as the reuse of ContentModules of example 12. Our approach allows to adjust the granularity of fragments when appropriate. For example, statistic information like the most often reused subdocuments can be used to dynamically determine the appropriate fragmentation strategy. So, documents can be re-fragmented and re-organized in storage when needed without having to change the document source itself. This supports the cooperative authoring of an XML document base. Example 13 (Dynamically Changing Strategies) To support reuse at the storage layer and to improve S 2 of example 12 accordingly we define the strategy S 3 = {e ReusedElements} where ReusedElements is the dynamically changing set of reused elements. 5 Conclusion In this paper we have presented a generic, model-based approach to the relational storage of XML documents. Arbitrary XML documents are automatically stored in database where they are clustered in fragments of different sizes tailorable to specific access and query requirements. We have specified different strategies which make use of information provided by an underlying model. Experimental results show that corresponding queries can be answered more efficiently than when using a complete decomposition. The storage model is based on directed acyclic graphs. In contrast to a tree model it directly supports multiple hierarchies which facilitate the reuse of XML subdocuments and allow the definition of different views on the same XML document. Future work will focus on the concept of views which could only be touched in this paper. Furthermore, we will study the application of our approach to the modularization of XML documents. References [1] S. Banerjee et al. Oracle8i - The XML Enabled Data Management System. In Proc. ICDE 2000: San Diego, USA, [2] J. M. Cheng and J. Xu. XML and DB2. In Proc. ICDE 2000: San Diego, USA, [3] A. Deutsch et al. Storing Semistructured Data with STORED. In Proc. ACM SIGMOD Philadelphia, PN, 1999, [4] D. Florescu and D. Kossmann. A Performance Evaluation of Alternative Mapping Schemes for Storing XML Data in a Relational Database. techreport 3680, INRIA, France, [5] C. Süß. Learning Material Markup Language [6] A. Schmidt et al. Efficient Relational Storage and Retrieval of XML Documents. In Proc. WebDB 2000, Dallas, USA, [7] J. Shanmugasundaram et al. Relational Databases for Querying XML Documents: Limitations and Opportunities. In Proc. 25th VLDB Conference, Edinburgh, Scotland, [8] T. Shimura et al. Storage and Retrieval of XML Documents Using Object-Relational Databases. In Proc. DEXA 99, Florence, Italy, [9] B. Surjanto et al. XML Content Management based on Object-Relational Database Technology. In Proc. WISE 2000, Hongkong, [10] C. Süß et al. Metamodeling for Web-Based Teachware Managment. In Advances in Conceptual Modeling. ER 99 Workshop on the World-Wide Web and Conceptual Modeling, Paris, France, 1999.
Data Modeling and Relational Storage of XML-based Teachware
Data Modeling and Relational Storage of XML-based Teachware Christian Süß, Ulrich Zukowski and Burkhard Freitag Fakultät für Mathematik und Informatik, Universität Passau D-94030 Passau, Germany {suess,zukowski,freitag}@fmi.uni-passau.de
More informationApproaches. XML Storage. Storing arbitrary XML. Mapping XML to relational. Mapping the link structure. Mapping leaf values
XML Storage CPS 296.1 Topics in Database Systems Approaches Text files Use DOM/XSLT to parse and access XML data Specialized DBMS Lore, Strudel, exist, etc. Still a long way to go Object-oriented DBMS
More informationCopyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1
Slide 27-1 Chapter 27 XML: Extensible Markup Language Chapter Outline Introduction Structured, Semi structured, and Unstructured Data. XML Hierarchical (Tree) Data Model. XML Documents, DTD, and XML Schema.
More informationChapter 13 XML: Extensible Markup Language
Chapter 13 XML: Extensible Markup Language - Internet applications provide Web interfaces to databases (data sources) - Three-tier architecture Client V Application Programs Webserver V Database Server
More informationPathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data
PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg
More informationEvaluating XPath Queries
Chapter 8 Evaluating XPath Queries Peter Wood (BBK) XML Data Management 201 / 353 Introduction When XML documents are small and can fit in memory, evaluating XPath expressions can be done efficiently But
More informationSchema-Based XML-to-SQL Query Translation Using Interval Encoding
2011 Eighth International Conference on Information Technology: New Generations Schema-Based XML-to-SQL Query Translation Using Interval Encoding Mustafa Atay Department of Computer Science Winston-Salem
More informationDesign of Index Schema based on Bit-Streams for XML Documents
Design of Index Schema based on Bit-Streams for XML Documents Youngrok Song 1, Kyonam Choo 3 and Sangmin Lee 2 1 Institute for Information and Electronics Research, Inha University, Incheon, Korea 2 Department
More informationAnswering Aggregate Queries Over Large RDF Graphs
1 Answering Aggregate Queries Over Large RDF Graphs Lei Zou, Peking University Ruizhe Huang, Peking University Lei Chen, Hong Kong University of Science and Technology M. Tamer Özsu, University of Waterloo
More informationPart XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321
Part XII Mapping XML to Databases Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321 Outline of this part 1 Mapping XML to Databases Introduction 2 Relational Tree Encoding Dead Ends
More informationIntegrating Path Index with Value Index for XML data
Integrating Path Index with Value Index for XML data Jing Wang 1, Xiaofeng Meng 2, Shan Wang 2 1 Institute of Computing Technology, Chinese Academy of Sciences, 100080 Beijing, China cuckoowj@btamail.net.cn
More informationA System for Storing, Retrieving, Organizing and Managing Web Services Metadata Using Relational Database *
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 6, No 1 Sofia 2006 A System for Storing, Retrieving, Organizing and Managing Web Services Metadata Using Relational Database
More informationSTORING-UPDATING AND QUERYING MULTIDIMENSIONAL XML DOCUMENTS USING RELATIONAL DATABASES 1
ISBN: 978-972-8924-44-7 2007 IADIS STORING-UPDATING AND QUERYING MULTIDIMENSIONAL XML DOCUMENTS USING RELATIONAL DATABASES 1 Nikolaos Fousteris, Yannis Stavrakas, Manolis Gergatsoulis Department of Archive
More informationThe Encoding Complexity of Network Coding
The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network
More informationDistributed Database System. Project. Query Evaluation and Web Recognition in Document Databases
74.783 Distributed Database System Project Query Evaluation and Web Recognition in Document Databases Instructor: Dr. Yangjun Chen Student: Kang Shi (6776229) August 1, 2003 1 Abstract A web and document
More informationA Connection between Network Coding and. Convolutional Codes
A Connection between Network Coding and 1 Convolutional Codes Christina Fragouli, Emina Soljanin christina.fragouli@epfl.ch, emina@lucent.com Abstract The min-cut, max-flow theorem states that a source
More informationTrees. Q: Why study trees? A: Many advance ADTs are implemented using tree-based data structures.
Trees Q: Why study trees? : Many advance DTs are implemented using tree-based data structures. Recursive Definition of (Rooted) Tree: Let T be a set with n 0 elements. (i) If n = 0, T is an empty tree,
More informationXML: Extensible Markup Language
XML: Extensible Markup Language CSC 375, Fall 2015 XML is a classic political compromise: it balances the needs of man and machine by being equally unreadable to both. Matthew Might Slides slightly modified
More informationAn Efficient XML Index Structure with Bottom-Up Query Processing
An Efficient XML Index Structure with Bottom-Up Query Processing Dong Min Seo, Jae Soo Yoo, and Ki Hyung Cho Department of Computer and Communication Engineering, Chungbuk National University, 48 Gaesin-dong,
More informationOne of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while
1 One of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while leaving the engine to choose the best way of fulfilling
More informationDISCUSSION 5min 2/24/2009. DTD to relational schema. Inlining. Basic inlining
XML DTD Relational Databases for Querying XML Documents: Limitations and Opportunities Semi-structured SGML Emerging as a standard E.g. john 604xxxxxxxx 778xxxxxxxx
More informationAdaptive Knowledge Management: A Meta-Modeling Approach and its Binding to XML
Adaptive Knowledge Management: A Meta-Modeling Approach and its Binding to XML Christian Süß Fakultät für Mathematik und Informatik Universität Passau D-94030 Passau, Germany suess@fmi.uni-passau.de http://daisy.fmi.uni-passau.de/
More informationUPDATING MULTIDIMENSIONAL XML DOCUMENTS 1)
UPDATING MULTIDIMENSIONAL XML DOCUMENTS ) Nikolaos Fousteris, Manolis Gergatsoulis, Yannis Stavrakas Department of Archive and Library Science, Ionian University, Ioannou Theotoki 72, 4900 Corfu, Greece.
More informationUsing Relational Database metadata to generate enhanced XML structure and document Abstract 1. Introduction
Using Relational Database metadata to generate enhanced XML structure and document Sherif Sakr - Mokhtar Boshra Faculty of Computers and Information Cairo University {sakr,mboshra}@cu.edu.eg Abstract Relational
More informationSemistructured Data Store Mapping with XML and Its Reconstruction
Semistructured Data Store Mapping with XML and Its Reconstruction Enhong CHEN 1 Gongqing WU 1 Gabriela Lindemann 2 Mirjam Minor 2 1 Department of Computer Science University of Science and Technology of
More informationCHAPTER 3 LITERATURE REVIEW
20 CHAPTER 3 LITERATURE REVIEW This chapter presents query processing with XML documents, indexing techniques and current algorithms for generating labels. Here, each labeling algorithm and its limitations
More informationKeyword Search over Hybrid XML-Relational Databases
SICE Annual Conference 2008 August 20-22, 2008, The University Electro-Communications, Japan Keyword Search over Hybrid XML-Relational Databases Liru Zhang 1 Tadashi Ohmori 1 and Mamoru Hoshi 1 1 Graduate
More informationExtending E-R for Modelling XML Keys
Extending E-R for Modelling XML Keys Martin Necasky Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic martin.necasky@mff.cuni.cz Jaroslav Pokorny Faculty of Mathematics and
More informationOutline. Depth-first Binary Tree Traversal. Gerênciade Dados daweb -DCC922 - XML Query Processing. Motivation 24/03/2014
Outline Gerênciade Dados daweb -DCC922 - XML Query Processing ( Apresentação basedaem material do livro-texto [Abiteboul et al., 2012]) 2014 Motivation Deep-first Tree Traversal Naïve Page-based Storage
More informationMonotone Constraints in Frequent Tree Mining
Monotone Constraints in Frequent Tree Mining Jeroen De Knijf Ad Feelders Abstract Recent studies show that using constraints that can be pushed into the mining process, substantially improves the performance
More informationData Structures and Algorithms
Data Structures and Algorithms Trees Sidra Malik sidra.malik@ciitlahore.edu.pk Tree? In computer science, a tree is an abstract model of a hierarchical structure A tree is a finite set of one or more nodes
More informationChapter 7: XML Namespaces
7. XML Namespaces 7-1 Chapter 7: XML Namespaces References: Tim Bray, Dave Hollander, Andrew Layman: Namespaces in XML. W3C Recommendation, World Wide Web Consortium, Jan 14, 1999. [http://www.w3.org/tr/1999/rec-xml-names-19990114],
More informationMETAXPath. Utah State University. From the SelectedWorks of Curtis Dyreson. Curtis Dyreson, Utah State University Michael H. Böhen Christian S.
Utah State University From the SelectedWorks of Curtis Dyreson December, 2001 METAXPath Curtis Dyreson, Utah State University Michael H. Böhen Christian S. Jensen Available at: https://works.bepress.com/curtis_dyreson/11/
More informationDetecting Logical Errors in SQL Queries
Detecting Logical Errors in SQL Queries Stefan Brass Christian Goldberg Martin-Luther-Universität Halle-Wittenberg, Institut für Informatik, Von-Seckendorff-Platz 1, D-06099 Halle (Saale), Germany (brass
More informationSchemaless Approach of Mapping XML Document into Relational Database
Schemaless Approach of Mapping XML Document into Relational Database Ibrahim Dweib 1, Ayman Awadi 2, Seif Elduola Fath Elrhman 1, Joan Lu 1 University of Huddersfield 1 Alkhoja Group 2 ibrahim_thweib@yahoo.c
More informationTeiid Designer User Guide 7.5.0
Teiid Designer User Guide 1 7.5.0 1. Introduction... 1 1.1. What is Teiid Designer?... 1 1.2. Why Use Teiid Designer?... 2 1.3. Metadata Overview... 2 1.3.1. What is Metadata... 2 1.3.2. Editing Metadata
More informationMotivation and basic concepts Storage Principle Query Principle Index Principle Implementation and Results Conclusion
JSON Schema-less into RDBMS Most of the material was taken from the Internet and the paper JSON data management: sup- porting schema-less development in RDBMS, Liu, Z.H., B. Hammerschmidt, and D. McMahon,
More informationFull-Text and Structural XML Indexing on B + -Tree
Full-Text and Structural XML Indexing on B + -Tree Toshiyuki Shimizu 1 and Masatoshi Yoshikawa 2 1 Graduate School of Information Science, Nagoya University shimizu@dl.itc.nagoya-u.ac.jp 2 Information
More informationXML/Relational mapping Introduction of the Main Challenges
HELSINKI UNIVERSITY OF TECHNOLOGY November 30, 2004 Telecommunications Software and Multimedia Laboratory T-111.590 Research Seminar on Digital Media (2-5 cr.): Autumn 2004: Web Service Technologies XML/Relational
More informationUtilizing Nested Normal Form to Design Redundancy Free JSON Schemas
Utilizing Nested Normal Form to Design Redundancy Free JSON Schemas https://doi.org/10.3991/ijes.v4i4.6539 Wai Yin Mok University of Alabama in Huntsville, Huntsville, AL, USA Abstract JSON (JavaScript
More informationNotes on Binary Dumbbell Trees
Notes on Binary Dumbbell Trees Michiel Smid March 23, 2012 Abstract Dumbbell trees were introduced in [1]. A detailed description of non-binary dumbbell trees appears in Chapter 11 of [3]. These notes
More information2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006
2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,
More informationA Bottom-up Strategy for Query Decomposition
A Bottom-up Strategy for Query Decomposition Le Thi Thu Thuy, Doan Dai Duong, Virendrakumar C. Bhavsar and Harold Boley Faculty of Computer Science, University of New Brunswick Fredericton, New Brunswick,
More informationA Structural Numbering Scheme for XML Data
A Structural Numbering Scheme for XML Data Alfred M. Martin WS2002/2003 February/March 2003 Based on workout made during the EDBT 2002 Workshops Dao Dinh Khal, Masatoshi Yoshikawa, and Shunsuke Uemura
More informationA Methodology for Integrating XML Data into Data Warehouses
A Methodology for Integrating XML Data into Data Warehouses Boris Vrdoljak, Marko Banek, Zoran Skočir University of Zagreb Faculty of Electrical Engineering and Computing Address: Unska 3, HR-10000 Zagreb,
More informationDesigning Information-Preserving Mapping Schemes for XML
Designing Information-Preserving Mapping Schemes for XML Denilson Barbosa Juliana Freire Alberto O. Mendelzon VLDB 2005 Motivation An XML-to-relational mapping scheme consists of a procedure for shredding
More informationChapter 10: Trees. A tree is a connected simple undirected graph with no simple circuits.
Chapter 10: Trees A tree is a connected simple undirected graph with no simple circuits. Properties: o There is a unique simple path between any 2 of its vertices. o No loops. o No multiple edges. Example
More information11 TREES DATA STRUCTURES AND ALGORITHMS IMPLEMENTATION & APPLICATIONS IMRAN IHSAN ASSISTANT PROFESSOR, AIR UNIVERSITY, ISLAMABAD
DATA STRUCTURES AND ALGORITHMS 11 TREES IMPLEMENTATION & APPLICATIONS IMRAN IHSAN ASSISTANT PROFESSOR, AIR UNIVERSITY, ISLAMABAD WWW.IMRANIHSAN.COM LECTURES ADAPTED FROM: DANIEL KANE, NEIL RHODES DEPARTMENT
More informationOutline. q Database integration & querying. q Peer-to-Peer data management q Stream data management q MapReduce-based distributed data management
Outline n Introduction & architectural issues n Data distribution n Distributed query processing n Distributed query optimization n Distributed transactions & concurrency control n Distributed reliability
More informationTHESE. Présentée en vue d obtention du grade de. Discipline : Informatique. par Nicoleta PREDA. Titre :
THESE Présentée en vue d obtention du grade de DOCTEUR DE L UNIVERSITE PARIS SUD Discipline : Informatique par Nicoleta PREDA Titre : Efficient Web resource management in structured peer-to-peer networks
More informationSangam: A Framework for Modeling Heterogeneous Database Transformations
Sangam: A Framework for Modeling Heterogeneous Database Transformations Kajal T. Claypool University of Massachusetts-Lowell Lowell, MA Email: kajal@cs.uml.edu Elke A. Rundensteiner Worcester Polytechnic
More informationAlgorithmic Aspects of Communication Networks
Algorithmic Aspects of Communication Networks Chapter 5 Network Resilience Algorithmic Aspects of ComNets (WS 16/17): 05 Network Resilience 1 Introduction and Motivation Network resilience denotes the
More informationThroughout the chapter, we will assume that the reader is familiar with the basics of phylogenetic trees.
Chapter 7 SUPERTREE ALGORITHMS FOR NESTED TAXA Philip Daniel and Charles Semple Abstract: Keywords: Most supertree algorithms combine collections of rooted phylogenetic trees with overlapping leaf sets
More informationDatabase Management
Database Management - 2011 Model Answers 1. a. A data model should comprise a structural part, an integrity part and a manipulative part. The relational model provides standard definitions for all three
More informationOptimization of Queries in Distributed Database Management System
Optimization of Queries in Distributed Database Management System Bhagvant Institute of Technology, Muzaffarnagar Abstract The query optimizer is widely considered to be the most important component of
More informationProcessing Rank-Aware Queries in P2P Systems
Processing Rank-Aware Queries in P2P Systems Katja Hose, Marcel Karnstedt, Anke Koch, Kai-Uwe Sattler, and Daniel Zinn Department of Computer Science and Automation, TU Ilmenau P.O. Box 100565, D-98684
More informationManagement of XML Documents without Schema in Relational Database Systems
Management of XML Documents without Schema in Relational Database Systems Thomas Kudrass Leipzig University of Applied Sciences, Department of Computer Science and Mathematics, D-04251 Leipzig, Germany
More informationAN EFFECTIVE APPROACH FOR MODIFYING XML DOCUMENTS IN THE CONTEXT OF MESSAGE BROKERING
AN EFFECTIVE APPROACH FOR MODIFYING XML DOCUMENTS IN THE CONTEXT OF MESSAGE BROKERING R. Gururaj, Indian Institute of Technology Madras, gururaj@cs.iitm.ernet.in M. Giridhar Reddy, Indian Institute of
More informationXML RETRIEVAL. Introduction to Information Retrieval CS 150 Donald J. Patterson
Introduction to Information Retrieval CS 150 Donald J. Patterson Content adapted from Manning, Raghavan, and Schütze http://www.informationretrieval.org OVERVIEW Introduction Basic XML Concepts Challenges
More informationStriped Grid Files: An Alternative for Highdimensional
Striped Grid Files: An Alternative for Highdimensional Indexing Thanet Praneenararat 1, Vorapong Suppakitpaisarn 2, Sunchai Pitakchonlasap 1, and Jaruloj Chongstitvatana 1 Department of Mathematics 1,
More informationA Framework for Processing Complex Document-centric XML with Overlapping Structures Ionut E. Iacob and Alex Dekhtyar
A Framework for Processing Complex Document-centric XML with Overlapping Structures Ionut E. Iacob and Alex Dekhtyar ABSTRACT Management of multihierarchical XML encodings has attracted attention of a
More informationA FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS
A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:
More informationTrees. Carlos Moreno uwaterloo.ca EIT https://ece.uwaterloo.ca/~cmoreno/ece250
Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Announcements Part of assignment 3 posted additional
More informationSelectively Storing XML Data in Relations
Selectively Storing XML Data in Relations Wenfei Fan 1 and Lisha Ma 2 1 University of Edinburgh and Bell Laboratories 2 Heriot-Watt University Abstract. This paper presents a new framework for users to
More informationXML Systems & Benchmarks
XML Systems & Benchmarks Christoph Staudt Peter Chiv Saarland University, Germany July 1st, 2003 Main Goals of our talk Part I Show up how databases and XML come together Make clear the problems that arise
More informationPASSWORDS TREES AND HIERARCHIES. CS121: Relational Databases Fall 2017 Lecture 24
PASSWORDS TREES AND HIERARCHIES CS121: Relational Databases Fall 2017 Lecture 24 Account Password Management 2 Mentioned a retailer with an online website Need a database to store user account details
More informationAn UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry
An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry I-Chen Wu 1 and Shang-Hsien Hsieh 2 Department of Civil Engineering, National Taiwan
More informationTreewidth and graph minors
Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under
More informationQuerying and Updating XML with XML Schema constraints in an RDBMS
Querying and Updating XML with XML Schema constraints in an RDBMS H. Georgiadis I. Varlamis V. Vassalos Department of Informatics Athens University of Economics and Business Athens, Greece {harisgeo,varlamis,vassalos}@aueb.gr
More informationUNIT 3 XML DATABASES
UNIT 3 XML DATABASES XML Databases: XML Data Model DTD - XML Schema - XML Querying Web Databases JDBC Information Retrieval Data Warehousing Data Mining. 3.1. XML Databases: XML Data Model The common method
More informationComputational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs
Computational Optimization ISE 407 Lecture 16 Dr. Ted Ralphs ISE 407 Lecture 16 1 References for Today s Lecture Required reading Sections 6.5-6.7 References CLRS Chapter 22 R. Sedgewick, Algorithms in
More informationXML Technologies. Doc. RNDr. Irena Holubova, Ph.D. Web page:
XML Technologies Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz Web page: http://www.ksi.mff.cuni.cz/~holubova/nprg036/ Outline Introduction to XML format, overview of XML technologies DTD XML
More informationData Centric Integrated Framework on Hotel Industry. Bridging XML to Relational Database
Data Centric Integrated Framework on Hotel Industry Bridging XML to Relational Database Introduction extensible Markup Language (XML) is a promising Internet standard for data representation and data exchange
More informationA Framework for Generic Integration of XML Sources. Wolfgang May Institut für Informatik Universität Freiburg Germany
A Framework for Generic Integration of XML Sources Wolfgang May Institut für Informatik Universität Freiburg Germany may@informatik.uni-freiburg.de KRDB Workshop, Rome 15.9.2001 OVERVIEW ffl Integration
More informationIndexing XML Data with ToXin
Indexing XML Data with ToXin Flavio Rizzolo, Alberto Mendelzon University of Toronto Department of Computer Science {flavio,mendel}@cs.toronto.edu Abstract Indexing schemes for semistructured data have
More informationQuery Optimization in Distributed Databases. Dilşat ABDULLAH
Query Optimization in Distributed Databases Dilşat ABDULLAH 1302108 Department of Computer Engineering Middle East Technical University December 2003 ABSTRACT Query optimization refers to the process of
More informationSCHEMA BASED XML SECURITY: RBAC APPROACH
SCHEMA BASED XML SECURITY: RBAC APPROACH Xinwen Zhang, Jaehong Park, and Ravi Sandhu George Mason University {xzhang6, jpark2, sandhu) } @gmu.edu Abstract Security of XML instance is a basic problem, especially
More informationFoundations of Computer Science Spring Mathematical Preliminaries
Foundations of Computer Science Spring 2017 Equivalence Relation, Recursive Definition, and Mathematical Induction Mathematical Preliminaries Mohammad Ashiqur Rahman Department of Computer Science College
More informationAlgorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs
Algorithms in Systems Engineering ISE 172 Lecture 16 Dr. Ted Ralphs ISE 172 Lecture 16 1 References for Today s Lecture Required reading Sections 6.5-6.7 References CLRS Chapter 22 R. Sedgewick, Algorithms
More informationIndexing XML Data Stored in a Relational Database
Indexing XML Data Stored in a Relational Database Shankar Pal, Istvan Cseri, Oliver Seeliger, Gideon Schaller, Leo Giakoumakis, Vasili Zolotov VLDB 200 Presentation: Alex Bradley Discussion: Cody Brown
More informationSFilter: A Simple and Scalable Filter for XML Streams
SFilter: A Simple and Scalable Filter for XML Streams Abdul Nizar M., G. Suresh Babu, P. Sreenivasa Kumar Indian Institute of Technology Madras Chennai - 600 036 INDIA nizar@cse.iitm.ac.in, sureshbabuau@gmail.com,
More informationAmol Deshpande, UC Berkeley. Suman Nath, CMU. Srinivasan Seshan, CMU
Amol Deshpande, UC Berkeley Suman Nath, CMU Phillip Gibbons, Intel Research Pittsburgh Srinivasan Seshan, CMU IrisNet Overview of IrisNet Example application: Parking Space Finder Query processing in IrisNet
More informationThese notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models.
Undirected Graphical Models: Chordal Graphs, Decomposable Graphs, Junction Trees, and Factorizations Peter Bartlett. October 2003. These notes present some properties of chordal graphs, a set of undirected
More informationRELATIONAL STORAGE FOR XML RULES
RELATIONAL STORAGE FOR XML RULES A. A. Abd El-Aziz Research Scholar Dept. of Information Science & Technology Anna University Email: abdelazizahmed@auist.net Professor A. Kannan Dept. of Information Science
More informationName: 1. (a) SQL is an example of a non-procedural query language.
Name: 1 1. (20 marks) erminology: or each of the following statements, state whether it is true or false. If it is false, correct the statement without changing the underlined text. (Note: there might
More informationOptimizing distributed XML queries through localization and pruning
Optimizing distributed XML queries through localization and pruning Patrick Kling pkling@cs.uwaterloo.ca M. Tamer Özsu tozsu@cs.uwaterloo.ca University of Waterloo David R. Cheriton School of Computer
More informationCSE 530A. B+ Trees. Washington University Fall 2013
CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key
More informationA Clustering-based Scheme for Labeling XML Trees
84 IJCSNS International Journal of Computer Science and Network Security, VOL.6 No.9A, September 2006 A Clustering-based Scheme for Labeling XML Trees Sadegh Soltan, and Masoud Rahgozar, University of
More informationDistributed DBMS. Concepts. Concepts. Distributed DBMS. Concepts. Concepts 9/8/2014
Distributed DBMS Advantages and disadvantages of distributed databases. Functions of DDBMS. Distributed database design. Distributed Database A logically interrelated collection of shared data (and a description
More informationOpen Access The Three-dimensional Coding Based on the Cone for XML Under Weaving Multi-documents
Send Orders for Reprints to reprints@benthamscience.ae 676 The Open Automation and Control Systems Journal, 2014, 6, 676-683 Open Access The Three-dimensional Coding Based on the Cone for XML Under Weaving
More informationLecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1
CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture
More informationA JAVA-BASED SYSTEM FOR XML DATA PROTECTION* E. Bertino, M. Braun, S. Castano, E. Ferrari, M. Mesiti
CHAPTER 2 Author- A JAVA-BASED SYSTEM FOR XML DATA PROTECTION* E. Bertino, M. Braun, S. Castano, E. Ferrari, M. Mesiti Abstract Author- is a Java-based system for access control to XML documents. Author-
More informationReview -Chapter 4. Review -Chapter 5
Review -Chapter 4 Entity relationship (ER) model Steps for building a formal ERD Uses ER diagrams to represent conceptual database as viewed by the end user Three main components Entities Relationships
More informationRank-aware XML Data Model and Algebra: Towards Unifying Exact Match and Similar Match in XML
Proceedings of the 7th WSEAS International Conference on Multimedia, Internet & Video Technologies, Beijing, China, September 15-17, 2007 253 Rank-aware XML Data Model and Algebra: Towards Unifying Exact
More informationModule 4. Implementation of XQuery. Part 2: Data Storage
Module 4 Implementation of XQuery Part 2: Data Storage Aspects of XQuery Implementation Compile Time + Optimizations Operator Models Query Rewrite Runtime + Query Execution XML Data Representation XML
More informationLabeling Dynamic XML Documents: An Order-Centric Approach
1 Labeling Dynamic XML Documents: An Order-Centric Approach Liang Xu, Tok Wang Ling, and Huayu Wu School of Computing National University of Singapore Abstract Dynamic XML labeling schemes have important
More informationQuery Solvers on Database Covers
Query Solvers on Database Covers Wouter Verlaek Kellogg College University of Oxford Supervised by Prof. Dan Olteanu A thesis submitted for the degree of Master of Science in Computer Science Trinity 2018
More informationAxiomatization of the Evolution of XML Database Schema
Programming and Computer Software, Vol. 9, No. 3, 003, pp. 7. Translated from Programmirovanie, Vol. 9, No. 3, 003. Original Russian Text Copyright 003 by Coox. Axiomatization of the Evolution of XML Database
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 10: XML Retrieval Hinrich Schütze, Christina Lioma Center for Information and Language Processing, University of Munich 2010-07-12
More informationQuerying transformed XML documents: Determining a sufficient fragment of the original document
Querying transformed XML documents: Determining a sufficient fragment of the original document Sven Groppe, Stefan Böttcher University of Paderborn Faculty 5 (Computer Science, Electrical Engineering &
More information