Schemaless Approach of Mapping XML Document into Relational Database

Similar documents
MAXDOR: Mapping XML Document into Relational Database

CHAPTER 3 LITERATURE REVIEW

A Clustering-based Scheme for Labeling XML Trees

An Extended Byte Carry Labeling Scheme for Dynamic XML Data

A FRACTIONAL NUMBER BASED LABELING SCHEME FOR DYNAMIC XML UPDATING

INVESTIGATING BINARY STRING ENCODING FOR COMPACT REPRESENTATION OF XML DOCUMENTS

Chapter 13 XML: Extensible Markup Language

Full-Text and Structural XML Indexing on B + -Tree

A New Way of Generating Reusable Index Labels for Dynamic XML

Outline. Approximation: Theory and Algorithms. Ordered Labeled Trees in a Relational Database (II/II) Nikolaus Augsten. Unit 5 March 30, 2009

A Persistent Labelling Scheme for XML and tree Databases 1

Indexing XML Data Stored in a Relational Database

Updating XML documents

XML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9

Approaches. XML Storage. Storing arbitrary XML. Mapping XML to relational. Mapping the link structure. Mapping leaf values

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1

CBSL A Compressed Binary String Labeling Scheme for Dynamic Update of XML Documents

Tree. A path is a connected sequence of edges. A tree topology is acyclic there is no loop.

XML: Extensible Markup Language

Evaluating XPath Queries

A New Indexing Strategy for XML Keyword Search

Schema-Based XML-to-SQL Query Translation Using Interval Encoding

An approach to the model-based fragmentation and relational storage of XML-documents

UPDATING MULTIDIMENSIONAL XML DOCUMENTS 1)

Path-based XML Relational Storage Approach

Distributed Database System. Project. Query Evaluation and Web Recognition in Document Databases

Efficient Query Optimization Of XML Tree Pattern Matching By Using Holistic Approach

Querying and Updating XML with XML Schema constraints in an RDBMS

Part XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321

An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry

XMapDB-Sim: Performance Evalaution on Model-Based XML to Relational Database Mapping Choices

University of Huddersfield Repository

A DTD-Syntax-Tree Based XML file Modularization Browsing Technique

XML Query Processing. Announcements (March 31) Overview. CPS 216 Advanced Database Systems. Course project milestone 2 due today

Design of Index Schema based on Bit-Streams for XML Documents

Compression of the Stream Array Data Structure

A System for Storing, Retrieving, Organizing and Managing Web Services Metadata Using Relational Database *

Module 4. Implementation of XQuery. Part 2: Data Storage

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data

RELATIONAL STORAGE FOR XML RULES

The Research on Coding Scheme of Binary-Tree for XML

AN EFFECTIVE APPROACH FOR MODIFYING XML DOCUMENTS IN THE CONTEXT OF MESSAGE BROKERING

Using an Oracle Repository to Accelerate XPath Queries

Efficient schema-based XML-to-Relational data mapping

Teiid Designer User Guide 7.5.0

DDE: From Dewey to a Fully Dynamic XML Labeling Scheme

Data Centric Integrated Framework on Hotel Industry. Bridging XML to Relational Database

Module 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.

XRecursive: An Efficient Method to Store and Query XML Documents

CSE 530A. B+ Trees. Washington University Fall 2013

One of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while

Experimental Evaluation of Query Processing Techniques over Multiversion XML Documents

DISCUSSION 5min 2/24/2009. DTD to relational schema. Inlining. Basic inlining

A New Method of Generating Index Label for Dynamic XML Data

Index-Driven XQuery Processing in the exist XML Database

Using a Relational Database for Scalable XML Search

Semistructured Data Store Mapping with XML and Its Reconstruction

Data Abstractions. National Chiao Tung University Chun-Jen Tsai 05/23/2012

Open Access The Three-dimensional Coding Based on the Cone for XML Under Weaving Multi-documents

Stacks, Queues and Hierarchical Collections

Binary Trees, Binary Search Trees

Announcements (March 31) XML Query Processing. Overview. Navigational processing in Lore. Navigational plans in Lore

Accelerating XML Structural Matching Using Suffix Bitmaps

An Implementation of Tree Pattern Matching Algorithms for Enhancement of Query Processing Operations in Large XML Trees

XML and Databases. Lecture 10 XPath Evaluation using RDBMS. Sebastian Maneth NICTA and UNSW

Query Relaxation for XML

Kikori-KS: An Effective and Efficient Keyword Search System for Digital Libraries in XML

UNIT 3 XML DATABASES

Overview. Structured Data. The Structure of Data. Semi-Structured Data Introduction to XML Querying XML Documents. CMPUT 391: XML and Querying XML

XML Technologies. Doc. RNDr. Irena Holubova, Ph.D. Web page:

ADT 2009 Other Approaches to XQuery Processing

Semi-structured Data. 8 - XPath

An Extended Preorder Index for Optimising XPath Expressions

A Survey Of Algorithms Related To Xml Based Pattern Matching

ISSN: (Online) Volume 2, Issue 3, March 2014 International Journal of Advance Research in Computer Science and Management Studies

SFilter: A Simple and Scalable Filter for XML Streams

Keyword Search over Hybrid XML-Relational Databases

Relational Storage for XML Rules

.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar..

XML/Relational mapping Introduction of the Main Challenges

Computer Science 210 Data Structures Siena College Fall Topic Notes: Trees

Symmetrically Exploiting XML

Stacks, Queues and Hierarchical Collections. 2501ICT Logan

09 STACK APPLICATION DATA STRUCTURES AND ALGORITHMS REVERSE POLISH NOTATION

XML and Databases. Outline. Outline - Lectures. Outline - Assignments. from Lecture 3 : XPath. Sebastian Maneth NICTA and UNSW

The tree data structure. Trees COL 106. Amit Kumar Shweta Agrawal. Acknowledgement :Many slides are courtesy Douglas Harder, UWaterloo

Algorithms. Deleting from Red-Black Trees B-Trees

Trees : Part 1. Section 4.1. Theory and Terminology. A Tree? A Tree? Theory and Terminology. Theory and Terminology

Element Algebra. 1 Introduction. M. G. Manukyan

XGA XML Grammar for JAVA

CS 231 Data Structures and Algorithms Fall Recursion and Binary Trees Lecture 21 October 24, Prof. Zadia Codabux

CS24 Week 8 Lecture 1

Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey

Chapter 10: Trees. A tree is a connected simple undirected graph with no simple circuits.

Redesign Accounting and Budget System Using LINQ Framework and Web Service

LECTURE 11 TREE TRAVERSALS

Indexing XML Data with ToXin

PASSWORDS TREES AND HIERARCHIES. CS121: Relational Databases Fall 2017 Lecture 24

Data Structure. IBPS SO (IT- Officer) Exam 2017

2.2 Syntax Definition

Transcription:

Schemaless Approach of Mapping XML Document into Relational Database Ibrahim Dweib 1, Ayman Awadi 2, Seif Elduola Fath Elrhman 1, Joan Lu 1 University of Huddersfield 1 Alkhoja Group 2 ibrahim_thweib@yahoo.c om, aymawadi@yahoo.com, seifelduolaf@yahoo.com, J.Lu@hud.ac.uk Abstract The extensible Markup Language (XML) is used for representing and exchanging data through the Internet, but this technology needs a suitable medium for storing these data. At present, three common technologies can be used to store and retrieve XML documents, i.e., native XML database, Object Oriented Database (OODB) and Relational Database (RDB). This paper describes a general method for mapping XML documents to RDB. The method does not need a DTD or XML schema. And it can be applied as a general solution for any tree data structure and not just for XML data. Also, it can be used for data-centric and document-centric documents. Experiments on this method shows it's ability to maintain document structure at a low cost price and easily, building of the original document is straight forward, performing first level semantic search is achievable either on a single document or on all documents. 1. Introduction The World Wide Web (WWW) nowadays is the most important media used by most of the human beings in their daily life activities (i.e.; e-business, e-mail, etc). Most enterprises collaborate with other enterprises in long-running read-write workflows through XML-based data exchange technologies. A large amount of data is needed to be exchanged through the web (i.e., XML format) and stored somewhere as a digital copy. Storing the huge amount of web services data is an attractive area of research for the researchers and database vendors. But the important issue is how to retrieve and query these data in an efficient manner. The use of XML; for data exchanging and representation; and Relational Database Management System (RDBMS); for storing and querying; together represents a sophisticated hybrid approach to solving most of the data problems. Following this track, the key challenges in previous studies with fixed shredding is that there is loss of information from the original XML documents, the reconstruction of the original XML documents is very difficult and the size of generated RDB is huge due to inlining of XML elements on the relational tables. Existing Mapping techniques from XML-torelational database can generally be classified into two tracks. The first one is the structured-centric technique, which depends on the XML document structured to guide the mapping process [1, 2]. The second track is the schema centric, which makes use of schema information as DTD or XML schema to derive an efficient relational storage for XML documents [3, 4]. In this research we will focus on a method for mapping XML documents to relational database. The method does not need a DTD or XML schema to simplify the mapping process. And it can be applied as a general solution for any tree data structure and not just for XML data. In this method, the description of each XML document structure is kept in a big text field doc_structure containing a coded string, any changes on the document structure should be reflected in this field, such as adding a new tag or property, deleting an existing tag or property, or relocating a given tag or property to a different location in the same document. The method aims to overcome the challenges faced due to fixed shredding, i.e.; 1) No loss of information while shredding. 2) Reconstruction of original XML documents is easier and much faster. 3) Maintaining XML document structure. 4) Preserve the ordering nature of XML data. 5) Capable to perform semantic search on the stored data. The rest of the paper is organized as follows: section 2 discusses related works, section 3 discusses the theory method and guidance, section 4 shows experimental results and section 5 draws the conclusions and future works. 978-1-4244-2358-3/08/$20.00 2008 IEEE 167 CIT 2008

2. Related works Different approaches were proposed for labelling XML document tree since it plays a significant role in querying XML process. Global, Local and Dewey labelling were proposed in [1]. In Global label each node is assigned a number that represents the node's absolute position in the document. In this label, dynamic update is very difficult since all the nodes after the inserted node's need to be relabelled and extracting the parent-child and ancestor-descendant relationship are also impossible. In Local label each node is assigned a number that represents its relative position among its siblings. In this label, a combination of a node's position with that of its ancestors as a path vector uniquely identifies the absolute position of the node within the document. Dynamic update in Local label has less overhead than Global label because only the following siblings of the new node need to be renumbered. But extracting the parent-child and ancestor-descendant relationship is still very difficult. While in Dewey label, each node is given a label as a combination of its parent label and a private integer number. It gives an easy way to extract node labels from its ancestors. For example, if an element label is 1.2.5.3, then its parent is 1.2.5, and its ancestor label is 1.2.. But this method generates a large size RDB from the mapping process, since it gives private label for each node in the tree, and it needs updating the labels of the following nodes in case of inserting new node. ORDPATH, a hierarchical labelling schema implemented in Microsoft SQL Server 2005, was introduced in [5]. It is used to label nodes of an XML tree without requiring a schema. It can support insertion of new nodes at arbitrary positions in the XML tree without updating the labels of old nodes since it only used positive odd integers to be assigned to nodes during initial loading and reserved even-numbered and negative integer values for later insertions into the existing tree. The advantages of ORDPATH label are no overhead incurred for updates and it reserves the structure of XML document. But, it fails to perform semantic search or path search. A clustering-based scheme for labelling XML trees was proposed in [2]. In this scheme, a group of elements is labelled instead of a single element. Elements are separated into various groups, putting all sibling elements in one group, and assigning a one label to this group instead to one label to each element and stored them in one relational record. A clustering-based scheme will reduce the size of the database needed to store the XML tree by reducing the number of record generated from the mapping process, since it uses one label for a group of elements (a clustered) which is stored in one relational records, in contrast of other labelling methods that need a label for each node. But this method suffers from the problem of dynamic updating after the insertion of new node, i.e. many nodes should be relabelled. And also, it fails to perform semantic search or path search. Oracle XML DB [6] and IBM DB2 XML Extender [7] provided a schemaless way of storing XML data. The entire XML document is stored in a column using CLOB data type. There is no need for XML-to-SQL query translation, since XML queries are similar to XML query processing in a native XML database. 3. Theory method and guidance The main goal of mapping XML documents to RDB is to utilize the main advantages of the two technologies by finding an efficient storage, retrieval and query method to the huge amount for web data exchange through the Internet. In this section we will focus on a method of mapping XML documents to RDB. The method does not need a DTD or XML schema to simplify the mapping process. And it can be applied as a general solution for any tree and not just for XML data. 3.1. Theory guidance The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many sub-trees of different levels; it can be define as the following: T = n i= 1 ( Ei, Ai, X i, r ; i=1, 2 n, represent the levels of i 1) XML tree, 0 represents the root Where, E i is a finite set of elements in the level i. A i is a finite set of attributes in the level i. X i is a finite set of texts in the level i. r i-1 is the root of the sub-tree of level i. Definition 2: A dynamic fragment (shred) df(i) is defined to be the attributes and texts (leaf children) of the subtree i of the XML tree plus its root r i-1, as follows: df(i) = (A i, X i, r i-1 ), Where A i is a finite set of attributes in the level i X i is a finite set of texts in the level i. r i-1 is the root of the sub-tree of level i. 3.2. Method employed The method is very simple, a global label approach is applied to give a label to the XML elements and attributes. The label is unique for each token, i.e.; document element, tag, or property. But it is not required to be in sequence as in [1]. Any initial traversing for the XML document, i.e., in-order, pre-order, or post-order, can be applicable. No re-labelling for XML document 168

items is needed if new item or sub-tree is added. The relational schema consists of two tables. The "documents" table keeps the required information of the XML documents structure, and the "tokens" table which keeps the contents of the XML documents. The following sub-sections give more details about the approach. 3.3. Design framework The solution is built on a simple idea based on definitions 1 and 2: 1. A master table for documents is needed. It is called "documents", this table will keep information about documents themselves, at minimum it will have the structure of documents(doc_id, doc_structure), additional fields may be added to keep all information about the document itself such as dates, statistics, types etc. a. The doc_id is a unique id generated per document to identify documents. b. The doc_structure is a big text field containing a coded string describing each document structure, any changes on the document structure should be reflected in this field, such as adding a new tag or property, deleting an existing tag or property, or relocating a given tag or property to a different location in the same document (details below). 2. A second table to store the actual contents for all documents is also needed. Documents will be shredded into pieces of data that will be called tokens, each document element, tag, or property will be considered a token, the tokens table will have at the minimum this structure, tokens(doc_id, token_id, token_name, token_value). a. The token_id is the primary generated id for each token. b. The doc_id is the foreign key linking the tokens table to the documents table. c. token_name is the tag name or the property name as found in the original XML document. d. token_value is the text value of the XML tag property. The rules for constructing doc_structure field are as follows: 1. The doc_structure field is where the document structure is maintained. 2. It consists of long series of related keys. 3. Each key should start with a given alphabet character, say the letter 'T' for element (child) and the letter 'A' for attribute, this is necessary to delimit keys in the sequence. Then the letter is followed with a numeric number representing the token_id that this key is referring to, e.g. T120 is a key referring to a token in the tokens table whose token_id = 120. 4. If the token we are referring to has some properties defined in the original XML document then the key representing this token in the doc_structure will be followed with a set of keys defining these properties. As an example, T120A12A17A2 is a valid key string which can be read as token number 120 has three properties defined by tokens number 12, 17, and 2, and these properties appear in the original document in this order. 5. If the token we are referring to has some children tags (sub-tree) in the original XML document, then these children will be represented as a key-string surrounded by angle brackets. As an example, T120<T12T7<T2T1>T77> is a valid string that can be read, token 120 has three sub tags in this order token 12, followed by token 7, then token 77, and token 7 itself has also two sub tags numbered 2, and number 1 in the given order. So, the relational schema for this method has two tables as shown in Figure 1. Documents(doc_id, doc_structure) Tokens(doc_id, token_id, token_name, token_value) Figure 1: Relational schema 3.3.1. Mapping XML to RDB algorithm. The data model used for the mapping algorithm uses the W3C's Document Object Model (DOM) to represent XML documents in memory before mapping them, it also uses a stack to traverse the xml document by pushing the children of each node onto stack in reverse order in order to preserve there order in the doc_structure field. Figure 2 shows MapXMLtoRDB algorithm with DOM Document containing the XML document to be mapped and DocID as input, and RDB tables as output. Line 5 pushes the root element of the document to the stack. The do loop is used to construct the doc_structure field and to insert the XML tokens (elements and attributes) into token's table (lines 6-28). In line 7, the top of stack is popped, if the popped element is ">", that means all the children of the parent element were added to the database, and the ">" symbol is appended to the "struc" string (lines 8-10). If not (i.e. the popped element is a node), the element's name and value are inserted into the database, and its id is appended to "struc" string. If this element has an attributes, all its attributes are inserted to the database and there ids are appended to the "struc" string. 169

1 MapXMLtoRDB Algorithm 2 Input: DOM Document containing the XML document to be mapped, DocID. 3 Output: XML tokens inserted in Relational Database tables. 4 Begin 5 Initialize stack with document Element 6 Do loop 7 Pop top of stack Element 8 If Element = ">" 9 Append to struc string 10 Else 11 Write token to database, element name, element value 12 Get token id for the added token 13 Append Id to struc string 14 If element has attributes 15 For each attribute in attributes collection do 16 Add to database as token, att. name & att. value 17 Get token id 18 Append token id to struc string 19 End for 20 End if 21 If element has child nodes 22 append "<" to struc string 23 Push ">" to stack 24 Push all childs to stack in reverse order 25 End if 26 If stack is empty exist loop 27 End if 28 End loop 29 Write struc string to database 30 End algorithm Figure 2: Mapping XML to RDB algorithm Lines (21-25) check if the element has children. If so, an "<" is appended to "struc" string, and ">" is pushed to the stack, and all its children are pushed to the stack but in reverse order. Line 26 checks the status of the stack, if it is empty, the do loop is terminated. After that, the "struc" string is inserted to the database (documents table). All element's children are enclosed by angle brackets. The nested brackets differentiate between document's levels, while using the letter 'T' and 'A' to differentiates between element's children and attribute. The reconstruction algorithm for building XML document from relational database is omitted due to space issue. 3.4. Theory implementation on simple case study In this subsection, we give an example to illustrate the application of the mapping method described in Subsection 3.3. Consider the XML document in Figure 3 as an example. Any XML document can be represented as a rooted, labelled Tree. Figure 4 presents an XML tree for the XML document in Figure 3. In our method, each node in the tree is given a generated label in pre-order traversal. This label is unique since it identifies each token in the document. <books> <book id="11210" category="fiction"> <author id="a1" sex="m">m. John</author> <name>computer Science 101</name> </book> <book id="11211"> <author>a. Mark</author> <name>applied Math 101</name> <subject>math</subject > </book> </books> Figure 3: XML document 99 Books 100 Book Book 107 101 102 Id "11210" Category "fiction" 103 106 author name 108 Id "11211" 109 author 110 name subject 111 104 Id "a1" 105 Sex "m" M. John CS 101 A. Mark Math Applied Math 101 Figure 4: A tree representation for XML document in figure 3 170

After transformation, this document will be represented by a single record in the documents table with doc_id for example = 10, as in Figure 5. And the tokens table will be containing the records for the document contents as shown in Figure 6. The doc_structure field for this document will be, T99<T100A101A102<T103A104A105T106>T107A1 08<T109T110T111>> Doc_id Doc_strcuture 10 T99<T100A101A102<T103A104A105T106 >T107A108<T109T110T111>> Figure 5: Documents table doc_id token_id token_name token_value 10 99 books Null 10 100 book Null 10 101 id 11210 10 102 category fiction 10 103 author M. John 10 104 id a1 10 105 sex m 10 106 name Computer Science 101 10 107 book Null 10 108 id 11211 10 109 author A. Mark 10 110 name Applied Math 101 10 111 subject Math Figure 6: Tokens table Notice that we can easily maintain the document structure in this way, for example if we desire to delete the "sex" property of the first author, and we know that this property is A105, then all what we need is to do a simple string operation to exclude the substring A105 from the doc_structure field (in boldface). And if we need to add a new book tag between the existing ones its nothing more than an insertion of the proper code inside the above string at the right place, so for example if the newly added book has the structure in Figure 7, and then, it has been shredded to those records in the tokens table as in Figure 8. <book id="106"> <author>abc</author> <name>applied Geo 106</name> </book> Figure 7: Abortion of XML document doc_id token_id token_name token_value 10 200 book Null 10 201 id 106 10 202 author abc 10 203 name Applied Geo 106 Figure 8: Equivalent Tokens table Then its equivalent key-string will be T200A201<T202T203> This new substring will be inserted in the doc_structure at the right place reflecting its order in the original document; therefore the doc_structure field will now look like this: T99<T100A101A102)<T103A104A105T106>T200A2 01<T202T203>T107A108<T109T110T111>> 4. Experimental results An Intel Core 2 Duo computer with 2 GHz CPU, 1 GB RAM, 256 MB shared Cache and running Windows Vista is used for the experimental test. Visual Basic 6 is used as software development kit with Microsoft Access 2003 as relational database target. Five XML documents with different sizes are used in the experiment. The performance metric is the time spent for mapping XML documents to relational database and the time spent for reconstructing these documents from relational database. The experiment is repeated five times and the mean value of those times is reported to obtain a realistic and accurate results. Table 1 shows both times, i.e., the time spent for mapping XML to RDB and the time spent for reconstructing those documents from relational database; for different documents sizes. The data is taken from the XML data repository that is available at the web site of the School of Computer Science and Engineering, University of Washington [8]. The results in table 1 shows that the time for mapping XML document to RDB and reconstructing it from RDB is acceptable and the relation is linear between the document size and the mapping and reconstructing time. 5. Conclusion and future works By using this method, we are able to maintain document structure at a low cost price and easily, building the original document is straight forward, performing first level semantic search is also achievable either on a single document or on all documents. 171

Table 1: The time spent for mapping XML documents and the time for reconstructing them. Document size 4 KB 28 KB 64 KB 602KB 1MB Mapping time (secs) 0.01988238 0.14977736.3551445 3.574335 5.85278136 Reconstructing time (secs) 0.018990234 0.44980958 1.926836 18.305544 32.06255104 Complex semantic search is not achievable easily in this structure, for example we can not make a select statement to retrieve all records where id of author equals something. The next step of this research is to improve this method to achieve complex semantic search, differentiate between XML data type (i.e., strings, dates, integers), in order to apply less than or greater than queries. And then, we will make an intensive testing and compare our method with other methods in the literature to see its performance. References [1] I. Tatarinov, S. Viglas, K. Beyer, J. Shanmugasundaram, E. Shekita, and C. Zhang, (2002), Storing and Querying Ordered XML using a Relational Database System, in Proc. of SIGMOD, pp 204-215. [2] S. Soltan, and M. Rahgozar, (2006), A Clustering-based Scheme for Labeling XML Trees, IJCSNS International Journal of Computer Science and Network Security, VOL.6 No.9A. [3] K. Fujimoto, M. Yoshikawa, D. Kha, and T. Amagasa, (2005), "A Mapping Scheme of XML Documents into Relational Databases Using Schema-based Path Identifiers", Proceedings of the 2005 International Workshop on Challenges in Web Information and Integration (WIRI'05), 2005 IEEE. [4] G. Xing, X. Zhonghang, and A. Douglas, (2007), "X2R: A System for Managing XML Documents and Key Constraints Using RDBMS", in Proc. of ACMSE 2007, March 23-24, 2007, Winston-Salem, North Carolina, USA. [5] P. O Neil, E. O Neil, S. Pal, I. Cseri, G. Schaller, and N. Westbury, (2004), "ORDPATHs: Insert-Friendly XML Node Labels", SIGMOD 2004, June 13 18, Paris, France. [6] Oracle, (n. a.), Oracle XML DB Developer's Guide 10g. Retrieved 1 st Nov 2006, from http://www.databasebooks.us/oracle_0016.php [7] IBM, (n. a.), DB2 XML Extender. Retrieved Oct 10, 2006, from http://www- 306.ibm.com/software/data/db2/extenders/xmlext/index.html [8] U. Washington, Computer Science & Engineering Research, (2002), XMLData Repository. Retrieved Jun 15, 2007 from http://www.cs.washington.edu/research/xmldatasets/ 172