Some aspects of references behaviour when querying XML with XQuery

Size: px
Start display at page:

Download "Some aspects of references behaviour when querying XML with XQuery"

Transcription

1 Some aspects of references behaviour when querying XML with XQuery c B.Khvostichenko boris.khv@pobox.spbu.ru B.Novikov borisnov@acm.org Abstract During the XQuery query evaluation, the query output is built as a completey new structure, containing copies of all elements that satisfy the query. At the same time, references contained in these data point to the original elements, hence the crossreferences between extracted data are not represented in the query output. In this paper, we propose a method for preserving cross-references, thus enriching the tree produced in the query evaluation with a graph structure. 1 Introduction XML is considered the most relevant standard for data representation and exchange among various applications on the Internet. The popularity of XML is in large part due to its flexibility for representing many kinds of information. As the importance of XML has increased, a series of standards has grown up around it, defined by the World Wide Web Consortium (W3C): XML Schema, XPath, XSLT etc. XML data are often quite heterogeneous, and distribute their meta-data throughout the document. XML documents may contain many levels of nested elements and can represent missing information simply by the absence of an element. For these reasons several special languages were designed for querying XML data. One of these languages is XQuery [10], being developed by XML Query Working Group. One can think of XML document as of tree, but XML Standard allows to define references between elements using ID-IDREF couples. Thus, XML documente turns into directed graph where directed edges represent parent-child relationship or IDREF- ID reference. This paper describes the problem arised with XML references when querying XML data with XQuery. It is organized in following way: Section 2 describes the probem in details, Section 3 introduces other approaches to dealing with references. Sec- This works was supported by the Russian Foundation for Basic Research under grant Proceedings of the Spring Young Researcher s Colloquium on Database and Information Systems SYR- CoDIS, St.-Petersburg, Russia, 2004 tion 4 expounds suggested solution and Section 5 summarizes the topic. 2 Problem Statement During query evaluation, XQuery builds new structure containing copies of all elements that satisfy the query. While XQuery uses XPath data model, described in [9], it copies IDs and IDREFs as attributes from original into newly created structure. It does not treat references in any special way, considering them as common element attributes. This could lead into some tricky situations, described below (we assume that if original document contains element, referencing with IDREF, then it also contains element with corresponing ID): 1. ID attribute is copied into new structure, IDREF is not copied. The problem is that the same ID already exists in original document. 2. IDREF attribute is copied into new structure, ID is not copied (hanging IDREF). As XML is semistructured data model, we do not pay attention to lack of some elements or attributes. 3. Both ID and corresponding IDREF are copied into new structure. This combines first case with need to distinguish original reference edge from new one. Problem: Define and implement a modified query evaluation technique for XQuery which is able to handle cross-references between extracted XML elements. 2.1 Example As an example we used fragment of musical encyclopaedia, that can be found in Figure 1. It stores bands and musicians in one catalog, containing casts for each band and instruments that person can play. Each cast contains references to musicians that played at this cast. Thus, different casts might probably reference the same musician (e.g. Robert Fripp is in every King Crimson s cast). Moreover, different bands can reference same musician (e.g. Greg Lake appeared in Emerson, Lake and Palmer and King Crimson both). Queries to this encyclopaedia, corresponding to problems described above, are as follows:

2 <catalog> <band name= Emerson, Lake and Palmer style= rock ID= elp > <artist IDREF= emerson >K. Emerson <band name= King Crimson style= rock ID= kingcrimson > <cast number= 1 years= > <artist IDREF= giles >M. Giles <cast number= 2 years= > <artist IDREF= collins >M. Collins <artist IDREF= haskell >G. Haskell <musician name= Keith Emerson ID= emerson > <instrument>piano</instrument> <instrument>organ</instrument> <musician name= Greg Lake ID= lake > <musician name= Carl Palmer ID= palmer > <instrument>percussion</instrument> <musician name= Robert Fripp ID= fripp > </catalog> Figure 1: Music encyclopaedia (fragment) for $i in document( catalog.xml )/catalog/musician where $i/instrument = Guitar return <guitar-player> $i/@id <name>data($i/@name)</name> for $j in $i/instrument return $j Figure 2: Query A. Select all guitar players A. Select all guitar players Figure 2, results are in Figure 5. B. Choose first cast of King Crimson Figure 3, results are in Figure 6. C. Find performers, that contain Lake in their name Figure 4, results are in Figure 7. for $i in document( catalog.xml ) /catalog/band[@name= King Crimson ] return for $j in $i/cast[@number= 1 ] return $j Figure 3: Query B. Choose first cast of King Crimson 3 Related Work 3.1 XQuery As mentioned above, this problem arose with XQuery query language [10] and XPath data model [9]. XQuery does not offer any special tool to treat IDREFs. Moreover, in XPath data model every element should have intrinsic identifier, but this internal identifier exists independently from ID that was given by user.

3 for $i in document( catalog.xml )// where contains(data($i/name), Lake ) return $i Figure 4: Query C. Find performers, that contain Lake in their name <guitar-player ID= lake > <name>greg Lake</name> <guitar-player ID= fripp > <name>robert Fripp</name> 3.2 Lorel Figure 5: Query A evaluation result Lorel [1] is a query language for Lore [6] database system for semistructured data. Originally, Lore was developed over own data model called OEM. Substantial difference between OEM and XML is that OEM represents graph with directed edges, while XML represents tree. Any object in OEM has its unique identifier and can be referenced with this identifier. Thus, this probem doesn t exist for OEM data model. After migrating Lore from OEM to XML [5], developers paid attention to this difference and introduced new Lore XML-based data model where user can choose, how to interpret IDREFs (as graph edges or as text attributes). 3.3 ODMG ODMG proposed Object Data Model and Object Query Language in its standard ODMG 3.0 [3]. They allow objects to reference each other, but ODMG s data model is strictly typed and has integrity constraints. They treat internal references as graph edges and do not have problems described in Section 2 It was suggested to use XML as an exchange format between applications that comply with ODMG standard [2]. But this suggestion does not use IDREF-ID option of referencing within XML document, it uses special structure to encode relationships. XML is used as media-language to transfer objects from one system to another, so there is no need to query this data. 4 Proposed Solution Proposed solution for problems, described in Section 2, consists of treating IDREF-ID references as graph edges with respect to XML data model and validity constraints. <cast number= 1 years= > <artist IDREF= giles >M. Giles Figure 6: Query B evaluation result <band name= Emerson, Lake and Palmer style= rock ID= elp > <artist IDREF= emerson >K. Emerson <musician name= Greg Lake ID= lake > Figure 7: Query C evaluation result 4.1 Copied ID (without corresponding IDREF) The problem here relies to ID validity constraint of XML [8]. In case of using original document together with created one further in query processing, ID ambiguity appears. Proposed solution is to copy ID attribute from original element and change its value (to be in line with ID validity constraint). But in order to provide access to original element, we suggest to create IDREF attribute in newely created element with value equals to ID of original element. Thus, whenever one needs to access original element from new document, he should traverse two graph edges (to created element and then to original one). 4.2 Copied IDREF (without corresponding ID) This is not a problem, because XML is a semistructured data model, and it can be incomplete by its nature. One can consider this hanging IDREF in two ways: 1. Reference to the element from original document (if it is used afterwards). 2. True hanging reference, when original document is not used further in query processing. 4.3 Copied both ID and corresponding IDREF This problem here also relies to ID validity constraint, but has some more insight. When the entire

4 edge is copied from one graph to another, both elements might not be the same as they were in original document. Moreover, type of both elements could possibly change (and XQuery is sensitive to element types). Thus we suggest to proceed in this case as follows. For the element that is referenced, its ID is changed and IDREF reference is created to the original element. For the element that references, IDREF value changes accordingly to new ID value of abovementioned element. This will solve ID validity problem and allow to access originally referenced element if needed. 4.4 Implementation Proposed implementation is straight and simple. First, XQuery processor evaluates query result as temporary document. Second, it searches through this document for IDs that were extracted from original document and creates dictionary of changes, where original ID is the key and new ID is the value. This dictionary must provide uniqueness of all IDs. Third, query processor searches through temporary document once again and does follows: 1. If element has ID that is to be replaced, then ID is replaced with the one taken from the dictionary and IDREF attribute is added with value of original ID (reference to original element). 2. If element has IDREF, referencing to ID that is subject to change, then IDREF changes to appropriate value (taken the from dictionary). <guitar-player ID= lake-01 IDREF= lake > <name>greg Lake</name> <guitar-player ID= fripp-01 IDREF= fripp > <name>robert Fripp</name> Figure 8: Query A updated evaluation result <cast number= 1 years > <artist IDREF= giles >M. Giles Figure 9: Query B updated evaluation result no changes <band name= Emerson, Lake and Palmer style= rock ID= elp-01 IDREF= elp > <artist IDREF= emerson >K. Emerson <artist IDREF= lake-01 >G. Lake <musician name= Greg Lake ID= lake-01 IDREF= lake > Figure 10: Query C updated evaluation result 4.5 Example Let s show now how our proposal will be reflected in examples given in Section 2. For the simplicity, we used -N postfix for newly created ID, where N is a number. A. First example the problem with ID duplicates. We need to change IDs of lake and fripp elements and create references from them to original elements (Figure 8). B. The second case is not really a problem, so nothing is going to change there (Figure 9). C. Third example combines two problems. We need to solve problem with elp duplicating ID (as in first case) and then solve problem with crossreference between artist in elp and lake (Figure 10). 5 Conclusion In this paper we investigated behaviour of IDREF- ID references in XML document during XQuery query evaluation. We found a problem with extracting ID and IDREF attributes from original document to a new one and proposed solution to this problem, that can be applied to the existing XQuery data model. While our straight approach can be a solution for the moment, there is another aspect that was uncovered in this article. Current version of XML allows usage of XML Namespaces [7], but there is no details on IDREF-ID references between namespaces. Moreover, there is no clarity if XML element could have different IDs for different namespaces. In case it can, one might use namespaces to propose another solution to the problem described in the article. So, the question is open and might be a good subject for further investigations. Furthermore, one can think of optimizing queries, using data about types (like DTD schema) [4]. This raises another problem, as reference types might change, and is also a good subject for investigation.

5 References [1] Serge Abiteboul, Dallan Quass, Jason McHugh, Jennifer Widom, and Janet L. Wiener. The Lorel query language for semistructured data. International Journal on Digital Libraries, 1(1):68 88, Also stanford.edu/pub/papers/lorel96.ps from Stanford DB group on-line publications http: //www-db.stanford.edu/pub/. [2] G.M. Bierman. Using xml as an object interchange format, [3] R. G. G. Cattell, Douglas K. Barry, Mark Berler, Jeff Eastman, David Jordan, Craig Russell, Olaf Schadow, Torsten Stanienda, and Fernando Velez. The Object Data Standard: ODMG 3.0. Elsevier Science and Technology Books, [4] Chin-Wan Chung Chang-Won Park, Jun- Ki Min. Structural function inlining technique for structurally recursive XML queries. In Proc. VLDB 2002, [5] R. Goldman, J. McHugh, and J. Widom. From semistructured data to XML: Migrating the lore data model and query language. In Workshop on the Web and Databases (WebDB 99), pages 25 30, [6] J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom. Lore: A database management system for semistructured data. SIGMOD Record (ACM Special Interest Group on Management of Data), 26(3):54 66, September [7] Namespaces in XML, [8] Extensible markup language (XML) 1.0 (third edition), /. [9] XQuery 1.0 and XPath 2.0 Data Model, [10] XQuery 1.0: An XML Query Language,

Semistructured Data Store Mapping with XML and Its Reconstruction

Semistructured Data Store Mapping with XML and Its Reconstruction Semistructured Data Store Mapping with XML and Its Reconstruction Enhong CHEN 1 Gongqing WU 1 Gabriela Lindemann 2 Mirjam Minor 2 1 Department of Computer Science University of Science and Technology of

More information

XML Query Processing and Optimization

XML Query Processing and Optimization XML Query Processing and Optimization Bartley D. Richardson Department of Electrical & Computer Engineering and Computer Science University of Cincinnati December 16, 2005 Outline Background XML As A Data

More information

Indexing XML Data with ToXin

Indexing XML Data with ToXin Indexing XML Data with ToXin Flavio Rizzolo, Alberto Mendelzon University of Toronto Department of Computer Science {flavio,mendel}@cs.toronto.edu Abstract Indexing schemes for semistructured data have

More information

Path Query Reduction and Diffusion for Distributed Semi-structured Data Retrieval+

Path Query Reduction and Diffusion for Distributed Semi-structured Data Retrieval+ Path Query Reduction and Diffusion for Distributed Semi-structured Data Retrieval+ Jaehyung Lee, Yon Dohn Chung, Myoung Ho Kim Division of Computer Science, Department of EECS Korea Advanced Institute

More information

Aspects of an XML-Based Phraseology Database Application

Aspects of an XML-Based Phraseology Database Application Aspects of an XML-Based Phraseology Database Application Denis Helic 1 and Peter Ďurčo2 1 University of Technology Graz Insitute for Information Systems and Computer Media dhelic@iicm.edu 2 University

More information

METAXPath. Utah State University. From the SelectedWorks of Curtis Dyreson. Curtis Dyreson, Utah State University Michael H. Böhen Christian S.

METAXPath. Utah State University. From the SelectedWorks of Curtis Dyreson. Curtis Dyreson, Utah State University Michael H. Böhen Christian S. Utah State University From the SelectedWorks of Curtis Dyreson December, 2001 METAXPath Curtis Dyreson, Utah State University Michael H. Böhen Christian S. Jensen Available at: https://works.bepress.com/curtis_dyreson/11/

More information

Creating a Mediated Schema Based on Initial Correspondences

Creating a Mediated Schema Based on Initial Correspondences Creating a Mediated Schema Based on Initial Correspondences Rachel A. Pottinger University of Washington Seattle, WA, 98195 rap@cs.washington.edu Philip A. Bernstein Microsoft Research Redmond, WA 98052-6399

More information

XML-QE: A Query Engine for XML Data Soures

XML-QE: A Query Engine for XML Data Soures XML-QE: A Query Engine for XML Data Soures Bruce Jackson, Adiel Yoaz {brucej, adiel}@cs.wisc.edu 1 1. Introduction XML, short for extensible Markup Language, may soon be used extensively for exchanging

More information

Chapter 13 XML: Extensible Markup Language

Chapter 13 XML: Extensible Markup Language Chapter 13 XML: Extensible Markup Language - Internet applications provide Web interfaces to databases (data sources) - Three-tier architecture Client V Application Programs Webserver V Database Server

More information

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:

More information

A Framework for Processing Complex Document-centric XML with Overlapping Structures Ionut E. Iacob and Alex Dekhtyar

A Framework for Processing Complex Document-centric XML with Overlapping Structures Ionut E. Iacob and Alex Dekhtyar A Framework for Processing Complex Document-centric XML with Overlapping Structures Ionut E. Iacob and Alex Dekhtyar ABSTRACT Management of multihierarchical XML encodings has attracted attention of a

More information

A Web-Based OO Platform for the Development of Didactic Multimedia Collaborative Applications

A Web-Based OO Platform for the Development of Didactic Multimedia Collaborative Applications A Web-Based OO Platform for the Development of Didactic Multimedia Collaborative Applications David A. Fuller, Luis A. Guerrero, Jenny Zegarra {dfuller, luguerre, jzegarra}@ing.puc.cl Computer Science

More information

XML in Databases. Albrecht Schmidt. al. Albrecht Schmidt, Aalborg University 1

XML in Databases. Albrecht Schmidt.   al. Albrecht Schmidt, Aalborg University 1 XML in Databases Albrecht Schmidt al@cs.auc.dk http://www.cs.auc.dk/ al Albrecht Schmidt, Aalborg University 1 What is XML? (1) Where is the Life we have lost in living? Where is the wisdom we have lost

More information

An Efficient XML Index Structure with Bottom-Up Query Processing

An Efficient XML Index Structure with Bottom-Up Query Processing An Efficient XML Index Structure with Bottom-Up Query Processing Dong Min Seo, Jae Soo Yoo, and Ki Hyung Cho Department of Computer and Communication Engineering, Chungbuk National University, 48 Gaesin-dong,

More information

Folder(Inbox) Message Message. Body

Folder(Inbox) Message Message. Body Rening OEM to Improve Features of Query Languages for Semistructured Data Pavel Hlousek Charles University, Faculty of Mathematics and Physics, Prague, Czech Republic Abstract. Semistructured data can

More information

Approaches. XML Storage. Storing arbitrary XML. Mapping XML to relational. Mapping the link structure. Mapping leaf values

Approaches. XML Storage. Storing arbitrary XML. Mapping XML to relational. Mapping the link structure. Mapping leaf values XML Storage CPS 296.1 Topics in Database Systems Approaches Text files Use DOM/XSLT to parse and access XML data Specialized DBMS Lore, Strudel, exist, etc. Still a long way to go Object-oriented DBMS

More information

Pre-Discussion. XQuery: An XML Query Language. Outline. 1. The story, in brief is. Other query languages. XML vs. Relational Data

Pre-Discussion. XQuery: An XML Query Language. Outline. 1. The story, in brief is. Other query languages. XML vs. Relational Data Pre-Discussion XQuery: An XML Query Language D. Chamberlin After the presentation, we will evaluate XQuery. During the presentation, think about consequences of the design decisions on the usability of

More information

Introduction to Semistructured Data and XML

Introduction to Semistructured Data and XML Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of Washington Database Management Systems, R. Ramakrishnan 1 How the Web is Today HTML documents often

More information

Overview of the Integration Wizard Project for Querying and Managing Semistructured Data in Heterogeneous Sources

Overview of the Integration Wizard Project for Querying and Managing Semistructured Data in Heterogeneous Sources In Proceedings of the Fifth National Computer Science and Engineering Conference (NSEC 2001), Chiang Mai University, Chiang Mai, Thailand, November 2001. Overview of the Integration Wizard Project for

More information

Interactive Query and Search in Semistructured Databases æ

Interactive Query and Search in Semistructured Databases æ Interactive Query and Search in Semistructured Databases Roy Goldman, Jennifer Widom Stanford University froyg,widomg@cs.stanford.edu www-db.stanford.edu Abstract Semistructured graph-based databases have

More information

Nested XPath Query Optimization for XML Structured Document Database

Nested XPath Query Optimization for XML Structured Document Database Nested XPath Query Optimization for XML Structured Document Database Radha Senthilkumar #, G. B. Rakesh, N.Sasikala, M.Gowrishankar, A. Kannan 3, Department of Information Technology, MIT Campus, Anna

More information

Full-Text and Structural XML Indexing on B + -Tree

Full-Text and Structural XML Indexing on B + -Tree Full-Text and Structural XML Indexing on B + -Tree Toshiyuki Shimizu 1 and Masatoshi Yoshikawa 2 1 Graduate School of Information Science, Nagoya University shimizu@dl.itc.nagoya-u.ac.jp 2 Information

More information

Lab Assignment 3 on XML

Lab Assignment 3 on XML CIS612 Dr. Sunnie S. Chung Lab Assignment 3 on XML Semi-structure Data Processing: Transforming XML data to CSV format For Lab3, You can write in your choice of any languages in any platform. The Semi-Structured

More information

Data Structures for Maintaining Path Statistics in Distributed XML Stores

Data Structures for Maintaining Path Statistics in Distributed XML Stores Data Structures for Maintaining Path Statistics in Distributed XML Stores c Yury Soldak Department of Computer Science, Saint-Petersburg State University University Prospekt 28 Saint-Petersburg Russian

More information

Using Relational Database metadata to generate enhanced XML structure and document Abstract 1. Introduction

Using Relational Database metadata to generate enhanced XML structure and document Abstract 1. Introduction Using Relational Database metadata to generate enhanced XML structure and document Sherif Sakr - Mokhtar Boshra Faculty of Computers and Information Cairo University {sakr,mboshra}@cu.edu.eg Abstract Relational

More information

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington CS330 Lecture April 8, 2003 1 Overview From HTML to XML DTDs Querying XML: XPath Transforming XML: XSLT

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411 1 Extensible

More information

ToX The Toronto XML Engine

ToX The Toronto XML Engine ToX The Toronto XML Engine Denilson Barbosa 1 Attila Barta 1 Alberto Mendelzon 1 George Mihaila 2 Flavio Rizzolo 1 Patricia Rodriguez-Gianolli 1 1 Department of Computer Science University of Toronto {dmb,atibarta,mendel,flavio,prg}@cs.toronto.edu

More information

XML and information exchange. XML extensible Markup Language XML

XML and information exchange. XML extensible Markup Language XML COS 425: Database and Information Management Systems XML and information exchange 1 XML extensible Markup Language History 1988 SGML: Standard Generalized Markup Language Annotate text with structure 1992

More information

Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database

Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database Yuanying Mo National University of Singapore moyuanyi@comp.nus.edu.sg Tok Wang Ling National University of Singapore

More information

DSD: A Schema Language for XML

DSD: A Schema Language for XML DSD: A Schema Language for XML Nils Klarlund, AT&T Labs Research Anders Møller, BRICS, Aarhus University Michael I. Schwartzbach, BRICS, Aarhus University Connections between XML and Formal Methods XML:

More information

XML: Extensible Markup Language

XML: Extensible Markup Language XML: Extensible Markup Language CSC 375, Fall 2015 XML is a classic political compromise: it balances the needs of man and machine by being equally unreadable to both. Matthew Might Slides slightly modified

More information

The Research on Coding Scheme of Binary-Tree for XML

The Research on Coding Scheme of Binary-Tree for XML Available online at www.sciencedirect.com Procedia Engineering 24 (2011 ) 861 865 2011 International Conference on Advances in Engineering The Research on Coding Scheme of Binary-Tree for XML Xiao Ke *

More information

DATA MODELS FOR SEMISTRUCTURED DATA

DATA MODELS FOR SEMISTRUCTURED DATA Chapter 2 DATA MODELS FOR SEMISTRUCTURED DATA Traditionally, real world semantics are captured in a data model, and mapped to the database schema. The real world semantics are modeled as constraints and

More information

XSLT program. XSLT elements. XSLT example. An XSLT program is an XML document containing

XSLT program. XSLT elements. XSLT example. An XSLT program is an XML document containing XSLT CPS 216 Advanced Database Systems Announcements (March 24) 2 Homework #3 will be assigned next Tuesday Reading assignment due next Wednesday XML processing in Lore (VLDB 1999) and Niagara (VLDB 2003)

More information

Introduction to Database Systems CSE 444

Introduction to Database Systems CSE 444 Introduction to Database Systems CSE 444 Lecture 25: XML 1 XML Outline XML Syntax Semistructured data DTDs XPath Coverage of XML is much better in new edition Readings Sections 11.1 11.3 and 12.1 [Subset

More information

Querying Spatiotemporal XML Using DataFoX

Querying Spatiotemporal XML Using DataFoX Querying Spatiotemporal XML Using DataFoX Yi Chen Peter Revesz Computer Science and Engineering Department University of Nebraska-Lincoln Lincoln, NE 68588, USA {ychen,revesz}@cseunledu Abstract We describe

More information

Introduction to Database Systems CSE 414

Introduction to Database Systems CSE 414 Introduction to Database Systems CSE 414 Lecture 13: XML and XPath 1 Announcements Current assignments: Web quiz 4 due tonight, 11 pm Homework 4 due Wednesday night, 11 pm Midterm: next Monday, May 4,

More information

Introduction to Data Management CSE 344

Introduction to Data Management CSE 344 Introduction to Data Management CSE 344 Lecture 11: XML and XPath 1 XML Outline What is XML? Syntax Semistructured data DTDs XPath 2 What is XML? Stands for extensible Markup Language 1. Advanced, self-describing

More information

Content Management for the Defense Intelligence Enterprise

Content Management for the Defense Intelligence Enterprise Gilbane Beacon Guidance on Content Strategies, Practices and Technologies Content Management for the Defense Intelligence Enterprise How XML and the Digital Production Process Transform Information Sharing

More information

ADT 2009 Other Approaches to XQuery Processing

ADT 2009 Other Approaches to XQuery Processing Other Approaches to XQuery Processing Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/ 12.11.2009: Schedule 2 RDBMS back-end support for XML/XQuery (1/2): Document Representation (XPath

More information

An Improved Prefix Labeling Scheme: A Binary String Approach for Dynamic Ordered XML

An Improved Prefix Labeling Scheme: A Binary String Approach for Dynamic Ordered XML An Improved Prefix Labeling Scheme: A Binary String Approach for Dynamic Ordered XML Changqing Li and Tok Wang Ling Department of Computer Science, National University of Singapore {lichangq, lingtw}@comp.nus.edu.sg

More information

Effective Schema-Based XML Query Optimization Techniques

Effective Schema-Based XML Query Optimization Techniques Effective Schema-Based XML Query Optimization Techniques Guoren Wang and Mengchi Liu School of Computer Science Carleton University, Canada {wanggr, mengchi}@scs.carleton.ca Bing Sun, Ge Yu, and Jianhua

More information

Design of Index Schema based on Bit-Streams for XML Documents

Design of Index Schema based on Bit-Streams for XML Documents Design of Index Schema based on Bit-Streams for XML Documents Youngrok Song 1, Kyonam Choo 3 and Sangmin Lee 2 1 Institute for Information and Electronics Research, Inha University, Incheon, Korea 2 Department

More information

An X-Ray on Web-Available XML Schemas

An X-Ray on Web-Available XML Schemas An X-Ray on Web-Available XML Schemas Alberto H. F. Laender, Mirella M. Moro, Cristiano Nascimento and Patrícia Martins Department of Computer Science Federal University of Minas Gerais Belo Horizonte,

More information

10/24/12. What We Have Learned So Far. XML Outline. Where We are Going Next. XML vs Relational. What is XML? Introduction to Data Management CSE 344

10/24/12. What We Have Learned So Far. XML Outline. Where We are Going Next. XML vs Relational. What is XML? Introduction to Data Management CSE 344 What We Have Learned So Far Introduction to Data Management CSE 344 Lecture 12: XML and XPath A LOT about the relational model Hand s on experience using a relational DBMS From basic to pretty advanced

More information

DISCUSSION 5min 2/24/2009. DTD to relational schema. Inlining. Basic inlining

DISCUSSION 5min 2/24/2009. DTD to relational schema. Inlining. Basic inlining XML DTD Relational Databases for Querying XML Documents: Limitations and Opportunities Semi-structured SGML Emerging as a standard E.g. john 604xxxxxxxx 778xxxxxxxx

More information

Informatics 1: Data & Analysis

Informatics 1: Data & Analysis T O Y H Informatics 1: Data & Analysis Lecture 11: Navigating XML using XPath Ian Stark School of Informatics The University of Edinburgh Tuesday 26 February 2013 Semester 2 Week 6 E H U N I V E R S I

More information

Efficient Re-construction of Document Versions Based on Adaptive Forward and Backward Change Deltas

Efficient Re-construction of Document Versions Based on Adaptive Forward and Backward Change Deltas Efficient Re-construction of Document Versions Based on Adaptive Forward and Backward Change Deltas Raymond K. Wong Nicole Lam School of Computer Science & Engineering, University of New South Wales, Sydney

More information

COMP9321 Web Application Engineering. Extensible Markup Language (XML)

COMP9321 Web Application Engineering. Extensible Markup Language (XML) COMP9321 Web Application Engineering Extensible Markup Language (XML) Dr. Basem Suleiman Service Oriented Computing Group, CSE, UNSW Australia Semester 1, 2016, Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2442

More information

An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry

An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry I-Chen Wu 1 and Shang-Hsien Hsieh 2 Department of Civil Engineering, National Taiwan

More information

Optimize Twig Query Pattern Based on XML Schema

Optimize Twig Query Pattern Based on XML Schema JOURNAL OF SOFTWARE, VOL. 8, NO. 6, JUNE 2013 1479 Optimize Twig Query Pattern Based on XML Schema Hui Li Beijing University of Technology, Beijing, China Email: xiaodadaxiao2000@163.com HuSheng Liao and

More information

CHAOS: An Active Security Mediation System

CHAOS: An Active Security Mediation System CHAOS: An Active Security Mediation System David Liu 1, Kincho Law 2, and Gio Wiederhold 1 1 Electrical Engineering Department, Stanford University, Stanford, CA davidliu@stanford.edu 2 Civil Engineering

More information

Framework for Supporting Metadata Services

Framework for Supporting Metadata Services Framework for Supporting Services Mitsuaki Tsunakara, Ryoji Kataoka, and Masashi Morimoto Abstract -sharing businesses have been attracting considerable attention recently. These include highspeed search

More information

XML. Jonathan Geisler. April 18, 2008

XML. Jonathan Geisler. April 18, 2008 April 18, 2008 What is? IS... What is? IS... Text (portable) What is? IS... Text (portable) Markup (human readable) What is? IS... Text (portable) Markup (human readable) Extensible (valuable for future)

More information

Structured documents

Structured documents Structured documents An overview of XML Structured documents Michael Houghton 15/11/2000 Unstructured documents Broadly speaking, text and multimedia document formats can be structured or unstructured.

More information

Querying XML Data. Mary Fernandez. AT&T Labs Research David Maier. Oregon Graduate Institute

Querying XML Data. Mary Fernandez. AT&T Labs Research David Maier. Oregon Graduate Institute Querying XML Data Alin Deutsch Univ. of Pennsylvania adeutsch@gradient.cis.upenn.edu Alon Levy University of Washington, Seattle alon@cs.washington.edu Mary Fernandez AT&T Labs Research mff@research.att.com

More information

A survey of graphical query languages for XML data

A survey of graphical query languages for XML data Journal of King Saud University Computer and Information Sciences (2011) 23, 59 70 King Saud University Journal of King Saud University Computer and Information Sciences www.ksu.edu.sa www.sciencedirect.com

More information

Structured Information Retrieval in XML documents

Structured Information Retrieval in XML documents Structured Information Retrieval in documents Evangelos Kotsakis Joint Research Center (CCR), TP261, I-21020 Ispra (VA), Italy kotsakis@acm.org ABSTRACT Query languages that take advantage of the document

More information

Designing a High Performance Database Engine for the Db4XML Native XML Database System

Designing a High Performance Database Engine for the Db4XML Native XML Database System Designing a High Performance Database Engine for the Db4XML Native XML Database System Sudhanshu Sipani a, Kunal Verma a, John A. Miller a, * and Boanerges Aleman-Meza a a Department of Computer Science,

More information

Ontology Structure of Elements for Web-based Natural Disaster Preparedness Systems

Ontology Structure of Elements for Web-based Natural Disaster Preparedness Systems Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2007 Proceedings Americas Conference on Information Systems (AMCIS) December 2007 Ontology Structure of Elements for Web-based Natural

More information

Semantic Web and Databases: Relationships and some Open Problems

Semantic Web and Databases: Relationships and some Open Problems Semantic Web and Databases: Relationships and some Open Problems Stefan Decker Gates Bldg 4A/425 Stanford University, Stanford, CA, 94306, USA stefan@db.stanford.edu Abstract. In this position paper I

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1 Slide 27-1 Chapter 27 XML: Extensible Markup Language Chapter Outline Introduction Structured, Semi structured, and Unstructured Data. XML Hierarchical (Tree) Data Model. XML Documents, DTD, and XML Schema.

More information

Use of XML Schema and XML Query for ENVISAT product data handling

Use of XML Schema and XML Query for ENVISAT product data handling Use of XML Schema and XML Query for ENVISAT product data handling Stéphane Mbaye stephane.mbaye@gael.fr GAEL Consultant Cité Descartes, 8 rue Albert Einstein 77420 Champs-sur-Marne, France Abstract * This

More information

Association Rule Mining from XML Data

Association Rule Mining from XML Data 144 Conference on Data Mining DMIN'06 Association Rule Mining from XML Data Qin Ding and Gnanasekaran Sundarraj Computer Science Program The Pennsylvania State University at Harrisburg Middletown, PA 17057,

More information

Implementing Web Content

Implementing Web Content Implementing Web Content Tonia M. Bartz Dr. David Robins Individual Investigation SLIS Site Redesign 6 August 2006 Appealing Web Content When writing content for a web site, it is best to think of it more

More information

Querying purexml Part 1 The Basics

Querying purexml Part 1 The Basics Information Management Emerging Partnerships and Technologies IBM Toronto Lab Summer/Fall 2010 Querying purexml Part 1 The Basics Li Chen, Shumin Wu Questions to malaika@us.ibm.com http://www.ibm.com/developerworks/wikis/display/db2xml/devotee

More information

XML and Agent Communication

XML and Agent Communication Tutorial Report for SENG 609.22- Agent-based Software Engineering Course Instructor: Dr. Behrouz H. Far XML and Agent Communication Jingqiu Shao Fall 2002 1 XML and Agent Communication Jingqiu Shao Department

More information

Semistructured Content

Semistructured Content On our first day Semistructured Content 1 Structured data : database system tagged, typed well-defined semantic interpretation Semi-structured data: tagged - (HTML?) some help with semantic interpretation

More information

The XQuery Data Model

The XQuery Data Model The XQuery Data Model 9. XQuery Data Model XQuery Type System Like for any other database query language, before we talk about the operators of the language, we have to specify exactly what it is that

More information

Expressing Internationalization and Localization information in XML

Expressing Internationalization and Localization information in XML Expressing Internationalization and Localization information in XML Felix Sasaki Richard Ishida World Wide Web Consortium 1 San Francisco, This presentation describes the current status of work on the

More information

Generalized Document Data Model for Integrating Autonomous Applications

Generalized Document Data Model for Integrating Autonomous Applications 6 th International Conference on Applied Informatics Eger, Hungary, January 27 31, 2004. Generalized Document Data Model for Integrating Autonomous Applications Zsolt Hernáth, Zoltán Vincellér Abstract

More information

Choosing a Data Model and Query Language for Provenance

Choosing a Data Model and Query Language for Provenance Choosing a Data Model and Query Language for Provenance The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation Accessed Citable

More information

XSLT. Announcements (October 24) XSLT. CPS 116 Introduction to Database Systems. Homework #3 due next Tuesday Project milestone #2 due November 9

XSLT. Announcements (October 24) XSLT. CPS 116 Introduction to Database Systems. Homework #3 due next Tuesday Project milestone #2 due November 9 XSLT CPS 116 Introduction to Database Systems Announcements (October 24) 2 Homework #3 due next Tuesday Project milestone #2 due November 9 XSLT 3 XML-to-XML rule-based transformation language Used most

More information

XML, DTD, and XPath. Announcements. From HTML to XML (extensible Markup Language) CPS 116 Introduction to Database Systems. Midterm has been graded

XML, DTD, and XPath. Announcements. From HTML to XML (extensible Markup Language) CPS 116 Introduction to Database Systems. Midterm has been graded XML, DTD, and XPath CPS 116 Introduction to Database Systems Announcements 2 Midterm has been graded Graded exams available in my office Grades posted on Blackboard Sample solution and score distribution

More information

Introduction to Database Systems CSE 414

Introduction to Database Systems CSE 414 Introduction to Database Systems CSE 414 Lecture 14-15: XML CSE 414 - Spring 2013 1 Announcements Homework 4 solution will be posted tomorrow Midterm: Monday in class Open books, no notes beyond one hand-written

More information

An Extended Byte Carry Labeling Scheme for Dynamic XML Data

An Extended Byte Carry Labeling Scheme for Dynamic XML Data Available online at www.sciencedirect.com Procedia Engineering 15 (2011) 5488 5492 An Extended Byte Carry Labeling Scheme for Dynamic XML Data YU Sheng a,b WU Minghui a,b, * LIU Lin a,b a School of Computer

More information

The XOO7 XML Management System Benchmark

The XOO7 XML Management System Benchmark The XOO7 XML Management System Benchmark STÉPHANE BRESSAN, MONG LI LEE, YING GUANG LI National University of Singapore {steph, leeml, liyg}@comp.nus.edu.sg ZOÉ LACROIX, ULLAS NAMBIAR Arizona State University

More information

CHAPTER 3 LITERATURE REVIEW

CHAPTER 3 LITERATURE REVIEW 20 CHAPTER 3 LITERATURE REVIEW This chapter presents query processing with XML documents, indexing techniques and current algorithms for generating labels. Here, each labeling algorithm and its limitations

More information

Element Algebra. 1 Introduction. M. G. Manukyan

Element Algebra. 1 Introduction. M. G. Manukyan Element Algebra M. G. Manukyan Yerevan State University Yerevan, 0025 mgm@ysu.am Abstract. An element algebra supporting the element calculus is proposed. The input and output of our algebra are xdm-elements.

More information

INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN EFFECTIVE KEYWORD SEARCH OF FUZZY TYPE IN XML

INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN EFFECTIVE KEYWORD SEARCH OF FUZZY TYPE IN XML INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 EFFECTIVE KEYWORD SEARCH OF FUZZY TYPE IN XML Mr. Mohammed Tariq Alam 1,Mrs.Shanila Mahreen 2 Assistant Professor

More information

SphinX: Schema-conscious XML Indexing

SphinX: Schema-conscious XML Indexing SphinX: Schema-conscious XML Indexing by Leela Krishna Poola Jayant R. Haritsa Database Systems Laboratory Dept. of Computer Science & Automation Indian Institute of Science Bangalore 560012, INDIA krishna,haritsa

More information

Querying XML Data. Querying XML has two components. Selecting data. Construct output, or transform data

Querying XML Data. Querying XML has two components. Selecting data. Construct output, or transform data Querying XML Data Querying XML has two components Selecting data pattern matching on structural & path properties typical selection conditions Construct output, or transform data construct new elements

More information

XML Data Integration By Graph Restructuring

XML Data Integration By Graph Restructuring XML Data Integration By Graph Restructuring Lucas Zamboulis School of Computer Science and Information Systems, Birkbeck College, University of London, London WCE 7HX E-mail: lucas@dcs.bbk.ac.uk Abstract.

More information

SXML: an XML document as an S-expression

SXML: an XML document as an S-expression SXML: an XML document as an S-expression Kirill Lisovsky, Dmitry Lizorkin Institute for System Programming RAS, Moscow State University lisovsky@acm.org lizorkin@hotbox.ru Abstract This article is the

More information

XML Metadata Standards and Topic Maps

XML Metadata Standards and Topic Maps XML Metadata Standards and Topic Maps Erik Wilde 16.7.2001 XML Metadata Standards and Topic Maps 1 Outline what is XML? a syntax (not a data model!) what is the data model behind XML? XML Information Set

More information

Lecture 3 February 9, 2010

Lecture 3 February 9, 2010 6.851: Advanced Data Structures Spring 2010 Dr. André Schulz Lecture 3 February 9, 2010 Scribe: Jacob Steinhardt and Greg Brockman 1 Overview In the last lecture we continued to study binary search trees

More information

Introduction to XML. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University

Introduction to XML. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University Introduction to XML Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University http://gear.kku.ac.th/~krunapon/xmlws 1 Topics p What is XML? p Why XML? p Where does XML

More information

An approach to the model-based fragmentation and relational storage of XML-documents

An approach to the model-based fragmentation and relational storage of XML-documents An approach to the model-based fragmentation and relational storage of XML-documents Christian Süß Fakultät für Mathematik und Informatik, Universität Passau, D-94030 Passau, Germany Abstract A flexible

More information

XML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9

XML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9 XML databases Jan Chomicki University at Buffalo Jan Chomicki (University at Buffalo) XML databases 1 / 9 Outline 1 XML data model 2 XPath 3 XQuery Jan Chomicki (University at Buffalo) XML databases 2

More information

Fragmentation of XML Documents

Fragmentation of XML Documents Fragmentation of XML Documents Hui Ma 1, Klaus-Dieter Schewe 2 1 Victoria University of Wellington, School of Engineering and Computer Science, Wellington, New Zealand hui.ma@ecs.vuw.ac.nz 2 Software Competence

More information

IBM DB2 11 DBA for z/os Certification Review Guide Exam 312

IBM DB2 11 DBA for z/os Certification Review Guide Exam 312 Introduction IBM DB2 11 DBA for z/os Certification Review Guide Exam 312 The purpose of this book is to assist you with preparing for the IBM DB2 11 DBA for z/os exam (Exam 312), one of the two required

More information

Semistructured Content

Semistructured Content On our first day Semistructured Content 1 Structured data : database system tagged, typed well-defined semantic interpretation Semi-structured data: tagged - XML (HTML?) some help with semantic interpretation

More information

Lesson 14 SOA with REST (Part I)

Lesson 14 SOA with REST (Part I) Lesson 14 SOA with REST (Part I) Service Oriented Architectures Security Module 3 - Resource-oriented services Unit 1 REST Ernesto Damiani Università di Milano Web Sites (1992) WS-* Web Services (2000)

More information

Informatics 1: Data & Analysis

Informatics 1: Data & Analysis Informatics 1: Data & Analysis Lecture 9: Trees and XML Ian Stark School of Informatics The University of Edinburgh Tuesday 11 February 2014 Semester 2 Week 5 http://www.inf.ed.ac.uk/teaching/courses/inf1/da

More information

Part XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321

Part XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321 Part XII Mapping XML to Databases Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321 Outline of this part 1 Mapping XML to Databases Introduction 2 Relational Tree Encoding Dead Ends

More information

Introduction to XML 3/14/12. Introduction to XML

Introduction to XML 3/14/12. Introduction to XML Introduction to XML Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University http://gear.kku.ac.th/~krunapon/xmlws 1 Topics p What is XML? p Why XML? p Where does XML

More information

Database Management Systems (CPTR 312)

Database Management Systems (CPTR 312) Database Management Systems (CPTR 312) Preliminaries Me: Raheel Ahmad Ph.D., Southern Illinois University M.S., University of Southern Mississippi B.S., Zakir Hussain College, India Contact: Science 116,

More information

A distributed editing environment for XML documents

A distributed editing environment for XML documents Proceedings of the first ECOOP Workshop on XML and Object Technology (XOT 00). Sophia-Antipolis; 2000. A distributed editing environment for XML documents Pasquier C. and Théry L. Abstract XML is based

More information

Index-Driven XQuery Processing in the exist XML Database

Index-Driven XQuery Processing in the exist XML Database Index-Driven XQuery Processing in the exist XML Database Wolfgang Meier wolfgang@exist-db.org The exist Project XML Prague, June 17, 2006 Outline 1 Introducing exist 2 Node Identification Schemes and Indexing

More information