Integrating Path Index with Value Index for XML data

Size: px
Start display at page:

Download "Integrating Path Index with Value Index for XML data"

Transcription

1 Integrating Path Index with Value Index for XML data Jing Wang 1, Xiaofeng Meng 2, Shan Wang 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China cuckoowj@btamail.net.cn 2 Information School, Renmin Univerisity of China, Beijing, China {xfmeng, suang}@public.bta.net.cn Abstract. With the advent of XML, it is becoming the de facto standard required by the Web applications. To facilitate path expression processing, we propose an index structure adopted in our native XML database system Orient- X. Our index is constructed by utilizing DTD to get paths that will appear in the XML documents. It represents structural summary of XML data collection conforming to certain DTD, so we can process any label path query without accessing original data. In addition, it is integrated with value indexes. Preliminary experiments show quite promising results. 1 Introduction As more and more data sources on the Internet switch over and express their data content using XML [1] format, the volume of XML data is increasing rapidly. This trend calls for efficient XML data management solutions. In line with the tree centric nature of XML data, path expression plays an important role in XML query [2]. Without index, tree traversal is not an efficient solution to this problem. Recent proposals put their focus on efficient support for sequence of / steps from the document root, but is not efficient for the processing of partial path matching and // due to the exhaustive navigation of the indexes. Furthermore, the construction cost is expensive, and index size may be very large. In this paper, we propose SUPEX ( Schema guided Path index for XML data). 1 In contrast to traditional path index, SUPEX is constructed by utilizing DTD to get paths that will appear in XML documents. With SUPEX, we can process any label path query without accessing original data. Value based conditions are crucial in querying any kind of data. In SUPEX, path index and value indexes are integrated to facilitate query evaluation. The remainder of this paper is organized as follows. In Section2, we present some back knowledge and related works. An overview of SUPEX is given in Section 3. Section 4 describes the procedure of query processing. Section 5 contains the results of 1 our experiments. Finally, conclusion and future work are given in Section 6. 1 The work was partly supported by the grants from 863 High Technology Foundation of China (No. 2002AA116030) and the Natural Science Foundation of China (No ).

2 2 Background and Related Work Document Type Definition (DTD) is part of XML standard [1], and specifies the structure of an XML element by specifying the names of its sub-elements and attributes. We define an XML data set as a set of XML documents conforming to certain DTD. A key issue in XML query processing is how to efficiently determine the ancestor-descendant relationship between any two elements. We adopt the encoding scheme proposed by [5]. Every node in XML document tree is associated with an 3-tuple (DocId Order Size Level). This numbering scheme is applied to our document tree and index graph. Recent proposals on path index include DataGuides [3], 1-indexes[4], and so on. These indexes are not efficient for the processing of partial path matching due to the exhaustive navigation of the indexes. Furthermore, the construction cost is expensive. Cooper et al. [6] presented the Index Fabric which encodes each label path to each XML element with a value as a string and inserts them into an efficient index structure for string. This index loses relationships between elements, and only supports label path query from document root. XISS[5] proposed an index based path evaluating approach, and supported path query through structural join. 3 Overview of SUPEX 3.1 SUPEX: Its Structure SUPEX consists of a structural graph (SG) and an element map (EM). SG is constructed based on DTD, and represents the structure summary of XML data. So all possible path starting from the roots of XML documents conforming to special DTD will appear in SG. EM provides fast entries to nodes in SG, and is useful in finding all elements with the same tag. Structural Graph Structural Graph Element Map Index Records Element Map Value Indexes Fig. 1. SUPEX structure with value indexes

3 SG has one root node. Each node in SG except the root node has a label defined in DTD, called E-Label. All nodes with the same E-Label in SG are linked through a pointer named Next-Element. Each node in SG corresponds to a set of fixed-length index records which is called extent of the corresponding SG node. The extent of SG node includes index records of elements having an identical incoming label path, and these index records are sorted by DocId and Order values. Each index record includes an element descriptor and other related information. SG is tree-shaped when there is no cycle in DTD graph. When DTD graph is cyclic, SG is still tree-like except the reverse edges from descendant nodes to ancestor nodes. Element Map (EM) is implemented as a B+-tree using element name as key. Each entry in a leaf node points to the first node of a list with an identical E-Label in SG. EM allows us to quickly find all SG nodes with the same E-Label. In traditional database systems such as relational database systems, value indexes are usually created on columns of a relation. But due to tree-shaped nature of XML data, it is difficult to define the granularity of values indexes. In SUPEX, value indexes are constructed with respect to the context of data elements. Each SG node may have one or several pointers to value indexes that are constructed on the attributes or text values of elements in its extent. These value indexes are implemented as B+-tree, and their construction and destruction are user s decisions. Fig. 1 gives the structure of SUPEX. 3.2 Construction of SUPEX DTDs have proved important in a variety of areas: transformation between XML and databases, XML data storage, and so on. SUPEX is generated from DTD, and the main issues that must be addressed include: 1. Simplifying DTD. Practical DTD can be very complex, and most of the complicity of DTD comes from the complex specification of elements. We choose a set of transformations to eliminate constraints on occurrence time of elements, transform to,, and group sub-elements having the same name. Such simplification loses information such as relative order of the elements, but retains information about all possible sub-elements, which is enough to generate our structure graph. 2. Constructing structural graph. The simplified DTD can be represented as DTD graph. Through depth first traversal of DTD graph starting at the element nodes without incoming edge, we expand DTD graph into the structural graph. The SG nodes with an identical E-Label are linked to form a list. Element Map can be constructed on all element tags. 3. Data Loading. SUPEX can be constructed before data loading. During XML documents loading procedure, each element is encoded, and its corresponding index record is inserted into the extent of corresponding node in SG. As for value indexes, users can choose to create appropriate value indexes on attributes or text values of elements conforming to certain context.

4 4 Query Processing with SUPEX SUPEX contains information of path index and value indexes. With SUPEX, we can efficiently process path expression with value based condition predicate. SUPEX supports two basic queries: (1) given a tag, all elements with this tag can be obtained by the lookup of EM. (2) Simple label paths from the root of document can be matched by traversal of SG starting from the root node. Except these two, SUPEX can be used to evaluate query in the following ways. 4.1 Path Expression A complex path expression can be decomposed into a set of basic structural relationships between nodes. These basic structural relationships include ancestordescendant and parent-child relationship. Path queries like E1/E2 and E1/*/E2 can be supported by Parent-Child (E1, E2) and Ancestor-Descendant(E1, E2) algorithms, respectively. The procedure of algorithm Ancestor-Descendant(E1,E2) is as follows. By the lookup of EM, we can get two nodes in SG that are the head nodes of lists with E-Label E1 and E2 respectively. Following the two lists, we determine the ancestor-descendant relationship between the current nodes in the two lists according to their numbers. If they are ancestor and descendant, the element records in their extent are sort-merged and appended into result. Otherwise, one pointer is moved to the next node accordingly. Algorithm Ancestor-Descendant (E1,E2) Input: Ancestor element E1,descendant element E2 Output: Pairs of matching nodes 1: Get the head node of List E1 in SG through EM; 2: Get the head node of List E2 in SG through EM; 3: For each node in List E1 do 4: Skip over unmatchable nodes in List E2; 5: For each matching node in List E2 do 6: Sort-merge the extents of current nodes of List E1 and E2; 7: Append the result to output; 8: End for 9: End for In addition to these basic structural relationships, our index can support partial label path matching. For label paths like //E1/E2//En, we needn t traverse the whole index graph to get result. By the lookup of EM, we can obtain the head node of the list with E-Label E1. For each node in this list, the sub-tree rooted at it will be traversed to find nodes matching E1/E2//En. So only a part of SG will be traversed to get the result. This will greatly reduce the cost of partial label path matching. The detailed procedure is omitted due to space limit.

5 4.2 Query Evaluation with Value Indexes Value based condition predicates are important in query evaluation. In XML query, condition predicates are often on elements matching certain label path expression. In SUPEX, value indexes are created according to the requirements of users, and can be used in the evaluation of condition predicates. Through the traversal of SG, we can get SG nodes matching certain label path expressions. If there are value based conditions on these nodes and appropriate value indexes, query can be evaluated through existing indexes. When there are a large number of data nodes matching path expressions, value indexes will be a good choice with lower cost compared with executing predicates on all candidate nodes. 5 Preliminary Experiment Results We empirically evaluated the performance of SUPEX on a variety of XML documents. We report results here for a representative dataset: the XMark benchmark [7]. The experiments were performed on Pentium IV-1.4GHz platform with MS- Windows 2000 and 256 Mbytes of main memory. The XERCES-C++ parser was used to parse and generate XML data. We implemented our index in the C++ programming language. The data sets were stored on a local disk. To get controllable document size, we used the XML generator XMLgen developed by the XMark benchmark project. For a fixed DTD modeling an Internet auction site, XMLgen produces document instances of controllable size. Table 1 lists the characteristics of the data sets used in our experiments. The numbers in columns of the table represent the parameter of XMLgen, the size of generated document, and the number of elements in the generated documents, respectively. Table 1. XML document size and number of elements in documents Scaling Factor Document Size(MB) Element number We implemented the element index and element-element join algorithm in XISS [5], and compare its performance with SUPEX. Fig. 2 and 3 report the query response time for //open_auction//description and //description/text against XMark documents of increasing size respectively. As shown in these figures, SUPEX is faster than XISS, and attains more cost reduction compared with XISS with the increasing of document size. We have found preliminary experiment results quite motivating. Further performance evaluation will be made in the future.

6 time(millisecs) data size(mbyte) SUPEX XISS time(millisecs) data size(mbyte) SUPEX XISS Fig. 2. Open_auction//description Fig. 3. Description/text 6 Conclusion and Future Work Our research group is working on a native XML data management system. We are implementing SUPEX as the index module of our system. In the future, we will test our method with large volume of data, and compare it with existing index schemes. Furthermore, values indexes will be added into SUPEX structure to accelerate the evaluation of predicate conditions. References 1. T. Bray, J.Paoli, C. M. Sperberg-McQueen, and E. Maler(Eds). Extensible Markup Language (XML) 1.0 (Second Edition). W3C Recommendation 6 October 2000, 2. D. chamberlin, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu(Eds). Xquery: A Query Language for XML. W3C Working Draft, 15 February 2001, 3. R. Goldman, J. Widom. DataGuide: Enabling Query Formulation and Optimization in Semistructured Databases. In Proceedings of the 23 th International Conference on Very Large Data Bases, Athens, Greece, T. Milo and D. Suciu. Index structures for path expression. In Proceedings of the 7th International Conference on Database Theory, pages , January Quanzhong Li, Bongki Moon. Indexing and Querying XML Data for Regular Path Expressions. In Proceedings of the 27 th International Conference on Very Large Data Bases, Roma, Italy, Brian F. Cooper, Neal Sample, Michael J. Franklin, Gisli R. Hjaltason, Moshe Shadmon. A Fast Index for Semistructured Data. In Proceedings of the 27 th International Conference on Very Large Data Bases, Roma, Italy, Albrecht R. Schmidt, Florian Waas, Martin L. Kersten, Daniela Florescu, Ioana Manolescu, Michael J. Carey, and Ralph Busse. The XML Benchmark Project. Technical Report INS- R0103, CWI, Amsterdam, the Netherlands, April 2001

Full-Text and Structural XML Indexing on B + -Tree

Full-Text and Structural XML Indexing on B + -Tree Full-Text and Structural XML Indexing on B + -Tree Toshiyuki Shimizu 1 and Masatoshi Yoshikawa 2 1 Graduate School of Information Science, Nagoya University shimizu@dl.itc.nagoya-u.ac.jp 2 Information

More information

Estimating the Selectivity of XML Path Expression with predicates by Histograms

Estimating the Selectivity of XML Path Expression with predicates by Histograms Estimating the Selectivity of XML Path Expression with predicates by Histograms Yu Wang 1, Haixun Wang 2, Xiaofeng Meng 1, and Shan Wang 1 1 Information School, Renmin University of China, Beijing 100872,

More information

ADT 2009 Other Approaches to XQuery Processing

ADT 2009 Other Approaches to XQuery Processing Other Approaches to XQuery Processing Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/ 12.11.2009: Schedule 2 RDBMS back-end support for XML/XQuery (1/2): Document Representation (XPath

More information

Semistructured Data Store Mapping with XML and Its Reconstruction

Semistructured Data Store Mapping with XML and Its Reconstruction Semistructured Data Store Mapping with XML and Its Reconstruction Enhong CHEN 1 Gongqing WU 1 Gabriela Lindemann 2 Mirjam Minor 2 1 Department of Computer Science University of Science and Technology of

More information

PAPER Full-Text and Structural Indexing of XML Documents on B + -Tree

PAPER Full-Text and Structural Indexing of XML Documents on B + -Tree IEICE TRANS. INF. & SYST., VOL.E89 D, NO.1 JANUARY 2006 237 PAPER Full-Text and Structural Indexing of XML Documents on B + -Tree Toshiyuki SHIMIZU a), Nonmember and Masatoshi YOSHIKAWA b), Member SUMMARY

More information

Effective Schema-Based XML Query Optimization Techniques

Effective Schema-Based XML Query Optimization Techniques Effective Schema-Based XML Query Optimization Techniques Guoren Wang and Mengchi Liu School of Computer Science Carleton University, Canada {wanggr, mengchi}@scs.carleton.ca Bing Sun, Ge Yu, and Jianhua

More information

An Efficient XML Index Structure with Bottom-Up Query Processing

An Efficient XML Index Structure with Bottom-Up Query Processing An Efficient XML Index Structure with Bottom-Up Query Processing Dong Min Seo, Jae Soo Yoo, and Ki Hyung Cho Department of Computer and Communication Engineering, Chungbuk National University, 48 Gaesin-dong,

More information

Open Access The Three-dimensional Coding Based on the Cone for XML Under Weaving Multi-documents

Open Access The Three-dimensional Coding Based on the Cone for XML Under Weaving Multi-documents Send Orders for Reprints to reprints@benthamscience.ae 676 The Open Automation and Control Systems Journal, 2014, 6, 676-683 Open Access The Three-dimensional Coding Based on the Cone for XML Under Weaving

More information

Design of Index Schema based on Bit-Streams for XML Documents

Design of Index Schema based on Bit-Streams for XML Documents Design of Index Schema based on Bit-Streams for XML Documents Youngrok Song 1, Kyonam Choo 3 and Sangmin Lee 2 1 Institute for Information and Electronics Research, Inha University, Incheon, Korea 2 Department

More information

ADT 2010 ADT XQuery Updates in MonetDB/XQuery & Other Approaches to XQuery Processing

ADT 2010 ADT XQuery Updates in MonetDB/XQuery & Other Approaches to XQuery Processing 1 XQuery Updates in MonetDB/XQuery & Other Approaches to XQuery Processing Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/ MonetDB/XQuery: Updates Schedule 9.11.1: RDBMS back-end support

More information

Indexing XML Data with ToXin

Indexing XML Data with ToXin Indexing XML Data with ToXin Flavio Rizzolo, Alberto Mendelzon University of Toronto Department of Computer Science {flavio,mendel}@cs.toronto.edu Abstract Indexing schemes for semistructured data have

More information

A New Way of Generating Reusable Index Labels for Dynamic XML

A New Way of Generating Reusable Index Labels for Dynamic XML A New Way of Generating Reusable Index Labels for Dynamic XML P. Jayanthi, Dr. A. Tamilarasi Department of CSE, Kongu Engineering College, Perundurai 638 052, Erode, Tamilnadu, India. Abstract XML now

More information

XML Systems & Benchmarks

XML Systems & Benchmarks XML Systems & Benchmarks Christoph Staudt Peter Chiv Saarland University, Germany July 1st, 2003 Main Goals of our talk Part I Show up how databases and XML come together Make clear the problems that arise

More information

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:

More information

A FRACTIONAL NUMBER BASED LABELING SCHEME FOR DYNAMIC XML UPDATING

A FRACTIONAL NUMBER BASED LABELING SCHEME FOR DYNAMIC XML UPDATING A FRACTIONAL NUMBER BASED LABELING SCHEME FOR DYNAMIC XML UPDATING Meghdad Mirabi 1, Hamidah Ibrahim 2, Leila Fathi 3,Ali Mamat 4, and Nur Izura Udzir 5 INTRODUCTION 1 Universiti Putra Malaysia, Malaysia,

More information

Chapter 13 XML: Extensible Markup Language

Chapter 13 XML: Extensible Markup Language Chapter 13 XML: Extensible Markup Language - Internet applications provide Web interfaces to databases (data sources) - Three-tier architecture Client V Application Programs Webserver V Database Server

More information

Accelerating XML Structural Matching Using Suffix Bitmaps

Accelerating XML Structural Matching Using Suffix Bitmaps Accelerating XML Structural Matching Using Suffix Bitmaps Feng Shao, Gang Chen, and Jinxiang Dong Dept. of Computer Science, Zhejiang University, Hangzhou, P.R. China microf_shao@msn.com, cg@zju.edu.cn,

More information

TwigStack + : Holistic Twig Join Pruning Using Extended Solution Extension

TwigStack + : Holistic Twig Join Pruning Using Extended Solution Extension Vol. 8 No.2B 2007 603-609 Article ID: + : Holistic Twig Join Pruning Using Extended Solution Extension ZHOU Junfeng 1,2, XIE Min 1, MENG Xiaofeng 1 1 School of Information, Renmin University of China,

More information

Outline. Approximation: Theory and Algorithms. Ordered Labeled Trees in a Relational Database (II/II) Nikolaus Augsten. Unit 5 March 30, 2009

Outline. Approximation: Theory and Algorithms. Ordered Labeled Trees in a Relational Database (II/II) Nikolaus Augsten. Unit 5 March 30, 2009 Outline Approximation: Theory and Algorithms Ordered Labeled Trees in a Relational Database (II/II) Nikolaus Augsten 1 2 3 Experimental Comparison of the Encodings Free University of Bozen-Bolzano Faculty

More information

Labeling Scheme and Structural Joins for Graph-Structured XML Data

Labeling Scheme and Structural Joins for Graph-Structured XML Data Labeling Scheme and Structural Joins for Graph-Structured XML Data Hongzhi Wang 1,2, Wei Wang 1,3, Xuemin Lin 1,3, and Jianzhong Li 2 1 University of New South Wales, Australia {hongzhiw, weiw, lxue}@cse.unsw.edu.au

More information

An Improved Prefix Labeling Scheme: A Binary String Approach for Dynamic Ordered XML

An Improved Prefix Labeling Scheme: A Binary String Approach for Dynamic Ordered XML An Improved Prefix Labeling Scheme: A Binary String Approach for Dynamic Ordered XML Changqing Li and Tok Wang Ling Department of Computer Science, National University of Singapore {lichangq, lingtw}@comp.nus.edu.sg

More information

An approach to the model-based fragmentation and relational storage of XML-documents

An approach to the model-based fragmentation and relational storage of XML-documents An approach to the model-based fragmentation and relational storage of XML-documents Christian Süß Fakultät für Mathematik und Informatik, Universität Passau, D-94030 Passau, Germany Abstract A flexible

More information

OrientX : A Schema-based Native XML Database System

OrientX : A Schema-based Native XML Database System OrientX : A Schema-based Native XML Database System Xiaofeng Meng, Yu Wang, Daofeng Luo, Shichao Lu, Jing An, Yan Chen, Jianbo Ou, Yu Jiang Information School Renmin University of China, Beijing, 100872,

More information

A Two-Step Approach for Tree-structured XPath Query Reduction

A Two-Step Approach for Tree-structured XPath Query Reduction A Two-Step Approach for Tree-structured XPath Query Reduction Minsoo Lee, Yun-mi Kim, and Yoon-kyung Lee Abstract XML data consists of a very flexible tree-structure which makes it difficult to support

More information

An Efficient XML Node Identification and Indexing Scheme

An Efficient XML Node Identification and Indexing Scheme An Efficient XML Node Identification and Indexing Scheme Jan-Marco Bremer and Michael Gertz Department of Computer Science University of California, Davis One Shields Ave., Davis, CA 95616, U.S.A. {bremer

More information

Indexing and Querying XML Data for Regular Path Expressions Λ

Indexing and Querying XML Data for Regular Path Expressions Λ Indexing and Querying XML Data for Regular Path Expressions Λ Quanzhong Li Bongki Moon Dept. of Computer Science University of Arizona, Tucson, AZ 85721 flqz,bkmoong@cs.arizona.edu Abstract With the advent

More information

A Dynamic Labeling Scheme using Vectors

A Dynamic Labeling Scheme using Vectors A Dynamic Labeling Scheme using Vectors Liang Xu, Zhifeng Bao, Tok Wang Ling School of Computing, National University of Singapore {xuliang, baozhife, lingtw}@comp.nus.edu.sg Abstract. The labeling problem

More information

Storing and Querying XML Documents Without Using Schema Information

Storing and Querying XML Documents Without Using Schema Information Storing and Querying XML Documents Without Using Schema Information Kanda Runapongsa Department of Computer Engineering Khon Kaen University, Thailand krunapon@kku.ac.th Jignesh M. Patel Department of

More information

Schema-Based XML-to-SQL Query Translation Using Interval Encoding

Schema-Based XML-to-SQL Query Translation Using Interval Encoding 2011 Eighth International Conference on Information Technology: New Generations Schema-Based XML-to-SQL Query Translation Using Interval Encoding Mustafa Atay Department of Computer Science Winston-Salem

More information

An Extended Preorder Index for Optimising XPath Expressions

An Extended Preorder Index for Optimising XPath Expressions An Extended Preorder Index for Optimising XPath Expressions Martin F O Connor, Zohra Bellahsène, and Mark Roantree Interoperable Systems Group, Dublin City University, Ireland. Email: {moconnor,mark.roantree}@computing.dcu.ie

More information

Designing a High Performance Database Engine for the Db4XML Native XML Database System

Designing a High Performance Database Engine for the Db4XML Native XML Database System Designing a High Performance Database Engine for the Db4XML Native XML Database System Sudhanshu Sipani a, Kunal Verma a, John A. Miller a, * and Boanerges Aleman-Meza a a Department of Computer Science,

More information

Multi-User Evaluation of XML Data Management Systems with XMach-1

Multi-User Evaluation of XML Data Management Systems with XMach-1 Multi-User Evaluation of XML Data Management Systems with XMach-1 Timo Böhme, Erhard Rahm University of Leipzig, Germany {boehme, rahm}@informatik.uni-leipzig.de http://dbs.uni-leipzig.de Abstract. XMach-1

More information

Informatics 1: Data & Analysis

Informatics 1: Data & Analysis T O Y H Informatics 1: Data & Analysis Lecture 11: Navigating XML using XPath Ian Stark School of Informatics The University of Edinburgh Tuesday 26 February 2013 Semester 2 Week 6 E H U N I V E R S I

More information

Relational Index Support for XPath Axes

Relational Index Support for XPath Axes Relational Index Support for XPath Axes Leo Yuen and Chung Keung Poon Department of Computer Science City University of Hong Kong {leo,ckpoon}@cs.cityu.edu.hk Abstract. In this paper, we designed efficient

More information

SphinX: Schema-conscious XML Indexing

SphinX: Schema-conscious XML Indexing SphinX: Schema-conscious XML Indexing by Leela Krishna Poola Jayant R. Haritsa Database Systems Laboratory Dept. of Computer Science & Automation Indian Institute of Science Bangalore 560012, INDIA krishna,haritsa

More information

The Research on Coding Scheme of Binary-Tree for XML

The Research on Coding Scheme of Binary-Tree for XML Available online at www.sciencedirect.com Procedia Engineering 24 (2011 ) 861 865 2011 International Conference on Advances in Engineering The Research on Coding Scheme of Binary-Tree for XML Xiao Ke *

More information

Using an Oracle Repository to Accelerate XPath Queries

Using an Oracle Repository to Accelerate XPath Queries Using an Oracle Repository to Accelerate XPath Queries Colm Noonan, Cian Durrigan, and Mark Roantree Interoperable Systems Group, Dublin City University, Dublin 9, Ireland {cnoonan, cdurrigan, mark}@computing.dcu.ie

More information

Data Centric Integrated Framework on Hotel Industry. Bridging XML to Relational Database

Data Centric Integrated Framework on Hotel Industry. Bridging XML to Relational Database Data Centric Integrated Framework on Hotel Industry Bridging XML to Relational Database Introduction extensible Markup Language (XML) is a promising Internet standard for data representation and data exchange

More information

TwigINLAB: A Decomposition-Matching-Merging Approach To Improving XML Query Processing

TwigINLAB: A Decomposition-Matching-Merging Approach To Improving XML Query Processing American Journal of Applied Sciences 5 (9): 99-25, 28 ISSN 546-9239 28 Science Publications TwigINLAB: A Decomposition-Matching-Merging Approach To Improving XML Query Processing Su-Cheng Haw and Chien-Sing

More information

STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE

STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE Wei-ning Qian, Hai-lei Qian, Li Wei, Yan Wang and Ao-ying Zhou Computer Science Department Fudan University Shanghai 200433 E-mail: wnqian@fudan.edu.cn

More information

Efficient Integration of Structure Indexes of XML

Efficient Integration of Structure Indexes of XML Efficient Integration of Structure Indexes of XML Taro L. Saito Shinichi Morishita University of Tokyo, Japan, {leo, moris}@cb.k.u-tokyo.ac.jp Abstract. Several indexing methods have been proposed to encode

More information

Element Algebra. 1 Introduction. M. G. Manukyan

Element Algebra. 1 Introduction. M. G. Manukyan Element Algebra M. G. Manukyan Yerevan State University Yerevan, 0025 mgm@ysu.am Abstract. An element algebra supporting the element calculus is proposed. The input and output of our algebra are xdm-elements.

More information

Security-Conscious XML Indexing

Security-Conscious XML Indexing Security-Conscious XML Indexing Yan Xiao, Bo Luo, and Dongwon Lee The Pennsylvania State University, University Park, USA xiaoyan515@gmail.com, {bluo,dongwon}@psu.edu Abstract. To support secure exchanging

More information

Compression of the Stream Array Data Structure

Compression of the Stream Array Data Structure Compression of the Stream Array Data Structure Radim Bača and Martin Pawlas Department of Computer Science, Technical University of Ostrava Czech Republic {radim.baca,martin.pawlas}@vsb.cz Abstract. In

More information

A New Method of Generating Index Label for Dynamic XML Data

A New Method of Generating Index Label for Dynamic XML Data Journal of Computer Science 7 (3): 421-426, 2011 ISSN 1549-3636 2011 Science Publications A New Method of Generating Index Label for Dynamic XML Data Jayanthi Paramasivam and Tamilarasi Angamuthu Department

More information

Symmetrically Exploiting XML

Symmetrically Exploiting XML Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA The 15 th International World Wide Web Conference

More information

Querying Tree-Structured Data Using Dimension Graphs

Querying Tree-Structured Data Using Dimension Graphs Querying Tree-Structured Data Using Dimension Graphs Dimitri Theodoratos 1 and Theodore Dalamagas 2 1 Dept. of Computer Science New Jersey Institute of Technology Newark, NJ 07102 dth@cs.njit.edu 2 School

More information

Security Based Heuristic SAX for XML Parsing

Security Based Heuristic SAX for XML Parsing Security Based Heuristic SAX for XML Parsing Wei Wang Department of Automation Tsinghua University, China Beijing, China Abstract - XML based services integrate information resources running on different

More information

QuickXDB: A Prototype of a Native XML QuickXDB: Prototype of Native XML DBMS DBMS

QuickXDB: A Prototype of a Native XML QuickXDB: Prototype of Native XML DBMS DBMS QuickXDB: A Prototype of a Native XML QuickXDB: Prototype of Native XML DBMS DBMS Petr Lukáš, Radim Bača, and Michal Krátký Petr Lukáš, Radim Bača, and Michal Krátký Department of Computer Science, VŠB

More information

Efficient Processing of Complex Twig Pattern Matching

Efficient Processing of Complex Twig Pattern Matching In Proceedings of 9th International Conference on Web-Age Information Management (WAIM 2008), page 135-140, Zhangjajie, China Efficient Processing of Complex Twig Pattern Matching Abstract As a de facto

More information

The XOO7 XML Management System Benchmark

The XOO7 XML Management System Benchmark The XOO7 XML Management System Benchmark STÉPHANE BRESSAN, MONG LI LEE, YING GUANG LI National University of Singapore {steph, leeml, liyg}@comp.nus.edu.sg ZOÉ LACROIX, ULLAS NAMBIAR Arizona State University

More information

Answering XML Twig Queries with Automata

Answering XML Twig Queries with Automata Answering XML Twig Queries with Automata Bing Sun, Bo Zhou, Nan Tang, Guoren Wang, Ge Yu, and Fulin Jia Northeastern University, Shenyang, China {sunb,wanggr,yuge,dbgroup}@mail.neu.edu.cn Abstract. XML

More information

Pathfinder/MonetDB: A High-Performance Relational Runtime for XQuery

Pathfinder/MonetDB: A High-Performance Relational Runtime for XQuery Introduction Problems & Solutions Join Recognition Experimental Results Introduction GK Spring Workshop Waldau: Pathfinder/MonetDB: A High-Performance Relational Runtime for XQuery Database & Information

More information

Tree-Pattern Queries on a Lightweight XML Processor

Tree-Pattern Queries on a Lightweight XML Processor Tree-Pattern Queries on a Lightweight XML Processor MIRELLA M. MORO Zografoula Vagena Vassilis J. Tsotras Research partially supported by CAPES, NSF grant IIS 0339032, UC Micro, and Lotus Interworks Outline

More information

Shifting Predicates to Inner Sub-Expressions for XQuery Optimization

Shifting Predicates to Inner Sub-Expressions for XQuery Optimization Shifting Predicates to Inner Sub-Expressions for XQuery Optimization Sven Groppe 1, Jinghua Groppe 1, Stefan Böttcher 2 and Marc-André Vollstedt 2 1 University of Innsbruck, Institute of Computer Science,

More information

A Modular modular XQuery implementation

A Modular modular XQuery implementation A Modular modular XQuery implementation Implementation Jan Vraný, Jan Jan Vraný, Jan Žák Žák Department of Computer Science and Engineering, FEE, Czech Technical University Department of Computer in Prague,

More information

An Algorithm for Streaming XPath Processing with Forward and Backward Axes

An Algorithm for Streaming XPath Processing with Forward and Backward Axes An Algorithm for Streaming XPath Processing with Forward and Backward Axes Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavchari IBM T.J. Watson Research Center Marcus Fontoura, Vanja Josifovski

More information

Index Structures for Matching XML Twigs Using Relational Query Processors

Index Structures for Matching XML Twigs Using Relational Query Processors Index Structures for Matching XML Twigs Using Relational Query Processors Zhiyuan Chen University of Maryland at Baltimore County zhchen@umbc.com Nick Koudas AT&T Labs Research koudas@research.att.com

More information

Accelerating XPath Location Steps

Accelerating XPath Location Steps Accelerating XPath Location Steps Torsten Grust University of Konstanz Department of Computer and Information Science PO Box D 188, D-78457 Konstanz, Germany TorstenGrust@uni-konstanzde ABSTRACT This work

More information

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data

PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data PathStack : A Holistic Path Join Algorithm for Path Query with Not-predicates on XML Data Enhua Jiao, Tok Wang Ling, Chee-Yong Chan School of Computing, National University of Singapore {jiaoenhu,lingtw,chancy}@comp.nus.edu.sg

More information

TwigList: Make Twig Pattern Matching Fast

TwigList: Make Twig Pattern Matching Fast TwigList: Make Twig Pattern Matching Fast Lu Qin, Jeffrey Xu Yu, and Bolin Ding The Chinese University of Hong Kong, China {lqin,yu,blding}@se.cuhk.edu.hk Abstract. Twig pattern matching problem has been

More information

XML Data Stream Processing: Extensions to YFilter

XML Data Stream Processing: Extensions to YFilter XML Data Stream Processing: Extensions to YFilter Shaolei Feng and Giridhar Kumaran January 31, 2007 Abstract Running XPath queries on XML data steams is a challenge. Current approaches that store the

More information

A Novel Replication Strategy for Efficient XML Data Broadcast in Wireless Mobile Networks

A Novel Replication Strategy for Efficient XML Data Broadcast in Wireless Mobile Networks JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 32, 309-327 (2016) A Novel Replication Strategy for Efficient XML Data Broadcast in Wireless Mobile Networks ALI BORJIAN BOROUJENI 1 AND MEGHDAD MIRABI 2

More information

Aggregate Query Processing of Streaming XML Data

Aggregate Query Processing of Streaming XML Data ggregate Query Processing of Streaming XML Data Yaw-Huei Chen and Ming-Chi Ho Department of Computer Science and Information Engineering National Chiayi University {ychen, s0920206@mail.ncyu.edu.tw bstract

More information

Structural Joins, Twig Joins and Path Stack

Structural Joins, Twig Joins and Path Stack Structural Joins, Twig Joins and Path Stack Seminar: XML & Datenbanken Student: Irina ANDREI Konstanz, 11.07.2006 Outline 1. Structural Joins Tree-Merge Stack-Tree 2. Path-Join Algorithms PathStack PathMPMJ

More information

Part XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321

Part XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321 Part XII Mapping XML to Databases Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321 Outline of this part 1 Mapping XML to Databases Introduction 2 Relational Tree Encoding Dead Ends

More information

An Extended Byte Carry Labeling Scheme for Dynamic XML Data

An Extended Byte Carry Labeling Scheme for Dynamic XML Data Available online at www.sciencedirect.com Procedia Engineering 15 (2011) 5488 5492 An Extended Byte Carry Labeling Scheme for Dynamic XML Data YU Sheng a,b WU Minghui a,b, * LIU Lin a,b a School of Computer

More information

XML: Extensible Markup Language

XML: Extensible Markup Language XML: Extensible Markup Language CSC 375, Fall 2015 XML is a classic political compromise: it balances the needs of man and machine by being equally unreadable to both. Matthew Might Slides slightly modified

More information

XML in Databases. Albrecht Schmidt. al. Albrecht Schmidt, Aalborg University 1

XML in Databases. Albrecht Schmidt.   al. Albrecht Schmidt, Aalborg University 1 XML in Databases Albrecht Schmidt al@cs.auc.dk http://www.cs.auc.dk/ al Albrecht Schmidt, Aalborg University 1 What is XML? (1) Where is the Life we have lost in living? Where is the wisdom we have lost

More information

TwigX-Guide: An Efficient Twig Pattern Matching System Extending DataGuide Indexing and Region Encoding Labeling

TwigX-Guide: An Efficient Twig Pattern Matching System Extending DataGuide Indexing and Region Encoding Labeling JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 25, 603-617 (2009) Short Paper TwigX-Guide: An Efficient Twig Pattern Matching System Extending DataGuide Indexing and Region Encoding Labeling Department

More information

Querying and Updating XML with XML Schema constraints in an RDBMS

Querying and Updating XML with XML Schema constraints in an RDBMS Querying and Updating XML with XML Schema constraints in an RDBMS H. Georgiadis I. Varlamis V. Vassalos Department of Informatics Athens University of Economics and Business Athens, Greece {harisgeo,varlamis,vassalos}@aueb.gr

More information

A Distributed Query Engine for XML-QL

A Distributed Query Engine for XML-QL A Distributed Query Engine for XML-QL Paramjit Oberoi and Vishal Kathuria University of Wisconsin-Madison {param,vishal}@cs.wisc.edu Abstract: This paper describes a distributed Query Engine for executing

More information

Fast Matching of Twig Patterns

Fast Matching of Twig Patterns Fast Matching of Twig Patterns Jiang Li and Junhu Wang School of Information and Communication Technology Griffith University, Gold Coast, Australia Jiang.Li@student.griffith.edu.au, J.Wang@griffith.edu.au

More information

XML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9

XML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9 XML databases Jan Chomicki University at Buffalo Jan Chomicki (University at Buffalo) XML databases 1 / 9 Outline 1 XML data model 2 XPath 3 XQuery Jan Chomicki (University at Buffalo) XML databases 2

More information

Fast Structural Query with Application to Chinese Treebank Sentence Retrieval

Fast Structural Query with Application to Chinese Treebank Sentence Retrieval Fast Structural Query with Application to Chinese Treebank Sentence Retrieval Chia-Hsin Huang 1 jashing@iis.sinica.edu.tw Tyng-Ruey Chuang 2 trc@iis.sinica.edu.tw Hahn-Ming Lee 3 hmlee@mail.ntust.edu.tw

More information

Approaches. XML Storage. Storing arbitrary XML. Mapping XML to relational. Mapping the link structure. Mapping leaf values

Approaches. XML Storage. Storing arbitrary XML. Mapping XML to relational. Mapping the link structure. Mapping leaf values XML Storage CPS 296.1 Topics in Database Systems Approaches Text files Use DOM/XSLT to parse and access XML data Specialized DBMS Lore, Strudel, exist, etc. Still a long way to go Object-oriented DBMS

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK SQL EDITOR FOR XML DATABASE MISS. ANUPAMA V. ZAKARDE 1, DR. H. R. DESHMUKH 2, A.

More information

XPathMark: an XPath Benchmark for the XMark Generated Data

XPathMark: an XPath Benchmark for the XMark Generated Data XPathMark: an XPath Benchmark for the XMark Generated Data Massimo Franceschet Informatics Institute, University of Amsterdam, Kruislaan 403 1098 SJ Amsterdam, The Netherlands Dipartimento di Scienze,

More information

Performance Evaluation of XHTML encoding and compression

Performance Evaluation of XHTML encoding and compression Performance Evaluation of XHTML encoding and compression Sathiamoorthy Manoharan Department of Computer Science, University of Auckland, Auckland, New Zealand Abstract. The wireless markup language (WML),

More information

SM3+: An XML Database Solution for the Management of MPEG-7 Descriptions

SM3+: An XML Database Solution for the Management of MPEG-7 Descriptions SM3+: An XML Database Solution for the Management of MPEG-7 Descriptions Yang Chu, Liang-Tien Chia, and Sourav S. Bhowmick Center for Multimedia and Network Technology School of Computer Engineering Nanyang

More information

EE 368. Weeks 5 (Notes)

EE 368. Weeks 5 (Notes) EE 368 Weeks 5 (Notes) 1 Chapter 5: Trees Skip pages 273-281, Section 5.6 - If A is the root of a tree and B is the root of a subtree of that tree, then A is B s parent (or father or mother) and B is A

More information

XGA XML Grammar for JAVA

XGA XML Grammar for JAVA XGA XML Grammar for JAVA Reinhard CERNY Student at the Technical University of Vienna e0025952@student.tuwien.ac.at Abstract. Today s XML editors provide basic functionality such as creating, editing and

More information

XML Storage and Indexing

XML Storage and Indexing XML Storage and Indexing Web Data Management and Distribution Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook

More information

A Clustering-based Scheme for Labeling XML Trees

A Clustering-based Scheme for Labeling XML Trees 84 IJCSNS International Journal of Computer Science and Network Security, VOL.6 No.9A, September 2006 A Clustering-based Scheme for Labeling XML Trees Sadegh Soltan, and Masoud Rahgozar, University of

More information

Optimize Twig Query Pattern Based on XML Schema

Optimize Twig Query Pattern Based on XML Schema JOURNAL OF SOFTWARE, VOL. 8, NO. 6, JUNE 2013 1479 Optimize Twig Query Pattern Based on XML Schema Hui Li Beijing University of Technology, Beijing, China Email: xiaodadaxiao2000@163.com HuSheng Liao and

More information

Classifying Elements for XML Query Transformation

Classifying Elements for XML Query Transformation Classifying Elements for XML Query Transformation c Ke Geng University of Auckland, New Zealand ke@cs.auckland.ac.nz Abstract Research into XML query transformation has become important with the increased

More information

Streaming XPath Processing with Forward and Backward Axes

Streaming XPath Processing with Forward and Backward Axes Streaming XPath Processing with Forward and Backward Axes Charles Barton, Philippe Charles Deepak Goyal, Mukund Raghavachari IBM T.J. Watson Research Center Marcus Fontoura, Vanja Josifovski IBM Almaden

More information

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010

Uses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010 Uses for About Binary January 31, 2010 Uses for About Binary Uses for Uses for About Basic Idea Implementing Binary Example: Expression Binary Search Uses for Uses for About Binary Uses for Storage Binary

More information

An Efficient Index Lattice for XML Query Evaluation

An Efficient Index Lattice for XML Query Evaluation An Efficient Index Lattice for XML Query Evaluation Wilfred Ng and James Cheng Department of Computer Science and Engineering The Hong Kong University of Science and Technology, Hong Kong {csjames, wilfred}@cse.ust.hk

More information

Efficient Processing of XML Path Queries Using the Disk-based F&B Index

Efficient Processing of XML Path Queries Using the Disk-based F&B Index Efficient Processing of XML Path Queries Using the Disk-based F&B Index Wei Wang Hongzhi Wang,3 Hongjun Lu 2 Haifeng Jiang 4 Xuemin Lin Jianzhong Li 3 University of New South Wales, Australia, {weiw,lxue}@cse.unsw.edu.au

More information

A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function

A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function DEIM Forum 2018 I5-5 Abstract A Schema Extraction Algorithm for External Memory Graphs Based on Novel Utility Function Yoshiki SEKINE and Nobutaka SUZUKI Graduate School of Library, Information and Media

More information

Adding Valid Time to XPath

Adding Valid Time to XPath Adding Valid Time to XPath Shuohao Zhang and Curtis E. Dyreson School of Electrical Engineering and Computer Science Washington State University Pullman, WA, United State of America (szhang2, cdyreson)@eecs.wsu.edu

More information

Provenance Management in Databases under Schema Evolution

Provenance Management in Databases under Schema Evolution Provenance Management in Databases under Schema Evolution Shi Gao, Carlo Zaniolo Department of Computer Science University of California, Los Angeles 1 Provenance under Schema Evolution Modern information

More information

Two-Tier Air Indexing for On-Demand XML Data Broadcast

Two-Tier Air Indexing for On-Demand XML Data Broadcast 29 29th IEEE International Conference on Distributed Computing Systems Two-Tier Air Indexing for On-Demand XML Data Broadcast Weiwei Sun #, Ping Yu #, Yongrui Qing #, Zhuoyao Zhang #, Baihua Zheng * #

More information

Navigation- vs. Index-Based XML Multi-Query Processing

Navigation- vs. Index-Based XML Multi-Query Processing Navigation- vs. Index-Based XML Multi-Query Processing Nicolas Bruno, Luis Gravano Columbia University {nicolas,gravano}@cs.columbia.edu Nick Koudas, Divesh Srivastava AT&T Labs Research {koudas,divesh}@research.att.com

More information

ON VIEW PROCESSING FOR A NATIVE XML DBMS

ON VIEW PROCESSING FOR A NATIVE XML DBMS ON VIEW PROCESSING FOR A NATIVE XML DBMS CHEN TING NATIONAL UNIVERSITY OF SINGAPORE 2004 Contents 1 Introduction 1 2 Background 8 2.1 XML data model.......................... 9 2.2 ORA-SS...............................

More information

METAXPath. Utah State University. From the SelectedWorks of Curtis Dyreson. Curtis Dyreson, Utah State University Michael H. Böhen Christian S.

METAXPath. Utah State University. From the SelectedWorks of Curtis Dyreson. Curtis Dyreson, Utah State University Michael H. Böhen Christian S. Utah State University From the SelectedWorks of Curtis Dyreson December, 2001 METAXPath Curtis Dyreson, Utah State University Michael H. Böhen Christian S. Jensen Available at: https://works.bepress.com/curtis_dyreson/11/

More information

Selectively Storing XML Data in Relations

Selectively Storing XML Data in Relations Selectively Storing XML Data in Relations Wenfei Fan 1 and Lisha Ma 2 1 University of Edinburgh and Bell Laboratories 2 Heriot-Watt University Abstract. This paper presents a new framework for users to

More information

A New Encoding Scheme of Supporting Data Update Efficiently

A New Encoding Scheme of Supporting Data Update Efficiently Send Orders for Reprints to reprints@benthamscience.ae 1472 The Open Cybernetics & Systemics Journal, 2015, 9, 1472-1477 Open Access A New Encoding Scheme of Supporting Data Update Efficiently Houliang

More information

Proposed Specification of a Distributed XML-Query Network

Proposed Specification of a Distributed XML-Query Network Proposed Specification of a Distributed XML-Query Network Christian Thiemann Michael Schlenker Thomas Severiens Institute for Science Networking Oldenburg October 8th 2003 arxiv:cs/0309022v2 [cs.dc] 15

More information