The Systems Group at ETH Zurich. XML and Databases Exercise Session 12. courtesy of Ghislain Fourny. Department of Computer Science ETH Zürich

Similar documents
Big Data Exercises. Fall 2016 Week 9 ETH Zurich

XML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9

CSE 344 Midterm. November 9, 2011, 9:30am - 10:20am. Question Points Score Total: 100

origin destination duration New York London 415 Shanghai Paris 760 Istanbul Tokyo 700 New York Paris 435 Moscow Paris 245 Lima New York 455

Module 4. Implementation of XQuery. Part 2: Data Storage

CSE 530 Midterm Exam

Querying XML. COSC 304 Introduction to Database Systems. XML Querying. Example DTD. Example XML Document. Path Descriptions in XPath

CSE 344 Midterm. November 9, 2011, 9:30am - 10:20am. Question Points Score Total: 100

This chapter describes the encoding scheme supported through the Java API.

XML. Solutions to Practice Exercises

Introduction to Database Systems. Fundamental Concepts

Query Languages for XML

A DTD-Syntax-Tree Based XML file Modularization Browsing Technique

Chapter 13 XML: Extensible Markup Language

Relational Database Features

Introduction to Database Systems CSE 414

CSE 444 Final Exam. August 21, Question 1 / 15. Question 2 / 25. Question 3 / 25. Question 4 / 15. Question 5 / 20.

XML. Semi-structured data (SSD) SSD Graphs. SSD Examples. Schemas for SSD. More flexible data model than the relational model.

Step 1b. After clicking Create account, you will land on the Request an Egencia User Account page where you will enter the following information:

Designing Information-Preserving Mapping Schemes for XML

10.3 Answer: 10.5 Answer:

Introduction to Data Management CSE 344

Part XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321

EXAMINATIONS 2013 MID-YEAR SWEN 432 ADVANCED DATABASE DESIGN AND IMPLEMENTATION

XML in Databases. Albrecht Schmidt. al. Albrecht Schmidt, Aalborg University 1

CS145 Midterm Examination

Semistructured Data and XML

10/24/12. What We Have Learned So Far. XML Outline. Where We are Going Next. XML vs Relational. What is XML? Introduction to Data Management CSE 344

CSCI3030U Database Models

Introduction to Database Systems CSE 414

CLASS DISCUSSION AND NOTES

Solution Sheet 5 XML Data Models and XQuery

Overview. Introduction. Introduction XML XML. Lecture 16 Introduction to XML. Boriana Koleva Room: C54

Chapter 5 Mapping XML to Databases

Tutorial 3: Relational Algebra and Tuple-Relational Calculus

Part V. Relational XQuery-Processing. Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2007/08 297

2779 : Implementing a Microsoft SQL Server 2005 Database

EXAM IN SEMI-STRUCTURED DATA Study Code Student Id Family Name First Name

XML, DTD: Exercises. A7B36XML, AD7B36XML: XML Technologies. Practical Classes 1 and 2: 3. and

Introduction to XML. Yanlei Diao UMass Amherst April 17, Slides Courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives and Gerome Miklau.

Informatics 1: Data & Analysis

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

Computer Science 304

XML, DTD, and XPath. Announcements. From HTML to XML (extensible Markup Language) CPS 116 Introduction to Database Systems. Midterm has been graded

Informatics 1: Data & Analysis

Query Engines for Web-Accessible XML Data

An approach to the model-based fragmentation and relational storage of XML-documents

XML publishing. Querying and storing XML. From relations to XML Views. From relations to XML Views

Course. Overview. 5 Day(s) Length: Published: English. IT Professionals. Level: Type: Enroll now (CAL) Database. database files. Createe databases and

Part II: Semistructured Data

Introduction. Web Pages. Example Graph

CS145 Midterm Examination

Developing SQL Databases

PLEASE HAND IN UNIVERSITY OF TORONTO Faculty of Arts and Science

Tutorial 3: Relational Algebra and Tuple Relational Calculus

XML. extensible Markup Language. ... and its usefulness for linguists

Data Presentation and Markup Languages

Tutorial 5: XML. Informatics 1 Data & Analysis. Week 7, Semester 2,

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11

CSS, Cascading Style Sheets

Advanced databases and data models: Theme2-1: Efficient storage of XML. Lena Strömbäck. June 17,

EXAM IN SEMI-STRUCTURED DATA Study Code Student Id Family Name First Name

Booking vacation packages (general)

Part 2: XML and Data Management Chapter 6: Overview of XML

Introduction. Course Overview Evolution and History of Database Management Systems Requirements of Object-Oriented Database Management Systems

XML Technologies. Doc. RNDr. Irena Holubova, Ph.D. Web page:

20464 Developing Microsoft SQL Server Databases

XML Data Management. 6. XPath 1.0 Principles. Werner Nutt

XML and Databases. Lecture 10 XPath Evaluation using RDBMS. Sebastian Maneth NICTA and UNSW

1.264 FINAL EXAM FALL 2012 NAME. Exam guidelines:

CS145 Introduction. About CS145 Relational Model, Schemas, SQL Semistructured Model, XML

M359 Block5 - Lecture12 Eng/ Waleed Omar

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 2: Relations and SQL. September 5, Lecturer: Rasmus Pagh

Schemaless Approach of Mapping XML Document into Relational Database

Big Data 9. Data Models

Introduction to XML. Chapter 133

XML Databases 1. Introduction,

Lecture 29 11/4/15. CMPSC431W: Database Management Systems. Instructor: Yu- San Lin

Introduction Syntax and Usage XML Databases Java Tutorial XML. November 5, 2008 XML

XML Technologies. Doc. RNDr. Irena Holubova, Ph.D. Web pages:

Pushing XPath Accelerator to its Limits

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington

One of the main selling points of a database engine is the ability to make declarative queries---like SQL---that specify what should be done while

SELECTION. (Chapter 2)

Introduction to Data Management. Lecture #2 (Big Picture, Cont.) Instructor: Chen Li

Overview. Structured Data. The Structure of Data. Semi-Structured Data Introduction to XML Querying XML Documents. CMPUT 391: XML and Querying XML

XML: extensible Markup Language

Implementing a Microsoft SQL Server 2005 Database Course 2779: Three days; Instructor-Led

Introduction to XML. National University of Computer and Emerging Sciences, Lahore. Shafiq Ur Rahman. Center for Research in Urdu Language Processing

Oracle Database 12c: Use XML DB

XML and Relational Databases

9/17/2010. Today s lecture. Advanced databases and data models: Theme3: Efficient storage of XML. Storage possibilities for XML. XML as a data model

Trees : Part 1. Section 4.1. Theory and Terminology. A Tree? A Tree? Theory and Terminology. Theory and Terminology

20464: Developing Microsoft SQL Server 2014 Databases

Constraint-based Information Integration

1. Consider the following functional dependencies, to be used in the analysis of a database design for the Furman Air enterprise of the first exam:

Efficient schema-based XML-to-Relational data mapping

EMERGING TECHNOLOGIES

XML Problem. Specification of the Publication Entity:

DBMS Questions for IBPS Bank Exam

Transcription:

ETH Zürich XML and Databases Exercise Session 12 The Systems Group at ETH Zurich courtesy of Ghislain Fourny

Where we are: XQuery Implementation

The problem we are solving today Square peg Round hole

The problem we are solving today

The problem we are solving today: Storing XML in an RDBMS <root>some <b>xml</b> document</root> Relational Database

Good news: we will use node IDs (remember last week) Duck Donald Issued 11/25/09 Expires 11/24/19 Born June 9th, 1934 Duckburg, Calisota Duck County The mayor Scrooge McDuck

XML Documents: no nesting

XML Documents: with nesting

How to put XML in a relational DB? With a schema: Schema-based shredding Without schema (generic): Tree encoding Edge approach

Schema-based Shredding without nesting

Schema-based Shredding without nesting

Schema-based Shredding without nesting airid name tax LHR London Heathrow 100 ZRH Zurich 150 PAR Paris 100 Doc name passportnum ber address Santa Claus 000111 Somewhere Santa Claus 000112 Somewhere flightref passref LX124 000111 LX183 000112 Passenger Reservation flightid seats date departure arrival source destination LX183 100 2006-12-24 08:15:00 10:01:00 ZRH PAR LX124 100 2006-12-24 12:00:00 13:00:00 PAR ZRH Flight

Schema-based Shredding without nesting

Schema-based Shredding without nesting: the DB schema Passenger(passportNumber, name, address) Airport(airpId, name, tax) Flight(flightId, date, departure, arrival, seats, source, destination) Reservation(flightRef, passref)

Schema-based Shredding with nesting (general case)

Schema-based Shredding: The DTD <!ELEMENT Doc(Airport+, Passenger+, Flight+, Reservation+ > <!ELEMENT Passenger(passportNumber, name, address) > <!ELEMENT Airport(airpId, name, tax) > <!ELEMENT Flight(flightId, date, departure, arrival, seats, Airport, Airport) > <!ELEMENT Reservation(resId, Flight, Passenger)> <!ELEMENT passportnumber #PCDATA> <!ELEMENT name #PCDATA> <!ELEMENT address #PCDATA> <!ATTLIST Airport airpid CDATA> <!ELEMENT tax #PCDATA> <!ATTLIST Flight flightid CDATA> <!ELEMENT date #PCDATA> <!ELEMENT departure #PCDATA> <!ELEMENT arrival #PCDATA> <!ELEMENT seats #PCDATA> <!ELEMENT resid #PCDATA>

Schema-based Shredding: The DTD graph

Schema-based Shredding: The element (sub)graphs Since there are no cycles, these are all subgraphs.

Schema-based Shredding: DB Schema for the Doc subgraph We can not put the children inline because they are not unique... 19

Schema-based Shredding: DB Schema for the Doc subgraph So we create special tables for them 20

Schema-based Shredding: DB Schema for the Doc subgraph Here, there are attributes we can put inline. 21

Schema-based Shredding: DB Schema for the Doc subgraph Those need a special table as well, because they can occur more than once.

Schema-based Shredding: The other subgraphs

Schema-based Shredding: Filling in the tables Doc.Flight. AirportID Doc.Flight. Airport.pare ntid Doc.Flight. Airport.airpI d Doc.Flight. Airport.nam e Doc.Flight. Airport.tax 1 1 ZRH Zurich 150 1 2 1 PAR Paris 100 2 3 2 PAR Paris 100 1 position 4 2 LHR London Heathrow 100 2 Doc.Flight.Airport

Edge

Edge Ordinal Source Label Target 1 1 Doc 2 1 2 Airport 3 1 3 airid v1 2 3 name v2 3 3 tax v3 Id v1 v2 Value ZRH Zurich Id Value v3 150

Tree encoding

Tree encoding

Tree encoding: one single table

Tree encoding pre size level kind prop frag 1 (many) 1 element Doc 1 2 6 2 element Airport 1 3 1 3 attribute airid 1 4 0 4 attvalue ZRH 1 5 1 3 element name 1 6 0 4 text Zurich

Tree encoding pre size level kind prop frag 1 (many) 1 element Doc 1 2 6 2 element Airport 1 3 1 3 attribute airid 1 4 0 4 attvalue ZRH 1 5 1 3 element name 1 6 0 4 text Zurich Ancestor relationship?

Tree encoding pre size level kind prop frag 1 (many) 1 element Doc 1 2 6 2 element Airport 1 3 1 3 attribute airid 1 4 0 4 attvalue ZRH 1 5 1 3 element name 1 6 0 4 text Zurich Ancestor relationship? descendant.pre > ancestor.pre & descendant.pre <= ancestor.pre+ancestor.size

Tree encoding pre size level kind prop frag 1 (many) 1 element Doc 1 2 6 2 element Airport 1 3 1 3 attribute airid 1 4 0 4 attvalue ZRH 1 5 1 3 element name 1 6 0 4 text Zurich 1 Parent relationship?

Tree encoding pre size level kind prop frag 1 (many) 1 element Doc 1 2 6 2 element Airport 1 3 1 3 attribute airid 1 4 0 4 attvalue ZRH 1 5 1 3 element name 1 6 0 4 text Zurich 1 Parent relationship? child.pre > parent.pre & child.pre <= parent.pre+parent.size & child.level = parent.level + 1

The problem we are solving today: Storing XML in an RDBMS <root>some <b>xml</b> document</root> Relational Database

The problem we are solving today: Storing XML in an RDBMS <root>some <b>xml</b> document</root> XQuery Relational Database

The problem we are solving today: Storing XML in an RDBMS <root>some <b>xml</b> document</root> XQuery SQL Relational Database

Our Query "Determine all destinations of passenger Santa Claus"

XML Documents: Query for Schemabased Shredding (no nesting) Passenger(passportNumber, name, address) Airport(airpId, name, tax) Flight(flightId, date, departure, arrival, seats, source, destination) Reservation(flightRef, passportnumber)

XML Documents: Query for Schemabased Shredding (no nesting)

XML Documents: Schema-based Shredding (nesting)

XML Documents: Query for Schemabased Shredding (nesting)

Edge Ordinal Source Label Target 1 1 Doc 2 1 2 Airport 3 1 3 airid v1 2 3 name v2 3 3 tax v3 Id v1 v2 Value ZRH Zurich Id Value v3 150

XML Documents: Query for Edge Passenger Name Santa Claus

XML Documents: Query for Edge Reservation Passenger Name Santa Claus

XML Documents: Query for Edge Reservation Passenger Flight Airport[2] Name Santa Claus

XML Documents: Query for Edge

Tree encoding pre size level kind prop frag 1 (many) 1 element Doc 1 2 6 2 element Airport 1 3 1 3 attribute airid 1 4 0 4 attvalue LHR 1 5 1 3 element name 1 6 0 4 text London Heathrow 1

XML Documents: Query for Tree Encoding Reservation Passenger Airport[2] Flight Name Santa Claus

XML Documents: Query for Tree Encoding pn offspring of p

Good luck with the exam!