M359 Block5 - Lecture12 Eng/ Waleed Omar

Similar documents
XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11

XML. extensible Markup Language. ... and its usefulness for linguists

Chapter 1: Getting Started. You will learn:

XML: Extensible Markup Language

COMP9321 Web Application Engineering. Extensible Markup Language (XML)

COMP9321 Web Application Engineering

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1

.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar..

Chapter 13 XML: Extensible Markup Language

Data Presentation and Markup Languages

XML. Objectives. Duration. Audience. Pre-Requisites

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML

XML: Managing with the Java Platform

Chapter 2 XML, XML Schema, XSLT, and XPath

Data Exchange. Hyper-Text Markup Language. Contents: HTML Sample. HTML Motivation. Cascading Style Sheets (CSS) Problems w/html

The XML Metalanguage

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML

COMP9321 Web Application Engineering

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University

XML. Jonathan Geisler. April 18, 2008

Introduction to XML. An Example XML Document. The following is a very simple XML document.

Introduction to XML. XML: basic elements

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 14 Database Connectivity and Web Technologies

W3C XML XML Overview

Oracle Database 12c: Use XML DB

Solutions. a. Yes b. No c. Cannot be determined without the DTD. d. Schema. 9. Explain the term extensible. 10. What is an attribute?

SDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5

CSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML

Part VII. Querying XML The XQuery Data Model. Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2005/06 153

XML Processing & Web Services. Husni Husni.trunojoyo.ac.id

Delivery Options: Attend face-to-face in the classroom or remote-live attendance.

x ide xml Integrated Development Environment Specifications Document 1 Project Description 2 Specifi fications

CSC Web Technologies, Spring Web Data Exchange Formats

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance.

Author: Irena Holubová Lecturer: Martin Svoboda

Querying purexml Part 1 The Basics

Manipulating XML Trees XPath and XSLT. CS 431 February 18, 2008 Carl Lagoze Cornell University

Part XII. Mapping XML to Databases. Torsten Grust (WSI) Database-Supported XML Processors Winter 2008/09 321

XML Technologies. Doc. RNDr. Irena Holubova, Ph.D. Web pages:

CSS, Cascading Style Sheets

Overview. Introduction. Introduction XML XML. Lecture 16 Introduction to XML. Boriana Koleva Room: C54

XML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9

2009 Martin v. Löwis. Data-centric XML. XML Syntax

Chapter 1: Introduction. Chapter 1: Introduction

CLASS DISCUSSION AND NOTES

Well-formed XML Documents

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward

The concept of DTD. DTD(Document Type Definition) Why we need DTD

XML & Related Languages

Introduction p. 1 An XML Primer p. 5 History of XML p. 6 Benefits of XML p. 11 Components of XML p. 12 BNF Grammar p. 14 Prolog p. 15 Elements p.

XML Extensible Markup Language

Chapter 1: Introduction

Introduction to XML Zdeněk Žabokrtský, Rudolf Rosa

XML. extensible Markup Language. Overview. Overview. Overview XML Components Document Type Definition (DTD) Attributes and Tags An XML schema

Chapter 1: Introduction

ISO/IEC INTERNATIONAL STANDARD. Information technology Document Schema Definition Languages (DSDL) Part 3: Rule-based validation Schematron

XML 2 APPLICATION. Chapter SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.

Chapter 7: XML Namespaces

XML. Marie Dubremetz Uppsala, April 2014

EXAM IN SEMI-STRUCTURED DATA Study Code Student Id Family Name First Name

extensible Markup Language

Overview. Structured Data. The Structure of Data. Semi-Structured Data Introduction to XML Querying XML Documents. CMPUT 391: XML and Querying XML

The Specification Xml Failed To Validate Against The Schema Whitespace

XML 2 APPLICATION. Chapter SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward

The Extensible Markup Language (XML) and Java technology are natural partners in helping developers exchange data and programs across the Internet.

XML Metadata Standards and Topic Maps


EMERGING TECHNOLOGIES. XML Documents and Schemas for XML documents

CS6501 IP Unit IV Page 1

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington

Part 2: XML and Data Management Chapter 6: Overview of XML

Information Technology Document Schema Definition Languages (DSDL) Part 1: Overview

Introduction to Semistructured Data and XML

An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry

How To Validate An Xml File Against A Schema Using Xmlspy

B.V.Patel Institute of Business Management, Computer & Information Technology, UTU

XML APIs Testing Using Advance Data Driven Techniques (ADDT) Shakil Ahmad August 15, 2003

Java EE 7: Back-end Server Application Development 4-2

What is XML? XML is designed to transport and store data.

XML. COSC Dr. Ramon Lawrence. An attribute is a name-value pair declared inside an element. Comments. Page 3. COSC Dr.

Essay Question: Explain 4 different means by which constrains are represented in the Conceptual Data Model (CDM).

Database Management Systems (CPTR 312)

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Database System Concepts

11. EXTENSIBLE MARKUP LANGUAGE (XML)

Introduction to XML. University of California, Santa Cruz Extension Computer and Information Technology

SRI VIDYA COLLEGE OF ENGINEERING & TECHNOLOGY- VIRUDHUNAGAR

XML, XPath, and XSLT. Jim Fawcett Software Modeling Copyright

XML, DTD, and XPath. Announcements. From HTML to XML (extensible Markup Language) CPS 116 Introduction to Database Systems. Midterm has been graded

UNIT 3 XML DATABASES

Next Generation Query and Transformation Standards. Priscilla Walmsley Managing Director, Datypic

Foreword... v Introduction... vi. 1 Scope Normative references Terms and definitions Extensible Datatypes schema overview...

XPath Expression Syntax

10/24/12. What We Have Learned So Far. XML Outline. Where We are Going Next. XML vs Relational. What is XML? Introduction to Data Management CSE 344

Introduction Syntax and Usage XML Databases Java Tutorial XML. November 5, 2008 XML

웹기술및응용. XML Basics 2018 년 2 학기. Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering

Introduction to Database Systems CSE 414

Transcription:

Documents and markup languages The term XML stands for extensible Markup Language. Used to label the different parts of documents. Labeling helps in: Displaying the documents in a formatted way Querying the document based on content SGML: Standard Generalized Markup Language: A very generalized specification of how to mark up a document. complex, used by specialized publishers HTML: Hyper Text markup Language Subset (application) of SGML. HTML conforms to the SGML specification for markup and defines the tags that can be used for web pages, but it has one significant limitation the tags that you can use are fixed by the W3C specification. XML: extensible Markup Language Subset of SGML, user defined tags, for both displaying and logically structuring documents XML and the DBMS The receiving organization is likely to receive transcripts for many students and, therefore, it needs some way of processing the XML document to extract and store the relevant data in its own database. <studenttranscript> <studentid >s05</studentid> <studentname >Ellis</studentName> <enrolments> <course> <coursecode>c2</coursecode> <title >Syntax</title> <credit>30</credit> </course > </enrolments > </studenttranscript > Structure of XML documents An XML document is a sequence of characters, which is partitioned into groups that are treated as either markup or character data. Elements The main structure of a document is determined by its elements, where an element is bounded by a start-tag(<element-name> and an end-tag(</element-name>) <element-name... > Element content.</element-name > A tag s element-name is a label that begins with a letter and continues with letters, digits or symbols such as a hyphen, underscore or full stop (a colon is also allowed but it has a special meaning). Whatever appears between a start-tag and its matching end-tag is the content of the element(character data or other elements). Page 1

empty element that has no content : <element-name.../> where the ending /> signifies that there is no matching end-tag. <title >Syntax</title > The element name is title and the character data content of the element is Syntax. A complex element beginning with the start-tag <studenttranscript> and finishing with the matching end-tag </studenttranscript >. In this case, the element name is studenttranscript and its content is considered as the sequence of elements named studentid, studentname and enrolments, each of which has its own content. A document obeying these rules is said to be well-formed XML. Every start-tag must have a matching end-tag. A document has one root element. Every other element must be contained within some parent element, requiring that each pair of matching tags is nested within another pair. Defining documents The original means of defining a document is given as part of the XML specification and is known as a document type definition (DTD). To define what can belong in a document, we use a Data Type Definition (DTD). Many ways can define the structure of an XML document in DTD Allowing elements to have a very varied content, such as containing both character data and elements (known as mixed content) or containing elements that can be optional or without ordering. There are more limited variations for attributes, since they have just a single character data value; for example, they can be optional and can have a default value. Prosperities of a DTD:- 1-The ability to define an entity, which is a named value that can be included within character data. 2-The data defined by a DTD for both element content and attribute values is always character data. There is no way to define integer or date. 3- DTD ordering does not matter you can mix up the element and attribute definitions and it does not affect their meaning, since it still determines the same structure. 4-the ability to define an entity, which is a named value that can be included within character data by giving its name embedded between & and;. When the document is parsed the name is replaced by its value. The alternatives to DTD(XML schema):- The main alternative developed by W3C is called XML Schema, which provides the means to define the structure of an XML document in a way that is different from a DTD as well as specifying a range of data types that can be used to define the values that are allowed within an XML document. An XML schema is itself an XML document, with its own schema to determine how it should be written. While this kind of schema meets the requirements of dataoriented applications, it is considered too complex for some uses of XML. Page 2

Alternatives to DTDs: RelaxNG schema Relax NG is a schema language that is an evolution and generalization of a DTD. It focuses on defining the structure of an XML document, though it can also define values by using the data types specified for XML Schema. A Relax NG schema can be written either as an XML document or with a more compact syntax. Schematron is another kind of schema language, which is based on writing assertions about the tree structure of an XML document. Difference between Database and XML schemas While a database must have a schema to define its tables before you can enter data, an XML document does not need a DTD or schema as long as its markup follows the rules, then it is legitimate, well-formed XML. Document prolog xml and encoding declarations External DTD: A DTD is kept in a file and the document type declaration references the file using a system identifier. Such a system identifier can take various forms, since the DTD can be anywhere on the internet, but the simplest option is to place the DTD in the same folder as the referencing XML document, in which case you need only the file name. Document prolog:- processing instruction & comments Another kind of markup in a prolog is called a processing instruction (PI). The purpose of a PI is to provide a specification concerning how an XML document may be processed, and so the first part of this markup is to identify the kind of processing to which the PI relates. An example of a common requirement is to display an XML document according to a stylesheet that specifies the format and appearance of the document. Processing XML documents Parsing XML An XML parser takes as input the sequence of characters from an XML document and analyses it to separate its markup and character data. A parser can then make the character data available to an XML application in various ways, depending on the interface it provides to applications. While parsing is common to processing all kinds of language, XML has some particular Features that we need to examine so that you can appreciate what is happening. How the parser can provide the data A parser may process the input document sequentially and extract the data requested by the XML application as it does so, without retaining a copy of the parsed XML. A parser may process the input document and construct an internal representation of the parsed XML, which is available for further requests from the XML application. the Document Object Model (DOM) When the parsed XML is retained, it is kept as a tree structure and It can be accessed by an XML application via an interface defined as the Document Object Model (DOM). This interface also allows the parsed tree to be updated. Page 3

Selecting XML content XPath is the simpler approach and it applies to the tree structure of a parsed XML document. It is basically similar to the way files and folders can be referenced by a path. An XML tree in XPath has one significant difference from the tree of elements we described previously, in that its root is the whole document. This means that the XPath tree includes processing instructions and comments that are part of a document s prolog but, because the document is parsed, it does not include things like entities. What is called an absolute path starts from a document root (referenced by /) and then specifies the steps to the required elements of the document. XQuery:- Is a query language that provides comparable querying capabilities to SQL, and thus can be quite complex. However, it has one form that makes direct use of XPath. In this example, the XML document to be queried is in the file given as the argument of the XQuery function doc, followed by the XPath expression that selects the elements to be returned. In this case, the result of the XQuery expression is the element There is an alternative and more general way of writing XQuery that is directly comparable to SQL. XQuery is not the only use of XPath. It is also used for XPointer a way of one XML document referencing another and in transforming one XML document into another, which is considered next. XPath and XQuery are also needed when we consider the use of XML in databases. Transforming XML documents First, an XML document can be transformed into another XML document, and the W3C specification of how to do this is known as XSLT (XML Stylesheet Language Transformations). Secondly, an XML document can be transformed into some other format suitable for output, such as a pdf file, and the W3C specification of how to do this is known as XSL-FO (XML Stylesheet Language Formatting Objects). Comparing XML and relational data:- The main features of relational data:- v Relational data is held as atomic values in a tabular structure with a unique name. v Columns of a table are identified by name within the table and all values in a column are of the same type; the order of columns is not significant. v Rows of a table are distinct, distinguished by the values in each row (particularly the primary key); the order of rows is not significant. v Access to and manipulation of data are expressed in terms of table operations that only involve value specifications (i.e. there is no concept of location of data by row number or column number). v Relations are logical structures that do not have any storage implications. The main features of XML data as follows. v XML data is held as nested elements in a tree structure with a named root element. v Elements in a tree are named; they can have attributes with values and can contain other elements or character data, which are all represented as character strings, though an attribute or character data value can have a data type determined by a schema. Page 4

v An element is distinguished by its location in the tree structure, specified as a path in terms of named elements and sequence numbers from the root element. v Access to and manipulation of elements and their contents are expressed in terms of operations that are based on the location of elements in the tree. v XML is a logical structure with a specified storage representation as a sequence of characters. Embedded SQL The SQL that is embedded in programs. There is a need to understand the relationship between SQL and the programming language that is used to write the program, as well as the means of transferring data between a compiled program and a DBMS. Processing embedded SQL How the source code is processed to give an executable program. There are two main factors relating to this issue. 1- An embedded SQL program is a hybrid, but the compilers that are used to produce an executable program cannot cope with such a mixture of languages they are designed for, say, pure C or pure Fortran. There needs to be some mechanism to convert SQL statements into a form acceptable to a language compiler. 2- It needs to be portable so that each SQL statement should work in the same way for different DBMSs. However, each DBMS is produced with its own interface designed by the vendor to implement the range of capabilities required to manage a database, with different ways in which these capabilities are invoked. This is called the native interface to the DBMS. Because the native interface is different for each DBMS, there is no standard way of converting SQL into a form that is understandable to a compiler. The solution to this problem is that each DBMS vendor provides what is called a precompiler for each language. A pre-compiler processes a hybrid program of SQL and the host language source code into pure source code for that language by replacing all SQL statements with requests to the native interface supported by that DBMS. A compiler for the host language can now process the result of the pre-compilation without having to be aware of any SQL it is pure source code. EXERCISE 2.1 It is a requirement of SQL that it is portable. Explain why it is only the source code for an embedded SQL program that is portable. Describe what must be done when an embedded SQL program, working with one vendor s DBMS, is required to work with another vendor s DBMS. SOLUTION Only the source code for an embedded SQL program is portable because the precompilation processing is different for each DBMS and results in object code that will only work with that DBMS that is, it is not portable. Transferring an embedded SQL program to another vendor s DBMS requires the original source code to be processed using that vendor s pre-compiler, and then compiled to produce another version of the object code. Page 5

ODBC (Open Database Connectivity) ODBC is not embedded SQL, though it is used embedded in programs and it does involve SQL. What is ODBC:- In general computing terms it is an application programming interface (API). ODBC was developed by DBMS vendors, mainly Microsoft, to respond to the problem with embedded SQL, a compiled embedded SQL program is not portable between different vendors DBMSs. Software developers wanted to produce compiled shrinkwrapped packages, that would work with any DBMS. The source code for the program is written with requests to the ODBC interface expressed using normal programming language invocation so that it can be compiled without any pre-processing. The compiled object code, executing as an application process, submits requests for database access via the ODBC interface. EXERCISE 2.6 Explain why an ODBC driver for a data source needs to be appropriate for the DBMS to be used. Solution An ODBC driver converts database requests expressed in terms of the ODBC interface into requests expressed in terms of the native interface for the DBMS being used. The ODBC driver can do this only if the request is appropriate that is, if it is written so that it can interact with the native interface of that DBMS. Page 6