XML and DTD. Mario Alviano A.Y. 2017/2018. University of Calabria, Italy 1 / 28

Similar documents
Introduction to XML Zdeněk Žabokrtský, Rudolf Rosa

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11

Author: Irena Holubová Lecturer: Martin Svoboda

Introduction to XML. An Example XML Document. The following is a very simple XML document.

Chapter 1: Getting Started. You will learn:

UR what? ! URI: Uniform Resource Identifier. " Uniquely identifies a data entity " Obeys a specific syntax " schemename:specificstuff

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward

XML. Objectives. Duration. Audience. Pre-Requisites

Structured documents

XML Structures. Web Programming. Uta Priss ZELL, Ostfalia University. XML Introduction Syntax: well-formed Semantics: validity Issues

Overview. Introduction. Introduction XML XML. Lecture 16 Introduction to XML. Boriana Koleva Room: C54

Outline. XML vs. HTML and Well Formed vs. Valid. XML Overview. CSC309 Tutorial --XML 4. Edward Xia

CSC Web Technologies, Spring Web Data Exchange Formats

XSLT (part I) Mario Alviano A.Y. 2017/2018. University of Calabria, Italy 1 / 22

SDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5

Introduction to XML. XML: basic elements

Introduction to XML. Chapter 133

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University

XML. XML Syntax. An example of XML:

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML

Delivery Options: Attend face-to-face in the classroom or remote-live attendance.

EMERGING TECHNOLOGIES. XML Documents and Schemas for XML documents

Chapter 10: Understanding the Standards

Xml Schema Attribute Definition Language (xsd) 1.0

Introduction Syntax and Usage XML Databases Java Tutorial XML. November 5, 2008 XML

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance.

Session 23 XML. XML Reading and Reference. Reading. Reference: Session 23 XML. Robert Kelly, 2018

Chapter 7: XML Namespaces

XML Namespaces. Mario Arrigoni Neri

XML 2 APPLICATION. Chapter SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.


XML 2 APPLICATION. Chapter SYS-ED/ COMPUTER EDUCATION TECHNIQUES, INC.

XML Information Set. Working Draft of May 17, 1999

.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar..

Layered approach. Data

Semistructured data, XML, DTDs

Additional Readings on XPath/XQuery Main source on XML, but hard to read:

Well-formed XML Documents

XML for Android Developers. partially adapted from XML Tutorial by W3Schools

Data Presentation and Markup Languages

Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013

The concept of DTD. DTD(Document Type Definition) Why we need DTD

2009 Martin v. Löwis. Data-centric XML. XML Syntax

XML stands for Extensible Markup Language and is a text-based markup language derived from Standard Generalized Markup Language (SGML).

Layered approach. Data

XML Schema. Mario Alviano A.Y. 2017/2018. University of Calabria, Italy 1 / 28

Fundamentals of Web Programming a

XPath. Mario Alviano A.Y. 2017/2018. University of Calabria, Italy 1 / 21

XML. extensible Markup Language. ... and its usefulness for linguists

XML. extensible Markup Language. Overview. Overview. Overview XML Components Document Type Definition (DTD) Attributes and Tags An XML schema

XML & Related Languages

XML. Part I XML Document and DTD

XML: and related technologies

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1

CSI 3140 WWW Structures, Techniques and Standards. Markup Languages: XHTML 1.0

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

CPT374 Tutorial-Laboratory Sheet Two

Introduction to XML. Yanlei Diao UMass Amherst April 17, Slides Courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives and Gerome Miklau.

What is XML? XML is designed to transport and store data.

Java EE 7: Back-end Server Application Development 4-2

2006 Martin v. Löwis. Data-centric XML. Document Types

Chapter 1: XML Syntax

Databases and Internet Applications

Electronic Commerce Architecture Project LAB ONE: Introduction to XML

Chapter 1: XML Syntax

HTML Overview. With an emphasis on XHTML

XML, DTD: Exercises. A7B36XML, AD7B36XML: XML Technologies. Practical Classes 1 and 2: 3. and

Tutorial 2: Validating Documents with DTDs

CountryData Technologies for Data Exchange. Introduction to XML

Part II: Semistructured Data

XML Extensible Markup Language

CHAPTER 2 MARKUP LANGUAGES: XHTML 1.0

XML Overview, part 1

XML: Extensible Markup Language

extensible Markup Language (XML) Basic Concepts

Introduction to XML. M2 MIA, Grenoble Université. François Faure

XML. COSC Dr. Ramon Lawrence. An attribute is a name-value pair declared inside an element. Comments. Page 3. COSC Dr.

Session [2] Information Modeling with XSD and DTD

5/19/2015. Objectives. JavaScript, Sixth Edition. Introduction to the World Wide Web (cont d.) Introduction to the World Wide Web

XML Technologies XML, DTD

From administrivia to what really matters

웹기술및응용. XML Basics 2018 년 2 학기. Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering

UNIT I. A protocol is a precise set of rules defining how components communicate, the format of addresses, how data is split into packets

IBM. XML and Related Technologies Dumps Braindumps Real Questions Practice Test dumps free

PART. Oracle and the XML Standards

XML Metadata Standards and Topic Maps

Enhanced XML Retrieval with Flexible Constraints Evaluation

Contents. Topics. 01. WWW 02. WWW Documents 03. Web Service 04. Web Technologies. Management of Technology. C01-1. Documents

COMP9321 Web Application Engineering. Extensible Markup Language (XML)

COMP9321 Web Application Engineering

(One) Layer Model of the Semantic Web. Semantic Web - XML XML. Extensible Markup Language. Prof. Dr. Steffen Staab Dipl.-Inf. Med.

Introduction to Database Systems CSE 414

Notes. Submit homework on Blackboard The first homework deadline is the end of Sunday, Feb 11 th. Final slides have 'Spring 2018' in chapter title

Solutions. a. Yes b. No c. Cannot be determined without the DTD. d. Schema. 9. Explain the term extensible. 10. What is an attribute?

Chapter 1: Semistructured Data Management XML

StreamServe Persuasion SP5 XMLOUT

Chapter 13 XML: Extensible Markup Language

Transcription:

1 / 28 XML and DTD Mario Alviano University of Calabria, Italy A.Y. 2017/2018

Outline 2 / 28 1 Introduction 2 XML syntax 3 Namespace 4 Document Type Definition (DTD) 5 Exercises

Outline 3 / 28 1 Introduction 2 XML syntax 3 Namespace 4 Document Type Definition (DTD) 5 Exercises

Documents versus Data 4 / 28 Documents Human-readable Basically unstructured text Markup indicates some structure Data Human- and machine-readable Structured text Schema for structure

Documents versus Data 4 / 28 Documents Human-readable Basically unstructured text Markup indicates some structure Data Human- and machine-readable Structured text Schema for structure XML (Extensible Markup Language) unifies these paradigms

5 / 28 XML What XML is not XML is not a programming language XML is not a protocol XML is not a database

5 / 28 XML What XML is not XML is not a programming language XML is not a protocol XML is not a database XML is a W3C Recommendation It is a framework for describing semi-structured data Applications specify their own document/data types

5 / 28 XML What XML is not XML is not a programming language XML is not a protocol XML is not a database XML is a W3C Recommendation It is a framework for describing semi-structured data Applications specify their own document/data types XML will be the ASCII of the Web basic, essential, unexciting Tim Bray, 1997

XML versus HTML HTML is an application of SGML Around 100 fixed tags Used mostly for presentation and layout Proprietary extensions and variations Error-tolerant browsers XML is subset of SGML Meta-language No fixed tags Applications specify their own document/data types Strict syntax 6 / 28

Why XML? (1) 7 / 28 How to represent data? Example. Text file Joe Fawcett Danny Ayers Mario Alviano Example. XML file <applicationusers> <user firstname="joe" lastname="fawcett" /> <user firstname="danny" lastname="ayers" /> <user firstname="mario" lastname="alviano" /> </applicationusers>

8 / 28 Why XML? (2) Less ambiguities Easily extensible Example. Text file Joe John Fawcett Danny John Ayers Mario Alviano Example. XML file <applicationusers> <user firstname="joe" middlename="john" lastname="fawcett" /> <user firstname="danny" middlename="john" lastname="ayers" /> <user firstname="mario" lastname="alviano" /> </applicationusers>

Why XML? (3) 9 / 28 Hierarchical data representation Example. Text file / /home /home/malvi /proc /sys Example. XML file <directory> <directory name="home" > <directory name="malvi" /> </directory> <directory name="proc" /> <directory name="sys" /> </directory>

Outline 10 / 28 1 Introduction 2 XML syntax 3 Namespace 4 Document Type Definition (DTD) 5 Exercises

11 / 28 XML syntax (1) First line of an XML file is called prolog Must specify XML version (1.0 oppure 1.1) May specify a Unicode encode (UTF-8, UTF-16, etc.) Comments use the same syntax of HTML Example. Prolog <?xml version="1.0" encoding="utf-8"?> Example. Comment <!-- This is a comment -->

XML syntax (2) 12 / 28 An XML file contains a tree of elements

XML syntax (2) 12 / 28 An XML file contains a tree of elements Elements have the following forms: 1 Opening tag, content, closing tag: <myelement>content</myelement> 2 Only for elements with no content: <myelement />

12 / 28 XML syntax (2) An XML file contains a tree of elements Elements have the following forms: 1 Opening tag, content, closing tag: <myelement>content</myelement> 2 Only for elements with no content: <myelement /> Element may have attributes: <myelement myfirstattribute="one" mysecondattribute="two" />

XML syntax (3) 13 / 28 Not all characters are valid and escape sequences are used

XML syntax (3) 13 / 28 Not all characters are valid and escape sequences are used Entity references & & < < > > " " &apos;

XML syntax (3) 13 / 28 Not all characters are valid and escape sequences are used Entity references & & < < > > " " &apos; Character references E.g., (exadecimal) or (decimal) add a space

XML syntax (3) Not all characters are valid and escape sequences are used Entity references & & < < > > " " &apos; Character references E.g., (exadecimal) or (decimal) add a space Contents containing many invalid character can be denoted by CDATA <conversiondata> <![CDATA[ 1 kilometer < 1 mile 1 pint < 1 liter 1 pound < 1 kilogram ]]> </conversiondata> 13 / 28

Outline 14 / 28 1 Introduction 2 XML syntax 3 Namespace 4 Document Type Definition (DTD) 5 Exercises

Namespace (1) 15 / 28 XML is born for interoperation More XML documents must coexist How to handle documents using the same names for elements and attributes?

Namespace (1) XML is born for interoperation More XML documents must coexist How to handle documents using the same names for elements and attributes? Example. Clash on element names <employee> <firstname>joe</firstname> <lastname>fawcett</lastname> <title>mr</title> <biography> <html> <head><title>joe s Bio</title></head> <body> <p>after graduating from...</p> </body> </html> </biography> </employee> 15 / 28

Namespace (2) 16 / 28 Namespaces allow to avoid clashes URI (Uniform Resource Identifier), i.e. URL (Uniform Resourse Locator) + URN (Uniform Resource Name)

Namespace (2) 16 / 28 Namespaces allow to avoid clashes URI (Uniform Resource Identifier), i.e. URL (Uniform Resourse Locator) + URN (Uniform Resource Name) URL: [Scheme]://[Domain]:[Port]/[Path]?[QueryString]#[FragmentId] http://www.wrox.com/remtitle.cgi?isbn=0470114878

Namespace (2) 16 / 28 Namespaces allow to avoid clashes URI (Uniform Resource Identifier), i.e. URL (Uniform Resourse Locator) + URN (Uniform Resource Name) URL: [Scheme]://[Domain]:[Port]/[Path]?[QueryString]#[FragmentId] http://www.wrox.com/remtitle.cgi?isbn=0470114878 URN: urn:[namespace identifier]:[namespace specific string] urn:isbn:9780470114872

Namespace (2) 16 / 28 Namespaces allow to avoid clashes URI (Uniform Resource Identifier), i.e. URL (Uniform Resourse Locator) + URN (Uniform Resource Name) URL: [Scheme]://[Domain]:[Port]/[Path]?[QueryString]#[FragmentId] http://www.wrox.com/remtitle.cgi?isbn=0470114878 URN: urn:[namespace identifier]:[namespace specific string] urn:isbn:9780470114872 Example. Default namespace <applicationusers xmlns="http://alviano.net/km/examples"> <user firstname="joe" lastname="fawcett" /> <user firstname="danny" lastname="ayers" /> <user firstname="mario" lastname="alviano" /> </applicationusers>

Namespace (3) 17 / 28 Namespaces identified by a prefix can be declared in addition to the default namespace xmlns:km="http://alviano.net/km/examples"

Namespace (3) 17 / 28 Namespaces identified by a prefix can be declared in addition to the default namespace xmlns:km="http://alviano.net/km/examples" Example. Namespace with prefix <km:applicationusers xmlns:km="http://alviano.net/km/examples"> <km:user firstname="joe" lastname="fawcett" /> <km:user firstname="danny" lastname="ayers" /> <km:user firstname="mario" lastname="alviano" /> </km:applicationusers>

17 / 28 Namespace (3) Namespaces identified by a prefix can be declared in addition to the default namespace xmlns:km="http://alviano.net/km/examples" Example. Namespace with prefix <km:applicationusers xmlns:km="http://alviano.net/km/examples"> <km:user firstname="joe" lastname="fawcett" /> <km:user firstname="danny" lastname="ayers" /> <km:user firstname="mario" lastname="alviano" /> </km:applicationusers> Warning! Namespace declarations are inherited Attributes are usually associated with no namespace (default namespaces do not apply to attributes)

Outline 18 / 28 1 Introduction 2 XML syntax 3 Namespace 4 Document Type Definition (DTD) 5 Exercises

Document Type Definition (DTD) (1) 19 / 28 A DTD specifies what data are contained in a XML file (i.e., DTD is a schema for XML)

19 / 28 Document Type Definition (DTD) (1) A DTD specifies what data are contained in a XML file (i.e., DTD is a schema for XML) The DTD is declared before the root element <!DOCTYPE root-element optional-external-reference optional-internal-declarations>

Document Type Definition (DTD) (1) A DTD specifies what data are contained in a XML file (i.e., DTD is a schema for XML) The DTD is declared before the root element <!DOCTYPE root-element optional-external-reference optional-internal-declarations> Internal declarations are enclosed on brackets and are of the following form <!ELEMENT element-name structure> where structure can be EMPTY ANY #PCDATA the name of another element a combination of the previous with?, * + 19 / 28

Document Type Definition (DTD) (2) 20 / 28 Example <?xml version="1.0"?> <!DOCTYPE name [ <!ELEMENT name (first, middle, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT middle (#PCDATA)> <!ELEMENT last (#PCDATA)> ]> <name> <given>joseph</given> <middle>john</middle> <last>fawcett</last> </name>

Document Type Definition (DTD) (2) 20 / 28 Example <?xml version="1.0"?> <!DOCTYPE name [ <!ELEMENT name (first, middle, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT middle (#PCDATA)> <!ELEMENT last (#PCDATA)> ]> <name> <given>joseph</given> <middle>john</middle> <last>fawcett</last> </name> Is the content valid? How to fix it?

21 / 28 Document Type Definition (DTD) (3) Attributes of an element can be specified as follows <!ATTLIST element-name attribute-name type default... > The type of an attribute can be CDATA, ID, IDREF, IDREFS,... The default value may also indicate that an attribute is required (#REQUIRED) or optional (#IMPLIED)

Document Type Definition (DTD) (4) 22 / 28 Example <?xml version="1.0"?> <!DOCTYPE name [ <!ELEMENT name EMPTY> <!ATTLIST name first CDATA #REQUIRED middle CDATA #IMPLIED last CDATA #REQUIRED> ]> <name first="joseph" middle="john" last="fawcett" />

Document Type Definition (DTD) (4) 23 / 28 The external reference allows to reuse an existing DTD

23 / 28 Document Type Definition (DTD) (4) The external reference allows to reuse an existing DTD SYSTEM is used for external DTD stored in a local file <!DOCTYPE bibliography SYSTEM "biblio.dtd">

23 / 28 Document Type Definition (DTD) (4) The external reference allows to reuse an existing DTD SYSTEM is used for external DTD stored in a local file <!DOCTYPE bibliography SYSTEM "biblio.dtd"> PUBLIC is used for DTDs in the catalog of the XML parser <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">

23 / 28 Document Type Definition (DTD) (4) The external reference allows to reuse an existing DTD SYSTEM is used for external DTD stored in a local file <!DOCTYPE bibliography SYSTEM "biblio.dtd"> PUBLIC is used for DTDs in the catalog of the XML parser <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> Optionally, a file may be specified to be used in case the DTD is not in the catalog <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/tr/html4/strict.dtd">

Document Type Definition (DTD) (5) 24 / 28 New entity references may be specified <!ENTITY entity-name definition> <!ENTITY author "Mario Alviano"> &author; can now be used in the XML document

Document Type Definition (DTD) (5) 24 / 28 New entity references may be specified <!ENTITY entity-name definition> <!ENTITY author "Mario Alviano"> &author; can now be used in the XML document Entities can be extern (as for DOCTYPE, SYSTEM and PUBLIC are used)

24 / 28 Document Type Definition (DTD) (5) New entity references may be specified <!ENTITY entity-name definition> <!ENTITY author "Mario Alviano"> &author; can now be used in the XML document Entities can be extern (as for DOCTYPE, SYSTEM and PUBLIC are used) Parameter entities are similar, but can be used in the DTD (to split it in files) <!ENTITY % entity-name definition> <!ENTITY % address SYSTEM "address.dtd">

Outline 25 / 28 1 Introduction 2 XML syntax 3 Namespace 4 Document Type Definition (DTD) 5 Exercises

How to validate an XML document against a DTD 26 / 28 XML validation with libxml xmllint -valid XMLfile -noout xmllint -dtdvalid DTDfile XMLfile -noout

How to validate an XML document against a DTD 26 / 28 XML validation with libxml xmllint -valid XMLfile -noout xmllint -dtdvalid DTDfile XMLfile -noout XML validation with Eclipse EE Right-click on the file(s) to be validated, then Validate

27 / 28 Exercises 1 Given the document order.xml, write a DTD that allows its validation 2 Given the document letter.xml, write a DTD that allows its validation 3 Given the DTD mountainranges.dtd, write a valid XML document 4 Given the DTD dealership.dtd, write a valid XML document 5 Given the description in football-matches.txt, write a DTD and a valid XML document

END OF THE LECTURE 28 / 28