Structured documents

Size: px
Start display at page:

Download "Structured documents"

Transcription

1 Structured documents An overview of XML Structured documents Michael Houghton 15/11/2000

2 Unstructured documents Broadly speaking, text and multimedia document formats can be structured or unstructured. An unstructured document simply contains the necessary instructions to render the content on screen, such as: position information typefaces and sizes colours They are typically stored in binary, and their formats are often secret, hindering document exchange. Examples include Postscript, PDF, TeX, and Word (mostly).

3 An example

4 Structured documents Structured document formats describe the function of each part of a document, for instance: titles, subtitles citations, quotes table of contents, index They are often encoded in text (ASCII or Unicode), with the emphasis on document sharing. Structured documents are more friendly to automated processing (e.g. autogeneration of indices). Examples include LaTeX, HTML, and XML.

5 Structure benefits A known document structure enables uses other than reading and printing: Separation of style and content e.g. company-wide styles, structured templates Automated document production e.g. generation of indices, tables of contents Archiving and retrieval e.g. searching for documents with a given title Metadata applications e.g. keywords, library indexing, annotations, interdocument linking

6 Markup languages Markup is the instruction annotation system used to express the structure in a document. Markup can be in two forms: Macros a code fragment or function (as in an office application) Tags a text sequence identifying the start or end of a part of the document HTML and XML are derived from SGML, which stands for Standard Generalised Markup Language. SGML uses the 'tag' form.

7 Markup concepts in SGML Structured documents in SGML consist of plain text, with tag sequences identifying the document structure, for example: <author> John Q Normal </author> This fragment describes two elements: author a text element: "John Q Normal" A structured document in SGML is a nested set of elements, with a single parent element, called the document element

8 A simple document (1) <person> <name> <!-- the person's name --> </name> <given> John </given> <family> Random </family> <initial> Q </initial> <prefers> Jack </prefers> <contact> <!-- their contact details --> < > johnq@random.org </ > <phone> </phone> </contact> </person>

9 A simple document (2) In the previous example, the document element is <person>. Some important points: '<' and '>' are (usually) special characters To appear in a text element, they must be 'escaped' as < and > respectively Whitespace is preserved However, it is often ignored by applications '<!--' and '-->' enclose comments These are special sequences identifying a note annotating the document but not part of the structure.

10 Tree structures SGML documents are often visualised as tree structures. Here's part of the previous document in tree form: (Note that this tree ignores whitespace nodes.)

11 Attributes Attributes are properties of an element which are not considered part of the main document structure. For example: <person id="458"> Attributes have a name and a value. In this example, the name is 'id' the value is '458' An element may have one or more attributes, but they must have different names.

12 SGML and HTML HTML is an application of SGML. An application is a set of SGML tags and attributes that can be used to describe a particular class of document. Here's a simple HTML document: <html> <head> <title> John Q. Random's home page </title> </head> <body> <h1>hello there!</h1> <p> <font color="blue">welcome to my home page!</font> </p> </body> </html>

13 Example HTML Here's the previous example in a browser:

14 What is XML? An overview of XML What is XML? Michael Houghton 15/11/2000

15 What is XML? XML stands for Extensible Markup Language It is an attempt to: Introduce formal structure to web documents Separate style from content Expand the scope and usefulness of web content It has been designed to allow the creation and combination of custom markup languages The XML standards suite is supported by the World Wide Web Consortium (W3C)

16 What's wrong with HTML? HTML has several failings: Little separation of style and content Heading and font tags are used interchangeably; current browsers still lack full CSS compliance De facto hardcoded presentation Leads to widespread incompatibility, and crash-prone browsers Little support for automated processing through metadata Only <META>, which is of limited use for complex metadata Lenient parsers allow poorly structured markup: e.g. <B>... <I>... </B>... </I>...

17 Why not use SGML? XML is a cut-down ( profile) implementation of SGML (Standard Generalised Markup Language). SGML was considered too complicated for web use; complete SGML implementations are large and complex. XML is simpler and stricter than SGML: XML requires end tags e.g. </P> would not be optional Empty elements need to be identified e.g. <BR> becomes <BR/> Attribute values must be quoted e.g. id="johnqrandom"

18 Core XML concepts Document validation: Checking for correct document structure Document Type Definitions (DTDs) XML Schema Definition Language Presentation and transformation: Transformation for display and document exchange Cascading Style Sheets (CSS) Extensible Style Language (XSL) Internal document structure: XPath

19 Some simple XML <?xml version="1.0" standalone="no"?> <!DOCTYPE book SYSTEM " <book id="nielsen01"> <title> Designing Web Usability </title> <subtitle> The Practice of Simplicity </subtitle> <author> Jakob Nielsen </author> <info> <key> Design </key> <key> Internet </key> <isbn> X </isbn> </info> </book>

20 CDATA If you wish to 'protect' some text from being interpreted as markup, you can: encode the '<' and '>' characters enclose all the text in a CDATA section CDATA sections look like this: Here is some text containing a <sequence/> which would be interpreted as markup <![CDATA[ This time, the <sequence/> won't be interpreted as markup ]]>

21 Checking for correctness XML parsers check documents for two kinds of correctness: Well-formedness Checks that tags nest correctly, attributes are quoted, singleton tags are correctly closed An XML document must be well-formed. Validity Checks the document against a DTD, to see if its structure is allowed. Validation is not necessary. However, a validating parser will fail an invalid document.

22 The DTD (1) A Document Type Definition (DTD) describes the possible valid structures of an XML document. A DTD can be associated with the document in two ways: As a linked document in the XML header e.g. <?xml version="1.0"?> <!DOCTYPE book SYSTEM " By directly embedding it into the document The DTD appears before the document root node, but after the XML declaration.

23 The DTD (2) Here's a DTD for a slide like the example: <!DOCTYPE book [ ]> <!ELEMENT book (title, subtitle?, author+, info) > <!ATTLIST book id CDATA #REQUIRED > <!ELEMENT title PCDATA > <!ELEMENT subtitle PCDATA > <!ELEMENT author PCDATA > <!ELEMENT info (key*, isbn) > <!ELEMENT key PCDATA > <!ELEMENT isbn PCDATA >

24 The DTD (3) These lines define the element elements and attributes: book, and the allowed child <!ELEMENT book (title, subtitle?, author+, info) > <!ATTLIST book id CDATA > The book element consists of: a mandatory title an optional subtitle one or more author elements The book element has a required attribute id, which consists of a character data value.

25 XML Schema One drawback of XML DTDs is that they are described in a separate syntax, inherited from SGML. XML Schema offers an alternative way to describe XML document structure, in XML syntax This provides many benefits: simplicity XML Schema rules are often easier to understand. interrogation of data structure XSLT transformations can know more about document structure tool reuse The same tools used to create and maintain XML documents can be used to maintain their structure

26 An example schema <?xml version ="1.0"?> <schema xmlns:xsd = " <element name = "book"> <complextype content = "elementonly"> <sequence> <element ref = "title" /> <element ref = "subtitle" minoccurs = "0" maxoccurs = "1" /> <element ref = "author" minoccurs = "1" maxoccurs = "unbounded" /> </sequence> <attribute name = "id" use = "required" type = "string"/> </complextype> </element> <element name = "title"> <complextype content = "elementonly" /> </element>... </schema>

27 Namespaces Namespaces framework. are the cornerstone of the modular design of the XML XML uses namespaces to allow the combination of different markup languages. For example, this document includes an element to describe a person from another namespace: <slide> <author> <person:name xmlns:person=" <person:given>john</person:given> <person:initial>q</person:initial> <person:family>random</person:family> </person:name> </author> </slide>

28 Stylesheets XML pages can make use of Cascading Stylesheets in the same way as HTML.However, they can also make use of more sophisticated XSL transformations. XSL is a two-part technology: XSLT is a rules-based system used to transform XML documents to other XML forms (such as WML), and to HTML. XSL-FO (Formatting Objects), is used to describe presentation objects for rendering in a browser. An XSL stylesheet will typically map XML to XSL-FO by means of XSLT rules.

29 XSLT rules An XSLT stylesheet consists of a set of XSLT rules. These rules are described in XML, with XSLT structures described using the xsl: namespace. Essentially, a rule is a chunk of output document, scripted by XSLT constructs which describe the parts of the source to which the rules are applied. The output is created by applying the closest matching rule to each part of the input document. Some XSLT processors allow extension rules to be written which talk to databases and other back-end systems.

30 Example XSLT rules A title rule: <xsl:template match="book"> Book Review: <xsl:value-of select="./title"/> by <xsl:value-of select="./author"/> </xsl:template> Applied to the example, this might produce: Book Review: Designing Web Usability by Jakob Nielsen XSLT is a so-called push/pull stylesheet system: push rules ( xsl:template) can pull information from other parts of the document.

31 XHTML and migration As we've seen, HTML is an application of SGML. XHTML is a similar markup language, expressed in XML syntax. XHTML can be viewed in current browsers, with some limitations. Converting HTML to XHTML will allow content created today to be processed into other forms - XHTML is XML. Conversion can be done with Dave Raggett's HTML Tidy utility, available from the W3C website.

32 XHTML profiles Three profiles of XHTML are under development by the Web Consortium: Transitional This is designed to ease the transition from HTML to XHTML. All of the 'questionable' parts of HTML (font colours, etc.) are still available. Strict This profile strictly enforces a style/content separation. According to the W3C, it is free of any tags associated with layout.used with W3C's Cascading Style Sheet language (CSS). Frames This profile has support for 'multi-framed' web designs.

33 XML and metadata An overview of XML XML and metadata Michael Houghton 15/11/2000

34 Metadata in HTML HTML's support of metadata is limited to the <META> element. This element supports a simple name/value pair, with two attributes: name The identifier of the data element content The data element's content This scheme is basic; more structured metadata has to be expressed through complicated naming schemes

35 Metadata in XHTML (1) The <META> element still works, in its XML form: <meta name="author" content="joe Bloggs" /> With XSLT, this data can be interrogated and acted on in a stylesheet or transformation Thus metadata can be preserved while migrating content from HTML to XHTML, then extracted and processed into a more expressive form.

36 Metadata in XHTML (2) However, with XHTML you're not restricted to <META> Using namespaces, it is possible to combine other metadata formats with XHTML. Common uses include adding RDF metadata, which can be done without breaking HTML backward compatibility. Since current browsers ignore tags they don't recognise, but include any text data in the output document, it is best to use metadata schemes that carry their content in attributes.

37 Dublin Core/RDF in XHTML <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd"> <html xmlns=" <head> <title>open.gov.uk - organisation index (a-b)</title> <rdf:rdf xmlns:rdf=" xmlns:dc=" <rdf:description about=" dc:creator="neil Pawley" dc:title="open.gov.uk - organisation index" dc:subject="organisation, index, listing, directory" dc:description="this section contains the open.gov.uk..." dc:publisher="ccta"... /> </rdf:rdf>

38 More on RDF RDF ( Resource Description Framework) is a W3C recommendation for general website metadata. It is already used in conjunction with Dublin Core metadata schemas. However, it is also in use: in 'intelligent' browsers such as Netscape 6 (the 'site summary' tree browser) and Metabrowser ( for content rating RDF was inspired by PICS, and there is a PICS to RDF mapping

39 Metadata migration Some ideas for metadata migration: Convert your HTML content to XHTML Use HTML Tidy Change your content generation scripts If your pages are generated dynamically, make the scripts generate XHTML with RDF Translate <META> data If your (X)HTML is static, consider crunching it with an XSLT processor to extract <META> data and output RDF instead

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML Chapter 7 XML 7.1 Introduction extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML Lax syntactical rules Many complex features that are rarely used HTML

More information

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML 7.1 Introduction extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML Lax syntactical rules Many complex features that are rarely used HTML is a markup language,

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411 1 Extensible

More information

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11 !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... 7:4 @import Directive... 9:11 A Absolute Units of Length... 9:14 Addressing the First Line... 9:6 Assigning Meaning to XML Tags...

More information

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance.

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance. XML Programming Duration: 5 Days US Price: $2795 UK Price: 1,995 *Prices are subject to VAT CA Price: CDN$3,275 *Prices are subject to GST/HST Delivery Options: Attend face-to-face in the classroom or

More information

Delivery Options: Attend face-to-face in the classroom or remote-live attendance.

Delivery Options: Attend face-to-face in the classroom or remote-live attendance. XML Programming Duration: 5 Days Price: $2795 *California residents and government employees call for pricing. Discounts: We offer multiple discount options. Click here for more info. Delivery Options:

More information

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University Markup Languages SGML, HTML, XML, XHTML CS 431 February 13, 2006 Carl Lagoze Cornell University Problem Richness of text Elements: letters, numbers, symbols, case Structure: words, sentences, paragraphs,

More information

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial. A tutorial report for SENG 609.22 Agent Based Software Engineering Course Instructor: Dr. Behrouz H. Far XML Tutorial Yanan Zhang Department of Electrical and Computer Engineering University of Calgary

More information

Introduction to XML. XML: basic elements

Introduction to XML. XML: basic elements Introduction to XML XML: basic elements XML Trying to wrap your brain around XML is sort of like trying to put an octopus in a bottle. Every time you think you have it under control, a new tentacle shows

More information

Module 2 (III): XHTML

Module 2 (III): XHTML INTERNET & WEB APPLICATION DEVELOPMENT SWE 444 Fall Semester 2008-2009 (081) Module 2 (III): XHTML Dr. El-Sayed El-Alfy Computer Science Department King Fahd University of Petroleum and Minerals alfy@kfupm.edu.sa

More information

CHAPTER 2 MARKUP LANGUAGES: XHTML 1.0

CHAPTER 2 MARKUP LANGUAGES: XHTML 1.0 WEB TECHNOLOGIES A COMPUTER SCIENCE PERSPECTIVE CHAPTER 2 MARKUP LANGUAGES: XHTML 1.0 Modified by Ahmed Sallam Based on original slides by Jeffrey C. Jackson reserved. 0-13-185603-0 HTML HELLO WORLD! Document

More information

XML Introduction 1. XML Stands for EXtensible Mark-up Language (XML). 2. SGML Electronic Publishing challenges -1986 3. HTML Web Presentation challenges -1991 4. XML Data Representation challenges -1996

More information

Tutorial 1 Getting Started with HTML5. HTML, CSS, and Dynamic HTML 5 TH EDITION

Tutorial 1 Getting Started with HTML5. HTML, CSS, and Dynamic HTML 5 TH EDITION Tutorial 1 Getting Started with HTML5 HTML, CSS, and Dynamic HTML 5 TH EDITION Objectives Explore the history of the Internet, the Web, and HTML Compare the different versions of HTML Study the syntax

More information

IBM. XML and Related Technologies Dumps Braindumps Real Questions Practice Test dumps free

IBM. XML and Related Technologies Dumps Braindumps Real Questions Practice Test dumps free 000-141 Dumps 000-141 Braindumps 000-141 Real Questions 000-141 Practice Test 000-141 dumps free IBM 000-141 XML and Related Technologies http://killexams.com/pass4sure/exam-detail/000-141 collections

More information

Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013

Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013 Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013 2 Outline Introduction XML Structure Document Type Definition (DTD) XHMTL Formatting XML CSS Formatting XSLT Transformations

More information

EMERGING TECHNOLOGIES. XML Documents and Schemas for XML documents

EMERGING TECHNOLOGIES. XML Documents and Schemas for XML documents EMERGING TECHNOLOGIES XML Documents and Schemas for XML documents Outline 1. Introduction 2. Structure of XML data 3. XML Document Schema 3.1. Document Type Definition (DTD) 3.2. XMLSchema 4. Data Model

More information

Web Standards Mastering HTML5, CSS3, and XML

Web Standards Mastering HTML5, CSS3, and XML Web Standards Mastering HTML5, CSS3, and XML Leslie F. Sikos, Ph.D. orders-ny@springer-sbm.com www.springeronline.com rights@apress.com www.apress.com www.apress.com/bulk-sales www.apress.com Contents

More information

COMP9321 Web Application Engineering. Extensible Markup Language (XML)

COMP9321 Web Application Engineering. Extensible Markup Language (XML) COMP9321 Web Application Engineering Extensible Markup Language (XML) Dr. Basem Suleiman Service Oriented Computing Group, CSE, UNSW Australia Semester 1, 2016, Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2442

More information

Chapter 10: Understanding the Standards

Chapter 10: Understanding the Standards Disclaimer: All words, pictures are adopted from Learning Web Design (3 rd eds.) by Jennifer Niederst Robbins, published by O Reilly 2007. Chapter 10: Understanding the Standards CSc2320 In this chapter

More information

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward Comp 336/436 - Markup Languages Fall Semester 2017 - Week 4 Dr Nick Hayward XML - recap first version of XML became a W3C Recommendation in 1998 a useful format for data storage and exchange config files,

More information

Chapter 1: Getting Started. You will learn:

Chapter 1: Getting Started. You will learn: Chapter 1: Getting Started SGML and SGML document components. What XML is. XML as compared to SGML and HTML. XML format. XML specifications. XML architecture. Data structure namespaces. Data delivery,

More information

XML Motivations. Semi-structured data. Principles of Information and Database Management 198:336 Week 8 Mar 28 Matthew Stone.

XML Motivations. Semi-structured data. Principles of Information and Database Management 198:336 Week 8 Mar 28 Matthew Stone. XML Motivations Principles of Information and Database Management 198:336 Week 8 Mar 28 Matthew Stone Semi-structured data Relaxing traditional schema Storing more complex objects Standardized data Using

More information

Author: Irena Holubová Lecturer: Martin Svoboda

Author: Irena Holubová Lecturer: Martin Svoboda NPRG036 XML Technologies Lecture 1 Introduction, XML, DTD 19. 2. 2018 Author: Irena Holubová Lecturer: Martin Svoboda http://www.ksi.mff.cuni.cz/~svoboda/courses/172-nprg036/ Lecture Outline Introduction

More information

XML: Extensible Markup Language

XML: Extensible Markup Language XML: Extensible Markup Language CSC 375, Fall 2015 XML is a classic political compromise: it balances the needs of man and machine by being equally unreadable to both. Matthew Might Slides slightly modified

More information

What is XHTML? XHTML is the language used to create and organize a web page:

What is XHTML? XHTML is the language used to create and organize a web page: XHTML Basics What is XHTML? XHTML is the language used to create and organize a web page: XHTML is newer than, but built upon, the original HTML (HyperText Markup Language) platform. XHTML has stricter

More information

CSI 3140 WWW Structures, Techniques and Standards. Markup Languages: XHTML 1.0

CSI 3140 WWW Structures, Techniques and Standards. Markup Languages: XHTML 1.0 CSI 3140 WWW Structures, Techniques and Standards Markup Languages: XHTML 1.0 HTML Hello World! Document Type Declaration Document Instance Guy-Vincent Jourdan :: CSI 3140 :: based on Jeffrey C. Jackson

More information

What is XML? XML is designed to transport and store data.

What is XML? XML is designed to transport and store data. What is XML? XML stands for extensible Markup Language. XML is designed to transport and store data. HTML was designed to display data. XML is a markup language much like HTML XML was designed to carry

More information

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward Comp 336/436 - Markup Languages Fall Semester 2018 - Week 4 Dr Nick Hayward XML - recap first version of XML became a W3C Recommendation in 1998 a useful format for data storage and exchange config files,

More information

The XML Metalanguage

The XML Metalanguage The XML Metalanguage Mika Raento mika.raento@cs.helsinki.fi University of Helsinki Department of Computer Science Mika Raento The XML Metalanguage p.1/442 2003-09-15 Preliminaries Mika Raento The XML Metalanguage

More information

Introduction Syntax and Usage XML Databases Java Tutorial XML. November 5, 2008 XML

Introduction Syntax and Usage XML Databases Java Tutorial XML. November 5, 2008 XML Introduction Syntax and Usage Databases Java Tutorial November 5, 2008 Introduction Syntax and Usage Databases Java Tutorial Outline 1 Introduction 2 Syntax and Usage Syntax Well Formed and Valid Displaying

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2017 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid= 2465 1

More information

XHTML. XHTML stands for EXtensible HyperText Markup Language. XHTML is the next generation of HTML. XHTML is almost identical to HTML 4.

XHTML. XHTML stands for EXtensible HyperText Markup Language. XHTML is the next generation of HTML. XHTML is almost identical to HTML 4. 3 XHTML What is XHTML? XHTML stands for EXtensible HyperText Markup Language XHTML is the next generation of HTML XHTML is aimed to replace HTML XHTML is almost identical to HTML 4.01 XHTML is a stricter

More information

Announcements. Paper due this Wednesday

Announcements. Paper due this Wednesday Announcements Paper due this Wednesday 1 Client and Server Client and server are two terms frequently used Client/Server Model Client/Server model when talking about software Client/Server model when talking

More information

XML. Objectives. Duration. Audience. Pre-Requisites

XML. Objectives. Duration. Audience. Pre-Requisites XML XML - extensible Markup Language is a family of standardized data formats. XML is used for data transmission and storage. Common applications of XML include business to business transactions, web services

More information

Introduction to XML. Chapter 133

Introduction to XML. Chapter 133 Chapter 133 Introduction to XML A. Multiple choice questions: 1. Attributes in XML should be enclosed within. a. single quotes b. double quotes c. both a and b d. none of these c. both a and b 2. Which

More information

XHTML & CSS CASCADING STYLE SHEETS

XHTML & CSS CASCADING STYLE SHEETS CASCADING STYLE SHEETS What is XHTML? XHTML stands for Extensible Hypertext Markup Language XHTML is aimed to replace HTML XHTML is almost identical to HTML 4.01 XHTML is a stricter and cleaner version

More information

.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar..

.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar.. .. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar.. XML in a Nutshell XML, extended Markup Language is a collection of rules for universal markup of data. Brief History

More information

extensible Markup Language (XML) Basic Concepts

extensible Markup Language (XML) Basic Concepts (XML) Basic Concepts Giuseppe Della Penna Università degli Studi di L Aquila dellapenna@univaq.it http://www.di.univaq.it/gdellape This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike

More information

CSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML

CSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards Representing Web Data: XML XML Example XML document: An XML document is one that follows certain syntax rules (most of which we followed for XHTML) Guy-Vincent

More information

Chapter 2:- Introduction to XHTML. Compiled By:- Sanjay Patel Assistant Professor, SVBIT.

Chapter 2:- Introduction to XHTML. Compiled By:- Sanjay Patel Assistant Professor, SVBIT. Chapter 2:- Introduction to XHTML Compiled By:- Assistant Professor, SVBIT. Outline Introduction to XHTML Move to XHTML Meta tags Character entities Frames and frame sets Inside Browser What is XHTML?

More information

W3C XML XML Overview

W3C XML XML Overview Overview Jaroslav Porubän 2008 References Tutorials, http://www.w3schools.com Specifications, World Wide Web Consortium, http://www.w3.org David Hunter, et al.: Beginning, 4th Edition, Wrox, 2007, 1080

More information

Overview. Introduction. Introduction XML XML. Lecture 16 Introduction to XML. Boriana Koleva Room: C54

Overview. Introduction. Introduction XML XML. Lecture 16 Introduction to XML. Boriana Koleva Room: C54 Overview Lecture 16 Introduction to XML Boriana Koleva Room: C54 Email: bnk@cs.nott.ac.uk Introduction The Syntax of XML XML Document Structure Document Type Definitions Introduction Introduction SGML

More information

11. EXTENSIBLE MARKUP LANGUAGE (XML)

11. EXTENSIBLE MARKUP LANGUAGE (XML) 11. EXTENSIBLE MARKUP LANGUAGE (XML) Introduction Extensible Markup Language is a Meta language that describes the contents of the document. So these tags can be called as self-describing data tags. XML

More information

Introduction to XML. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University

Introduction to XML. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University Introduction to XML Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University http://gear.kku.ac.th/~krunapon/xmlws 1 Topics p What is XML? p Why XML? p Where does XML

More information

Electronic Commerce Architecture Project LAB ONE: Introduction to XML

Electronic Commerce Architecture Project LAB ONE: Introduction to XML Electronic Commerce Architecture Project LAB ONE: Introduction to XML An XML document has two required parts. The first is the definition of what data should be in the document. The second is the document

More information

introduction to XHTML

introduction to XHTML introduction to XHTML XHTML stands for Extensible HyperText Markup Language and is based on HTML 4.0, incorporating XML. Due to this fusion the mark up language will remain compatible with existing browsers

More information

CountryData Technologies for Data Exchange. Introduction to XML

CountryData Technologies for Data Exchange. Introduction to XML CountryData Technologies for Data Exchange Introduction to XML What is XML? EXtensible Markup Language Format is similar to HTML, but XML deals with data structures, while HTML is about presentation Open

More information

SDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5

SDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5 2 Basics of XML and XML documents 2.1 XML and XML documents Survivor's Guide to XML, or XML for Computer Scientists / Dummies 2.1 XML and XML documents 2.2 Basics of XML DTDs 2.3 XML Namespaces XML 1.0

More information

Introduction to XML 3/14/12. Introduction to XML

Introduction to XML 3/14/12. Introduction to XML Introduction to XML Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University http://gear.kku.ac.th/~krunapon/xmlws 1 Topics p What is XML? p Why XML? p Where does XML

More information

Understanding the Web Design Environment. Principles of Web Design, Third Edition

Understanding the Web Design Environment. Principles of Web Design, Third Edition Understanding the Web Design Environment Principles of Web Design, Third Edition HTML: Then and Now HTML is an application of the Standard Generalized Markup Language Intended to represent simple document

More information

Implementing Web Content

Implementing Web Content Implementing Web Content Tonia M. Bartz Dr. David Robins Individual Investigation SLIS Site Redesign 6 August 2006 Appealing Web Content When writing content for a web site, it is best to think of it more

More information

XML and DTD. Mario Alviano A.Y. 2017/2018. University of Calabria, Italy 1 / 28

XML and DTD. Mario Alviano A.Y. 2017/2018. University of Calabria, Italy 1 / 28 1 / 28 XML and DTD Mario Alviano University of Calabria, Italy A.Y. 2017/2018 Outline 2 / 28 1 Introduction 2 XML syntax 3 Namespace 4 Document Type Definition (DTD) 5 Exercises Outline 3 / 28 1 Introduction

More information

7.1 Introduction. 7.1 Introduction (continued) - Problem with using SGML: - SGML is a meta-markup language

7.1 Introduction. 7.1 Introduction (continued) - Problem with using SGML: - SGML is a meta-markup language 7.1 Introduction - SGML is a meta-markup language - Developed in the early 1980s; ISO std. In 1986 - HTML was developed using SGML in the early 1990s - specifically for Web documents - Two problems with

More information

HTML. Mohammed Alhessi M.Sc. Geomatics Engineering. Internet GIS Technologies كلية اآلداب - قسم الجغرافيا نظم المعلومات الجغرافية

HTML. Mohammed Alhessi M.Sc. Geomatics Engineering. Internet GIS Technologies كلية اآلداب - قسم الجغرافيا نظم المعلومات الجغرافية HTML Mohammed Alhessi M.Sc. Geomatics Engineering Wednesday, February 18, 2015 Eng. Mohammed Alhessi 1 W3Schools Main Reference: http://www.w3schools.com/ 2 What is HTML? HTML is a markup language for

More information

Style Sheet A. Bellaachia Page: 22

Style Sheet A. Bellaachia Page: 22 Style Sheet How to render the content of an XML document on a page? Two mechanisms: CSS: Cascading Style Sheets XSL (the extensible Style sheet Language) CSS Definitions: CSS: Cascading Style Sheets Simple

More information

extensible Markup Language

extensible Markup Language extensible Markup Language XML is rapidly becoming a widespread method of creating, controlling and managing data on the Web. XML Orientation XML is a method for putting structured data in a text file.

More information

The main Topics in this lecture are:

The main Topics in this lecture are: Lecture 15: Working with Extensible Markup Language (XML) The main Topics in this lecture are: - Brief introduction to XML - Some advantages of XML - XML Structure: elements, attributes, entities - What

More information

Layered approach. Data

Layered approach. Data Layered approach (by T. Berners-Lee) The Semantic Web principles are implemented in the layers of Web technologies and standards semantics relational data Selfdescr. doc. Data Data Rules Ontology vocabulary

More information

Introduction to XML Zdeněk Žabokrtský, Rudolf Rosa

Introduction to XML Zdeněk Žabokrtský, Rudolf Rosa NPFL092 Technology for Natural Language Processing Introduction to XML Zdeněk Žabokrtský, Rudolf Rosa November 28, 2018 Charles Univeristy in Prague Faculty of Mathematics and Physics Institute of Formal

More information

but XML goes far beyond HTML: it describes data

but XML goes far beyond HTML: it describes data The XML Meta-Language 1 Introduction to XML The father of markup languages: XML = EXtensible Markup Language is a simplified version of SGML Originally created to overcome the limitations of HTML the HTML

More information

XML. COSC Dr. Ramon Lawrence. An attribute is a name-value pair declared inside an element. Comments. Page 3. COSC Dr.

XML. COSC Dr. Ramon Lawrence. An attribute is a name-value pair declared inside an element. Comments. Page 3. COSC Dr. COSC 304 Introduction to Database Systems XML Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca XML Extensible Markup Language (XML) is a markup language that allows for

More information

Exam : Title : XML 1.1 and Related Technologies. Version : DEMO

Exam : Title : XML 1.1 and Related Technologies. Version : DEMO Exam : 000-142 Title : XML 1.1 and Related Technologies Version : DEMO 1. XML data is stored and retrieved within a relational database for a data-centric application by means of mapping XML schema elements

More information

Java EE 7: Back-end Server Application Development 4-2

Java EE 7: Back-end Server Application Development 4-2 Java EE 7: Back-end Server Application Development 4-2 XML describes data objects called XML documents that: Are composed of markup language for structuring the document data Support custom tags for data

More information

Web Programming Paper Solution (Chapter wise)

Web Programming Paper Solution (Chapter wise) What is valid XML document? Design an XML document for address book If in XML document All tags are properly closed All tags are properly nested They have a single root element XML document forms XML tree

More information

XML Structures. Web Programming. Uta Priss ZELL, Ostfalia University. XML Introduction Syntax: well-formed Semantics: validity Issues

XML Structures. Web Programming. Uta Priss ZELL, Ostfalia University. XML Introduction Syntax: well-formed Semantics: validity Issues XML Structures Web Programming Uta Priss ZELL, Ostfalia University 2013 Web Programming XML1 Slide 1/32 Outline XML Introduction Syntax: well-formed Semantics: validity Issues Web Programming XML1 Slide

More information

White Paper. elcome to Nokia s WAP 2.0 XHTML browser for small devices. Advantages of XHTML for Wireless Data

White Paper. elcome to Nokia s WAP 2.0 XHTML browser for small devices. Advantages of XHTML for Wireless Data elcome to Nokia s WAP 2.0 XHTML browser for small devices. Advantages of XHTML for Wireless Data Contents Introduction: WAP 2.0 is XHTML 2 XHTML Basic: Key Features and Capabilities 2 Well-Formed XML 3

More information

- XML. - DTDs - XML Schema - XSLT. Web Services. - Well-formedness is a REQUIRED check on XML documents

- XML. - DTDs - XML Schema - XSLT. Web Services. - Well-formedness is a REQUIRED check on XML documents Purpose of this day Introduction to XML for parliamentary documents (and all other kinds of documents, actually) Prof. Fabio Vitali University of Bologna Introduce the principal aspects of electronic management

More information

XML. Rodrigo García Carmona Universidad San Pablo-CEU Escuela Politécnica Superior

XML. Rodrigo García Carmona Universidad San Pablo-CEU Escuela Politécnica Superior XML Rodrigo García Carmona Universidad San Pablo-CEU Escuela Politécnica Superior XML INTRODUCTION 2 THE XML LANGUAGE XML: Extensible Markup Language Standard for the presentation and transmission of information.

More information

Birkbeck (University of London)

Birkbeck (University of London) Birkbeck (University of London) MSc Examination Department of Computer Science and Information Systems Internet and Web Technologies (COIY063H7) 15 Credits Date of Examination: 13 June 2017 Duration of

More information

XML is a popular multi-language system, and XHTML depends on it. XML details languages

XML is a popular multi-language system, and XHTML depends on it. XML details languages 1 XML XML is a popular multi-language system, and XHTML depends on it XML details languages XML 2 Many of the newer standards, including XHTML, are based on XML = Extensible Markup Language, so we will

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1 Slide 27-1 Chapter 27 XML: Extensible Markup Language Chapter Outline Introduction Structured, Semi structured, and Unstructured Data. XML Hierarchical (Tree) Data Model. XML Documents, DTD, and XML Schema.

More information

HTML and XML. XML stands for extensible Markup Language

HTML and XML. XML stands for extensible Markup Language HTML and XML XML stands for extensible Markup Language HTML is used to mark up text so it can be displayed to users HTML describes both structure (e.g. , , ) and appearance (e.g. , ,

More information

XML Overview, part 1

XML Overview, part 1 XML Overview, part 1 Norman Gray Revision 1.4, 2002/10/30 XML Overview, part 1 p.1/28 Contents The who, what and why XML Syntax Programming with XML Other topics The future http://www.astro.gla.ac.uk/users/norman/docs/

More information

Question Bank XML (Solved/Unsolved) Q.1 Fill in the Blanks: (1 Mark each)

Question Bank XML (Solved/Unsolved) Q.1 Fill in the Blanks: (1 Mark each) Q.1 Fill in the Blanks: (1 Mark each) 1. With XML, you can create your own elements, also called tags. 2. The beginning or first element in XML is called the root (document) element. 3. Jon Bosak is known

More information

XML for Java Developers G Session 2 - Sub-Topic 1 Beginning XML. Dr. Jean-Claude Franchitti

XML for Java Developers G Session 2 - Sub-Topic 1 Beginning XML. Dr. Jean-Claude Franchitti XML for Java Developers G22.3033-002 Session 2 - Sub-Topic 1 Beginning XML Dr. Jean-Claude Franchitti New York University Computer Science Department Courant Institute of Mathematical Sciences Objectives

More information

Outline. XML vs. HTML and Well Formed vs. Valid. XML Overview. CSC309 Tutorial --XML 4. Edward Xia

Outline. XML vs. HTML and Well Formed vs. Valid. XML Overview. CSC309 Tutorial --XML 4. Edward Xia CSC309 Tutorial XML Edward Xia November 7, 2003 Outline XML Overview XML DOCTYPE Element Declarations Attribute List Declarations Entity Declarations CDATA Stylesheet PI XML Namespaces A Complete Example

More information

TASC Consulting Technical Writing Courseware Training

TASC Consulting Technical Writing Courseware Training Understanding XML Aruna Panangipally TASC Consulting Technical Writing Courseware Training Session Outline Why should a technical writer know XML? The Beginning Understanding markup languages Origins of

More information

XML Update. Royal Society of the Arts London, December 8, Jon Bosak Sun Microsystems

XML Update. Royal Society of the Arts London, December 8, Jon Bosak Sun Microsystems XML Update Royal Society of the Arts London, December 8, 1998 Jon Bosak Sun Microsystems XML Basics...A-1 The XML Concept...B-1 XML in Context...C-1 XML and Open Standards...D-1 XML Update XML Basics XML

More information

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington CS330 Lecture April 8, 2003 1 Overview From HTML to XML DTDs Querying XML: XPath Transforming XML: XSLT

More information

HTML vs. XML In the case of HTML, browsers have been taught how to ignore invalid HTML such as the <mymadeuptag> element and generally do their best

HTML vs. XML In the case of HTML, browsers have been taught how to ignore invalid HTML such as the <mymadeuptag> element and generally do their best 1 2 HTML vs. XML In the case of HTML, browsers have been taught how to ignore invalid HTML such as the element and generally do their best when dealing with badly placed HTML elements. The

More information

Introduction to XML. An Example XML Document. The following is a very simple XML document.

Introduction to XML. An Example XML Document. The following is a very simple XML document. Introduction to XML Extensible Markup Language (XML) was standardized in 1998 after 2 years of work. However, it developed out of SGML (Standard Generalized Markup Language), a product of the 1970s and

More information

PART. Oracle and the XML Standards

PART. Oracle and the XML Standards PART I Oracle and the XML Standards CHAPTER 1 Introducing XML 4 Oracle Database 10g XML & SQL E xtensible Markup Language (XML) is a meta-markup language, meaning that the language, as specified by the

More information

XML Applications. Prof. Andrea Omicini DEIS, Ingegneria Due Alma Mater Studiorum, Università di Bologna a Cesena

XML Applications. Prof. Andrea Omicini DEIS, Ingegneria Due Alma Mater Studiorum, Università di Bologna a Cesena XML Applications Prof. Andrea Omicini DEIS, Ingegneria Due Alma Mater Studiorum, Università di Bologna a Cesena Outline XHTML XML Schema XSL & XSLT Other XML Applications 2 XHTML HTML vs. XML HTML Presentation

More information

~ Ian Hunneybell: DIA Revision Notes ~

~ Ian Hunneybell: DIA Revision Notes ~ XML is based on open standards, and is text-based, thereby making it accessible to all. It is extensible, thus allowing anyone to customise it for their own needs, to publish for others to use, and to

More information

Motivation (WWW) Markup Languages (defined). 7/15/2012. CISC1600-SummerII2012-Raphael-lec2 1. Agenda

Motivation (WWW) Markup Languages (defined). 7/15/2012. CISC1600-SummerII2012-Raphael-lec2 1. Agenda CISC 1600 Introduction to Multi-media Computing Agenda Email Address: Course Page: Class Hours: Summer Session II 2012 Instructor : J. Raphael raphael@sci.brooklyn.cuny.edu http://www.sci.brooklyn.cuny.edu/~raphael/cisc1600.html

More information

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

PASS4TEST. IT Certification Guaranteed, The Easy Way!   We offer free update service for one year PASS4TEST IT Certification Guaranteed, The Easy Way! \ http://www.pass4test.com We offer free update service for one year Exam : 000-141 Title : XML and related technologies Vendors : IBM Version : DEMO

More information

Technology for the Rest of Us: XML. May 26, 2004 Columbus, Ohio

Technology for the Rest of Us: XML. May 26, 2004 Columbus, Ohio Technology for the Rest of Us: XML May 26, 2004 Columbus, Ohio Ron Gilmour Science & Technology Coordinator Hodges Library, University of Tennesee at Knoxville gilmour@lib.utk.edu Presentation Materials

More information

HTML Overview. With an emphasis on XHTML

HTML Overview. With an emphasis on XHTML HTML Overview With an emphasis on XHTML What is HTML? Stands for HyperText Markup Language A client-side technology (i.e. runs on a user s computer) HTML has a specific set of tags that allow: the structure

More information

CSS, Cascading Style Sheets

CSS, Cascading Style Sheets CSS, Cascading Style Sheets HTML was intended to define the content of a document This is a heading This is a paragraph This is a table element Not how they look (aka style)

More information

Semistructured Content

Semistructured Content On our first day Semistructured Content 1 Structured data : database system tagged, typed well-defined semantic interpretation Semi-structured data: tagged - (HTML?) some help with semantic interpretation

More information

(1) I (2) S (3) P allow subscribers to connect to the (4) often provide basic services such as (5) (6)

(1) I (2) S (3) P allow subscribers to connect to the (4) often provide basic services such as (5) (6) Collection of (1) Meta-network That is, a (2) of (3) Uses a standard set of protocols Also uses standards d for structuring t the information transferred (1) I (2) S (3) P allow subscribers to connect

More information

Layered approach. Data

Layered approach. Data Layered approach (by T. Berners-Lee) The Semantic Web principles are implemented in the layers of Web technologies and standards semantics relational data Selfdescr. doc. Data Data Rules Ontology vocabulary

More information

XML (Extensible Markup Language

XML (Extensible Markup Language XML (Extensible Markup Language XML is a markup language. XML stands for extensible Markup Language. The XML standard was created by W3C to provide an easy to use and standardized way to store self describing

More information

GRAPHIC WEB DESIGNER PROGRAM

GRAPHIC WEB DESIGNER PROGRAM NH128 HTML Level 1 24 Total Hours COURSE TITLE: HTML Level 1 COURSE OVERVIEW: This course introduces web designers to the nuts and bolts of HTML (HyperText Markup Language), the programming language used

More information

Semistructured data, XML, DTDs

Semistructured data, XML, DTDs Semistructured data, XML, DTDs Introduction to Databases Manos Papagelis Thanks to Ryan Johnson, John Mylopoulos, Arnold Rosenbloom and Renee Miller for material in these slides Structured vs. unstructured

More information

Create web pages in HTML with a text editor, following the rules of XHTML syntax and using appropriate HTML tags Create a web page that includes

Create web pages in HTML with a text editor, following the rules of XHTML syntax and using appropriate HTML tags Create a web page that includes CMPT 165 INTRODUCTION TO THE INTERNET AND THE WORLD WIDE WEB By Hassan S. Shavarani UNIT2: MARKUP AND HTML 1 IN THIS UNIT YOU WILL LEARN THE FOLLOWING Create web pages in HTML with a text editor, following

More information

TagSoup: A SAX parser in Java for nasty, ugly HTML. John Cowan

TagSoup: A SAX parser in Java for nasty, ugly HTML. John Cowan TagSoup: A SAX parser in Java for nasty, ugly HTML John Cowan (cowan@ccil.org) Copyright This presentation is: Copyright 2002 John Cowan Licensed under the GNU General Public License ABSOLUTELY WITHOUT

More information

Chapter 2 XML, XML Schema, XSLT, and XPath

Chapter 2 XML, XML Schema, XSLT, and XPath Summary Chapter 2 XML, XML Schema, XSLT, and XPath Ryan McAlister XML stands for Extensible Markup Language, meaning it uses tags to denote data much like HTML. Unlike HTML though it was designed to carry

More information

Chapter 7: XML Namespaces

Chapter 7: XML Namespaces 7. XML Namespaces 7-1 Chapter 7: XML Namespaces References: Tim Bray, Dave Hollander, Andrew Layman: Namespaces in XML. W3C Recommendation, World Wide Web Consortium, Jan 14, 1999. [http://www.w3.org/tr/1999/rec-xml-names-19990114],

More information

Manipulating XML Trees XPath and XSLT. CS 431 February 18, 2008 Carl Lagoze Cornell University

Manipulating XML Trees XPath and XSLT. CS 431 February 18, 2008 Carl Lagoze Cornell University Manipulating XML Trees XPath and XSLT CS 431 February 18, 2008 Carl Lagoze Cornell University XPath Language for addressing parts of an XML document XSLT Xpointer XQuery Tree model based on DOM W3C Recommendation

More information