extensible Markup Language Eshcar Hillel Sources: http://www.w3schools.com http://java.sun.com/webservices/jaxp/ learning/tutorial/index.html Tutorial Outline What is? syntax rules Schema Document Object Model (DOM) Java API for Processing (JAXP) 2 An Example File: Note.xml <?xml version="1.0" encoding="iso-8859-1"?> <!-- Edited with Spy 2007 --> <note date="22/03/2007"> <to>tove</to> <from>jani</from> <heading>reminder</heading> <body>don't forget me this weekend!</body> </note> 3 1
What is? Tags A markup language much like HTML A cross-platform, software and hardware independent tool for transmitting information was not designed to DO anything Structure, store and exchange data Uses a Document Type Definition (DTD) or an Schema to describe the data 4 The Main Differences Between and HTML Extensible 5 Describes data, focuses on what data is tags are not predefined; You must define your own tags Well formed Properly nested Must have a closing tag Tags are case sensitive HTML Displays data, focuses on how data looks The tags used to mark up are predefined Not well formed Benefits A language with self-describing and simple syntax Exchange data between incompatible systems Software- and hardware-independent Share data stored in plain text format Human readable and computer-manipulable plain text Store (and retrieve) data in configuration files or in data repositories 6 2
Tutorial Outline What is? syntax rules Schema Document Object Model (DOM) Java API for Processing (JAXP) 7 Syntax Rules The declaration defines the version and the character encoding used in the document A comment line <!-- --> The root element of the document 4 child elements of the root <?xml version="1.0" encoding="iso-8859-1"?> <!--Edited with Spy 2007--> <note> <to>tove</to> <from>jani</from> <heading>reminder</heading> <body>don't forget me this weekend!</body> </note> 8 Elements An element is everything from (including) it s start tag Several to (including) it's end tag content types Between tags is the element s content Related as parents and children to form hierarchy: Note is the parent To, from, heading and body are siblings <?xml version="1.0" encoding="iso-8859-1"?> <!--Edited with Spy 2007--> <note> <to>tove</to> <from>jani</from> <heading>reminder</heading> <body>don't forget me this weekend!</body> </note> 9 3
Attributes Elements can have attributes in the start tag must always be quoted Used to provide additional information about elements Not expandable, cannot describe structures Rule of thumb: use attribute for metadata and elements for the data itself I.e., date should be an element <?xml version="1.0" encoding="iso-8859-1"?> <!--Edited with Spy 2007--> <note date="22/03/2007"> id="p501"> <to>tove</to> <from>jani</from> <heading>reminder</heading> <body>don't forget me this weekend!</body> </note> 10 Empty Content An empty element has no content Still may have attributes <product prodid="1345"> </product> Alternative syntax <product prodid="1345" /> 11 Tutorial Outline What is? syntax rules Schema Document Object Model (DOM) Java API for Processing (JAXP) 12 4
DTD - Document Type Definition Defines the legal building blocks of an document Defines the document structure with a list of legal elements Note.dtd defines the elements of the Note.xml document <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> 13 Schema Describes the structure of an document An -based alternative to DTDs An document is not required to have a corresponding Schema The Schema language is also referred to as Schema Definition (XSD) 14 Schema Defines: Elements and attributes that can appear in a document, and their data types The order and number of child elements Whether an element is empty or can include text or child elements Default and fixed values for elements and attributes 15 5
Note.xsd <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/schema"> <xs:element name="note"> <xs:complextype> <xs:sequence> <xs:element name="to type="xs:string"/> <xs:element name="from" type="xs:string"/> Empty <xs:element name="heading" type="xs:string"/> content <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complextype> </xs:schema> 16 Schemas vs. DTDs Extensible Schemas are written in ( syntax) Schemas support data types and namespace Schemas are Richer and more powerful easier to describe document content easier to validate the data easier to define data formats and restrictions documents can have a reference to a DTD or to an Schema 17 XSD - The <schema> Element The root element of every Schema May contain some attributes Indicates namespace for the schema elements and data types They should be prefixed with xs: <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3. org/2001/schema" >...... </xs:schema> 18 6
Simple Types A simple element contains only text can add restrictions can require to match a specific pattern quite misleading It cannot contain any other elements or attributes The syntax: <xs:element name="xxx" type="yyy"/> name data type 19 Simple Elements Examples only text e.g., numbers, strings, dates etc. elements <lastname>refsnes</lastname> <age>36</age> <dateborn>1970-03-27 </dateborn> Simple element definitions <xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/> 20 Default and Fixed Values 21 Simple elements may have a default value OR a fixed value specified A default value is automatically assigned to the element when no other value is specified <xs:element name="color" type="xs:string" default="red"/> A fixed value is always automatically assigned to the element You cannot specify another value <xs:element name="color" type="xs:string" fixed="red"/> 7
Attributes All attributes are declared as simple types Simple elements cannot have attributes The syntax: <xs:attribute name="xxx" type="yyy"/> element with an attribute example: <lastname lang="en">smith</lastname> The corresponding attribute definition <xs:attribute name="lang" type="xs:string"/> 22 Optional and Required Attributes Attributes may be default OR fixed Attributes are optional by default To specify that the attribute is required, use the "use" attribute Attributes <lastname> < lastname />כהן <lastname> < lastname />כהן <lastname lang="en">smith</lastname> Attributes definitions <xs:attribute name="lang" type="xs:string use="optional" default= HB"/> <xs:attribute name="lang" type="xs:string" fixed= HB"/> <xs:attribute name="lang" type="xs:string" use="required fixed="en"/> 23 Derived Simple Types: Restrictions Restrictions are used to define acceptable values for elements or attributes Anonymous Numeric Restriction type <xs:element name="age"> <xs:simpletype> <xs:restriction base="xs:integer"> <xs:mininclusive value="0"/> <xs:maxinclusive value="120"/> </xs:restriction> </xs:simpletype> Enumeration <xs:element name="car type="cartype"/> <xs:simpletype name="cartype"> <xs:restriction base="xs:string"> <xs:enumeration value="audi"/> <xs:enumeration value="golf"/> <xs:enumeration value="bmw"/> </xs:restriction> </xs:simpletype> 24 "cartype" can be used by other elements 8
Complex Types A complex element contains other elements and/or attributes There are four kinds of complex elements I. Empty: no content is allowed II. Element only: content must include only child elements III. Simple: content must be of simple type (text only) IV. Mixed: both element and text content is allowed Each of these elements may contain attributes as well 25 Complex Type I: Empty Elements An empty complex element cannot have contents, only attributes An empty element: <product prodid="1345" /> <xs:element name="product"> <xs:complextype> <xs:attribute name="prodid" type="xs:positiveinteger"/> </xs:complextype> 26 Order Indicators Used to define the order of the elements <all> specifies that the child elements can appear in any order, each child element must occur only once <choice> specifies that either one child element or another can occur <xs:element name="person"> <xs:complextype> <xs:all> <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:all> </xs:complextype> <xs:element name="person"> <xs:complextype> <xs:choice> <xs:element name="employee" type="employee"/> <xs:element name="member" type="member"/> </xs:choice> </xs:complextype> 27 9
Complex Type II: Elements Only An "elements-only" complex type contains an element that contains only other elements A person element: <person> <firstname>john</firstname> <lastname>smith</lastname> </person> <xs:element name="person"> <xs:complextype> Indicates a <xs:sequence> specific order <xs:element name="firstname" type="xs:string"/> <xs:element name="lastname" type="xs:string"/> </xs:sequence> </xs:complextype> 28 Occurrence Indicators Specify the minimum and maximum number of times an element can occur The default value for maxoccurs and minoccurs is 1 To allow an unlimited number use the statement maxoccurs="unbounded" 29 <xs:sequence> (extends the person example) <xs:element name="full_name" type="xs:string"/> <xs:element name="child_name" type="xs:string" maxoccurs="10" minoccurs="0"/> </xs:sequence> Complex Type III: Text-Only A complex text-only element can contain text and attributes A shoe size element: <shoesize country="france"> 35</shoesize> <xs:element name="shoesize"> <xs:complextype> <xs:simplecontent> <xs:extension base="xs:integer"> <xs:attribute name="country" type="xs:string" /> </xs:extension> </xs:simplecontent> </xs:complextype> 30 10
Complex Type IV: Mixed Content 31 A mixed complex type element can contain attributes, elements, and text The mixed attribute must be set to "true" <letter> Dear Mr.<name>John Smith</name>. Your order <orderid>1032</orderid> will be shipped on <shipdate>2001-07-13</shipdate>. </letter> <xs:element name="letter"> <xs:complextype mixed="true"> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="orderid" type="xs:positiveinteger"/> <xs:element name="shipdate" type="xs:date"/> </xs:sequence> </xs:complextype> Tutorial Outline What is? syntax rules Schema Document Object Model (DOM) Java API for Processing (JAXP) 32 Document Object Model (DOM) Provides a standard interface for accessing and manipulating documents Presents an document as a tree structure elements, attributes, and text defined as nodes The DOM enables to access every node in an document 33 11
DOM Class Hierarchy NodeList Node NamedNodeMap Document CharacterData Element Attr 34 Text Comment DOM Node Hierarchy Example A tree can be traversed without knowing its exact structure and which type of data it contains <bookstore> <book category="cooking"> <title lang="en">everyday Italian</title> <author>giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="children"> <title lang="en">harry Potter</title> <author>j K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="web"> <title lang="en">learning </title> <author>erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore> 35 DOM Node Tree Relationships between the tree nodes The top node is called the root Every node has exactly one parent node A node can have any number of children Siblings have the same parent 36 12
Tutorial Outline What is? syntax rules Schema Document Object Model (DOM) Java API for Processing (JAXP) 37 Java API for Processing (JAXP) Enables applications to parse and transform documents Independent of a particular implementation Provides two parser types: SAX parser event driven DOM document builder constructs DOM trees by parsing documents 38 An Overview of the Packages javax.xml.parsers Provides a common interface for different vendors' SAX and DOM parsers DocumentBuilderFactory, DocumentBuilder org.w3c.dom Defines the classes for all the DOM components org.xml.sax Defines the basic SAX APIs 39 13
The DOM APIs Provides a tree structure of nodes Ideal for interactive applications The entire object model is present in memory Requires reading the entire document 40 Code Examples Reading Data into a DOM http://java.sun.com/webservices/jaxp/dist/1.1/docs/tutorial/d om/1_read.html Creating and Manipulating a DOM http://java.sun.com/webservices/jaxp/dist/1.1/docs/tutorial/d om/4_create.html Echoing an File with the SAX Parser http://java.sun.com/webservices/jaxp/dist/1.1/docs/tutorial/s ax/2a_echo.html Handling Errors with the Nonvalidating Parser http://java.sun.com/webservices/jaxp/dist/1.1/docs/tutorial/s ax/3_error.html 41 14