COMP60411 Semi-structured Data and the Web A bit of XPath, Namespaces, and XML schema

Size: px
Start display at page:

Download "COMP60411 Semi-structured Data and the Web A bit of XPath, Namespaces, and XML schema"

Transcription

1 COMP60411 Semi-structured Data and the Web A bit of XPath, Namespaces, and XML schema week 2 Conny Hedeler & Uli Sattler University of Manchester 1

2 If you have arrived late... welcome: I hope you have a good start and catch up quickly! Check out for all relevant information Speak to me in the break 2

3 .plagiarism again... Work through the online course available in the CS-PGT-Welcome Community in Blackboard..possibly a second time if you have questions, ask! don t risk your marks don t risk your degree 3

4 Any questions? 4

5 XML documents... There are various standards, tools, APIs, data models for XML: to describe XML documents & validate XML document against: we have seen: DTDs today: XML schema to parse & manipulate XML documents programmatically: we have seen: SAX (and DOM) in next week s coursework: DOM transform an XML document into another XML document or into an instance of another formats, e.g., html, excel, relational tables.another form of manipulation 5

6 Manipulation of XML documents XPath for navigating through and querying of XML documents XQuery more expressive than XPath, uses XPath for querying and data manipulation Turing complete designed to access large amounts of data, to interface with relational systems XSLT similar to XQuery in that it uses XPath,... designed for styling, together with XSL-FO or CSS contrast this with DOM and SAX: a collection of APIs for programmatic manipulation includes data model and parser to build your own applications 6

7 XPath ML Schema later more designed to navigate to/select parts in a well-formed XML document no transformational capabilities (as in XQuery and XSLT) is a W3C standard: XPath 1.0 is a 1999 W3C standard XPath 2.0 is a 2007 W3C standard that extends/is a superset of XPath 1.0 richer set of WXS datatypes sequence support type information from WXS validation see vs set? allows to select/define parts of an XML document: sequence of nodes uses path expressions to navigate in XML documents to select node-lists in an XML document similar to expressions in a traditional computer file system rm */*/*.pdf provides numerous built-in functions e.g., for string values, numeric values, date and time comparison, node and QName manipulation, sequence manipulation, Boolean values, etc. 7

8 XPath: Datamodel remember how an XML document can be seen as a node-labelled tree with element names as labels XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax - but not on DOM tree! XPath uses XQuery/XPath Datamodel there is a translation at see XPath process model it is similar to the DOM tree easier handles attributes differently 8

9 Level Data unit examples Information or Property required cognitive application tree adorned with... Element Element Element Attribute namespace schema nothing a schema token choice: DOM tree Infoset XPath. tree complex simple Element well-formedness Element Element Attribute <foo:name t= 8 >Bob <foo:name t= 8 >Bob character < foo:name t= 8 >Bob which encoding (e.g., UTF-8) bit parsing serializing 9

10 XPath processing - a simplified view Input/Output Generic tools XPath expression XPath parser XPath tree XPath Execution Engine Schema XML document (Schema-aware) Parser Standard Datamodel eg. DOM or XPath Node Sequence 10

11 11

12 XPath: Datamodel the XPath DM uses the following concepts nodes: element attribute text namespace processing-instruction comment document (root) document (root) node element node atomic value: behave like nodes without children or parents is an atomic value, e.g., xsd:string item: atomic values or nodes <?xml version="1.0" encoding="iso "?> <bookstore> <book> <title lang="en">harry Potter</title> <author>j K. Rowling</author> <year>2005</year> <price>29.99</price> </book> </bookstore> text node attribute node

13 XPath Data Model From: <?xml version="1.0" encoding="utf-8"?> <network> <description name="boston"> This is the configuration of our network in the Boston office. </description> <host name="agatha" type="server" os="linux"> <interface name="eth0" type="ethernet"> <arec>agatha.example.edu</arec> <cname>mail.example.edu</cname> <addr> </addr> </interface> <service>smtp</service> <service>pop3</service> <service>imap4</service> </host> <host name="gil" type="server" os="linux"> 13

14 Comparison XPath DM and DOM datamodel Document nodetype = DOCUMENT_NODE nodename = #document nodevalue = (null) Element nodetype = ELEMENT_NODE nodename = mytext nodevalue = (null) firstchild lastchild attributes XPath DM and DOM DM are similar, but different most importantly regarding names and values of nodes but also structurally (see ) in XPath, only attributes, elements, processing instructions, and namespace nodes have names, of form (local part, namespace URI) whereas DOM uses pseudo-names like #document, #comment, #text In XPath, the value of an element or root node is the concatenation of the values of all its text node descendants, not null as it is in DOM: e.g, XPath value of <a>a<b>b</b></a> is AB XPath does not have separate nodes for CDATA sections (they are merged with their surrounding text) XPath has no representation of the DTD <N>here is some text and <![CDATA[some CDATA < >]]> </N> 14

15 XPath: core terms -- relation between nodes We know trees already: each node has at most one parent each node but the root node has exactly one parent the root node has no parent each node has zero or more children ancestor is the transitive closure of parent, i.e., a node s parent, its parent, its parent,... descendant is the transitive closure of child, i.e., a node s children, their children, their children,... when evaluating an XPath expression p, we assume that we know which document and which context we are evaluating p over we see later how they are chosen/given an XPath expression evaluates to a node sequence, a node is a document/element/attribute node or an atomic value document order is preserved among items 15

16 XPath - by example <?xml version="1.0" encoding="utf-8"?> <network> <description name="boston"> This is the configuration of our network in the Boston office. </description> <host name="agatha" type="server" os="linux"> <interface name="eth0" type="ethernet"> <arec>agatha.example.edu</arec> <cname>mail.example.edu</cname> <addr> </addr> </interface> <service>smtp</service> <service>pop3</service> <service>imap4</service> </host> <host name="gil" type="server" os="linux"> 16

17 XPath - abbreviated syntax by example context node XPath expression: */*[2] <?xml version="1.0" encoding="utf-8"?> <network> <description name="boston"> This is the configuration of our network in the Boston office. </description> <host name="agatha" type="server" os="linux"> <interface name="eth0" type="ethernet"> <arec>agatha.example.edu</arec> <cname>mail.example.edu</cname> <addr> </addr> </interface> <service>smtp</service> <service>pop3</service> <service>imap4</service> </host> <host name="gil" type="server" os="linux"> 17

18 XPath - abbreviated syntax by example context node XPath expression: */*[2]/*[1]/*[3] <?xml version="1.0" encoding="utf-8"?> <network> <description name="boston"> This is the configuration of our network in the Boston office. </description> <host name="agatha" type="server" os="linux"> <interface name="eth0" type="ethernet"> <arec>agatha.example.edu</arec> <cname>mail.example.edu</cname> <addr> </addr> </interface> <service>smtp</service> <service>pop3</service> <service>imap4</service> </host> <host name="gil" type="server" os="linux"> 18

19 XPath - abbreviated syntax know your context node context node XPath expression: */*[2] <?xml version="1.0" encoding="utf-8"?> <network> <description name="boston"> This is the configuration of our network in the Boston office. </description> <host name="agatha" type="server" os="linux"> <interface name="eth0" type="ethernet"> <arec>agatha.example.edu</arec> <cname>mail.example.edu</cname> <addr> </addr> </interface> <service>smtp</service> <service>pop3</service> <service>imap4</service> </host> <host name="gil" type="server" os="linux"> 19

20 XPath - abbreviated syntax absolute paths context node /* XPath expression: /*/*[1] <?xml version="1.0" encoding="utf-8"?> <network> <description name="boston"> This is the configuration of our network in the Boston office. </description> <host name="agatha" type="server" os="linux"> <interface name="eth0" type="ethernet"> <arec>agatha.example.edu</arec> <cname>mail.example.edu</cname> <addr> </addr> </interface> <service>smtp</service> <service>pop3</service> <service>imap4</service> </host> <host name="gil" type="server" os="linux"> 20

21 XPath - abbreviated syntax local globally XPath expression: //service <?xml version="1.0" encoding="utf-8"?> <network> <description name="boston"> This is the configuration of our network in the Boston office. </description> <host name="agatha" type="server" os="linux"> <interface name="eth0" type="ethernet"> <arec>agatha.example.edu</arec> <cname>mail.example.edu</cname> <addr> </addr> </interface> <service>smtp</service> <service>pop3</service> <service>imap4</service> </host> <host name="gil" type="server" os="linux"> 21

22 XPath - abbreviated syntax local globally XPath expression: //* <?xml version="1.0" encoding="utf-8"?> <network> <description name="boston"> This is the configuration of our network in the Boston office. </description> <host name="agatha" type="server" os="linux"> <interface name="eth0" type="ethernet"> <arec>agatha.example.edu</arec> <cname>mail.example.edu</cname> <addr> </addr> </interface> <service>smtp</service> <service>pop3</service> <service>imap4</service> </host> <host name="gil" type="server" os="linux"> 22

23 XPath - abbreviated syntax attributes XPath expression: //*[@name= agatha ] <?xml version="1.0" encoding="utf-8"?> <network> <description name="boston"> This is the configuration of our network in the Boston office. </description> <host name="agatha" type="server" os="linux"> <interface name="eth0" type="ethernet"> <arec>agatha.example.edu</arec> <cname>mail.example.edu</cname> <addr> </addr> </interface> <service>smtp</service> <service>pop3</service> <service>imap4</service> </host> <host name="gil" type="server" os="linux"> 23

24 Find more about XPath: read up and play with examples, e.g., in 24

25 Modelling including a discussion of M1 and M1a

26 Case study 1...as easy as 1 + 1?

27 Task: Design a simple format (not M1) Design a format for arithmetic expressions! This is our CW1 domain High level description: addition, multiplication, subtraction over the integers Examples in informal notation 4+5*7 Twelve minus fifty-nine plus one, that sort of thing. Request Design a reasonable XML format for this domain Provide a DTD that describes that format i.e., we use schema as medium of communication We have choices! First choice is the root element name Let s say, expression

28 Work from an example!

29 Pick a root element

30 Work from an example!

31 Picking good example(s) - how? Different principles Coverage: hit all features Simplicity: easy to get right or get something Corner cases: the hard situations Realism: hit an actual situation Trade off principles E.g., coverage vs. simplicity More examples may be better than one! Your example(s) should Give you insight into the domain Highlight benefits or drawbacks Force (or elicit) design decisions Ours 2+3*(5-4)

32 Initial design thoughts Three kinds of design issues/choices: Elements vs. attributes vs. text Also type of textual content Structural relationships E.g., nesting Naming Choices on one constrain the others Suppose we decide to represent operators with elements We cannot then use the names +, *, and - They aren t legal element names Design 1 Elements for everything Names: plus, minus, times, open_parens, close_parens, int Structure?

33 A flat style

34 How to evaluate our design/format? Establish evaluation axes and measures Coverage Does it capture all of my domain? E.g., is there an arithmetic expression I can t encode? Usability Can people write/read/otherwise use your format? Is working with the format error prone? Naturalness/Fit/Fidelity Is your format natural? Does your format make good use of the data model? Evolvability Can we extend the format with minimal disruption?. change..? Again, there may be trade offs It depends on the context and interests

35 Fit? Is there structure? No...

36 Maybe indentation will help?

37 The system can t see/show structure! What is the structure of 2+3*(5-4)?

38 Exercise: draw parse/structure tree of 2+3*(5-4) (3+2*3) + (5-(1+2)*(2+2))...then evaluate these expressions. do we need structure to evaluate expressions?

39 The system can t see/show structure! Why should we care what the system formatter does? It s a warning sign We cannot enforce relevant syntax constraints E.g., Parenthesis must balance For every ( there is a ) We cannot express that to an arbitrary depth! Wait! maybe we can? Cool CS fact: Regular expressions can t express nesting!

40 really doesn t work: Try to write (5 - (3-2))

41 Other aspects of Fit Queries! E.g., over (10 + 5) * (6-4) Get all parenthesized expressions I.e., (10 + 5) and (6-4) Get the content of the first parenthesized expression I.e., Get the first element of the first parenthesized expression I.e., 10 We need to parse! We need a stack! DTD constraints We talked about SAX processing DOM processing DOM tree looks like the SAX messages No nesting! Shallow tree! Which of these queries can we answer in XPath?

42 SAX Processing startdocument startelement [null] [expression] [expression] comment [ A simple expression which touches all features. 2+3*(5-4) ] [21] [68] startelement [null] [int] [int] characters [2] [102] [1] endelement [null] [int] [int] startelement [null] [plus] [plus] endelement [null] [plus] [plus] startelement [null] [int] [int] characters [3] [131] [1] endelement [null] [int] [int] startelement [null] [times] [times] endelement [null] [times] [times] startelement [null] [open_parens] [open_parens] endelement [null] [open_parens] [open_parens] startelement [null] [int] [int] characters [5] [180] [1] endelement [null] [int] [int] startelement [null] [minus] [minus] endelement [null] [minus] [minus] startelement [null] [int] [int] characters [4] [210] [1] endelement [null] [int] [int] startelement [null] [close_parens] [close_parens] endelement [null] [close_parens] [close_parens] endelement [null] [expression] [expression] <?xml version="1.0" encoding="utf-8"?> <expression> <int>2</int> <plus/> <int>3</int> <times/> <open_parens/> <int>5</int> <minus/> <int>4</int> <close_parens/> </expression> You will have to do your own matching/checking of parenthesis!

43 Done with this: New design Design 1 Elements for everything Names: plus, minus, times, open_parens, close_parens, int Structure = Flat Design 2 Elements for everything Names: plus, minus, times, parens, int Structure = Nested parenthesis/make subexpressions clear! Is this better?

44 Doesn t look too different!

45 But the system understands structure!

46 Evaluating Design 2 Design 2 Elements for everything Names: plus, minus, times, parens, int Structure = Nested parens DTD constraints we can enforce correct nesting of parenthesis many parentheses: 2+(3*(5-4)) SAX/DOM? Some tree structure, but still some flat analysis Queries? <expression> <int>2</int> <plus/> <parens> <int>3</int> <times/> <parens> <int>5</int> <minus/> <int>4</int> </parens> </parens> </expression> <!ELEMENT expression (parens (int, (minus times plus), int) (parens, (minus times plus), int) (int, (minus times plus), parens) (parens, (minus times plus), parens))> <!ELEMENT parens (parens (int, (minus times plus), int) (parens, (minus times plus), int) (int, (minus times plus), parens) (parens, (minus times plus), parens))> <!ELEMENT plus EMPTY> <!ELEMENT minus EMPTY> <!ELEMENT times EMPTY> <!ELEMENT int (#PCDATA)> ]>

47 Design 2: queries are easier! (10 + 5) * (6-4) Get all parenthesized expressions = //parens

48 Design 2: queries are easier! (10 + 5) * (6-4) Get the content of the first parenthesized expression = //parens[1]/*

49 Design 2: queries are easier! (10 + 5) * (6-4) Get the first element of the first parenthesized expression = //parens[1]/*[1]

50 Design 2: uses data model well! Flat vs. nested parens Both Design 1 and 2 encode parenthesization! But Design 2 does it in a more XML natural way: All XML sensitive tools and languages do more with Design 2 Key features of our information are salient to those tools! Is this the best use of the model?

51 Take it further! Design 1 Elements for everything Names: plus, minus, times, open_parens, close_parens, int Structure = Flat Design 2 Elements for everything Names: plus, minus, times, parens, int Structure = Nested parens Design 3 Elements for everything Names: plus, minus, times, int Structure = Nested operators, not parens!

52 Nesting, nesting everywhere?

53 Evaluating Design 3 Design 3 Elements for everything Names: plus, minus, times, int Structure = Nested operators (no parens!) DTD constraints Lots! SAX/DOM processing Natural tree structure Usability Fewer elements (and concepts!) Queries? All our old ones are irrelevant! Our new ones are more content oriented Get all additions of subtractions <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE expression [ <!ELEMENT expression (plus times minus int )> <!ELEMENT plus ((plus times minus int ), (plus times minus int ))> <!ELEMENT times ((plus times minus int ), (plus times minus int ))> <!ELEMENT minus ((plus times minus int ), (plus times minus int ))> <!ELEMENT int (#PCDATA)> ]> <expression> <plus> <int>2</int> <times> <int>3</int> <minus> <int>5</int> <int>4</int> </minus> </times> </plus> </expression>

54 Last twiddle to get to calc1 Design 3.1 Elements for everything Elements for everything except integer values Names: plus, minus, times, int, value Structure = Nested operators (no parens!) Another subtle change allow more than 2 parameter for plus and times not <int>3</int> but <int value="3"/>...why? <plus> <int value="3"/> <int value="5"/> <int value="7"/> </plus>

55 Look familiar?

56 We have our language design! Design 3.1 Elements for everything Elements for everything except int values Names: plus, minus, times, int, value Structure = Nested operators (no parens!) plus and times that >= 2 parameters We re done, right? We need to capture it Say in a DTD Our DTD needs evaluation too!

57 Two versions <!ELEMENT expression (plus times minus int )> <!ELEMENT plus ((plus times minus int ), (plus times minus int )+)> <!ELEMENT times ((plus times minus int ), (plus times minus int )+)> <!ELEMENT minus ((plus times minus int ), (plus times minus int ))> <!ELEMENT int EMPTY> <!ATTLIST int value NMTOKEN #REQUIRED> or <!ENTITY % expr "(plus times minus int )"> <!ELEMENT expression %expr;> <!ELEMENT plus (%expr;, (%expr;)+)> <!ELEMENT times (%expr;, (%expr;)+)> <!ELEMENT minus (%expr;, %expr;)> <!ELEMENT int EMPTY> <!ATTLIST int value NMTOKEN #REQUIRED>? They say exactly the same thing! The latter says it better (slightly)

58 What does calc1.dtd give us? For CW1 SaxCalc It documents the format It could be used for authoring Autocompletion, error correction Can (partially) generate examples It could be used to check input e.g. using a validating parser But it doesn t capture all our constraints Integers only as values! <plus> <int value="3"/> <int value="five"/> <int value="7"/> </plus>

59 Remember validity? XML 1.0 Definition < [Definition: An XML document is valid if it has an associated DTD and if the document complies with the constraints expressed in it.] Two conditions (beyond being well-formed): A declaration document type declaration to establish association satisfaction of the constraints in the definition document type definition (the DTD)...also known as internally valid (Though that s a bit of misnomer; the spec calls this Valid ) The document says what it is

60 Is this a good notion? SaxCalc Error handling What would a user want us to do? <expression> <plus> <times> <int value="7"/> <minus> <int value="4"/> <int value="24"/> </minus> </times> <int value="7"/> </plus> </expression> <html> <expression> <plus> <times> <int value="7"></int> <minus> <int value="4"></int> <int value="24"></int> </minus> </times> <int value="7"></int> </plus> </expression> </html> <html> <expression> <plus> Multiplication is cool! <times> <int value="7"></int> <minus> <int value="4"></int> <int value="24"></int> </minus> </times> <int value="7"></int> </plus> </expression> </html>

61 External validation One XML doc can be valid w.r.t. many DTDs Some tighter, some looser Different schemas can offer different things External validation breaks self-describingness a DTD another DTD all XML docs

62 External validation (2) Internal validation can go wrong in a lot of ways Missing declaration (to tell where/which DTD) User forgot Missing definition (in DTD) Internal subsets are verbose and hard to update Web based schema have problems Can t depend on them: app breaks if the server dies Hard on the server External validation with DTDs Poorly handled by SAX Even redirecting may be hard Other schema languages do better javax.xml.validation.validator;

63 Error handling - a taster Do we need anything? In SaxCalc, you could get errors for free parseint() failures Keep track of arguments Why not hard code? May need to track characters() But throwing unhandled exceptions is cool, right? You ll get a stack trace! Separate input checking and evaluation Allow the format to evolve more or less separately Within certain constraints What about non-well-formed documents?

64 DTDs as a (schema) language Usability Syntax isn t too bad Regular expression syntax is terse, yet readable(ish) Some things get awkward Expressivity (what can we say?) - Can t constrain text to be integers Maintainability, readability, evolvability - Parameter entities are pretty weak - Textual macros! Computability How hard/costly is it to validate?...later more Can we do better?

65 Case study 2

66 M1: An XML Syntax for Documents For this exercise, you have to design a suitable XML format, generate an example document in this format, and write a DTD that describes it. The format should be suitable to store, share, and work with documents, i.e., structured pieces of text with titles, authors, a table of content, sections, a bibliography, etc. More precisely, your format should be able to capture documents of the following kind: a document has one or more authors; for each author, we consider their name and possibly their institution...

67 Decisions you had to make for M1 Naming elements Nesting structure but not much most was constrained by M1 How much detail? for names of authors? How to deal with ToC and Bibliography? to prevent errors, e.g., ToC entries without sections Element or Attribute for content? Title Name Text Snippets...all of this is text?

68 What can DTD say about attributes? In DTD, we can restrict attribute values to CDATA (any text, most general) NMTOKEN (a single name) and NMTOKENS (a list of names) ENTITY and ENTITIES predefined: < & > " user defined external unparsed: ID, IDREF and IDREFS (uniqueness constraints apply across doc) enumerated values (standard) NOTATION (basically unused) For our documents: NMTOKEN/NMTOKENS for title? No! XML is tricky! wouldn t be a legal title! NMTOKEN/NMTOKENS for name? Possible. ID and IDREF for ToC and Bibliography <!ENTITY my_pic SYSTEM "pics/pretty.jpg" NDATA jpeg> <!ELEMENT pic EMPTY> <!ATTLIST pic file ENTITY #REQUIRED> <image file="my_pic"/>

69 What can DTD say about elements? Nesting structure: sections in sections or chapters title in document entries in ToC and Bibliography Text content: parsed character data! So, for our text content: snippet s text, section/chapter/... titles cannot be NMTOKENS attribute e.g.,! isn t allowed either CDATA attributes or text elements <!ELEMENT snippet (#PCDATA)> Names can be attributes with NMTOKENS range (think of Elizabeth Garrett Anderson ) or text elements...a matter of taste

70 Chapters, sections and ToC entries? We can use ID/IDREF to ensure that all ToC entries relate to some chapter (or section) Can we say more in DTD? 1. eg that every chapter has a ToC entry? 2. eg that order of chapters in document is the same as in ToC? 3. eg that title of section in ToC is the same as in content? Observations about our format? It contains some redundancy: 4. is redundancy good or bad? 5. could we avoid it? 6. if yes, which?

71 Case study 3 or, my brain hurts

72 M1a: An XML Syntax for DTDs As we have seen in the lecture, DTDs do not have an XML based syntax, which makes their manipulation by XML tools difficult. In this assignment, you will design a way of expressing the constraints of a DTD using XML. This will allow you to translate any DTD into its corresponding XML-based version. Your task is to design and write a DTD for such an XML-based version, then translate a sample DTD file into your XML-based syntax. DTD is a language for describing XML But DTD as a language is not homoiconic That is, they are not represented in the same language they represent The meta level and the object level are distinct Thus, e.g., can t use XPath to count element declarations Can t use DTDs to define subsets or extensions Is homoiconicity desirable?

73 An argument the first striking and unexpected observation is that most published DTDs are not correct, with missing elements, wrong syntax or incompatible attribute declarations. This might prove that such DTDs are being used for documentation purposes only and are not meant to be used for validation at all. The reason might be that because of the ad-hoc syntax of DTDs [..], there are no standard tools to validate them. The finding: published DTDs are broken Explanation: Ad hoc syntax inhibits tools Solution: Use XML for the syntax This issue will be addressed by proposals that use XML as the syntax to describe DTDs themselves Everything You Ever Wanted to Know About DTDs, But Were Afraid to Ask

74 A modelling challenge! a schema another schema First, get clear on the domain What are we trying to represent? Element declarations attribute declarations M1a said so! comments What is the structure of the information? all XML docs e.g., of an example element declaration <!ELEMENT foo EMPTY> Covering? Simple? Corner case? Realistic? What s a reasonable, simple first design? I trust we don t need to consider flat designs. How about a bit of CDATA, i.e., move declarations as thus: <dtdx> <![CDATA[<!ELEMENT foo EMPTY>]]> </dtdx>

75 That was not a good design Design 1 Move the DTD into an element Design 2 Lots of elements, some attributes comment, element, choice, seq, plus, etc. Some nesting: mirror the syntax tree of element declaration regular expression in the content model attribute declaration Doesn t really change the DTD syntax: Validation is not helpful No nice querying Still need a traditional DTD (DTD/trad) syntax parser

76 A sub-optimal example In the sketch below, how would we express the constraint every element name that is referred to in a content description is declared <!ELEMENT expression (plus times minus int )> <element name="expression"> <choice> <plus/> <times/> <minus/> <int/> </choice> </element> <!ELEMENT plus ((plus times minus int ), (plus times minus int )+)> <element name="plus"> <seq> <choice> <plus/> <times/> <minus/> <int/> </choice> <choice> <plus/> <times/> <minus/> <int/> </choice> </seq> </element>

77 Why do attributes help here? We know: DTDs can constrain attributes better than it can constrain elements in two ways: (weak) type constraints uniqueness via ID/IDREF Both are helpful for representing DTD syntax!

78 It s verbose - here is our calc1.xml <!DOCTYPE dtdx SYSTEM "dtdx1.dtd"> <dtdx> <comment>i omit most comments for brevity in this example.</comment> <element name="expression"> <choice> <ref to="plus"/> <ref to="times"/> <ref to="minus"/> <ref to="int"/> </choice> </element> <element name="plus"> <seq> <choice> <ref to="plus"/> <ref to="times"/> <ref to="minus"/> <ref to="int"/> </choice> <plus> <choice> <ref to="plus"/> <ref to="times"/> <ref to="minus"/> <ref to="int"/> </choice> </plus> </seq> </element> Therefor all the attributes? <element name="times"> <seq> <choice> <ref to="plus"/> <ref to="times"/> <ref to="minus"/> <ref to="int"/> </choice> <plus> <choice> <ref to="plus"/> <ref to="times"/> <ref to="minus"/> <ref to="int"/> </choice> </plus> </seq> </element> <element name="minus"> <seq> <choice> <ref to="plus"/> <ref to="times"/> <ref to="minus"/> <ref to="int"/> </choice> <choice> <ref to="plus"/> <ref to="times"/> <ref to="minus"/> <ref to="int"/> </choice> </seq> </element> <element name="int"><empty/></element> <attlist on="int"> <attdef name="value"> <tokenized type="nmtoken"/> <required/> </attdef> </attlist> </dtdx>

79 Examples Key constraints Every element can have at most one declaration <!ATTLIST element name ID #REQUIRED> Every ref must have a corresponding def <!ATTLIST ref to IDREF #REQUIRED> Every attlist ref must have a corresponding def <!ATTLIST attlist on IDREF #REQUIRED> Type constraints ID and IDREFs must be NMTOKENS Convenient! Tokenized attribute values have a constraint set <element name="expression"> <choice> <ref to="plus"/> <ref to="times"/> <ref to="minus"/> <ref to="int"/> </choice> </element> <!ATTLIST tokenized type (ID IDREF IDREFS ENTITY ENTITIES NMTOKEN NMTOKENS) #REQUIRED> (This could be done with an element.)

80 Our Design Design 1 Move the DTD into an element Design 2.1 Lots of elements, attributes for type and key constraints comment, element, choice, seq, plus, etc. Some nesting Mirror the syntax tree! Use ID and IDREF as appropriate

81 Tail wagging dog One design decision is odd: Attributes for name, ref, etc. Not driven by general design considerations Driven by limitations of our schema language! Another disadvantage of DTDs! Ad hoc syntax Expressivity limitations (enough to distort modelling) Key constraints only on attribute content Type constraints only on attribute content Limited types (no integers!) Poor structured development support Parameter entities!

82 Evaluating Design 2.1 Design 2.1 Authoring a bit verbose, but good editor support DTD constraints not bad at all! SAX/DOM processing pretty good Queries? Lots of elements, attributes for type & key constraints comment, element, choice, seq, plus, etc. Some nesting Mirror the syntax tree! Use ID and IDREF as appropriate

83 Queries: Assume you want to retrieve from DTDx schema: Get all comments //comment Get all elements with a top level choice //element/choice/.. Get all elements with a choice anywhere /*/element//choice/ancestor::element Get all attribute declarations on <int> /*/attlist[@on="int"] How many comments? count(//comment) Queryability excellent!

84 XML Namespaces or, making things simpler by making them much more complex

85 An observation Both calc1-bjp and dtdx-bjp have a plus element <plus> <int value="4"/> <int value="5"/> </plus> <plus> <choice>... </choice> </plus> We have an element name conflict! How do we distinguish plus[arithmetic] and plus[reg-exp]? Semantically? In a combined document?

86 Uniquing the names (1) We can add some characters <calcplus> <int value="4"/> <int value="5"/> </calcplus> <dtdxplus> <choice>... </choice> </dtdxplus> No name clash now But the meaningful part of name (plus) is hard to see calcplus isn t a real word!

87 Uniquing the names (2) We can use a separator or other convention <calc:plus> <int value="4"/> <int value="5"/> </calc:plus> <dtdx:plus> <choice>... </choice> </dtdx:plus> No name clash now The meaningful part of the name is clear The disambiguator is clear But we can get clashes! Need a registry to coordinate?

88 Uniquing the names (3) Use URls for disambiguation < <int value="4"/> <int value="5"/> </ < <choice>... </choice> </ No name clash now The meaningful part of the name clear The disambiguator is clear Clashes are hard to get Existing URI allocation mechanism But not well formed!

89 Uniquing the names (4) Combine the (2) and (3)! <calc:plus xmlns:calc=" <int value="4"/> <int value="5"/> </calc:plus> <dtdx:plus xmlns:dtdx=" <choice>... </choice> </dtdx:plus> No name clash now The meaningful part of the name clear The disambiguator is clear Clashes are hard to get Existing URI allocation mechanism But well formed! But the model doesn t know

90 Layered!

91 Anatomy & Terminology of Namespaces <calc:plus xmlns:calc=" <int value="4"/> <int value="5"/> </calc:plus> Namespace declarations, e.g., xmlns:calc=" looks like/can be treated as a normal attribute Qualified names ( QNames ), e.g., calc:plus consist of Prefix, e.g., calc Local name, e.g., plus Expanded name, e.g., { they don t occur in doc but we can talk about them! Namespace name, e.g.,

92 We don t need a prefix <plus xmlns=" <int value="4"/> <int value="5"/> </plus> <calc:plus xmlns:calc=" <int value="4"/> <int value="5"/> </calc:plus> We can have default namespaces Terser/Less cluttered Retro-fit legacy documents Safer for non-namespace aware processors But trickiness! What s the expanded name of int in each document? { {}int Default namespaces and attributes interact weirdly...

93 Multiple namespaces We can have multiple declarations Each declaration has a scope The scope of a declaration is: the element where the declaration appears together with the descendants of that element......except those descendants which have a conflicting declaration (and their descendants, etc.) I.e., a declaration with the same prefix Scopes nest and shadow Deeper nested declarations redefine/overwrite outer declarations <plus xmlns=" xmlns:n=" > <n:int value="4"/> <n:int value="5"/> </plus> <plus xmlns=" <int xmlns=" value="4"/> <int value="5"/> </plus>

94 Some clicker tests... <a:expression xmlns="foo1" xmlns:a="foo2" xmlns:b="bah"> <b:plus xmlns:a="foobah"> <int value="3"/> <a:int value="3"/> </b:plus> </a:expression>

95 Much more about NS in our future Issues: Namespaces are increasingly controversial Modelling principles Schema language support Speaking of which...

96 DTDs and Namespaces Another expressivity limitation DTDs were designed before namespaces and were never updated DTDs only understand XML without namespaces Namespaces are syntactically layered So we can do something with them But not with full generality General schema principle If two documents are the same according to the data model, (equivalent) then: if one is valid wrt a schema so should the other Not true for namespace equivalent documents and DTDs

97 An example A simple document with a namespace <foo xmlns=" <bar/> </foo> We can write a DTD for it <!DOCTYPE foo [ <!ELEMENT foo (bar)> <!ATTLIST foo xmlns CDATA #FIXED " <!ELEMENT bar EMPTY> ]> <foo xmlns=" <bar/> </foo> This document is valid!

98 An example (cont) 3 simple documents with a namespace (a) <foo xmlns=" <bar/> </foo> (b) <ex1:foo xmlns:ex1=" <ex1:bar/> </ex1:foo> (c) <ex2:foo xmlns:ex2=" <ex2:bar/> </ex2:foo> We can compare with another document (a)-(c) are, from a namespace point of view, the same they simply uses different prefixes! But (b) & (c) are not valid wrt our DTD (whereas (a) is): <!ELEMENT foo (bar)> <!ATTLIST foo xmlns CDATA #FIXED " <!ELEMENT bar EMPTY>

99 Problems with DTDs Ad hoc, non-xml based syntax Expressivity limitations (enough to distort modelling) Key/uniqueness constraints only on attribute content Type constraints only on attribute content and only limited types: no integers, date, string,... No namespace support... Poor structured development support Parameter entities! Poor support from tools and APIs External validation tricky in SAX Some security issues

100 Security aside Risk for Consumers: Entity expansion attacks ( ) Billion laughs: Exponential Entity Expansion Disappearing DTDs Network failure or removal Risk for Publishers: Accidental DoS ( the RSS 0.91 DTD is retrieved over 4 million times per day. That s nuts. Burdening a single third party like that for something as useless as a DTD makes no sense.

101 Should you use DTDs? Probably not, but The syntax is reasonable for authoring and reading prototyping There are a fair number of DTDs out there A common denominator Namespace lack hurts a lot As we ll see They are a good starting point

102 XML Schema - another schema language for XML 102

103 Schema languages for XML e.g., DTD provide means to define the legal structure of an XML document cartoon.dtd cartoon-uli.dtd all XML docs 103

104 Schema languages for XML provide means to define the legal structure of an XML document <?xml version="1.0" encoding="utf-8"?> <!ELEMENT cartoon (prolog, panels)> <!ATTLIST cartoon copyright CDATA #REQUIRED> <!ATTLIST cartoon year CDATA #REQUIRED> <!ELEMENT prolog (series, author, characters)> <!ELEMENT series (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT characters (character)*>... cartoon.dtd, a DTD for cartoon descriptions... <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE cartoon SYSTEM "cartoon.dtd"> <cartoon copyright="united Feature Syndicate" year="2000"> <prolog> <series>dilbert</series> <author>scott Adams</author> <characters> <character>the Pointy-Haired Boss</character> <character>dilbert</character> </characters>... <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE cartoon SYSTEM "cartoon.dtd"> <cartoon copyright="bill Watterson" year="1994"> <prolog> <series>calvin and Hobbs</series> <author>bill Watterson</author> <characters> <character>calvin</character> <character>hobbs</character> <character>snowman</character> </characters>

105 Schema languages for XML A variety of schema languages have been developed for XML; they vary w.r.t. their expressive power: do I have a means to express foo? how hard is it to describe foo? e.g., try to say, in a DTD that element must contain, in any order, an element1, an element2,..., and an element27 ease of use/understanding: how easy it is to write a schema? how easy is it to understand a schema written by somebody else? the complexity of validating a document w.r.t. a schema: how much space/time does it take to verify whether a document is valid w.r.t. a schema (in the size of document and schema)? e.g., checking this for DTDs requires only space linear in depth of the document 105

106 Schema languages for XML...can I express the same constraints better in a different schema language? cartoon.dtd cartoon.xsd all XML docs 106

107 Schema languages for XML...can I express the even better constraints better in a different schema language? cartoon.dtd cartoon.xsd all XML docs 107

108 XML Schema XML Schema is also referred to as XML Schema Definition XSD is a W3C standard, see can be seen as successor of DTDs: a DTD is not a well-formed XML document an XML Schema is a well-formed XML document XML Schema is mostly more expressive than DTDs we ll talk about this at length in contrast to DTDs, XML Schema supports namespaces, so we can combine several documents: for schema validation, universal names are used (rather than qualified names) datatypes, including simple datatypes for parsed character data and for attribute values, e.g., for date (when was 11/10/2006?) more uniqueness/key constraints XML provides more support for describing the (element and mixed) content of elements 108

109 XML Schema: a first example Example with DTD: <?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>tove</to> <from>jani</from> <senton> </senton> <body> Have a nice weekend! </body> </note> note.dtd : <?xml version="1.0" encoding="utf-8"?> <!ELEMENT note (to, from, senton,heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT senton (#PCDATA)> <!ELEMENT body (#PCDATA)> 109

110 XML Schema: a first example note.xsd: <?xml version="1.0"?> <note xmlns= " xmlns:xs= " xmlns:xsi= " <to>tove</to> <from>jani</from> <senton> </senton> <body> Have a nice weekend! </body> </note> <?xml version="1.0"?> <xs:schema xmlns:xs= " targetnamespace= " xmlns=" elementformdefault="qualified"> <xs:element name="note"> <xs:complextype> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="senton" type="xs:date"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complextype></xs:element></xs:schema>

111 Schema XML document Schema-ware parser Serializer Standard API eg. DOM or Sax your application to validate an XML document against an XML schema, we use a validating XML parser that supports XML Schema e.g., DOM level 2, SAX2 in XML Schema, each element and type can only be declared once almost all elements can contain an element <xs:annotation>...</xs:annotation> as 1st child: useful, e.g., for <xs:simpletype name="northweststates"> <xs:annotation><xs:documentation>states in the Pacific Northwest of US </xs:documentation></xs:annotation> <xs:restriction base="xs:string"></xs:restriction> </xs:simpletype> XML Schema provides support for modularity & re-use through xs:import, xs:include, xs:redefine 111

112 XML Schema & Namespaces most XML Schemas start like this, in note.xsd <?xml version="1.0" encoding="utf-8"?> <xs:schema xmlns:xs=" targetnamespace=" xmlns=" elementformdefault="qualified" >.. </xs:schema> XML Schema namespace e.g. for datatypes Target namespace of elements defined in this schema and a document using such a schema looks like this: <?xml version="1.0"?> <note xmlns=" xmlns:xsi=" Local (default) namespace This document uses a schema 112

113 XML Schema & Namespaces in contrast to DTDs, XML Schema supports namespaces a XML Schema either has no namespace or 2 namespaces: targetnamespace for those elements defined in schema and XMLSchema namespace note.xsd: <?xml version="1.0"?> <p:note xmlns:p=" xmlns:xs=" xmlns:xsi=" <p:to>tove</p:to> <?xml version="1.0"?> <note xmlns=" xmlns:xs=" xmlns:xsi=" <to>tove</to> <?xml version="1.0"?> <xs:schema xmlns:xs= " targetnamespace= " xmlns=" elementformdefault="qualified"> <xs:element name="note"> <xs:complextype> <xs:sequence> <xs:element name="to" type="xs:string"/>

114 XML Schema core concepts: data types in the previous examples, we used 2 Built-in datatypes: xs:string xs:date many more: built-in/atomic/ primitive e.g., xs:datetime composite/ user-defined e.g., xs:lists, xs:union through restrictions/ user-defined e.g., ints <

115 XML Schema core concepts: datatypes each XML datatype comes with a value space, e.g., for xs:boolean, this is {true, false}. lexical space, e.g., for xs:boolean, this is {true, false, 1, 0}, and lexical-to-value mapping that has to be neither injective nor surjective (for boolean, it s surjective, but not injective) constraining facets that can be used in restrictions of that datatype, e.g., maxinclusive, maxexclusive, mininclusive, minexclusive for most 115

116 XML Schema: types We can define types in XSD, in two ways: xs:simpletype for simple types, to be used for attribute values and elements without element child nodes and without attributes xs:complextype for complex types, to be used for elements with element content or mixed element content or text content and attributes 116

117 XML Schema: element type definition can be anonymous, e.g., in the definition of age or person below: <xs:element name="age"> <xs:simpletype> <xs:restriction base="xs:integer"> <xs:mininclusive value="3"/> <xs:maxinclusive value="7"/> </xs:restriction> </xs:simpletype> </xs:element> can be named, e.g., <xs:element name="person"> <xs:complextype> <xs:sequence> <xs:element name="name" type="nametype"/> <xs:element name="dob" type="xs:date"/> </xs:sequence> <xs:attribute name="friend" type="xs:boolean"/> </xs:complextype> </xs:element> <xs:element name="age" type="agetype"/> <xs:simpletype name="agetype"> <xs:restriction base="xs:integer"> <xs:mininclusive value="3"/> <xs:maxinclusive value="7"/> </xs:restriction> </xs:simpletype> <xs:element name="person" type="persontype"/> <xs:complextype name ="PersonType"> <xs:sequence> <xs:element name="name" type="nametype"/> <xs:element name="dob" type="xs:date"/> </xs:sequence> <xs:attribute name="friend" type="xs:boolean"/> </xs:complextype> 117

118 XML Schema: atomic simple types are based on the numerous built-in datatypes that can be restricted using xs:restriction facets, e.g., enumeration length maxlength minlength maxexclusive/maxinclusive minexclusive/mininclusive patterns (using regular expressions close to Perl s) <xs:simpletype name= biketype > <xs:restriction base="xs:string"> <xs:enumeration value= MTB"/> <xs:enumeration value= road"/> </xs:restriction></ xs:simpletype> <xs:simpletype name= eightchar ><xs:restriction base="xs:string"> <xs:length value="8"/> </xs:restriction></xs:simpletype> <xs:simpletype name= medstr > <xs:restriction base="xs:string"> <xs:minlength value="5"/> <xs:maxlength value="8"/> </xs:restriction></xs:simpletype> <xs:simpletype name= age > <xs:restriction base="xs:integer > <xs:mininclusive value="0"/> <xs:maxinclusive value="120"/> </xs:restriction></xs:simpletype> <xs:simpletype name= simplestr ><xs:restriction base="xs:string"> <xs:pattern value="([a-z][a-z])+"/> </xs:restriction></xs:simpletype> 118

119 XML Schema: composite simple types we can use built-in datatypes not only in restrictions, but also in compositions to construct: xs:list xs:union <xs:simpletype name='mylist'> <xs:list itemtype='xs:integer'/> </xs:simpletype> <xs:simpletype name='shortlist'> <xs:restriction base='mylist'> <xs:maxlength value='8'/>! </xs:restriction> </xs:simpletype> <xs:simpletype name="colourlist"> <xs:list> <xs:simpletype> <xs:restriction base="xs:string"> <xs:enumeration value="red"/> <xs:enumeration value="green"/> <xs:enumeration value="blue"/> </xs:restriction> </xs:simpletype> </xs:list> </xs:simpletype> <xs:simpletype name="colourlistordate"> <xs:union membertypes="colourlist xs:date"/> </xs:simpletype> 119

120 XML Schema: simple types can be used in element declarations, for elements without element child nodes (instead of PCDATA in DTDs) <xs:complextype name="persontype"> <xs:sequence> attribute declarations <xs:element name="name" type="xs:string"/> (instead of CDATA in DTDs) <xs:element name="dob" type="xs:date"/> </xs:sequence> <xs:attribute name="friend" type="xs:boolean" default="true"/> <xs:attribute name="phone" type="xs:string"/> </xs:complextype> as in DTDs, we can specify fixed or default values 120

121 XML Schema: simple content Remember how easy DTDs are? xs:simpletype for attribute values and elements without element child nodes and without attributes for elements where we cannot use xs:simpletype because of attribute declarations but that have simple (e.g., text) content only, we can use xs:simplecontent, e.g. <xs:element name="size"> <xs:complextype> <xs:simplecontent> <xs:extension base="xs:integer"> <xs:attribute name="country" type="xs:string"/> </xs:extension> </xs:simplecontent> </xs:complextype> </xs:element> 121

122 XML Schema: complex types xs:complextype for elements with element content or mixed element content or text content and attributes element order enforcement constructs: sequence: order preserving all: like sequence, but not order preserving choice: choose exactly one <xs:complextype name="nametype"> <xs:sequence> <xs:element name="fname" type="xs:string"/> <xs:element name="lname" type="xs:string"/> </xs:sequence> </xs:complextype> these constructs can be combined with minoccurs and maxoccurs, by default both are 1, but they can be set to any non-negative integer or unbounded, e.g. 1 fname 0-7 mname 1 lname <xs:complextype name="nametype"> <xs:sequence> <xs:element name="fname" type="xs:string"/> <xs:element name="mname" type="xs:string" minoccurs="0" maxoccurs="7"/> <xs:element name="lname" type="xs:string"/> </xs:sequence> </xs:complextype> 122

123 XML Schema: mixed content to allow for mixed content, set attribute mixed= true, e.g., <xs:complextype name="persontype" mixed="true"> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="dob" type="xs:date"/> </xs:sequence> <xs:attribute name="friend" type="xs:boolean" default="true"/> <xs:attribute name="phone" type="xs:string"/> </xs:complextype> like in DTDs, we cannot constrain where the text occurs between elements, can only say that content can be mixed 123

124 XML Schema: restriction and extension we have already used xs:extension and xs:restriction both for simple types and complex types they are XML Schema s mechanisms for inheritance extension: specifying a new type X by adding declarations to Y s this appends X s definition to Y s, e.g., <xs:simpletype name="agetype"> <xs:restriction base="xs:integer"> <xs:mininclusive value="3"/> <xs:maxinclusive value="7"/> </xs:restriction> </xs:simpletype> <xs:complextype name="newagetype"> <xs:simplecontent> <xs:extension base="agetype"> <xs:attribute name="range" type="xs:string"/> </xs:extension> </xs:simplecontent> </xs:complextype> <xs:complextype name="persontype"> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="dob" type="xs:date"/> </xs:sequence> <xs:attribute name="friend" type="xs:boolean" default="true"/> <xs:attribute name="phone" type="xs:string"/> </xs:complextype> <xs:complextype name="longpersontype"> <xs:complexcontent> <xs:extension base="persontype"> <xs:sequence> <xs:element name="address" type="xs:string"/> </xs:sequence> </xs:extension> </xs:complexcontent></xs:complextype> 124

COMP60411 Semi-structured Data and the Web A bit of XPath, namespaces, and XML schema

COMP60411 Semi-structured Data and the Web A bit of XPath, namespaces, and XML schema COMP60411 Semi-structured Data and the Web A bit of XPath, namespaces, and XML schema week 2 Conny Hedeler & Uli Sattler University of Manchester 1 .plagiarism again... Work through the online course available

More information

COMP60411 Semi-structured Data and the Web A bit of XPath, namespaces, and XML schema

COMP60411 Semi-structured Data and the Web A bit of XPath, namespaces, and XML schema COMP60411 Semi-structured Data and the Web A bit of XPath, namespaces, and XML schema week 2 Bijan Parsia & Uli Sattler University of Manchester 1 .plagiarism again... Work through the COMP60990 Research

More information

COMP60411 Semi-structured Data and the Web A bit of XPath, namespaces, and XML schema

COMP60411 Semi-structured Data and the Web A bit of XPath, namespaces, and XML schema COMP60411 Semi-structured Data and the Web A bit of XPath, namespaces, and XML schema week 2 Uli Sattler University of Manchester 1 .plagiarism again... Work through the COMP609PM Plagiarism and Malpractice

More information

COMP60411 Modelling Data on the Web XPath, XML Schema, and XQuery, week 3

COMP60411 Modelling Data on the Web XPath, XML Schema, and XQuery, week 3 COMP60411 Modelling Data on the Web XPath, XML Schema, and XQuery, week 3 Bijan Parsia Uli Sattler University of Manchester 1 Week 1 coursework Mostly graded! Q1, SE1, M1 should be accessible to you CW1

More information

COMP60411 Modelling Data on the Web XPath, XML Schema, and XQuery. Week 3

COMP60411 Modelling Data on the Web XPath, XML Schema, and XQuery. Week 3 COMP60411 Modelling Data on the Web XPath, XML Schema, and XQuery Week 3 Bijan Parsia Uli Sattler University of Manchester Week 1 coursework All graded! Q1, SE1, M1, CW1 In general, Pay attention to the

More information

XML extensible Markup Language

XML extensible Markup Language extensible Markup Language Eshcar Hillel Sources: http://www.w3schools.com http://java.sun.com/webservices/jaxp/ learning/tutorial/index.html Tutorial Outline What is? syntax rules Schema Document Object

More information

Software Engineering Methods, XML extensible Markup Language. Tutorial Outline. An Example File: Note.xml XML 1

Software Engineering Methods, XML extensible Markup Language. Tutorial Outline. An Example File: Note.xml XML 1 extensible Markup Language Eshcar Hillel Sources: http://www.w3schools.com http://java.sun.com/webservices/jaxp/ learning/tutorial/index.html Tutorial Outline What is? syntax rules Schema Document Object

More information

XML (Extensible Markup Language)

XML (Extensible Markup Language) Basics of XML: What is XML? XML (Extensible Markup Language) XML stands for Extensible Markup Language XML was designed to carry data, not to display data XML tags are not predefined. You must define your

More information

Markup Languages. Lecture 4. XML Schema

Markup Languages. Lecture 4. XML Schema Markup Languages Lecture 4. XML Schema Introduction to XML Schema XML Schema is an XML-based alternative to DTD. An XML schema describes the structure of an XML document. The XML Schema language is also

More information

XML Schema. Mario Alviano A.Y. 2017/2018. University of Calabria, Italy 1 / 28

XML Schema. Mario Alviano A.Y. 2017/2018. University of Calabria, Italy 1 / 28 1 / 28 XML Schema Mario Alviano University of Calabria, Italy A.Y. 2017/2018 Outline 2 / 28 1 Introduction 2 Elements 3 Simple and complex types 4 Attributes 5 Groups and built-in 6 Import of other schemes

More information

Semantic Web. XML and XML Schema. Morteza Amini. Sharif University of Technology Fall 94-95

Semantic Web. XML and XML Schema. Morteza Amini. Sharif University of Technology Fall 94-95 ه عا ی Semantic Web XML and XML Schema Morteza Amini Sharif University of Technology Fall 94-95 Outline Markup Languages XML Building Blocks XML Applications Namespaces XML Schema 2 Outline Markup Languages

More information

Big Data 9. Data Models

Big Data 9. Data Models Ghislain Fourny Big Data 9. Data Models pinkyone / 123RF Stock Photo 1 Syntax vs. Data Models Physical view Syntax this is text. 2 Syntax vs. Data Models a Logical view

More information

COMP9321 Web Application Engineering. Extensible Markup Language (XML)

COMP9321 Web Application Engineering. Extensible Markup Language (XML) COMP9321 Web Application Engineering Extensible Markup Language (XML) Dr. Basem Suleiman Service Oriented Computing Group, CSE, UNSW Australia Semester 1, 2016, Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2442

More information

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11 !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... 7:4 @import Directive... 9:11 A Absolute Units of Length... 9:14 Addressing the First Line... 9:6 Assigning Meaning to XML Tags...

More information

Java EE 7: Back-end Server Application Development 4-2

Java EE 7: Back-end Server Application Development 4-2 Java EE 7: Back-end Server Application Development 4-2 XML describes data objects called XML documents that: Are composed of markup language for structuring the document data Support custom tags for data

More information

Last week we saw how to use the DOM parser to read an XML document. The DOM parser can also be used to create and modify nodes.

Last week we saw how to use the DOM parser to read an XML document. The DOM parser can also be used to create and modify nodes. Distributed Software Development XML Schema Chris Brooks Department of Computer Science University of San Francisco 7-2: Modifying XML programmatically Last week we saw how to use the DOM parser to read

More information

Introduction Syntax and Usage XML Databases Java Tutorial XML. November 5, 2008 XML

Introduction Syntax and Usage XML Databases Java Tutorial XML. November 5, 2008 XML Introduction Syntax and Usage Databases Java Tutorial November 5, 2008 Introduction Syntax and Usage Databases Java Tutorial Outline 1 Introduction 2 Syntax and Usage Syntax Well Formed and Valid Displaying

More information

Chapter 1: Getting Started. You will learn:

Chapter 1: Getting Started. You will learn: Chapter 1: Getting Started SGML and SGML document components. What XML is. XML as compared to SGML and HTML. XML format. XML specifications. XML architecture. Data structure namespaces. Data delivery,

More information

Marker s feedback version

Marker s feedback version Two hours Special instructions: This paper will be taken on-line and this is the paper format which will be available as a back-up UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE Semi-structured Data

More information

Modelling XML Applications (part 2)

Modelling XML Applications (part 2) Modelling XML Applications (part 2) Patryk Czarnik XML and Applications 2014/2015 Lecture 3 20.10.2014 Common design decisions Natural language Which natural language to use? It would be a nonsense not

More information

EMERGING TECHNOLOGIES. XML Documents and Schemas for XML documents

EMERGING TECHNOLOGIES. XML Documents and Schemas for XML documents EMERGING TECHNOLOGIES XML Documents and Schemas for XML documents Outline 1. Introduction 2. Structure of XML data 3. XML Document Schema 3.1. Document Type Definition (DTD) 3.2. XMLSchema 4. Data Model

More information

Session [2] Information Modeling with XSD and DTD

Session [2] Information Modeling with XSD and DTD Session [2] Information Modeling with XSD and DTD September 12, 2000 Horst Rechner Q&A from Session [1] HTML without XML See Code HDBMS vs. RDBMS What does XDR mean? XML-Data Reduced Utilized in Biztalk

More information

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML 7.1 Introduction extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML Lax syntactical rules Many complex features that are rarely used HTML is a markup language,

More information

Solution Sheet 5 XML Data Models and XQuery

Solution Sheet 5 XML Data Models and XQuery The Systems Group at ETH Zurich Big Data Fall Semester 2012 Prof. Dr. Donald Kossmann Prof. Dr. Nesime Tatbul Assistants: Martin Kaufmann Besmira Nushi 07.12.2012 Solution Sheet 5 XML Data Models and XQuery

More information

XML. Document Type Definitions XML Schema. Database Systems and Concepts, CSCI 3030U, UOIT, Course Instructor: Jarek Szlichta

XML. Document Type Definitions XML Schema. Database Systems and Concepts, CSCI 3030U, UOIT, Course Instructor: Jarek Szlichta XML Document Type Definitions XML Schema 1 XML XML stands for extensible Markup Language. XML was designed to describe data. XML has come into common use for the interchange of data over the Internet.

More information

MANAGING INFORMATION (CSCU9T4) LECTURE 2: XML STRUCTURE

MANAGING INFORMATION (CSCU9T4) LECTURE 2: XML STRUCTURE MANAGING INFORMATION (CSCU9T4) LECTURE 2: XML STRUCTURE Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE XML Elements vs. Attributes Well-formed vs. Valid XML documents Document Type Definitions (DTDs)

More information

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington CS330 Lecture April 8, 2003 1 Overview From HTML to XML DTDs Querying XML: XPath Transforming XML: XSLT

More information

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML Chapter 7 XML 7.1 Introduction extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML Lax syntactical rules Many complex features that are rarely used HTML

More information

markup language carry data define your own tags self-descriptive W3C Recommendation

markup language carry data define your own tags self-descriptive W3C Recommendation XML intro What is XML? XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to carry data, not to display data XML tags are not predefined. You must define

More information

Bioinforma)cs Resources XML / Web Access

Bioinforma)cs Resources XML / Web Access Bioinforma)cs Resources XML / Web Access Lecture & Exercises Prof. B. Rost, Dr. L. Richter, J. Reeb Ins)tut für Informa)k I12 XML Infusion (in 10 sec) compila)on from hkp://www.w3schools.com/xml/default.asp

More information

So far, we've discussed the use of XML in creating web services. How does this work? What other things can we do with it?

So far, we've discussed the use of XML in creating web services. How does this work? What other things can we do with it? XML Page 1 XML and web services Monday, March 14, 2011 2:50 PM So far, we've discussed the use of XML in creating web services. How does this work? What other things can we do with it? XML Page 2 Where

More information

Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013

Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013 Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013 2 Outline Introduction XML Structure Document Type Definition (DTD) XHMTL Formatting XML CSS Formatting XSLT Transformations

More information

XML. Part II DTD (cont.) and XML Schema

XML. Part II DTD (cont.) and XML Schema XML Part II DTD (cont.) and XML Schema Attribute Declarations Declare a list of allowable attributes for each element These lists are called ATTLIST declarations Consists of 3 basic parts The ATTLIST keyword

More information

Jeff Offutt. SWE 642 Software Engineering for the World Wide Web

Jeff Offutt.  SWE 642 Software Engineering for the World Wide Web XML Advanced Topics Jeff Offutt http://www.cs.gmu.edu/~offutt/ SWE 642 Software Engineering for the World Wide Web sources: Professional Java Server Programming, Patzer, Wrox, 2 nd edition, Ch 5, 6 Programming

More information

SDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5

SDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5 2 Basics of XML and XML documents 2.1 XML and XML documents Survivor's Guide to XML, or XML for Computer Scientists / Dummies 2.1 XML and XML documents 2.2 Basics of XML DTDs 2.3 XML Namespaces XML 1.0

More information

Delivery Options: Attend face-to-face in the classroom or remote-live attendance.

Delivery Options: Attend face-to-face in the classroom or remote-live attendance. XML Programming Duration: 5 Days Price: $2795 *California residents and government employees call for pricing. Discounts: We offer multiple discount options. Click here for more info. Delivery Options:

More information

Big Data for Engineers Spring Data Models

Big Data for Engineers Spring Data Models Ghislain Fourny Big Data for Engineers Spring 2018 11. Data Models pinkyone / 123RF Stock Photo CSV (Comma separated values) This is syntax ID,Last name,first name,theory, 1,Einstein,Albert,"General, Special

More information

Syntax XML Schema XML Techniques for E-Commerce, Budapest 2004

Syntax XML Schema XML Techniques for E-Commerce, Budapest 2004 Mag. iur. Dr. techn. Michael Sonntag Syntax XML Schema XML Techniques for E-Commerce, Budapest 2004 E-Mail: sonntag@fim.uni-linz.ac.at http://www.fim.uni-linz.ac.at/staff/sonntag.htm Michael Sonntag 2004

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1 Slide 27-1 Chapter 27 XML: Extensible Markup Language Chapter Outline Introduction Structured, Semi structured, and Unstructured Data. XML Hierarchical (Tree) Data Model. XML Documents, DTD, and XML Schema.

More information

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance.

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance. XML Programming Duration: 5 Days US Price: $2795 UK Price: 1,995 *Prices are subject to VAT CA Price: CDN$3,275 *Prices are subject to GST/HST Delivery Options: Attend face-to-face in the classroom or

More information

Big Data Fall Data Models

Big Data Fall Data Models Ghislain Fourny Big Data Fall 2018 11. Data Models pinkyone / 123RF Stock Photo CSV (Comma separated values) This is syntax ID,Last name,first name,theory, 1,Einstein,Albert,"General, Special Relativity"

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411 1 Extensible

More information

Restricting complextypes that have mixed content

Restricting complextypes that have mixed content Restricting complextypes that have mixed content Roger L. Costello October 2012 complextype with mixed content (no attributes) Here is a complextype with mixed content:

More information

Part VII. Querying XML The XQuery Data Model. Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2005/06 153

Part VII. Querying XML The XQuery Data Model. Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2005/06 153 Part VII Querying XML The XQuery Data Model Marc H. Scholl (DBIS, Uni KN) XML and Databases Winter 2005/06 153 Outline of this part 1 Querying XML Documents Overview 2 The XQuery Data Model The XQuery

More information

The XQuery Data Model

The XQuery Data Model The XQuery Data Model 9. XQuery Data Model XQuery Type System Like for any other database query language, before we talk about the operators of the language, we have to specify exactly what it is that

More information

Big Data Exercises. Fall 2018 Week 8 ETH Zurich. XML validation

Big Data Exercises. Fall 2018 Week 8 ETH Zurich. XML validation Big Data Exercises Fall 2018 Week 8 ETH Zurich XML validation Reading: (optional, but useful) XML in a Nutshell, Elliotte Rusty Harold, W. Scott Means, 3rd edition, 2005: Online via ETH Library 1. XML

More information

Introduction to XML. Large Scale Programming, 1DL410, autumn 2009 Cons T Åhs

Introduction to XML. Large Scale Programming, 1DL410, autumn 2009 Cons T Åhs Introduction to XML Large Scale Programming, 1DL410, autumn 2009 Cons T Åhs XML Input files, i.e., scene descriptions to our ray tracer are written in XML. What is XML? XML - extensible markup language

More information

Introducing our First Schema

Introducing our First Schema 1 di 11 21/05/2006 10.24 Published on XML.com http://www.xml.com/pub/a/2000/11/29/schemas/part1.html See this if you're having trouble printing code examples Using W3C XML By Eric van der Vlist October

More information

XML Schema Element and Attribute Reference

XML Schema Element and Attribute Reference E XML Schema Element and Attribute Reference all This appendix provides a full listing of all elements within the XML Schema Structures Recommendation (found at http://www.w3.org/tr/xmlschema-1/). The

More information

Structured documents

Structured documents Structured documents An overview of XML Structured documents Michael Houghton 15/11/2000 Unstructured documents Broadly speaking, text and multimedia document formats can be structured or unstructured.

More information

Modelling XML Applications

Modelling XML Applications Modelling XML Applications Patryk Czarnik XML and Applications 2013/2014 Lecture 2 14.10.2013 XML application (recall) XML application (zastosowanie XML) A concrete language with XML syntax Typically defined

More information

PESC Compliant JSON Version /19/2018. A publication of the Technical Advisory Board Postsecondary Electronic Standards Council

PESC Compliant JSON Version /19/2018. A publication of the Technical Advisory Board Postsecondary Electronic Standards Council Version 0.5.0 10/19/2018 A publication of the Technical Advisory Board Postsecondary Electronic Standards Council 2018. All Rights Reserved. This document may be copied and furnished to others, and derivative

More information

H2 Spring B. We can abstract out the interactions and policy points from DoDAF operational views

H2 Spring B. We can abstract out the interactions and policy points from DoDAF operational views 1. (4 points) Of the following statements, identify all that hold about architecture. A. DoDAF specifies a number of views to capture different aspects of a system being modeled Solution: A is true: B.

More information

XML: Extensible Markup Language

XML: Extensible Markup Language XML: Extensible Markup Language CSC 375, Fall 2015 XML is a classic political compromise: it balances the needs of man and machine by being equally unreadable to both. Matthew Might Slides slightly modified

More information

Data Presentation and Markup Languages

Data Presentation and Markup Languages Data Presentation and Markup Languages MIE456 Tutorial Acknowledgements Some contents of this presentation are borrowed from a tutorial given at VLDB 2000, Cairo, Agypte (www.vldb.org) by D. Florescu &.

More information

Appendix H XML Quick Reference

Appendix H XML Quick Reference HTML Appendix H XML Quick Reference What Is XML? Extensible Markup Language (XML) is a subset of the Standard Generalized Markup Language (SGML). XML allows developers to create their own document elements

More information

Big Data 11. Data Models

Big Data 11. Data Models Ghislain Fourny Big Data 11. Data Models pinkyone / 123RF Stock Photo CSV (Comma separated values) This is syntax ID,Last name,first name,theory, 1,Einstein,Albert,"General, Special Relativity" 2,Gödel,Kurt,"""Incompleteness""

More information

XML. COSC Dr. Ramon Lawrence. An attribute is a name-value pair declared inside an element. Comments. Page 3. COSC Dr.

XML. COSC Dr. Ramon Lawrence. An attribute is a name-value pair declared inside an element. Comments. Page 3. COSC Dr. COSC 304 Introduction to Database Systems XML Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca XML Extensible Markup Language (XML) is a markup language that allows for

More information

2.2 Syntax Definition

2.2 Syntax Definition 42 CHAPTER 2. A SIMPLE SYNTAX-DIRECTED TRANSLATOR sequence of "three-address" instructions; a more complete example appears in Fig. 2.2. This form of intermediate code takes its name from instructions

More information

Chapter 11 XML Data Modeling. Recent Development for Data Models 2016 Stefan Deßloch

Chapter 11 XML Data Modeling. Recent Development for Data Models 2016 Stefan Deßloch Chapter 11 XML Data Modeling Recent Development for Data Models 2016 Stefan Deßloch Motivation Traditional data models (e.g., relational data model) primarily support structure data separate DB schema

More information

What is XML? XML is designed to transport and store data.

What is XML? XML is designed to transport and store data. What is XML? XML stands for extensible Markup Language. XML is designed to transport and store data. HTML was designed to display data. XML is a markup language much like HTML XML was designed to carry

More information

XML Structures. Web Programming. Uta Priss ZELL, Ostfalia University. XML Introduction Syntax: well-formed Semantics: validity Issues

XML Structures. Web Programming. Uta Priss ZELL, Ostfalia University. XML Introduction Syntax: well-formed Semantics: validity Issues XML Structures Web Programming Uta Priss ZELL, Ostfalia University 2013 Web Programming XML1 Slide 1/32 Outline XML Introduction Syntax: well-formed Semantics: validity Issues Web Programming XML1 Slide

More information

XML: extensible Markup Language

XML: extensible Markup Language Datamodels XML: extensible Markup Language Slides are based on slides from Database System Concepts Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use Many examples are from

More information

but XML goes far beyond HTML: it describes data

but XML goes far beyond HTML: it describes data The XML Meta-Language 1 Introduction to XML The father of markup languages: XML = EXtensible Markup Language is a simplified version of SGML Originally created to overcome the limitations of HTML the HTML

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2017 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid= 2465 1

More information

This book is licensed under a Creative Commons Attribution 3.0 License

This book is licensed under a Creative Commons Attribution 3.0 License 6. Syntax Learning objectives: syntax and semantics syntax diagrams and EBNF describe context-free grammars terminal and nonterminal symbols productions definition of EBNF by itself parse tree grammars

More information

XML. Presented by : Guerreiro João Thanh Truong Cong

XML. Presented by : Guerreiro João Thanh Truong Cong XML Presented by : Guerreiro João Thanh Truong Cong XML : Definitions XML = Extensible Markup Language. Other Markup Language : HTML. XML HTML XML describes a Markup Language. XML is a Meta-Language. Users

More information

XML Extensible Markup Language

XML Extensible Markup Language XML Extensible Markup Language Generic format for structured representation of data. DD1335 (Lecture 9) Basic Internet Programming Spring 2010 1 / 34 XML Extensible Markup Language Generic format for structured

More information

ADT 2005 Lecture 7 Chapter 10: XML

ADT 2005 Lecture 7 Chapter 10: XML ADT 2005 Lecture 7 Chapter 10: XML Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/ Database System Concepts Silberschatz, Korth and Sudarshan The Challenge: Comic Strip Finder The Challenge:

More information

Web Computing. Revision Notes

Web Computing. Revision Notes Web Computing Revision Notes Exam Format The format of the exam is standard: Answer TWO OUT OF THREE questions Candidates should answer ONLY TWO questions The time allowed is TWO hours Notes: You will

More information

Semistructured data, XML, DTDs

Semistructured data, XML, DTDs Semistructured data, XML, DTDs Introduction to Databases Manos Papagelis Thanks to Ryan Johnson, John Mylopoulos, Arnold Rosenbloom and Renee Miller for material in these slides Structured vs. unstructured

More information

Tutorial 2: Validating Documents with DTDs

Tutorial 2: Validating Documents with DTDs 1. One way to create a valid document is to design a document type definition, or DTD, for the document. 2. As shown in the accompanying figure, the external subset would define some basic rules for all

More information

The concept of DTD. DTD(Document Type Definition) Why we need DTD

The concept of DTD. DTD(Document Type Definition) Why we need DTD Contents Topics The concept of DTD Why we need DTD The basic grammar of DTD The practice which apply DTD in XML document How to write DTD for valid XML document The concept of DTD DTD(Document Type Definition)

More information

11. EXTENSIBLE MARKUP LANGUAGE (XML)

11. EXTENSIBLE MARKUP LANGUAGE (XML) 11. EXTENSIBLE MARKUP LANGUAGE (XML) Introduction Extensible Markup Language is a Meta language that describes the contents of the document. So these tags can be called as self-describing data tags. XML

More information

CSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML

CSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards Representing Web Data: XML XML Example XML document: An XML document is one that follows certain syntax rules (most of which we followed for XHTML) Guy-Vincent

More information

2006 Martin v. Löwis. Data-centric XML. XML Schema (Part 1)

2006 Martin v. Löwis. Data-centric XML. XML Schema (Part 1) Data-centric XML XML Schema (Part 1) Schema and DTD Disadvantages of DTD: separate, non-xml syntax very limited constraints on data types (just ID, IDREF, ) no support for sets (i.e. each element type

More information

XML DTDs and Namespaces. CS174 Chris Pollett Oct 3, 2007.

XML DTDs and Namespaces. CS174 Chris Pollett Oct 3, 2007. XML DTDs and Namespaces CS174 Chris Pollett Oct 3, 2007. Outline Internal versus External DTDs Namespaces XML Schemas Internal versus External DTDs There are two ways to associate a DTD with an XML document:

More information

extensible Markup Language

extensible Markup Language extensible Markup Language XML is rapidly becoming a widespread method of creating, controlling and managing data on the Web. XML Orientation XML is a method for putting structured data in a text file.

More information

XML and Web Services

XML and Web Services XML and Web Services Lecture 8 1 XML (Section 17) Outline XML syntax, semistructured data Document Type Definitions (DTDs) XML Schema Introduction to XML based Web Services 2 Additional Readings on XML

More information

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. M.Sc. in Advanced Computer Science. Date: Tuesday 20 th May 2008.

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. M.Sc. in Advanced Computer Science. Date: Tuesday 20 th May 2008. COMP60370 Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE M.Sc. in Advanced Computer Science Semi-Structured Data and the Web Date: Tuesday 20 th May 2008 Time: 09:45 11:45 Please answer

More information

On why C# s type system needs an extension

On why C# s type system needs an extension On why C# s type system needs an extension Wolfgang Gehring University of Ulm, Faculty of Computer Science, D-89069 Ulm, Germany wgehring@informatik.uni-ulm.de Abstract. XML Schemas (XSD) are the type

More information

Lesson 14 SOA with REST (Part I)

Lesson 14 SOA with REST (Part I) Lesson 14 SOA with REST (Part I) Service Oriented Architectures Security Module 3 - Resource-oriented services Unit 1 REST Ernesto Damiani Università di Milano Web Sites (1992) WS-* Web Services (2000)

More information

SDMX self-learning package No. 6 Student book. XML Based Technologies Used in SDMX

SDMX self-learning package No. 6 Student book. XML Based Technologies Used in SDMX No. 6 Student book XML Based Technologies Used in SDMX Produced by Eurostat, Directorate B: Statistical Methodologies and Tools Unit B-5: Statistical Information Technologies Last update of content May

More information

XML in Databases. Albrecht Schmidt. al. Albrecht Schmidt, Aalborg University 1

XML in Databases. Albrecht Schmidt.   al. Albrecht Schmidt, Aalborg University 1 XML in Databases Albrecht Schmidt al@cs.auc.dk http://www.cs.auc.dk/ al Albrecht Schmidt, Aalborg University 1 What is XML? (1) Where is the Life we have lost in living? Where is the wisdom we have lost

More information

Using and defining new simple types. Basic structuring of definitions. Copyright , Sosnoski Software Solutions, Inc. All rights reserved.

Using and defining new simple types. Basic structuring of definitions. Copyright , Sosnoski Software Solutions, Inc. All rights reserved. IRIS Web Services Workshop II Session 1 Wednesday AM, September 21 Dennis M. Sosnoski What is XML? Extensible Markup Language XML is metalanguage for markup Enables markup languages for particular domains

More information

6.001 Notes: Section 15.1

6.001 Notes: Section 15.1 6.001 Notes: Section 15.1 Slide 15.1.1 Our goal over the next few lectures is to build an interpreter, which in a very basic sense is the ultimate in programming, since doing so will allow us to define

More information

Using UML To Define XML Document Types

Using UML To Define XML Document Types Using UML To Define XML Document Types W. Eliot Kimber ISOGEN International, A DataChannel Company Created On: 10 Dec 1999 Last Revised: 14 Jan 2000 Defines a convention for the use of UML to define XML

More information

XML Format Plug-in User s Guide. Version 10g Release 3 (10.3)

XML Format Plug-in User s Guide. Version 10g Release 3 (10.3) XML Format Plug-in User s Guide Version 10g Release 3 (10.3) XML... 4 TERMINOLOGY... 4 CREATING AN XML FORMAT... 5 CREATING AN XML FORMAT BASED ON AN EXISTING XML MESSAGE FORMAT... 5 CREATING AN EMPTY

More information

M359 Block5 - Lecture12 Eng/ Waleed Omar

M359 Block5 - Lecture12 Eng/ Waleed Omar Documents and markup languages The term XML stands for extensible Markup Language. Used to label the different parts of documents. Labeling helps in: Displaying the documents in a formatted way Querying

More information

CS125 : Introduction to Computer Science. Lecture Notes #4 Type Checking, Input/Output, and Programming Style

CS125 : Introduction to Computer Science. Lecture Notes #4 Type Checking, Input/Output, and Programming Style CS125 : Introduction to Computer Science Lecture Notes #4 Type Checking, Input/Output, and Programming Style c 2005, 2004, 2002, 2001, 2000 Jason Zych 1 Lecture 4 : Type Checking, Input/Output, and Programming

More information

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University Markup Languages SGML, HTML, XML, XHTML CS 431 February 13, 2006 Carl Lagoze Cornell University Problem Richness of text Elements: letters, numbers, symbols, case Structure: words, sentences, paragraphs,

More information

XML: Managing with the Java Platform

XML: Managing with the Java Platform In order to learn which questions have been answered correctly: 1. Print these pages. 2. Answer the questions. 3. Send this assessment with the answers via: a. FAX to (212) 967-3498. Or b. Mail the answers

More information

Introduction to XML. When talking about XML, here are some terms that would be helpful:

Introduction to XML. When talking about XML, here are some terms that would be helpful: Introduction to XML XML stands for the extensible Markup Language. It is a new markup language, developed by the W3C (World Wide Web Consortium), mainly to overcome limitations in HTML. HTML is an immensely

More information

Chapter 13 XML: Extensible Markup Language

Chapter 13 XML: Extensible Markup Language Chapter 13 XML: Extensible Markup Language - Internet applications provide Web interfaces to databases (data sources) - Three-tier architecture Client V Application Programs Webserver V Database Server

More information

Chapter 3 Brief Overview of XML

Chapter 3 Brief Overview of XML Slide 3.1 Web Serv vices: Princ ciples & Te echno ology Chapter 3 Brief Overview of XML Mike P. Papazoglou & mikep@uvt.nl Slide 3.2 Topics XML document structure XML schemas reuse Document navigation and

More information

Framing how values are extracted from the data stream. Includes properties for alignment, length, and delimiters.

Framing how values are extracted from the data stream. Includes properties for alignment, length, and delimiters. DFDL Introduction For Beginners Lesson 3: DFDL properties In lesson 2 we learned that in the DFDL language, XML Schema conveys the basic structure of the data format being modeled, and DFDL properties

More information

Introduction to Semistructured Data and XML

Introduction to Semistructured Data and XML Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of Washington Database Management Systems, R. Ramakrishnan 1 How the Web is Today HTML documents often

More information

Author: Irena Holubová Lecturer: Martin Svoboda

Author: Irena Holubová Lecturer: Martin Svoboda NPRG036 XML Technologies Lecture 1 Introduction, XML, DTD 19. 2. 2018 Author: Irena Holubová Lecturer: Martin Svoboda http://www.ksi.mff.cuni.cz/~svoboda/courses/172-nprg036/ Lecture Outline Introduction

More information

PESC Compliant JSON Version /04/2019

PESC Compliant JSON Version /04/2019 Version 1.0.0 02/04/2019 A publication of the Technical Advisory Board, 2018. This document is released and all use of it is controlled under Creative Commons, specifically under an Attribution- ShareAlike

More information