XML Schema Languages. Why a DDL for XML?

Size: px
Start display at page:

Download "XML Schema Languages. Why a DDL for XML?"

Transcription

1 XML Schema Languages 1 Why a DDL for XML? For old & well-know (but good!) reasons As a modeling tool: to describe the structure of information: entities, relationships... to share common descriptions between actors/applications to guide query formulation and application development For error detection & safety: to verify that documents comply to what the application expects to make sure that the application accesses valid data to enforce safe operations (e.g., don t do float arithmetic on trees!) to check that compositions of operations make sense For performances: to design storage (saving space, improving clustering, etc.) to process queries (algebraic laws, rewriting path expressions, etc.) 2 1 1

2 But XML Deals with New Needs XML data created from legacy repositories Need to capture schemas from heterogeneous sources Relational schemas: Simple but with integrity constraints Object-oriented schemas: Typed references, Inheritance... Document grammars: Regular expressions, mixed text and structure XML used on the Web, for data exchange Need to remain flexible Web sources: From strict schemas to well-formed documents (smooooothly...) Many applications use the same information: We should be able to type the same document in multiple ways 3 Main XML DDL Desiderata Object-oriented Richer set of Datatypes Can extend or restrict a type (derive new type definitions on the basis of old ones) Database-oriented Can define elements with null content Can specify element content as being unique and scope of uniqueness World Wide Web Oriented Namespaces Distributed Schema Can create equivalent elements - e.g., the student" element is equivalent to the «φοιτητής" element 4 2 2

3 XML Schema At a Glance u Describing atomic values integer, string, float, date, images, etc v Describing structures elements: tag-coupled approach vs. tag-decoupled approach attributes w Capturing more semantics identity, references, relationships intra or inter documents isa: notion of inheritance... x Simplifying schema reuse import/export abilities refinement of existing descriptions 5 Values in XML: Easy? DTD says it s easy: Recipe: #PCDATA = string I.e.: CDATA = other strings,... Everything is a string Unfortunately: Strings are not a panacea... Database research says it s easy: Recipe: Take a data model with atomic types Each value is in a different type... I.e.: Don t deal with syntax but data model Unfortunately: XML = file = syntax 6 3 3

4 Values in XML: Many Issues... Addressing numerous needs: float, string, int, date, URI, telephone number, gif, applet, etc. Living with XML 1.0 syntax The same lexical representation can correspond to several values <artifact><title>haystacks at Chailly</title> <artist>monet</artist> <date>1865</date><price>38000</price> </artifact> The same value can have several lexical representations <in_auction>true</in_auction> <in_auction>1</in_auction> binary formats (images, etc.) must be serialized in a portable way Compatible with other standards Compatible with internationalization World Wide Web! 7 XML Schema Part 2: Datatypes Defines 18 of built-in types (basic types) general purpose types types for compatibility with DTDs Relies on other existing standards whenever possible IEEE for floats UCS [ISO 10646] & Unicode for internationalization ISO 8601 for dates bin.base64: MIME-style Base64 encoded binary BLOB bin.hex: Hexadecimal digits representing octets Uuid: Hexadecimal digits representing octets, with optional embedded hyphens ("333C7BC4-460F-11D0-BC C7055A83 ) Gives the ability to define new types (derived types) Single lexical representation for many values? document is interpreted with respect to a given schema if no schema, the value is given the type string 8 4 4

5 XML Datatypes can be... Atomic Single indivisible value vs Complex Decomposable value Basic (or Primitive) Not defined in terms of other datatypes Derived Defined in terms of other datatypes Built-in Defined in the Schema spec (may be derived) User-generated Defined by individual schema designers Datatypes can be applied to both attribute values and #PCDATA element content 9 A value space Can be enumerated outright, defined axiomatically, etc. ( the common notion of an integer ) A lexical value space Different physical representations of the same value (1, 1.0, 1.00, etc.) Fundamental facets Cardinality, exactness, boundedness Constraining facets What are the actual boundaries? An XML Datatype has t True,

6 Datatypes: Base Types Base types cover essential needs classic values: string, boolean, float, double, decimal temporal values: timeduration, recurringduration binary values: binary Web-related types: urireference, Qname DTD types: ID, IDREF, ENTITY, NOTATION One value for several syntaxes Each base type has a set of values (value space) Values may have several lexical representations (lexical space) Equality and order are defined in terms of the value space 11 Built-in Base Types: Examples string Datatype Examples Notes Victor Hugo boolean true, false, 1,0 float 12, 12.00, 1.2E-2, INF mx2^e where m < 2^ <= e <= 104 double 12, 12.00, 1.2E-2, INF mx2^e where m < 2^ <= e <= 970 decimal 0, -0, 1.23, Arbitrary precision timeduration P29Y2MT1H30M1.3S 29 years, 2 months, 3 days, 1 hour, 30 minutes, 1.3 seconds recurringduration T19:05:00 August 29 th at 7.05pm every year urireference

7 Datatypes: Facets Each base type has facets (read: properties) Some facets are fundamentals equality, order bounded, cardinality, numeric Some facets are constraining length, minlength, maxlength: for string, binary or lists maxinclusive, maxexclusive, mininclusive, minexclusive precision, scale: for decimal numbers encoding: hex or base64 for binary enumeration, pattern: for strings duration, period : for time, dates 13 Datatypes: Derived Types One can derive types by restriction of facets One can derive types by list XML Schema offers predefined derived types integer, non-positive-integer, int, date, year, century, time-instant, language, etc. <xsd:simpletype name= integer' base= xsd:decimal'> <scale value='0'/></xsd:simpletype> <xsd:simpletype name= int' base= xsd:integer'> <maxinclusive value= '/> <mininclusive value= /></xsd:simpletype> IDREFS, NMTOKENS, ENTITIES, etc. <xsd:simpletype name= IDREFS' base= xsd:idref derivedby= xsd:list />

8 Built-in Derived Types: Examples Derived Types Notes normalizedstring strings not containing carriage return (#xd), line feed (#xa),tab (#x9) chars token strings not containing line feed (#xa) tab (#x9) chars, not having leading/ trailing spaces (#x20) and internal sequences of two or more spaces language any valid xml:lang value, e.g.,en,fr,.. NMTOKEN "house barn yard" Name "hello-there" NCName part (no namespace qualifier) IDREFS List of IDREF ENTITIES List of ENTITITY integer decimal with 0 scale int integer > and < nonnegativeinteger zero to infinity integer positiveinteger one to infinity integer negativeinteger negative infinity to zero integer nonpositiveinteger negative infinity to negative one long integer > and < short int > and < byte short > -128 and < Classifying XML Datatypes primitive Built-in XML Types derived Built-in XML types user-derived XML Schema types derived atomic list union atomic list union

9 Now You Can Practice... Using a range facet <xsd:simpletype name= auctionprice' base= xsd:decimal'> <mininclusive value='10'/> </xsd:simpletype> Using an enumeration facet <xsd:simpletype name= artifactkind' base= xsd:string'> <xsd:enumeration value= Painting"/> <xsd:enumeration value= Sculpture"/>... </xsd:simpletype> Using a pattern facet <xsd:simpletype name= isbn base= xsd:string > <xsd:pattern value= ISBN \d{10}"/></xsd:simpletype> Using a list type <xsd:simpletype name= auctions base= auctionprice derivedby= xsd:list /> etc. 17 Regural Expression Chapter \d Chapter 1 a*b [xyz]b a?b b, ab a+b [a-c]x [a-c]x [-ac]x b, ab, aab, aaab, xb, yb, zb Example String Regular Expressions: Pattern Examples ab, aab, aaab, ax, bx, cx ax, bx, cx -x, ax, cx [ac-] ax, cx, -x [^0-9]x any non-digit char followed by x \dx digit char followed by x Chapter\s\d Chapter followed by a blank followed by a digit ho{2} there hoho there (ho\s){2} there ho ho there.abc any char followed by abc (a b)+x ax, bx, aax, bbx, abx, bax,... [a-c]x ax, bx, cx a{1,3}x ax, aax, aaax a{2,}x aax, aaax, aaaax, \w\s\w word (alphanumeric plus dash) followed by a space followed by a word

10 XML Schema Datatypes: Overview 19 Describing XML Structures element names with the names themselves: artifact, title, etc. possibly with wildcards: ~ = any tag,!a = not a, etc. element children using regular expressions element attributes unordered attribute-value pairs Main Question: status of types does the tag determines the type? tag-coupled types vs. tag-decoupled types

11 Coupled Types Approach taken by DTDs two elements with same name have always same type children = regular expression over elements <!ELEMENT artifact (title,artist?,price,date?,dimensions?)> <!ELEMENT title (#PCDATA)>... <!ELEMENT artist (name,nationality) <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)>... Properties equivalent to context-free grammars easy to parse: => no depth look-ahead (1-unambuiguity) no closure under union, no local names allowed cannot fully capture relational, object-oriented schemas 21 Decoupled Types Approach taken by YAT, XDuce, lotos, etc. types are decoupled from element names children are defined by regular expressions over types type Artifact= artifact[title,artist?,price,date?,dimensions?] type Title = title [ String ] type Artist = artist [ Name, Nationality ] type Name = name [ first [ String ], last [ String ] ]... different types can have the same tag type PName = name [ String ] Properties equivalent to regular tree grammars closure properties (intersection, complement, union...) more precise type for documents and queries harder to parse (might require look-ahead and backtracking)

12 Decoupled Types They are simple to define basic entities: datatypes, tags, type names one construct : types schema ::= type type_name = type... type ::= String Boolean... (* datatypes *) type_name (* type name *) tag [ type ] (* element *) ~ [ type ] (* element with wild card *) type, type (* sequence *) type type (* union *) type* (* kleene star *) type? (* optional *) 23 Decoupled Types They can easily describe mixed content type Section = section [ title [ String ], Body ] type Body = content [ (b [ Body ] footnote [ String ] Section String)* ] They can easily describe all well-formed documents type UrScalar = (String Boolean Float Double...) type UrTree = UrScalar ~[ UrTree* ]

13 Decoupled Types They support a notion of subtyping via inclusion type Body2 = content [ String, (b [ String ] footnote [ String ] String)*, Section* ] Body2 <: Body <: UrTree all documents of type Body2 are also of type Body and UrTree But they can be ambiguous type Section2 = section [ title [ String ], Body2*, Body* ] deciding between Body and Body2 can be expensive 25 Decoupled Types & Full XML How do you describe attributes? type Book = book [ String ], Title, Author+, Price, Publisher, Section, Conclusion? ] but attributes are unordered, without duplicates they do not interact with the children of the element they cannot contain complex values How do you describe references? Like in object schemas: type Book = book [title [ String ], &Author+, &Publisher ] type Author = author [ name [ first [ String ],last [ String ] ] ] type Publisher = publisher [ name [ String ] ] but it s even harder to parse because of cycles [Beeri Milo 1999]

14 What about XML Schema? Tries to get the expressive power of decoupled types + the ease of parsing of coupled types Plus advanced features: subtyping, constraints... Deals with all the specifics of XML + XML Schema Syntax is in XML! <xsd:element name= book > <xsd:complextype> <xsd:element name= title" type="xsd:string"/> <xsd:element name= author maxoccurs= unbounded > <xsd:complextype> <xsd:element name= first type= xsd:string /> <xsd:element name= last type= xsd:string /> </xsd:complextype> </xsd:element>... </xsd:complextype> </xsd:element> 27 Element & Attribute Declarations Elements ~ associate element names to types have a name and are described by a type <xsd:element name= title" type="xsd:string"/> title [String] <xsd:element name= affiliation type= Publisher /> affiliation Attributes [ Publisher ~ associate ] attribute names to types have a name and contain an atomic value can be required or optional can only appear inside elements (through complex types) <xsd:attribute name= price type="xsd:string" use= [ String ]? <xsd:attribute name= auctionhistory type="auctions" [ Auctions] type Auctions = Decimal*

15 Model Groups Defines content models (i.e., type for the children of an element) ~ equivalent to regular expressions over elements <xsd:sequence> title[title],price[price] <xsd:element name= title" type= Title"/> <xsd:element name= price" type= Price"/> </xsd:sequence> <xsd:choice> ( publisher[publisher] editor[author] ) <xsd:element name= publisher type= Publisher /> <xsd:element name= editor type= Author /> </xsd:choice> <xsd:sequence minoccurs= 0 maxoccurs= unbounded > book[ Book ]* <xsd:element name = book type= Book > </xsd:sequence> <xsd:all> (title[title],price[price]) (price[price],title[title]) <xsd:element name= title" type= Title"/> <xsd:element name= price" type= Price"/> </xsd:all> 29 Complex Type Definitions they can contain a content model and attribute declarations The attribute declarations always come last, after the element declarations <xsd:complextype name= Book > type Book [String], <xsd:sequence> title [String], author[authorname]+ <xsd:element name= title" type="xsd:string"/> <xsd:element name= author maxoccurs= unbounded </xsd:sequence> they can be empty type= AuthorName /> <xsd:attribute name= isbn type= xsd:string/> </xsd:complextype> </xsd:complextype name= RefBib content= empty > <xsd:attribute name = refto type= xsd:idref /> </xsd:complextype> type RefBib [ &UrTree ]

16 Complex Type Definitions they can be recursive they can be mixed (i.e., strings + sub elements) <xsd:complextype name= Body content= mixed > <element name = b type= Body minoccurs= 0 maxoccurs= unbounded /> </xsd:complextype> type Body = (b [ Body ] String)* they may allow for null content <xsd:element name= AuthorName"> <xsd:complextype> <element name= first" type="xsd:string"/> <element name="middle type="xsd:string nullable="true"/> <element name= last" type="xsd:string"/> </xsd:complextype> </xsd:element> first[string],middle[string?],last[string] 31 The Any Element/ Attribute Type The any element allows any well-formed XML <xsd:element name="free-form"> <xsd:complextype> <xsd:any/> </xsd:complextype> </xsd:element> <free-form> <comment source= Vassilis">This is great!</comment> </free-form> The anyattribute allows any attribute <xsd:element name="free-form"> <xsd:complextype content="empty"> <xsd:anyattribute/> </xsd:complextype> </xsd:element> <free-form comment="this is great"/>

17 The UrTree Type The UrTree is the source for all types which do not specify a value for the source attribute It is the type for all elements which do not specify a type: Example <element name="foo"/> <xsd:complextype name= UrTree" content="mixed"> <xsd: any mininclusion="0" maxinclusion="*"/> <xsd:anyattribute/> </xsd:complextype> 33 Some Interactions Among Features Local element restrictions local elements with same name can have different type <xsd:element name= author > <xsd:complextype> <xsd:element name= name type= AuthorName /> </xsd:complextype> </xsd:element> type Author = author [ name [ AuthorName ] ] <xsd:element name= publisher"/> <xsd:complextype> <xsd:element name= name" type="xsd:string"/>... </xsd:complextype> </xsd:element> type Publisher = publisher [ name [ String ] ]

18 Some Interactions Among Features but they must have the same type among siblings <xsd:complextype name= Names > <xsd:element name= name type = AuthorName /> <xsd:element name= name type = xsd:string minoccurs = 0 /> <xsd:complextype> type Names= name [ AuthorName ], name [ String ]? When do you use the complextype element and when do you use the datatype element? Use complextype when there are elements and/or attributes Use datatype when it is a primitive type (string, integer, etc) To be simple or not to be simple... <internationalprice currency='eu'>423.46</internationalprice> requires a complextype defined by extension over decimals 35 Integrity Constraints ICs come from relational practical view-point: key & foreign key constraints Book(isbn, title, price, publisher) :isbn is a key for the relation Book Author(authorid, first, last, affiliation) :authorid and first,last are both keys for the relation Author Wrote(isbn,authorid) :isbn and authorid are foreign keys to Book and Author theoretical view-point: functional & inclusion dependencies studied in depth in the literature Many useful applications of ICs used to preserve information when mapping ER model to relational used for safety and verification (e.g., controlling updates) used for optimization (e.g., dropping useless joins)

19 ID/IDREF Mechanism in DTDs DTDs provided the ID/IDREF attribute datatypes for uniqueness Very simple ICs to model identity and references <!ELEMENT book (title, author+, price, publisher, section, bibliography?)> <!ATTLIST book isbn ID #required> <!ELEMENT title (#PCDATA)> <!ELEMENT publisher (name, address)> <!ATTLIST publisher sticker ID #required> <!ELEMENT bibliography EMPTY> <!ATTLIST bibliography refs IDREFS #implied> ID attributes must have distinct values they identify elements uniquely in a document but: they enforce uniqueness on both publisher s stickers and book s isbns! IDREF attributes must have values from ID attributes they can capture references to other elements but: they allow refs to point to publishers! 37 Uniqueness vs Key Key: an element which is defined to be a key must always be present (minoccurs must be greater than zero) be non-nullable (i.e., nullable="false") be unique within an element scope Key implies unique, but unique does not imply key XML Schema has much enhanced uniqueness capabilities: enables you to define element content to be unique enables you to define non-id attributes to be unique enables you to define a combination of element content and attributes to be unique enables you to distinguish between unique and key enables you to declare the scope of the document over which something is unique

20 Adding Constraints to DTDs We can replace IDs by real keys: book.isbn -> book isbn is a key for the relation book publisher.sticker -> publisher sticker is a key for the relation publisher author.authorid -> author authorid is a key for the relation author wrote.isbn,wrote.authorid -> wrote isbn and authorid are a key for the relation wrote We can replace IDREFs by real foreign keys biblio.refs <= book.isbn refs is a multi-valued foreign key from biblio to book wrote.isbn <= book.isbn isbn is a foreign key from wrote to book wrote.authorid<= author.authorid authorid is foreign key from wrote to author 39 Constraints in XML Schema XML Schema can define powerful constraints using XPath expressions One can define keys: <key name= Isbn"> <selector>books/book</selector> <field>@isbn</field> </key> <key name= Publisher"> <selector>books/book/publisher</selector> <field>@sticker</field> </key> the selector gives the document scope on which the constraint applies the field gives the unique tag or attribute content One can define foreign keys: <keyref refer= Isbn"> <selector>books/book/biblio</selector> <field>@refs</field> </keyref>

21 Constraints in XML Schema One can define unique combinations of values: <unique name= wroteunique"> <selector>books/book/author/wrote</selector> </unique> The key/keyref/unique elements may be placed anywhere in a schema Where you place them determines the scope of the uniqueness <selector>books</selector>: we are stating that in an instance document the uniqueness is with respect to the entire books document <selector>books/book/</selector>: uniqueness will have as scope just the book element. Thus, over the entire instance books document there may be repeats, but within any book element it will be unique 41 XML Schema Constraints: Research Issues [Fan et al 2000] show that reasoning with simple ICs on DTDs is decidable in practical cases: unary keys, unary (multi-valued) foreign keys, inverse constraints candidate keys and foreign keys Many open issues is XPath too powerful for reasoning (predicates, function calls?) which notion of equality is used? interaction between ICs and structural constraints?

22 Reusing Schemas Many benefits sharing existing definitions faster development Traditional techniques for schema reuse: some notion of import and the ability to resolve name conflicts Import Person, Company from StdClass class Person class Company tuple(name : tuple(first : string, tuple(name: string) last : string )) inheritance, based on subtyping class Author inherit Person class Publisher inherit Company tuple(affiliation: Publisher) tuple(address: string) tuple(first:string,last:string,affiliation:publisher) <: tuple(first:string,last:string) tuple(name:string, address: string) <: tuple(name:string) We need means to access schemas over the Web ~ namespaces 43 Using Namespaces in XML Schema A given XML Schema defines a set of new names The names defined in a schema are said to belong to its target namespace Definitions and declarations in a schema can refer to names that may belong to other namespaces We refer to those namespaces as source namespaces Each schema has one target namespace and possibly many source namespaces In fact, every name in a given schema belongs to some namespace The names for the namespaces can be fairly long, but they can be abbreviated with the syntax of xmlns declaration in the XML Schema document

23 Reusing XML Schemas: Including Types Means to include types in your schema from other schemas access and import though URIs (all must have the same namespace) equivalent as to create all types directly into the containing schema name conflict resolution based on user responsibility Book.xsd Company.xsd <schema xmlns=" targetnamespace=" xmlns:mybib=" <include schemalocation=" <include schemalocation=" </schema> Biblio.xsd 45 Reusing XML Schemas: Importing Types Means to import types in your schema from other schemas access and import though URIs allows you to reference types in another namespace name conflict resolution based on namespaces <schema xmlns=" targetnamespace= xmlns:html=" xmlns:mybib=" <import namespace=" schemalocation=" </schema> Biblio.xsd

24 Reusing XML Schemas: Extending Types Extension allows to add new fields in a complex type ~ inheritance <xsd:complextype name= ContactAuthor" base= Author" derivedby="extension"> <xsd:element name= telephone" type= xsd:string"/> </xsd:complextype> Now you can use both types but you might need to mark the data with xsi:type attributes <author xsi:type= Author > <name> <first>serge</first><last>abiteboul</last> </name> <affiliation>inria</affiliation> </author> <author xsi:type= ContactAuthor > <name> <first>vassilis</first><last>christophides</last></name> <affiliation>ics-forth</affiliation> <telephone> </telephone> </author> you cannot export the document without its type anymore Reusing XML Schemas: Restricting Types Restricts the scope of a type definition ~ set inclusion <xsd:element name= book2 base= book derivedby= restriction > <xsd:complextype> <xsd:element name= title" type="xsd:string"/> <xsd:element name= author minoccurs= 2 maxoccurs= 10 />... </xsd:complextype> </xsd:element> Spirit is to allow: smaller datatypes narrowed range for sequences t{n,m} < t{n,m } iff n>n && m<m reduced alternative t1 < (t1 t2) propagation of restriction t1 < t1 implies t1 < (t1 t2)

25 Reusing XML Schemas: Equivalence Classes Allows to define elements that can be used in place of other elements <element name= contact type= ContactAuthor equivclass= author'/> allow an element named contact to be used whenever an author element is expected <author> <name> <first>serge</first> <last>abiteboul</last> </name> <affiliation>inria</affiliation> </author> <contact> <name> <first>vassilis</first> <last>christophides</last> </name> <affiliation>ics-forth</affiliation> <telephone> </telephone> </contact> the corresponding type can be a derived type of course, equivalence classes are not based on equivalence!! 49 Reusing XML Schemas: Restricting Derivations Sometimes we may want to create a type and disallow all derivations of it, or just disallow extensions of it, or restrictions of it <xsd:complextype name="publication" final="#all" > This type cannot be extended nor restricted <xsd:complextype name="publication" final="restriction" > This type cannot be restricted <xsd:complextype name="publication" final="extended" > This type cannot be extended If you define a type to be exact, then other types may derive from it. However, in instances derived types may not be used in its stead <xsd:complextype name= author" exact="extension"> <xsd:element name= name" type= xsd:string"/> </xsd:complextype> <author xsi:type= ContactAuthor > <name> <first>vassilis</first><last>christophides</last></name> <affiliation>ics-forth</affiliation> <telephone> </telephone> </author>

26 Some Short-Comings Restriction is purely syntactic the following two types are not restrictions of one another! <xsd:sequence> a[a],(b[b],c[c]) <xsd:element name= a" type= A"/> <xsd:sequence> <xsd:element name= b" type= B"/> <xsd:element name= c" type= C"/> </xsd:sequence> </xsd:sequence> <xsd:sequence> (a[a],b[b]),c[c] <xsd:sequence> <xsd:element name= a" type= A"/> <xsd:element name= b" type= B"/> </xsd:sequence> <xsd:element name= c" type= C"/> </xsd:sequence> 51 Some Short-Comings Restriction and extension are not possible together: Person1 = person [ name [ UrTree ], age [ Int ] ] Person2 = person [ name [ String ], age [ Int ], address [ Address ] ]

27 Referencing a Schema in an XML Instance Document schemalocation= " targetnamespace= " Book.xml uses elements from namespace Book.xsd defines elements in namespace A schema defines a new vocabulary Instance documents use that new vocabulary 53 Multiple Levels of Checking Book.xml Book.xsd XMLSchema.dtd (schema-for-schemas) Validate that the xml document conforms to the rules described in Book.xsd Validate that Book.xsd is a valid schema document, i.e., it conforms to the rules described in the schema-for-schemas Schemas can be used to validate a document in two ways: Content Model Validation: Checks order and nesting of elements (similar to DTD validation) DataType Validation: Checks the element content for valid type and range

28 Namespaces in XML Schema: Overview 55 Validating an XML Schema Instance Document Validation can apply to the entire XML instance document, or to a single element Validating using" two schemas" <Biblio xmlns:book=" xmlns:company=" xmlns:xsi=" xsd:schemalocation= " <book:book> <book:title>data on the Web: From Relations to Semistructured Data and XML</book:Title> <book:author>serge Abiteboul</book:Author> <book:publisher>morgan Kaufmann</book:Publisher> </book:book> <company:affiliation> <person:name>inria</person:name> </company:affiliation> </Biblio>

29 (Almost) any Element/Attribute <any namespace="##other"/> allows any well-formed XML element, provided the element is in another namespace than the one we're defining <any namespace=" allows any well-formed XML element, provided it's from the specified namespace <any namespace="##targetnamespace"/> allows any well-formed XML element, provided it's from the namespace that we're defining <anyattribute namespace="##other"/> allows any attribute, provided the attribute is in another namespace than the one we're defining <anyattribute namespace=" allows any attribute, provided it's from the specified namespace <anyattribute namespace="##targetnamespace"/> allows any attribute, provided it's from the namespace that we're defining 57 DTD vs. XML Schema Features DTD XML Schema Integrat ion Type & Extensibili ElementsAttribute Syntax in XML No Yes Supporting Namespace No Yes include & import No Yes No. Built-in types User defined types No Yes Type domain constrints No Yes Explicit Null value No Yes Type extension No Yes Except Simple Type Attribute default falue Yes Yes Choice among attributes No No Optional/required Attr. Yes Yes Attribute domain constraints Partial Yes Element default value No Partial Element content model Yes Yes Choice among elements Yes Yes Min & Max Occurrence Partial Yes Unordered List No Yes

30 DTD vs. XML Schema Constraints Misc. Features DTD XML Schema Uniqueness for attributes Yes Yes Uniqueness for non-attributes No YES Key for attributes Yes Yes Key for non-attributes No YES Foreign key for attributes Partial Yes Foreign key for non-attributes No Yes Open model No Yes Documentation No Yes Emeded HTML No Yes Self-describability No Yes 59 Context-dependent XML Typing dealer dealer UsedCars NewCars UsedCars NewCars ad ad ad used ad new model year model DTDs cannot distinguish between used car ads and new car ads: different structure in difference contexts model year model Specialized DTDs allow ad to have different structure in different contexts

31 Specialized DTDs and XML Schemas Dealers UsedCars NewCars ad used ad new UsedCars NewCars ad used ad new model year model dealer dealer UsedCars NewCars UsedCars NewCars ad used ad new ad ad model year model model year model 61 Formal Foundations of XML Schemas Basic Questions on XML Schemas (~ Specialized DTDs) Validation: how hard? Expressive power: what can be defined? Closure properties: union, intersection, difference Complexity of manipulations Tool: powerful connection to tree automata! Theorem: DTDs with specialization define precisely the regular tree languages (over unranked trees) and so are equivalent to top-down and bottom-up non deterministic tree automata Closure properties: union, intersection, complement Algorithms for: validation wrt. DTD, decidable inclusion testing of DTDs, computing DTDs for union or intersection, etc. Static analysis

32 Regular Tree Grammars for XML Schemas Regular Tree Grammars (RTG): each function symbol has fixed arity Extended RTG (ERTG): arguments of a function symbol defined by a regular expression ERTG: G = (N, T, S, P) N is a set of non-terminals T is set of terminals S is set of start symbols P is a set of production rules of the form: A x (RE) where A in N, x in T, and RE is a regular expression of non-terminals Attributes are considered equivalent to elements Note: DTDs are local tree grammars imposing a restriction on RTG For every terminal symbol x, there is exactly one rule of the form: A x (RE) Advantage of tree-local: deterministic top-down, as well as, bottom-up parsing, with 1 look ahead 63 Where to Place XML Schemas DTD XML DTD schema Determinis.c top- down tree automata Tree automata Some bizarre restriction Inside an element, no two types with the same tag Closer to DTDs than to tree automata Efficient type validation

33 XML DDL: Summary XML Schemas are a tremendous advancement over DTDs: Enhanced datatypes 37+ versus 10 Can create your own datatypes Can define the lexical representation Written in XML enables use of XML tools Object-oriented flavor Can extend or restrict a type Can express sets - the child elements may occur in any order Can specify element content as being unique (keys on content) and uniqueness within a scope Can define multiple elements with the same name but different type Can define elements with null content Can create equivalent elements 65 XML DDL: Summary Complete but complex XML Schema specification... Many research work with interesting and complementary properties Yet no approach that reconciles all of the above And still some difficult problems to solve: concrete integrity constraint language that is tractable syntactic vs. semantics notion of subtyping? type inclusion is heavily used in Xduce [Hosoya et all 2000] instantiation [Cluet et al 1998] captures XML schema mechanisms, but is less powerful than inclusion graph schemas subsumption [Buneman et al 1997] captures a form of subtyping, but does not work on regular expression types for ordered data use of types for language typing use of types for query processing use of types for storage

34 Readings For formal semantics and expressive power of XML DTDs and Schemas read: Chapter 3 Web Data Management book The following material from the W3C Web page on XML XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes XML Schema Definition Language (XSD) 1.1 Part 1: Structures Check out the DTD of the XML Schema Specification /XMLSchema.dtd 67 Acknowledgements XML Data: From Research to Standards Daniela Florescu, Jerome Simeon Tutorial VLDB 2000 A Web Odyssey: from Codd to XML Victor Vianu, Invited Talk PODS 2001 XML Schemas Roger L. Costello online tutorial XML Technologies

CS561 Spring Mixed Content

CS561 Spring Mixed Content Mixed Content DTDs define mixed content by mixing #PCDATA into the content model DTDs always require mixed content to use the form (#PCDATA a b )* the occurrence of elements in mixed content cannot be

More information

XML DTDs and Namespaces. CS174 Chris Pollett Oct 3, 2007.

XML DTDs and Namespaces. CS174 Chris Pollett Oct 3, 2007. XML DTDs and Namespaces CS174 Chris Pollett Oct 3, 2007. Outline Internal versus External DTDs Namespaces XML Schemas Internal versus External DTDs There are two ways to associate a DTD with an XML document:

More information

B oth element and attribute declarations can use simple types

B oth element and attribute declarations can use simple types Simple types 154 Chapter 9 B oth element and attribute declarations can use simple types to describe the data content of the components. This chapter introduces simple types, and explains how to define

More information

XML - Schema. Mario Arrigoni Neri

XML - Schema. Mario Arrigoni Neri XML - Schema Mario Arrigoni Neri 1 Well formed XML and valid XML Well formation is a purely syntactic property Proper tag nesting, unique root, etc.. Validation is more semantic, because it must take into

More information

Module 3. XML Schema

Module 3. XML Schema Module 3 XML Schema 1 Recapitulation (Module 2) XML as inheriting from the Web history SGML, HTML, XHTML, XML XML key concepts Documents, elements, attributes, text Order, nested structure, textual information

More information

Session [2] Information Modeling with XSD and DTD

Session [2] Information Modeling with XSD and DTD Session [2] Information Modeling with XSD and DTD September 12, 2000 Horst Rechner Q&A from Session [1] HTML without XML See Code HDBMS vs. RDBMS What does XDR mean? XML-Data Reduced Utilized in Biztalk

More information

Information Systems. DTD and XML Schema. Nikolaj Popov

Information Systems. DTD and XML Schema. Nikolaj Popov Information Systems DTD and XML Schema Nikolaj Popov Research Institute for Symbolic Computation Johannes Kepler University of Linz, Austria popov@risc.uni-linz.ac.at Outline DTDs Document Type Declarations

More information

DTD MIGRATION TO W3C SCHEMA

DTD MIGRATION TO W3C SCHEMA Chapter 1 Schema Introduction The XML technical specification identified a standard for writing a schema (i.e., an information model) for XML called a document type definition (DTD). 1 DTDs were a carryover

More information

Querying XML Data. Querying XML has two components. Selecting data. Construct output, or transform data

Querying XML Data. Querying XML has two components. Selecting data. Construct output, or transform data Querying XML Data Querying XML has two components Selecting data pattern matching on structural & path properties typical selection conditions Construct output, or transform data construct new elements

More information

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington CS330 Lecture April 8, 2003 1 Overview From HTML to XML DTDs Querying XML: XPath Transforming XML: XSLT

More information

Overview. Introduction to XML Schemas. Tutorial XML Europe , Berlin. 1 Introduction. 2 Concepts. 3 Schema Languages.

Overview. Introduction to XML Schemas. Tutorial XML Europe , Berlin. 1 Introduction. 2 Concepts. 3 Schema Languages. Introduction to XML Schemas Tutorial XML Europe 2001 21.5.2001, Berlin Ulrike Schäfer. www.infotakt.de. slide 1 Overview 1 Introduction q Why are Schemas? 2 Concepts q What are schemas? 3 Schema Languages

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 27-1 Slide 27-1 Chapter 27 XML: Extensible Markup Language Chapter Outline Introduction Structured, Semi structured, and Unstructured Data. XML Hierarchical (Tree) Data Model. XML Documents, DTD, and XML Schema.

More information

EMERGING TECHNOLOGIES. XML Documents and Schemas for XML documents

EMERGING TECHNOLOGIES. XML Documents and Schemas for XML documents EMERGING TECHNOLOGIES XML Documents and Schemas for XML documents Outline 1. Introduction 2. Structure of XML data 3. XML Document Schema 3.1. Document Type Definition (DTD) 3.2. XMLSchema 4. Data Model

More information

HTML vs. XML In the case of HTML, browsers have been taught how to ignore invalid HTML such as the <mymadeuptag> element and generally do their best

HTML vs. XML In the case of HTML, browsers have been taught how to ignore invalid HTML such as the <mymadeuptag> element and generally do their best 1 2 HTML vs. XML In the case of HTML, browsers have been taught how to ignore invalid HTML such as the element and generally do their best when dealing with badly placed HTML elements. The

More information

(One) Layer Model of the Semantic Web. Semantic Web - XML XML. Extensible Markup Language. Prof. Dr. Steffen Staab Dipl.-Inf. Med.

(One) Layer Model of the Semantic Web. Semantic Web - XML XML. Extensible Markup Language. Prof. Dr. Steffen Staab Dipl.-Inf. Med. (One) Layer Model of the Semantic Web Semantic Web - XML Prof. Dr. Steffen Staab Dipl.-Inf. Med. Bernhard Tausch Steffen Staab - 1 Steffen Staab - 2 Slide 2 Extensible Markup Language Purpose here: storing

More information

XML and Content Management

XML and Content Management XML and Content Management Lecture 3: Modelling XML Documents: XML Schema Maciej Ogrodniczuk, Patryk Czarnik MIMUW, Oct 18, 2010 Lecture 3: XML Schema XML and Content Management 1 DTD example (recall)

More information

XML Schema Profile Definition

XML Schema Profile Definition XML Schema Profile Definition Authors: Nicholas Routledge, Andrew Goodchild, Linda Bird, DSTC Email: andrewg@dstc.edu.au, bird@dstc.edu.au This document contains the following topics: Topic Page Introduction

More information

More XML Schemas, XSLT, Intro to PHP. CS174 Chris Pollett Oct 15, 2007.

More XML Schemas, XSLT, Intro to PHP. CS174 Chris Pollett Oct 15, 2007. More XML Schemas, XSLT, Intro to PHP CS174 Chris Pollett Oct 15, 2007. Outline XML Schemas XSLT PHP Overview of data types There are two categories of data types in XML Schemas: simple types -- which are

More information

XML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9

XML databases. Jan Chomicki. University at Buffalo. Jan Chomicki (University at Buffalo) XML databases 1 / 9 XML databases Jan Chomicki University at Buffalo Jan Chomicki (University at Buffalo) XML databases 1 / 9 Outline 1 XML data model 2 XPath 3 XQuery Jan Chomicki (University at Buffalo) XML databases 2

More information

COMP9321 Web Application Engineering. Extensible Markup Language (XML)

COMP9321 Web Application Engineering. Extensible Markup Language (XML) COMP9321 Web Application Engineering Extensible Markup Language (XML) Dr. Basem Suleiman Service Oriented Computing Group, CSE, UNSW Australia Semester 1, 2016, Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2442

More information

Part 2: XML and Data Management Chapter 6: Overview of XML

Part 2: XML and Data Management Chapter 6: Overview of XML Part 2: XML and Data Management Chapter 6: Overview of XML Prof. Dr. Stefan Böttcher 6. Overview of the XML standards: XML, DTD, XML Schema 7. Navigation in XML documents: XML axes, DOM, SAX, XPath, Tree

More information

Chapter 11 XML Data Modeling. Recent Development for Data Models 2016 Stefan Deßloch

Chapter 11 XML Data Modeling. Recent Development for Data Models 2016 Stefan Deßloch Chapter 11 XML Data Modeling Recent Development for Data Models 2016 Stefan Deßloch Motivation Traditional data models (e.g., relational data model) primarily support structure data separate DB schema

More information

XML Schemas A C U R A D I B E L U S S I A L B E R T O ( E S T R A T T I D A M A T E R I A L E D I S P O N I B I L E S U L S I T O W 3 C )

XML Schemas A C U R A D I B E L U S S I A L B E R T O ( E S T R A T T I D A M A T E R I A L E D I S P O N I B I L E S U L S I T O W 3 C ) XML Schemas 1 A C U R A D I B E L U S S I A L B E R T O ( E S T R A T T I D A M A T E R I A L E D I S P O N I B I L E S U L S I T O W 3 C ) H T T P : / / W W W. W 3. O R G / T R / X M L S C H E M A - 0

More information

XML Schemas Derived from

XML Schemas Derived from 1 XML Schemas Derived from http://www.w3.org/tr/xmlschema-0/ Copyright by Roger L. Costello http://www.xfront.com/ Protected by the GNU General Public License Version 2 Modified by Fabrizio Riguzzi on

More information

Tutorial 2: Validating Documents with DTDs

Tutorial 2: Validating Documents with DTDs 1. One way to create a valid document is to design a document type definition, or DTD, for the document. 2. As shown in the accompanying figure, the external subset would define some basic rules for all

More information

XML. COSC Dr. Ramon Lawrence. An attribute is a name-value pair declared inside an element. Comments. Page 3. COSC Dr.

XML. COSC Dr. Ramon Lawrence. An attribute is a name-value pair declared inside an element. Comments. Page 3. COSC Dr. COSC 304 Introduction to Database Systems XML Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca XML Extensible Markup Language (XML) is a markup language that allows for

More information

CountryData Technologies for Data Exchange. Introduction to XML

CountryData Technologies for Data Exchange. Introduction to XML CountryData Technologies for Data Exchange Introduction to XML What is XML? EXtensible Markup Language Format is similar to HTML, but XML deals with data structures, while HTML is about presentation Open

More information

웹기술및응용. XML Schema 2018 년 2 학기. Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering

웹기술및응용. XML Schema 2018 년 2 학기. Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering 웹기술및응용 XML Schema 2018 년 2 학기 Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering Outline History Comparison with DTD Syntax Definitions and declaration Simple types Namespace Complex

More information

Sistemi ICT per il Business Networking

Sistemi ICT per il Business Networking Corso di Laurea Specialistica Ingegneria Gestionale Sistemi ICT per il Business Networking XML Schema Docente: Vito Morreale (vito.morreale@eng.it) 1 Motivation People are dissatisfied with DTDs It's a

More information

Describing Document Types: The Schema Languages of XML Part 2

Describing Document Types: The Schema Languages of XML Part 2 Describing Document Types: The Schema Languages of XML Part 2 John Cowan 1 Copyright Copyright 2005 John Cowan Licensed under the GNU General Public License ABSOLUTELY NO WARRANTIES; USE AT YOUR OWN RISK

More information

Introduction to Semistructured Data and XML

Introduction to Semistructured Data and XML Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of Washington Database Management Systems, R. Ramakrishnan 1 How the Web is Today HTML documents often

More information

XML. Part II DTD (cont.) and XML Schema

XML. Part II DTD (cont.) and XML Schema XML Part II DTD (cont.) and XML Schema Attribute Declarations Declare a list of allowable attributes for each element These lists are called ATTLIST declarations Consists of 3 basic parts The ATTLIST keyword

More information

ETSI STANDARD Methods for Testing and Specification (MTS); The Testing and Test Control Notation version 3; Part 9: Using XML schema with TTCN-3

ETSI STANDARD Methods for Testing and Specification (MTS); The Testing and Test Control Notation version 3; Part 9: Using XML schema with TTCN-3 ES 201 873-9 V4.7.1 (2016-07) STANDARD Methods for Testing and Specification (MTS); The Testing and Test Control Notation version 3; Part 9: Using XML schema with TTCN-3 2 ES 201 873-9 V4.7.1 (2016-07)

More information

ETSI ES V3.3.1 ( ) ETSI Standard

ETSI ES V3.3.1 ( ) ETSI Standard ES 201 873-9 V3.3.1 (2008-07) Standard Methods for Testing and Specification (MTS); The Testing and Test Control Notation version 3; Part 9: Using XML schema with TTCN-3 2 ES 201 873-9 V3.3.1 (2008-07)

More information

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11 !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... 7:4 @import Directive... 9:11 A Absolute Units of Length... 9:14 Addressing the First Line... 9:6 Assigning Meaning to XML Tags...

More information

XML. XML Namespaces, XML Schema, XSLT

XML. XML Namespaces, XML Schema, XSLT XML XML Namespaces, XML Schema, XSLT Contents XML Namespaces... 2 Namespace Prefixes and Declaration... 3 Multiple Namespace Declarations... 4 Declaring Namespaces in the Root Element... 5 Default Namespaces...

More information

DSD: A Schema Language for XML

DSD: A Schema Language for XML DSD: A Schema Language for XML Nils Klarlund, AT&T Labs Research Anders Møller, BRICS, Aarhus University Michael I. Schwartzbach, BRICS, Aarhus University Connections between XML and Formal Methods XML:

More information

XML in Databases. Albrecht Schmidt. al. Albrecht Schmidt, Aalborg University 1

XML in Databases. Albrecht Schmidt.   al. Albrecht Schmidt, Aalborg University 1 XML in Databases Albrecht Schmidt al@cs.auc.dk http://www.cs.auc.dk/ al Albrecht Schmidt, Aalborg University 1 What is XML? (1) Where is the Life we have lost in living? Where is the wisdom we have lost

More information

Modelling XML Applications (part 2)

Modelling XML Applications (part 2) Modelling XML Applications (part 2) Patryk Czarnik XML and Applications 2014/2015 Lecture 3 20.10.2014 Common design decisions Natural language Which natural language to use? It would be a nonsense not

More information

Grammars for XML Documents XML Schema, Part 1

Grammars for XML Documents XML Schema, Part 1 Grammars for XML Documents XML Schema, Part 1 Lecture "XML in Communication Systems" Chapter 4 Dr.-Ing. Jesper Zedlitz Research Group for Communication Systems Dept. of Computer Science Christian-Albrechts-University

More information

CHAPTER 8. XML Schemas

CHAPTER 8. XML Schemas 429ch08 1/11/02 1:20 PM Page 291 CHAPTER 8 XML Schemas MOST OF US WHO ARE INVOLVED in XML development are all too familiar with using Document Type Definition (DTD) to enforce the structure of XML documents.

More information

Delivery Options: Attend face-to-face in the classroom or remote-live attendance.

Delivery Options: Attend face-to-face in the classroom or remote-live attendance. XML Programming Duration: 5 Days Price: $2795 *California residents and government employees call for pricing. Discounts: We offer multiple discount options. Click here for more info. Delivery Options:

More information

Measuring the Capacity of an XML Schema

Measuring the Capacity of an XML Schema Measuring the Capacity of an XML Schema Specifying an Information Channel with an XML Schema August 2006 Roger L. Costello 1, The MITRE Corporation Robin A. Simmons 2, The MITRE Corporation 1 Roger L.

More information

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML 7.1 Introduction extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML Lax syntactical rules Many complex features that are rarely used HTML is a markup language,

More information

!" DTDs rely on a mechanism based on the use of. !" It is intended to specify cross references" !" Reference to a figure, chapter, section, etc.!

! DTDs rely on a mechanism based on the use of. ! It is intended to specify cross references ! Reference to a figure, chapter, section, etc.! MULTIMEDIA DOCUMENTS! XML Schema (Part 2)"!" DTDs rely on a mechanism based on the use of attributes (ID et IDREF) to specify links into documents"!" It is intended to specify cross references"!" Reference

More information

Week 2: Lecture Notes. DTDs and XML Schemas

Week 2: Lecture Notes. DTDs and XML Schemas Week 2: Lecture Notes DTDs and XML Schemas In Week 1, we looked at the structure of an XML document and how to write XML. I trust you have all decided on the editor you prefer. If not, I continue to recommend

More information

WBM-RDA Integration. User Guide

WBM-RDA Integration. User Guide WBM-RDA Integration User Guide Level: Intermediate Ray W. Ellis (rayellis@us.ibm.com) Daniel T. Chang (dtchang@us.ibm.com) Mei Y. Selvage (meis@us.ibm.com) User Guide WBM and RDA Integration Page 1 of

More information

Introduction to Database Systems CSE 414

Introduction to Database Systems CSE 414 Introduction to Database Systems CSE 414 Lecture 13: XML and XPath 1 Announcements Current assignments: Web quiz 4 due tonight, 11 pm Homework 4 due Wednesday night, 11 pm Midterm: next Monday, May 4,

More information

XML Information Set. Working Draft of May 17, 1999

XML Information Set. Working Draft of May 17, 1999 XML Information Set Working Draft of May 17, 1999 This version: http://www.w3.org/tr/1999/wd-xml-infoset-19990517 Latest version: http://www.w3.org/tr/xml-infoset Editors: John Cowan David Megginson Copyright

More information

XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013

XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013 Assured and security Deep-Secure XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013 This technical note describes the extensible Data

More information

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance.

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance. XML Programming Duration: 5 Days US Price: $2795 UK Price: 1,995 *Prices are subject to VAT CA Price: CDN$3,275 *Prices are subject to GST/HST Delivery Options: Attend face-to-face in the classroom or

More information

Relational Databases

Relational Databases Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 49 Plan of the course 1 Relational databases 2 Relational database design 3 Conceptual database design 4

More information

Chapter 1: Getting Started. You will learn:

Chapter 1: Getting Started. You will learn: Chapter 1: Getting Started SGML and SGML document components. What XML is. XML as compared to SGML and HTML. XML format. XML specifications. XML architecture. Data structure namespaces. Data delivery,

More information

7.1 Introduction. 7.1 Introduction (continued) - Problem with using SGML: - SGML is a meta-markup language

7.1 Introduction. 7.1 Introduction (continued) - Problem with using SGML: - SGML is a meta-markup language 7.1 Introduction - SGML is a meta-markup language - Developed in the early 1980s; ISO std. In 1986 - HTML was developed using SGML in the early 1990s - specifically for Web documents - Two problems with

More information

Introduction to Data Management CSE 344

Introduction to Data Management CSE 344 Introduction to Data Management CSE 344 Lecture 11: XML and XPath 1 XML Outline What is XML? Syntax Semistructured data DTDs XPath 2 What is XML? Stands for extensible Markup Language 1. Advanced, self-describing

More information

Faster XML data validation in a programming language with XML datatypes

Faster XML data validation in a programming language with XML datatypes Faster XML data validation in a programming language with XML datatypes Kurt Svensson Inobiz AB Kornhamnstorg 61, 103 12 Stockholm, Sweden kurt.svensson@inobiz.se Abstract EDI-C is a programming language

More information

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML Chapter 7 XML 7.1 Introduction extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML Lax syntactical rules Many complex features that are rarely used HTML

More information

10/24/12. What We Have Learned So Far. XML Outline. Where We are Going Next. XML vs Relational. What is XML? Introduction to Data Management CSE 344

10/24/12. What We Have Learned So Far. XML Outline. Where We are Going Next. XML vs Relational. What is XML? Introduction to Data Management CSE 344 What We Have Learned So Far Introduction to Data Management CSE 344 Lecture 12: XML and XPath A LOT about the relational model Hand s on experience using a relational DBMS From basic to pretty advanced

More information

XML Design Rules and Conventions (DRC) for the Exchange Network

XML Design Rules and Conventions (DRC) for the Exchange Network XML Design s and Conventions (DRC) for the Exchange Network Version: 2.0 Revision Date: 01/12/2010 01/12/2010 Page. 1 THIS PAGE INTENTIONALLY LEFT BLANK 01/12/2010 Table of Contents 1. Introduction...

More information

An Analysis of Approaches to XML Schema Inference

An Analysis of Approaches to XML Schema Inference An Analysis of Approaches to XML Schema Inference Irena Mlynkova irena.mlynkova@mff.cuni.cz Charles University Faculty of Mathematics and Physics Department of Software Engineering Prague, Czech Republic

More information

Appendix H XML Quick Reference

Appendix H XML Quick Reference HTML Appendix H XML Quick Reference What Is XML? Extensible Markup Language (XML) is a subset of the Standard Generalized Markup Language (SGML). XML allows developers to create their own document elements

More information

This book is licensed under a Creative Commons Attribution 3.0 License

This book is licensed under a Creative Commons Attribution 3.0 License 6. Syntax Learning objectives: syntax and semantics syntax diagrams and EBNF describe context-free grammars terminal and nonterminal symbols productions definition of EBNF by itself parse tree grammars

More information

Semistructured data, XML, DTDs

Semistructured data, XML, DTDs Semistructured data, XML, DTDs Introduction to Databases Manos Papagelis Thanks to Ryan Johnson, John Mylopoulos, Arnold Rosenbloom and Renee Miller for material in these slides Structured vs. unstructured

More information

Introduction to Database Systems CSE 414

Introduction to Database Systems CSE 414 Introduction to Database Systems CSE 414 Lecture 14-15: XML CSE 414 - Spring 2013 1 Announcements Homework 4 solution will be posted tomorrow Midterm: Monday in class Open books, no notes beyond one hand-written

More information

2009 Martin v. Löwis. Data-centric XML. XML Syntax

2009 Martin v. Löwis. Data-centric XML. XML Syntax Data-centric XML XML Syntax 2 What Is XML? Extensible Markup Language Derived from SGML (Standard Generalized Markup Language) Two goals: large-scale electronic publishing exchange of wide variety of data

More information

XML FOR FLEXIBILITY AND EXTENSIBILITY OF DESIGN INFORMATION MODELS

XML FOR FLEXIBILITY AND EXTENSIBILITY OF DESIGN INFORMATION MODELS XML FOR FLEXIBILITY AND EXTENSIBILITY OF DESIGN INFORMATION MODELS JOS P. VAN LEEUWEN AND A.J. JESSURUN Eindhoven University of Technology, The Netherlands Faculty of Building and Architecture, Design

More information

CSC Web Technologies, Spring Web Data Exchange Formats

CSC Web Technologies, Spring Web Data Exchange Formats CSC 342 - Web Technologies, Spring 2017 Web Data Exchange Formats Web Data Exchange Data exchange is the process of transforming structured data from one format to another to facilitate data sharing between

More information

XML Standards for Ontology Exchange

XML Standards for Ontology Exchange Marin Dimitrov OntoText Lab., Sirma AI Ltd, 38A Hristo Botev Blvd, Sofia 1000, Bulgaria marin@sirma.bg Abstract. This paper contains a brief introduction to XML and comparison of different languages for

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design i About the Tutorial A compiler translates the codes written in one language to some other language without changing the meaning of the program. It is also expected that a compiler should make the target

More information

Principles of Programming Languages COMP251: Syntax and Grammars

Principles of Programming Languages COMP251: Syntax and Grammars Principles of Programming Languages COMP251: Syntax and Grammars Prof. Dekai Wu Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong, China Fall 2007

More information

Informatics 1: Data & Analysis

Informatics 1: Data & Analysis Informatics 1: Data & Analysis Lecture 3: The Relational Model Ian Stark School of Informatics The University of Edinburgh Tuesday 24 January 2017 Semester 2 Week 2 https://blog.inf.ed.ac.uk/da17 Lecture

More information

UNIT 3 XML DATABASES

UNIT 3 XML DATABASES UNIT 3 XML DATABASES XML Databases: XML Data Model DTD - XML Schema - XML Querying Web Databases JDBC Information Retrieval Data Warehousing Data Mining. 3.1. XML Databases: XML Data Model The common method

More information

XML. Semi-structured data (SSD) SSD Graphs. SSD Examples. Schemas for SSD. More flexible data model than the relational model.

XML. Semi-structured data (SSD) SSD Graphs. SSD Examples. Schemas for SSD. More flexible data model than the relational model. Semi-structured data (SSD) XML Semistructured data XML, DTD, (XMLSchema) XPath, XQuery More flexible data model than the relational model. Think of an object structure, but with the type of each object

More information

XML Schema Part 2: Datatypes

XML Schema Part 2: Datatypes XML Schema Part 2: Datatypes W3C Recommendation 02 May 2001 This version: http://www.w3.org/tr/2001/rec-xmlschema-2-20010502/ (in XML and HTML, with a schema and DTD including datatype definitions, as

More information

XML Schemas. Purpose of XML Schemas (and DTDs)

XML Schemas. Purpose of XML Schemas (and DTDs) 1 XML Schemas http://www.w3.org/tr/xmlschema-0/ (Primer) http://www.w3.org/tr/xmlschema-1/ (Structures) http://www.w3.org/tr/xmlschema-2/ (Datatypes) Roger L. Costello XML Technologies Course 2 Purpose

More information

CS/INFO 330: Applied Database Systems

CS/INFO 330: Applied Database Systems CS/INFO 330: Applied Database Systems XML Schema Johannes Gehrke October 31, 2005 Annoucements Design document due on Friday Updated slides are on the website This week: Today: XMLSchema Wednesday: Introduction

More information

XML Schema & MPEG DDL

XML Schema & MPEG DDL XML Schema & MPEG DDL 1 Outline Basic Tools: MPEG-7, XML Schema, DDL Why use MPEG-7 for MMDBMS? MPEG-7 DDL bases on XML Schema, but defines MPEG-7 specific extensions DDL is:...a language that allows the

More information

UN/CEFACT Core Components Data Type Catalogue Version September 2009

UN/CEFACT Core Components Data Type Catalogue Version September 2009 UN/CEFACT Core Components Data Type Catalogue Version 3.0 29 September 2009 UN/CEFACT Core Components Data Type Catalogue Version 3.0 Page 1 of 88 Abstract CCTS 3.0 defines the rules for developing Core

More information

The Prague Markup Language (Version 1.1)

The Prague Markup Language (Version 1.1) The Prague Markup Language (Version 1.1) Petr Pajas, Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics Revision History Revision 1.0.0 5 Dec 2005 Initial revision for UFAL

More information

Introduction to XML. Yanlei Diao UMass Amherst April 17, Slides Courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives and Gerome Miklau.

Introduction to XML. Yanlei Diao UMass Amherst April 17, Slides Courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives and Gerome Miklau. Introduction to XML Yanlei Diao UMass Amherst April 17, 2008 Slides Courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives and Gerome Miklau. 1 Structure in Data Representation Relational data is highly

More information

- XML. - DTDs - XML Schema - XSLT. Web Services. - Well-formedness is a REQUIRED check on XML documents

- XML. - DTDs - XML Schema - XSLT. Web Services. - Well-formedness is a REQUIRED check on XML documents Purpose of this day Introduction to XML for parliamentary documents (and all other kinds of documents, actually) Prof. Fabio Vitali University of Bologna Introduce the principal aspects of electronic management

More information

Using UML To Define XML Document Types

Using UML To Define XML Document Types Using UML To Define XML Document Types W. Eliot Kimber ISOGEN International, A DataChannel Company Created On: 10 Dec 1999 Last Revised: 14 Jan 2000 Defines a convention for the use of UML to define XML

More information

Type Checking and Type Equality

Type Checking and Type Equality Type Checking and Type Equality Type systems are the biggest point of variation across programming languages. Even languages that look similar are often greatly different when it comes to their type systems.

More information

Last week we saw how to use the DOM parser to read an XML document. The DOM parser can also be used to create and modify nodes.

Last week we saw how to use the DOM parser to read an XML document. The DOM parser can also be used to create and modify nodes. Distributed Software Development XML Schema Chris Brooks Department of Computer Science University of San Francisco 7-2: Modifying XML programmatically Last week we saw how to use the DOM parser to read

More information

Modelling XML Applications

Modelling XML Applications Modelling XML Applications Patryk Czarnik XML and Applications 2013/2014 Lecture 2 14.10.2013 XML application (recall) XML application (zastosowanie XML) A concrete language with XML syntax Typically defined

More information

Who s Afraid of XML Schema?

Who s Afraid of XML Schema? Who s Afraid of XML Schema? Neil Graham IBM Canada Ltd. Page Objectives Review some of the scariest aspects of XML Schema Focus on broad concepts Heavy use of examples Prove that the basics of XML Schema

More information

XML Format Plug-in User s Guide. Version 10g Release 3 (10.3)

XML Format Plug-in User s Guide. Version 10g Release 3 (10.3) XML Format Plug-in User s Guide Version 10g Release 3 (10.3) XML... 4 TERMINOLOGY... 4 CREATING AN XML FORMAT... 5 CREATING AN XML FORMAT BASED ON AN EXISTING XML MESSAGE FORMAT... 5 CREATING AN EMPTY

More information

XML (4) Extensible Markup Language

XML (4) Extensible Markup Language XML (4) Extensible Markup Language Acknowledgements and copyrights: these slides are a result of combination of notes and slides with contributions from: Michael Kiffer, Arthur Bernstein, Philip Lewis,

More information

The XQuery Data Model

The XQuery Data Model The XQuery Data Model 9. XQuery Data Model XQuery Type System Like for any other database query language, before we talk about the operators of the language, we have to specify exactly what it is that

More information

Information Technology Document Schema Definition Languages (DSDL) Part 1: Overview

Information Technology Document Schema Definition Languages (DSDL) Part 1: Overview ISO/IEC JTC 1/SC 34 Date: 2008-09-17 ISO/IEC FCD 19757-1 ISO/IEC JTC 1/SC 34/WG 1 Secretariat: Japanese Industrial Standards Committee Information Technology Document Schema Definition Languages (DSDL)

More information

When looking for a way to express the SOAP payload, the authors of

When looking for a way to express the SOAP payload, the authors of SeelyCO2.qrk 7/10/01 7:32 PM Page 23 2 Chapter XML Overview When looking for a way to express the SOAP payload, the authors of the specification had a number of ways they could have gone. They could have

More information

Foreword... v Introduction... vi. 1 Scope Normative references Terms and definitions Extensible Datatypes schema overview...

Foreword... v Introduction... vi. 1 Scope Normative references Terms and definitions Extensible Datatypes schema overview... Contents Page Foreword... v Introduction... vi 1 Scope... 1 2 Normative references... 1 3 Terms and definitions... 1 4 Extensible Datatypes schema overview... 2 5 Common constructs... 3 5.1 Common types...

More information

Multi-agent and Semantic Web Systems: RDF Data Structures

Multi-agent and Semantic Web Systems: RDF Data Structures Multi-agent and Semantic Web Systems: RDF Data Structures Fiona McNeill School of Informatics 31st January 2013 Fiona McNeill Multi-agent Semantic Web Systems: RDF Data Structures 31st January 2013 0/25

More information

SDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5

SDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5 2 Basics of XML and XML documents 2.1 XML and XML documents Survivor's Guide to XML, or XML for Computer Scientists / Dummies 2.1 XML and XML documents 2.2 Basics of XML DTDs 2.3 XML Namespaces XML 1.0

More information

EXAM IN SEMI-STRUCTURED DATA Study Code Student Id Family Name First Name

EXAM IN SEMI-STRUCTURED DATA Study Code Student Id Family Name First Name EXAM IN SEMI-STRUCTURED DATA 184.705 28. 10. 2016 Study Code Student Id Family Name First Name Working time: 100 minutes. Exercises have to be solved on this exam sheet; Additional slips of paper will

More information

Data Presentation and Markup Languages

Data Presentation and Markup Languages Data Presentation and Markup Languages MIE456 Tutorial Acknowledgements Some contents of this presentation are borrowed from a tutorial given at VLDB 2000, Cairo, Agypte (www.vldb.org) by D. Florescu &.

More information

Additional Readings on XPath/XQuery Main source on XML, but hard to read:

Additional Readings on XPath/XQuery Main source on XML, but hard to read: Introduction to Database Systems CSE 444 Lecture 10 XML XML (4.6, 4.7) Syntax Semistructured data DTDs XML Outline April 21, 2008 1 2 Further Readings on XML Additional Readings on XPath/XQuery Main source

More information

Programming Languages Third Edition

Programming Languages Third Edition Programming Languages Third Edition Chapter 12 Formal Semantics Objectives Become familiar with a sample small language for the purpose of semantic specification Understand operational semantics Understand

More information

HR-XML Schema Extension Recommendation, 2003 February 26

HR-XML Schema Extension Recommendation, 2003 February 26 HR-XML Schema Extension Recommendation, 2003 February 26 This version: HRXMLExtension.doc Previous version: HRXMLExtension-1_0.doc Editor: Paul Kiel, HR-XML, paul@hr-xml.org Authors: Paul Kiel, HR-XML,

More information

Module 4. Implementation of XQuery. Part 2: Data Storage

Module 4. Implementation of XQuery. Part 2: Data Storage Module 4 Implementation of XQuery Part 2: Data Storage Aspects of XQuery Implementation Compile Time + Optimizations Operator Models Query Rewrite Runtime + Query Execution XML Data Representation XML

More information