M2 MIA, Grenoble Université
Example <?xml version="1.0" encoding="iso-8859-1"?> <note> <to>tove</to> <from>jani</from> <heading>reminder</heading> <body>dont forget me this weekend!</body> </note> <!-- This is a comment -->
What is it? XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to carry data, not to display data XML tags are not predefined. You must define your own tags XML is a W3C Recommendation
Examples of uses Store data, or exchange data between applications XHTML WSDL for describing available web services WAP and WML as markup languages for handheld devices RSS languages for news feeds RDF and OWL for describing resources and ontology SMIL for describing multimedia for the web X3D for describing graphical scene graphs
Document structure Example At lower level: hierarchy of nodes (Document, Element, Attr, CDATASection, Comment, etc.) At higher level: a declaration a tree structure composed of a root element and child elements comments anywhere, not nested <?xml version="1.0" encoding="iso-8859-1"?> <note> <to>tove</to> <from>jani</from> <heading>reminder</heading> <body>dont forget me this weekend!</body> </note> <!-- This is a comment -->
Elements Example <Message priority="1"> some text <to>tove</to> continues <from>jani</from> </Message> Elements are the building blocks of a document. They are composed of: an opening and a closing tag possibly nested sub-elements defining a tree structure text chunks Empty elements can be defined using a single tag: Example <Logo />
Attributes Example <MyTabWidget id="widget3" width="100px" height="100px"></mytabwidget> Attributes provide additional information about the elements. They are composed of: a name, e.g. id a type, typically CDATA i.e raw text a value, e.g. widget3 convert text to numerical values if needed special characters,,,, & are encoded like in html, e.g. <
Elements or attributes? Data can be modeled as nodes or attributes. There is no rule. Data in attribute <person gender="female"> <firstname>anna</firstname> <lastname>smith</lastname> </person> Data in element <person> <gender>female</gender> <firstname>anna</firstname> <lastname>smith</lastname> </person>
SAX/DOM models DOM: tree-based document is fully loaded and a tree is created read - edit - write difficult for large documents SAX: event-driven read one line at a time and send data to callback functions memory-efficient (streaming) only for reading
Example: SAX Parsing Parsers sends data to callback functions (event-driven) Example: file to parse <Equation size="10"> <Constraint index="2" value="3.5"> </Constraint> </Equation> Example: parser virtual bool startelement(const QString & namespaceuri, const QStrin const QString & qname, const QXmlAttributes & { if( localname.tostdstring()=="equation") for(int index = 0 ; index<atts.length();index++) { if( atts.qname(index).tostdstring()=="size" ) { size = atoi( atts.value(index).tostdstring().c_str() ) } } else...
Validation How to check the validity of a document? Example: a file to validate <?xml version="1.0" encoding="utf-8"?> <xlaplacian version="1.0" size="10"> <constraint index="0" value="-100"/> <constraint index="5" value="100"/> <constraint index="9" value="20"/> </xlaplacian> We need: 1 to define a dialect 2 validation tools
A dialect definition tool: DTD Files can be checked against a Documentation Type Definition (DTD). Example: a file including a DTD <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE xlaplacian [ <!ELEMENT xlaplacian (constraint*)> <!ATTLIST xlaplacian size CDATA #REQUIRED> <!ATTLIST xlaplacian version CDATA #REQUIRED> <!ELEMENT constraint EMPTY> <!ATTLIST constraint index CDATA #REQUIRED> <!ATTLIST constraint value CDATA #REQUIRED> ]> <xlaplacian version="1.0" size="10"> <constraint index="0" value="-100"/> <constraint index="5" value="100"/> <constraint index="9" value="20"/> </xlaplacian>
Using an external DTD Example: a file refering to an external DTD <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE xlaplacian SYSTEM "xlaplacian.dtd"> <xlaplacian version="1.0" size="10"> <constraint index="0" value="-100"/> <constraint index="5" value="100"/> <constraint index="9" value="20"/> </xlaplacian> External DTD: xlaplacian.dtd <!ELEMENT xlaplacian (constraint*)> <!ATTLIST xlaplacian size CDATA #REQUIRED> <!ATTLIST xlaplacian version CDATA #REQUIRED> <!ELEMENT constraint EMPTY> <!ATTLIST constraint index CDATA #REQUIRED> <!ATTLIST constraint value CDATA #REQUIRED>
Validation tools Validation can be performed: by the parser using libraries (C++: libxml2, Xerces; Java; Python;... ) using xmllint in a Unix command line Example xmllint --noout --valid myfiletocheck.xml online XML Schema is a more powerful alternative to DTD
Viewing in a browser Default viewing looks like this:
Using a style sheet catalog.xml catalog.css CATALOG { <?xml version="1.0" encoding="iso-8859-1"?> background-color: #ffffff; <?xml-stylesheet type="text/css" href="catalog.css"?> width: 100%; <CATALOG> } <CD> CD <TITLE>Empire Burlesque</TITLE> { <ARTIST>Bob Dylan</ARTIST> display: block; <COUNTRY>USA</COUNTRY> margin-bottom: 30pt; <COMPANY>Columbia</COMPANY> margin-left: 0; <PRICE>10.90</PRICE> } <YEAR>1985</YEAR> TITLE </CD> { <CD> color: #FF0000; <TITLE>Hide your heart</title> font-size: 20pt; <ARTIST>Bonnie Tyler</ARTIST> } <COUNTRY>UK</COUNTRY> ARTIST <COMPANY>CBS Records</COMPANY> { <PRICE>9.90</PRICE> color: #0000FF; <YEAR>1988</YEAR> font-size: 20pt; </CD> }. COUNTRY,PRICE,YEAR,COMPANY. {. display: block; </CATALOG> color: #000000; margin-left: 20pt; }
CSS-formated viewing An example of formatted viewing :
Conclusion Never create your own text file format! Use XML! See also: Transforming XML documents using XSLT Query data in big XML documents using XQuery http://www.w3schools.com/xml/