Processing XML Documents with DOM Using BSF4ooRexx

Size: px
Start display at page:

Download "Processing XML Documents with DOM Using BSF4ooRexx"

Transcription

1 MIS Department Processing XML Documents with DOM Using BSF4ooRexx Prof. Dr. Rony G. Flatscher Vienna University of Economics and Business Administration Wirtschaftsuniversität Wien Augasse 2-6 A-1090 Wien Hier DOM, Vortrags-/Vorlesungstitel Automatisierung p.1 von Windows im Master Anwendungen eintragen (3)

2 Overview Markup-Language Basics XML DOM Principles Exploiting DOM from oorexx Examples Roundup Hier DOM, Vortrags-/Vorlesungstitel p.2

3 Terms, 1 (Markup Languages) Tag Enables one to use tags to embrace regular text Opening tag (a.k.a. start tag) <some_tag_name> Closing tag (a.k.a. end tag) </some_tag_name> Allows for analyzing text, by noticing which parts of a text are surrounded ("embraced") by which tags "Element" The sequence "opening tag", text, "closing tag" Hier DOM, Vortrags-/Vorlesungstitel p.3

4 Terms, 2 (Markup Languages) Document Type Definition (DTD) Defines the tags and their attributes, if any Name (identifier) of the tag Attributes for tags "Content model" Nesting of tags and the allowed sequence of tags - Hierarchical structure! Allows to determine how many times an element may occur "Instance" of a DTD A document with text that got marked-up according to the rules defined in a DTD A document that has been checked whether the DTD rules were applied correctly is named a "valid" document Hier DOM, Vortrags-/Vorlesungstitel p.4

5 Terms, 3 (Markup Languages) HTML A markup language for the WWW DTD HTML-Browser Parses a document marked up according to HTML Formats the text, depending on the used tags Version 4.01: three variants defined SGML-based, hence it is possible to Use any case for the tags and attribute names Some closing tags may be omitted if the end tags can be determined by the rules set forth in the DTD It is possible to define exclusions Hier DOM, Vortrags-/Vorlesungstitel p.5

6 Terms, 4 (Markup Languages) XML A slightly simplified version of SGML Allows the definition of DTDs for markup languages Since 2002 an alternative got introduced in the form of "XML Schema": Tag and attribute names must be written in exact case End tags must be always given Attribute values can now be enclosed within apostrophes/single quotes (') in addition to double quotes (") It is possible to explicitly denote empty elements <some_tag_name/> Hier DOM, Vortrags-/Vorlesungstitel p.6

7 Terms, 5 (Markup Languages) XML DTDs can be omitted A matching DTD can be always inferred, if the document is "well formed": All tags must be nested Tags must not overlap Start tags must have matching end tags Structure is always independent of the formatting! Cascading Style Sheets (CSS) Allows to define formatting (layout) rules for elements It is possible to define specific formatting (layout) rules for elements with attributes that have specific values or depending on the sequence of the elements Hier DOM, Vortrags-/Vorlesungstitel p.7

8 Terms (XHTML) Text, marked up in XHTML <html> <head> <title>this is my HTML file</title> </head> <body> <h1>important Heading</h1> <p>this <span class="verb">is</span> the first paragraph.</p> <h1>another Important Heading</h1> <p id="xyz1">another paragraph.</p> <p id="a9876">this <span class="verb">is</span> it.</p> </body> </html> Hier DOM, Vortrags-/Vorlesungstitel p.8

9 XHTML Text Rendered in Firefox Hier DOM, Vortrags-/Vorlesungstitel p.9

10 Example: Linking a Cascading Style Sheet (CSS) Text, marked up in XHTML <html> <head> <title>this is my HTML file</title> <link rel="stylesheet" type="text/css" href="example2.css"/> </head> <body> <h1>important Heading</h1> <p>this <span class="verb">is</span> the first paragraph.</p> <h1>another Important Heading</h1> <p id="xyz1">another paragraph.</p> <p id="a9876">this <span class="verb">is</span> it.</p> </body> </html> Hier DOM, Vortrags-/Vorlesungstitel p.10

11 XHTML Text Rendered in Firefox with CSS Hier DOM, Vortrags-/Vorlesungstitel p.11

12 Example of a Cascading Style Sheets (CSS) Tag h1 { color: blue; text-align: center; font-family: Arial,sans-serif; font-size: 200%; } Tag body { background-color: yellow; font-family: Times, Avantgarde; font-size: small; } "class" Attribut.verb { background-color: white; color: red; font-weight: 900; } "id" Attribut #xyz1 { font-variant: small-caps; text-align: right; } "id" Attribut #a9876 { font-size: large; } Hier DOM, Vortrags-/Vorlesungstitel p.12

13 Document Object Model (DOM), 1 Parsing of a HTML/XML file Creates a parse tree in which the elements are nodes Each HTML browser needs to do this with each HTML file! Application Programming Interfaces (API) for Creating, querying, changing and deleting nodes from the tree This includes the manipulation of attributes as well! Intercepting events, while the user is working with the parse tree User generated events as a result of using the mouse or the keyboard Application generated events like "document loaded" Hier DOM, Vortrags-/Vorlesungstitel p.13

14 Document Object Model (DOM), 2 Example <!-- example2.html --> <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> <html> <head> <title>this is my HTML file</title> <link rel="stylesheet" type="text/css" href="example2.css"/> </head> <body> <h1>important Heading</h1> <p>this <span class="verb">is</span> the first paragraph.</p> <h1>another Important Heading</h1> <p id="xyz1">another paragraph.</p> <p id="a9876">this <span class="verb">is</span> it.</p> </body> </html> html head body title link h1 p h1 p p span span Hier DOM, Vortrags-/Vorlesungstitel p.14

15 DOM (Document Object Model), 3 Java DOM Parsers A Java DOM parser processes the entire document "Factory" pattern, which allows different implementations of DOM parsers to be deployed via Java The DOM parser creates a parse tree where each node is one of type (cf. Java documentation of org.w3c.dom.node) Attr (attribute), CDATASection (character data section), Comment, Document, DocumentFragment, DocumentType, Element, Entity, EntityReference, Notation, ProcessingInstruction, Text The interface definitions for nodes (cf. org.w3c.dom.node) also include a set of methods to manipulate the parse tree Hier DOM, Vortrags-/Vorlesungstitel p.15

16 DOM (Document Object Model), 4 The interface org.xml.sax.errorhandler defines the methods a SAX/DOM error listener must implement error(saxparseexception exception) fatalerror(saxparseexception exception) warning(saxparseexception exception) org.xml.sax.saxparseexception has the following methods getcause() returns a Throwable Java object representing the cause getexception() returns an embedded exception, if any getmessage() returns a string with the detailed error message tostring() returns a string representation of the SAXParseException Hier DOM, Vortrags-/Vorlesungstitel p.16

17 DOM-parsing in oorexx Create an oorexx listener class for handling errors/warnings Create an oorexx listener object from it Create a Java object that embeds the oorexx listener object BSFCreateRexxProxy(rexxListenerObject,[slotArg,interfaceName[,...) interfacename denotes the Java interface name which methods the Rexx listener object handles It is possible to denote more than one Java interface, if the Rexx listener object is able to handle all methods defined by them! Let the Java DOM parser parse the document Process the resulting parse tree with Rexx routines node by node Hier DOM, Vortrags-/Vorlesungstitel p.17

18 "code01.rxj ": Extract Text from any XHTML Document The oorexx Program (Main Program), 1 /* purpose: demonstrate how to extract the text from a xml file using DOM */ parse arg xmlfilename /* create an instance of the JAXP DocumentBuilderFactory */ factory=bsf.loadclass("javax.xml.parsers.documentbuilderfactory")~newinstance factory~setnamespaceaware(.true) -- set desired parser to namespace aware parser=factory~newdocumentbuilder -- create the parser from the factory eh=.errorhandler~new -- create an error handler Rexx object -- wrap up the Rexx error handler as a Java object javaeh=bsfcreaterexxproxy(eh,, "org.xml.sax.errorhandler") parser~seterrorhandler(javaeh) -- set the error handler for this parser rootnode=parser~parse(xmlfilename) -- parse the file, returns root node /* make important constants available via.local */ clzdomnode=bsf.loadclass("org.w3c.dom.node") -- load the Java interface class.local~cdata_section_node=clzdomnode~cdata_section_node -- save field value.local~text_node =clzdomnode~text_node -- save field value /* now collect all text and CDATA nodes and display them */ call follownode rootnode ::requires BSF.CLS /* get the Java support */... cut... Hier DOM, Vortrags-/Vorlesungstitel p.18

19 "code01.rxj ": Extract Text from any XHTML Document The oorexx Program (the oorexx Classes), 2... cut... ::requires BSF.CLS /* get the Java support */ ::routine follownode /* walks the document tree recursively */ use arg node call processnode node -- process received node if node~haschildnodes then do children=node~getchildnodes -- get NodeList loop i=0 to children~length based indexes! call follownode children~item(i) -- recurse end end ::routine processnode /* processes each node */ use arg node nodetype=node~getnodetype -- get type of node if nodetype=.text_node nodetype=.cdata_section_node then say pp(node~nodevalue) ::class ErrorHandler -- a Rexx error handler ("org.xml.sax.errorhandler") ::method unknown /* handles "warning", "error" and "fatalerror" events */ use arg methname, argarray -- arguments from the Java SAX parser exception=argarray[1 -- retrieve SAXException argument.error~say(methname":" - "line="exception~getlinenumber",col="exception~getcolumnnumber":" - pp(exception~getmessage)) Hier DOM, Vortrags-/Vorlesungstitel p.19

20 "code01.rxj ": Extract Text from any XHTML Document Running the oorexx Program, 3 f:\>rexx code01_text.rxj example2.html [ [ [This is my HTML file [ [ [ [ [Important Heading [ [This [is [ the first paragraph. [ [Another Important Heading [ [Another paragraph. [ [This [is [ it. [ [ <!-- example2.html --> <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> <html> <head> <title>this is my HTML file</title> <link rel="stylesheet" type="text/css" href="example2.css"/> </head> <body> <h1>important Heading</h1> <p>this <span class="verb">is</span> the first paragraph.</p> <h1>another Important Heading</h1> <p id="xyz1">another paragraph.</p> <p id="a9876">this <span class="verb">is</span> it.</p> </body> </html> Hier DOM, Vortrags-/Vorlesungstitel p.20

21 "code01.rxj ": Extract Text from any XHTML Document Some Remarks, 4 Some remarks Text can be encoded as Plain text (node type TEXT_NODE) or CDATA sections (node type CDATA_SECTION_NODE) <[!CDATA[...character-data...> Ignorable whitespace is not ignored, but treated like any whitespace A node of type TEXT_NODE will be created for it The node types used in the Java DOM parser are retrievable via the Java interface class org.w3c.dom.node To make it easy to refer to these values from oorexx, the types TEXT_NODE and CDATA_SECTION_NODE are retrieved and made available to all parts of the Rexx program by storing them in.local Hier DOM, Vortrags-/Vorlesungstitel p.21

22 "code02.rxj": List Elements in Document Order The oorexx Program (Main Program), 1 /* purpose: demonstrate how to extract the text from a xml file using DOM */ parse arg xmlfilename /* create an instance of the JAXP DocumentBuilderFactory */ factory=bsf.loadclass("javax.xml.parsers.documentbuilderfactory")~newinstance factory~setnamespaceaware(.true) -- set desired parser to namespace aware parser=factory~newdocumentbuilder -- create the parser from the factory eh=.errorhandler~new -- create an error handler Rexx object -- wrap up the Rexx error handler as a Java object javaeh=bsfcreaterexxproxy(eh,, "org.xml.sax.errorhandler") parser~seterrorhandler(javaeh) -- set the error handler for this parser rootnode=parser~parse(xmlfilename) -- parse the file, returns root node /* make important constants available via.local */ clzdomnode=bsf.loadclass("org.w3c.dom.node") -- load the Java interface class.local~element_node =clzdomnode~element_node -- save field value /* now collect all text and CDATA nodes and display them */ call follownode rootnode ::requires BSF.CLS /* get the Java support */... cut... Hier DOM, Vortrags-/Vorlesungstitel p.22

23 "code02.rxj": List Elements in Document Order The oorexx Program (the oorexx Classes), 2... cut... ::requires BSF.CLS /* get the Java support */ ::routine follownode /* walks the document tree recursively */ use arg node call processnode node -- process received node if node~haschildnodes then do children=node~getchildnodes -- get NodeList loop i=0 to children~length based indexes! call follownode children~item(i) -- recurse end end ::routine processnode /* processes each node */ use arg node nodetype=node~getnodetype -- get type of node if nodetype=.element_node then say pp(node~nodename) ::class ErrorHandler -- a Rexx error handler ("org.xml.sax.errorhandler") ::method unknown /* handles "warning", "error" and "fatalerror" events */ use arg methname, argarray -- arguments from the Java SAX parser exception=argarray[1 -- retrieve SAXException argument.error~say(methname":" - "line="exception~getlinenumber",col="exception~getcolumnnumber":" - pp(exception~getmessage)) Hier DOM, Vortrags-/Vorlesungstitel p.23

24 "code02.rxj": List Elements in Document Order Running the oorexx Program, 3 f:\>rexx code02_text.rxj example2.html [html [head [title [link [body [h1 [p [span [h1 [p [p [span <!-- example2.html --> <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" <html> <head> "DTD/xhtml1-transitional.dtd"> <title>this is my HTML file</title> <link rel="stylesheet" type="text/css" href="example2.css"/> </head> <body> <h1>important Heading</h1> <p>this <span class="verb">is</span> the first paragraph.</p> <h1>another Important Heading</h1> <p id="xyz1">another paragraph.</p> <p id="a9876">this <span class="verb">is</span> it.</p> </body> </html> Hier DOM, Vortrags-/Vorlesungstitel p.24

25 "code02.rxj": List Elements in Document Order Some Remarks, 4 A few remarks The DOM parse tree has more nodes than shown in the output! This particular program processes nodes of type ELEMENT_NODE only The node types used in the Java DOM parser are retrievable via the Java interface class org.w3c.dom.node To make it easy to refer to these values from oorexx, the type ELEMENT_NODE is retrieved and made available to all parts of the Rexx program by storing it in.local One could also use the Java infrastructure to filter only those node types one is interested in Hier DOM, Vortrags-/Vorlesungstitel p.25

26 "code03.rxj" : List Elements in Document Order #2 The oorexx Program (Main Program), 1 /* purpose: demonstrate how to extract the text from a xml file using DOM */ parse arg xmlfilename /* create an instance of the JAXP DocumentBuilderFactory */ factory=bsf.loadclass("javax.xml.parsers.documentbuilderfactory")~newinstance factory~setnamespaceaware(.true) -- set desired parser to namespace aware parser=factory~newdocumentbuilder -- create the parser from the factory eh=.errorhandler~new -- create an error handler Rexx object -- wrap up the Rexx error handler as a Java object javaeh=bsfcreaterexxproxy(eh,, "org.xml.sax.errorhandler") parser~seterrorhandler(javaeh) -- set the error handler for this parser rootnode=parser~parse(xmlfilename) -- parse the file, returns root node /* make important constants available via.local */ clzdomnode=bsf.loadclass("org.w3c.dom.node") -- load the Java interface class.local~element_node =clzdomnode~element_node -- save field value /* now collect all text and CDATA nodes and display them */ call follownode rootnode, 0 ::requires BSF.CLS /* get the Java support */... cut... Hier DOM, Vortrags-/Vorlesungstitel p.26

27 "code03.rxj" : List Elements in Document Order #2 The oorexx Program (the oorexx Classes), 2... cut... ::requires BSF.CLS /* get the Java support */ ::routine follownode /* walks the document tree recursively */ use arg node, level call processnode node, level -- process received node if node~haschildnodes then do children=node~getchildnodes -- get NodeList loop i=0 to children~length based indexes! call follownode children~item(i), level+1 -- recurse end end ::routine processnode /* processes each node */ use arg node, level nodetype=node~getnodetype -- get type of node if nodetype=.element_node then say " "~copies(level) pp(node~nodename) ::class ErrorHandler -- a Rexx error handler ("org.xml.sax.errorhandler") ::method unknown /* handles "warning", "error" and "fatalerror" events */ use arg methname, argarray -- arguments from the Java SAX parser exception=argarray[1 -- retrieve SAXException argument.error~say(methname":" - "line="exception~getlinenumber",col="exception~getcolumnnumber":" - pp(exception~getmessage)) Hier DOM, Vortrags-/Vorlesungstitel p.27

28 "code03.rxj" : List Elements in Document Order #2 Running the oorexx Program, 3 f:\>rexx code03_text.rxj example2.html [html [head [title [link [body [h1 [p [span [h1 [p [p [span <!-- example2.html --> <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" <html> <head> "DTD/xhtml1-transitional.dtd"> <title>this is my HTML file</title> <link rel="stylesheet" type="text/css" href="example2.css"/> </head> <body> <h1>important Heading</h1> <p>this <span class="verb">is</span> the first paragraph.</p> <h1>another Important Heading</h1> <p id="xyz1">another paragraph.</p> <p id="a9876">this <span class="verb">is</span> it.</p> </body> </html> Hier DOM, Vortrags-/Vorlesungstitel p.28

29 "code03.rxj" : List Elements in Document Order #2 Some Remarks, 4 A few remarks The DOM parse tree has more nodes than shown in the output! Note whitespace before first shown element This particular program processes nodes of type ELEMENT_NODE only The node types used in the Java DOM parser are retrievable via the Java interface class org.w3c.dom.node To make it easy to refer to these values from oorexx, the type ELEMENT_NODE is retrieved and made available to all parts of the Rexx program by storing it in.local One could also use the Java infrastructure to filter only those node types one is interested in Hier DOM, Vortrags-/Vorlesungstitel p.29

30 "code04.rxj" : List Elements with Text The oorexx Program (Main Program), 1 /* purpose: demonstrate how to extract the text from a xml file using DOM */ parse arg xmlfilename /* create an instance of the JAXP DocumentBuilderFactory */ factory=bsf.loadclass("javax.xml.parsers.documentbuilderfactory")~newinstance factory~setnamespaceaware(.true) -- set desired parser to namespace aware parser=factory~newdocumentbuilder -- create the parser from the factory eh=.errorhandler~new -- create an error handler Rexx object -- wrap up the Rexx error handler as a Java object javaeh=bsfcreaterexxproxy(eh,, "org.xml.sax.errorhandler") parser~seterrorhandler(javaeh) -- set the error handler for this parser rootnode=parser~parse(xmlfilename) -- parse the file, returns root node /* make important constants available via.local */ clzdomnode=bsf.loadclass("org.w3c.dom.node") -- load the Java interface class.local~cdata_section_node=clzdomnode~cdata_section_node -- save field value.local~text_node =clzdomnode~text_node -- save field value.local~element_node =clzdomnode~element_node -- save field value /* now collect all text and CDATA nodes and display them */ call follownode rootnode, 0 ::requires BSF.CLS /* get the Java support */... cut... Hier DOM, Vortrags-/Vorlesungstitel p.30

31 "code04.rxj" : List Elements with Text The oorexx Program (the oorexx Classes), 2... cut... ::requires BSF.CLS /* get the Java support */ ::routine follownode /* walks the document tree recursively */ use arg node, level call processnode node, level -- process received node if node~haschildnodes then do children=node~getchildnodes -- get NodeList loop i=0 to children~length based indexes! call follownode children~item(i), level+1 -- recurse end end ::routine processnode /* processes each node */ use arg node, level nodetype=node~getnodetype -- get type of node if nodetype=.text_node nodetype=.cdata_section_node then say " "~copies(level) "-->" pp(node~getnodevalue) -- instead of getdata() else if nodetype=.element_node then say " "~copies(level) pp(node~getnodename) ::class ErrorHandler -- a Rexx error handler ("org.xml.sax.errorhandler") ::method unknown /* handles "warning", "error" and "fatalerror" events */ use arg methname, argarray -- arguments from the Java SAX parser exception=argarray[1 -- retrieve SAXException argument.error~say(methname":" - "line="exception~getlinenumber",col="exception~getcolumnnumber":" - pp(exception~getmessage)) Hier DOM, Vortrags-/Vorlesungstitel p.31

32 "code04.rxj" : List Elements with Text Running the oorexx Program, 3 f:\>rexx code04_text.rxj example2.html [html --> [ [head --> [ [title --> [This is my HTML file --> [ [link --> [ --> [ [body --> [ [h1 --> [Important Heading --> [ [p --> [This [span --> [is --> [ the first paragraph. --> [ [h1 --> [Another Important Heading --> [ [p --> [Another paragraph. --> [ [p --> [This [span --> [is --> [ it. --> [ --> [ <!-- example2.html --> <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> <html> <head> <title>this is my HTML file</title> <link rel="stylesheet" type="text/css" href="example2.css"/> </head> <body> <h1>important Heading</h1> <p>this <span class="verb">is</span> the first paragraph.</p> <h1>another Important Heading</h1> <p id="xyz1">another paragraph.</p> <p id="a9876">this <span class="verb">is</span> it.</p> </body> </html> Hier DOM, Vortrags-/Vorlesungstitel p.32

33 "code04.rxj" : List Elements with Text Some Remarks, 4 A few remarks The DOM parse tree has more nodes than shown in the output! Note whitespace before first shown element This particular program processes nodes of type TEXT_NODE, CDATA_SECTION_NODE and ELEMENT_NODE only The node types used in the Java DOM parser are retrievable via the Java interface class org.w3c.dom.node To make it easy to refer to these values from oorexx, the types TEXT_NODE, CDATA_SECTION_NODE and ELEMENT_NODE are retrieved and made available to all parts of the Rexx program by storing them in.local One could also use the Java infrastructure to filter only those node types one is interested in Hier DOM, Vortrags-/Vorlesungstitel p.33

34 More Available with Java DOM Parsing Than Documented by Java! The default DOM parser coming with Java (Apache's Xerces2) is capable of more than what is documented in Oracle's JavaDocs! The W3C interface org.w3c.dom.traversal.documenttraversal, cf. documentation at javadoc/org/w3c/dom/traversal/documenttraversal.html Method createnodeiterator(...) filters nodes and returns the result as a list with the methods defined in the interface org.w3c.dom.traversal.nodeiterator for its traversal Method createtreewalker(...) filters nodes and returns the result as a tree with the methods defined in the interface org.w3c.dom.traversal.treewalker for its traversal Hier DOM, Vortrags-/Vorlesungstitel p.34

35 Using A NodeIterator to Iterate over Elements Get the Java constant field value for showing (filtering) elements using the Java interface class org.w3c.dom.traversal.nodefilter and the constant field named SHOW_ELEMENT Create a NodeIterator from the DOM parse tree and use its methods to iterate over the filtered nodes Hint Compare the following code "code05.rxj" with "code02.rxj" above Hier DOM, Vortrags-/Vorlesungstitel p.35

36 "code05.rxj": List Elements in Document Order The oorexx Program (Main Program), 1 /* purpose: demonstrate how to list the element names in document order */ parse arg xmlfilename /* create an instance of the JAXP DocumentBuilderFactory */ factory=bsf.loadclass("javax.xml.parsers.documentbuilderfactory")~newinstance factory~setnamespaceaware(.true) -- set desired parser to namespace aware parser=factory~newdocumentbuilder -- create the parser from the factory eh=.errorhandler~new -- create an error handler Rexx object -- wrap up the Rexx error handler as a Java object javaeh=bsfcreaterexxproxy(eh,, "org.xml.sax.errorhandler") parser~seterrorhandler(javaeh) -- set the error handler for this parser rootnode=parser~parse(xmlfilename) -- parse the file, returns root node /* get constant value to determine node types to filter */ whattoshow=bsf.getconstant("org.w3c.dom.traversal.nodefilter", "SHOW_ELEMENT") /* create a NodeIterator with only Element nodes */ iterator=rootnode~createnodeiterator(rootnode, whattoshow,.nil,.true) /* process list of Element nodes */ node=iterator~nextnode /* get first node */ loop while node<>.nil say pp(node~getnodename) node=iterator~nextnode /* get next node */ end ::requires BSF.CLS /* get the Java support */ ::class ErrorHandler -- a Rexx error handler ("org.xml.sax.errorhandler")... as depicted above... Hier DOM, Vortrags-/Vorlesungstitel p.36

37 "code05.rxj": List Elements in Document Order Running the oorexx Program, 2 f:\>rexx code05_text.rxj example2.html [html [head [title [link [body [h1 [p [span [h1 [p [p [span <!-- example2.html --> <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" <html> <head> "DTD/xhtml1-transitional.dtd"> <title>this is my HTML file</title> <link rel="stylesheet" type="text/css" href="example2.css"/> </head> <body> <h1>important Heading</h1> <p>this <span class="verb">is</span> the first paragraph.</p> <h1>another Important Heading</h1> <p id="xyz1">another paragraph.</p> <p id="a9876">this <span class="verb">is</span> it.</p> </body> </html> Hier DOM, Vortrags-/Vorlesungstitel p.37

38 Using A TreeWalker to Iterate over Elements Get the Java constant field value for showing (filtering) elements using the Java interface class org.w3c.dom.traversal.nodefilter and the constant field named SHOW_ELEMENT Create a TreeWalker from the DOM parse tree and use its methods to iterate over the filtered nodes Note The createtreewalker() method will filter element related nodes as well (e.g. text nodes included in an element) Hint Compare the following code "code06.rxj" with "code03.rxj" above Hier DOM, Vortrags-/Vorlesungstitel p.38

39 "code06.rxj": List Elements in Document Order The oorexx Program (Main Program), 1 /* purpose: demonstrate how to list the element names in document order */ parse arg xmlfilename /* create an instance of the JAXP DocumentBuilderFactory */ factory=bsf.loadclass("javax.xml.parsers.documentbuilderfactory")~newinstance factory~setnamespaceaware(.true) -- set desired parser to namespace aware parser=factory~newdocumentbuilder -- create the parser from the factory eh=.errorhandler~new -- create an error handler Rexx object -- wrap up the Rexx error handler as a Java object javaeh=bsfcreaterexxproxy(eh,, "org.xml.sax.errorhandler") parser~seterrorhandler(javaeh) -- set the error handler for this parser rootnode=parser~parse(xmlfilename) -- parse the file, returns root node /* get constant value to determine node types to filter */.local~show_element=bsf.getconstant("org.w3c.dom.traversal.nodefilter", "SHOW_ELEMENT") whattoshow=.show_element /* create a TreeWalker with only Element nodes */ walker=rootnode~createtreewalker(rootnode, whattoshow,.nil,.true) /* process list of Element nodes */ call walkthetree walker~firstchild, 0 ::requires BSF.CLS /* get the Java support */... cut here... Hier DOM, Vortrags-/Vorlesungstitel p.39

40 "code06.rxj": List Elements in Document Order The oorexx Program, 2... cut here... /* process list of Element nodes */ call walkthetree walker~firstchild, 0 ::requires BSF.CLS /* get the Java support */ ::routine walkthetree /* walk the tree recursively */ use arg node, level say " "~copies(level) pp(node~getnodename) -- show element name indented child=node~firstchild -- depth first do while child<>.nil -- there may be other node types coming with elements in a TreeWalker if child~getnodetype=.show_element then -- make sure only element nodes call walkthetree child, level+1 -- recurse, increase level child=child~nextsibling -- breadth next end ::class ErrorHandler -- a Rexx error handler ("org.xml.sax.errorhandler")... as depicted above... Hier DOM, Vortrags-/Vorlesungstitel p.40

41 "code05.rxj": List Elements in Document Order Running the oorexx Program, 2 f:\>rexx code06_text.rxj example2.html [html [head [title [link [body [h1 [p [span [h1 [p [p [span <!-- example2.html --> <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" <html> <head> "DTD/xhtml1-transitional.dtd"> <title>this is my HTML file</title> <link rel="stylesheet" type="text/css" href="example2.css"/> </head> <body> <h1>important Heading</h1> <p>this <span class="verb">is</span> the first paragraph.</p> <h1>another Important Heading</h1> <p id="xyz1">another paragraph.</p> <p id="a9876">this <span class="verb">is</span> it.</p> </body> </html> Hier DOM, Vortrags-/Vorlesungstitel p.41

42 Roundup Parsing any XML encoded document possible Using BSF4ooRexx Exploiting Java's functionality for parsing XML documents DOM parsing DOM parser first creates parse tree One may directly walk the DOM parse tree or use the traversal methods to filter the DOM parse tree into a NodeIterator or TreeWalker Rather easy, needs enough memory for the parse tree Easy to exploit from oorexx! Hier DOM, Vortrags-/Vorlesungstitel p.42

43 Further Information World Wide Web Consortium ("W3C") (HTML, XHTML2) (Doctype links) DOM specific URLs (as of ) (current project home) (book) (Java 7 docs) W3C: Apache Xerces2: Sample files installed with BSF4ooRexx bsf4oorexx/samples/dom Hier DOM, Vortrags-/Vorlesungstitel p.43

Processing XML Documents with SAX Using BSF4ooRexx

Processing XML Documents with SAX Using BSF4ooRexx MIS Department Processing XML Documents with SAX Using BSF4ooRexx 2013 International Rexx Symposium RTP, North Carolina Prof. Dr. Rony G. Flatscher Vienna University of Economics and Business Administration

More information

OVERVIEW OF THE DOCUMENT OBJECT MODEL (DOM), A.K.A. DHTML UNDER WINDOWS

OVERVIEW OF THE DOCUMENT OBJECT MODEL (DOM), A.K.A. DHTML UNDER WINDOWS OVERVIEW OF THE DOCUMENT OBJECT MODEL (DOM), A.K.A. DHTML UNDER WINDOWS Rony G. Flatscher Department of Management Information Systems/Business Informatics at the Vienna University of Economics and Business

More information

An Introduction to Procedural and Object-oriented Programming (Object Rexx) 4 "Automating Windows and Windows Applications"

An Introduction to Procedural and Object-oriented Programming (Object Rexx) 4 Automating Windows and Windows Applications MIS Department An Introduction to Procedural and Object-oriented Programming (Object Rexx) 4 "Automating Windows and Windows Applications" Object Rexx vs. MS Visual Basic Script ("VBScript") Windows Configuration

More information

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University Markup Languages SGML, HTML, XML, XHTML CS 431 February 13, 2006 Carl Lagoze Cornell University Problem Richness of text Elements: letters, numbers, symbols, case Structure: words, sentences, paragraphs,

More information

Introduction to XML. XML: basic elements

Introduction to XML. XML: basic elements Introduction to XML XML: basic elements XML Trying to wrap your brain around XML is sort of like trying to put an octopus in a bottle. Every time you think you have it under control, a new tentacle shows

More information

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11

XML: Introduction. !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... Directive... 9:11 !important Declaration... 9:11 #FIXED... 7:5 #IMPLIED... 7:5 #REQUIRED... 7:4 @import Directive... 9:11 A Absolute Units of Length... 9:14 Addressing the First Line... 9:6 Assigning Meaning to XML Tags...

More information

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML

7.1 Introduction. extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML 7.1 Introduction extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML Lax syntactical rules Many complex features that are rarely used HTML is a markup language,

More information

How to Develop a Native Library in C++ for oorexx in a Nutshell

How to Develop a Native Library in C++ for oorexx in a Nutshell Institut für Betriebswirtschaftslehre und Wirtschaftsinformatik How to Develop a Native Library in C++ for oorexx in a Nutshell The 2015 International Rexx Symposium Rony G. Flatscher Wirtschaftsuniversität

More information

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML

Copyright 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 7 XML Chapter 7 XML 7.1 Introduction extensible Markup Language Developed from SGML A meta-markup language Deficiencies of HTML and SGML Lax syntactical rules Many complex features that are rarely used HTML

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411 1 Extensible

More information

Overview. Part I: Portraying the Internet as a collection of online information systems HTML/XHTML & CSS

Overview. Part I: Portraying the Internet as a collection of online information systems HTML/XHTML & CSS CSS Overview Part I: Portraying the Internet as a collection of online information systems Part II: Design a website using HTML/XHTML & CSS XHTML validation What is wrong?

More information

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance.

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance. XML Programming Duration: 5 Days US Price: $2795 UK Price: 1,995 *Prices are subject to VAT CA Price: CDN$3,275 *Prices are subject to GST/HST Delivery Options: Attend face-to-face in the classroom or

More information

Delivery Options: Attend face-to-face in the classroom or remote-live attendance.

Delivery Options: Attend face-to-face in the classroom or remote-live attendance. XML Programming Duration: 5 Days Price: $2795 *California residents and government employees call for pricing. Discounts: We offer multiple discount options. Click here for more info. Delivery Options:

More information

COMP9321 Web Application Engineering. Extensible Markup Language (XML)

COMP9321 Web Application Engineering. Extensible Markup Language (XML) COMP9321 Web Application Engineering Extensible Markup Language (XML) Dr. Basem Suleiman Service Oriented Computing Group, CSE, UNSW Australia Semester 1, 2016, Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2442

More information

Chapter 1: Getting Started. You will learn:

Chapter 1: Getting Started. You will learn: Chapter 1: Getting Started SGML and SGML document components. What XML is. XML as compared to SGML and HTML. XML format. XML specifications. XML architecture. Data structure namespaces. Data delivery,

More information

CSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML

CSI 3140 WWW Structures, Techniques and Standards. Representing Web Data: XML CSI 3140 WWW Structures, Techniques and Standards Representing Web Data: XML XML Example XML document: An XML document is one that follows certain syntax rules (most of which we followed for XHTML) Guy-Vincent

More information

XML. Jonathan Geisler. April 18, 2008

XML. Jonathan Geisler. April 18, 2008 April 18, 2008 What is? IS... What is? IS... Text (portable) What is? IS... Text (portable) Markup (human readable) What is? IS... Text (portable) Markup (human readable) Extensible (valuable for future)

More information

XML Introduction 1. XML Stands for EXtensible Mark-up Language (XML). 2. SGML Electronic Publishing challenges -1986 3. HTML Web Presentation challenges -1991 4. XML Data Representation challenges -1996

More information

Chapter 1 Getting Started with HTML 5 1. Chapter 2 Introduction to New Elements in HTML 5 21

Chapter 1 Getting Started with HTML 5 1. Chapter 2 Introduction to New Elements in HTML 5 21 Table of Contents Chapter 1 Getting Started with HTML 5 1 Introduction to HTML 5... 2 New API... 2 New Structure... 3 New Markup Elements and Attributes... 3 New Form Elements and Attributes... 4 Geolocation...

More information

Parsing XML documents. DOM, SAX, StAX

Parsing XML documents. DOM, SAX, StAX Parsing XML documents DOM, SAX, StAX XML-parsers XML-parsers are such programs, that are able to read XML documents, and provide access to the contents and structure of the document XML-parsers are controlled

More information

Document Object Model (DOM) Java API for XML Parsing (JAXP) DOM Advantages & Disadvantage &6&7XWRULDO (GZDUG;LD

Document Object Model (DOM) Java API for XML Parsing (JAXP) DOM Advantages & Disadvantage &6&7XWRULDO (GZDUG;LD &6&7XWRULDO '20 (GZDUG;LD Document Object Model (DOM) DOM Supports navigating and modifying XML documents Hierarchical tree representation of documents DOM is a language-neutral specification -- Bindings

More information

XML. Technical Talk. by Svetlana Slavova. CMPT 842, Feb

XML. Technical Talk. by Svetlana Slavova. CMPT 842, Feb XML Technical Talk by Svetlana Slavova 1 Outline Introduction to XML XML vs. Serialization Curious facts, advantages & weaknesses XML syntax Parsing XML Example References 2 Introduction to XML (I) XML

More information

Introduction p. 1 An XML Primer p. 5 History of XML p. 6 Benefits of XML p. 11 Components of XML p. 12 BNF Grammar p. 14 Prolog p. 15 Elements p.

Introduction p. 1 An XML Primer p. 5 History of XML p. 6 Benefits of XML p. 11 Components of XML p. 12 BNF Grammar p. 14 Prolog p. 15 Elements p. Introduction p. 1 An XML Primer p. 5 History of XML p. 6 Benefits of XML p. 11 Components of XML p. 12 BNF Grammar p. 14 Prolog p. 15 Elements p. 16 Attributes p. 17 Comments p. 18 Document Type Definition

More information

Web architectures Laurea Specialistica in Informatica Università di Trento. DOM architecture

Web architectures Laurea Specialistica in Informatica Università di Trento. DOM architecture DOM architecture DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); dbf.setvalidating(true); // optional default is non-validating DocumentBuilder db = dbf.newdocumentbuilder(); Document

More information

Introduction to XML. M2 MIA, Grenoble Université. François Faure

Introduction to XML. M2 MIA, Grenoble Université. François Faure M2 MIA, Grenoble Université Example tove jani reminder dont forget me this weekend!

More information

Open Object Rexx Tutorial

Open Object Rexx Tutorial MIS Department Open Object Rexx Tutorial USE ARG, Routines, Abstract Datatype, Classes, Methods, Attributes, Messages, Scopes, Generalizing Class Hierarchy, Inheritance Prof. Rony G. Flatscher Vienna University

More information

Structured documents

Structured documents Structured documents An overview of XML Structured documents Michael Houghton 15/11/2000 Unstructured documents Broadly speaking, text and multimedia document formats can be structured or unstructured.

More information

INTRODUCTION TO WEB USING HTML What is HTML?

INTRODUCTION TO WEB USING HTML What is HTML? Geoinformation and Sectoral Statistics Section (GiSS) INTRODUCTION TO WEB USING HTML What is HTML? HTML is the standard markup language for creating Web pages. HTML stands for Hyper Text Markup Language

More information

Text and Layout. Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 11. This presentation 2004, MacAvon Media Productions

Text and Layout. Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 11. This presentation 2004, MacAvon Media Productions Text and Layout Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 11 This presentation 344 345 Text in Graphics Maximum flexibility obtained by treating text as graphics and manipulating

More information

AIM. 10 September

AIM. 10 September AIM These two courses are aimed at introducing you to the World of Web Programming. These courses does NOT make you Master all the skills of a Web Programmer. You must learn and work MORE in this area

More information

introduction to XHTML

introduction to XHTML introduction to XHTML XHTML stands for Extensible HyperText Markup Language and is based on HTML 4.0, incorporating XML. Due to this fusion the mark up language will remain compatible with existing browsers

More information

Rexx Tutorial for Beginners, 2

Rexx Tutorial for Beginners, 2 MIS Department Rexx Tutorial for Beginners, 2 Statement, Routine (Procedure, Function), "Stem"-Variable Prof. Rony G. Flatscher Vienna University of Economics and Business Administration Wirtschaftsuniversität

More information

CSS: The Basics CISC 282 September 20, 2014

CSS: The Basics CISC 282 September 20, 2014 CSS: The Basics CISC 282 September 20, 2014 Style Sheets System for defining a document's style Used in many contexts Desktop publishing Markup languages Cascading Style Sheets (CSS) Style sheets for HTML

More information

XML. Objectives. Duration. Audience. Pre-Requisites

XML. Objectives. Duration. Audience. Pre-Requisites XML XML - extensible Markup Language is a family of standardized data formats. XML is used for data transmission and storage. Common applications of XML include business to business transactions, web services

More information

What is XHTML? XHTML is the language used to create and organize a web page:

What is XHTML? XHTML is the language used to create and organize a web page: XHTML Basics What is XHTML? XHTML is the language used to create and organize a web page: XHTML is newer than, but built upon, the original HTML (HyperText Markup Language) platform. XHTML has stricter

More information

HTML. Mohammed Alhessi M.Sc. Geomatics Engineering. Internet GIS Technologies كلية اآلداب - قسم الجغرافيا نظم المعلومات الجغرافية

HTML. Mohammed Alhessi M.Sc. Geomatics Engineering. Internet GIS Technologies كلية اآلداب - قسم الجغرافيا نظم المعلومات الجغرافية HTML Mohammed Alhessi M.Sc. Geomatics Engineering Wednesday, February 18, 2015 Eng. Mohammed Alhessi 1 W3Schools Main Reference: http://www.w3schools.com/ 2 What is HTML? HTML is a markup language for

More information

History of the Internet. The Internet - A Huge Virtual Network. Global Information Infrastructure. Client Server Network Connectivity

History of the Internet. The Internet - A Huge Virtual Network. Global Information Infrastructure. Client Server Network Connectivity History of the Internet It is desired to have a single network Interconnect LANs using WAN Technology Access any computer on a LAN remotely via WAN technology Department of Defense sponsors research ARPA

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2017 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid= 2465 1

More information

CSI 3140 WWW Structures, Techniques and Standards. Markup Languages: XHTML 1.0

CSI 3140 WWW Structures, Techniques and Standards. Markup Languages: XHTML 1.0 CSI 3140 WWW Structures, Techniques and Standards Markup Languages: XHTML 1.0 HTML Hello World! Document Type Declaration Document Instance Guy-Vincent Jourdan :: CSI 3140 :: based on Jeffrey C. Jackson

More information

XHTML & CSS CASCADING STYLE SHEETS

XHTML & CSS CASCADING STYLE SHEETS CASCADING STYLE SHEETS What is XHTML? XHTML stands for Extensible Hypertext Markup Language XHTML is aimed to replace HTML XHTML is almost identical to HTML 4.01 XHTML is a stricter and cleaner version

More information

CS144 Notes: Web Standards

CS144 Notes: Web Standards CS144 Notes: Web Standards Basic interaction Example: http://www.youtube.com - Q: what is going on behind the scene? * Q: What entities are involved in this interaction? * Q: What is the role of each entity?

More information

Session 23 XML. XML Reading and Reference. Reading. Reference: Session 23 XML. Robert Kelly, 2018

Session 23 XML. XML Reading and Reference. Reading. Reference: Session 23 XML. Robert Kelly, 2018 Session 23 XML Reading XML Reading and Reference https://en.wikipedia.org/wiki/xml Reference: XML in a Nutshell (Ch. 1-3), available in Safari On-line 2 1 Lecture Objectives Understand the goal of application

More information

CHAPTER 2 MARKUP LANGUAGES: XHTML 1.0

CHAPTER 2 MARKUP LANGUAGES: XHTML 1.0 WEB TECHNOLOGIES A COMPUTER SCIENCE PERSPECTIVE CHAPTER 2 MARKUP LANGUAGES: XHTML 1.0 Modified by Ahmed Sallam Based on original slides by Jeffrey C. Jackson reserved. 0-13-185603-0 HTML HELLO WORLD! Document

More information

CSS how to display to solve a problem External Style Sheets CSS files

CSS how to display to solve a problem External Style Sheets CSS files WEB DESIGN What is CSS? CSS stands for Cascading Style Sheets Styles define how to display HTML elements Styles were added to HTML 4.0 to solve a problem External Style Sheets can save a lot of work External

More information

The Xlint Project * 1 Motivation. 2 XML Parsing Techniques

The Xlint Project * 1 Motivation. 2 XML Parsing Techniques The Xlint Project * Juan Fernando Arguello, Yuhui Jin {jarguell, yhjin}@db.stanford.edu Stanford University December 24, 2003 1 Motivation Extensible Markup Language (XML) [1] is a simple, very flexible

More information

XML Programming in Java

XML Programming in Java Mag. iur. Dr. techn. Michael Sonntag XML Programming in Java DOM, SAX XML Techniques for E-Commerce, Budapest 2005 E-Mail: sonntag@fim.uni-linz.ac.at http://www.fim.uni-linz.ac.at/staff/sonntag.htm Michael

More information

Data Exchange. Hyper-Text Markup Language. Contents: HTML Sample. HTML Motivation. Cascading Style Sheets (CSS) Problems w/html

Data Exchange. Hyper-Text Markup Language. Contents: HTML Sample. HTML Motivation. Cascading Style Sheets (CSS) Problems w/html Data Exchange Contents: Mariano Cilia / cilia@informatik.tu-darmstadt.de Origins (HTML) Schema DOM, SAX Semantic Data Exchange Integration Problems MIX Model 1 Hyper-Text Markup Language HTML Hypertext:

More information

XML: some structural principles

XML: some structural principles XML: some structural principles Hayo Thielecke University of Birmingham www.cs.bham.ac.uk/~hxt October 18, 2011 1 / 25 XML in SSC1 versus First year info+web Information and the Web is optional in Year

More information

singly and doubly linked lists, one- and two-ended arrays, and circular arrays.

singly and doubly linked lists, one- and two-ended arrays, and circular arrays. 4.1 The Tree Data Structure We have already seen a number of data structures: singly and doubly linked lists, one- and two-ended arrays, and circular arrays. We will now look at a new data structure: the

More information

Outline. XML vs. HTML and Well Formed vs. Valid. XML Overview. CSC309 Tutorial --XML 4. Edward Xia

Outline. XML vs. HTML and Well Formed vs. Valid. XML Overview. CSC309 Tutorial --XML 4. Edward Xia CSC309 Tutorial XML Edward Xia November 7, 2003 Outline XML Overview XML DOCTYPE Element Declarations Attribute List Declarations Entity Declarations CDATA Stylesheet PI XML Namespaces A Complete Example

More information

Chapter 2:- Introduction to XHTML. Compiled By:- Sanjay Patel Assistant Professor, SVBIT.

Chapter 2:- Introduction to XHTML. Compiled By:- Sanjay Patel Assistant Professor, SVBIT. Chapter 2:- Introduction to XHTML Compiled By:- Assistant Professor, SVBIT. Outline Introduction to XHTML Move to XHTML Meta tags Character entities Frames and frame sets Inside Browser What is XHTML?

More information

Introduction to HTML and CSS. Arts and Humanities in the Digital Age 2018 CHASE DTP Dr. Paul Gooding

Introduction to HTML and CSS. Arts and Humanities in the Digital Age 2018 CHASE DTP Dr. Paul Gooding Introduction to HTML and CSS Arts and Humanities in the Digital Age 2018 CHASE DTP Dr. Paul Gooding p.gooding@uea.ac.uk @pmgooding Session Outline Introduction How do the web, and web browsers work? Getting

More information

Motivation (WWW) Markup Languages (defined). 7/15/2012. CISC1600-SummerII2012-Raphael-lec2 1. Agenda

Motivation (WWW) Markup Languages (defined). 7/15/2012. CISC1600-SummerII2012-Raphael-lec2 1. Agenda CISC 1600 Introduction to Multi-media Computing Agenda Email Address: Course Page: Class Hours: Summer Session II 2012 Instructor : J. Raphael raphael@sci.brooklyn.cuny.edu http://www.sci.brooklyn.cuny.edu/~raphael/cisc1600.html

More information

What is a web site? Web editors Introduction to HTML (Hyper Text Markup Language)

What is a web site? Web editors Introduction to HTML (Hyper Text Markup Language) What is a web site? Web editors Introduction to HTML (Hyper Text Markup Language) What is a website? A website is a collection of web pages containing text and other information, such as images, sound

More information

Web Programming HTML CSS JavaScript Step by step Exercises Hans-Petter Halvorsen

Web Programming HTML CSS JavaScript Step by step Exercises Hans-Petter Halvorsen https://www.halvorsen.blog Web Programming HTML CSS JavaScript Step by step Exercises Hans-Petter Halvorsen History of the Web Internet (1960s) World Wide Web - WWW (1991) First Web Browser - Netscape,

More information

TagSoup: A SAX parser in Java for nasty, ugly HTML. John Cowan

TagSoup: A SAX parser in Java for nasty, ugly HTML. John Cowan TagSoup: A SAX parser in Java for nasty, ugly HTML John Cowan (cowan@ccil.org) Copyright This presentation is: Copyright 2002 John Cowan Licensed under the GNU General Public License ABSOLUTELY WITHOUT

More information

Author: Irena Holubová Lecturer: Martin Svoboda

Author: Irena Holubová Lecturer: Martin Svoboda NPRG036 XML Technologies Lecture 1 Introduction, XML, DTD 19. 2. 2018 Author: Irena Holubová Lecturer: Martin Svoboda http://www.ksi.mff.cuni.cz/~svoboda/courses/172-nprg036/ Lecture Outline Introduction

More information

Tutorial 1 Getting Started with HTML5. HTML, CSS, and Dynamic HTML 5 TH EDITION

Tutorial 1 Getting Started with HTML5. HTML, CSS, and Dynamic HTML 5 TH EDITION Tutorial 1 Getting Started with HTML5 HTML, CSS, and Dynamic HTML 5 TH EDITION Objectives Explore the history of the Internet, the Web, and HTML Compare the different versions of HTML Study the syntax

More information

ATSC 212 html Day 1 Web Authoring

ATSC 212 html Day 1 Web Authoring ATSC 212 html Day 1 Web Authoring Roland Stull rstull@eos.ubc.ca 1 Web Philosophy! Content is everything.! Style is nothing**. (**until recently)! Hypertext! Hot words or images can expand to give more

More information

W3C XML XML Overview

W3C XML XML Overview Overview Jaroslav Porubän 2008 References Tutorials, http://www.w3schools.com Specifications, World Wide Web Consortium, http://www.w3.org David Hunter, et al.: Beginning, 4th Edition, Wrox, 2007, 1080

More information

Index LICENSED PRODUCT NOT FOR RESALE

Index LICENSED PRODUCT NOT FOR RESALE Index LICENSED PRODUCT NOT FOR RESALE A Absolute positioning, 100 102 with multi-columns, 101 Accelerometer, 263 Access data, 225 227 Adding elements, 209 211 to display, 210 Animated boxes creation using

More information

CS6501 IP Unit IV Page 1

CS6501 IP Unit IV Page 1 CS6501 Internet Programming Unit IV Part - A 1. What is PHP? PHP - Hypertext Preprocessor -one of the most popular server-side scripting languages for creating dynamic Web pages. - an open-source technology

More information

extensible Markup Language

extensible Markup Language extensible Markup Language XML is rapidly becoming a widespread method of creating, controlling and managing data on the Web. XML Orientation XML is a method for putting structured data in a text file.

More information

What is CSS? NAME: INSERT OPENING GRAPHIC HERE:

What is CSS? NAME: INSERT OPENING GRAPHIC HERE: What is CSS? NAME: INSERT OPENING GRAPHIC HERE: Highlight VOCABULARY WORDS that you need defined. Put a? mark in any area that you need clarified. 1 What is CSS? CSS stands for Cascading Style Sheets Styles

More information

Document Object Model. Overview

Document Object Model. Overview Overview The (DOM) is a programming interface for HTML or XML documents. Models document as a tree of nodes. Nodes can contain text and other nodes. Nodes can have attributes which include style and behavior

More information

MODULE 2 HTML 5 FUNDAMENTALS. HyperText. > Douglas Engelbart ( )

MODULE 2 HTML 5 FUNDAMENTALS. HyperText. > Douglas Engelbart ( ) MODULE 2 HTML 5 FUNDAMENTALS HyperText > Douglas Engelbart (1925-2013) Tim Berners-Lee's proposal In March 1989, Tim Berners- Lee submitted a proposal for an information management system to his boss,

More information

Needed for: domain-specific applications implementing new generic tools Important components: parsing XML documents into XML trees navigating through

Needed for: domain-specific applications implementing new generic tools Important components: parsing XML documents into XML trees navigating through Chris Panayiotou Needed for: domain-specific applications implementing new generic tools Important components: parsing XML documents into XML trees navigating through XML trees manipulating XML trees serializing

More information

FUNDAMENTALS OF WEB DESIGN (46)

FUNDAMENTALS OF WEB DESIGN (46) 8 Pages Contestant Number Time Rank FUNDAMENTALS OF WEB DESIGN (46) Regional 2010 Points Section Possible Awarded 20 Questions @ 5pts. 100 pts Application (Subj.) 100 pts TOTAL POINTS 200 pts Failure to

More information

Introduction to XML. An Example XML Document. The following is a very simple XML document.

Introduction to XML. An Example XML Document. The following is a very simple XML document. Introduction to XML Extensible Markup Language (XML) was standardized in 1998 after 2 years of work. However, it developed out of SGML (Standard Generalized Markup Language), a product of the 1970s and

More information

LA TROBE UNIVERSITY SEMESTER ONE EXAMINATION PERIOD CAMPUS AW BE BU MI SH ALLOWABLE MATERIALS

LA TROBE UNIVERSITY SEMESTER ONE EXAMINATION PERIOD CAMPUS AW BE BU MI SH ALLOWABLE MATERIALS LIBRARY USE LA TROBE UNIVERSITY SEMESTER ONE EXAMINATION PERIOD 2013 Student ID: Seat Number: Unit Code: CSE2WD Paper No: 1 Unit Name: Paper Name: Reading Time: Writing Time: Web Development Final 15 minutes

More information

extensible Markup Language (XML) Basic Concepts

extensible Markup Language (XML) Basic Concepts (XML) Basic Concepts Giuseppe Della Penna Università degli Studi di L Aquila dellapenna@univaq.it http://www.di.univaq.it/gdellape This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike

More information

ATSC Week 2 Web Authoring

ATSC Week 2 Web Authoring ATSC 212 - Week 2 Web Authoring Roland Stull rstull@eos.ubc.ca 1 Web Philosophy! Content is everything.! Style is nothing**. (**until recently)! Hypertext! Hot words or images can expand to give more info.

More information

XML: Managing with the Java Platform

XML: Managing with the Java Platform In order to learn which questions have been answered correctly: 1. Print these pages. 2. Answer the questions. 3. Send this assessment with the answers via: a. FAX to (212) 967-3498. Or b. Mail the answers

More information

Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013

Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013 Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013 2 Outline Introduction XML Structure Document Type Definition (DTD) XHMTL Formatting XML CSS Formatting XSLT Transformations

More information

A network is a group of two or more computers that are connected to share resources and information.

A network is a group of two or more computers that are connected to share resources and information. Chapter 1 Introduction to HTML, XHTML, and CSS HTML Hypertext Markup Language XHTML Extensible Hypertext Markup Language CSS Cascading Style Sheets The Internet is a worldwide collection of computers and

More information

Semantic Web Lecture Part 1. Prof. Do van Thanh

Semantic Web Lecture Part 1. Prof. Do van Thanh Semantic Web Lecture Part 1 Prof. Do van Thanh Overview of the lecture Part 1 Why Semantic Web? Part 2 Semantic Web components: XML - XML Schema Part 3 - Semantic Web components: RDF RDF Schema Part 4

More information

Document Object Model (DOM) A brief introduction. Overview of DOM. .. DATA 301 Introduction to Data Science Alexander Dekhtyar..

Document Object Model (DOM) A brief introduction. Overview of DOM. .. DATA 301 Introduction to Data Science Alexander Dekhtyar.. .. DATA 301 Introduction to Data Science Alexander Dekhtyar.. Overview of DOM Document Object Model (DOM) A brief introduction Document Object Model (DOM) is a collection of platform-independent abstract

More information

Introduction to XML. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University

Introduction to XML. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University Introduction to XML Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University http://gear.kku.ac.th/~krunapon/xmlws 1 Topics p What is XML? p Why XML? p Where does XML

More information

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

PASS4TEST. IT Certification Guaranteed, The Easy Way!   We offer free update service for one year PASS4TEST IT Certification Guaranteed, The Easy Way! \ http://www.pass4test.com We offer free update service for one year Exam : 000-141 Title : XML and related technologies Vendors : IBM Version : DEMO

More information

Data Presentation and Markup Languages

Data Presentation and Markup Languages Data Presentation and Markup Languages MIE456 Tutorial Acknowledgements Some contents of this presentation are borrowed from a tutorial given at VLDB 2000, Cairo, Agypte (www.vldb.org) by D. Florescu &.

More information

It is possible to create webpages without knowing anything about the HTML source behind the page.

It is possible to create webpages without knowing anything about the HTML source behind the page. What is HTML? HTML is the standard markup language for creating Web pages. HTML is a fairly simple language made up of elements, which can be applied to pieces of text to give them different meaning in

More information

User Interaction: XML and JSON

User Interaction: XML and JSON User Interaction: XML and JSON Asst. Professor Donald J. Patterson INF 133 Fall 2011 1 What might a design notebook be like? Cooler What does a design notebook entry look like? HTML and XML 1989: Tim Berners-Lee

More information

.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar..

.. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar.. .. Cal Poly CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar.. XML in a Nutshell XML, extended Markup Language is a collection of rules for universal markup of data. Brief History

More information

Web Development & Design Foundations with XHTML. Chapter 2 Key Concepts

Web Development & Design Foundations with XHTML. Chapter 2 Key Concepts Web Development & Design Foundations with XHTML Chapter 2 Key Concepts Learning Outcomes In this chapter, you will learn about: XHTML syntax, tags, and document type definitions The anatomy of a web page

More information

Markup Language. Made up of elements Elements create a document tree

Markup Language. Made up of elements Elements create a document tree Patrick Behr Markup Language HTML is a markup language HTML markup instructs browsers how to display the content Provides structure and meaning to the content Does not (should not) describe how

More information

HTML Hyperlinks (Links)

HTML Hyperlinks (Links) WEB DESIGN HTML Hyperlinks (Links) The HTML tag defines a hyperlink. A hyperlink (or link) is a word, group of words, or image that you can click on to jump to another document. When you move the

More information

INFORMATICA GENERALE 2014/2015 LINGUAGGI DI MARKUP CSS

INFORMATICA GENERALE 2014/2015 LINGUAGGI DI MARKUP CSS INFORMATICA GENERALE 2014/2015 LINGUAGGI DI MARKUP CSS cristina gena dipartimento di informatica cgena@di.unito.it http://www.di.unito.it/~cgena/ materiale e info sul corso http://www.di.unito.it/~cgena/teaching.html

More information

What is XML? XML is designed to transport and store data.

What is XML? XML is designed to transport and store data. What is XML? XML stands for extensible Markup Language. XML is designed to transport and store data. HTML was designed to display data. XML is a markup language much like HTML XML was designed to carry

More information

XML APIs. Web Data Management and Distribution. Serge Abiteboul Philippe Rigaux Marie-Christine Rousset Pierre Senellart

XML APIs. Web Data Management and Distribution. Serge Abiteboul Philippe Rigaux Marie-Christine Rousset Pierre Senellart XML APIs Web Data Management and Distribution Serge Abiteboul Philippe Rigaux Marie-Christine Rousset Pierre Senellart http://gemo.futurs.inria.fr/wdmd January 25, 2009 Gemo, Lamsade, LIG, Telecom (WDMD)

More information

Computer Science E-75 Building Dynamic Websites

Computer Science E-75 Building Dynamic Websites Computer Science E-75 Building Dynamic Websites Harvard Extension School http://www.cs75.net/ Lecture 0: HTTP David J. Malan malan@post.harvard.edu http://www.cs.harvard.edu/~malan/ 0 DNS Image from wikipedia.org.

More information

Introduction to XML 3/14/12. Introduction to XML

Introduction to XML 3/14/12. Introduction to XML Introduction to XML Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University http://gear.kku.ac.th/~krunapon/xmlws 1 Topics p What is XML? p Why XML? p Where does XML

More information

Cascading Style Sheets (CSS)

Cascading Style Sheets (CSS) Cascading Style Sheets (CSS) Mendel Rosenblum 1 Driving problem behind CSS What font type and size does introduction generate? Answer: Some default from the browser (HTML tells what browser how)

More information

Document Parser Interfaces. Tasks of a Parser. 3. XML Processor APIs. Document Parser Interfaces. ESIS Example: Input document

Document Parser Interfaces. Tasks of a Parser. 3. XML Processor APIs. Document Parser Interfaces. ESIS Example: Input document 3. XML Processor APIs How applications can manipulate structured documents? An overview of document parser interfaces 3.1 SAX: an event-based interface 3.2 DOM: an object-based interface Document Parser

More information

CMPT 165 Unit 2 Markup Part 2

CMPT 165 Unit 2 Markup Part 2 CMPT 165 Unit 2 Markup Part 2 Sept. 17 th, 2015 Edited and presented by Gursimran Sahota Today s Agenda Recap of materials covered on Tues Introduction on basic tags Introduce a few useful tags and concepts

More information

Programming the World Wide Web by Robert W. Sebesta

Programming the World Wide Web by Robert W. Sebesta Programming the World Wide Web by Robert W. Sebesta Tired Of Rpg/400, Jcl And The Like? Heres A Ticket Out Programming the World Wide Web by Robert Sebesta provides students with a comprehensive introduction

More information

CITS1231 Web Technologies. Introduction to Cascading Style Sheets

CITS1231 Web Technologies. Introduction to Cascading Style Sheets CITS1231 Web Technologies Introduction to Cascading Style Sheets The Problems with HTML Poor Coding No consistency in the way a document is developed. The same markup can often be written a number of ways.

More information

Review Question 1. Which tag is used to create a link to another page? 1. <p> 2. <li> 3. <a> 4. <em>

Review Question 1. Which tag is used to create a link to another page? 1. <p> 2. <li> 3. <a> 4. <em> Introduction to CSS Review Question 1 Which tag is used to create a link to another page? 1. 2. 3. 4. Review Question 1 Which tag is used to create a link to another page? 1. 2.

More information

Make a Website. A complex guide to building a website through continuing the fundamentals of HTML & CSS. Created by Michael Parekh 1

Make a Website. A complex guide to building a website through continuing the fundamentals of HTML & CSS. Created by Michael Parekh 1 Make a Website A complex guide to building a website through continuing the fundamentals of HTML & CSS. Created by Michael Parekh 1 Overview Course outcome: You'll build four simple websites using web

More information

Introduction to HTML5

Introduction to HTML5 Introduction to HTML5 History of HTML 1991 HTML first published 1995 1997 1999 2000 HTML 2.0 HTML 3.2 HTML 4.01 XHTML 1.0 After HTML 4.01 was released, focus shifted to XHTML and its stricter standards.

More information

SAX & DOM. Announcements (Thu. Oct. 31) SAX & DOM. CompSci 316 Introduction to Database Systems

SAX & DOM. Announcements (Thu. Oct. 31) SAX & DOM. CompSci 316 Introduction to Database Systems SAX & DOM CompSci 316 Introduction to Database Systems Announcements (Thu. Oct. 31) 2 Homework #3 non-gradiance deadline extended to next Thursday Gradiance deadline remains next Tuesday Project milestone

More information