Relaxed on the Way Towards True Validation of Compound Documents

Similar documents
Relaxed on the Way Towards True Validation of Compound Documents

Relaxed on the Way Towards True Validation of Compound Documents

Validator.nu Validation 2.0. Henri Sivonen

Grammar vs. Rules. Diagnostics in XML Document Validation. Petr Nálevka

Information Technology Document Schema Definition Languages (DSDL) Part 1: Overview

ISO/IEC INTERNATIONAL STANDARD. Information technology Document Schema Definition Languages (DSDL) Part 3: Rule-based validation Schematron

Author: Irena Holubová Lecturer: Martin Svoboda

Basic Web Pages with XHTML (and a bit of CSS) CSE 190 M (Web Programming), Spring 2008 University of Washington Reading: Chapter 1, sections

Extensible Markup Language (XML) Hamid Zarrabi-Zadeh Web Programming Fall 2013

Chapter 3 Style Sheets: CSS

Web Standards Mastering HTML5, CSS3, and XML

ISO/IEC INTERNATIONAL STANDARD. Information technology Document Schema Definition Languages (DSDL) Part 11: Schema association

Some more XML applications and XML-related standards (XLink, XPointer, XForms)

XML Applications. Prof. Andrea Omicini DEIS, Ingegneria Due Alma Mater Studiorum, Università di Bologna a Cesena

ADDING CSS TO YOUR HTML DOCUMENT. A FEW CSS VALUES (colour, size and the box model)

2010 Martin v. Löwis. Data-centric XML. Other Schema Languages

Session 23 XML. XML Reading and Reference. Reading. Reference: Session 23 XML. Robert Kelly, 2018

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University

XML Technologies. Doc. RNDr. Irena Holubova, Ph.D. Web pages:

Fundamentals: Client/Server

What is XHTML? XHTML is the language used to create and organize a web page:

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

Publishing Technology 101 A Journal Publishing Primer. Mike Hepp Director, Technology Strategy Dartmouth Journal Services

CHAPTER 2 MARKUP LANGUAGES: XHTML 1.0

Foreword... v Introduction... vi. 1 Scope Normative references Terms and definitions Extensible Datatypes schema overview...

XML (Extensible Markup Language

CSI 3140 WWW Structures, Techniques and Standards. Markup Languages: XHTML 1.0

Delivery Options: Attend face-to-face in the classroom or via remote-live attendance.

CSS. Lecture 16 COMPSCI 111/111G SS 2018

References differences between SVG 1.1 Full and SVG 1.2 Tiny

XML for Java Developers G Session 8 - Main Theme XML Information Rendering (Part II) Dr. Jean-Claude Franchitti

ISO/IEC TR TECHNICAL REPORT

An OASIS White Paper. Open by Design. The Advantages of the OpenDocument Format (ODF) ##### D R A F T ##### By the OASIS ODF Adoption TC For OASIS

Contents ISO/IEC FDIS Page

Module 2 (III): XHTML

Foreword... v Introduction... vi. 1 Scope Normative references Terms and definitions DTLL schema overview...

Delivery Options: Attend face-to-face in the classroom or remote-live attendance.

Agenda. Summary of Previous Session. XML for Java Developers G Session 7 - Main Theme XML Information Rendering (Part II)

Rich Web Application Backplane

CSS Lecture 16 COMPSCI 111/111G SS 2018

TRAINING GUIDE. Rebranding Lucity Web

Controlling Appearance the Old Way

Appendix D CSS Properties and Values

Introduction to Web Design CSS Reference

Contents ISO/IEC Page

Introduction to Web Design CSS Reference

COPYRIGHTED MATERIAL. Contents. Part I: Introduction 1. Chapter 1: What Is XML? 3. Chapter 2: Well-Formed XML 23. Acknowledgments

Creating A Web Page. Computer Concepts I and II. Sue Norris

Multimodality with XHTML+Voice

Reading 2.2 Cascading Style Sheets

Schemachine. (C) 2002 Rick Jelliffe. A framework for modular validation of XML documents

Contents. 1. Using Cherry 1.1 Getting started 1.2 Logging in

CMPT 165 INTRODUCTION TO THE INTERNET AND THE WORLD WIDE WEB

XML Metadata Standards and Topic Maps

Extreme Java G Session 3 - Sub-Topic 5 XML Information Rendering. Dr. Jean-Claude Franchitti

Comp 336/436 - Markup Languages. Fall Semester Week 4. Dr Nick Hayward

XML/XHTML ESA 2008/2009. Eelco Schatborn 11 september 2008

Implementing a chat button on TECHNICAL PAPER

User Interaction: XML and JSON

XHTML. XHTML stands for EXtensible HyperText Markup Language. XHTML is the next generation of HTML. XHTML is almost identical to HTML 4.

Internet publishing HTML (XHTML) language. Petr Zámostný room: A-72a phone.:

XML Extensible Markup Language

CIS 408 Internet Computing. Dr. Sunnie Chung Dept. of Electrical Engineering and Computer Science Cleveland State University

Why HTML5? Why not XHTML2? Learning from history how to drive the future of the Web

HTML Overview. With an emphasis on XHTML

extensible Markup Language (XML) Basic Concepts

Web Technologies Present and Future of XML

Outline. Link HTML With Style Sheets &6&7XWRULDO &66 ;+70/ (GZDUG;LD

User Interaction: XML and JSON

SDPL : XML Basics 2. SDPL : XML Basics 1. SDPL : XML Basics 4. SDPL : XML Basics 3. SDPL : XML Basics 5

Part A Short Answer (50 marks)

Chapter 10: Understanding the Standards

Contents. Markup Language and the need of XML. Using environment XML and growth direction. To understand dxml standard.

Create a cool image gallery using CSS visibility and positioning property

CMPT 165 Unit 3 CSS Part 1. Sept. 24 th, 2015

XHTML Modularization for RelaxNG

Standards in Flux Norman Walsh MarkLogic Corporation 14 Feb 2011

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY. (An NBA Accredited Programme) ACADEMIC YEAR / EVEN SEMESTER

but XML goes far beyond HTML: it describes data

<body bgcolor=" " fgcolor=" " link=" " vlink=" " alink=" "> These body attributes have now been deprecated, and should not be used in XHTML.

Report From 'xml Schema' Is 'the Root Element Of A W3c Xml Schema Should Be Schema And Its Namespace

B4M36DS2, BE4M36DS2: Database Systems 2

Document Schema Definition Languages (DSDL) Part 7: Character Repertoire Validation Language

Creating Accessible DotNetNuke Skins Using CSS

Client-side Web Engineering 2 From XML to Client-side Mashups. SWE 642, Spring 2008 Nick Duan. February 6, What is XML?

CSS for Styling CS380

The XML Metalanguage

HTML. HTML: logical structure

Admin. Midterm 1 on. Oct. 13 th (2 weeks from today) Coursework:

Introduction to HTML5

This document is a preview generated by EVS

HTML: The Basics & Block Elements

ASSESSMENT SUMMARY XHTML 1.1 (W3C) Date: 27/03/ / 6 Doc.Version: 0.90

XHTML & CSS CASCADING STYLE SHEETS

COMP9321 Web Application Engineering

Position Paper for Ubiquitous WEB

Computer Science E-75 Building Dynamic Websites

Web Programming Step by Step

Cascading Style Sheet Quick Reference

Information Model Architecture. Version 1.0

Transcription:

Relaxed on the Way Towards True Validation of Compound Documents Petr Nálevka University of Economics, Prague Dept. of Information and Knowledge Engineering petr@nalevka.com Jirka Kosek University of Economics, Prague Dept. of Information and Knowledge Engineering jirka@kosek.cz

Relaxed on the Way Towards True Validation of Compound Documents Agenda Benefits of validation What is Relaxed Limitations of current validation approaches RELAX NG + Schematron Comparison of Relaxed with W3C validator Support for compound documents NVDL, JNVDL and compound documents Who is using Relaxed

Benefits of validation Ideal world All browsers are implementing Web standards All authors create pages according to Web standards All pages work in all browsers, interoperability is reached How to reach ideal world Web standards promotion Conformance testing Many aspects of standard compliance can be automatically tested validated

What is Relaxed HTML and XHTML validation service Web-based user interface for people Web service interface for machines Set of XHTML schemas Schemas can validate more then DTDs which are provided as a part of corresponding W3C recommendation Powerful schema languages RELAX NG and Schematron are used to overcome DTD limitations

Weaknesses of DTD validation Weak data types support Cannot express HTML datatypes e. g. colors, lenghts, multi-lenghts, integers, date & time, URIs,... No namespace support Unable to validate compound documents Unable to express complex structural relationships No rule-based validation W3C Markup Validation Service is DTD based and thus it suffers from all problems mentioned above

Advatages The power of RELAX NG and Schematron Ability to validate compound documents Optional restriction level thanks to straightforward modularity support Full expressive power of XPath and regular expressions Complex structural relationship constraints (Schematron rules) Standardized technology (ISO and OASIS standards) Disadvatages SGML/HTML 4.01 must be converted to well-formed XML before validation

<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/tr/xhtml1/dtd/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head><title>w3c validator limitations demo</title></head> <body> <h1>datatypes</h1> <table border="10%"> <tbody> <tr> <td>a</td> <td><font color="ivory">b</font></td> </tr> </tbody> </table> <h1>nested forms</h1> <form name="form2" action="process.form"> <div> <form action="process.subform"> <p>something is wrong</p> </form> </div> </form> <h1>name and ID inconsistency</h1> <form name="form1" id="form2" action="process.form"> <p>something is wrong</p> </form> <a name="form2">something is wrong</a> </body> </html> Relaxed beats W3C validator

Relaxed beats W3C validator

<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/tr/xhtml1/dtd/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head><title>w3c validator weaknesses demo</title></head> <body> <h1>datatypes</h1> <table border="10%"> <tbody> <tr> <td>a</td> <td><font color="ivory">b</font></td> </tr> </tbody> </table> <h1>nested forms</h1> <form action="process.form"> <div> <form action="process.subform"> <p>somethinkg's wrong</p> </form> </div> </form> <h1>name and ID consistency</h1> <form name="form1" id="form2" action="process.form"> <p>somethinkg's wrong</p> </form> <a name="form2">something is wrong</a> </body> </html> Specification violations Relaxed beats W3C validator

Relaxed beats W3C validator

RELAX NG Example of datatype modelling <!-- Color: Black, Green, Silver, Lime, Gray, Olive, White, Yellow, Maroon, Navy, Red, Blue, Purple, Teal, Fuchsia, Aqua, #custom --> <define name="color.datatype"> <data type="string"> <param name="pattern">[bb][ll][aa][cc][kk] [gg][rr][ee][ee][nn]...... [aa][qq][uu][aa] #[0-9A-Fa-f]{3} #[0-9A-Fa-f]{6}</param> </data> </define> <!-- Pixels: a pixel is restricted to a non-negative integer. --> <define name="pixels.datatype"> <data type="nonnegativeinteger"> <param name="pattern">[0-9]+</param> </data> </define>

Modelling complex relationships using Schematron <sch:rule context="html:*"> <sch:report test="string-length(@id) > 0 and ((preceding::html:*/@name = @id) or (following::html:*/@name = @id))"> The id and name attributes share the same namespace, they shall not collide. </sch:report> </sch:rule> <sch:rule context="html:form"> <sch:report test="descendant::html:form"> Forms cannot have any nested forms. </sch:report> </sch:rule>

What does Relaxed validate W3C Recommendations HTML 4.01, XHTML 1.0 Strict/Transitional/Frameset Widely used in real world WCAG 1.0 (partial) Compound documents Arbitrary foreign elements and attributes are allowed/disallowed XHTML1.0 + SVG1.1 XHTML1.0 + MathML2.0 XHTML1.0 + MathML2.0 + SVG1.1

W3C validator does not support compound documents Validation results for XHTML page with embedded SVG

Compound documents Documents combining more XML grammars together There are many XML languages whose combination can bring a real value-added (rich-client, web-design, semantic queries...) Already supported in some browsers e. g. SVG+XHTML in Firefox and Opera SVG SMIL MathML EGIX presentation VoiceXML XForms XHTML XLink rich-client RDF RSS XTM vcard in XML metadata

NVDL (ISO/IEC 19757-4) NVDL = Namespace-based Validation and Dispatching Language International standard for compound document validation Advantages Validator transparent NVDL engine distributes XML fragments from particular namespace to appropriate validators Schema language neutral Different schema languages can be combined (W3C XML Schema, RELAX NG,...) Real life schemas are writen in many different languages Standardized and flexible way for expressing which grammars may be used in particular context

Relaxed and compound documents Today Compound document validation using RELAX NG Special compound document schema must be created manually for each combination of mark-up vocabularies Future All schemas must be converted to RELAX NG prior combining JNVDL (http://sourceforge.net/projects/jnvdl) Java-based NVDL implementation Straightforward compound document validation

Relaxed in use On-line service for HTML document authors Available at http://relaxed.cz Better outputs, compound document support... European Internet Accessibility Observatory Uses Relaxed validation engine The EIAO project will establish the technical basis for a European Internet Accessibility Observatory. Frequently updated assessment data will be available online from a data warehouse providing a basis for benchmarking, policymaking, research and actions to develop accessibility to Internet. Henri Sivonen's validation service Available at http://hsivonen.iki.fi/validator/ Uses Relaxed XHTML and WCAG schemas

Thank you for your attention Try http://relaxed.cz yourself!