XML-Based Representation. Robert L. Kelsey

Similar documents
PERFORMANCE OF PENTIUM III XEON PROCESSORS IN A BUSINESS APPLICATION AT LOS ALAMOS NATIONAL LABORATORY GORE, JAMES E.

LA-UR Approved for public release; distribution is unlimited.

LA-UR Approved for public release; distribution is unlimited.

Forrest B. Brown, Yasunobu Nagaya. American Nuclear Society 2002 Winter Meeting November 17-21, 2002 Washington, DC

Clusters Using Nonlinear Magnification

MCNP Monte Carlo & Advanced Reactor Simulations. Forrest Brown. NEAMS Reactor Simulation Workshop ANL, 19 May Title: Author(s): Intended for:

Standard Business Rules Language: why and how? ICAI 06

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

Introduction to XML. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University

Web site Image database. Web site Video database. Web server. Meta-server Meta-search Agent. Meta-DB. Video query. Text query. Web client.

Introduction to XML 3/14/12. Introduction to XML

APPLYING KNOWLEDGE BASED AI TO MODERN DATA MANAGEMENT. Mani Keeran, CFA Gi Kim, CFA Preeti Sharma

Java Based Open Architecture Controller

Agenda. Summary of Previous Session. XML for Java Developers G Session 6 - Main Theme XML Information Processing (Part II)

IBM Research Report. Model-Driven Business Transformation and Semantic Web

XML APIs Testing Using Advance Data Driven Techniques (ADDT) Shakil Ahmad August 15, 2003

Labelling & Classification using emerging protocols

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

SXML: an XML document as an S-expression

Metadata in the Driver's Seat: The Nokia Metia Framework

NISO STS (Standards Tag Suite) Differences Between ISO STS 1.1 and NISO STS 1.0. Version 1 October 2017

Analysis Exchange Framework Terms of Reference December 2016

Indexing into Controlled Vocabularies with XML

Christopher Sewell Katrin Heitmann Li-ta Lo Salman Habib James Ahrens

Infrastructure for Multilayer Interoperability to Encourage Use of Heterogeneous Data and Information Sharing between Government Systems

Khoral Research, Inc. Khoros is a powerful, integrated system which allows users to perform a variety

Constructing distributed applications using Xbeans

2. PRELIMINARIES MANICURE is specically designed to prepare text collections from printed materials for information retrieval applications. In this ca

Use of XML Schema and XML Query for ENVISAT product data handling

ISO/IEC INTERNATIONAL STANDARD. Information technology Document Schema Definition Languages (DSDL) Part 3: Rule-based validation Schematron

Automated Classification. Lars Marius Garshol Topic Maps

Creating Enterprise and WorkGroup Applications with 4D ODBC

and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

Semantic Web Domain Knowledge Representation Using Software Engineering Modeling Technique

ISO INTERNATIONAL STANDARD. Language resource management Feature structures Part 1: Feature structure representation

The Specifications Exchange Service of an RM-ODP Framework

WHY WE NEED AN XML STANDARD FOR REPRESENTING BUSINESS RULES. Introduction. Production rules. Christian de Sainte Marie ILOG

Event Metamodel and Profile (EMP) Proposed RFP Updated Sept, 2007

Netml A Language and Website for Collaborative Work on Networks and their Algorithms

Inter-Project Dependencies in Java Software Ecosystems

Hospitality Industry Technology Integration Standards Glossary of Terminology

[MS-PICSL]: Internet Explorer PICS Label Distribution and Syntax Standards Support Document

Recommendations of the ad-hoc XML Working Group To the CIO Council s EIEIT Committee May 18, 2000

Semantics for and from Information Models Mapping EXPRESS and use of OWL with a UML profile for EXPRESS

The Xlint Project * 1 Motivation. 2 XML Parsing Techniques

The Wonderful World of XML. Presented by Laurie K. Brooks AML Consulting, Inc.

Beginning To Define ebxml Initial Draft

Oracle Enterprise Data Quality for Product Data

XML for Java Developers G Session 3 - Main Theme XML Information Modeling (Part I) Dr. Jean-Claude Franchitti

XML for Java Developers G Session 2 - Sub-Topic 1 Beginning XML. Dr. Jean-Claude Franchitti

JENA: A Java API for Ontology Management

LA-UR- Title: Author(s): Intended for: Approved for public release; distribution is unlimited.

extensible Markup Language

Software Component Relationships. Stephen H. Edwards. Department of Computer Science. Virginia Polytechnic Institute and State University

Progress report on INSTAT/XML

1 Introduction The history of information retrieval may go back as far as According to Maron[7], 1948 signies three important events. The rst is

Benchmarking the CGNS I/O performance

Contents. Markup Language and the need of XML. Using environment XML and growth direction. To understand dxml standard.

Semantic Analysis. Lecture 9. February 7, 2018

Developing Web-Based Applications Using Model Driven Architecture and Domain Specific Languages

SCR*: A Toolset for Specifying and. Analyzing Software Requirements? Constance Heitmeyer, James Kirby, Bruce Labaw and Ramesh Bharadwaj

mapping IFC versions R.W. Amor & C.W. Ge Department of Computer Science, University of Auckland, Auckland, New Zealand

Hospital System Lowers IT Costs After Epic Migration Flatirons Digital Innovations, Inc. All rights reserved.

xml:tm Using XML technology to reduce the cost of authoring and translation

Beyond DTDs: Constraining Data Content. Language Processing and Specification Group Computer Science Department University of Minho Portugal

Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful?

ISO/IEC TR TECHNICAL REPORT

Automatic Metadata Extraction for Archival Description and Access

From Object Composition to Model Transformation with the MDA

Introduction to Programming

Two Image-Template Operations for Binary Image Processing. Hongchi Shi. Department of Computer Engineering and Computer Science

Jumpstarting the Semantic Web

An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry

Combining Different Business Rules Technologies:A Rationalization

Xyleme Studio Data Sheet

For our sample application we have realized a wrapper WWWSEARCH which is able to retrieve HTML-pages from a web server and extract pieces of informati

Minsoo Ryu. College of Information and Communications Hanyang University.

M359 Block5 - Lecture12 Eng/ Waleed Omar

IoT Standards Ecosystem, What s new?

Information management - Topic Maps visualization

ActiveVOS Technologies

Model driven Engineering & Model driven Architecture

second_language research_teaching sla vivian_cook language_department idl

Evaluation of Commercial Web Engineering Processes

ISO/IEC/ IEEE INTERNATIONAL STANDARD. Systems and software engineering Architecture description

InfoSphere Master Data Management Reference Data Management Hub Version 10 Release 0. User s Guide GI

The Information Technology Program (ITS) Contents What is Information Technology?... 2

A Knowledge-Based System for the Specification of Variables in Clinical Trials

FIBO Operational Ontologies Briefing for the Object Management Group

Using the Identify Database for Self-Affiliation: A Guide

DON XML Achieving Enterprise Interoperability

Introduction to XML. XML: basic elements

Transforming Military Command and Control Information Exchange

DITA for Enterprise Business Documents Sub-committee Proposal Background Why an Enterprise Business Documents Sub committee

SCOS-2000 Technical Note


METADATA REGISTRY, ISO/IEC 11179

Enhancement of CAD model interoperability based on feature ontology

Open XML Requirements Specifications, a Xylia based application

M2 Glossary of Terms and Abbreviations

Transcription:

LA-UR-01-1036 Approved for public release; distribution is unlimited. Title: XML-Based Representation Author(s): Robert L. Kelsey Submitted to: http://lib-www.lanl.gov/la-pubs/00357118.pdf Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by the University of California for the U.S. Department of Energy under contract W-7405-ENG-36. By acceptance of this article, the publisher recognizes that the U.S. Government retains a nonexclusive, royaltyfree license to publish or reproduce the published form of this contribution, or to allow others to do so, for U.S. Government purposes. Los Alamos National Laboratory requests that the publisher identify this article as work performed under the auspices of the U.S. Department of Energy. Los Alamos National Laboratory strongly supports academic freedom and a researcher's right to publish; as an institution, however, the Laboratory does not endorse the viewpoint of a publication or guarantee its technical correctness. FORM 836 (10/96)

XML-Based Representation Robert L. Kelsey Los Alamos National Laboratory Mail Stop F645 Los Alamos, NM 87545 U.S.A. Abstract For focused applications with limited user and use application communities, XML can be the right choice for representation. It is easy to use, maintain, and extend and enjoys wide support in commercial and research sectors. When the knowledge and information to be represented is object-based and use of that knowledge and information is a high priority, then XML-based representation should be considered. This paper discusses some of the issues involved in using XML-based representation and presents an example application that successfully uses an XML-based representation. Keywords: knowledge representation, XML, object-based representation 1 Representation Issues With XML The extensible Markup Language (XML) [13] and its parent the Standard Generalized Markup Language (SGML) [7] are languages used to create markup languages. They can also be used for creating information and knowledge representations for specic types of applications. Applications with knowledge and information that can be decomposed with objects are particularly well suited for XMLbased representations [9]. Additionally, applications that must repeatedly manipulate the information and knowledge represented can be well served with an XML-based representation. XML contains three constructs that are used to create a document type denition (DTD) or grammar for a markup language. The three constructs are element, attribute, and entity. Attribute constructs are associated with element constructs. An entity construct is like a macro that can be expanded. Any duplicated denitions can be specied as an entity and called to save time and lengthy text repetition. In an object-based context an element isan object and an attribute is an object attribute. Attributes are the properties and characteristicsofanobject. An element construct can contain other elements. In this way, ataxonomyofobjects can be represented. An XML-based representation can contain aggregation and inheritance relationships, although these relationships are not as explicit as what may be found in ontological representations. Other relationships and notation which maycontribute to a logical reasoning mechanism are also missing from an XML-based representation. Some of these relationships and notation could be explicitly added to an XML-based representation. It has been argued that XML lacks the formal semantics necessary for representing complex objects and relations [4]. This is agreed, but many languages and representation schemes suer from the same problem. XML is useful for representing declarative knowledge and information. Procedural knowledge and information is another issue. Procedural knowledge is knowledge about how to perform a task or activity [1]. This type of knowledge is somewhat dicult to dene and much harder to represent. Production rules are often used to represent procedural knowledge, but rules are not that distinct from other forms of

representation for declarative knowledge [11]. Data interchange and knowledge and information sharing contribute to adding semantics to XML, especially due to e-commerce applications [4, 12]. Interchange and sharing are application uses which require a certain level of procedural knowledge or \application-level processing semantics" [4]. In the example application discussion that follows the XML-based representation is for declarative type knowledge and information. The knowledge and information necessary to achieve application use and processing is embodied in the programs, scripts, and routines that manipulate the knowledge and information in the XML-based representation. An XML-based representation is not the best path to take for all applications. The tradeos depend on the application, its size, the number of users, time to deployment, type of knowledge/information, and other considerations. If the application makes heavy use of the represented knowledge and information and is focused in its use and users, then an XML-based representation may be the right choice. The following sections describe an example application that uses an XML-based representation that is showing success with its users. 2 Application Example 2.1 Introduction An application example that not only illustrates the use of XML for representation but how XML can help tie together separate pieces of knowledge and information is an enterprise system for simulation of physical systems. At the foundation of the system is an object-based representation of physics and simulation constructs and concepts implemented in XML. Additional XML tools and software facilitate the storage, sharing, and use of the represented knowledge and information. These tools include an XML-specic database, conversion programs written in Java, and Java/XML standards and APIs for XML content manipulation. In this application users have anumber of in-house physical simulation codes at their disposal. These codes have been created and modied over the course of many years and are written in several dierent languages including FORTRAN, C, and C++. Each code has its own unique input le format and execution nuances. There are many expert users in the use of one or two of these codes, but there are few generalists in the use of many of these codes. The enterprise system facilitates the use of more of these codes by an individual user. It also provides the means for migrating users and problem sets to more modern codes. 2.2 How XML Is Used An XML DTD was created that serves as the representation of physical system simulators. The representation is a taxonomy ofphysics and simulation objects and their associated properties and characteristics. The DTD is also the grammar of the representation language. Creating an XML instance of the DTD creates a description of a particular physical simulation problem. The representation at the highest level includes objects for a computational mesh, simulation startup information, a material set, a geometric body set, and simulation output information. These highest-level objects are further dened with sub-levels of objects as much as four levels deep. Figure 1 shows an abstract view of a portion of the material set object in the representation. The attributes associated with these objects are not shown. A user may create a problem description (in XML) from scratch using a graphic user interface for this purpose. The interface is called XEENA [6] and is an XML editor from IBM Alphaworks. However, itismoretypical for a user to convert an input le (for a particular simulation code) to the XML-based representation and then to another simulation code input le format. An XML database provides for a repository

Material Set Material EOS Model Ideal Gas JWL USUP Sesame Modified Osborne HE Model CJBurn Programmed Burn Forest Fire Multishock Forest Fire Figure 1: Abstract view of a portion of the representation. of known and common problem descriptions. These are example problems which can be used for testing simulations for verication and validation purposes and for training new users. The XML database is also used as a database of materials and their associated properties. In this use, rather than a complete problem description, only instances of the material portion of the representation are stored in the database. The material database contains a set of standard materials and their properties. The database can be queried by material name and by material description. Materials can be accessed from the database and put directly into an XML problem description. This ability is especially useful for creating problem descriptions from scratch or substituting materials in an existing problem description. Another part of the enterprise system is a language [8] for specifying a suite or series of problem runs. This language is also implemented in XML. A user can specify an example problem and make substitutions (such asma- terials) in that problem to create a series of problems. This is useful for parameter testing and making comparisons between simulation codes. The example problem (in XML form) as well as the problem portions being substituted can originate in les and the database. Some types of substitutions can be specied in the language explicitly or with functions or equations. Each individual run species a target simulation code and as many substitutions as desired. This allows the creation of a problem description from distributed and disparate sources. Once substitutions for a given run are completed, the XML problem description is converted to the target simulation code input format and the problem is submitted for execution. The language also allows the user to specify what output parameters are to be compared between runs. The purpose of the language is not only job (execution) control, but to facilitate problem creation from disparate sources and to specify basic output analysis. Figure 2 shows the enterprise system and its components. Along the left are sources for physical system problems or pieces to build those problems. These sources may be in XML format or in an application code specic format. The database sources may be queried and put directly in the XML problem description. The interchange format is XML and the target format is application code specic. The XEENA interface can be used to create problem descriptions from scratch and from these

Source Format Interchange Format Target Format Code Specific Format Example Database Conversion Query Description (XML) Conversion Code Specific Format Creation Materials Database Query XEENA Run Specification (XML) Simulation Driver Execution Results Result Analysis Documentation Execution Figure 2: The components of the enterprise system for physical system simulation. sources. The problem execution portion of the enterprise system takes the XML problem description and an XML run specication and executes the problem. A simulation driver exists for each target application code. The driver contains application specic details needed to submit and execute the problem for the particular application code. Execution results are generated and a basic analysis is performed on the results. The documentation can be tailored for brief descriptions, formal reports, and slide presentations. The XML databases are an open-source version of dbxml [5]. The conversion routines make use of the Xerces Java parser [3] for XML and the SAXON XSLT processor [10]. The documentation tools make use of Xalan Java [2]. 3 Results And Conclusions Users regularly use the conversion routines for input le format conversion. The routines are especially used for migrating old and legacy application code input les to newer application code input les. There is some interest in newer application codes adopting the XML problem description as their input le format. Whenever possible, open source tools and code were used to help build the enterprise system application. There is an increasing number of XML tools available, both from open source and commercial vendors. This availability contributes to the rapid creation of prototypes. Open source use helps avoid reinvention and can help ensure that the most recent advances in XML are being used. This work has shown that XML is useful for quickly creating application-specic and application-oriented representations. The representations are relatively easy to create, maintain, and extend. They are easy to use and easy to build applications around. Part of the ease of use is due to the availability and increasing availability of standards, APIs, and tools for XML. However, XML-based representations are not without problems. The biggest problem is that the representations lack enough semantic content for some applications. If the terminology and/or vocabulary expressed in an XML-based representation is not known and understood by the users and use applications, then the representation will likely be worthless. More semantic content isnec- essary for representations that will be shared far and wide among users and use applications without complete understanding of the knowl-

edge and information being represented. These are some of the tradeos which must be considered. When an application is focused and intends to use represented knowledge and information, then an XML-based representation should be considered and attempted. It is faster to create an XML-based representation than a full-scale ontology. If an XML-based representation does not work for the chosen application, then it can be cast aside and more traditional methods can be applied. A test of XML does not need a lot of time and eort. It is importantto remember that in applications, using what works or what is good enough can be the best path to take. References [1] John R. Anderson. Cognitive Psychology and Its Implications. W.H. Freeman and Company, New York, NY, fourth edition, 1995. [2] The Apache XML Project. Xalan Java, 2000. http://xml.apache.org. [3] The Apache XML Project. Xerces Java Parser, 2000. http://xml.apache.org. [4] Robin Cover. XML and semantic transparency. Technical report, OASIS, November 1998. http://xml.coverpages.org. [5] The dbxml Group, L.L.C. dbxml XML Database Application Server, 2000. http://www.dbxml.org. [6] IBM Corporation, IBM Haifa Research Lab, Israel. XEENA, 1999. http://www.alphaworks.ibm.com/tech/ xeena. [7] International Organization for Standardization, Geneva. ISO 8879:1986 Information processing - Text and oce systems - Standard Generalized Markup Language (SGML), October 1986. [8] Robert L. Kelsey, Keith R. Bisset, and Robert B. Webster. Automated parametric execution and documentation for largescale simulations. In Enabling Technology for Simulation Science V,volume 4367, Bellingham, WA, 2001. Society of Photo- Optical Instrumentation Engineers, Society of Photo-Optical Instrumentation Engineers. [9] Robert L. Kelsey and Robert B. Webster. Analyzing use cases for knowledge acquisition. In Applications and Science of Computational Intelligence III, volume 4055, pages 384{391, Bellingham, WA, 2000. Society of Photo-Optical Instrumentation Engineers, Society of Photo-Optical Instrumentation Engineers. [10] Michael Kay of ICL. The SAXON XSLT Processor, 2000. http://users.iclway.co.uk/mhkay/saxon/. [11] Elaine Rich and Kevin Knight. Articial Intelligence. McGraw-Hill, Inc., New York, NY, second edition, 1991. [12] Howard Smith and Kevin Poulter. Share the ontology in xml-based trading architectures. Communications of the ACM, 42(3):110{111, March 1999. [13] W.W.W. Consortium. Extensible Markup Language, W3C Recommendation, October 2000. http://www.w3.org/tr/recxml.