Modeling XML Vocabularies with UML: Part I

Similar documents
Overview of Sentence Order Reference Document Development Process

Enabling Grids for E-sciencE ISSGC 05. XML documents. Richard Hopkins, National e-science Centre, Edinburgh June

Teiid Designer User Guide 7.5.0

The Unified Modeling Language User Guide

Introduction to Software Engineering. 5. Modeling Objects and Classes

The Unified Modelling Language. Example Diagrams. Notation vs. Methodology. UML and Meta Modelling

Introduction. Chapter 1. What Is Visual Modeling? The Triangle for Success. The Role of Notation. History of the UML. The Role of Process

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY. (An NBA Accredited Programme) ACADEMIC YEAR / EVEN SEMESTER

CS561 Spring Mixed Content

Beginning To Define ebxml Initial Draft

The Unified Modeling Language (UML)

1 Introduction. 1.1 Introduction

Object-Oriented Design

WHY WE NEED AN XML STANDARD FOR REPRESENTING BUSINESS RULES. Introduction. Production rules. Christian de Sainte Marie ILOG

Modellistica Medica. Maria Grazia Pia, INFN Genova. Scuola di Specializzazione in Fisica Sanitaria Genova Anno Accademico

Unified Modeling Language (UML)

Stylus Studio Case Study: FIXML Working with Complex Message Sets Defined Using XML Schema

Chapter 8 The Enhanced Entity- Relationship (EER) Model

COSC 3351 Software Design. An Introduction to UML (I)

Index. business modeling syntax 181 business process modeling 57 business rule 40

UML Views of a System

Using UML To Define XML Document Types

Introduction to UML p. 1 Introduction to the Object-Oriented Paradigm p. 1 What Is Visual Modeling? p. 6 Systems of Graphical Notation p.

Getting a Quick Start with RUP

Ingegneria del Software Corso di Laurea in Informatica per il Management. Introduction to UML

Chapter 3 Research Method

STAR Naming and Design Rules. Version 1.0

Model Driven Development Unified Modeling Language (UML)

Architecture Viewpoint Template for ISO/IEC/IEEE 42010

Enterprise Architect. User Guide Series. Domain Models

Teiid Designer User Guide 7.7.0

Chapter 8: Enhanced ER Model

Geographic Information Fundamentals Overview

Meta-Data Support for Data Transformations Using Microsoft Repository

Software Service Engineering

Unified Modeling Language (UML)

8/22/2003. Proposal for VPI model PSL assertion extensions

UML Fundamental. OutLine. NetFusion Tech. Co., Ltd. Jack Lee. Use-case diagram Class diagram Sequence diagram

Introduction to Dependable Systems: Meta-modeling and modeldriven

Introduction to Software Engineering. 5. Modeling Objects and Classes

METADATA INTERCHANGE IN SERVICE BASED ARCHITECTURE

Pattern for Structuring UML-Compatible Software Project Repositories

UML data models from an ORM perspective: Part 4

Chapter 1: Semistructured Data Management XML

An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry

Web Services Description Language (WSDL) Version 1.2

UML big picture. Perdita Stevens. School of Informatics University of Edinburgh

2.0.3 attributes: A named property of a class that describes the range of values that the class or its instances (i.e., objects) may hold.

Chapter 2: The Object-Oriented Design Process

Discover, Relate, Model, and Integrate Data Assets with Rational Data Architect

2.0.3 attributes: A named property of a class that describes the range of values that the class or its instances (i.e., objects) may hold.

Chapter 1: Semistructured Data Management XML

Object-Oriented Software Engineering Practical Software Development using UML and Java

Agile Model-Driven Development with UML 2.0 SCOTT W. AM BLER. Foreword by Randy Miller UNIFIED 1420 MODELING LANGUAGE. gile 1.

Automating Conceptual Design of Web Warehouses

Spemmet - A Tool for Modeling Software Processes with SPEM

What Is UML? The Goals and Features of UML. Overview. The goals of UML

Hospitality Industry Technology Integration Standards Glossary of Terminology

technical memo Physical Mark-Up Language Update abstract Christian Floerkemeier & Robin Koh

History of object-oriented approaches

Software Engineering Lab Manual

Object-Oriented Systems Analysis and Design Using UML

Introduction to Database Design

Model Driven Ontology: A New Methodology for Ontology Development

UML Modeling I. Instructor: Yongjie Zheng September 3, CS 490MT/5555 Software Methods and Tools

Computation Independent Model (CIM): Platform Independent Model (PIM): Platform Specific Model (PSM): Implementation Specific Model (ISM):

"Charting the Course... Agile Database Design Techniques Course Summary

Chapter 3 Database Modeling and Design II. Database Modeling

Notation Standards for TOGAF:

Fundamentals to Creating Architectures using ISO/IEC/IEEE Standards

SRI VENKATESWARA COLLEGE OF ENGINERRING AND TECHNOLOGY THIRUPACHUR,THIRUVALLUR UNIT I OOAD PART A

Object-Oriented Programming

A - 1. CS 494 Object-Oriented Analysis & Design. UML Class Models. Overview. Class Model Perspectives (cont d) Developing Class Models

Multidimensional Modeling using UML and XML

Automation of Semantic Web based Digital Library using Unified Modeling Language Minal Bhise 1 1

Class Diagrams in Analysis

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN

Object-Oriented Design

Semantics, Metadata and Identifying Master Data

Dictionary Driven Exchange Content Assembly Blueprints

Ontology Development. Qing He

2.0.3 attributes: A named property of a class that describes the range of values that the class or its instances (i.e., objects) may hold.

CS/INFO 330 Entity-Relationship Modeling. Announcements. Goals of This Lecture. Mirek Riedewald

1 Executive Overview The Benefits and Objectives of BPDM

Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques. Fundamentals, Design, and Implementation, 9/e

A component-centric UML based approach for modeling the architecture of web applications.

A Methodology for Integrating XML Data into Data Warehouses

SOME TYPES AND USES OF DATA MODELS

Software Language Engineering of Architectural Viewpoints

Modeling Requirements

Principles of Software Construction: Objects, Design and Concurrency. Just enough UML. toad

Chapter 5: Structural Modeling

Basic Structural Modeling. Copyright Joey Paquet,

Bolero Comments. The guiding principles for bolero XML tools and methodologies are:

Second OMG Workshop on Web Services Modeling. Easy Development of Scalable Web Services Based on Model-Driven Process Management

SQL DDL. CS3 Database Systems Weeks 4-5 SQL DDL Database design. Key Constraints. Inclusion Constraints

Software Modelling. UML Class Diagram Notation. Unified Modeling Language (UML) UML Modelling. CS 247: Software Engineering Principles

Proposed Revisions to ebxml Technical Architecture Specification v ebxml Business Process Project Team

Representing System Architecture

Avancier Methods (AM)

Transcription:

Modeling XML Vocabularies with UML: Part I David Carlson, CTO Ontogenics Corp. dcarlson@ontogenics.com http://xmlmodeling.com The arrival of the W3C s XML Schema specification has evoked a variety of responses from software developers, system integrators, XML document analysts and authors, and designers of B2B vocabularies. Some like the richer structure and semantics that can be captured with these new schemas when compared to DTDs, while others complain about excessive complexity. Many find that the resulting schemas are difficult to share with wider audiences of users and business partners. I overlook many of these differences of opinion and simply view XML Schema as implementation syntax for models of business vocabularies. Other forms of model representation and presentation are more effective than W3C XML Schema when specifying new vocabularies or sharing definitions with users. In particular, I favor the Unified Modeling Language (UML) as a widely adopted standard for system specification and design. My goal in this article and the following two in this series is to share some thoughts about how these two standards are complementary and to work through a simple example that makes the ideas concrete. Although this discussion is focused on the W3C XML Schema specification, the same concepts are easily transferred to other XML schema languages. Indeed, I have already applied the same techniques to creating and reverse engineering DTDs and SOX schemas, as well as RELAX, TREX, and their new integrated offspring RELAX NG. In general, I will use the term schema with a lower case s when referring to this entire family of XML schema languages. The Role of Models in XML Applications Few people can fully comprehend all aspects of a large inter-enterprise system at one time; they must divide and conquer the problem as a set of alternate models and views. Each of these models deliberately ignores aspects of the system that are not relevant to its purpose. Building these kinds of models is fundamental to the way we cope with the complexity of everyday life by ignoring unnecessary details to enable us to focus on the task at hand. Different stakeholder groups have different needs with respect to abstraction and focus. In the context of B2B system integration, all business partners must agree on the information models that define the vocabulary for task-oriented communication. The models include both the data structure for XML documents that are exchanged, as well as the process models of the extended dialogs that are required to complete complex business transactions. Historically in system analysis and design, there have been a variety of techniques, tools, and methodologies for guiding and supporting these alternative models of the system structure and behavior. When no formal methods or tools are applied, models are still created using PowerPoint, Visio, or paper and pencil to help communicate a system s purpose and function. Even when you don t write them down, you create models in your mind as a way to Copyright 2001 David Carlson August 22, 2001 1

comprehend the myriad of details. An XML schema is also a vocabulary model written in the syntax of that specification language. A high-level process for developing XML vocabularies is shown in Figure 1 below. It includes three decision points that determine the final vocabulary definition, regardless of which schema language is used. Data-oriented versus text-oriented applications may have different usage requirements. For example, a data-oriented vocabulary can be optimized for serialization of objects or database query results and its constraints should be carefully aligned with the data-types and referential integrity constraints of its sources. These dataoriented documents may never be viewed by humans, other than by developers testing the application. A text-oriented vocabulary often has human users who need to edit the XML documents, with or without the assistance of GUI editing tools. Its structure must be easily understood by webmasters who write stylesheets that transform and present the documents' content. An XML vocabulary design that works perfectly for data interchange might cause human users to call for the lynching of its developers. Don t forget the needs of your users when creating the XML schema! Define Vocabulary Terms Define Relationships and Constraints Primary use? [ text-oriented ] Human authors? [ data-oriented ] [ Yes ] [ No ] EDI Integration? Analyze Human Factors of Vocabulary [ Yes ] Define EDI Mapping [ No ] Assess Presentation Requirements Figure 1: UML activity diagram for schema development process Copyright 2001 David Carlson August 22, 2001 2

The process diagram in Figure 1 is a UML activity diagram, which is one of nine diagram types defined by that standard. This diagram was created using Rational Rose, one of the most widely used UML modeling tools. Most of our discussion, however, is focused on the UML class diagram that is used to specify static information structure of a system an XML vocabulary in our application context. What is UML? The Unified Modeling Language (UML) defines a standard language and graphical notation for creating models of business and technical systems. Contrary to popular opinion, UML is not limited to use as a tool for programmers. The UML defines model types that span a range from functional requirements and activity workflow models, to class structure design and component diagrams. These models, and a development process that uses them, improve and simplify communication among an application s many diverse stakeholders. A UML class diagram can be constructed to visually represent the elements, relationships, and constraints of an XML vocabulary. With a little initial coaching, class diagrams allow complex vocabularies to be shared with non-technical business stakeholders. A very simple subset of a product catalog vocabulary is shown as class diagram in Figure 2 [1]. CatalogItem description : string listprice : double sku : string globalidentifier : string +supplier 1..* Organization addressline [0..3] : string computetax() : float Figure 2: A simple UML class diagram The primary elements of a UML class diagram are: Class this example defines two classes: CatalogItem and Organization. A class represents an aggregation of structural features and defines a namespace for those feature names. Thus, both classes can contain an attribute named name, but their class namespace scope makes the two attributes distinct. Attribute each class may optionally define a set of attributes. Each attribute has a type; in this example string, double, and float refer to the built-in datatypes as defined by the XML Schema specification. For those of you thinking ahead to XML schema design, specifying a UML attribute does not limit the schema to an XML attribute; the mapping to schema syntax allows either an XML attribute or child element. Operation the computetax() operation of CatalogItem specifies part of the behavior for this class. In other words, what does the class do, in addition to defining the structure of its data? In object-oriented parlance, if you send a computetax message to a CatalogItem object, it will return a floating-point data value. This operation does not expect any parameters, but they could be specified between the parentheses. We will not use class operations in the specification of XML vocabulary, but their definition would be critical to Web Services, especially WSDL specification of SOAP messages. Copyright 2001 David Carlson August 22, 2001 3

Association an association relates two or more classes in a model. If an association has an arrow on one end, it means that the association is usually navigated in one direction and provides a hint to design and implementation of this vocabulary. Role & Multiplicity the end of an association may specify the role of the class; the Organization plays a supplier role for a CatalogItem in this model. In addition, the 1..* multiplicity means that there must be one or more suppliers for each catalog item. Generalization although Figure 2 does not include class inheritance (one class as a generalization of another), this structure is fundamental to object-oriented models and is included in the next expanded example. Conceptual Models of XML Vocabulary Now that you understand the basics of UML class diagrams, let s apply them to a larger XML vocabulary design. We ll work with the purchase order vocabulary that is used in the XML Schema Part 0: Primer [make a link] document. That example is first introduced in section 2.1 and then elaborated throughout the W3C specification. The model defined in this article adds international addresses and multi-schema support as explained in section 4.1 of the W3C specification. If you are new to XML Schema, I suggest that you review the Primer after reading this article, then compare our UML design process in these three articles with the same purchase order vocabulary in the schema specification. [[ also add link to XML.com Schema materials ]] The purchase order vocabulary is defined in two modules, corresponding to the core PurchaseOrder type and a separate reusable Address module specification. In UML, these modules are called packages. The first package specification is shown as a UML class diagram in Figure 3. The PurchaseOrder class has two attributes and three associations that define its structure. Several of these attributes include a multiplicity specification of [0..1], which means that those attribute values are optional, either 0 or 1 occurrences. The Address class plays both a shipto and billto role in association with a PurchaseOrder. (Hint: these might become shipto and billto child elements in the schema.) The multiplicity of 1 means that a PurchaseOrder must have exactly one of each address role. On the Item class, notice that a quantity is of type QuantityType. This type is defined as another class in the UML model. In the same diagram, QuantityType is defined as a subclass of positiveinteger, which is annotated as coming from the XSD_Datatypes package in this UML model. Thus, a quantity is a specialized kind of positive integer. Both QuantityType and SKU are user-defined data-types, and both include an attribute that further restricts their intended usage. The pattern and maxexclusive attributes are assigned a value that is used at later stages of the design process to guide XML Schema generation. Finally, the class name of Address is shown in italics, which means that it is an abstract class that is not intended to be used directly. As we ll see next, Address if further specified in another UML class diagram. Copyright 2001 David Carlson August 22, 2001 4

Item partnum : SKU productname : string quantity : QuantityType USPrice : decimal comment [0..1] : string shipdate [0..1] : date Address street : string city : string 1 1 +shipto +billto string (from XSD_Datatypes) positiveinteger (from XSD_Datatypes) +items 0..* SKU pattern = \d{3}-[a-z]{2} QuantityType maxexclusive = 100 PurchaseOrder orderdate [0..1] : date comment [0..1] : string Figure 3: Conceptual model of purchase order vocabulary The Address package specification, shown in Figure 4, follows a similar logic. In this diagram, both USAddress and UKAddress are specialized subtypes of Address. In common object-oriented interpretation, this means that both of these subtypes inherit the three attributes defined in their superclass. The exportcode attribute of UKAddress is assigned an initial value of 1. Address street : string city : string string (from XSD_Datatypes) USAddress state : USState zip : positiveinteger UKAddress exportcode : positiveinteger = 1 postcode : UKPostcode <<enumeration>> USState AK AL AR PA Postcode length = 7 UKPostcode pattern = [A-Z]{2}\d\s\d[A-Z]{2} Figure 4: Modularized Address schema component Copyright 2001 David Carlson August 22, 2001 5

Design Models of XML Schemas Now that we ve created a conceptual model of our XML vocabulary s content and gained approval of all business and technical stakeholders, what next? As hinted in previous sections, there are numerous alternatives available when the mapping this model to XML schema constructs. Are the UML attributes and association ends mapped to XML attributes or elements? How is UML generalization of classes and datatypes mapped into schema definitions? How does this mapping differ when the target schema language is changed from W3C XML Schema to RELAX NG? What about DTDs? If you refer back to the schema development process illustrated in Figure 1, the next design task also depends on whether this vocabulary is data or text-oriented. Because the purchase order vocabulary is data-oriented, most of the remaining design decisions relate to deployment issues such as developer conventions for using XML attributes or child elements, data type alignment with other sources and destinations of data to be exchanged using this vocabulary, and anticipated future requirements for extending this vocabulary or combining it with other XML namespaces. If this were a text-oriented application, then content managers and authors would have further input on design choices. For example, most human authors prefer XML document structures that avoid excessive use of container elements to group related content elements, whereas this is common practice in data-oriented applications. Also, the order of elements in a document is often more important to human authors and readers than it is to data parsing. This article is the first of three articles on modeling XML vocabularies. Its focus on capturing the conceptual model of a vocabulary is the logical first step in the development process. The second article presents a list of design choices and alternative approaches for mapping UML to W3C XML Schema. The UML model presented in this first article will be refined to reflect the design choices made by the authors of the W3C's XML Schema Primer, where this example originated. For our purposes, these authors are the stakeholders of system requirements. The third article introduces a UML profile for XML schemas that allows all detailed design choices to be added to the model definition and then used to automatically generate a complete schema. Our end result is a UML model that is used to generate a W3C XML Schema, which can successfully validate XML document instances copied from the Schema Primer specification. Along the way, I ll introduce a web-based tool that we have developed to generate schemas from UML and reverse engineer schemas into UML. Tips for Success In order to help you when applying these ideas to your own e-business projects, I offer the following tips for success: 1. Your e-business vocabulary defines an agreement or contract with all related business parties. Plan its specification accordingly. Get input on requirements of all key stakeholders using the visual models of UML to improve communication. 2. Define all known terms, associations, and constraints and document their purpose, source, and usage. Do not restrict your specifications to the limited expressiveness of DTDs, or even to the expanded W3C XML Schema language. Using UML, you can capture a complete specification and then transform it to one or more XML schema Copyright 2001 David Carlson August 22, 2001 6

languages. Documentation notes added to the UML model can be automatically transformed to annotations in the XML schema. 3. Create a common UML model that drives both the XML schema definition and other non-xml system components. Many systems use XML in a subset of their components, but the analysis must be done holistically. References [1] David Carlson. Modeling XML Applications with UML: Practical E-Business Applications. Boston: Addison-Wesley, 2001. This book follows a full system development life-cycle based on a product catalog application design. [2] Martin Fowler, Kendall Scott. UML Distilled, Second Edition. Boston: Addison- Wesley, 2000. [3] Object Management Group (OMG) UML resources, http://www.omg.org/technology/uml Copyright 2001 David Carlson August 22, 2001 7