MDR-based Framework for Sharing Metadata in Ubiquitous Computing Environment

Similar documents
Collaboration System using Agent based on MRA in Cloud

ISO CTS2 and Value Set Binding. Harold Solbrig Mayo Clinic

DCMI Abstract Model - DRAFT Update

Procedure for the Specification of Web Ontology

References differences between SVG 1.1 Full and SVG 1.2 Tiny

Semantic Web Knowledge Representation in the Web Context. CS 431 March 24, 2008 Carl Lagoze Cornell University

An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry

Proposed Draft Technical Report ISO/IEC PDTR

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY. (An NBA Accredited Programme) ACADEMIC YEAR / EVEN SEMESTER

A GML SCHEMA MAPPING APPROACH TO OVERCOME SEMANTIC HETEROGENEITY IN GIS

ISO/IEC INTERNATIONAL STANDARD. Information technology Metamodel framework for interoperability (MFI) Part 1: Reference model

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 5: Multimedia description schemes

Lecture Telecooperation. D. Fensel Leopold-Franzens- Universität Innsbruck

ONAR: AN ONTOLOGIES-BASED SERVICE ORIENTED APPLICATION INTEGRATION FRAMEWORK

The XML Metalanguage

XML Metadata Standards and Topic Maps

Ontology Extraction from Heterogeneous Documents

11. Documents and Document Models

ISO/IEC JTC 1/SC 32 N 0722

Methodology and Technology Services

ISO/IEC Information technology Multimedia content description interface Part 7: Conformance testing

Labelling & Classification using emerging protocols

CEN/ISSS WS/eCAT. Terminology for ecatalogues and Product Description and Classification

Information technology Procedures for achieving metadata registry content consistency Part 5: Metadata mapping procedure

Semantic Web Search Model for Information Retrieval of the Semantic Data *

Building Ontology Repositories for E-Commerce Systems

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

Cross-domain Metadata Interoperability for Integrated Information Services

3. Technical and administrative metadata standards. Metadata Standards and Applications

Metadata Workshop 3 March 2006 Part 1

XML ALONE IS NOT SUFFICIENT FOR EFFECTIVE WEBEDI

Information technology Metamodel framework for interoperability (MFI) Part 1: Framework

Editor s Draft. Outcome of Berlin Meeting ISO/IEC JTC 1/SC32 WG2 N1669 ISO/IEC CD :ED2

Metadata Standards and Applications. 4. Metadata Syntaxes and Containers

ISO/IEC Information technology Multimedia framework (MPEG-21) Part 3: Digital Item Identification

Chapter 13 XML: Extensible Markup Language

Information Technology Metadata registries (MDR) Part 5: Naming and identification principles

Introduction to the Semantic Web

Automatic Service Discovery and Integration using Semantic Descriptions in the Web Services Management Layer

ISO/IEC CD :200x(E) Title: Information technology - Framework for Metamodel interoperability Part 2: Reference model Project:

SDMX self-learning package No. 3 Student book. SDMX-ML Messages

Network protocols and. network systems INTRODUCTION CHAPTER

Introduction to Web Services & SOA

RDF and Digital Libraries

XML Support for Annotated Language Resources

ISO/IEC JTC 1/SC 32 N 2262

Digital Library Curriculum Development Module 4-b: Metadata Draft: 6 May 2008

Practical E&P Data Mapping using XML

Information technology - Metadata registries (MDR) - Part 5: Naming principles

ebxml Technical Architecture San Jose, CA USA Wednesday, 9 August 2000

A Bottom-up Strategy for Query Decomposition

An introduction to metadata. Metadata registries for improved data management

Web Technologies Present and Future of XML

Knowledge Representation, Ontologies, and the Semantic Web

XML-based Event Notification System for Large Scale. Distributed Virtual Environment

Information Technology Document Schema Definition Languages (DSDL) Part 1: Overview

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Metadata: The Theory Behind the Practice

PAPER by Transentric. TranXML : The Common Vocabulary for Transportation Data Exchange

Enterprise Information Systems Chapter 1 - Motivation

Digitisation Standards

Agent Framework For Intelligent Data Processing

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 2: Description definition language

Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS. Jenn Riley IU Metadata Librarian DLP Brown Bag Series February 25, 2005

Lesson 5 Web Service Interface Definition (Part II)

Enhancement of CAD model interoperability based on feature ontology

METADATA INTERCHANGE IN SERVICE BASED ARCHITECTURE

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.

Bridging the Gap between Semantic Web and Networked Sensors: A Position Paper

Proposed Revisions to ebxml Technical. Architecture Specification v1.04

ISO/IEC INTERNATIONAL STANDARD. Information technology Metadata registries (MDR) Part 1: Framework

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 1: Systems

METADATA AT DIGITAL-INFORMATIVE ERA

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia framework (MPEG-21) Part 21: Media Contract Ontology

TBX in ODD: Schema-agnostic specification and documentation for TermBase exchange

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction

technical memo Physical Mark-Up Language Update abstract Christian Floerkemeier & Robin Koh

Proposed Revisions to ebxml Technical Architecture Specification v ebxml Business Process Project Team

An Approach to Resolve Data Model Heterogeneities in Multiple Data Sources

Realizing the Army Net-Centric Data Strategy (ANCDS) in a Service Oriented Architecture (SOA)

Internal Structure of Information Packages in Digital Preservation

ISO/IEC INTERNATIONAL STANDARD. Information technology JPEG 2000 image coding system Part 14: XML representation and reference

Integration of Product Ontologies for B2B Marketplaces: A Preview

Research on Metadata Schema Registry System Oriented Knowledge Service

Opus: University of Bath Online Publication Store

Lecture 17 MaRC as Metadata

Towards Ontology-based harmonization of Web content standards

ISO/IEC INTERNATIONAL STANDARD

An Architecture for Semantic Enterprise Application Integration Standards

STANDARD ST.66 DECEMBER 2007 CHANGES

From Open Data to Data- Intensive Science through CERIF

Some more XML applications and XML-related standards (XLink, XPointer, XForms)

ICD Wiki Framework for Enabling Semantic Web Service Definition and Orchestration

Semantic Web: vision and reality

Device Independent Principles for Adapted Content Delivery

RELATIONAL STORAGE FOR XML RULES

ISAN: the Global ID for AV Content

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent

A Dublin Core Application Profile in the Agricultural Domain

Transcription:

MDR-based Framework for Sharing Metadata in Ubiquitous Computing Environment O-Hoon Choi 1, Jung-Eun Lim 1, and Doo-Kwon Baik 1 1 Department of Computer Science and Engineering, KOREA University, SungBok Gu, Anam Dong, 135-701, Korea {pens,jelim,baik}@software.korea.ac.kr Abstract. : To resolve the syntax, structure and semantic heterogeneity for sharing information resources, the representative technologies are XML and Metadata. XML is used to represent the syntax and structure of information resources but the various XML schema definitions that have been developed by independent organizations without any standards or guidelines, make it difficult to share the semantic meaning of XML encoded information resources. In this paper, we propose a mechanism, named MSDL that represents the exact meaning of XML tags by describing the structural and semantic differences with standard metadata in metadata registries. MSDL overcomes the limitations of other approaches with respect to exactness, flexibility and standardization, and provides an environment for business partners using different metadata to share their XML encoded information resources. 1 Introduction XML is a standardized text format designed specifically for transmitting structured data to web applications. With XML, application designers can create sets of data element tags and structures that define and describe the information contained in a document, database, object, catalog or general application, all in the name of facilitating data interchange [1]. XML documents, the unit of data exchange, are based on schemas (XML schema definitions). XML schema is a specification defining the structure of tags and plays a role of metadata on XML documents. Metadata is a data about data. And it is essential to information sharing. It describes attribute, property, value type and context on data. One important point is that XML documents based on same schema definition have same structure and tags. Therefore, it is possible to share data between applications if they use standardized schema definition. Such standardized schema definitions include CML DTD [2] used in chemistry domain, Math DTD [3] in mathematics, ONIX [4] in digital library domain, and so on. So, common schema definition makes it possible to share tag elements and attributes defined in schema definition, and to understand semantics of the XML documents.

However, in many cases, each system (organization) has developed their own schema definition and use XML documents based on the schema. There is no interoperability with other systems without manual converting processes. For example, if we represent author of a book named Database Design, Dublin Core [5] which is a metadata standard for describing general properties of information resources, uses a term creator, but ONIX [4] which is a metadata standard for digital contents, uses a term author In this paper, we first categorize heterogeneity between XML documents syntactic, structural and semantic heterogeneity, and second design a framework (FSMI, Framework for Sharable Metadata Interoperability) to increase the interoperability of XML encoded information resources using different metadata sets. In this framework we propose a mechanism, named MSDL (Metadata Semantic Description Language) that represents the exact meaning of XML tags by describing the structural and semantic differences with standard metadata in metadata registries. With MSDL, users can describe the syntactic, structural and semantic difference of their XML schema to other s schema, and based on this description, can convert other system s XML document to a document with their own schema. 2. Related Works In this section, we introduce some other approaches that enable the interoperability of XML documents having different schema definitions. The most straightforward approach for XML document interoperability is 1:1 mapping. In 1:1 mapping, each system tries to map their XML schema to other s schema. Microsoft s BizTalk [6] is the representative software. Users of BizTalk create a mapping table for each other application, and interchange their XML documents. After receiving, it converts the documents according to receiver s understandable format. Maintaining integrated global schema is another method. EAI solutions analyze each application s data and processes, create global mapping tables and rules, and enable each system to exchange XML encoded data. Also, many organizations develop standard metadata (XML schema) and member of the domain use it as a data interchange format. The representative examples are ebxml [7] and Rosettanet [8] in electric commerce domain, and ONIX in digital contents (e-book) domain. This approach is very simple and powerful when all members use it as their data standard. These three approaches have advantages and disadvantages in exactness, flexibility, initial cost and maintenance cost. In the last section, we will compare our proposed approach to them. 3. Metadata Model We use data element model of ISO 11179 as a standard metadata model for comparison. Data element is a basic unit of identification, description and value representation unit of a data.

Definition 1 (Data Element) Data Element is a basic unit for definition, identification and representation of data. It consists of data element concept and permissible value domain. DE = {(dec, r) dec DEC, r R} DEC = {(oc, p) oc OC, p P} The combination of an object class and a property is a data element concept (DEC). A DEC is a concept that can be represented in the form of a data element, described independently of any particular representation. In the examples above, annual household income actually names a DEC, which has two possible representations associated with it. Therefore, a data element can also be seen to be composed of two parts: a data element concept and a representation. Data element is composed of three parts as Figure 1 [9]. (1:N) Object Class Car Person Book Employee Etc Property Color (1:1) Model (1:1) Age Title pages Etc Representation Char 10 bytes {"A", "B", "C"} Integer (0:200) Integer $ Etc Fig.1. Structure of Data Element Object class (OC) is a set of ideas, abstractions, or things in the real world that can be identified with explicit boundaries and meaning and whose properties and behavior follow the same rules. Examples of object classes are cars, persons, books, employees, orders, etc. Property (P) is what humans use to distinguish or describe objects. Examples of properties are color, model, sex, age, income, address, price, etc. The most important aspect of the representation (R) part of a data element is the value domain. A value domain is a set of permissible (or valid) values for a data element. For example, the data element representing annual household income may have the set of nonnegative integers (with unit of dollars) as a set of valid values. Figure 2 shows some examples related on Definition 1. OC={person, car, employee, book, product,...} P={name, age, title, cost,...} R={char 30byte,{blue, red, green},{0..100},...} DEC={personname, carname, booktitle, productcost,...} DE={(personname, char 30byte), (carname, char 30byte), (productcost,dollar),...} Fig.2. Examples on Data Element A metadata registry (MDR) supports registration, authentication and service of such data elements. Each data elements have a unique identifier, all users and programs

using the metadata registry identify a data element by the identifier. A data element also contains it definition, name, context, representation format, and so on. 4. MSDL 4.1. Information Heterogeneity Table 1 shows the type of heterogeneity between standard metadata based on ISO 11179 MDR and XML schema definitions. Semantic heterogeneity means that the two xml tags has same meaning but different names. Structural heterogeneity means that two sets of xml tag have same meaning but different structures including composition, decomposition and rearrangement. Syntactic heterogeneity means the difference of data values in XML tags. Conflicts in this heterogeneity include measurement unit, data type and data code sets. Table 1. Heterogeneity of XML Documents Description and Examples Semantic Heterogeneity Structural Heterogeneity Syntactic Heterogeneity Different terminologies have same semantics in context. ex) title book name, price - cost Structurally different with same semantics ex1) name - first name, last name ex2) first name, last name name ex3) last name - first name, last name Semantically, structurally same, but have different data type, measurement unit and code sets ex1) {kr, jp, cn} - {korea, japan, china} ex2) mile - kilometer ex3) integer - float 4.2. MSDL The key idea of this paper is that users of each system describe the difference between their XML schema and standard metadata registered in metadata registry. Then they interchange their XML documents with other system as resolving the difference of other s document using the descriptions. We propose MSDL (Metadata Semantic Description Language) as a tool for describing the difference between local XML schema and standard metadata in MDR. MSDL consists of namespace on the location of MDR and local schema, correspondent information (MAP) and transformation rules for converting local document to/from standard document.

Definition 2 (MSDL) A language for describing the difference between local schema and standard metadata. It uses XML syntax. MSDL= (mdrnamespace, xmlnamespace, MAP, CODESET) Where, mdrnamespace and localnamespace must appear once in a MSDL document, but MAP appears as much as the number of tags to be converted. Definition 2.1 (MAP) MAP defines semantic, structural and syntactic conflicts between tags in XML documents and metadata in MDR. MAP={(mdrE, locale, maptype) mdre DE, locale XS, maptype MapT} Where, mdre is a subset of standard metadata in MDR, and has one or more elements. locale is a subset of metadata defined in local XML schema, and also has one or more elements. Definition 2.2 (Mapping Type) MapT defines structural conflicts and syntactic conflicts between metadata with same semantics. MapT={(sct, vct sct SCT, vct VCT) SCT={substitution, composition, decomposition, rearrangement} VCT={(CodeSetmdr, CodeSetlocal), (MUnit, cvalue),(typecasting, cfunction)} Where, number(codesetmdr) = number(codesetlocal), Munit = measurement unit (m, km, kg, etc.), Cvalue = constant for measurement unit change, cfunction = name of system support function for type casting SCT (Structural Change Type) has one of the four elements substitution, composition, decomposition, and rearrangement. Where, substitution is for resolving semantic conflict and other three elements are for structural conflict. For example, book title and book name have same meaning but different name, in this case, substitution is applied. On the other hand, a name tag may correspond to two tags first name and last name In this case, composition or decomposition is applied. VCT (Value Change Type) has one of the three elements code set conflict, measurement unit conflict and data type conflict. In code set conflict, we consider only the case of two code sets having same number of elements. A MSDL document must have one or more mapping type to describe the difference between standard metadata and local schema. Figure 3 shows an example of MDSL document mapping local element Book price to standard metadata element Price In this example, two elements have semantic conflict and syntactic (measurement unit) conflict. <map> <mdrelement> <DataelementName> <element>price</element>

</DataelementName> <elementid>de024041</elementid> <datatype> <noncodesetdatatype>string</noncodesetdatatype> </datatype> <measurementunitid>mu000000</measurementunitid> </mdrelement> <localelement> <elementname>price</elementname> <elementpath>/book/price</elementpath> <datatype> <noncodesetdatatype>string</noncodesetdatatype> </datatype> <measurementunitid>mu000001</measurementunitid> </localelement> <mappingrule> <mappingtype>substitution</mappingtype> </mappingrule> </map> Fig.3. An Example of MSDL Document 4.3. XML Conversion Process using MSDL MSDL just specifies information about conflicts of local schema and standard metadata. Thus, it is necessary to develop a system which transform imported XML document to self-understandable one using MSDL. We call this system as FSMI (Framework for Sharable Metadata Interoperability). FSMI is a system for transforming XML encoded source data to fit other system s XML document using each system s predefined MDSL documents. Figure 4 shows the transforming process. S o u rc e M essag e S c h em a ( X M L ) Targ et M essag e S c h em a ( XM L ) M essage Converter S ou rc e M e ssage ( X M L ) M e ssage C onverter Step1 S ta nd ard M essag e M es sag e C onverter S tep2 Target M essag e ( X M L ) M S D L b etw een S ou rc e & sta nd ard M S D L betw een S tandard & T arg et S tep 1 S tep 2 Fig.4. XML document transform process XML document converting process consists of two steps. The first step is to generate a standard document with standard data element in MDR from source document, local

XML schema and MSDL of the schema. Then, in the second step, standard document is converted to target document using target document s XML schema and MSDL. In the step 1 and step 2, FSMI applies mapping rules in MSDL to resolve structural and syntactical heterogeneities (conflicts). Generally, these two types of heterogeneity co-exist in a document, it is necessary to apply mapping rules with combined manner. We will apply two steps mapping rule to solve discordance in document translation processing. In the first step, we translate a source document to a standard document that is structured by standard data elements using source document, its XML schema, and it s MSDL. In the second step, we translate a standard document to a target document that is used by other system using target document s XML schema and it s MSDL. In each step, we apply translation rules that are described in MSDL. It is need to apply lots of mapping rules to solve discordance of representation and structure. With two steps document translation process, we get a compound mapping rules. Figure 5 shows an example of translating a source document to standard document using MSDL. Fig 5. Translation using MSDL 5. Comparison 1:1 mapping approach is relatively exact because experts of each system directly map to other system s schema. But, the mapping cost increases exponentially as the number of related systems. Especially, in case of schema change, the maintenance cost will be high. Global schema can cover large parts of each system s characteristics of schema through analysis process, and the maintaining cost is very low in case of

schema change because systems only have to change related parts of global schema. But the initial building cost of global schema may be high. Table 2. Comparison to Other Approaches. 1:1 Mapping Global Schema Standard Schema MSDL Conversion accuracy Very High High Middle High Initial cost Low Middle Very high High Maintenance cost Very High Middle Low Low Extensibility Low Middle Middle High Global schema Application Domain Not exist Not limited Exist (Bottom-up) Enterprise Integration Exist (Top-down) Specific Domain Exist (Hybrid) Inter Domain If all systems use standardized XML schema, then there are no syntactic, structural and semantic conflicts. It is very ideal status on data sharing and integration. But It is impossible to make a standard system (schema) that support all the case of system, and necessary to consider another processes for data that standard does not cover. Also, a domain may have more than one metadata standards. For example, in digital content domain have metadata standard including ONIX, MARC, DC, and MPEG7 and in electric commerce domain, they have standard including cxml, ebxml, Rosettanet, eco, and so on. Table 2 shows the characteristics of each approaches including MSDL. With MSDL, conversion accuracy is relatively high, because all the needed part of schema for sharing can be covered in MDR using registering process. Figure 6 shows cost evaluation graph. The cost evaluation consists of maintenance cost, center complexity, user complexity and construction cost. It shows FSMI is suitable for maintain and construct than other methods. And FSMI based on MDR is a hybrid structure system. It is easy to apply other domains system and framework. [10,11,12] Fig 6. Cost Graph of Complexity Most of all, maintenance cost in case of changing schema, is very low and extensibility in case of adding new metadata is relatively high. However, the building cost of

MDR is very high and time consumed, and there may be some data losses in two steps transforming process. 6. Conclusion and Future Works XML is a de facto standard for representing information resources in distributed environment. But it is necessary to share metadata of XML document defining syntax, structure and semantic of the document. In this paper, we propose MSDL as a tool for describing difference between standard metadata and local XML schemas to support interoperability of XML documents. With MSDL, users can describe the syntactic, structural and semantic difference of their XML schema to other s schema, and based on this description, can convert other system s XML document to a document with their own schema. Users, who wish to exchange their document, just describe the difference of their XML schema with standard metadata in MDR, and make a MSDL document on the schema, and then other system can convert the document to understandable ones. MSDL overcomes limitations of 1:1 mapping, global schema and standard schema approaches, providing low maintenance cost on schema changes, high extensibility in increase of systems, and domain independent document sharing. In the future, we have plans to develop the converting system using MSDL, and extend it to support legacy solutions including EAI, B2Bi, Web Service, and so on. References 1. Rik Drummond, Kay Spearman, XML Set to Change the Face of E-Commerce, Network Computing, V.9, N.8, pp.140-144, May 1998. 2. Chemical Markup Language, Version 1.0, Jan.1997. available at http://www.venus.co. uk/omf/cml/. 3. Mathematical Markup Language (MathML) 1.0 Specification, W3C, July 1998. available at http://www.w3c.org /TR/1998/REC-MathML-19980407 4. ONline Information exchange, http://editeur.org/ 5. http://dublincore.org/ 6. http://www.microsoft.com/biztalk/ 7. http://www.ebxml.org 8. http://www.rosettanet.org/rosettanet/ 9. "Information Technology - Metadata Registries", ISO/IEC 11179 Part 1-6, ISO, 2003. 10. N. Guarino, C. Masolo and G. Vetere, Ontoseek: Content-based Access to the Web, IEEE Intelligent Systems, Vol. 14, No. 3, pp. 70-80, May/June 1999 11. B. Amann and I. Fundulaki, Integrating Ontologies and Thesauri to Build RDF Schemas In ECDL-99: Research and Advanced Technologies for Digital Libraries, Lecture Notes in Computer Science, pages 234--253, Paris, France, September 1999. Springer-Verlag. 12. Zhan Cui, Dean Jones and Paul O'brien,"Semantic B2B Integration: Issues in Ontologybased Approaches", SIGMOD Record, Vol. 31, No. 1, March 2002.