XFDU packaging contribution to an implementation of the OAIS reference model

Similar documents
XML information Packaging Standards for Archives

STANDARD ARCHIVE FORMAT FOR EUROPE

OAIS: What is it and Where is it Going?

CCSDS STANDARDS A Reference Model for an Open Archival Information System (OAIS)

Use of XML Schema and XML Query for ENVISAT product data handling

The OAIS Reference Model: current implementations

Trusted Digital Repositories. A systems approach to determining trustworthiness using DRAMBORA

IRVLA The Irish Virtual Research Library and Archive project.

DAITSS Demo Virtual Machine Quick Start Guide

An overview of the OAIS and Representation Information

XML FORMATTED DATA UNIT (XFDU) STRUCTURE AND CONSTRUCTION RULES

Session Two: OAIS Model & Digital Curation Lifecycle Model

Using XFDU for CASPAR information packaging Matthew Dunckley Science & Technology Facilities Council, Oxford, UK

Earth Observation Payload Data Ground Systems Infrastructure Evolution LTDP SAFE. EO Collection and EO Product metadata separation Trade-Off

Persistent identifiers, long-term access and the DiVA preservation strategy

Copyright 2008, Paul Conway.

INSTALLATION MANUAL CNES CCSDS PAIS PROTOTYPE INSTALLATION AND USER MANUAL

Implementing OAIS Reference Model Concepts: Information Packages and Producer-Archive Interface Standards

Chapter 5: The DAITSS Archiving Process

XML Structure and Construction Rules Concept Paper. 31Aug 2003

Digital Preservation DMFUG 2017

Assessment of product against OAIS compliance requirements

FDA Affiliate s Guide to the FDA User Interface

1 st Int. Workshop on Standards and Technologies in Multimedia Archives and Records (STAR), Lausanne, /27

D WSMO Data Grounding Component

Its All About The Metadata

Comparing Open Source Digital Library Software

Assessment of product against OAIS compliance requirements

(1) CNES 18 av E. Belin, Toulouse Cedex 9, France. ABSTRACT

User Stories : Digital Archiving of UNHCR EDRMS Content. Prepared for UNHCR Open Preservation Foundation, May 2017 Version 0.5

Lessons from the implementation

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

Draft Digital Preservation Policy for IGNCA. Dr. Aditya Tripathi Banaras Hindu University Varanasi

Florida Digital Archive (FDA) SIP Specification

- What we actually mean by documents (the FRBR hierarchy) - What are the components of documents

Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview

Preservation Standards (& Specifications) (&& Best Practices)

An Integrated Framework to Enhance the Web Content Mining and Knowledge Discovery

LiXuid Manuscript. Sean MacRae, Business Systems Analyst

Metadata. for the Long Term Preservation of Electronic Publications. Catherine Lupovici Julien Masanès. Bibliothèque nationale de France

ISO ARCHIVE STANDARDS: STATUS REPORT

What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October

Transfers and Preservation of E-archives at the National Archives of Sweden

Archival Information Package (AIP) E-ARK AIP version 1.0

DIGITAL ARCHIVES & PRESERVATION SYSTEMS

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Building for the Future

TABLE OF CONTENTS. ICANN Pre- Delegation Testing System. A User s Guide. Version

Draft Report Concerning Space Data System Standards PRODUCER-ARCHIVE INTERFACE SPECIFICATION (PAIS) INTEROPERABILITY TESTING REPORT

The VITO Earth Observation LTDA Facility

Digital Preservation Standards Using ISO for assessment

1 There is no single, authorized definition of digital preservation. 2 Definition is from the Framework for Applying OAIS to

The Community Data Portal and the WMO WIS

Protection of the National Cultural Heritage in Austria

Response to the CCSDS s DAI Working Group s call for corrections to the OAIS Draft for Public Examination

Common Specification (FGS-PUBL) for deposit of single electronic publications to the National Library of Sweden - Kungl.

The Value of Metadata

INTRO INTO WORKING WITH MINT

MuseKnowledge Hybrid Search

Linking datasets with user commentary, annotations and publications: the CHARMe project

ACDH AUSTRIAN CENTRE FOR DIGITAL HUMANITIES

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

A Repository of Metadata Crosswalks. Jean Godby, Devon Smith, Eric Childress, Jeffrey A. Young OCLC Online Computer Library Center Office of Research

Repository Interoperability and Preservation: The Hub and Spoke Framework

EMC InfoArchive SharePoint Connector

The CASPAR Finding Aids

XML-based production of Eurostat publications

Software Requirements Specification for the Names project prototype

Different Aspects of Digital Preservation

Establishing Trust in Data Curation: OAIS and TRAC applied to a Data Staging Repository (DataStaR)

A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method

US NODC Archival Management Practices and the OAIS Reference Model Donald W. Collins

A Model for Managing Digital Pictures of the National Archives of Iran Based on the Open Archival Information System Reference Model

Geospatial Records and Your Archives

UC Office of the President ipres 2009: the Sixth International Conference on Preservation of Digital Objects

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation

XML. Jonathan Geisler. April 18, 2008

SDMX self-learning package XML based technologies used in SDMX-IT TEST

Selecting the Right Method

Mapping from Flat or Hierarchical Metadata Schemas to a Semantic Web Ontology. Justyna Walkowska, Marcin Werla

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

Using METS, PREMIS and MODS for Archiving ejournals

RavenDB & document stores

OAI (Open Archives Initiative) Suite Version 3.0. Introductory Guide for New Users

Wendy Thomas Minnesota Population Center NADDI 2014

Nuno Freire National Library of Portugal Lisbon, Portugal

BHL-EUROPE: Biodiversity Heritage Library for Europe. Jana Hoffmann, Henning Scholz

SERAD CNES Service for Data Referencing and Archiving

adore: a modular, standards-based Digital Object Repository

WEB-BASED COLLECTION MANAGEMENT FOR ARCHIVES

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

NEDLIB LB5648 Mapping Functionality of Off-line Archiving and Provision Systems to OAIS

S-100 schemas and other files

Version 5.7. Published: November 5th, Copyright 2018 Prologic. All rights reserved.

DuraArK Durable Architectural Knowledge

PREMIS Implementations at the British Library & PREMIS and the Planets Project. Angela Dappert The British Library PREMIS roundtable, February 2009

Earth Observation Payload Data Ground Systems Infrastructure Evolution LTDP SAFE. SAFE Software System Specification

Transcription:

XFDU packaging contribution to an implementation of the OAIS reference model Arnaud Lucas, Centre National d Etudes Spatiales 18, avenue Edouard Belin 31401 Toulouse Cedex 9 FRANCE Arnaud.lucas@cnes.fr Abstract. The ISO Reference Model for an Open Archival Information (ISO 14721) defines various concepts for archiving digital data, in particular models of information packages to be stored in an Archive (Archival Information Package or AIP), or to be exchanged between an information Producer and the Archive (Submission information Package or SIP), or to be distributed to Archive consumers (Dissemination Information Package or DIP), as indicated in the following OAIS functional model figure. Descriptive Info Data Management Descriptive Info 4-1.1 P R O D U C E R SIP Ingest AIP Archival Storage Administration AIP Access queries result sets orders DIP C O N S U M E R MANAGEMENT The XFDU (XML Formatted Data Unit) is an emerging CCSDS recommendation to package data and metadata, including software into a single package to facilitate information transfer and archiving. The purpose of this paper is to demonstrate what the XFDU packaging brings to possible implementations of the OAIS SIP, AIP and DIP models.

2 Arnaud Lucas, Centre National d Etudes Spatiales XFDU description Fig. 1. XFDU logical view An XFDU package is made of a physical container such as a ZIP file that contains a manifest file (an XML file). This manifest contains all the valuable information about the data inside the container and/or outside the container. At the higher level, the manifest is split into in different parts: The Information Package Map contains the logical view of the package. It s a hierarchical xml tree representing the content of the package. Each leaf of the tree is a Content Unit and can be referred to from the other parts of the package (i.e. information is linked by the Content Unit reference ID). The Data Object Section contains all the physical information needed to get the file objects out. The Metadata Section records all of the metadata for all items in the package. The Behavior Section associates executable code with the content of the package. The Content Unit provides the primary view into the package as it refers to each of the data objects and it associates appropriate metadata with each data object. The Content Unit reference to the metadata is via one or more metadata Category pointers. For each such pointer, there is a set of metadata classes that may be chosen to further classify the metadata object. The actual Metadata Object may be included in the manifest file or referenced by URI. A Content Unit may also contain other Content Units reference external XFDUs.

XFDU packaging contribution to an implementation of the OAIS reference model 3 Fig. 2. file logical view An XML Formatted Data Unit (XFDU) is the complete contents as specified by the Information Package Map (i.e., the. highest level Content Unit) component of the XML. This includes the XML document, files contained in the XML, files referenced in the XFDU including those contained within the XFDU Package, and resources (i.e., files and XFDU Packages) external to the XFDU Package. The XFDU is a logical entity and may never exist as a physical entity. This structure allows using the XFDU packages in a lot of data transfer processes such as described in the OAIS reference model. XFDU as SIP (Submission information Package) First example : Linking data & metadata In the case of packages being ingested by an archive center, XFDUs can provide obviously the link between data and metadata (embedded or available online). If a syntactic description metadata is provided, an automatic check of the packaged data can be performed. This means that the data file will be checked against an xml schema provided by the metadata layer.

4 Arnaud Lucas, Centre National d Etudes Spatiales For this example, let s take an XFDU containing a XML data file (data.xml) and its associated metadata as a word document (data.doc). This data file is described by a w3c schema file located by a given URL (http://www.ccsds.org/data.xsd). When the archive receives this package, the informationpackagemap is read and the content unit data (the file data.xml) is given back with its associated metadata (DOC pointer) after being checked using the XSD pointer. A logical view of the XFDU package should be : InformationPackageMap ContentUnit id = data MetadataObjectPointer = XSD DOC ObjectPointer = dataptr dataobjectsection dataobject id=dataptr FLocat = data.xml metadatasection metadataobject id=doc FLocat = data.doc metadata Object id=xsd FLocat = http://www.ccsds.org/data.xsd Web Data.doc Data.xml XFDU Package Fig. 3. file linking data & metadata 2 nd example : Dealing with large files Managing large file is not a challenge for the XFDU, because a multi-part file access is coded into the Data Object Section. In this example, a single ContentUnit named data is split into three different parts at the DataObject level (using the FLocat identifier). When receiving this package, all the parts are automatically concatenated.

XFDU packaging contribution to an implementation of the OAIS reference model 5 InformationPackageMap ContentUnit id = data ObjectPointer = dataptr dataobjectsection dataobject id=dataptr FLocat = data1.xml FLocat = data2.xml FLocat = data3.xml concat Fig. 4. Multipart files 3 rd example : ordering content units Dealing with multiple-part files highlights XFDU's order attribute, which allows conditional processing. For example, it's possible to make a logical and hierarchical view of the incoming packages, check up the integrity of the whole deposit and launch a specific action when receiving the package. In this example, the archive receives a book with an introduction and two chapters : InformationPackageMap ContentUnit id = thebook ContentUnit id = Chapter2 ObjectPointer = chap2ptr Order=3 ContentUnit id = Introduction ObjectPointer = introptr Order=1 ContentUnit id = Chapter1 ObjectPointer = chap1ptr Order=2 Fig. 5. Order and hierarchical view

6 Arnaud Lucas, Centre National d Etudes Spatiales 4 th example : Specializing information The XFDU manifest file is a xml file driven by a W3C schema. This schema is extensible, this means that you can add your own information in the ContentUnit element. For example if you want to have a date and a comment in the manifest file, you just have to extend the XFDU schema to add these elements into a new MyContentUnit. Since MyContentUnit is derived from a ContentUnit, we have a valid manifest file. This one should look like this : InformationPackageMap MyContentUnit id = Data MyComment= a modified Content Unit ObjectPointer = dataptr MyDate=2005/12/25 Fig. 6. Specialization of a ContentUnit XFDU as AIP (Archival Information Package) 5 th example : automatic generation of catalogues The manifest file is a xml file. It can be processed by an XSLT stylesheet in order to produce administrative information to help managing the archive. For example, the information inside the XFDU manifest file can be processed in order to produce a catalogue in the format of the archive management system.

XFDU packaging contribution to an implementation of the OAIS reference model 7 ContentUnit id=data1 ContentUnit id=data2 XSLT processing Catalog Entry Entry Name=data1 Name=data2 Fig. 7. Automatic transformation of the XFDU content 6 th example : Optimizing an archive Duplicated metadata can be simply grouped and accessed globally from the whole system using the XFDU external links mechanism as seen in the first example. In this example, different contentunits in multiple XFDU packages share the same metadata.... ContentUnit id=dataa ContentUnit id=datab Metadata 1 ContentUnit id=xxx ContentUnit id=yyy Metadata 2 Fig. 8. Archive optimization with metadata share

8 Arnaud Lucas, Centre National d Etudes Spatiales 7 th example : Specialization for multi-mission archive. In a multi-mission archive, XFDUs can be specialized for each mission and stay compatible at a higher level. The schema validating the manifest file can be modified as seen in example 4. This can be a great saving, using the same software to access all the data. ESA has a specialized XFDU format named SAFE. The SAFE packages have been specialized in order to manage multi-mission data (from ENVISAT instruments). XFDU as DIP (Dissemination Information Package) In this last part of my paper, the aim is to deliver data to the final user in a friendly way. In fact, we can take advantage of all the preceding examples and gather all the different XFDU techniques already shown. To deliver data to the final consumer using the power of xml would be very easy (data transformation to the requested data format using XSLT). As for the SIP, dealing with large files is easy with XFDUs (automatic split can be performed). Once again, the hierarchical architecture allows the packaging of several requested products into one logical package. The metadata part brings information about the processing of the extracted data and even points to extra valuable services that could be of interest for the requester, like for example a time based extraction. Conclusion This paper has shown some possible use of the XFDU packages. Sure, there are still a lot to discover. The technology behind the XFDU is universal (XML and schemas) and the design of this CCSDS recommendation is made to cover a very large panel of needs, originally for the archive community but not especially. References CCSDS : http://www.ccsds.org XFDU development site : http://sindbad.gsfc.nasa.gov/xfdu