Preservation DataStores: An architecture for Preservation Aware Storage

Size: px
Start display at page:

Download "Preservation DataStores: An architecture for Preservation Aware Storage"

Transcription

1 Preservation DataStores: An architecture for Preservation Aware Storage Dalit Naor, IBM research, Haifa Simona Cohen, Michael Factor, Dalit Naor, Leeat Ramati, Petra Reshef, Shahar Ronen, Julian Satran IBM Labs in Haifa

2 Outline of Talk Long term digital preservation Motivation Approached and standards Why is it a storage problem Storage characteristics Why is preservation aware storage needed? Preservation DataStores Proposed architecture Migration and Media formats Summary 2

3 The challenge of data preservation This document was created about 2000 years ago. One can read the letters with one s bare eyes. Dead Sea Scroll, ~70AD. Media: Copper. Language: Hebrew. This information was created this year. Will the media last for 20 years yet alone 2000? Will it be possible to access, interpret and present the data in 20 years? 50? 100? 3

4 Digital Preservation: Bit vs. Logical Preservation Challenge: preserve large amounts of heterogeneous data for long periods of time (tens if not hundreds of years) Preserve Information and not only Bits Preservation of Information implies continuing Understandability and Usability Preservation of Information is hard and requires vigilance Changes in technologies, users (Designated Community) Storage considerations: Bit Preservation Maintaining physical readability: being able to get the bits off a physical media, e.g., who today can read a 5¼ floppy drive Logical Preservation Maintaining logical readability: assuming one can get the bits, how does one read/understand/interpret them? Store with the data, metadata (representation information) that describes how to interpret the data Ensure the metadata can be interpreted in tens or hundreds of years - a recursive problem 4

5 Industries Need Finance Rule 17a-4 requires brokerdealers to retain account record information for six years. The six-year period begins either at the time the account is closed or when the information is replaced or updated Aerospace Aircraft designs records have to be retained for the lifetime of each aircraft (30+ years) Life insurance policies has to be kept for life of policy plus 6-10 years Petroleum Oil-field data is used over life of field (50+ years) Healthcare Medical records should be preserved for the life of the individual and beyond Pharma X-rays are often stored for periods of 75 years Pharma needs offline electronic data storage for 50 to 100 years or longer OSHA requires employers to keep records of both medical and other employees who are exposed to toxic substances and harmful agents. Employers must maintain these records for 30 years The retention requirement for the [medical] records of minors varied from 20 to 43 years of age Scientific and Cultural Satellite data is kept for ever We would like to keep Libraries and Art data for ever 5

6 Example Data Volumes Earth Observation (EO) Data Development of ESA's EO Historical Data Archive between Total Archive in TerraBytes (TB) This does not consider multiple copies, and high level products! Year AQUA Modis (April 03-today) ENVISAT LR (March 02-today) ENVISAT HR (March 02-today) TERRA Modis (June 01-today) QUICK SCATT (01-today) /PROBA (May 02-today) LANDSAT 7 ETM (April 99-Dec 03) SEA STAR SeaWifs (Apr 98-today) ERS 2 HR (May 95-today) ERS 2 LBR (May 95-today) JERS SAR/OPS VNIR (92-Sep 98) At present (2006) the total volume of archived data and products (from ESA acquired missions) is ~3 PBytes & increases by >> 500 TB/y) ERS 1 HR (Jul 91-Mar 00) ERS 1 LBR (Jul 91-Mar 00) SPOT 1-4 HRV (87-today) MOS 1, 1b MESSR (87-Oct 93) NOAA 9-17 AVHRR (86-today) LANDSAT 5 TM (April 84-today) NIMBUS 7 (Nov 78-May 86), SEASAT (Jun-Oct 78) LANDSAT 2-4 MSS (75-Dec 93) Taken from Europe Space Agency (ESA), presentation by Luigi Fusco 6

7 Characteristics of Preservation Data Access pattern Content (raw) data is Read Only. Most content data is cold rarely accessed during its lifetime New metadata added later on, e.g. new RepInfo Heterogeneity Quantity Type, size and value, some data is more important than other Offline, Too large for on-line storage. Quality Lossless format if it needs to be transformed over time. 7

8 OAIS: Open Archival Information System Reference Model (ISO:14721:2002) Information Model Recursive Representation Information (RepInfo) Functional Model 8

9 RepInfo Example FITS FILE PDF STANDARD PDF s/w FITS STANDARD JAVA VM FITS JAVA s/w Taken from CASPAR presentation by David Giaretta FITS DICTIONARY DICTIONARY SPECIFICATION XML SPECIFICATION UNICODE SPECIFICATION 9

10 OAIS AIP Logical Structure Content Data Object - the raw data that is the focus of the preservation. Representation Information the information required to interpret the raw data to its designated community. Reference globally unique and persistent identifiers for the content information. Provenance the history and the origin of the content information and any changes that may have taken place since it was originated, and who has had custody of it since it was originated. Context documents reason for creation of the content information and relationship to its environment. Fixity a demonstration that the particular content information has not been altered in an undocumented manner. 10

11 Preservation Processes Characteristics Packaging External Data Migration and Transformation Ingest - Digest/Access Loss / Corruption Package metadata together with the data referential data to either internal objects or external Need referral to data residing outside the system External data may evolve over time. Supporting obsolescence of formats, hardware or software. System ingesting data may be different from system accessing data Data likely to be lost / corrupted, especially during migrated and transformation 11

12 Preservation Aware Storage: A New Storage Paradigm Functionality Physically co-locate the Information Object (AIP) Execute data intensive functions at the storage component: fixity computations and validation data transformation Handle provenance events internally Support the loading and execution of external transformations Rational Ensure metadata is never lost when raw data survives Utilize the data locality property E.g. migration and copy occurs at the storage Ideally performed during bit-migration performed close to data 12

13 Preservation Aware Storage: A New Storage Paradigm (Cont.) Functionality Maintain referential integrity Update links during migration Ensure readability of the data by a different system in the future. Support global self-described formats Support media migration Rational Ideally done during migration Interaction with backend storage Interaction with backend storage Support a graceful loss of data 13

14 Migration: Short term to long term archive via media migration AIP Near Line and Off Line Off Line Phase 2 Off Line Phase 3 Archiving System Archiving System Archiving System. Media Migration Media Migration Encapsulation of data and metadata is done within the tape/disk Migration is simple just move the tape to the new system If a tape is damaged or lost, the effect is contained the information in the other tapes is still valid! 14

15 Migration System Migration Constant Migrations Into the Future Generate AIPs Near Line Off Line Phase 1 Off Line Phase 2 Off Line Phase 3 Archiving System Archiving System Archiving System Archiving System. Media Migration Media Migration System Migration The encapsulation of data and metadata is done by the system Migration is more timely and complex because the whole system is involved Some damaged or lost media may make other data useless 15

16 PDS Architecture: A Stacked Architecture for OAIS-based preservation-aware storage AIP Preservation Web Services Preservation DataStore Preservation Engine Layer XAM Layer Object Layer backend Ingest, Access, Administration, Applications Layered approach Based on open standards OAIS, XAM, OSD Generic mappings Logical object physical object 16

17 PDS Architecture in detail AIP Preservation Web Services Preservation DataStore RepInfo Mgr PDI Mgr Migration Mgr Placement Mgr Preservation Engine XAM API XAM Library VIM API XAM to FS WAS CE File System posix I/O backend VIM API XAM to OSD sockets HL HL OSD + Object Store web service Preservation WSDL Ingest, Access, Administration, Applications Security Admin 17

18 Scalability Model 18

19 Storage Interface (High Level) OAIS defines Archival Information package (AIP) to be the basic unit stored in the archival storage The high level API to PDS includes: ingestaip accessaip migrateaip transformaip getpreservationpolic setpreservationpolicy loadstorelet 19

20 Towards Self-Describing Self-Contained Data Format (SD-SCDF) Ideas at early stage Proposed by Simona Cohen Context: SNIA s 100 Year archive Task force Digital Information Layers Application Layer Get the interpretation of the data Example: render a Word 3.0 document Diverse applications and data types Data Layer Get the basic units (records/containers/objects) Example: understand OAIS AIPs Can it be standardized? Bit Layer Get out the bits from the media Example: Linear Tape-Open (LTO) for tapes Maybe depends on media type 20

21 Requirements for the Data Layer Format (1/2) Media agnostic Tape, disk, future media Vendor and Platform agnostic Self-describing Support self-contained data Include means to represent internal links and cross references Performance Need to have good performance even for large data that includes text and binaries Enable parallel reads and writes 21

22 Requirements for the Data Layer Format (2/2) Interoperability Need to be able to migrate data between different systems without loss of data Can be interpreted in the future Extensible Additional information which may be added in the future Vendor specific extensions Cost Free parsers Readable by both humans and machines Ability to do offline inspection Support additional functions on the data compression, encryption, cryptography 22

23 Proposal SD-SCDF is a proposal for an open logical format standard based on marrying the OAIS AIP with XAM. The data format will include OAIS concepts RepInfo, reference, provenance, fixity, context Use XOP for a cluster of AIPs Each AIP is a set of XSets Does a media unit (e.g. volume) contain one XOP package or many XOP packages? Add inter-link and external-link mechanism Include TOC that points to the various AIPs on the media VTL could be a natural translation point and search staging area for off line data (Taken from SNIA SDDF presentation) 23

24 Summary Motivated the problem Analyzed special characteristics of the problem Proposed architecture and problems Reference Towards OAIS-Based Preservation Aware Storage - A White Paper. The Need for Preservation Aware Storage - A Position Paper". ACM SIGOPS Operating Systems Review, Special Issue on File and Storage Systems, Volume 41, Issue 1 (Jan 2007), pp Architecture paper Submitted to MSST 2007 Sites

25 Backup IBM Labs in Haifa

26 Collaborations and Partners CASPAR Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval Objectives: Build an OAIS-based framework and architecture Demonstrate its validity with heterogeneous data Status Provided User Requirements and Scenarios (D4101) document to EU Working on overall conceptual model and Architecture Each partner defines its component in more detail and specifies its interface Built a user community web site Presented preservation and CASPAR seminars Received some initial data Contacts with US partners Reagan Moore, SDSC 26

27 Preservation Aware Storage Proposed Functionalities (1/3) Encapsulate and physically co-locate in the storage the raw data and its complex interrelated metadata objects, such as representation information network, provenance, and fixity. This ensures that the metadata needed for interpretation is not separated from the raw data and thus never lost (if the raw data survives). Utilize the locality property and execute data intensive functions such as fixity computations, data validation, data transformation within the storage component. Include the representation information of metadata such as the representation information of fixity and provenance, so that the metadata can be validated and interpreted when migrating to newer systems. 27

28 Preservation Aware Storage Proposed Functionalities (2/3) Handle the provenance events internally. The applications on top of the preservation aware storage should be freed from managing events that can be handled internally in the storage. Moreover, the types of provenance events are richer and also include events related to migration and transformation. Support the loading and execution of external transformations during the migration process. Additionally, it should facilitate on demand triggering of those transformations. Support media migration, as opposed to system migration, in which migration from one system to another can be done by physically detaching the media from one system and attaching it to the new system. 28

29 Preservation Aware Storage Proposed Functionalities (3/3) Maintain referential integrity including updating all the links during the migration process such that they remain valid in the new system. This requires an awareness of certain meta-data fields that represent links, both internally to the system and externally. Ensure readability of the data by a different system in the future. This is done by developing and supporting global self-described formats for disks and tapes. Support a graceful loss of data. Some portions of the data are likely to be lost or become corrupted over time. If some data is lost, a good preservation system must prevent cases where data in the system that is still intact cannot be read or interpreted. This is done by utilizing a self-contained format. 29

30 Ingest AIP Data Flow Within PDS (1/3) The OAIS Ingest entity sends to Preservation DataStore (Archival Storage) a storage request with an AIP. The transfer request may need to indicate the anticipated frequency of utilization The Reference Manager either assigns or validates the given Persistent Globally Unique ID for the AIP. The AIP ID may be based on XAM XUID. The AIP ID also resides in the OAIS Data Management entity or CASPAR Directory Service. The RepInfo Manager validates and fetches some of the RepInfo network of the content data object. The given AIP may sometimes embed the RepInfo and sometimes just have links to the RepInfo which resides in some registry. Since the Preservation DataStore is required to have self-contained objects, the RepInfo Manager copies the RepInfo from the registry. The RepInfo Manager will utilize a configurable policy to decide to what extend the RepInfo network is copied. 30

31 Ingest AIP Data Flow Within PDS (2/3) The Fixity Manager computes the fixity using a hierarchical hash computation tree. The AIP may include hints to how strong the fixity algorithm should be. The computed fixity will be exposed The Provenance Manager generates the provenance structure and adds the initial events. The Context Manager checks the context referential integrity. The Context Manager will check the existence of the hard links but not the soft links. The RepInfo Manager associates the PDI RepInfo with the given AIP. The PDI RepInfo is known to PDS but it needs to be associated with the AIP to satisfy the self-containment requirement. 31

32 Ingest AIP Data Flow Within PDS (3/3) The Placement Manager computes to which cluster to assign the given AIP. Each cluster is self-contained and all its artifacts will reside physically together on the same media unit e.g. tape volume. The Placement Manager has a configuration file that includes input from the user, the media units sizes, the AIP context, the Content Information RepInfo, etc. This configuration file is used to compute the cluster assignment. The Placement Manager updates the cluster object and optionally compress it. It also updates the AIP-to- MediaUnit table. The Placement Manager optionally exports (unmount) the cluster object to the media unit using a self-describing self-contained format. Preservation DataStore sends to the OAIS Ingest entity a storage confirmation event indicating (or verifying) the storage identification information for the AIP. 32

Preservation DataStores: Architecture for Preservation Aware Storage

Preservation DataStores: Architecture for Preservation Aware Storage Preservation DataStores: Architecture for Preservation Aware Storage Michael Factor, Dalit Naor, Simona Rabinovici-Cohen, Leeat Ramati, Petra Reshef, Julian Satran IBM Haifa Research Lab {factor, dalit,

More information

The Storage Networking Industry Association (SNIA) Data Preservation and Metadata Projects. Bob Rogers, Application Matrix

The Storage Networking Industry Association (SNIA) Data Preservation and Metadata Projects. Bob Rogers, Application Matrix The Storage Networking Industry Association (SNIA) Data Preservation and Metadata Projects Bob Rogers, Application Matrix Overview The Self Contained Information Retention Format Rationale & Objectives

More information

Earth Science Community view on Digital Repositories

Earth Science Community view on Digital Repositories Ground European Network for Earth Science Interoperations Digital Repository Dissemination and Exploitation of GRids in Earth science Earth Science Community view on Digital Repositories Luigi FUSCO -

More information

Combining SNIA Cloud, Tape and Container Format Technologies for the Long Term Retention of Big Data

Combining SNIA Cloud, Tape and Container Format Technologies for the Long Term Retention of Big Data Combining SNIA Cloud, Tape and Container Format Technologies for the Long Term Retention of Big Data Sam Fineberg, HP Simona Rabinovici-Cohen, IBM Research Haifa Outline Introduction SNIA Long Term Retention

More information

The ESA Earth Observation Long Term Data Preservation (LTDP) Programme ABSTRACT

The ESA Earth Observation Long Term Data Preservation (LTDP) Programme ABSTRACT The ESA Earth Observation Long Term Data Preservation (LTDP) Programme Vincenzo Beruti, Mirko Albani (1), (1) ESA-ESRIN Via G. Galilei, CP 64,00044 Frascati, Italy EMail: Vincenzo.Beruti@esa.int, Mirko.Albani@esa.int

More information

XAM over OSD. Sami Iren Seagate Technology

XAM over OSD. Sami Iren Seagate Technology XAM over OSD Sami Iren Seagate Technology sami.iren@seagate.com Abstract XAM and OSD are both object-based technologies that are being standardized by SNIA. XAM provides a standard object-based API for

More information

Big Data in Research: Research Analytics Industry Solution. Stuart Long CTO - Oracle Systems Asia Pacific and Japan

Big Data in Research: Research Analytics Industry Solution. Stuart Long CTO - Oracle Systems Asia Pacific and Japan Big Data in Research: Research Analytics Industry Solution Stuart Long CTO - Oracle Systems Asia Pacific and Japan Information Architecture Capability Model Data Data technology Technology Management management

More information

Format Technologies for the Long Term Retention of Big Data

Format Technologies for the Long Term Retention of Big Data Combining PRESENTATION SNIA Cloud, TITLE Tape GOES HERE and Container Format Technologies for the Long Term Retention of Big Data Presenter: Sam Fineberg, HP Co-Authors: Simona Rabinovici-Cohen, IBM Research

More information

LTR TWG & the Cloud PRESENTATION TITLE GOES HERE

LTR TWG & the Cloud PRESENTATION TITLE GOES HERE LTR TWG & the Cloud PRESENTATION TITLE GOES HERE Roger Cummings Co-Chair, LTR TWG LTR TWG Introduction! TWG full chartered in mid 2008! Mission! The TWG will lead storage industry collaboration with groups

More information

Format Technologies for the Long Term Retention of Big Data

Format Technologies for the Long Term Retention of Big Data Combining PRESENTATION SNIA Cloud, TITLE Tape GOES HERE and Container Format Technologies for the Long Term Retention of Big Data Presenter: Sam Fineberg, HP Co-Authors: Simona Rabinovici-Cohen, IBM Research

More information

Storage Industry Resource Domain Model

Storage Industry Resource Domain Model Storage Industry Resource Domain Model A Technical Proposal from the SNIA Technical Council Topics Abstract Data Storage Interfaces Storage Resource Domain Data Resource Domain Information Resource Domain

More information

SNIA 100 Year Archive Survey 2017 Thomas Rivera, CISSP

SNIA 100 Year Archive Survey 2017 Thomas Rivera, CISSP SNIA 100 Year Archive Survey 2017 Thomas Rivera, CISSP Data Security & Privacy Consultant Co-chair, Data Protection & Capacity Optimization (DPCO)Committee SNIA Secretary, Board of Directors SNIA Secretary,

More information

Format Technologies for the Long Term Retention of Big Data. Roger Cummings, Antesignanus Co-Author: Simona Rabinovici-Cohen, IBM Research Haifa

Format Technologies for the Long Term Retention of Big Data. Roger Cummings, Antesignanus Co-Author: Simona Rabinovici-Cohen, IBM Research Haifa Combining PRESENTATION SNIA Cloud, TITLE Tape GOES HERE and Container Format Technologies for the Long Term Retention of Big Data Roger Cummings, Antesignanus Co-Author: Simona Rabinovici-Cohen, IBM Research

More information

The ESA CASPAR Scientific Testbed and the combined approach with GENESI-DR

The ESA CASPAR Scientific Testbed and the combined approach with GENESI-DR The ESA CASPAR Scientific Testbed and the combined approach with GENESI-DR S. ALBANI (ACS c/o ESA-ESRIN) Sergio.Albani@esa.int PV2009, 1-3/12/2009, Madrid SUMMARY ESA, CASPAR and Long Term Data Preservation

More information

The International Journal of Digital Curation Issue 3, Volume

The International Journal of Digital Curation Issue 3, Volume 4 Long-term Preservation of Earth Observation Data Long-term Preservation of Earth Observation Data and Knowledge in ESA through CASPAR Sergio Albani, ACS c/o ESA-ESRIN, Italy David Giaretta, STFC Rutherford

More information

ISO ARCHIVE STANDARDS: STATUS REPORT

ISO ARCHIVE STANDARDS: STATUS REPORT ISO ARCHIVE STANDARDS: STATUS REPORT Donald M Sawyer Code 633 NASA/Goddard Space Flight Center Greenbelt, MD 20771 Phone: +1 301 286 2748 Fax: +1 301 286 1771 E-mail: donald.sawyer@gsfc.nasa.gov Presented

More information

Using XFDU for CASPAR information packaging Matthew Dunckley Science & Technology Facilities Council, Oxford, UK

Using XFDU for CASPAR information packaging Matthew Dunckley Science & Technology Facilities Council, Oxford, UK The current issue and full text archive of this journal is available at www.emeraldinsight.com/1065-075x.htm OCLC 26,2 80 Received July 2009 Revised October 2009 Accepted October 2009 THEME ARTICLE Using

More information

Australia s Remotely Sensed Data Archive: The Next 25 Years

Australia s Remotely Sensed Data Archive: The Next 25 Years Australian Government Australia s Remotely Sensed Data Archive: The Next 25 Years Stuart Barr Geospatial and Earth Monitoring Division Introduction The Australian Centre for Remote Sensing (ACRES) is Australia

More information

Fragility of digitally encoded information increasingly appreciated as a major concern This concern applies to almost every aspect of life

Fragility of digitally encoded information increasingly appreciated as a major concern This concern applies to almost every aspect of life CASPAR Overview David Giaretta International Conference on Digital Preservation at the occasion of the retirement of J. Steenbakkers Koninklijke Bibliotheek November 1-2 2007, The Hague 1 Drivers Fragility

More information

Session Two: OAIS Model & Digital Curation Lifecycle Model

Session Two: OAIS Model & Digital Curation Lifecycle Model From the SelectedWorks of Group 4 SundbergVernonDhaliwal Winter January 19, 2016 Session Two: OAIS Model & Digital Curation Lifecycle Model Dr. Eun G Park Available at: https://works.bepress.com/group4-sundbergvernondhaliwal/10/

More information

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information

More information

The VITO Earth Observation LTDA Facility

The VITO Earth Observation LTDA Facility 15/12/2009 The VITO Earth Observation LTDA Facility PV2009 Madrid Martine Paepen VITO Flemish Institute for Technological Research Boeretang 200, B-2400 Mol, Belgium Tel: +32 14 33.67.21 martine.paepen@vito.be

More information

The CASPAR Finding Aids

The CASPAR Finding Aids ABSTRACT The CASPAR Finding Aids Henri Avancini, Carlo Meghini, Loredana Versienti CNR-ISTI Area dell Ricerca di Pisa, Via G. Moruzzi 1, 56124 Pisa, Italy EMail: Full.Name@isti.cnr.it CASPAR is a EU co-funded

More information

CoSA & Preservica Practical Digital Preservation 2015/16. Practical OAIS Digital Preservation Online Workshop Module 2

CoSA & Preservica Practical Digital Preservation 2015/16. Practical OAIS Digital Preservation Online Workshop Module 2 CoSA & Preservica Practical Digital Preservation 2015/16 Practical OAIS Digital Preservation Online Workshop Module 2 Practical Digital Preservation 2015/16 Welcome! PDP Online Workshops - with focus on

More information

Tape Sucks for Long-Term Retention Time to Move to the Cloud. How Cloud is Transforming Legacy Data Strategies

Tape Sucks for Long-Term Retention Time to Move to the Cloud. How Cloud is Transforming Legacy Data Strategies Tape Sucks for Long-Term Retention Time to Move to the Cloud How Cloud is Transforming Legacy Data Strategies INTRODUCTION Tapes suck for long term retention (LTR) Unknown content Locked in proprietary

More information

Wrap-up. Ground Segment Coordination Body Workshop ESA/ESRIN, Frascati, 6-7 June 2012

Wrap-up. Ground Segment Coordination Body Workshop ESA/ESRIN, Frascati, 6-7 June 2012 Wrap-up Ground Segment Coordination Body Workshop 2012 ESA/ESRIN, Frascati, 6-7 June 2012 Objectives of the Workshop Give a high level Overview of the respective programmes of all participants + Give a

More information

Earth Observation Payload Data Ground Systems Infrastructure Evolution LTDP SAFE. SAFE Software System Specification

Earth Observation Payload Data Ground Systems Infrastructure Evolution LTDP SAFE. SAFE Software System Specification Earth Observation Payload Data Ground Systems Infrastructure Evolution 2011-2014 LTDP SAFE SAFE Software System Specification Ref: SAFE-GMV-SSS-001 Version: 2.0 Date: 31st May 2012 Author Reviewer Approver

More information

Rocket Software Rocket Arkivio

Rocket Software Rocket Arkivio Rocket Software Rocket Arkivio Who we are? Arkivio is a Brand of Rocket Software focused on providing solutions for intelligent management of Unstructured Data over its lifetime. Started in 2000 with vision

More information

More than a Lifetime of

More than a Lifetime of More than a Lifetime of Data and Information Unifying Live and Archival Storage Larry Stabile Iron Mountain Digital Time Capsules 1000 years Amarillo, Texas, 1968 5000 years NY World s Fair, 1939 Pyramids

More information

VISION Virtualized Storage Services Foundation for the Future Internet

VISION Virtualized Storage Services Foundation for the Future Internet VISION Virtualized Storage Services Foundation for the Future Internet Julian Satran, Hillel Kolodner, Dalit Naor *, Yaron Wolfsthal IBM, On Behalf of the VISION Consortium SNIA Cloud Storage Mini Summit

More information

EMC Disk Library Automated Tape Caching Feature

EMC Disk Library Automated Tape Caching Feature EMC Disk Library Automated Tape Caching Feature A Detailed Review Abstract This white paper details the EMC Disk Library configuration and best practices when using the EMC Disk Library Automated Tape

More information

AUTHENTICITY AND OAIS. THE CASPAR MODEL AND THE INTERPARES PRINCIPLES & OUTPUTS

AUTHENTICITY AND OAIS. THE CASPAR MODEL AND THE INTERPARES PRINCIPLES & OUTPUTS AUTHENTICITY AND OAIS. THE CASPAR MODEL AND THE INTERPARES PRINCIPLES & OUTPUTS Mariella Guercio Delos Summer School Tirrenia, 11 June 2008 1 1. CASPAR and InterPARES. The relevance of cooperation and

More information

Components for a Science Data Infrastructure preservation and re-use of data. David Giaretta

Components for a Science Data Infrastructure preservation and re-use of data. David Giaretta Components for a Science Data Infrastructure preservation and re-use of data David Giaretta CASPAR Project EU FP6 Integrated Project Total spend approx. 16MEuro (8.8 MEuro from EU) http://www.casparpreserves.eu

More information

CCSDS STANDARDS A Reference Model for an Open Archival Information System (OAIS)

CCSDS STANDARDS A Reference Model for an Open Archival Information System (OAIS) CCSDS STANDARDS A Reference Model for an Open Archival System (OAIS) Mr. Nestor Peccia European Space Operations Centre, Robert-Bosch-Str. 5, D-64293 Darmstadt, Germany. Phone +49 6151 902431, Fax +49

More information

extensible Access Method (XAM) - a new fixed content API Mark A Carlson, SNIA Technical Council, Sun Microsystems, Inc.

extensible Access Method (XAM) - a new fixed content API Mark A Carlson, SNIA Technical Council, Sun Microsystems, Inc. extensible Access Method (XAM) - a new fixed content API Mark A Carlson, SNIA Technical Council, Sun Microsystems, Inc. SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA.

More information

Insights into TSM/HSM for UNIX and Windows

Insights into TSM/HSM for UNIX and Windows IBM Software Group Insights into TSM/HSM for UNIX and Windows Oxford University TSM Symposium 2005 Jens-Peter Akelbein (akelbein@de.ibm.com) IBM Tivoli Storage SW Development 1 IBM Software Group Tivoli

More information

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation INSPIRE 2010, KRAKOW Dr. Arif Shaon, Dr. Andrew Woolf (e-science, Science and Technology Facilities Council, UK) 3

More information

Self-contained Information Retention Format For Future Semantic Interoperability

Self-contained Information Retention Format For Future Semantic Interoperability Self-contained Information Retention Format For Future Semantic Interoperability Simona Rabinovici-Cohen 1, Roger Cummings 2, and Sam Fineberg 3 1 IBM Research Haifa simona@il.ibm.com 2 Antesignanus roger@antesignanus.com

More information

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...)

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...) Technical issues 1 Slide 1 & 2 Technical issues There are a wide variety of technical issues related to starting up an IR. I m not a technical expert, so I m going to cover most of these in a fairly superficial

More information

Archive exchange Format AXF

Archive exchange Format AXF Archive exchange Format AXF An Open Format for Universal Content Transport, Storage and Long Term Preservation Brian Campanotti CTO Front Porch Digital What is AXF? AXF is a universal format for the encapsulation,

More information

Different Aspects of Digital Preservation

Different Aspects of Digital Preservation Different Aspects of Digital Preservation DCH-RP and EUDAT Workshop in Stockholm 3rd of June 2014 Börje Justrell Table of Content Definitions Strategies The Digital Archive Lifecycle 2 Digital preservation

More information

Defining OAIS requirements by Deconstructing the OAIS Reference Model Date last revised: August 28, 2005

Defining OAIS requirements by Deconstructing the OAIS Reference Model Date last revised: August 28, 2005 Defining OAIS requirements by Deconstructing the OAIS Reference Model Date last revised: August 28, 2005 This table includes text extracted directly from the OAIS reference model (Blue Book, 2002 version)

More information

The OAIS Reference Model: current implementations

The OAIS Reference Model: current implementations The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath m.day@ukoln.ac.uk Chinese-European Workshop on Digital Preservation, Beijing, China, 14-16 July 2004 Presentation

More information

Selecting the Right Method

Selecting the Right Method Selecting the Right Method Applying the proper OpenText InfoArchive method to balance project requirements with source application architectures InfoArchive is an application-agnostic solution for information

More information

Rio-2 Hybrid Backup Server

Rio-2 Hybrid Backup Server A Revolution in Data Storage for Today s Enterprise March 2018 Notices This white paper provides information about the as of the date of issue of the white paper. Processes and general practices are subject

More information

White paper Selecting the right method

White paper Selecting the right method White paper Selecting the right method This whitepaper outlines how to apply the proper OpenText InfoArchive method to balance project requirements with source application architectures. Contents The four

More information

Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview

Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview University of Kalyani, India From the SelectedWorks of Sibsankar Jana February 27, 2009 Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview

More information

Draft Digital Preservation Policy for IGNCA. Dr. Aditya Tripathi Banaras Hindu University Varanasi

Draft Digital Preservation Policy for IGNCA. Dr. Aditya Tripathi Banaras Hindu University Varanasi Draft Digital Preservation Policy for IGNCA Dr. Aditya Tripathi Banaras Hindu University Varanasi aditya@bhu.ac.in adityatripathi@hotmail.com Digital Preservation Born Digital Object Regardless of U S

More information

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation Trends in Data Protection and Restoration Technologies Mike Fishman, EMC 2 Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member

More information

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid, Barbara Sierman Digital Preservation Officer Madrid, 16-03-2006 e-depot in practice Short introduction of the e-depot 4 Cases with different aspects Characteristics of the supplier Specialities, problems

More information

The digital preservation technological context

The digital preservation technological context The digital preservation technological context Michael Day, Digital Curation Centre UKOLN, University of Bath m.day@ukoln.ac.uk Preservation of Digital Heritage: Basic Concepts and Main Initiatives, Madrid,

More information

Copyright 2008, Paul Conway.

Copyright 2008, Paul Conway. Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution - Non-Commercial - Share Alike 3.0 License.. http://creativecommons.org/licenses/by-nc-sa/3.0/

More information

Technology Insight Series

Technology Insight Series IBM ProtecTIER Deduplication for z/os John Webster March 04, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved. Announcement Summary The many data

More information

An overview of the OAIS and Representation Information

An overview of the OAIS and Representation Information An overview of the OAIS and Representation Information JORUM, DCC and JISC Forum Long-term Curation and Preservation of Learning Objects February 9 th 2006 University of Glasgow Manjula Patel UKOLN and

More information

The Need for a Terminology Bridge. May 2009

The Need for a Terminology Bridge. May 2009 May 2009 Principal Author: Michael Peterson Supporting Authors: Bob Rogers Chief Strategy Advocate for the SNIA s Data Management Forum, CEO, Strategic Research Corporation and TechNexxus Chair of the

More information

Archive II. The archive. 26/May/15

Archive II. The archive. 26/May/15 Archive II The archive 26/May/15 What is an archive? Is a service that provides long-term storage and access of data. Long-term usually means ~5years or more. Archive is strictly not the same as a backup.

More information

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS).

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS). Harvard University Library Office for Information Systems DRS Policy Guide This Guide defines the policies associated with the Harvard Library Digital Repository Service (DRS) and is intended for Harvard

More information

Effizientes Speichern von Cold-Data

Effizientes Speichern von Cold-Data Effizientes Speichern von Cold-Data Dr. Dirk Gebh Storage Sales Consultant Oracle Deutschland Program Agenda 1 2 3 4 5 Cold-Data OHSM Introduction Use Case Removing Cold Data from Primary Storage OHSM

More information

Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam. Copyright 2014 SEP

Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam.  Copyright 2014 SEP Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam info@sepusa.com www.sepusa.com Table of Contents INTRODUCTION AND OVERVIEW... 3 SOLUTION COMPONENTS... 4-5 SAP HANA... 6 SEP

More information

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017 Update HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017 1 AGENDA DRS DRS DRS Architecture DRS DRS DRS Work 2 COLLABORATIVELY MANAGED DRS Business Owner Digital

More information

Agenda. Bibliography

Agenda. Bibliography Humor 2 1 Agenda 3 Trusted Digital Repositories (TDR) definition Open Archival Information System (OAIS) its relevance to TDRs Requirements for a TDR Trustworthy Repositories Audit & Certification: Criteria

More information

DataFinder A Scientific Data Management Solution ABSTRACT

DataFinder A Scientific Data Management Solution ABSTRACT DataFinder A Scientific Data Management Solution Tobias Schlauch (1), Andreas Schreiber (2) (1) German Aerospace Center (DLR) Simulation and Software Technology Lilienthalplatz 7, D-38108 Braunschweig,

More information

Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives

Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Lisa M. Schmidt lisa.schmidt@matrix.msu.edu http://www.h-net.org/archive/ MATRIX: The Center for Humane Arts, Letters & Social

More information

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Joesph JaJa joseph@ Mike Smorul toaster@ Fritz McCall fmccall@ Yang Wang wpwy@ Institute

More information

The Web Hierarchical Ordering Mechanism (WHOM) a tool for ordering HDF and HDF-EOS Data

The Web Hierarchical Ordering Mechanism (WHOM) a tool for ordering HDF and HDF-EOS Data The Goddard Earth Sciences Distributed Active Archive Center http://daac.gsfc.nasa.gov The Web Hierarchical Ordering Mechanism (WHOM) a tool for ordering HDF and HDF-EOS Data Presented by: James E. Johnson

More information

Transitioning NCAR MSS to HPSS

Transitioning NCAR MSS to HPSS Transitioning NCAR MSS to HPSS Oct 29, 2009 Erich Thanhardt Overview Transitioning to HPSS Explain rationale behind the move Introduce current HPSS system in house Present transition plans with timelines

More information

Importance of cultural heritage:

Importance of cultural heritage: Cultural heritage: Consists of tangible and intangible, natural and cultural, movable and immovable assets inherited from the past. Extremely valuable for the present and the future of communities. Access,

More information

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version University of British Columbia Library Persistent Digital Collections Implementation Plan Final project report Summary version May 16, 2012 Prepared by 1. Introduction In 2011 Artefactual Systems Inc.

More information

Freedom of Access in Maine

Freedom of Access in Maine Freedom of Access in Maine Managing Digital Reality: State Initiatives Presented by: Karla Black, Esq. The Reality of Digital Records If William Shakespeare had written Hamlet on a word processor, or If

More information

Hybrid Backup & Disaster Recovery. Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam

Hybrid Backup & Disaster Recovery. Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam Hybrid Backup & Disaster Recovery Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam 1 Table of Contents 1. Introduction and Overview... 3 2. Solution Components... 3 3. SAP HANA: Data Protection...

More information

Solution Brief: Archiving with Harmonic Media Application Server and ProXplore

Solution Brief: Archiving with Harmonic Media Application Server and ProXplore Solution Brief: Archiving with Harmonic Media Application Server and ProXplore Summary Harmonic Media Application Server (MAS) provides management of content across the Harmonic server and storage infrastructure.

More information

HOW DATA DEDUPLICATION WORKS A WHITE PAPER

HOW DATA DEDUPLICATION WORKS A WHITE PAPER HOW DATA DEDUPLICATION WORKS A WHITE PAPER HOW DATA DEDUPLICATION WORKS ABSTRACT IT departments face explosive data growth, driving up costs of storage for backup and disaster recovery (DR). For this reason,

More information

Its All About The Metadata

Its All About The Metadata Best Practices Exchange 2013 Its All About The Metadata Mark Evans - Digital Archiving Practice Manager 11/13/2013 Agenda Why Metadata is important Metadata landscape A flexible approach Case study - KDLA

More information

Tom Sas HP. Author: SNIA - Data Protection & Capacity Optimization (DPCO) Committee

Tom Sas HP. Author: SNIA - Data Protection & Capacity Optimization (DPCO) Committee Advanced PRESENTATION Data Reduction TITLE GOES HERE Concepts Tom Sas HP Author: SNIA - Data Protection & Capacity Optimization (DPCO) Committee SNIA Legal Notice The material contained in this tutorial

More information

The ESA Earth Observation Payload Data Long Term Storage Activities ABSTRACT

The ESA Earth Observation Payload Data Long Term Storage Activities ABSTRACT The ESA Earth Observation Payload Data Long Term Storage Activities Gian Maria Pinna (1), Francesco Ferrante (2) (1) ESA-ESRIN Via G. Galilei, CP 64,00044 Frascati, Italy EMail: GianMaria.Pinna@esa.int

More information

OAIS: What is it and Where is it Going?

OAIS: What is it and Where is it Going? OAIS: What is it and Where is it Going? Presentation on the Reference Model for an Open Archival System (OAIS) Don Sawyer/NASA/GSFC Lou Reich/NASA/CSC FAFLRT/ALA FAFLRT/ALA 1 Organizational Background

More information

Introduction to Digital Preservation. Danielle Mericle University of Oregon

Introduction to Digital Preservation. Danielle Mericle University of Oregon Introduction to Digital Preservation Danielle Mericle dmericle@uoregon.edu University of Oregon What is Digital Preservation? the series of management policies and activities necessary to ensure the enduring

More information

Integration of Agilent OpenLAB CDS EZChrom Edition with OpenLAB ECM Compliance with 21 CFR Part 11

Integration of Agilent OpenLAB CDS EZChrom Edition with OpenLAB ECM Compliance with 21 CFR Part 11 OpenLAB CDS Integration of Agilent OpenLAB CDS EZChrom Edition with OpenLAB ECM Compliance with 21 CFR Part 11 Technical Note Introduction Part 11 in Title 21 of the Code of Federal Regulations includes

More information

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October Susan Thomas, Project Manager An overview of the project Wellcome Library, 10 October 2006 Outline What is Paradigm? Lessons so far Some future challenges Next steps What is Paradigm? Funded for 2 years

More information

Storage Technology Requirements of the NCAR Mass Storage System

Storage Technology Requirements of the NCAR Mass Storage System Storage Technology Requirements of the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research (NCAR) 1850 Table Mesa Dr. Boulder, CO 80303 Phone: +1-303-497-1203; FAX: +1-303-497-1848

More information

Interoperability & Archives in the European Commission

Interoperability & Archives in the European Commission Interoperability & Archives in the European Commission By Natalia ARISTIMUÑO PEREZ Head of Interoperability Unit at Directorate- General for Informatics (DG DIGIT) European Commission High value added

More information

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic Deep Storage for Exponential Data Nathan Thompson CEO, Spectra Logic HISTORY Partnered with Fujifilm on a variety of projects HQ in Boulder, 35 years of business Customers in 54 countries Spectra builds

More information

Digital Preservation at NARA

Digital Preservation at NARA Digital Preservation at NARA Policy, Records, Technology Leslie Johnston Director of Digital Preservation US National Archives and Records Administration (NARA) ARMA, April 18, 2018 Policy Managing Government

More information

XML information Packaging Standards for Archives

XML information Packaging Standards for Archives XML information Packaging Standards for Archives Lou Reich/CSC Long Term Knowledge Retention Workshop March15,2006 15 March 2006 1 XML Packaging Standards Growing interest in XML-based representation of

More information

archiving with the IBM CommonStore solution

archiving with the IBM CommonStore solution IBM Software Group E-mail archiving with the IBM CommonStore solution Comprehensive flexible reliable Borut Obran Genis d.o.o. 2006 IBM Corporation Agenda Overview Mailbox management Discovery Compliance

More information

IBM TS4300 Tape Library

IBM TS4300 Tape Library IBM TS4300 Tape Library Supports secure, long-term data storage in a highly scalable tape library Highlights Highly scalable solution with one 3U base module and up to six expansions Help meet compliance

More information

A Simple Mass Storage System for the SRB Data Grid

A Simple Mass Storage System for the SRB Data Grid A Simple Mass Storage System for the SRB Data Grid Michael Wan, Arcot Rajasekar, Reagan Moore, Phil Andrews San Diego Supercomputer Center SDSC/UCSD/NPACI Outline Motivations for implementing a Mass Storage

More information

Building for the Future

Building for the Future Building for the Future The National Digital Newspaper Program Deborah Thomas US Library of Congress DigCCurr 2007 Chapel Hill, NC April 19, 2007 1 What is NDNP? Provide access to historic newspapers Select

More information

HPE Security Data Security. HPE SecureData. Product Lifecycle Status. End of Support Dates. Date: April 20, 2017 Version:

HPE Security Data Security. HPE SecureData. Product Lifecycle Status. End of Support Dates. Date: April 20, 2017 Version: HPE Security Data Security HPE SecureData Product Lifecycle Status End of Support Dates Date: April 20, 2017 Version: 1704-1 Table of Contents Table of Contents... 2 Introduction... 3 HPE SecureData Appliance...

More information

verapdf: definitive, open source PDF/A validation for digital preservationists

verapdf: definitive, open source PDF/A validation for digital preservationists verapdf: definitive, open source PDF/A validation for digital preservationists Open Preservation Foundation PREFORMA Open Source Workshop 2016, Stockholm Presenters Joachim Jung, Open Preservation Foundation

More information

Ta kontroll över er data! Christofer Jensen Client Technical Specialist. Stockholm

Ta kontroll över er data! Christofer Jensen Client Technical Specialist. Stockholm Ta kontroll över er data! Christofer Jensen Client Technical Specialist Stockholm IBM Storage: Named a Leader in 13 Gartner and IDC Reports in,, and 2016 #1 in Mainframe Storage, Enterprise Data Protection,

More information

XenData MX64 Edition. Product Brief:

XenData MX64 Edition. Product Brief: XenData MX64 Edition Product Brief: The MX64 Edition of XenData Archive Series software runs on multiple 64 bit servers and creates a very high performance digital video archive. The software manages one

More information

Technology Special Interest Group Thursday, December 4, Tony Hanson Webmaster Technology Special Interest Group Leader

Technology Special Interest Group Thursday, December 4, Tony Hanson Webmaster Technology Special Interest Group Leader Technology Special Interest Group Thursday, December 4, 2014 Tony Hanson Webmaster Technology Special Interest Group Leader 2014 2015 Schedule Sept: Backing Up Your Data Oct: Archiving 1: Organizing Your

More information

Digital Preservation DMFUG 2017

Digital Preservation DMFUG 2017 Digital Preservation DMFUG 2017 1 The need, the goal, a tutorial In 2000, the University of California, Berkeley estimated that 93% of the world's yearly intellectual output is produced in digital form

More information

XenData Product Brief: XenData6 Server Software

XenData Product Brief: XenData6 Server Software XenData Product Brief: XenData6 Server Software XenData6 Server is the software that runs the XenData SX-10 Archive Appliance and the range of SX-520 Archive Servers, creating powerful solutions for archiving

More information

A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method

A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method Koichi Tabata, Takeshi Okada, Mitsuharu Nagamori, Tetsuo Sakaguchi, and Shigeo

More information

EMC Centera CentraStar/SDK Compatibility with Centera ISV Applications

EMC Centera CentraStar/SDK Compatibility with Centera ISV Applications EMC Centera CentraStar/SDK Compatibility with Centera ISV Applications A Detailed Review Abstract This white paper provides an overview on the compatibility between EMC Centera CentraStar and SDK releases,

More information

What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011

What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011 What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011 The FCLA, the FDA, and DAITSS FDA: a service of the Florida Center for Library Automation

More information

New LTO Tape File System Helps Make Video Storage Easier Introducing

New LTO Tape File System Helps Make Video Storage Easier Introducing New LTO Tape File System Helps Make Video Storage Easier Introducing LTO-5 Technology and Linear Tape File System TM Learn more at: www.trustlto.com Linear Tape-Open, LTO, LTO Logo, Ultrium and Ultrium

More information

AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT

AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT AUTOMATING IBM SPECTRUM SCALE CLUSTER BUILDS IN AWS PROOF OF CONCEPT By Joshua Kwedar Sr. Systems Engineer By Steve Horan Cloud Architect ATS Innovation Center, Malvern, PA Dates: Oct December 2017 INTRODUCTION

More information