What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011

Similar documents
FDA Affiliate s Guide to the FDA User Interface

DAITSS Demo Virtual Machine Quick Start Guide

Chapter 5: The DAITSS Archiving Process

Florida Digital Archive (FDA) SIP Specification

DAITSS Workflow Interface

Digital Preservation at NARA

Introduction to Digital Preservation. Danielle Mericle University of Oregon

FCLA Endeca Project. By Michele Newberry IGeLU Brno

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS).

Importance of cultural heritage:

Digital Preservation: From Theory to Practice

How to build your own dark archive (in your spare time)

Digital Preservation DMFUG 2017

Digital Preservation Preservation Strategies

Agenda. Bibliography

Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

Woodson Research Center Digital Preservation Policy

Draft Digital Preservation Policy for IGNCA. Dr. Aditya Tripathi Banaras Hindu University Varanasi

Copyright 2008, Paul Conway.

An overview of the OAIS and Representation Information

Introduction to. Digital Curation Workshop. March 14, 2013 SFU Wosk Centre for Dialogue Vancouver, BC

UC Office of the President ipres 2009: the Sixth International Conference on Preservation of Digital Objects

Assessment of product against OAIS compliance requirements

Archival Information Package (AIP) E-ARK AIP version 1.0

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

Session Two: OAIS Model & Digital Curation Lifecycle Model

Chapter 2: Getting Started

Ex Libris Rosetta A Complete Digital Asset Management and Preservation System

Defining OAIS requirements by Deconstructing the OAIS Reference Model Date last revised: August 28, 2005

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October

Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives

The OAIS Reference Model: current implementations

Protecting Future Access Now Models for Preserving Locally Created Content

Different Aspects of Digital Preservation

UNT Libraries TRAC Audit Checklist

ISO Self-Assessment at the British Library. Caylin Smith Repository

Its All About The Metadata

Assessment of product against OAIS compliance requirements

Trusted Digital Repositories. A systems approach to determining trustworthiness using DRAMBORA

Records management workflows

Digital Preservation Standards Using ISO for assessment

MAPPING STANDARDS! FOR RICHER ASSESSMENTS. Bertram Lyons AVPreserve Digital Preservation 2014 Washington, DC

Persistent identifiers, long-term access and the DiVA preservation strategy

Data Curation Handbook Steps

Igitur Archive: Institutional Repository Utrecht University. May , Martin Slabbertje

Assigns a persistent identifier that will always point to the object and/or its metadata.

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories

Shared Bib Records with Multiple Series Fields: Issues and Suggested Solution

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

Preservation Planning in the OAIS Model

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017

A Model for Managing Digital Pictures of the National Archives of Iran Based on the Open Archival Information System Reference Model

OAIS: What is it and Where is it Going?

SNHU Academic Archive Policies

Petabytes of Preservation on Tape Jason Pierson Oct 2012

UNIVERSITY OF NOTTINGHAM LIBRARIES, RESEARCH AND LEARNING RESOURCES

Preservation Standards (& Specifications) (&& Best Practices)

The Sunshine State Digital Network

The Swedish National Archives digital preservation. Mats Berggren, IT-department,

Conch Appendix: Discovery Questionnaire. Questionnaire Summary

Date: March 14, 2008 Florida; Florida Archives digitization

Document Metadata: document technical metadata for digital preservation

Towards Interoperable Preservation Repositories TIPR. DLF Spring Forum, 2009 Joseph Pawletko (NYU), Priscilla Caplan (FCLA), Bill Kehoe (CUL)

GUIDELINES FOR CREATION AND PRESERVATION OF DIGITAL FILES

LTR TWG & the Cloud PRESENTATION TITLE GOES HERE

Performing a Migration in the Framework of the OAIS Reference Model: NSSDC Case Study

DRS 2 Glossary. access flag An object access flag records the least restrictive access flag recorded for one of the object s files: ο ο

Compound or complex object: a set of files with a hierarchical relationship, associated with a single descriptive metadata record.

NEDLIB LB5648 Mapping Functionality of Off-line Archiving and Provision Systems to OAIS

CoSA & Preservica Practical Digital Preservation 2015/16. Practical OAIS Digital Preservation Online Workshop Module 2

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Long-Term Preservation Services

Building a Digital Repository on a Shoestring Budget

Strategy for long term preservation of material collected for the Netarchive by the Royal Library and the State and University Library 2014

J. Welles Henderson Archives & Library Digital Preservation Policy Approved: October 26, 2016

Digits Fugit or. Preserving Digital Materials Long Term. Chris Erickson - Brigham Young University

PREMIS Implementations at the British Library & PREMIS and the Planets Project. Angela Dappert The British Library PREMIS roundtable, February 2009

Research Data Management: lessons learned - and still to learn

The Ohio State University's Knowledge Bank: An Institutional Repository in Practice

GETTING STARTED WITH DIGITAL COMMONWEALTH

Building for the Future

Any comments, corrections, or recommendations may be sent to the project team, care of:

ISO ARCHIVE STANDARDS: STATUS REPORT

DA-NRW: a distributed architecture for long-term preservation

The Promise of PREMIS: background, scope and purpose of the Data Dictionary for Preservation Metadata

Preserving French Scientific Data. Olivier Rouchon Sun PASIG Malta June 25th 2009

Response to the CCSDS s DAI Working Group s call for corrections to the OAIS Draft for Public Examination

Internet Architecture Board (IAB) Request for Comments: 8153 Category: Informational April 2017 ISSN:

MetaArchive Cooperative TRAC Audit Checklist

Preservation of the H-Net Lists: Suggested Improvements

Robin Dale RLG

An Introduction to PREMIS. Jenn Riley Metadata Librarian IU Digital Library Program

3D Visualization. Requirements Document. LOTAR International, Visualization Working Group ABSTRACT

XML information Packaging Standards for Archives

Digital Preservation Research Initiatives at NLNZ. SAA Research Forum Austin, Texas 11 August 2009

Managing Born- Digital Documents.

Developing a Research Data Policy

Managing Records in Electronic Formats. An Introduction

Montana State University Library Digital Preservation Procedures Last updated 14 August 2018

Transcription:

What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011

The FCLA, the FDA, and DAITSS FDA: a service of the Florida Center for Library Automation (FCLA) in Gainesville, Florida DAITSS (Dark Archive In The Sunshine State) is the repository software developed by FCLA for the Florida Digital Archive (FDA) as a preservation solution for the State University Libraries of Florida The FDA was the first fully OAIS (Open Archival Information System ISO 1472:2003) conformant repository in production in the United States (2005) The FDA is one of a handful of repositories in the United States to use format migration as a long-term preservation strategy The FDA repository is managed centrally at FCLA, with FDA Affiliates depositing materials to a central repository Automatically archive ETDs submitted to FCLA s ETD service In April 2011 the FDA went into production with version 2 of its DAITSS software, and is preparing for its Open Source Software release for possible use by other repositories

FDA staff: 2 full-time developers currently working on DAITSS 2 software enhancements 1 Formats Specialist/developer: works with all of the major file format-related tools and resources (DROID, JHOVE, UDFR) 1 Archive Manager: duties include troubleshooting, training and documentation for FDA Affiliates, special conversion projects, developing new tools 1 Operations Technician: runs production Services of FCLA IT staff (one primary contact) for OS patches, backups, storage disk management, etc.

FDA: a centralized digital archive serving 10 State University Libraries in Florida UWF FAMU FAU USF UNF FDA FGCU FIU UF UCF FSU

FCLA s ETD Service UWF FAMU FAU USF UNF FDA Via ETD Service FGCU FIU UF UCF FSU

FCLA s ETD Service Catalog record creation Online storage and access Access control for restricted ETDs Optional UMI submission Automatic long-term preservation in the Florida Digital Archive

The ETD Service submission process ETD is FTP-ed to FCLA A copy is processed for display (ETD Service) A copy is sent to the FDA workspace for archiving (FDA)

FDA direct Web Submission via DAITSS GUI

Information Package Bitstream Data File m 1 Intellectual Entity

Information Package Contents METS descriptor (manifest) content file content file content file content file

University of Florida s METS Editor

The DAITSS 2 archiving process 1. The ETD SIP (Submission Information Package) is checked for validity: Completeness: are all described content files included? Correctness: have all content files been correctly transmitted? 2. If the SIP is valid, its individual content files are processed: Content files formats are identified and validated against format standards. Any file format inhibitors and anomalies are noted File format transformation is performed according to an Action Plan for that format (e.g, PDF transformation to PDF/A-1b ) 3. An AIP (Archival Information Package) is created for the ETD and two master copies are stored, one on disk on a remote server in Tallahassee and one on tape in Gainesville, and the FDA s preservation database is populated with extensive metadata about each archived package and file. 4. A report is emailed or ftp-ed to the submitting institution detailing any file format inhibitors and anomalies encountered during archiving

In addition to keeping multiple Archival Information Package masters, the FDA: Extracts and retains extensive technical metadata about each file and its component bitstreams: File format information Significant properties for file transformation Performs file format transformation to ensure long-term preservation Builds provenance history of submitted, normalized and migrated versions of each entity and its components Performs ongoing integrity and fixity checking: Are both master copies still in the repository? (Integrity) Has either copy of the package changed since it was stored? (Fixity)

FDA repository structure Submission Request DAITSS workspace Storage silo Tallahassee database Ingest Storage silo Gainesville

DAITSS 2 architecture

Per file: Validation, Description, Action Plan, Transformation Identification: what format is it? 1,200 recognized formats Approx. 30 supported formats with Action Plans Validation and characterization: well-formed for the identified format valid for the identified format what are its significant properties? Is there an Action Plan for this format? If so, perform file transformation

Background Reports and Action Plans Background report: Format Description Pointer to Specification How to Recognize History and Duration Maintenance Body Platform Support Legal Issues Perceived Popularity Limitations Related Specifications Action Plan: Normalization format(s) Preservation plans: Original format Normalized format Revised on a regular schedule

DAITSS 2 file transformation services Normalization. If a file is in a format considered to be less than optimal for digital preservation a version of the file may be created in a more preservation-worthy format. In general, preferred formats are non-proprietary, well documented, and well understood by FDA staff. (Example: WAV files are normalized to PCM-encoded WAV; all PDFs might be normalized to PDF/A-1b) Migration. If a file is in a format considered at risk of obsolescence, a version may be created in a format considered to be a reasonable successor to the original format. All effort will be made to retain the appearance and behaviors of the original version, although this can not always be guaranteed. Dissemination of archived packages always returns the last, best version of files as well as the original version of all files.

Reports of file format anomalies sent to FDA Affiliates immediately after archiving via ftp or email

Pre-submission use of FDA Description Service

Benefits of file format transformation include: Short-term: Feedback on inhibitors and anomalies allows for file correction Long-term: Files without inhibitors and anomalies are more suitable for longterm preservation Delivery of latest, best copy of file formats Ability to migrate to successor format should the original format be at risk of obsolescence DAITSS Formats Specialist is constantly working to add new Action Plans and update current ones The file format metadata in the FDA s preservation database allows quick identification of files in the repository by format for possible transformation ( refresh )

DAITSS 2 refresh process

DAITSS 2: Open Source Software DAITSS 2 in production at the FDA April 22, 2011 Works well in a consortial environment Currently preparing for OSS release via GitHub Summer intern 2011 project: testing new installation simulating an independent archive using DAITSS 2 software For more information about the possibility of using DAITSS 2 software to create your own digital archive, please contact Manny Rodriguez, avatar38@ufl.edu