Can a Consortium Build a Viable Preservation Repository?

Similar documents
Bradley J. Daigle, University of Virginia Sco9 Turnbull, University of Virginia

Digital Projects/Public Viewer Digital Preservation Storage Update National Preservation Projects

Digital Preservation Network (DPN)

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

Long Term Digital Preserva2on

David Minor UC San Diego Library Chronopolis Preservation Network

Protecting Future Access Now Models for Preserving Locally Created Content

ISO Self-Assessment at the British Library. Caylin Smith Repository

Agenda. Bibliography

Trials And Tribulations Of Moving Forward With Digital Preservation Workflows And Strategies

PRESERVING DIGITAL OBJECTS

Digital Preservation at NARA

Data Curation Handbook Steps

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

Indiana University s Media Digitization and Preservation Initiative

Introduction to Digital Preservation. Danielle Mericle University of Oregon

Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives

UNT Libraries TRAC Audit Checklist

The Canadian Information Network for Research in the Social Sciences and Humanities.

Importance of cultural heritage:

DRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS).

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland

MetaArchive Cooperative TRAC Audit Checklist

Preserving Digital Content at Scale

Libraries and Disaster Recovery

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories

Woodson Research Center Digital Preservation Policy

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

STRATEGIC PLAN

One Body, Many Heads for Repository-Powered Library Applications

University of Maryland Libraries: Digital Preservation Policy

Preservation and Access of Digital Audiovisual Assets at the Guggenheim

Emory Libraries Digital Collections Steering Committee Policy Suite

The Center for Research Libraries. Archive Profile Inter-university Consortium for Political and Social Research (ICPSR)

Addressing the E-Journal Preservation Conundrum: Understanding Portico

Developing a Research Data Policy

NEW YORK PUBLIC LIBRARY

State Government Digital Preservation Profiles

Post Digitization: Challenges in Managing a Dynamic Dataset. Jasper Faase, 12 April 2012

Digital Preservation in Theory and Practice

RUtgers COmmunity REpository (RUcore)

UVic Libraries digital preservation framework Digital Preservation Working Group 29 March 2017

GUIDELINES FOR CREATION AND PRESERVATION OF DIGITAL FILES

Ex Libris Rosetta A Complete Digital Asset Management and Preservation System

Trusted Digital Repositories. A systems approach to determining trustworthiness using DRAMBORA

Preservation. Policy number: PP th March Table of Contents

Preserving the H-Net Academic Electronic Mail Lists

<goals> 10/15/11% From production to preservation to access to use: OAIS, TDR, and the FDLP

community digital preservation collaborations

MAPPING STANDARDS! FOR RICHER ASSESSMENTS. Bertram Lyons AVPreserve Digital Preservation 2014 Washington, DC

Implementing Trusted Digital Repositories

J. Welles Henderson Archives & Library Digital Preservation Policy Approved: October 26, 2016

Managing stakeholders and (creating) their expectations

Defining Economic Models for Digital Libraries :

Data Management Checklist

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

DIGITAL ARCHIVES & PRESERVATION SYSTEMS

The digital preservation technological context

Sustainable Governance for Long-Term Stewardship of Earth Science Data

Writing a Data Management Plan A guide for the perplexed

From production to preservation to access to use: OAIS, TDR, and the FDLP OAIS TRAC / TDR

Report on compliance validation

Creating a Digital Preservation Network with Shared Stewardship and Cost

31 March 2012 Literature Review #4 Jewel H. Ward

Preservation of the H-Net Lists: Suggested Improvements

The International Journal of Digital Curation Issue 1, Volume

Collection Policy. Policy Number: PP1 April 2015

Electronic Records Archives: Philadelphia Federal Executive Board

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours)

Leveraging High Performance Computing Infrastructure for Trusted Digital Preservation

Preservation at scale

Preservation Standards (& Specifications) (&& Best Practices)

The University of British Columbia Board of Governors

Records Retention 101 for Maryland Clerks

SNIA 100 Year Archive Survey 2017 Thomas Rivera, CISSP

4. Save the accession record by pressing the button at the bottom right corner of the. are working in and open a new accession record.

Fedora Commons: Taking on the Challenge of the Next Generation of Scholarly Communication

Digits Fugit or. Preserving Digital Materials Long Term. Chris Erickson - Brigham Young University

Digital Preservation in the Cloud Benefits and Considerations for State Archives Tuesday 10 Feb 2015 Preservica & Amazon Web Services

PREMIS in Archivematica

Building on to the Digital Preservation Foundation at Harvard Library. Andrea Goethals ABCD-Library Meeting June 27, 2016

Digital Preservation Policy for the PFA Library and Film Study Center. Paul Grammatico. San Jose State University

Digital Preservation Efforts at UNLV Libraries

HydraDAM2: Extending Fedora 4 and Hydra for Media Preservation

Copyright 2008, Paul Conway.

Groton Data Center Migration Project

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

Records Information Management

The role of PIONIER network in long term preservation services for cultural heritage institutions in Poland

Edinburgh DataShare: Tackling research data in a DSpace institutional repository

AFS Vermont Folklife Center Digital Preservation Storage Planning Report

Openness, Growth, Evolution, and Closure in Archival Information Systems

ICPSR Audit Report For the period ending 24 October 2006

Brown University Libraries Technology Plan

Data Management Plan Generic Template Zach S. Henderson Library

An Introduction to Digital Preservation

DCH-RP Trust-Building Report

Digital Preservation in Theory and in Practice

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...)

Transcription:

Can a Consortium Build a Viable Preservation Repository? Presentation at CNI March 31, 2014 Bradley Daigle (APTrust University of Virginia) Stephen Davis (Columbia University) Linda Newman (University of Cincinnati) Suzanne Thorin (APTrust University of Virginia) Scott Turnbull (APTrust University of Virginia) www.aptrust.org

Academic Preservation Trust Academic Preservation Trust, a consortium of 17 institutions, is taking a community approach in building and managing a repository infrastructure that will provide long-term preservation of the scholarly record. APTrust will also be a DPN first node. www.aptrust.org

APTrust Institutions Columbia University Johns Hopkins University Indiana University North Carolina State University Penn State University Stanford University Syracuse University University of Chicago University of Cincinnati University of Connecticut University of Maryland University of Miami University of Michigan University of North Carolina University of Notre Dame University of Virginia Virginia Tech www.aptrust.org

APTrust is hosted by the University of Virginia, which fully supports 5 ½ staff, including space and equipment. Program Director Lead Engineer Junior Engineer Systems Engineer Content Lead (1/2 time) www.aptrust.org

Membership Dues Member dues: $20,000 annually Supports partner meetings, conference travel, contract and cloud services, marketing, and the web site www.aptrust.org

What is the problem we are trying to solve? Columbia University University of Cincinnati University of Virginia www.aptrust.org

Columbia University Use Case 1 Columbia University Libraries / Information Services has made commitments to granting agencies to provide long-term digital archiving for digital content created with grant funds to third-party content creators to provide permanent access to born-digital content acquired from them to continuing to collect and preserve archival collections, now partly or wholly born-digital content to permanently preserve University-generated archival and research content

Columbia University Use Case 2 We must preserve the content of Local Digitization Projects Preservation-Related Digitization Institutional Repository / Data Sets Born Digital Archival Content Archived Web Sites Super Dark Archives highly secure

Columbia University Questions Why create our own single-institution long-term preservation repository? Why divert scarce existing CUL/IS internal equipment funds to storage on a permanent basis? Why divert scarce existing CUL/IS staff time to creation, enhancement and maintenance of our own local preservation repository, permanently? Why undergo the costs and staff investment in obtaining local TRAC certification?

University of Cincinnati Use Case Question: Why is digital preservation important to us? Answer: We have digital collections where the original source material has deteriorated or is about to be intentionally destroyed. (Magnetic tapes, nitrate negatives considered flammable). The digital object is THE ONLY object. Magnetic tape image by Daniel P. B. Smith. Released under the GNU Free Documentation License. http://en.wikipedia.org/wiki/file:magtape1.jpg Nitrate negative from Cincinnati Subway and Street Improvements (digital collection) http://drc.libraries.uc.edu/handle/2374.uc/702759 www.aptrust.org

University of Cincinnati Use Case Question: Why is digital preservation important to us? Answer: We just moved a repository system from Columbus Ohio to our Cincinnati campus. 10 TBs of data, in 16 different VMDKs (virtual machine disk images) was transferred over the internet pipeline Checksums were created for each VMDK and verified upon receipt, some taking 24 hours to calculate. Checksums were also created for one-million+ files, compared with info in the repository database, and recompared after the storage format was changed (from VMDK to NFS). www.aptrust.org

University of Cincinnati Use Case Question: Why is digital preservation important to us? Answer: (continued) We decided to test a full backup and restore. This took over a week, and we discovered that 16 of our digital assets were corrupt. We diagnosed the cause, adjusted, and repeated without error but if we had not been comparing before and after checksums of all files we would not have known about the corruption. This process took a 1.5 months and offered a striking example of the care that must be taken to avoid losing data when moving large amounts of it. www.aptrust.org

University of Cincinnati Use Case Question: Why is digital preservation important to us? Answer: Our credibility is at stake. We want to be believed. Photograph; President Nixon with Elvis Presley; 20 Dec 1970; Richard Nixon Presidential Library and Museum, Yorba Linda, California. http://www.nixonlibrary.gov/forresearchers/find/av/photo/images/12_20_70_3.gif www.aptrust.org

University of Cincinnati Use Case Question: Why is digital preservation important to us? Answer: (continued) We are promoting a new digital repository to our faculty. Its raison d'être why researchers should deposit their digital assets in this repository rather than or in addition to several short-term delivery systems on our campus is long term persistence. We have promised that their assets will also be preserved in a dark archive such as the Academic Preservation Trust. We have stated that preservation means bit-level integrity and format migration. We have asserted that the Libraries traditional mission of preservation of the cultural record now applies to the digital scholarly record. www.aptrust.org

University of Virginia Use Case Integral part of our preservation and curatorial landscape Soup to nuts process for analogue materials Selection Digitization Management Stewardship

Born Digital UVa - continued It is all about transfer Disk images awaiting arrangement Need and I/O space Digital Scholarship Wish we had this years ago

UVa Landscape Local disk (please only temporary) / scratch disk Spinning disk still only backup Local HSM local tape backup APTrust more robust preservation actions DPN dark archive

Basic Technology Goals Simple submission packaging BagIt Strong Chain of Custody Logging Format agnostic basic preservation - Fixity Strong auditing and reporting - PREMIS Easily reference items between systems Identifiers Simple distribution package for restoration - BagIt

Flow of Content in APTrust Submission Bag Metadata (TagFiles) Preservation Files data/file1 data/file2 data/file3 Repackage to same bag format Restore Ingest Related Fedora Objects Intellectual Object Generic File1 Generic File2 Generic File3 DPN Bag DPN Bag DPN Bag DPN Bag Break apart bag and manage as separate fedora objects Bagged separately in DPN to support versioning

Challenges Abstracting away from specific repository software Identifying content across distributed systems Scaling solutions are still a mixed bag Managing dependencies in a consortium Deleting content requires some more work

Sustainability of Service Common development frameworks Hydra Use available cloud services - AWS Align with evolving preservation ecosystem OAIS & DDP Fedora 4 Standards like OAIS and DDP

APTrust and TRAC Certification APTrust is committed to working toward TRAC certification, APTrust is the first ever repository to be built from the ground up taking TRAC into account. A Certification Working Group has been established and will be advising and consulting with the APTrust staff and partners on TRAC objectives. Initial development work is proceeding at the level of Digital Object Management and Infrastructure.

Examples of TRAC Requirements The repository shall have an appropriate succession plan, contingency plans, and/or escrow arrangements in place in case the repository ceases to operate or the governing or funding institution substantially changes its scope. The repository shall have short- and long-term business planning processes in place to sustain the repository over time. The repository shall have contracts or deposit agreements which specify and transfer all necessary preservation rights, and those rights transferred shall be documented. The repository shall have the appropriate number of staff to support all functions and services. The repository shall have and use a convention that generates persistent, unique identifiers.

Academic Preservation Trust part of the evolving national digital preservation infrastructure The Task Force envisions the development of a national system of digital archives, which it defines as repositories of digital information that are collectively responsible for the long-term accessibility of the nation s social, economic, cultural and intellectual heritage instantiated in digital form. Preserving Digital Information. Report of the Task Force on Archiving of Digital Information, commissioned by The Commission on Preservation and Access and the Research Libraries Group. May 1, 1996. Executive Summary, iii.