Digital Preservation in Theory and in Practice

Similar documents
Digital Preservation in Theory and Practice

Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview

The digital preservation technological context

Introduction to Digital Preservation. Danielle Mericle University of Oregon

Protecting Future Access Now Models for Preserving Locally Created Content

Text Mining for Historical Documents Digitisation and Preservation of Digital Data

Data Curation Handbook Steps

Agenda. Bibliography

MAPPING STANDARDS! FOR RICHER ASSESSMENTS. Bertram Lyons AVPreserve Digital Preservation 2014 Washington, DC

Digital Preservation Preservation Strategies

Different Aspects of Digital Preservation

ISO Self-Assessment at the British Library. Caylin Smith Repository

Long Term Digital Preserva2on

Digits Fugit or. Preserving Digital Materials Long Term. Chris Erickson - Brigham Young University

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories

Archive and Preservation for Media Collections

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS).

Digital Preservation Network (DPN)

Digital s ideal partner

ISO Information and documentation Digital records conversion and migration process

Can a Consortium Build a Viable Preservation Repository?

Importance of cultural heritage:

When it comes to information archive strategy, digital has met its ideal partner.

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017

Emory Libraries Digital Collections Steering Committee Policy Suite

Digital Preservation at NARA

GETTING STARTED WITH DIGITAL COMMONWEALTH

Certification as a means of providing trust: the European framework. Ingrid Dillo Data Archiving and Networked Services

GUIDELINES FOR CREATION AND PRESERVATION OF DIGITAL FILES

An Introduction to PREMIS. Jenn Riley Metadata Librarian IU Digital Library Program

UNIVERSITY OF NOTTINGHAM LIBRARIES, RESEARCH AND LEARNING RESOURCES

Certification Efforts at Nestor Working Group and cooperation with Certification Efforts at RLG/OCLC to become an international ISO standard

White Paper Digital Evidence Preservation and Distribution: Updating the Analog System for the Digital World July 2011

Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives

Digital Preservation DMFUG 2017

Guide. A small business guide to data storage and backup

Digital Preservation Policy. Principles of digital preservation at the Data Archive for the Social Sciences

Building on to the Digital Preservation Foundation at Harvard Library. Andrea Goethals ABCD-Library Meeting June 27, 2016

European digital repository certification: the way forward

Australia s Remotely Sensed Data Archive: The Next 25 Years

31 March 2012 Literature Review #4 Jewel H. Ward

UVic Libraries digital preservation framework Digital Preservation Working Group 29 March 2017

Collection Policy. Policy Number: PP1 April 2015

David Minor UC San Diego Library Chronopolis Preservation Network

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland

STRATEGIC PLAN

DIGITAL IMAGING VIEW OF DIGITAL PRESERVATION

STORAGE FOR ELECTRONIC RECORDS. A Discussion

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

Archive exchange Format AXF

Building a Digital Repository on a Shoestring Budget

Preservation at scale

<goals> 10/15/11% From production to preservation to access to use: OAIS, TDR, and the FDLP

Defining OAIS requirements by Deconstructing the OAIS Reference Model Date last revised: August 28, 2005

Preservation. Policy number: PP th March Table of Contents

Edinburgh DataShare: Tackling research data in a DSpace institutional repository

Worksheet - Storing Data

BUILDING A NEW DIGITAL LIBRARY FOR THE NATIONAL LIBRARY OF AUSTRALIA

Cloud Archive and Long Term Preservation Challenges and Best Practices

An overview of the OAIS and Representation Information

Technical Challenges in Digital Preservation

The International Journal of Digital Curation Issue 1, Volume

An Introduction to Digital Preservation

Trusted Digital Repositories. A systems approach to determining trustworthiness using DRAMBORA

The Microsoft Large Mailbox Vision

Libraries and Disaster Recovery

Lessons from the implementation

The distant future of the recent past

UNT Libraries TRAC Audit Checklist

Draft Digital Preservation Policy for IGNCA. Dr. Aditya Tripathi Banaras Hindu University Varanasi

Frederick Zarndt Semblanza

Introduction to. Digital Curation Workshop. March 14, 2013 SFU Wosk Centre for Dialogue Vancouver, BC

The Need for a Terminology Bridge. May 2009

CoSA & Preservica Practical Digital Preservation 2015/16. Practical OAIS Digital Preservation Online Workshop Module 2

Preservation Standards (& Specifications) (&& Best Practices)

Digital Preservation: From Theory to Practice

University of Maryland Libraries: Digital Preservation Policy

J. Welles Henderson Archives & Library Digital Preservation Policy Approved: October 26, 2016

TDWI Data Governance Fundamentals: Managing Data as an Asset

Woodson Research Center Digital Preservation Policy

What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011

The Preservation of Digital Records: the InterPARES approach (on the basis of its findings)

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

Overview IN this chapter we will study. William Stallings Computer Organization and Architecture 6th Edition

Strategy for long term preservation of material collected for the Netarchive by the Royal Library and the State and University Library 2014

The OAIS Reference Model: current implementations

The International Journal of Digital Curation Issue 1, Volume

Exploring the Concept of Temporal Interoperability as a Framework for Digital Preservation*

HOW WELL DO YOU KNOW YOUR IT NETWORK? BRIEFING DOCUMENT

Freedom of Access in Maine

Emerging Trends in Records Management Technology. Jessie Weston, CRA 2018 MISA Conference October 11-12, 2018

To Outsource or Not to Outsource JONAH VOLK MEDIA PRESERVATION COORDINATOR BARBARA GOLDSMITH PRESERVATION DIVISION NEW YORK PUBLIC LIBRARY

In this unit we are going to review a set of computer protection measures also known as countermeasures.

Archive II. The archive. 26/May/15

Presented By. Chris Lacinak

Availability and the Always-on Enterprise: Why Backup is Dead

Number 40 September Editor: Lynn Campbell. CONTENTS Stephen Clarke Archives NZ

Summary of Bird and Simons Best Practices

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Transcription:

Digital Preservation in Theory and in Practice PASIG Dublin Preservation Bootcamp 17 October 2012 Tom Cramer Chief Technology Strategist Stanford University Libraries tcramer@stanford.edu

Slide courtesy of Keith Johnson

Slide courtesy of Keith Johnson

What Is Digital Information? Data that is encoded as a series of 0 s and 1 s typically stored on magnetic (disk, tape) or optical (CD s, DVD s) media, that requires specialized hardware to read (e.g., disk or tape drive, CD player), and specialized software to operate (e.g., firmware, operating systems), and still more software (applications) to interpret and render (e.g., Powerpoint) into usable form, and contextual knowledge about how to operate that software in order to use it (e.g., double click to open)

What Is the Problem? In short, access to digital information requires hardware, software and people BUT technology and people change, therefore creating potential barriers to re-use.

What Is Digital Preservation? Digital preservation is the series strategies and actions taken to promote the availability and usability of digital information over time.

(In)Famous Examples of Loss The 1976 Viking Mars Landings Raw data still available, but Some not processed; Some not documented; Original software defunct. 1988: Processing 3,000 images from original tapes took 2 years of reverse engineering to build modern software. c2000: Researchers looking for biological data from the landing couldn t process the raw data in unknown formats. Tracked down printouts and hired students to rekey it all.

(In)Famous Examples of Loss 1986: BBC Domesday Project. Maps, images, statistical data, video and virtual reality tours. More than 1,000,000 participants in the project commemorating 900 th anniversary William the Conqueror s census of England Data stored on two laser discs (300 MB per side!): Required a Philips VP415 Domesday Player to read. 1999 2003: Two separate data recovery efforts involving both emulation and migration. 2090: Estimated date by which the project will be copyright free.

Less Famous Examples of Loss Parker on the Web 559 Anglo-Saxon manuscripts: ~200,000 high resolution images Used Internet3 to transfer images from Cambridge to Stanford: Hard drives shipped via DHL Cambridge retained one full insurance copy on 1 TB external hard drives Within 5 years of start of project, 7 of 32 of these drives had failed. (Full copy of all images preserved at Stanford, though!)

Less Famous Examples of Loss Monterey Jazz Festival Festival founded in 1958: longest running jazz festival in the world. Rich collection of recordings from inception, spanning over 50 years, in varying states of condition & decay. ~800 audio recordings, 1.6 TB First 35 years were recorded on analog tape 1 out of 100 s was critically damaged In 1994, the festival converted to DAT of the first 40 DAT tapes reformatted, 5 were critically damaged

Risks to Digital Information Media decay (bit rot) Obsolescence File Format Software Hardware Media Technology Failure Software Hardware Media Communication errors Lack of context Data but no codebook Ambiguous IP State Copyright Licensing Natural disasters Information attack Economic failure Organizational failure Human error

Strategies Replication Migration Format Media Technology Emulation Hardware Software Encapsulation Redundancy and heterogeneity Technology Location Organization Succession Planning Digital Preservation (aka Long Term Access) is realized through a series of relays over time.

SDR Preservation Core The Stanford Digital Repository (SDR) provides services to make scholarly resources available over the long term by helping ensure their integrity, authenticity, and To fulfill its mission, the SDR must be reusability. secure, sustainable and trustworthy.

Availability Retrievable by the original depositor, or his/her designates. What went in can come out Discoverable Deliverable to new environments, in new contexts Over time

Integrity Fixity of the original object No data loss No data corruption No data augmentation Wholeness Contains all of its essential bits Even if in modified form (appropriately documented, of course)

Authenticity That an object is what it purports to be Encompasses integrity And context: a description of the object in from its original state And (most importantly) provenance where an object came from and the chain of custody and processes from its point of origin

Reusability May be the object in its original form / format May be a derived form, suitable for a specific purpose Case study: what s more useful, an image of a newspaper page, or a the full text of a newspaper page? Requires clear understanding of business purpose and what OAIS calls designated community

Security Secure against leakage Secure against vandalism Secure against enhancement A primary design consideration A vital ingredient to trust

Sustainability Technically feasible & maintainable Economically viable and maintainable Organizational alignment and commitment Able to adapt Technically: changes in technology, scale Economically: changes in costs, funding (recessions ) Organizationally: layoffs, staff changes, mergers, strategy shifts

Trustworthiness Perception of competence, security, longterm commitment Prerequisite for confidence by Depositors Funders Content Consumers

Preservation Mindset & Strategies Resist the temptation to think of preserved objects as static Migrations, versions, audits & disseminations all require constant attention Can be quite liberating: preserve over time, not just at outset Acknowledge, and love, your bit bucket Remember that preservation is a journey, not a destination

Technologically Minimize dependencies Encapsulate your metadata with your objects Draw your burnline at the storage layer Minimize correlated errors Embrace redundancy Embrace diversity

Design Monolithic systems tend to serve poorly Complex, expensive, inflexible Migration costs can capsize you Keep it simple; have an exit plan for every component Don t overspec; don t overbuild We re very bad at predicting (especially bad without baseline experience)

Know Your Designated Community - Who will be using the content? - How will they be using it? - Latency - Delivery formats - Security - Offer (appropriate) access from the start