Practical Experiences with Ingesting Materials for Long-Term Preservation

Similar documents
Mass Digitisation Enabling Access, Use and Reuse

CCS Content Conversion Specialists. METS / ALTO introduction

PREMIS, METS and preservation metadata:

Repository Interoperability and Preservation: The Hub and Spoke Framework

Introduction to. Digital Curation Workshop. March 14, 2013 SFU Wosk Centre for Dialogue Vancouver, BC

The National Digital Library Finna Among Digital Research Infrastructures in Finland

Optical Layout Recognition (OLR)

An introduction to the Metadata Encoding and Transmission Standard METS)

Paper Presented at 20th Business meeting

Building for the Future

The Case of the 35 Gigabyte Digital Record: OCR and Digital Workflows

DRS 2 Glossary. access flag An object access flag records the least restrictive access flag recorded for one of the object s files: ο ο

The OAIS Reference Model: current implementations

Writing a Data Management Plan A guide for the perplexed

Chapter 5: The DAITSS Archiving Process

Erkki Tolonen

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Long-term digital preservation of UNSWorks

Its All About The Metadata

PROCESSING AND CATALOGUING DATA AND DOCUMENTATION: QUALITATIVE

The Promise of PREMIS: background, scope and purpose of the Data Dictionary for Preservation Metadata

DIGITIZATION OF HISTORICAL INFORMATION AT THE NATIONAL ARCHIVES OF ZAMBIA: CRITICAL STRATEGIC REVIEW

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

Post Digitization: Challenges in Managing a Dynamic Dataset. Jasper Faase, 12 April 2012

UKOLN involvement in the ARCO Project. Manjula Patel UKOLN, University of Bath

METS as a Container and Metadata Approach for Sample Course Content

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Recent developments in Finland Open Science, IT infrastructures

PROCESSING AND CATALOGUING DATA AND DOCUMENTATION - QUALITATIVE

LIMB Processing Release Notes

Alphabet Soup: A Metadata Overview Melanie Schlosser Metadata Librarian

This presentation is on issues that span most every digitization project.

EUROPEANA METADATA INGESTION , Helsinki, Finland

RUtgers COmmunity REpository (RUcore)

Aligning METS with the OAI-ORE Data Model

Digitisation of historic newspapers and voluntary digital deposit of newspaper pre-print files in the the National Library of Estonia

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

Building Collections Using Greenstone

What is Islandora? Islandora is an open source digital repository that preserves, manages, and showcases your institution s unique material.

BUILDING A NEW DIGITAL LIBRARY FOR THE NATIONAL LIBRARY OF AUSTRALIA

Metadata Overview: digital repositories

Digital preservation activities at the German National Library nestor / kopal

Managing Learning Objects in Large Scale Courseware Authoring Studio 1

METS: Implementing a metadata standard in the digital library

How to Build a Digital Library

Text below in italics is directly from the LAMP Digitization Project Principles (

Protection of the National Cultural Heritage in Austria

Web-based workflow software to support book digitization and dissemination. The Mounting Books project

Collection Policy. Policy Number: PP1 April 2015

METS For The Cultural Heritage Community: A Literature Review

Protecting Future Access Now Models for Preserving Locally Created Content

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland

Enhanced Processing of METS/MODS Library Metadata in CouchDB

DRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy

Safe Havens in a Choppy Sea:

Importance of cultural heritage:

Metadata Workshop 3 March 2006 Part 1

Archivists Toolkit: Description Functional Area

Linking Between and Within METS and PREMIS Documents

Digisam - Swedish National Coordination of Digitisation, Digital Preservation and Digital Access to Cultural Heritage

Digital Preservation DMFUG 2017

Florida Digital Archive (FDA) SIP Specification

Getting Started with the Digital Commonwealth. Robin L. Dale Director of Digital & Preservation Services LYRASIS

Building Consensus: An Overview of Metadata Standards Development

PAWN: A Novel Ingestion Workflow Technology for Digital Preservation. Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall

Lessons Learned. Implementing Rosetta in the Harold B. Lee Library

Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector

Transfers and Preservation of E-archives at the National Archives of Sweden

COMMON SPECIFICATION FOR INFORMATION PACKAGES

Records Management and Interoperability Initiatives and Experiences in the EU and Estonia

Assessment of product against OAIS compliance requirements

Europeana Newspapers - A Gateway to European Newspapers Online

DAITSS Demo Virtual Machine Quick Start Guide

3. Technical and administrative metadata standards. Metadata Standards and Applications

The Swedish National Archives digital preservation. Mats Berggren, IT-department,

Creating Compound Objects (Documents, Monographs Postcards, and Picture Cubes)

The Reporter A Newspaper Digitization Project. Jenni Salamon Coordinator, Ohio Digital Newspaper PRogram

BHL-EUROPE: Biodiversity Heritage Library for Europe. Jana Hoffmann, Henning Scholz

Digitisation Standards

Using DSpace for Digitized Collections. Lisa Spiro, Marie Wise, Sidney Byrd & Geneva Henry Rice University. Open Repositories 2007 January 23, 2007

Wendy Thomas Minnesota Population Center NADDI 2014

Mitigating Preservation Threats: Standards and Practices in the National Digital Newspaper Program

Metadata Requirements for Digital Museum Environments

Managing Image Metadata

NDL Search -future development- Yasunao Kobayashi Assistant Director Library Support Division, Kansai-kan of the National Diet Library

An Overview and Case Study. Scott Yeadon

DAITSS Workflow Interface

Assessment of product against OAIS compliance requirements

For those of you who may not have heard of the BHL let me give you some background. The Biodiversity Heritage Library (BHL) is a consortium of

Draft Digital Preservation Policy for IGNCA. Dr. Aditya Tripathi Banaras Hindu University Varanasi

PID System for eresearch

Ex Libris Rosetta A Complete Digital Asset Management and Preservation System

Digital The Harold B. Lee Library

Different Aspects of Digital Preservation

From Individual Solutions to Generic Tools Digitization at the Max Planck Society. Digitization Day 2012, Geneva Andrea Kulas

15/06/2018 In Out, In Out, And Shake It All About. A Moving Story of Data

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.

Archival Information Package (AIP) E-ARK AIP version 1.0

NDNP: The Kentucky Edition

EUDAT Data Services & Tools for Researchers and Communities. Dr. Per Öster Director, Research Infrastructures CSC IT Center for Science Ltd

Transcription:

Practical Experiences with Ingesting Materials for Long-Term Preservation Esa-Pekka Keskitalo <firstname.lastname@helsinki.fi> 20.10.2011 Digital Preservation Summit 2011, Hamburg

Overview About the National Library of Finland Digitzation at the NLF Logistics Post-processing Metadata Preservation National Digital Library Initiative

National Library of Finland The National Library of Finland ensures the availability of the published national heritage. The Library develops library and information services together with the Finnish library network and other actors of the modern information society Circa 230 employees, budget 29 million

Organization and its Challenges NLF Admin Digitization process 230 km Preservation and Digitisation Centre Mikkeli Digitisation Centre of digitization and conservation (in Mikkeli) Research Library (Collections, services to customers) National Library Network Services (digital preservation, inofrmation systems) National Library of Finland Collections

Digitization at the NLF Based on a Strategy, And Policies Preservation Policy Collections Policy Digitization Policy Cataloguing Policy (under revision) Aligned with the National Digital Library Inititative NLF Startegy (will be revised in early 2012): http://www.nationallibrary.fi/infoe/organization/ nationallibrarystrategy_20062015_summary.html Preservation Policy & Digitization Policy: http://www.nationallibrary.fi/libraries/dimiko.html Collection Policy: http://www.nationallibrary.fi/services/kokoelmat.html Internationally acknowledged best practices, EU guidelines, standards

Criteria for Selection Critical masses; coherent collections For the NLF, digitization of single items is a non-strategic service Preservation: digitization for protection Demand Contents: national or international value

Challenges Long distance between locations: reliable tracking needed for large batches Gaps in cataloguing à integration of cataloguing and digitization Complex system of collections Cooperation over organizational boundaries Preservation vs Dissemination

Improving Logistics Old Process Decentralized workflow Material assembled at the collections Scanning at a remote location Conversion at another location Material returned to the collections New Process Seamless overview and control Batch created at the collections Gapless Status & Location Tracking at a glance All have access to the same web interface (project managers, scan operators, QA operators etc.) Finally, batch disassembled at collections

Need to track the items on every stage BATCH CREATION DELIVERY CHECK- IN DISBAND HELSINKI CHECK- IN SCANNING QA MIKKELI 1 POST- PROCESSING QA DELIVERY MIKKELI 2

Project View Select Project Remote Item Tracking Select Function

Item Tracking: Entities 1) Project Location in collections Genre (monograph, newspaper, etc.) Can be tracking only (no digitization or conversion involved) 2) Batch Collection of items (books, newspapers, manuscripts, etc.) 3) Physical Unit Single item (e.g. book, microfilm reel, box of parchments) Unique ID / Barcode 4) Logical Unit Smallest unit in item tracking can be book chapter, newspaper issue, part of a series, Assembled for transportation to digitization center

Post-Processing DocWorks (CCS) Identifying language and font type Cropping, deskewing, file format validation etc. Identification of structural elements Labour intensive how long can you go? OCR (when applicable); not proofread, but you can help: http://www.digitalkoot.fi/en/splash Identification of structural elements with DocWorks (CCS)

Crowdsourcing

METS wrapping for preservation XML Schema for creating a document that explains and structures a digital object and its metadata http://www.loc.gov/standards/mets/ Information may be embedded or be referenced to Uses: Transmit objects to others Rendering <metshdr> <dmdsec> <amdsec> <filesec> <structmap> <structlink> <behaviorsec> mets Header descriptive metadata Section administrative metadata Section file Section structural Map section structural Link section behavior Section

METS profiles Different content types require different profiles Monographs Ephemera Periodicals Parchment Fragments Etc.

METS Package Contents METS with MARC, MODS, PREMIS and MIX For each page: master image in JPEG2000 (lossless) access image in JPEG thumbnail OCRd text in ALTO XML Single PDF with hidden text layer Whole SIP compressed to a zip file

PREMIS in METS PREMIS distributed within different METS sections Common amdsec for the intellectual entity Events and Agents in digiprovmds Rights in rightsmd amdsec for each file PREMIS Object in techmd objectcharacteristicsextension containing MIX 2.0 PREMIS Events in digiprovmds Agents referenced from the common amdsec

National Digital Library (NDL) Initiative www.kdk.fi Improves access to digital materials and preservation of digital materials Comprises libraries, archives, and museums Until end of 2013

NDL User Interface Common User Interface Infrastructure Customizable by organization, region, community Integrated contents and services Ex Libris Primo Interoperability of content systems need a lot of work Work for NDL benefits Into production in 2012

NDL and Long-Term Preservation Work in progress Aim: a shared / centralized long-term preservation system Technology run by CSC IT Centre for Science (non-profit company owned by the Ministry od Education and Culture) Different models of service for different organizations

NDL Specifications for Preservation Accepted file formats and their profiles Accepted METS profiles Specifications for metadata, e.g. Minimum requirements Expression of access rights and restrictions Identifiers Transfer procedures between systems

Links http://digi.kansalliskirjasto.fi https://www.doria.fi/handle/10024/69173