Its All About The Metadata

Similar documents
Working with a Preservation Software Vendor - The Kentucky Experience Glen McAninch

Digital Preservation Efforts at UNLV Libraries

Assessment of product against OAIS compliance requirements

Assessment of product against OAIS compliance requirements

An overview of the OAIS and Representation Information

The OAIS Reference Model: current implementations

DIGITAL ARCHIVES & PRESERVATION SYSTEMS

DRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy

Digital Preservation DMFUG 2017

Archival Information Package (AIP) E-ARK AIP version 1.0

Importance of cultural heritage:

The Choice For A Long Term Digital Preservation System or why the IISH favored Archivematica

Selecting an Electronic Records Repository Platform

Current Digital Preservation Trends and SDB4 Features. Pauline Sinclair, Tessella, PASIG, Madrid, 5 July 2010

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October

PREMIS Implementations at the British Library & PREMIS and the Planets Project. Angela Dappert The British Library PREMIS roundtable, February 2009

Metadata Workshop 3 March 2006 Part 1

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

Any comments, corrections, or recommendations may be sent to the project team, care of:

Geospatial Multistate Archive and Preservation Partnership Metadata Comparison

Geospatial Records and Your Archives

Digital Preservation: From Theory to Practice

DAITSS Demo Virtual Machine Quick Start Guide

Digits Fugit or. Preserving Digital Materials Long Term. Chris Erickson - Brigham Young University

AUSTRIAN STATE RECORDS MANAGEMENT LIFECYCLE

Digital The Harold B. Lee Library

3. Technical and administrative metadata standards. Metadata Standards and Applications

FDA Affiliate s Guide to the FDA User Interface

An Introduction to PREMIS. Jenn Riley Metadata Librarian IU Digital Library Program

Introduction to. Digital Curation Workshop. March 14, 2013 SFU Wosk Centre for Dialogue Vancouver, BC

The type of organization for which you created the collection and the potential user and their needs.

The Swedish National Archives digital preservation. Mats Berggren, IT-department,

Session Two: OAIS Model & Digital Curation Lifecycle Model

ISO Self-Assessment at the British Library. Caylin Smith Repository

Introduction to Digital Preservation. Danielle Mericle University of Oregon

Introduction to Islandora Kim Pham, Digital Projects & Technologies Librarian (UTSC) Kelli Babcock, Digital Initiatives Librarian (UTL)

Transfers and Preservation of E-archives at the National Archives of Sweden

Lessons Learned. Implementing Rosetta in the Harold B. Lee Library

Records management workflows

RODA 3 LONG-TERM DIGITAL PRESERVATION CHARACTERISTICS AND TECHNICAL REQUIREMENTS

PRESERVING DIGITAL OBJECTS

Archivists Toolkit: Description Functional Area

Montana State Library Spatial Data Transfer Design

Repository Interoperability and Preservation: The Hub and Spoke Framework

Preservation at scale

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

IRVLA The Irish Virtual Research Library and Archive project.

Preservation Standards (& Specifications) (&& Best Practices)

15/06/2018 In Out, In Out, And Shake It All About. A Moving Story of Data

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

Manual for processing and ingesting archival . July 2014 Fran Baker, Phil Butler, Ben Green The University of Manchester Library

What is Islandora? Islandora is an open source digital repository that preserves, manages, and showcases your institution s unique material.

Building Consensus: An Overview of Metadata Standards Development

Kalaivani Ananthan Version 2.0 October 2008 Funded by the Library of Congress

Practical Experiences with Ingesting Materials for Long-Term Preservation

Building a Digital Repository on a Shoestring Budget

Metadata. Week 4 LBSC 671 Creating Information Infrastructures

A Model for Managing Digital Pictures of the National Archives of Iran Based on the Open Archival Information System Reference Model

Writing a Data Management Plan A guide for the perplexed

Building for the Future

Working with Islandora

How to contribute information to AGRIS

Web-based workflow software to support book digitization and dissemination. The Mounting Books project

If you build it, will they come? Issues in Institutional Repository Implementation, Promotion and Maintenance

What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011

Introduction to Archivists Toolkit Version (update 5)

Taking the plunge: digital archives at HSBC

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories

B2SAFE metadata management

Comparing Open Source Digital Library Software

CoSA & Preservica Practical Digital Preservation 2015/16. Practical OAIS Digital Preservation Online Workshop Module 1

Protection of the National Cultural Heritage in Austria

Assimilate - Knowledge & Content Management documents is a nightmare

Chapter 5: The DAITSS Archiving Process

For those of you who may not have heard of the BHL let me give you some background. The Biodiversity Heritage Library (BHL) is a consortium of

PREMIS in Archivematica

User Stories : Digital Archiving of UNHCR EDRMS Content. Prepared for UNHCR Open Preservation Foundation, May 2017 Version 0.5

Digital Preservation Standards Using ISO for assessment

UC Office of the President ipres 2009: the Sixth International Conference on Preservation of Digital Objects

DAITSS Workflow Interface

Draft Digital Preservation Policy for IGNCA. Dr. Aditya Tripathi Banaras Hindu University Varanasi

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method

Protecting Future Access Now Models for Preserving Locally Created Content

PROCESS HISTORY METADATA PEGGY GRIESINGER NATIONAL DIGITAL STEWARDSHIP RESIDENT MUSEUM OF MODERN ART DECEMBER 4 T H, 2014

Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland

The International Journal of Digital Curation Issue 1, Volume

Developing a Research Data Policy

Born digital Hull: early steps and lessons learnt (so far) Simon Wilson, Digital Archivist (AIMS Project)

Preservation Planning in the OAIS Model

Persistent identifiers, long-term access and the DiVA preservation strategy

General Model of E-ARK Services

Safe Havens in a Choppy Sea:

Institutional repositories: description of VITAL as an example of a Fedora-based digital assets management system.

Archivematica user instructions

Copyright 2008, Paul Conway.

Ex Libris Rosetta A Complete Digital Asset Management and Preservation System

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Transcription:

Best Practices Exchange 2013 Its All About The Metadata Mark Evans - Digital Archiving Practice Manager 11/13/2013

Agenda Why Metadata is important Metadata landscape A flexible approach Case study - KDLA Conclusions Demonstration using Preservica

Why is Metadata Important 10110100100010101010000111110010010100100100010 010101010000111100101001011011100000111101110110 10110100100010101010000111110010010100100100010 010101010000111100101001011011100000111101110110 0101010000011111001111110011111100111100011000100 10110100100010101010000111110010010100100100010 010101010000111100101001011011100000111101110110 Binary file is meaningless on it own

The Big Question How much metadata is needed and necessary to preserve digital objects

We Have Some Guidance Open Archival Information System (OAIS) Reference Model ISO Standard Well adopted Explains What not How Describes functions and information model Concept of Information Packages Three types Submission (SIP), Archival(AIP), Dissemination(DIP) Contain aggregates of information objects Data Object 10010 11010 01110 01110 Interpreted using its Representation Information Yields Information Object

OAIS Information Package Information Package can contain 4 types of Information Object Information Package Descriptive Information Content Information Packaging Information Preservation Descriptive Information Provenance Context Reference Fixity Rights Each Information Object has associated Representation Information

Example: Corresponding metadata Descriptive Metadata Descriptive Metadata Technical Metadata ID : HF2653-001-abc Author: Loefffler, Dean And Lamning Summary: A bill for an act relating to improvements to the capital area. Collection = : HF2653-001-abc Representation := file1.jpg: file2.jpg Parent record := AAABBB123 HF2653.pdf, 25654 bytes, created 10/5/2011, Valid and well formed SHA1=2323A563DF4329 Application Information Data Format PDF v1.4 Portable Document Format [fmt/41] Binary Sequence 11111111 11011000 11111111 11100000 00000000 00010000 01001010 01000110 01001001 01000110

Metadata Landscape What metadata does a digital preservation system need? Understand structural Information: Hierarchy of Records Relationships between Records Hierarchy of Files Relationships between Records & Files Understand technical Information: Technology-dependent information: Determine if actions needed (e.g., obsolete format) Technology-independent information: Verify preservation actions

Metadata Landscape II Standards, Standards, Standards!!! Dublin Core PREMIS METS EAD MODS PBCORE MIX FDGC Etc etc etc

Metadata Landscape III Lets not forget about descriptive metadata Needs to support: Holding metadata with appropriate entity in the hierarchy Allow users to view metadata Allow users to add / edit metadata Allow users to search on metadata Still convert it if needed (e.g., for export)

Have to deal with lots of ingest sources Each ingest source potentially contains metadata Unlikely to be a consistent scheme across sources Could be content specific Could be standards based Could be custom Traditional approach is to create archival metadata Can be manually intensive Potential for lots of repetition Realistic only at high levels? Can delay accessioning Which solutions support it? Metadata Sources

Metadata Sources Could convert existing metadata to a normalized form: Source 1 Source 2 Convert Schema A OAIS Digital Archive Source 3 Or force a standard on the creators

Metadata Sources This would reduce the problem However, combined schema may change over time: Potential for subsequent conversions Or cope with multiple versions May require software changes Also, each conversion is a potential point of loss Adopted archival schema may not provide full coverage for source E.g EAD arrives and DC is the adopted schema

Metadata Sources Desirable to reuse existing metadata Source 1 Schema 1 Source 2 Schema 2 OAIS Digital Archive Source 3 Schema 3 Can we cope with heterogeneous schemas? Need to examine types of metadata

A Flexible Approach Fixed schema + embedding Define schema that: Understands structural information Understands technical information eg PREMIS Embeds any descriptive metadata Embeds any additional technical metadata eg MIX Can embed multiple metadata schema for each entity Schema supports standard OAIS functions: Ingest, Access, Data Management, Storage Controls Preservation

Registering Schema Any metadata schema can be registered with SDB

Users can embed / validate XML using SIP Creator tool Either from a file Meeting Ingest Needs Or by cutting and pasting into the tool Ingest Workflow steps can be written / modified to embed metadata in XIP to support automated ingest Source metadata could be from files or other systems

Meeting Descriptive Needs Descriptive functions can use XSLT: View Transform XML to static HTML Edit Transform XML to dynamic HTML Transform - Transform XML to alternative XML schema Search: Use SOLR

Example Viewing Metadata

Example Editing Metadata

Example Editing Metadata Authorised users can: Add descriptions Add new metadata schemes Can have multiple schemas in parallel: Can keep original metadata from source systems Allows additional archival information to be kept Some potential for overlap

Schema Administration So if add new source (or source metadata changes): Upload schema Simply embed metadata to correct structural entities Add view transform Add edit transform Add transform for schema conversion (if wanted) Configure SOLR for fielded search

Advantages Easy to add new source: Very low process overhead No waiting required Metadata can be consumed as is No collection of things needing to be archived! Can do appraisal, catalogue updates etc. later as needed No loss of existing metadata: Even if transform original metadata in archival audit trail Resilient to change Reduce barriers to starting: Don t have to get it just right up front

Disadvantages Potential lack of consistency across descriptions Within a particular scheme Across differing schemes Overlap between schemas: Could have multi-schema edit transforms Fielded searches become harder Have to pick the schema to pick the field). These can be eliminated by re-cataloguing after ingest if required.

KDLA case study Existing Digital Repository DSpace Provides Public Access Limited Preservation Recent subscription of Preservica Fills a preservation gap Not intended for public access What about Metadata? Preservation Description Accessioning

KDLA Electronic Collections Web sites, Publications, Minutes, Geospatial datasets (map), Databases, Digital images, Video, and Audio recordings.

File/folder vs item Case Study of Photographs Merging a file/folder-based description system (inhouse file server and Preservica) Accession based Preserve the groupings and folder labels and attach accession metadata to every object With an item level DSpace or Contentdm DSpace Package Preservica SIP + Accessioning Metadata

PREMIS (PREservation Metadata: Implementation Strategies) Preservica s metadata schema contains a PREMIS data elements which Provides information about the provenance of the AIP PREMIS Documents relationships between: Intellectual Entity Objects Events (normalization, audits, migration) Agents (based on logon) Preservation Events: Ingest Virus scan Sensitive data scan File format conversion Checksum calculation and integrity checks Normalization

Preservica SIP Creator Ingest

Accession Information Example

Conclusions Need to use a consistent schema for: Structural metadata Technical metadata Can support multiple descriptive / technical schemas: Still allow fielded view / edit / search Hence can support: Multiple sources & ingest native schemas Multiple versions of schemas Consolidation is good but not vital: Could occur post-ingest Lowers ingest barriers, provides flexibility, minimizes loss

Questions mark.evans@tessella.com http://www.digital-preservation.com