Mass Digitisation Enabling Access, Use and Reuse

Similar documents
Practical Experiences with Ingesting Materials for Long-Term Preservation

The National Digital Library Finna Among Digital Research Infrastructures in Finland

Collection Policy. Policy Number: PP1 April 2015

Safe Havens in a Choppy Sea:

Recent developments in Finland Open Science, IT infrastructures

BUILDING A NEW DIGITAL LIBRARY FOR THE NATIONAL LIBRARY OF AUSTRALIA

Writing a Data Management Plan A guide for the perplexed

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland

Introduction

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

Optical Layout Recognition (OLR)

Building for the Future

EUROPEANA METADATA INGESTION , Helsinki, Finland

Digital The Harold B. Lee Library

DRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy

Digitisation Standards

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Lessons Learned. Implementing Rosetta in the Harold B. Lee Library

PROCESS HISTORY METADATA PEGGY GRIESINGER NATIONAL DIGITAL STEWARDSHIP RESIDENT MUSEUM OF MODERN ART DECEMBER 4 T H, 2014

Data Exchange and Conversion Utilities and Tools (DExT)

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017

BHL-EUROPE: Biodiversity Heritage Library for Europe. Jana Hoffmann, Henning Scholz

Protecting Future Access Now Models for Preserving Locally Created Content

Web-based workflow software to support book digitization and dissemination. The Mounting Books project

The Functional Extension Parser (FEP) A Document Understanding Platform

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

Interoperability & Archives in the European Commission

Digital repositories as research infrastructure: a UK perspective

The OAIS Reference Model: current implementations

SobekCM. Compiled for presentation to the Digital Library Working Group School of Oriental and African Studies

Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Ponds, Lakes, Ocean: Pooling Digitized Resources and DPLA. Emily Jaycox, Missouri Historical Society SLRLN Tech Expo 2018

Persistent identifiers, long-term access and the DiVA preservation strategy

Paper Presented at 20th Business meeting

Links, languages and semantics: linked data approaches in The European Library and Europeana. Valentine Charles, Nuno Freire & Antoine Isaac

PREMIS Implementations at the British Library & PREMIS and the Planets Project. Angela Dappert The British Library PREMIS roundtable, February 2009

A service-oriented national e-theses information system and repository

PASIG Directions & Issues

Digital preservation activities at the German National Library nestor / kopal

Striving for efficiency

VI-SEEM Data Repository. Presented by: Panayiotis Charalambous

Erkki Tolonen

Comparing Open Source Digital Library Software

OpenData Hackathon Δημόσια, Ανοικτά Δεδομένα H εμπειρία του Εθνικού Κέντρου Τεκμηρίωσης

Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector

For those of you who may not have heard of the BHL let me give you some background. The Biodiversity Heritage Library (BHL) is a consortium of

PROCESSING AND CATALOGUING DATA AND DOCUMENTATION: QUALITATIVE

Medici for Digital Cultural Heritage Libraries. George Tsouloupas, PhD The LinkSCEEM Project

RUtgers COmmunity REpository (RUcore)

Multimedia Project Presentation

Data Replication: Automated move and copy of data. PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013

1. CONCEPTUAL MODEL 1.1 DOMAIN MODEL 1.2 UML DIAGRAM

Dexterity: Data Exchange Tools and Standards for Social Sciences

The e-depot in practice. Barbara Sierman Digital Preservation Officer Madrid,

CCS Content Conversion Specialists. METS / ALTO introduction

The Choice For A Long Term Digital Preservation System or why the IISH favored Archivematica

Building Consensus: An Overview of Metadata Standards Development

The digital preservation technological context

Long-term digital preservation of UNSWorks

Introduction to Metadata for digital resources (2D/3D)

Europeana, the prototype EDLfoundation Europeana Network Europeana, vs. 1.0 ThoughtLab Technical requirements

ARKive-ERA Project Lessons and Thoughts

Pam Armstrong Library and Archives Canada Ottawa, Canada

VANCOUVER HOLOCAUST EDUCATION CENTRE COLLECTIONS WEBSITE USER GUIDE. collections.vhec.org

From The European Library to The European Digital Library. Jill Cousins Inforum, Prague, May 2007

IRVLA The Irish Virtual Research Library and Archive project.

CARARE: project overview

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October

Nuno Freire National Library of Portugal Lisbon, Portugal

Compound or complex object: a set of files with a hierarchical relationship, associated with a single descriptive metadata record.

DCH-RP and PREFORMA Two case studies on the digital preservation of cultural heritage

146 Information Technology

Managing stakeholders and (creating) their expectations

Information retrieval concepts Search and browsing on unstructured data sources Digital libraries applications

DRS 2 Glossary. access flag An object access flag records the least restrictive access flag recorded for one of the object s files: ο ο

Building on to the Digital Preservation Foundation at Harvard Library. Andrea Goethals ABCD-Library Meeting June 27, 2016

Envisioning Semantic Web Technology Solutions for the Arts

Wendy Thomas Minnesota Population Center NADDI 2014

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Alternative Funding Model for [the improvement of] OA Publishing [in Croatia]

Registry Interchange Format: Collections and Services (RIF-CS) explained

Deposit-Only Service Needs Report (last edited 9/8/2016, Tricia Patterson)

Importance of cultural heritage:

Records management workflows

What steps to take. when AV is yet to become a priority for your organisation

Putting Open Access into Practice

Europeana update: aspects of the data

Business to Consumer Markets on the Semantic Web

Fedora Commons: Taking on the Challenge of the Next Generation of Scholarly Communication

Getting Started with the Digital Commonwealth. Robin L. Dale Director of Digital & Preservation Services LYRASIS

Different Aspects of Digital Preservation

Digital Preservation and The Digital Repository Infrastructure

The Sunshine State Digital Network

Learning Centre Thesis-kick off seminar Information retrieval

An overview of the OAIS and Representation Information

DCH-RP Trust-Building Report

UKOLN involvement in the ARCO Project. Manjula Patel UKOLN, University of Bath

University of Bath. Publication date: Document Version Publisher's PDF, also known as Version of record. Link to publication

Transcription:

Mass Digitisation Enabling Access, Use and Reuse National Digitisation Centre, Mikkeli, National Library of Finland Triangelipäivät 30.10.2008 Tiina Ison, Senior Analyst, Project Manager

Organisation of Speech THE CONTEXT - National Infrastructure Development 1. National Projects: DL and Mass Digitisation 2. Access, Use and Re-Use 3. Memory Organisations IN PRACTICE Digitisation Production Unit 1. Mass Digitisation Project 2008-2009 - OPM Hanke 2. National Digitisation Centre, NLF 3. Scaling up towards Mass Digitisation Processes 4. Reviewing structural analysis of content- Level of granularity 5. Providing access to resources 6. Metadata creation and capture 7. How about Use and Re-use? WRAP UP two national projects one wider process

THE CONTEXT - National Infrastructure Development Persona: Tiina as Senior Analyst

1. Education Ministry Funding and National Research Infrastrcutre Development Projects: National Digital Library Project 2008-2011 (OPM) Mass Digitisation Project 2008-2009 (OPM) Hub and Spoke Relationship (NRIDP) Closer integration needed (NRIDP) New Infrastructure Digitisation Existing Infrastructure Uniform Access Interface (viitetieto ja kokoteksti) Ministry of Education slide Production Environments and Systems Libraries, Archive, Museums, AV Born Digital Survey and Roadmap for Research Infrastructure in Finland, Social Sciences and Humanities (SSH) Panel. 7.10.2008 http://www.tsv.fi7tik/ssh report_071008_kokousmateriaali.pdf Long Term Preservation System (metadata ja dokumentit)

2. User and Communities want Access, Use and Re-use USER Context (build for mobility, user knowledge state, user knowledge construction, etc) Enabling Use and Re-Use interact COMMUNITY context Communities of Practice authoring, contribution, building knowledge commons e D E F I N E Enabling Access Presentation Layers User Interface Search Navigation Navigate Browse Contribute Resource Discovery Tools (Search and Discovery Services) Metadata Taxonomy Ontology Tagging Digital Resource Item Collection Collaboration Other Other Other D E F I N E governs FUNCTIONALITY

3. Memory Organisations- Museums, Archives and Libraries National Digital Library front end Production Units back end CRITICAL MASS Of CONTENT Production Units 1. Logistics for movement of source material 2. Logistics for workflow and storage of digital objects 3. Logistics for metadata creation and capture throughout the production chain Production dilemma message to National DL Workshop 16.10. - chain starts at creation of digital object not at ingest to a DL PAS? ingest ACCESS and PRESERVATION

IN PRACTICE Digitisation as Production Unit Persona: Tiina as Project Manager

1. Mass Digitisation Project OPM Digitointi Hanke Mass digitisation project funding from the Ministry of Education 2008-2009 Project aim is mass digitisation production logistics and management of whole of production chain and maximising access - as widely as possible (enabling access) Library wide workshops held at summer and autumn, 2008 Metadata Working Group and Care&Handling Working Group CCS Consultancy on production logistics and ingest Tools Item Tracking, Scan Client, docworks (site license)

2. National Digitisation Centre Now: Digitisation Centre of NLF at Mikkeli Project Based Digitisation (excluding historical newspapers) Digitisation processes and workflows internal to Mikkeli (scanning, conversion, web) Production output using interoperable XML based metadata standards in MODS, METS xml based format since 2004 Provision of Access is via web site maintained by Mikkeli http://digi.lib.helsinki.fi. Storage locally and at Helsinki Scaling up towards: National Digitisation Centre as a Production Unit at Mikkeli Mass Digitisation of Cultural Heritage (text, audio) Library wide digitisation process cross organisational production lines (i.e. ephemera with Archives ERS project) Production output using interoperable, XML based metadata standards in MODS, METS, PREMIS and MARC XML heading towards a METS Profile (KDK?)? Ingestion to National DL functionality and information architecture, single federated search, Europeana, EROM?PAS ingestion into a National Trusted Repository?

3. Establishing Mass Digitisation Processes - Helsinki care and handling of physical item Mikkeli Scan Client Module Item Tracking Module docworks (since 2004) 1. Selection 2. Verify catalogue, create holding record catalogued and noncatalogued items 3. Assign Bar Code ID s (1:1) 4. Prepare Trolleys in batches 5. Transport (230km) 6. Inspection enter into Item Tracking Bar code in, Scanner selection. 7. Scanning automatic, manual 8. Conversion Level of Structural analysis, output as JPEG2000, TXT, PDF. OCR, METS 9. Remote QA 10. Document Delivery and Ingestion to Presentation System Voyager holding record entry Update status in library catalogue using field (583) Ingest IN: descriptive metadata into docworks - automatically Capture technical metadata at scanning Create structural metadata with docworks Helsinki 9. Remote QA 11. Item Returns Update status in library catalogue Ingest Out: page number? How about authority control?

4. Level of Structural Analysis Determine the level of structural analysis Monographs chapter level? Newspapers article level? Ephemera none? Logical Structure chapters, articles, etc Physical Structure pages, columns, captions, footnotes etc. Turku Dissertation Papers Metadata working group Catalogue Pasi Koste Sirkka Havu Research community Submission to requirements Prioritisation of elements -- > researches really want to enrich txt due to quality of OCR - Remote QA for researchers? Cataloer Pasi Koste

5. Provision of Access (currently) Simple search Advance search Full text search (OCR) text highlighing Browsing by Newspaper name Article categorisation according to 1800 s (Eero Hyönen potential for ontology!) Fuzzy searching (spellling) For historical newspapers, functionality development for monographs, journals and ephemera

6. Creation and Capture of Metadata Persistent Identifiers of source material and digitial object (citation). Unique resource identifiers (URI) Administrative metadata Descriptive metadata (Ingestion from library catalogue Marc to MODS) Technical Metadata (MIX) (image quality, scanner, digital object provenenance) Structural metadata (METS) (granularity of search (URN) Preservation metadata (PREMIS) Rights management metadata (PREMIS) (copyright, access restrictions, licensing ie. Creative commons permission rights) docworks automatic output of administrative, techical and structural metadata wrap CAN INGEST TO DL s (National, Europeana etc)

7. How about Use and Re-Use Digitisation Centre produces output - of digital objects with Interoprable metadata - URI s that link resources semantically - Currently ensuring enabling citation with Persistent Identifiers (physical item, digital object) URN s for identifying down to structural level IPR? Copyright? Access Retrictions? Licensing (Creative Commons)? Learning objects? Community contribution?

Wrap Up One organisational wide process for digital content Unique identification of physical, source material Automatic ingestion of descriptive metadata from library catalogue Remote QA processes for non-catalogued items for post processing Ingestion back to library catalogue on discussion table Potential for researcher contribution for metadata, OCR txt enrichment Output Interoprable metadata MODS, METS heading for PREMIS Memory Organisations - Stuggling with concept of production unit - Pressure to build critical mass of content online - Pressure for provision of access, use, re-use - As prduction units differing levels of maturity - Metadata chain One national process - for long term preservation National Digitial Library (KDK) great time for co-operation