International Implementation of Digital Library Software/Platforms 2009 ASIS&T Annual Meeting Vancouver, November 2009

Similar documents
GETTING STARTED WITH DIGITAL COMMONWEALTH

Building for the Future

Maennerchor Project Digital Collection

Greenstone Publications

The Reporter A Newspaper Digitization Project. Jenni Salamon Coordinator, Ohio Digital Newspaper PRogram

GUIDELINES FOR CREATION AND PRESERVATION OF DIGITAL FILES

Getting Started with the Digital Commonwealth. Robin L. Dale Director of Digital & Preservation Services LYRASIS

Mary Manning Eric Schnell

Text below in italics is directly from the LAMP Digitization Project Principles (

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

CONTENTdm & The Digital Collection Gateway New Looks for Discovery and Delivery

The Argus Digital Collection: A Digital Archive for Access and Preservation

The Case of the 35 Gigabyte Digital Record: OCR and Digital Workflows

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

Bringing the Voices of Communities Together:

Best practices for producing high quality PDF files

How to Build a Digital Library

Building Collections Using Greenstone

Protecting Future Access Now Models for Preserving Locally Created Content

NEW YORK PUBLIC LIBRARY

Grey Literature and Digital Preservation: Standards in Practice GL17 Pre-Conference Workshop 30 November 2015, Amsterdam.

Post Digitization: Challenges in Managing a Dynamic Dataset. Jasper Faase, 12 April 2012

DIGITIZATION OF HISTORICAL INFORMATION AT THE NATIONAL ARCHIVES OF ZAMBIA: CRITICAL STRATEGIC REVIEW

Comparing Open Source Digital Library Software

Staff Sourcing for Digitization

Edward M. Corrado and Sandy Card: ELUNA 2011

From The European Library to The European Digital Library. Jill Cousins Inforum, Prague, May 2007

RLG Model Request for Information (RFI) for Digital Imaging Services

PDF/A - The Basics. From the Understanding PDF White Papers PDF Tools AG

Preserving & Digitizing the Kent Tribune Newspaper. Sandy Halem, Kent Historical Society Jenni Salamon, Ohio History Connection

CONTENTdm 4.3. Russ Hunt Product Specialist Barcelona October 30th 2007

Mitigating Preservation Threats: Standards and Practices in the National Digital Newspaper Program

The Open Archives Initiative and the Sheet Music Consortium

Preservation and Access of Digital Audiovisual Assets at the Guggenheim

If you build it, will they come? Issues in Institutional Repository Implementation, Promotion and Maintenance

Data publication and discovery with Globus

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Compound or complex object: a set of files with a hierarchical relationship, associated with a single descriptive metadata record.

MALAYSIA THESES ONLINE (MYTO): AN APPROACH FOR MANAGING UNIVERSITIES ELECTRONIC THESES AND DISSERTATIONS

ABBYY FineReader 14 YOUR DOCUMENTS IN ACTION

Elba Project. Procedures and general norms used in the edition of the electronic book and in its storage in the digital library

Nuno Freire National Library of Portugal Lisbon, Portugal

Equipment TraderTM. equipment.treetrader.com

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

BHL-EUROPE: Biodiversity Heritage Library for Europe. Jana Hoffmann, Henning Scholz

Working with Islandora

Striving for efficiency

ACDH AUSTRIAN CENTRE FOR DIGITAL HUMANITIES

NHS Fife. 2015/16 Audit Computer Service Review Follow Up

RUtgers COmmunity REpository (RUcore)

Software for Digital Library

Delivering your special collections online: Digitization and CONTENTdm

Multilingual Information Access for Digital Libraries The Metadata Records Translation Project

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Qualification details

Next Generation Library Catalogs: opportunities. September 26, 2008

NDNP: The Kentucky Edition

Importance of cultural heritage:

Registry Interchange Format: Collections and Services (RIF-CS) explained

Assigns a persistent identifier that will always point to the object and/or its metadata.

access to reformatted and born digital content regardless of the challenges of media failure and technological

Metadata Workshop 3 March 2006 Part 1

Paper Presented at 20th Business meeting

University of Bath. Publication date: Document Version Publisher's PDF, also known as Version of record. Link to publication

PDF Specification for IEEE Xplore (Part A-Core Requirements)

Alphabet Soup: A Metadata Overview Melanie Schlosser Metadata Librarian

Digital Preservation Efforts at UNLV Libraries

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

COAR Interoperability Roadmap. Uppsala, May 21, 2012 COAR General Assembly

Smarter Document Capture

Readiris Pro 9 for Mac, the New Release of I.R.I.S. Flagship Product Sets a New Standard for OCR on the Mac Platform

Strategic Action Plan. for Web Accessibility at Brown University

Hello, I m Melanie Feltner-Reichert, director of Digital Library Initiatives at the University of Tennessee. My colleague. Linda Phillips, is going

Digibess: thanks Islandora! Arcidosso Italy March, 20-22, Giancarlo Birello, Anna Perin IT office and Library CNR-Ceris

NSDL Technical Systems Transition. Overview of technical systems transition September 8, 2011

Digitizing for Access Promoting the Use of Our Collections

Using DSpace for Digitized Collections. Lisa Spiro, Marie Wise, Sidney Byrd & Geneva Henry Rice University. Open Repositories 2007 January 23, 2007

UWDCC Case Study. Supplementing DPOE Workshop 6 June Steven Dast Digital Asset Librarian UW Digital Collections Center

Creating a System for the Online Delivery of Oral History Content

The Ohio State University's Knowledge Bank: An Institutional Repository in Practice

Interim Report Technical Support for Integrated Library Systems Comparison of Open Source and Proprietary Software

How do Small Archives Steward their Moving Image and Sound Collections? A Qualitative Study

DOWNLOAD PDF EDITING TEXT IN A SCANNED FILE

Creating Compound Objects (Documents, Monographs Postcards, and Picture Cubes)

Creating descriptive metadata for patron browsing and selection on the Bryant & Stratton College Virtual Library

Introduction

Creating a System for the Online Delivery of Oral History Content

Issues and Opportunities Associated with Federated Searching

Search Interoperability, OAI, and Metadata

Refinement of digitized documents through recognition of mathematical formulae

Evaluation of Islandora & SobekCM

Council of Chief Librarians Electronic Access & Resources Committee Review. CountryWatch. The Arm Chair Travelers Database. Review Date: 19 April 2016

Deposit-Only Service Needs Report (last edited 9/8/2016, Tricia Patterson)

How to contribute information to AGRIS

State Government Digital Preservation Profiles

Non-text theses as an integrated part of the University Repository

Open Source Software Packages for E-Resource Management

Addressing the E-Journal Preservation Conundrum: Understanding Portico

Anne J. Gilliland-Swetland in the document Introduction to Metadata, Pathways to Digital Information, setting the Stage States:

Distributed Services Architecture in dlibra Digital Library Framework

Transcription:

Newspaper Digitization Project at the Press Institute of Mongolia International Implementation of Digital Library Software/Platforms 2009 ASIS&T Annual Meeting Vancouver, November 2009 Krystyna K. Matusiak Digital Collections Librarian University of Wisconsin Milwaukee Milwaukee, USA

Background 1924 1990 Mongolian People s Republic Single party system Mongolian People s Revolutionary Party Allied with the Soviet Union Media and the press controlled by the communist government Daily Unen (Truth), the equivalent of the Russian Pravda 1990 1995 Transition Period December 1989 March 1990 Pro democracy protests Multi party system Transition to market economy Explosion of media Mongoliin Zaluuchuud, April 4, 1990 600 newspapers by the mid 1990s

Digitization Project First large scale digitization project in Mongolia Two year project: 2006 2008 Initiated by the Press Institute of Mongolia The goal was to digitize Mongolian newspapers in the holdings of the Press Institute Supported by a grant from the Endangered Archive Programme of the British Library http://www.bl.uk/about/policies/endangeredarch/homepage.html

The Press Institute of Mongolia A non profit organization Located in Ulaan Baatar, Mongolia Established in 1995 under the Free Press project Promotes the development of independent and pluralistic media in Mongolia Publishes a comprehensive annual report on the state of the media in Mongolia Houses a research department and a special library with a collection of newspapers and magazines from 1923 to 1996 http://pressinst.org.mn/

Newspaper Collection A collection of rare Mongolian serials published from 1923 to 1996 80 newspaper and magazine titles acquired from private individuals and several Mongolian media organizations Particularly rare newspapers from the transition period 1990 1995 Unavailable at other cultural institutions in Mongolia Document the transformation period in Mongolia after the fall of Communism in the early 1990s Used extensively by journalism students and researchers and face the risk of rapid deterioration

Project Goals Preservation Lack of preservation program Microfilm preservation strategies not available Digitization as a means of creating digital surrogate copies of deteriorating newspapers for preservation purpose Preservation as the project s primary goal Create digital archival copies of rare Mongolian serial publications Safeguard the collection housed at the Press Institute from the risk of physical deterioration and destruction

Project Goals Access secondary goal Providing wide and remote access through an online collection Reducing the handling of the fragile print materials Providing optional print version in the PDF format Enhancing indexing and searchability of the newspapers Providing full text keyword search and multiple browse capabilities

Selection Coverage The titles selected for the project cover the transition period 1990 1995 Document the transition to democracy and the development of the independent press in Mongolia Language Mongolian language using the Cyrillic alphabet Format Magazines A4 Newspapers A3 and large A2 newspaper format 59 titles 6189 issues 39029 pages

Image Capture Original paper copies as a source for digital images during the scanning process Scanning conducted in house using oversize flatbed scanners Each scanned newspaper page treated as a separate image and saved as an archival master in the TIFF format Large newspaper pages in A2 format scanned in two sections and merged in Photoshop Resolution at 400 dpi Determined after preliminary testing Provided more accurate OCR

Digitization Guidelines Project guidelines based on digital library standards and best practices Use neutral approach with the notion of digital master files and derivatives The standards established for the scanning process: Resolution: 400 dpi File Format: Tagged Image File Format (TIFF) Compression: None Bit depth: 8 bit greyscale for black and white newspapers 24 bit RGB for pages with color images Consistent file naming convention File names consist of a unique newspaper code (two letters) followed by year (two digits), month (two digits), and day (two digits) mt950101_01 page 1 of the Mongolian Times, January 1, 1995

Digitization Guidelines Digital masters Images created as a direct result of the scanning process Preservation quality digital copies A source for creating multiple derivative copies Two copies of each digital master in the TIFF format The first set of archival TIFF files stored at the Press Institute of Mongolia The second set deposited at the British Library Extensive documentation for preservation purpose Recorded on multiple levels Collection Individual publications Publishers Deposited at the British Library

Selection of Digital Library Software Factors considered in the selection process Affordability Language support Required staff skills and technical expertise Availability of software support and training Search and browse capabilities Support of digital library standards Interoperability Usability Evaluated DL software packages CONTENTdm Olive software Greenstone Newspaper Management Program Software developed by a local Mongolian company Offered support for the Mongolian language Under development at the time of evaluation

Language Support Multilanguage support Capability of displaying non Latin characters Unicode support Mongolian language Traditional Mongolian script Cyrillic alphabet adopted in the 1940s Publications selected for the digitization Mongolian language using the Cyrillic alphabet Russian Novosti Mongolii

Adherence to DL Standards Support metadata schema Dublin Core Support controlled vocabulary tools Provide access to digital objects in multiple formats Searchable text, image, and PDF for documents Provide efficient and effective info access and retrieval Simple and advanced search options Multiple browse pathways Usable and customizable interface Ensure interoperability Metadata harvesting Open Archives Protocol for Metadata Harvesting (OAI PMH) Accommodate future migration of objects and metadata

Open Source Vs. Proprietary DL Software Open source software Source code and rights in public domain Developed by dedicated developers and a community of users Distributed under Open Source License No initial cost Right to improve and change the software Associated (hidden?) cost Technical expertise, support, and training Maintenance and sustainability Further development Proprietary DL Software Source code protected and locked Developed by commercial vendors for profit High cost license and annual maintenance fee Technological competition and innovation Hybrid products

Greenstone Greenstone http://www.greenstone.org/ Open source international digital library software Developed at the University of Waikato in New Zealand Supports digital library standards Dublin Core Open Archives Protocol for Metadata Harvesting (OAI PMH) Offers multilingual support Unicode Standard Capable of processing and displaying non Latin characters, including Cyrillic alphabet Capable of building indexes and searching in the Mongolian language Requires training and technical expertise Low score on usability

Collection Building Several stages of collection building process: Creating derivative files using ABBYY FineReader software Searchable HTML files through the OCR (Optical Character Recognition) process PDF files Adding descriptive metadata Dublin Core schema Assembling an online collection with Greenstone software Uploading the files to the Web server Building indexes Customizing the interface

Online Collection Digital Archive of the Mongolian Newspapers 1990 1995 http://www.pressinst.org.mn/cgi bin/library Online collection: Built to support remote and immediate access to copies of the newspapers on the Web Presents customized Greenstone interface in the Mongolian language Lists titles in Cyrillic and Latin alphabet Offers full text searchability and browse options by year and title Optional PDF versions

Collection Sustainability Challenges Grant funded project Grant expired in June 2008 Unstable IT infrastructure Limited expertise in Greenstone Staff turnover No access to Greenstone training sessions Efforts to ensure sustainability Institutional commitment Detailed project documentation Adherence to DL standards and best practices Integration with other DL developments in Mongolia Mongolian DL portal Library at the American Center for Mongolian Studies

Questions? Krystyna K. Matusiak Digital Collections Librarian University of Wisconsin Milwaukee Milwaukee, USA kkm@uwm.edu Myagmar Munkhmandakh Director Press Institute of Mongolia Ulaanbaatar, Mongolia munkhmandakh@pressinst.org.mn Preservation of rare periodical publications in Mongolia. (2008). Project report available from the Endangered Archive Programme website at: http://www.bl.uk/about/policies/endangeredarch/bayarmaaoutcome.html