access to reformatted and born digital content regardless of the challenges of media failure and technological

Similar documents
GUIDELINES FOR CREATION AND PRESERVATION OF DIGITAL FILES

Caring for Digital Materials Webinar 4: Practice safe archiving: Backups, copies, and what can go wrong. Instructor: Jefferson Bailey 4/10/2013 1

The Perils of Preserving Digitized and Born-Digital Video

Elementary Computing CSC 100. M. Cheng, Computer Science

2nd Paragraph should make a point (could be an advantage or disadvantage) and explain the point fully giving an example where necessary.

Sustainable File Formats for Electronic Records A Guide for Government Agencies

General Computing Concepts. Coding and Representation. General Computing Concepts. Computing Concepts: Review

Unicode. Standard Alphanumeric Formats. Unicode Version 2.1 BCD ASCII EBCDIC

Deposit-Only Service Needs Report (last edited 9/8/2016, Tricia Patterson)

Bringing the Voices of Communities Together:

MEDIA RELATED FILE TYPES

This presentation is on issues that span most every digitization project.

Data Storage. Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J.

Bits and Bit Patterns

Digitisation Standards

PASIG Digital Preservation BOOTcamp Best* Practices in Preserving Common Content Types: AudioVisual files

Advanced High Graphics

Chapter 1. Data Storage Pearson Addison-Wesley. All rights reserved

Emerging Trends in Records Management Technology. Jessie Weston, CRA 2018 MISA Conference October 11-12, 2018

M4-R4: INTRODUCTION TO MULTIMEDIA (JAN 2019) DURATION: 03 Hrs

3.01C Multimedia Elements and Guidelines Explore multimedia systems, elements and presentations.

ELAR: instructions for depositors

CS 074 The Digital World. Digital Audio

Example 1: Denary = 1. Answer: Binary = (1 * 1) = 1. Example 2: Denary = 3. Answer: Binary = (1 * 1) + (2 * 1) = 3

Best practices for producing high quality PDF files

D1.4 Digitization Guide Cassette Audio Project Parameters

Optimizing A/V Content For Mobile Delivery

Summary of Bird and Simons Best Practices

Recording oral histories

Important Encoder Settings for Your Live Stream

Data encoding. Lauri Võsandi

Best Practices Outline for Deep Dish Television

Audio issues in MIR evaluation

The Swedish National Archives digital preservation. Mats Berggren, IT-department,

ABBYY FineReader 14. User s Guide ABBYY Production LLC. All rights reserved.

SharePoint Archival Storage Strategies & Technologies January Porter-Roth Associates 1

Multimedia Technology

Introduction to Computer Science (I1100) Data Storage

Technical and Functional Standards for Digital Court Recording

4. Data Essence Overview Introduction Data Essence System Overview Major Data Essence Concepts.

Assigns a persistent identifier that will always point to the object and/or its metadata.

GUIDELINES FOR THE CREATION OF DIGITAL COLLECTIONS

Safe Havens in a Choppy Sea:

INTRODUCTION TO COMPUTERS

Wimba Classroom VPAT

Digital Audio. Amplitude Analogue signal

CHAPTER 10: SOUND AND VIDEO EDITING

Multimedia on the Web

MMS Specifications Sheet

Cisco AnyRes Video on Demand Analysis Node

Digitization of Multimedia Elements

1.1 SD Commercial File Delivery standard (FAST)

Professional Powerpoint Presentation II

9/8/2016. Characteristics of multimedia Various media types

DjVu Technology Primer

Prentice Hall. Learning Microsoft PowerPoint , (Weixel et al.) Arkansas Multimedia Applications I - Curriculum Content Frameworks

Digitization of Magnetic Audio Tape. Matthew Wilcox, Audiovisual Archivist Michigan State University Archives & Historical Collections

And you thought we were famous

Different File Types and their Use

Managing Born- Digital Documents.

Digital Music. You can download this file from Dig Music May

Part III: Survey of Internet technologies

Structure and document your research data

Archiving and Migrating Digital Files

Lecture Information. Mod 01 Part 1: The Need for Compression. Why Digital Signal Coding? (1)

CSC 101: Lab #8 Digital Video Lab due date: 5:00pm, day after lab session

GETTING STARTED WITH DIGITAL COMMONWEALTH

Lecture Information Multimedia Video Coding & Architectures

Smarter Document Capture

PROCESSING AND CATALOGUING DATA AND DOCUMENTATION: QUALITATIVE

MP3. Panayiotis Petropoulos

RECOMMENDED FILE FORMATS

DIGITAL IMPROVEMENTS -16mm Frame by Frame Film Transfer with Magnetic and Optical Sound

Importance of cultural heritage:

DLF ENVIRONMENTAL SCAN BY JEN MOHAN

ENDNOTE X7 VPAT VOLUNTARY PRODUCT ACCESSIBILITY TEMPLATE

Standard File Formats

Differences between the GWG 1v4 and 2015 specifications

IMPORTING, ORGANIZING, EXPORTING, AND SAVING. MyGraphicsLab: Adobe Photoshop CS6 ACA Certification Preparation for Visual Communication

Input Devices. Types of Input Devices: 1)Keyboard:

Jianhui Zhang, Ph.D., Associate Prof. College of Computer Science and Technology, Hangzhou Dianzi Univ.

EEC-484/584 Computer Networks

MULTIMEDIA AND CODING

PDF/A - The Basics. From the Understanding PDF White Papers PDF Tools AG

Lecture 19 Media Formats

4. TECHNOLOGICAL DECISIONS

INTRODUCTORY COMPUTER NETWORKS PHYSICAL LAYER, DIGITAL TRANSMISSION FUNDAMENTALS. Faramarz Hendessi

3 Data Storage 3.1. Foundations of Computer Science Cengage Learning

RLG Model Request for Information (RFI) for Digital Imaging Services

New Features in Final Cut Pro 6

Chapter 9 Section 3. Digital Imaging (Scanned) And Electronic (Born-Digital) Records Process And Formats

Accessibility Guidelines

Characterisation. Digital Preservation Planning: Principles, Examples and the Future with Planets. July 29 th, 2008

The 1-Bit Advantage Future Proof Recording

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Safe Connect Network Access Control. Section 508 Conformance Review Voluntary Product Accessibility Template. End User Interface

UNDERSTANDING MUSIC & VIDEO FORMATS

Video Forensics: WHAT YOU NEED TO KNOW

Technology Special Interest Group Thursday, December 4, Tony Hanson Webmaster Technology Special Interest Group Leader

Computing in the Modern World

Transcription:

Goals Convert it to preserve it: Digitization and 1. Participants will have a better understanding of the inherent fragility of digital objects 2. Participants will acquire information to help themselectpreservationformats formats, metadata, and backup systems for digital objects 3. Participants will be able to identify one or more actions that can be taken to improve their institution s digital preservation efforts Sessions Session Outline We are here: Overview of digital preservation Convert it to preserve it: Digitization and file conversion Describe it so you can find it: Metadata, finding aids, and asset management Practice safe archiving: Backups, copies, and what can go wrong Partner to preserve: Digital preservation networks and collaboration Lauren Goodley Tues., April 2, 2013 Jacob Nadal Thurs., April 4, 2013 Danielle Plumer Tues., April 9, 2013 Jefferson Bailey Weds., April 10, 2013 Liz Bishoff and Tom Clareson Mon., April 15, 2013 Review major formats Text Images Audio Brief discussion of Video, Data, and Interactive Systems These lack a preservation consensus. There are many technical questions and fewer guarantees. For each, we ll examine How the format is designed What risks and advantages it entails for preservation Key specifications for creating (or working with vendors to create) files in these formats Defining Digital Preservation Medium Definition of DIgital Preservation Digital preservation combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. Photo: John Keogh http://www.flickr.com/people/jvk/ 04/04/13 1

Text Text UTF-8, a way of representing Unicode, is standard Digital text is purely character data No font or layout information is stored in a pure text file Critical for searching and manipulation XML is a UTF-8 text format y by Tarale: http://flickr.com/photos/tarale/ Text: UTF 8 Unicode is an unlimited way of encoding characters The Unicode Transmission Format 8 bit (UTF 8) is the most common way to encounter Unicode UTF 8 transmits using 1 to 4 octets, 8 bit bytes First 128 of these are US ASCII, and then there are lots of other things Text: UTF 8 The UTF 8 Character Set Easy to identify Given an unknown text string, a simple search pattern identifies UTF 8 over 99.5% of the time Default, native encoding for XML Multi language support 04/04/13 2

Images and Text The portions of the unicode character set that we just saw were, of course, an image. Computers don t read; they encode and decode So, digitized books are page images plus text transcriptions plus the metadata that holds all of that together. To get text from images, you have to re key it or use Optical Character Recognition (OCR) OCR accuracy is reported as character level accuracy from ideal sources Actual outcomes for accurately transcribed words from less than perfect sources is usually lower! Two sides of text Format for building digital library systems (XML, HTML/CSS, UTF 8, PHP, etc.) Documents in a digital library Microsoft Word: a de facto standard, d especially with the move to Office XML in recent versions PDF: a format with an open license, that can contain text, images, audio, video, forms, etc. PDF/A is based on PDF 4.1, and contains a limited set of PDF features that are considered preservable Key Specifications UTF 8 encoded Unicode text XML based formats for markup Clarity about how text capture was performed (OCR or re keying) Metadata that carries the details of this! For documents, know your versions. PDF/A, when possible.docx, when possible Images Book Cell, de Matej Krén, 02. Photo by Ferran Moreno Lanza. http://flickr.com/photos/unquepassava/ Images Two types of images: Raster (which we ll talk about today) and Vector (which we won t) TIFF is the standard preservation format JPEG2000 emerging as a new alternative File should: Contain uncompressed image data (TIFF and JPEG2000 can both store compressed data) Be at least 300 pixels per inch (ppi/dpi), 24-bit color Higher pixel count effectively allows more zoom-in without pixilation Color calibrated and profiled with an ICC color profile. Capturing Good Images Set up and profile equipment, then leave it alone! Master should be an unaltered capture color profiled but not color corrected Editing, retouching, and color correction should be done on a secondary copy for a particular use case, web or print, for example. 04/04/13 3

Key Specs Uncompressed, 24 bit RGB Color managed (icc profile) Usually TIFF format 300+ DPI 04/04/13 4

Audio Audio Uncompressed Pulse Code Modulation Broadcast WAV (BWAV) Wave file with a metadata header Resolution of at least 44.1 khz (CD quality), preferably 96 khz Bit Depth of at least 16 bit (CD quality), pref. 24 bit Why Frequency and Depth Matter Resolution Waveforms, one per channel. Mono = 1, Stereo = 2, 5.1 = 6 channels Perhaps some metadata, in BWF especially CD audio is 44.1 khz (44,100 samples per second) Most digital preservation engineers favor 96 khz Extra sampling capacity helps avoid errors, provides finer reproduction of sound Bit Depth CD audio is 16 bit, which allows up to 65,536 levels of amplitude, between 0 96dB 24 bit audio has a theoretical maximum of 16.7 million levels from 0 144 db Current digital audio converters are limited to ~120 db because of practical limits on integrated circuit design 96khz/24 bit surpasses limits of human hearing Some signals encode data not meant for humans 04/04/13 5

Capturing Audio Source: Tape, LP, Microphone, etc. Compare: Photo, document, etc. Digital Audio Converter (DAC): This will determine the bit depthandresolution resolution, and basic quality of your capture Compare: Scanner, Digital Camera Audio Mastering/Editing Software Compare: Image editing software Key Specs Broadcast WAV (BWAV) Wave file with a metadata header WAV audio is Pulse Code Modulation (PCM), the universal format for uncompressed audio Resolution of at least 44.1 khz (CD quality), preferably 96 khz Bit Depth of at least 16 bit (CD quality), pref. 24 bit Two different sources Video & Moving Image Motion picture is a series of optical image frames with a sound track (usually optical) Video is a series of magnetically recorded signals as waveforms for image and sounds Video has a specific resolution derived from a fixed number of scan lines 720x480i60 from 486 scan lines is SECAM standard 720x480 are picture dimensions i60 indicates interlacing 6 lines for (graphical, not textual) metadata 04/04/13 6

Video & Moving Image Standards and practices developing Uncompressed desirable, but high storage costs Compression is normal in video, but may cause preservation problems Uncompressed.AVI is the current safe bet Motion JP2K & MPEG21 may be options H.264 becoming the standard for service copies Pick one, but plan on a migration Digital Video (Delivery) H.264 is standard Often delivered via Flash Video (FLV) Several more or less proprietary options (Quicktime, Real, Windows Media) HTML 5 is emerging video delivery platform Pick one, but plan on a migration Data and Interactivity Data and Interactivity Need to decide if fixed points in time are required: Are you storing an instance of data? Need to decide if active system is required: Are you maintaining and experience or immersive environment? Or, are you doing both? Examples of how this affects you: Email and Social Media. Email is a known but loosely defined set of standards, the use of which is tightly coupled to client application Social media is closed but free, with no cost to use, but no provision to move data to other systems, either. Where to learn more: ICPSR: www.icpsr.umich.edu/icpsrweb California Digital Library University of California Curation Center: www.cdlib.org/services/uc3/datamanagement Variable Media Network: variablemedia.net Where to go for more Sources of standards and specifications: PARS Minimum Capture Guidelines (this year): http://www.ala.org/alcts/mgrps/pars Federal Agencies Digitization Guidelines Initiative (FADGI) (right now): http://www.digitizationguidelines.gov/ Discussion Lists: Google Digital Curation group: http://groups.google.com/group/digital curation DigiPres listserv (via ALA): http://lists.ala.org/wws/arc/digipres Professional Groups: Association of Moving Image Archivists (AMIA): http://www.amianet.org/ Association for Recorded Sound Collections (ARSC): http://www.arsc audio.org/ Society for Imaging Science and Technology (SIST): http://www.imaging.org/ 04/04/13 7