Practical Digital Preservation: Solutions from Small and Medium-sized Institutions

Size: px
Start display at page:

Download "Practical Digital Preservation: Solutions from Small and Medium-sized Institutions"

Transcription

1 1 Practical Digital Preservation: Solutions from Small and Medium-sized Institutions ABSTRACT Anthony Cocciolo Pratt Institute School of Information This paper offers digital preservation solutions for small and medium-sized institutions based on experiences developing solutions for the Solomon R. Guggenheim Museum ( ) and Bard Graduate Center (2016). First, a brief background on digital preservation is provided, followed by context on the two institution s projects as well as the methodology used to create the findings. The author finds that medium and small institutions are best served by adopting low-cost solutions that do not lock the institution into some specific technology that could be difficult to maintain over time. While descriptive collection-level metadata is necessary, item-level technical and descriptive metadata should be avoided as it is too time-consuming to create, even when using automation tools. Rather than taking an overly cautious or meticulous approach, a good enough approach to digital preservation is advocated here where actions should be taken only when serious threats to information intelligibility are encountered. Introduction As creative and intellectual work is increasingly manifested as digital records, often with no analog equivalent such as a printout, the need for digital preservation heightens. Digital preservation is defined here as the activities and planning that helps ensure that digital information of enduring value remains accessible and intellectually faithful to its original form over time. Fortunately, there has been steady effort in developing resources and standards for digital preservation. Despite the best intentions of such efforts, select standards and resources can prove overwhelming to small and medium institutions. The discrepancy between digital preservation standards and the ability for small and mediumsized cultural heritage institutions to meet those standards is best highlighted by the U.S. IMLS-funded POWRR project (Preserving [Digital] Objects with Restricted Resources), which is studying strategies for medium and small institutions to provide long-term preservation to digital assets. 1 Such institutions may struggle with deploying basic building blocks for digital preservation, such as using enterprise storage technology rather than removable media. This project looks to follow the path of the POWRR project and offer practical advice on digital preservation services based on experience and reflections in helping small and medium institutions setup digital repositories. The two notable projects that will be drawn from are a NHPRC grant-funded project to the Solomon R. Guggenheim Museum ( ) and a project at the Bard Graduate Center (2016) in New York. Note that I do not represent these institutions, but rather offer lessons learned from experiences working on digital archives and preservation projects at these institutions. However, before what

2 2 has been learned from these projects will be discussed, some background on digital preservation is necessary. Relevant Literature Particularly salient methods and models used for digital preservation include the use of the Open Archival Information System model (OAIS) and the Trusted Digital Repositories (TDR) framework. The OAIS model was developed by a consortium of national space agencies to address an inability to access digital information from earlier space programs. 2 Since then, it has been adopted as a best practice in preserving digital information from a wide variety of institutions, including libraries and archives within colleges, universities and governments. The model necessitates the creation of SIPs (Submission Information Package), AIPs (Archival Information Package), and DIPs (Dissemination Information Package), among other elements. 3 The SIP is the version of the information package that is transferred from the creator to the OAIS; the AIP is the version of the information package that is stored and preserved by the OAIS; the DIP is the version of the information package delivered to the researcher in response to an access request. 4 An example will be provided to better elucidate: suppose a creator transfers to the archives files that may not be well-suited for long-term preservation. This can include proprietary binary files that are obsolete and/or not well documented, such as files created in WordStar, a word-processing program often used in the 1980s. This bundle of files would form the SIP. The AIP may well contain the SIP files, but would also contain a version of files normalized for long-term preservation, such as the WordStar file converted to PDFs. The DIP would contain versions of the files well suited for user access, such as the PDF files created for the AIP. The TDR framework builds upon the OAIS model by describing the elements needed to make a repository trustworthy. 5 This includes aspects related to technology, resources and host organization. 6 The criteria for trustworthiness are further articulated in the Trusted Repositories Audit and Criteria (TRAC) report, which make the elements needed to ensure trustworthiness even more explicit. 7 Other initiatives have looked to provide digital preservation guidance. The National Digital Stewardship Alliance s level of digital preservation provide a simplified criteria for ensuring the trustworthiness of a repository. 8 Futher, several software systems have been developed to simplify and automate digital preservation workflows, such as Artefactual Systems Archivematica, Tessela s Preservica, and OCLC s CONTENTdm. 9 Despite the advances in digital preservation, several individuals have noted that the demands created by standards such as TRAC can be overwhelming for small and medium-sized institutions. This includes within the context of manuscript and special collection repositories. For example, Goldman contends that while OAIS and TDR may be worthy long-term goals for manuscript and special collection repositories, the requirements may be too great and such organizations may be better served by starting small with something as simple as network file storage. 10 In effort to provide solutions

3 3 that are both simple and trustworthy, experiences from small and medium-sized institutions will be drawn from. More background from such sites will be provided in the next section. Background on Sites At a basic level, the Guggenheim Museum and the Bard Graduate Center electronic records projects looked to establish repositories that addressed the following: - Technical infrastructure Establishing a hardware and software infrastructure for storing the electronic records. Also, establishing a relationship with the IT department to purchase, deploy and maintain components. - File monitoring The means to monitor files for bit-rot, bit-corruption, or tampering. - File format inventories Maintaining an index of file formats that are accessioned into the digital archives, as well as the necessary conversions needed to prepare them for long-term preservation and access. - Internal Workflows Workflows for archives staff to accession and process electronic records into the digital archives, and make them available to on-site researchers or institutional staffs via online finding aids. Includes specifics on requisite metadata. - External Workflows Informational handouts and forms for institutional staff to transfer records to the repository. Each of these facets was addressed differently based on the institution size, preexisting archives programs and what were known at the time. More detail on the specific programs will be discussed below. Guggenheim Museum The Guggenheim is a medium-sized museum with a paper archives program that goes back to the 1970s. The records document the formation and activities of the institution are used by researchers in documentary films and written works on the contemporary art world, among other uses. As the museum has a long-standing paper archives, it was necessary to develop an electronic archives program that related to the existing paperbased processes. With funding from the NHPRC, the Guggenheim s Electronic Records Start-up Project looked to plan for a digital records repository using a best practices approach. 11 The end goal of the project was to offer three options or tiers that could be adopted by the institution based on investment. The resulting whitepaper and other resources on the project can be found on the museum s webpage. 12 In this project, I acted as the Electronic Records Consultant with primary responsibility for coordinating the plan for the electronic records repository.

4 4 A major component of the project was establishing a redundant storage system through working with the IT department. The system developed used a storage area network (SAN) created by Synology Inc. that was kept in the server room at the museum offices in Soho, Manhattan, that implemented RAID 6. RAID 6 allows for two hard drives in the array to fail and no information is lost, thus making it a great solution for archives that are more interested in not losing data than they are with array speed. In addition to the primary copy, the data was copied over the wide area network (WAN) via the Linux program rsync to identical Synology hardware stored in a museum storage facility in another neighborhood of Manhattan. This was a significant achievement because it allowed the project to achieve NDSA levels of digital preservation Level 1, which is achieved in-part by having two copies of the data that are no co-located. Although a variety of digital preservation options were explored and tested in this project, the one that was examined closest was the use of Archivematica. Archivematica is an open source software developed by Artefactual Systems designed in-part to enact OAIS requirements and thus support digital preservation activities and workflows. One of the notable functions of Archivematica is the preparation of AIPs, which is accomplished first by transferring files into the system that form the SIP. Archivematia assembles the AIP by including the original file and folder structure as well as new files that may be created based on the preservation and access policies put in place. For example, a PST file (a Microsoft Outlook mail format) may be included in the AIP as well as normalized version of the PST for preservation and access, such as a MBOX version (a text-based format widely regarded as amenable to long-term preservation). 13 Files that are not automatically normalized to the policies must be manually normalized before the AIP can be created; files that cannot be normalized to fit the policies will prevent the AIP from being created. This helps ensure that files don t fall through the cracks and that the archive has good control over what is being archived and ensures that the archives can indeed serve up all files being stewarded to end-users. A major component of the AIP created by Archivematica, other than the original files and well as the normalized files, is the PREMIS metadata that is wrapped in a METS file. The PREMIS metadata includes information like details on file formats and preservation events, such as information on the transformation of files from their original format to the formats designed for preservation. The METS file provides additional information on the structure of the AIP, such as the relationship between files and directories. Although an extensive discussion of PREMIS and METS metadata is beyond the scope of this paper, suffice it to say that PREMIS metadata is useful for holding technical attributes of files and their related preservation events, where METS document act to describe the structure of the files and house the metadata, which can also include descriptive metadata such as Dublin Core metadata. 14 Despite this impressive functionality that Archivematica provides in automating the creation of these standards-based metadata records, there are a number of challenges with this approach. Since every file gets inspected by Archivematica, any one file in a collection can hold up the creation of the AIP. For example, if the system cannot determine all of information destined for the PREMIS file, such as the exact version of a

5 5 file format, this could hold-up the creation of the entire AIP. Although there are workarounds (e.g., relying on file extension rather than inspecting the contents of the file in more depth), working through all the issues can be very time consuming. Further, not being able to produce the preservation and access files that adhere to the policy developed can hold-up the creation of the AIP. Although this may well indeed be the point of Archivematica, the work to go through all digital collections in this way starts to feel somewhat daunting if not outright impossible. For those with experience in the paper archives, it may feel somewhat like a pre-mplp world, where all files were refoldered, and every fastener removed, every time, per policy. 15 After the effort to create an AIP, one can look through the documentation created by Archivematica, including the PREMIS metadata wrapped in METS. Unfortunately, the sheer volume of the metadata can be somewhat overwhelming. For example, in creating an AIP for a single PDF file, which was normalized to PDF/A for preservation, a METS file with 1,322 lines was created. Although this is somewhat impressive, especially compared to how long it would take a person to hand-code this much information, it does feel somewhat overly cautious and meticulous. Perhaps this metadata will prove hugely valuable to future archivists and researchers, but given the labor needed to produce it even where much of it is automated I cannot help but think it is not terribly enlightening or useful. The useful metadata the Dublin Core metadata used to describe the AIP is buried deep in this large METS-encoded XML file. Archivematica does not have a way to normalize many kinds of frequently encountered records to formats that are suitable for preservation and access, such as Microsoft Office files to PDF. For this reason, a script was developed that converts Microsoft Office files to PDF. 16 As the Guggenheim project progressed and more tests were conducted, a number of other issues were encountered that prompted the project to move in-part away from Archivematica. Besides the extensive time it took to successfully create AIPs, both in terms of computer time and human time, an issue that was identified was the finality of the AIP. With the emphasis in the digital preservation community on creating packages fixed in time, how could one accommodate an accretion, or the addition of new records into a collection? In the paper archives, new records were often encountered after a collection was processed. In these cases, the records were simply filed into existing boxes. Unfortunately, at the time the project was running, there were no way to simply slip-in a few newly discovered records into a collection. Because of these limitations, the project began to put more focus on what we the archives staff and I were calling the SIP storage, which was a network share on the same storage device as the Archivematica AIPs. The idea behind the SIP storage was that it would act as a storage spot until collections were complete and could be ingested into Archivematica. As the start-up project reached its end in 2014, my feeling at the time was that given the challenges of creating AIPs in Archivematica, the recommended course was to manually normalize any files with preservation and access concerns, and use the SIP storage as the digital archives.

6 6 At the time the project was running, Archivematica did not have a way to monitor file fixity. For this reason, the project rolled-out the use of AceAudit, which is a web-based file fixity checker, to ensure that files in Archivematica were not subject to bit corruption. 17 AceAudit was run across both the SIP storage and the Archivematica AIP storage regularly with an report sent to the archives. As trustworthiness is not simply something that happens once but is generated through ongoing activity, a trustworthiness checklist was developed that could be re-ran by institutional staff to ensure that the repository maintains trustworthiness. A copy of this checklist, which was developed through consulting TRAC and the NDSA levels of digital preservation, is included in appendix 1. Staff at the museum would transfer electronic records to the archives through using network shares and by ing an Excel form to the Archives with descriptive metadata. A Windows script was developed that would read from the Excel form and produce an accession record in ArchivesSpace, which the museum was beginning to use to manage the archives. Because of the very large files sizes of videos, specific guidance was necessary on how to handle them. In addition to the issue of very large files sizes, video files can be created using a wide-variety of CODECs that may not be readily evident from the file extension or wrapper (e.g., MOV, AVI, etc.). Rather than attempt to convert all video files to a single format, the recommendation developed was to engage in conversion only if the CODEC or wrapper was in danger of not being supported in the near future. For this reason, collections with videos accessioned into the archives needed to be added to a video registry, which records which CODECs and wrappers are contained in each collection. Should a CODEC or wrapper become obsolete, a preservation action can be undertaken (e.g., conversion to a more sustainable format). This method was deployed because it was thought to be too time consuming to convert all videos to a single format and too disk-space intensive to maintain the original and converted version when it could not be demonstrated that a particular CODEC or wrapper was in any serious danger of no longer being supported. Many of the lessons learned from this startup project were applied to the Bard Graduate Center s project in Details on this project are discussed below. Bard Graduate Center The Bard Graduate Center is a specialized graduate school that was interested in establishing a digital archives to maintain records documenting significant school activity. The school was established in the 1990s and had not established a paper archives program because most of the school s activity occurred in a post- digital world. As the school itself is small in size (it has a staff of 60 persons), creating a solution that was in concordance with its size was necessary.

7 7 Working with the IT department, the technical infrastructure developed was to deploy a SAN in each of its two buildings on West 86 th Street in the Upper West Side of Manhattan, thus providing for two copies of the data that were not co-located. The data would not be mirrored should something happen to the primary copy, but rather copied using enterprise backup software. A third copy of the data would be replicated to Amazon s Glacier. Glacier is a service provided by Amazon that provides low-cost storage of data that does not need to be frequently accessed. 18 Unlike most cloud storage, it can take several hours to pull information from the Glacier, thus making it inappropriate for most time-sensitive tasks but work-well for emergency copies of data. A modified version of the trustworthiness checklist was also deployed for use in this project. For this project, rather than use Archivematica, the use of a Windows network share for the digital archives was employed. Security for the share would be moderated by Windows basic access control system. For full-text searching of records, Windows' built-in full-text searching capability was employed. Unfortunately, those full-text search capabilities are only available to Windows clients. For full-text searching on Macintosh clients, they would have to rely on the far slower file-by-file searching made possible through Samba networking. For this reason, the access workstation in the archives needed to be Windows-based, although institutional staff can mount the archives as a read-only share on their desktops regardless of platform. As this arrangement indicates, the digital archives are only available to users and institutional staff on-site. Collections would be accessioned and processed according to manuals that were developed. Files in formats not well suited for long-term preservation and access could be manually normalized, or normalized using an assortment of batch tools. A table in Appendix 2 illustrates the formats that are accessioned into the Bard Graduate Center Archives, as well as the necessary actions that need to be taken on specific formats. As there were large quantities of WordPerfect files, an open source script was developed that uses Microsoft Word to batch convert WordPerfect files to PDF. 19 According to the processing manual that was developed, normalization should follow a simply file naming convention. For example: LetterToDirector.wpd to LetterToDirector_normalized.pdf According to the processing manual, the original source files are included in the archives, as well as the normalized versions. The last processing step involves using Bagger (the GUI version of BagIt created by the Library of Congress) to create a bag for the collection. 20 The bag will fail to validate should any file become corrupt through bit-rot or bit corruption, or any file is added or removed from the bag. An open-source python script was developed to check all bags in a given directory are still valid and send an update to the archivist, with the script scheduled to run weekly. 21 This arrangement is far easier to manage than the AceAudit solution because it involves managing a relatively small number of bags (a bag usually

8 8 being equivalent to a single collection), rather than trying to manage all the files and directories spread across all collections in the repository. Like the Guggenheim project, this project also used the video codec registry concept, where videos would remain in their original format and noted on a registry. Also like that project, accession records and descriptive metadata would be entered into ArchivesSpace and made available to users through online finding aids. Getting descriptive information from staff transferees involved use of rather than Excel forms to further simply the process. Unlike the Guggenheim project, no PREMIS, METS or Dublin Core metadata was created. Methods The findings will be drawn based on reflections on the consulting experience from 2013 to 2016, augmented by re-reading the extensive documentation produced by both projects. Emphasis will be placed not on the long-term success of the projects, which are too young to be determined, but rather on the ease in which archivists can accession, process, and make accessible the born-digital documentation. Specific documentation that will be re-examined is included in Table 1. Table 1. Documents analyzed in producing findings. Institution Type Documentation Date Guggenheim Report Three-tiered Plan for Managing Electronic 2014 Sept. 2 Museum Records Manual Electronic Records Processing Manual 2014 Aug. 25 Report Pilot Projects Overview 2014 Jan. 23 Report Pilot 1: Setup of an Open Archival 2014 Sept. 15 Information System (OAIS) Compliant Repository Report Pilot 2: Digital Records Processing and 2014 Aug. 25 Obsolete Media Ingestion Workstation Report Pilot 3: Selection of Electronic Records from 2014 July 1 Departmental Network Storage for Transfer to the Archives Report Pilot 4: Workflows for Staff Transfer of 12 Feb Electronic Records to the Archives Report Pilot 5: Transferring Select Permanent 15 May 2014 Electronic Records to an Open Archival Information System Report Pilot 6: Problem Records: Obscure or 15 May 2014 Specialized Formats Report Pilot 7: Problem Records: Obsolete Formats 30 May 2014 Report Pilot 8: Problem Records: Removable Media, including Obsolete Media and Removable Hard Drives 5 June 2014

9 9 Bard Graduate Center Report Pilot 9: Problem Records: Preserving 25 Aug 2014 Significant Correspondence Report Pilot 10: Procedures for preserving SRGF 8 Sept websites and microsites. Report Pilot 11: Problem Records: Very large files 7 July 2014 (e.g., video) Manual Processing Obsolete or Removable Media 15 Sept Report Tool Summary 2 Sept Report Institutional Electronic Records Research 28 Feb and Inventory Policy Records Retention Schedule 3 July 2014 Policy Archives and Records Preservation Policy 1 April 2016 Draft Report BGC Archive Project: Proposed Digital 24 May 2016 Infrastructure Policy Records Retention Schedule Draft 2 June 2016 Manual Preferred File Formats Draft 2 June 2016 Manual Accessioning Materials into the Digital 24 May 2016 Archives Manual Electronic Records Processing Manual 24 May 2016 Manual Submitting Electronic Files to the BGC 24 May 2016 Archives Manual Accessing the Digital Record Repository 24 May 2016 Findings Findings from these projects, which may be useful to other small and medium-sized institutions in planning a digital archives and preservation initative, include the following: - Software Infrastructure - Using simple digital storage interfaces, such as Windows network shares, makes getting files into and out of the repository easy, while also enabling access control and full-text searching. It also has the advantage of not locking the institution into some sophisticated software that would be difficult to move away from and expensive to maintain (both in terms of staff or consultants time among other costs). - Hardware Infrastructure - Storage arrays that impalement RAID 6, permitting two hard drives to fail without losing any information, is a great solution for archives that are interested more in storage integrity and less on array speed. - Data Replication, not Mirroring - Several storage solutions offer data mirroring capabilities so that the primary copy and secondary copy are kept constantly insync. The issue with this is if something should happen to the primary copy (e.g., an archivist accidentally deletes a file), the secondary copy is impacted

10 10 immediately. This can be addressed by having the secondary and/or tertiary copies be backups of the primary copy, where if any issues are found with the primary copy (e.g., bit-rot, bit-corruption, accidental deletion), then the primary copy can be restored from the secondary or tertiary copy. Backups should be retained for a month so if an issue is identified, copies from the past can be restored from. - Cloud Storage - This can be a great option for storing a tertiary copy of repository contents. However, keep in mind that Internet connection speed and the amount of data being transmitted, as well as other competing demands on the network, could make this option less appealing. This can be addressed somewhat by doing data copies overnight where there is less demand on the network. - Hardware Refreshes - When purchasing hardware, all hard drives should be planned on being replaced anywhere from every five to eight years to reduce the chances of hardware failures. Like all machines, research indicates that hard drive failure rates increase with age File integrity monitoring - The BagIt standard and related software, originally developed by Library of Congress, is a great method to enact file integrity monitoring. It allows for monitoring for bit-corruption, as well as tampering such as files being added or removed from a collection. It also allows archivists to manage at the bag or collection level, rather than the file-level, greatly simplifying monitoring. - Metadata Requirements - Creating descriptive metadata at the collection level is essential, and this can be accomplished using standard online finding aid tools (e.g., ArchivesSpace, Archivists Toolkit, Archon, AtoM, etc.). 23 However, metadata at lower levels, such as the item-level, is not necessary. Further, creating technical metadata, such as PREMIS metadata (which can be wrapped in METS metadata), can be hugely labor intensive, even when using automation tools like Archivematica. This technical metadata has not proven to enhance digital preservation, and is likely too labor intensive for small and medium-sized institutions to create. - File format normalization - A registry of file formats accessioned into the archives should be maintained (e.g., like the one in Appendix 2). Files should be migrated or normalized to new formats only when the original format is seriously endangered. For example, WordStar files should be normalized, but binary versions of Microsoft Word files (.doc) files supplanted by XML-based.docx files in 2007 should remain in the original format. Should Microsoft discontinue support for.doc files, this should trigger a preservation action where files are normalized. In conclusion, files should be normalized to new formats only when there is a serious threat to the format s ability to be become intellectually accessible. Normalized files should have obvious filenames, like appending

11 11 _normalized.pdf to the existing filename. - Video - Most video created today is born-digital. Rather than attempt to normalize all video into a single format, it is recommended that the CODECs and wrappers are monitored for obsolesce, and only when necessary to perform normalizations. It is both too time consuming to normalize all video, as well as too disk-space intensive to maintain both the original and the normalized, especially if the original CODEC and wrappers are not seriously endangered. A registry of CODECs and wrappers and their respective collections should be maintained. - Prepare for emulation and maintaining old software - Many two-dimensional, non-interactive records in unusual formats can be converted into PDF documents for long-term preservation. However, pay attention to important information that may occur outside of the printed area that will not be captured in the PDF. 3- dimenional records (e.g., architectural records) cannot easily be represented as PDFs (although a 3D PDF format has been developed). The same goes for interactive records that may be software-based. In these cases, maintaining a copy of the original software is necessary, and being prepared to present users the content with the original software, possibly in an emulated environment, is necessary. - Maintaining Trustworthiness A repository is trustworthy when it maintains it over time. Resources such as the trustworthiness checklist included in Appendix 1 should be inspected yearly to confirm that the repository maintains its trustworthiness. Conclusion In conclusion, this project offered several lessons learned from engaging in digital archives and preservation projects from 2013 to For small and medium-sized institutions, the general approach offered here is toward low-cost solutions that do not lock the institution into some specific technology that could be difficult to maintain over time. Further, although metadata records like PREMIS are intended to enhance digital preservation, they are time consuming to produce even when using automation tools and the records themselves have not proven to enhance the intelligibility of digital information. Rather than taking an overly cautious or meticulous approach, the one advocated for here is taking actions when serious threats are encountered, such as file formats no longer being supported by their vendors. For example, the default format policy for JPG files in Archivematica is to normalize them to uncompressed TIFF for preservation. Although this policy can be modified in Archivematica, this is an overly cautious approach as there is no immediate danger that the JPG format will become unintelligible. Rather than engaging in actions around hypothetical what ifs, which can prove very labor intensive as well as disk-space demanding, this paper like the POWRR project advocates for a good enough approach to digital preservation. 24

12 12 Appendix 1. Trustworthiness Checklist The following checklist should be used to verify the trustworthiness of the digital archive once a year. Staffing: Archives Archives The Archives department has adequately trained staff (at least one person) to perform deposits of digital content into the digital archives, respond to researcher requests, verify that the file integrity procedures are working as designed (e.g., read automated reports), monitor and respond to technological obsolescence risks. Staffing: Information Technology The IT department has adequately trained staff (at least one person) for managing the technical elements of the digital archives, such as the VPS, storage array, maintaining documentation such as passwords, and regularly verifying the data replication procedures are working as designed. Redundancy At least two copies of the digital archives are maintained and not co-located. The second copy is not a mirror (in case the primary copy becomes corrupt), but should replicate the primary copy within a 2-week window. This provides adequate time to restore the primary copy from the secondary copy if there is a file integrity issue. Secondary copies are maintained for a month. Storage Hardware All digital archive content (primary and secondary) is stored on enterprise storage hardware that implements at least RAID 6 (or two hard drives can fail at the same time and the storage array is maintained). File Integrity File integrity checks happen weekly. Procedure for responding to a file integrity issue are well understood and happen within a timely time period (a couple of business days maximum). Descriptive Metadata Maintain documentation on descriptive metadata requirements, such as minimum metadata required for a deposit. Evaluate that descriptive metadata requirements are being regularly met. Rights Metadata Maintain documentation on rights metadata requirements, and ensure that they are consistent with policy. Access Requests

13 13 Ensure that researcher requests for materials from the digital archives are handled in a timely fashion, and the rights are being properly applied (e.g., restricted information is being secured, access for staff and public is clearly communicated in the access policy). IT Infrastructure Ensure IT infrastructure is secure (physical plant, disaster recovery plan, backup procedures in place, etc.), consistent with ISO IT Documentation Maintain documentation related to managing the digital archives, such as related to storage management, disaster planning, security, backups and hardware support. Hardware Refreshes Hardware should be refreshed regularly to prevent hardware failures. For example, hard drives should be replaced every five to eight years to limit the chances of failure. IT documentation should include the dates that hard drives were put into service, as well as vendor support information. File Format Inventories The archives maintain a list of preservation and access file formats, and updates it judiciously (e.g., new file types needed to be preserved). The archives also maintains a list of Video codecs used in the digital archives. Software Risk The archives monitors the wider environment for changes to software that will render AIP or DIP content inaccessible, especially preservation and access file formats and video CODECs. File Rendering The archives have the means to render or open every file contained within an AIP or DIP (e.g., has access to all required software), and provide access to content upon request (if rights are granted). Preservation Action If AIP or DIP content is in danger of not being renderable (e.g., company discontinues software for rendering a file), then the archives performs a preservation action, such as format migration, emulation setup, or some other preservation action to ensure that the content is preserved. Transparency Documentation related to the digital archives should be able to be made available to stakeholders upon request. For example, listings of AIPs under storage, Submission requests, Access requests, etc. Financial

14 14 Digital archives has short- and long-term business planning processes in place to sustain the repository over time. If institutional failure is foreseeable, a succession plan should be developed. If institutional failure is imminent, the succession plan should be executed. Evaluation The users of the digital archives (e.g., staff researchers) are satisfied with it, and the access request results in the desired digital object. Appendix 2. File format normalization chart Overview The following table outlines the types of files that that are managed by the digital records repository. File types that have poor long-term viability, such as files types that proprietary and not openly documented, are recommended to be normalized into formats that are well-documented and/or have better long-term viability. The original file should always be retained. Note that this table should be updated as new file types are incorporated into the digital archives. The archives should not incorporate any files that it cannot make accessible to a researcher. Media type Submission file format Normalized version for preservation Notes MS Office (2007 and after): Word, Powerpoint, Excel DOCX, PPTX, XLSX Use original format These files are zipped-up XML files, so they work well for preservation. MS Office (pre-2007): Word, Powerpoint, Excel DOC, PPT, XLS Use original format Proprietary binary format, but MS has openly documented them, and they show no signs of being dropped from MS Office, so they continue to be acceptable. Plain text TXT Use original format Rich Text Format RTF Use original format

15 15 Filemaker FMP, FP5, FP7 Use original format, PDF, CSV Use options in File Maker to export PDF version and CSV versions. Put CSV files in a subfolder called: mydatabasename_normalized_csv Portable Document Format PDF PDF/A or High Quality PDF Embed fonts; avoid using Smallest file size save option, which does not embed fonts. Graphics - sustainable JPG, TIFF, PNG, GIF Use original format Graphics - low sustainability PSD, BMP TIFF Flatten any layers Adobe Illustrator AI PDF/A or High Quality PDF Save as high-quality PDF. Adobe InDesign INDD PDF/A or High Quality PDF Save as high-quality PDF. Audio AC3, AIFF, MP3, WAV, WMA Use Original Format, see note. Monitor formats for obsolescence issues. Migrate to WAV LPCM if format becomes endangered. Video AVI, FLV, MOV, MP1, MP2, MP4, SWF, WMV, QT, M4V Use original format, see note Use original format, but register the CODEC and Wrapper in the Video Codec Registry. Migrate to FFV1/LPCM in MKV if format becomes endangered. Camera raw files CR2, CRW, RAW, DNG TIFF Use Photoshop or other tool that reads these files. WordPerfect WPD PDF Convert to PDF using MS Word for Windows or script wpd_to_pdf.vbs script (requires MS Word for Windows). QuarkXPress QXD PDF/A or High Quality PDF Convert to PDF using InDesign. Save as high-quality PDF.

16 16 SketchUp (2- dimensional drawing) SKP (SKB is the backup of the SKP file) PDF/A or High Quality PDF Open in Sketchup, and Export as PDF. SketchUp (3- dimensional model) SKP (SKB is the backup of the SKP file) Use Original format, PDF/A or High Quality PDF For 3-dimensional models, a copy of SketchUp is required to read the original SKP file (although the PDF should still be created for easy reference to what is in the model). Websites HTML, various web assets WARC, screenshots Create WARC files using Webrecorder and use Webarchiveplayer to play back. 26 Webrecorder should also be used for creating screenshots. If webrecorder screenshots are not sufficient, create additional screenshots as PNG files (e.g., use Awesome Screenshot) Website files, local HTML, various web assets Use original format, see note Common files used on the web, such as HTML, JPG, GIF, can be maintained as is. May need to normalize individual files if significantly endangered. Gmail ZIP file of inbox via Google Takeout MBOX Google Takeout uses MBOX, so nothing to do other than download the file from Takeout. Acknowledgements Thank you to Francine Snyder, Sarah Haug, Heather Topcik, Jack Szwergold, Robert Rosenthal and Malinda Rathnayake for their critical feedback on their respective institution s project. A special thanks to Mike Satalof for the aforementioned reasons and for developing an earlier version of the table in Appendix 2. Author Biography Anthony Cocciolo is an Associate Professor at Pratt Institute School of Information, where his research and teaching are in the archives area. Prior to Pratt, he was the Head of Technology for the Gottesman Libraries at Teachers College, Columbia University. He completed his doctorate from the Communication, Media and Learning Technologies Design program at Teachers College Columbia University, and BS in Computer Science from the University of California, Riverside. You can find out more about him at his

17 17 website: 1 Preserving Digital Object with Restricted Resources, Amanda Kay Rinehart and Patrice-Andre Prud homee, Overwhelmed to action: digital preservation challenges at the under-resourced institution, OCLC Systems & Services 30 no. 1 (2014): Consultative Committee for Space Data Systems, Reference model for an Open Archival Information System (OAIS) (2012), available at: (accessed 1 April 2014). 3 Christopher A. Lee and Helen Tibbo, Where s the Archivists in Digital Curation? Exploring the Possibilities through a Matrix of Knowledge and Skills, Archivaria 72 Fall (2011): , available at: (accessed 11 July 2016). 4 Brian F. Lavoie, The Open Archival Reference (OAIS) Reference Model: Introductory Guide, 2 nd edition (Glasgow, UK: Digital Preservation Coalition, 2014), available at: (accessed 11 July 2016); 5 RLG & OCLC, Trusted Digital Repositories: Attributes and Responsibilities, Mountain View, CA: RLG (2002), available at: (accessed 1 April 2014); Steve Marks, Becoming a Trusted Digital Repository (Chicago: Society of American Archivists, 2015). 6 Cornell University Library, Digital Preservation Management: Implementing Shortterm Strategies for Long-term problems (2007), available at: (accessed 1 April 2014). 7 OCLC & CRL, Trustworthy Repositories Audit & Certification: Criteria and Checklist (Chicago, IL: CRL and Dublin, OH: OCLC, 2007), available at: (accessed 1 April 2014). 8 National Digital Stewardship Alliance, NDSA Levels of Preservation (2013), available at: (accessed 11 July 2016). 9 Archivematica, Preservica, ContentDM, 10 Ben Goldman, Bridging the Gap: Taking Practical Steps Toward Managing Born- Digital Collections in Manuscript Repositories, RBM: A Journal of Rare Books, Manuscripts, and Cultural Heritage 12 no. 1 (2011), pp Guggenheim Museum, Guggenheim Receives Grant to Plan an E-Repository, (February 13, 2013), available at: (accessed at 12 July 2016) 12 Guggenheim Museum Electronic Records Management Start-Up Project, 13 Christopher J. Prom, Preserving (Glasgow, UK: Digital Preservation Coalition, 2011), available at: dpctw11-01pdf (accessed 27 January ).

18 18 14 More information on PREMIS, METS and Dublin Core metadata structures can be found on the respective standards website: and 15 MPLP is short for More Product, Less Process, and is influential in paper processing. More information on it can be found in: Mark A. Greene and Dennis Meissner, More Product, Less Process: Revamping Traditional Archival Processing, American Archivist 68, Fall/Winter (2005): MS Office file normalization script, 17 AceAudit, 18 Amazon Glacier, 19 WPD to PDF script, 20 Library of Congress Bagger, 21 BagIt Validation Script, 22 Eduardo Pinheiro, Wolf-Dietrich Weber and Luiz Andre Barroso, Failure Trends in a Large Disk Drive Population, Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST 07), February 13-16, 2007, San Jose, CA, 23 ArchivesSpace, Archivists Toolkit, Archon, AtoM, 24 Jamie Schumacher, Lynne M. Thomas, Drew VandeCreek, et. al., From Theory to Action: Good Enough Digital Preservation Solutions for Under-Resourced Cultural Heritage Institutions (2014), available at (accessed 8 July 2016). 25 ISO 17799, 26 Web Recorder, Web Archive Player,

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version

University of British Columbia Library. Persistent Digital Collections Implementation Plan. Final project report Summary version University of British Columbia Library Persistent Digital Collections Implementation Plan Final project report Summary version May 16, 2012 Prepared by 1. Introduction In 2011 Artefactual Systems Inc.

More information

Introduction to Digital Preservation. Danielle Mericle University of Oregon

Introduction to Digital Preservation. Danielle Mericle University of Oregon Introduction to Digital Preservation Danielle Mericle dmericle@uoregon.edu University of Oregon What is Digital Preservation? the series of management policies and activities necessary to ensure the enduring

More information

Agenda. Bibliography

Agenda. Bibliography Humor 2 1 Agenda 3 Trusted Digital Repositories (TDR) definition Open Archival Information System (OAIS) its relevance to TDRs Requirements for a TDR Trustworthy Repositories Audit & Certification: Criteria

More information

GUIDELINES FOR CREATION AND PRESERVATION OF DIGITAL FILES

GUIDELINES FOR CREATION AND PRESERVATION OF DIGITAL FILES GUIDELINES FOR CREATION AND PRESERVATION OF DIGITAL FILES October 2018 INTRODUCTION This document provides guidelines for the creation and preservation of digital files. They pertain to both born-digital

More information

Importance of cultural heritage:

Importance of cultural heritage: Cultural heritage: Consists of tangible and intangible, natural and cultural, movable and immovable assets inherited from the past. Extremely valuable for the present and the future of communities. Access,

More information

The OAIS Reference Model: current implementations

The OAIS Reference Model: current implementations The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath m.day@ukoln.ac.uk Chinese-European Workshop on Digital Preservation, Beijing, China, 14-16 July 2004 Presentation

More information

Preservation and Access of Digital Audiovisual Assets at the Guggenheim

Preservation and Access of Digital Audiovisual Assets at the Guggenheim Preservation and Access of Digital Audiovisual Assets at the Guggenheim Summary The Solomon R. Guggenheim Museum holds a variety of highly valuable born-digital and digitized audiovisual assets, including

More information

Trusted Digital Repositories. A systems approach to determining trustworthiness using DRAMBORA

Trusted Digital Repositories. A systems approach to determining trustworthiness using DRAMBORA Trusted Digital Repositories A systems approach to determining trustworthiness using DRAMBORA DRAMBORA Digital Repository Audit Method Based on Risk Assessment A self-audit toolkit developed by the Digital

More information

An overview of the OAIS and Representation Information

An overview of the OAIS and Representation Information An overview of the OAIS and Representation Information JORUM, DCC and JISC Forum Long-term Curation and Preservation of Learning Objects February 9 th 2006 University of Glasgow Manjula Patel UKOLN and

More information

Introduction to. Digital Curation Workshop. March 14, 2013 SFU Wosk Centre for Dialogue Vancouver, BC

Introduction to. Digital Curation Workshop. March 14, 2013 SFU Wosk Centre for Dialogue Vancouver, BC Introduction to Digital Curation Workshop March 14, 2013 SFU Wosk Centre for Dialogue Vancouver, BC What is Archivematica? digital preservation/curation system designed to maintain standards-based, longterm

More information

Digital Preservation DMFUG 2017

Digital Preservation DMFUG 2017 Digital Preservation DMFUG 2017 1 The need, the goal, a tutorial In 2000, the University of California, Berkeley estimated that 93% of the world's yearly intellectual output is produced in digital form

More information

ISO Self-Assessment at the British Library. Caylin Smith Repository

ISO Self-Assessment at the British Library. Caylin Smith Repository ISO 16363 Self-Assessment at the British Library Caylin Smith Repository Manager caylin.smith@bl.uk @caylinssmith Outline Digital Preservation at the British Library The Library s Digital Collections Achieving

More information

The Choice For A Long Term Digital Preservation System or why the IISH favored Archivematica

The Choice For A Long Term Digital Preservation System or why the IISH favored Archivematica The Choice For A Long Term Digital Preservation System or why the IISH favored Archivematica At the beginning of 2017 the IISH decided to use Archivematica as a central system for long term preservation

More information

Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview

Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview University of Kalyani, India From the SelectedWorks of Sibsankar Jana February 27, 2009 Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview

More information

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository Robert R. Downs and Robert S. Chen Center for International Earth Science Information

More information

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October

Susan Thomas, Project Manager. An overview of the project. Wellcome Library, 10 October Susan Thomas, Project Manager An overview of the project Wellcome Library, 10 October 2006 Outline What is Paradigm? Lessons so far Some future challenges Next steps What is Paradigm? Funded for 2 years

More information

NEW YORK PUBLIC LIBRARY

NEW YORK PUBLIC LIBRARY NEW YORK PUBLIC LIBRARY S U S A N M A L S B U R Y A N D N I C K K R A B B E N H O E F T O V E R V I E W The New York Public Library includes three research libraries that collect archival material: the

More information

Assessment of product against OAIS compliance requirements

Assessment of product against OAIS compliance requirements Assessment of product against OAIS compliance requirements Product name: Archivematica Date of assessment: 30/11/2013 Vendor Assessment performed by: Evelyn McLellan (President), Artefactual Systems Inc.

More information

Digital Preservation at NARA

Digital Preservation at NARA Digital Preservation at NARA Policy, Records, Technology Leslie Johnston Director of Digital Preservation US National Archives and Records Administration (NARA) ARMA, April 18, 2018 Policy Managing Government

More information

Digital Preservation Efforts at UNLV Libraries

Digital Preservation Efforts at UNLV Libraries Library Faculty Presentations Library Faculty/Staff Scholarship & Research 11-4-2016 Digital Preservation Efforts at UNLV Libraries Emily Lapworth University of Nevada, Las Vegas, emily.lapworth@unlv.edu

More information

Draft Digital Preservation Policy for IGNCA. Dr. Aditya Tripathi Banaras Hindu University Varanasi

Draft Digital Preservation Policy for IGNCA. Dr. Aditya Tripathi Banaras Hindu University Varanasi Draft Digital Preservation Policy for IGNCA Dr. Aditya Tripathi Banaras Hindu University Varanasi aditya@bhu.ac.in adityatripathi@hotmail.com Digital Preservation Born Digital Object Regardless of U S

More information

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM OMB No. 3137 0071, Exp. Date: 09/30/2015 DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM Introduction: IMLS is committed to expanding public access to IMLS-funded research, data and other digital products:

More information

Assessment of product against OAIS compliance requirements

Assessment of product against OAIS compliance requirements Assessment of product against OAIS compliance requirements Product name: Archivematica Sources consulted: Archivematica Documentation Date of assessment: 19/09/2013 Assessment performed by: Christopher

More information

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories Purdue University Purdue e-pubs Libraries Faculty and Staff Presentations Purdue Libraries 2015 Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing

More information

Managing Born- Digital Documents.

Managing Born- Digital Documents. Managing Born- Digital Documents www.archives.nysed.gov Objectives Review the challenges of managing born-digital records Provide Practical strategies to ensure born-digital records are well managed Understand

More information

DIGITAL ARCHIVES & PRESERVATION SYSTEMS

DIGITAL ARCHIVES & PRESERVATION SYSTEMS DIGITAL ARCHIVES & PRESERVATION SYSTEMS Part 4 Archivematica (presented July 14, 2015) Kari R. Smith, MIT Institute Archives Session Overview 2 Digital archives and digital preservation systems. These

More information

Woodson Research Center Digital Preservation Policy

Woodson Research Center Digital Preservation Policy Primary Policy Aims Risk Assessment Needs Statement Project/Purpose Statement Goals and Objectives Projects to Undertake in 2016 Organizational Commitments Financial Commitments Personnel Preservation

More information

GETTING STARTED WITH DIGITAL COMMONWEALTH

GETTING STARTED WITH DIGITAL COMMONWEALTH GETTING STARTED WITH DIGITAL COMMONWEALTH Digital Commonwealth (www.digitalcommonwealth.org) is a Web portal and fee-based repository service for online cultural heritage materials held by Massachusetts

More information

UVic Libraries digital preservation framework Digital Preservation Working Group 29 March 2017

UVic Libraries digital preservation framework Digital Preservation Working Group 29 March 2017 UVic Libraries digital preservation framework Digital Preservation Working Group 29 March 2017 Purpose This document formalizes the University of Victoria Libraries continuing commitment to the long-term

More information

Sustainable File Formats for Electronic Records A Guide for Government Agencies

Sustainable File Formats for Electronic Records A Guide for Government Agencies Sustainable File Formats for Electronic Records A Guide for Government Agencies Electronic records are produced and kept in a wide variety of file formats, often dictated by the type of software used to

More information

Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives

Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Lisa M. Schmidt lisa.schmidt@matrix.msu.edu http://www.h-net.org/archive/ MATRIX: The Center for Humane Arts, Letters & Social

More information

Document Title Ingest Guide for University Electronic Records

Document Title Ingest Guide for University Electronic Records Digital Collections and Archives, Manuscripts & Archives, Document Title Ingest Guide for University Electronic Records Document Number 3.1 Version Draft for Comment 3 rd version Date 09/30/05 NHPRC Grant

More information

Developing an Electronic Records Preservation Strategy

Developing an Electronic Records Preservation Strategy Version 7 Developing an Electronic Records Preservation Strategy 1. For whom is this guidance intended? 1.1 This document is intended for all business units at the University of Edinburgh and in particular

More information

Preservation of the H-Net Lists: Suggested Improvements

Preservation of the H-Net  Lists: Suggested Improvements Preservation of the H-Net E-Mail Lists: Suggested Improvements Lisa M. Schmidt MATRIX: Center for Humane Arts, Letters and Social Sciences Online Michigan State University August 2008 Preservation of the

More information

CoSA & Preservica Practical Digital Preservation 2015/16. Practical OAIS Digital Preservation Online Workshop Module 2

CoSA & Preservica Practical Digital Preservation 2015/16. Practical OAIS Digital Preservation Online Workshop Module 2 CoSA & Preservica Practical Digital Preservation 2015/16 Practical OAIS Digital Preservation Online Workshop Module 2 Practical Digital Preservation 2015/16 Welcome! PDP Online Workshops - with focus on

More information

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...)

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...) Technical issues 1 Slide 1 & 2 Technical issues There are a wide variety of technical issues related to starting up an IR. I m not a technical expert, so I m going to cover most of these in a fairly superficial

More information

Protection of the National Cultural Heritage in Austria

Protection of the National Cultural Heritage in Austria Protection of the National Cultural Heritage in Austria Mag. Protection notice / Copyright notice The Domesday Book Domesday Book A survey of England completed 1086 and still readable National Archives

More information

Different Aspects of Digital Preservation

Different Aspects of Digital Preservation Different Aspects of Digital Preservation DCH-RP and EUDAT Workshop in Stockholm 3rd of June 2014 Börje Justrell Table of Content Definitions Strategies The Digital Archive Lifecycle 2 Digital preservation

More information

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017 Update HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017 1 AGENDA DRS DRS DRS Architecture DRS DRS DRS Work 2 COLLABORATIVELY MANAGED DRS Business Owner Digital

More information

From production to preservation to access to use: OAIS, TDR, and the FDLP OAIS TRAC / TDR

From production to preservation to access to use: OAIS, TDR, and the FDLP OAIS TRAC / TDR From production to preservation to access to use: OAIS, TDR, and the FDLP Federal Depository Library Conference, October 2011 Presentation Handout James A. Jacobs Data Services Librarian emeritus, University

More information

MAPPING STANDARDS! FOR RICHER ASSESSMENTS. Bertram Lyons AVPreserve Digital Preservation 2014 Washington, DC

MAPPING STANDARDS! FOR RICHER ASSESSMENTS. Bertram Lyons AVPreserve Digital Preservation 2014 Washington, DC MAPPING STANDARDS! FOR RICHER ASSESSMENTS Bertram Lyons AVPreserve Digital Preservation 2014 Washington, DC NDSA Levels of Digital Preservation! Matrix (Version 1) ISO 16363:2012! Audit & Certification

More information

critically examined in the Federal Archives.

critically examined in the Federal Archives. Bettina Martin-Weber BASYS - The Federal Archives database-driven archival management system for description, administration and presentation of metadata and digital archives Back in 2004 I already took

More information

Digital Preservation Standards Using ISO for assessment

Digital Preservation Standards Using ISO for assessment Digital Preservation Standards Using ISO 16363 for assessment Preservation Administrators Interest Group, American Library Association, June 25, 2016 Amy Rudersdorf Senior Consultant, AVPreserve amy@avpreserve.com

More information

DAITSS Demo Virtual Machine Quick Start Guide

DAITSS Demo Virtual Machine Quick Start Guide DAITSS Demo Virtual Machine Quick Start Guide The following topics are covered in this document: A brief Glossary Downloading the DAITSS Demo Virtual Machine Starting up the DAITSS Demo Virtual Machine

More information

Improving a Trustworthy Data Repository with ISO 16363

Improving a Trustworthy Data Repository with ISO 16363 Improving a Trustworthy Data Repository with ISO 16363 Robert R. Downs 1 1 rdowns@ciesin.columbia.edu NASA Socioeconomic Data and Applications Center (SEDAC) Center for International Earth Science Information

More information

Digits Fugit or. Preserving Digital Materials Long Term. Chris Erickson - Brigham Young University

Digits Fugit or. Preserving Digital Materials Long Term. Chris Erickson - Brigham Young University Digits Fugit or Preserving Digital Materials Long Term Tawnya Mosier University of Utah Chris Erickson - Brigham Young University Our Presentation ti 1. The need for digital preservation. 2. What can we

More information

Clearing Out Legacy Electronic Records

Clearing Out Legacy Electronic Records For whom is this guidance intended? Clearing Out Legacy Electronic Records This guidance is intended for any member of University staff who has a sizeable collection of old electronic records, such as

More information

Digital Preservation Preservation Strategies

Digital Preservation Preservation Strategies Digital Preservation Preservation Strategies by Dr. Jagdish Arora Director, INFLIBNET Centre jarora@inflibnet.ac.in Digital Preservation Strategies Short-term Strategies Bit-stream Copying Refreshing Replication

More information

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland Dr Aileen O Carroll Policy Manager Digital Repository of Ireland

More information

Protecting Future Access Now Models for Preserving Locally Created Content

Protecting Future Access Now Models for Preserving Locally Created Content Protecting Future Access Now Models for Preserving Locally Created Content By Amy Kirchhoff Archive Service Product Manager, Portico, ITHAKA Amigos Online Conference Digital Preservation: What s Now, What

More information

PRESERVING DIGITAL OBJECTS

PRESERVING DIGITAL OBJECTS MODULE 12 PRESERVING DIGITAL OBJECTS Erin O Meara and Kate Stratton 44 DIGITAL PRESERVATION ESSENTIALS Appendix B: Case Studies Case Study 1: Rockefeller Archive Center By Sibyl Schaefer, former Assistant

More information

Montana State Library Spatial Data Transfer Design

Montana State Library Spatial Data Transfer Design Montana State Library Spatial Data Transfer Design Prepared for GeoMAPP, December 17, 2011 by Diane Papineau, Gerry Daumiller, Evan Hammer, Jennie Stapp, and Grant Austin Introduction The Montana State

More information

Can a Consortium Build a Viable Preservation Repository?

Can a Consortium Build a Viable Preservation Repository? Can a Consortium Build a Viable Preservation Repository? Presentation at CNI March 31, 2014 Bradley Daigle (APTrust University of Virginia) Stephen Davis (Columbia University) Linda Newman (University

More information

State Government Digital Preservation Profiles

State Government Digital Preservation Profiles July 2006 2006 Center for Technology in Government The Center grants permission to reprint this document provided this cover page is included. This page intentionally left blank. Introduction The state

More information

Digital Preservation: From Theory to Practice

Digital Preservation: From Theory to Practice Digital Preservation: From Theory to Practice Instructor: Evelyn McLellan AABC pre-conference workshop April 28, 2011 Peter Van Garderen President / Systems Archivist MJ Suhonos Systems Librarian / Software

More information

Building a Digital Repository on a Shoestring Budget

Building a Digital Repository on a Shoestring Budget Building a Digital Repository on a Shoestring Budget Christinger Tomer University of Pittsburgh! PALA September 30, 2014 A version this presentation is available at http://www.pitt.edu/~ctomer/shoestring/

More information

Collection Policy. Policy Number: PP1 April 2015

Collection Policy. Policy Number: PP1 April 2015 Policy Number: PP1 April 2015 Collection Policy The Digital Repository of Ireland is an interactive trusted digital repository for Ireland s contemporary and historical social and cultural data. The repository

More information

Robin Dale RLG

Robin Dale RLG Robin Dale RLG Robin.Dale@notes.rlg.org Diversity of applications (commercial, home-grown, operational, etc.) in the organization, structure and encoding of documents and data Complexity varies greatly

More information

UNT Libraries TRAC Audit Checklist

UNT Libraries TRAC Audit Checklist UNT Libraries TRAC Audit Checklist Date: October 2015 Version: 1.0 Contributors: Mark Phillips Assistant Dean for Digital Libraries Daniel Alemneh Supervisor, Digital Curation Unit Ana Krahmer Supervisor,

More information

Session Two: OAIS Model & Digital Curation Lifecycle Model

Session Two: OAIS Model & Digital Curation Lifecycle Model From the SelectedWorks of Group 4 SundbergVernonDhaliwal Winter January 19, 2016 Session Two: OAIS Model & Digital Curation Lifecycle Model Dr. Eun G Park Available at: https://works.bepress.com/group4-sundbergvernondhaliwal/10/

More information

CoSA & Preservica Practical Digital Preservation 2017 Preserving and Protecting Audio visual Files

CoSA & Preservica Practical Digital Preservation 2017 Preserving and Protecting Audio visual Files CoSA & Preservica Practical Digital Preservation 2017 Preserving and Protecting Audio visual Files April 11, 2017 Practical Digital Preservation 2017 Welcome! PDP Briefings Protecting and Preserving Long-Term

More information

Data Curation Handbook Steps

Data Curation Handbook Steps Data Curation Handbook Steps By Lisa R. Johnston Preliminary Step 0: Establish Your Data Curation Service: Repository data curation services should be sustained through appropriate staffing and business

More information

NDSA Web Archiving Survey

NDSA Web Archiving Survey NDSA Web Archiving Survey Introduction In 2011 and 2013, the National Digital Stewardship Alliance (NDSA) conducted surveys of U.S. organizations currently or prospectively engaged in web archiving to

More information

What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011

What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011 What do you do when your file formats become obsolete? Lydia T. Motyka Florida Center for Library Automation USETDA 2011 The FCLA, the FDA, and DAITSS FDA: a service of the Florida Center for Library Automation

More information

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information

More information

The Making of PDF/A. 1st Intl. PDF/A Conference, Amsterdam Stephen P. Levenson. United States Federal Judiciary Washington DC USA

The Making of PDF/A. 1st Intl. PDF/A Conference, Amsterdam Stephen P. Levenson. United States Federal Judiciary Washington DC USA 1st Intl. PDF/A Conference, Amsterdam 2008 United States Federal Judiciary Washington DC USA 2008 PDF/A Competence Center, PDF/A for all Eternity? A file format is a critical part of a preservation model

More information

Networked Access to Library Resources

Networked Access to Library Resources Institute of Museum and Library Services National Leadership Grant Realizing the Vision of Networked Access to Library Resources An Applied Research and Demonstration Project to Establish and Operate a

More information

Kroll Ontrack VMware Forum. Survey and Report

Kroll Ontrack VMware Forum. Survey and Report Kroll Ontrack VMware Forum Survey and Report Contents I. Defining Cloud and Adoption 4 II. Risks 6 III. Challenging Recoveries with Loss 7 IV. Questions to Ask Prior to Engaging in Cloud storage Solutions

More information

Its All About The Metadata

Its All About The Metadata Best Practices Exchange 2013 Its All About The Metadata Mark Evans - Digital Archiving Practice Manager 11/13/2013 Agenda Why Metadata is important Metadata landscape A flexible approach Case study - KDLA

More information

Sparta Systems TrackWise Solution

Sparta Systems TrackWise Solution Systems Solution 21 CFR Part 11 and Annex 11 Assessment October 2017 Systems Solution Introduction The purpose of this document is to outline the roles and responsibilities for compliance with the FDA

More information

Sparta Systems TrackWise Digital Solution

Sparta Systems TrackWise Digital Solution Systems TrackWise Digital Solution 21 CFR Part 11 and Annex 11 Assessment February 2018 Systems TrackWise Digital Solution Introduction The purpose of this document is to outline the roles and responsibilities

More information

Working with a Preservation Software Vendor - The Kentucky Experience Glen McAninch

Working with a Preservation Software Vendor - The Kentucky Experience Glen McAninch Working with a Preservation Software Vendor - The Kentucky Experience Glen McAninch Kentucky Department for Libraries and Archives November 2014 Best Practices Exchange Montgomery, Alabama Who We Are Kentucky

More information

Managing Records in Electronic Formats. An Introduction

Managing Records in Electronic Formats. An Introduction Managing Records in Electronic Formats An Introduction Jefferson County Public Schools Archives and Records Center November 2012 Managing Records in Electronic Format As we create and use more and more

More information

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments *

Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Scalable, Reliable Marshalling and Organization of Distributed Large Scale Data Onto Enterprise Storage Environments * Joesph JaJa joseph@ Mike Smorul toaster@ Fritz McCall fmccall@ Yang Wang wpwy@ Institute

More information

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS).

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS). Harvard University Library Office for Information Systems DRS Policy Guide This Guide defines the policies associated with the Harvard Library Digital Repository Service (DRS) and is intended for Harvard

More information

Libraries and Disaster Recovery

Libraries and Disaster Recovery Libraries and Disaster Recovery A Framework for Regional Co-operation in Digital Preservation and Recovery Presentation to CDNLAO Meeting, Tokyo By N Varaprasad, NLB Singapore World Disasters & Impact

More information

An Introduction to PREMIS. Jenn Riley Metadata Librarian IU Digital Library Program

An Introduction to PREMIS. Jenn Riley Metadata Librarian IU Digital Library Program An Introduction to PREMIS Jenn Riley Metadata Librarian IU Digital Library Program Outline Background and context PREMIS data model PREMIS data dictionary Implementing PREMIS Adoption and ongoing developments

More information

Trials And Tribulations Of Moving Forward With Digital Preservation Workflows And Strategies

Trials And Tribulations Of Moving Forward With Digital Preservation Workflows And Strategies Trials And Tribulations Of Moving Forward With Digital Preservation Workflows And Strategies Thursday, October 26 8:30am - 10:00am NDSA Digital Preservation 2017 Discuss implementation of digital preservation

More information

31 March 2012 Literature Review #4 Jewel H. Ward

31 March 2012 Literature Review #4 Jewel H. Ward CITATION Ward, J.H. (2012). Managing Data: Preservation Standards & Audit & Certification Mechanisms (i.e., "policies"). Unpublished Manuscript, University of North Carolina at Chapel Hill. Creative Commons

More information

Digital Preservation Workshop

Digital Preservation Workshop Digital Preservation Workshop 10 November 2010 University of Victoria Peter Van Garderen Artefactual Systems Workshop Agenda 10:00 Introductions 10:15 What is digital preservation? From strategy to implementation

More information

ISO Information and documentation Digital records conversion and migration process

ISO Information and documentation Digital records conversion and migration process INTERNATIONAL STANDARD ISO 13008 First edition 2012-06-15 Information and documentation Digital records conversion and migration process Information et documentation Processus de conversion et migration

More information

How do Small Archives Steward their Moving Image and Sound Collections? A Qualitative Study

How do Small Archives Steward their Moving Image and Sound Collections? A Qualitative Study How do Small Archives Steward their Moving Image and Sound Collections? A Qualitative Study Anthony Cocciolo Society of American Archivists Research Forum July 25, 2017 Portland, Oregon Available at SAA

More information

Selecting an Electronic Records Repository Platform

Selecting an Electronic Records Repository Platform Selecting an Electronic Records Repository Platform How we conjured something from nothing A Presentation By Bryan Collars and Brian Thomas South Carolina Department of Archives and History BPE 2015 Topics

More information

Introduction to Archivists Toolkit Version (update 5)

Introduction to Archivists Toolkit Version (update 5) Introduction to Archivists Toolkit Version 2.0.0 (update 5) ** DRAFT ** Background Archivists Toolkit (AT) is an open source archival data management system. The AT project is a collaboration of the University

More information

DCH-RP Trust-Building Report

DCH-RP Trust-Building Report DCH-RP Trust-Building Report Raivo Ruusalepp Estonian Ministry of Culture DCH-RP and EUDAT workshop Stockholm, June 3 rd, 2014 Topics Trust in a digital repository Trust in a distributed digital repository

More information

HTM, HTML, MHT, MHTML Web document Brightspace Learning Environment strips the <title> tag and text within the tag from user created web documents

HTM, HTML, MHT, MHTML Web document Brightspace Learning Environment strips the <title> tag and text within the tag from user created web documents Dropbox basics What is Dropbox? Learners use the tool to upload and submit assignment submissions to assignment submission folders in Brightspace Learning Environment, eliminating the need to mail, fax,

More information

Digital Preservation Workshop

Digital Preservation Workshop Digital Preservation Workshop 5 November 2010 University of Calgary Peter Van Garderen Artefactual Systems Workshop Agenda 10:00 Introductions 10:15 What is digital preservation? From strategy to implementation

More information

SharePoint Archival Storage Strategies & Technologies January Porter-Roth Associates 1

SharePoint Archival Storage Strategies & Technologies January Porter-Roth Associates 1 SharePoint Archival Storage Strategies & Technologies January 2009 Porter-Roth Associates 1 Bud Porter-Roth Porter-Roth Associates 415-381-6217 budpr@erms.com http://www.erms.com Porter-Roth Associates

More information

Workshop Background. Purpose. Context. To provide you with resources and tools to help you know how to handle file format decisions as a researcher.

Workshop Background. Purpose. Context. To provide you with resources and tools to help you know how to handle file format decisions as a researcher. Workshop Background Purpose To provide you with resources and tools to help you know how to handle file format decisions as a researcher. Context Workshop Series: Preservation and Curation of ETD Research

More information

MEDIA RELATED FILE TYPES

MEDIA RELATED FILE TYPES MEDIA RELATED FILE TYPES Data Everything on your computer is a form of data or information and is ultimately reduced to a binary language of ones and zeros. If all data stayed as ones and zeros the information

More information

A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method

A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method A Collaboration Model between Archival Systems to Enhance the Reliability of Preservation by an Enclose-and-Deposit Method Koichi Tabata, Takeshi Okada, Mitsuharu Nagamori, Tetsuo Sakaguchi, and Shigeo

More information

Building for the Future

Building for the Future Building for the Future The National Digital Newspaper Program Deborah Thomas US Library of Congress DigCCurr 2007 Chapel Hill, NC April 19, 2007 1 What is NDNP? Provide access to historic newspapers Select

More information

State Government Digital Preservation Profiles

State Government Digital Preservation Profiles July 2006 2006 Center for Technology in Government The Center grants permission to reprint this document provided this cover page is included. This page intentionally left blank. Introduction The state

More information

Conch Appendix: Discovery Questionnaire. Questionnaire Summary

Conch Appendix: Discovery Questionnaire. Questionnaire Summary Conch Appendix: Discovery Questionnaire Project Acronym: PREFORMA Grant Agreement number: 619568 Project Title: PREservation FORMAts for culture information/e-archives Prepared by: MediaArea.net SARL Erik

More information

The Salesforce Migration Playbook

The Salesforce Migration Playbook The Salesforce Migration Playbook By Capstorm Table of Contents Salesforce Migration Overview...1 Step 1: Extract Data Into A Staging Environment...3 Step 2: Transform Data Into the Target Salesforce Schema...5

More information

Long-term digital preservation of UNSWorks

Long-term digital preservation of UNSWorks Long-term digital preservation of UNSWorks UNSW Library Arif Shaon, Maude Frances CAUL Community Days 2014 UNSW Australia The University of New South Wales at a Glance: https://www.unsw.edu.au/sites/default/files/documents/unsw4009_miniguide_2012_aw2_v2.pdf

More information

Digital Preservation in Theory and Practice

Digital Preservation in Theory and Practice PASIG Digital Preservation Boot camp Digital Preservation in Theory and Practice Preservation and Archiving Special Interest Group (PASIG) Boot Camp Tom Cramer Chief Technology Strategist Stanford University

More information

ZYNSTRA TECHNICAL BRIEFING NOTE

ZYNSTRA TECHNICAL BRIEFING NOTE ZYNSTRA TECHNICAL BRIEFING NOTE Backup What is Backup? Backup is a service that forms an integral part of each Cloud Managed Server. Its purpose is to regularly store an additional copy of your data and

More information

University of Maryland Libraries: Digital Preservation Policy

University of Maryland Libraries: Digital Preservation Policy University of Maryland Libraries: Digital Preservation Policy July 28, 2013 Approved by the Library Management Group: January 7, 2014 Digital Preservation Policy Task Force: Joanne Archer Jennie Levine

More information

Data Curation Profile Human Genomics

Data Curation Profile Human Genomics Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date

More information