A structured workflow for implementing digital archiving standards in an organisation Dr Peter MU Schmitz & Antony K Cooper Logistics and Quantitative Methods, CSIR Built Environment PO Box 395, Pretoria, 0001, South Africa Email: {pschmitz, acooper}@csir.co.za African Digital Scholarship & Curation 2009, Pretoria, 12-14 May 2009
Content Introduction Existing archiving standards Structured workflow Example Conclusions Acknowledgements Slide 2
Introduction National Archives and Records Service (NARS) Responsible for: National archival heritage for use by the government and people of South Africa promoting efficient, accountable and transparent government through the proper management and care of government records NARS standards: Designing and maintaining records classification systems Identifying records with archival value Determining the conditions for the management of micrographic and electronic records systems Training public servants in records management Inspecting the records management practices of governmental offices Slide 3
Introduction NARS key documents: Performance criteria for records managers of governmental bodies Records management policy manual Managing electronic records in governmental bodies Policy, principles and requirements Other standards: ISO SANS But: All these standards and guidelines can be overwhelming and it can be difficult to understand how they all fit together. Proposal: To use some form of a structured workflow to implement a data archiving standard in a government department Slide 4
Existing archiving standards Open Archival Information System (OAIS), published as ISO 14721:2003. PANDORA National Library of Australia PANDAS a workflow-based archiving system using worktrays to assist a person to archive information in PANDORA Important standards for South Africa: ISO 14721:2003, Reference Model for an Open Archival Information System (OAIS). ISO/TR 15801:2005, Electronic Imaging Information stored electronically Recommendations for trustworthiness and reliability (published locally as SANS 15801:2005). ISO 18938:2008, Imaging materials Optical discs Care and handling for extended storage. Slide 5
Existing archiving standards Important standards for South Africa: ISO/TR 18492:2005, Document Management Applications Long term preservation of electronic document-based information. ISO 19005-1:2005, Document management Electronic document file format for long term preservation. Part 1: Use of PDF 1.4 (PDF/A-1) (published locally as SANS 19005-1:2006). ISO 15489-1:2001, Information and documentation Records management. Part 1: General (published locally as SANS 15489-1:2004). ISO/TR 15489-2:2001, Information and documentation Records management. Part 2: Guidelines (published locally as SANS 15489-2:2004). Most of the above are embedded in the key NARS documents. Slide 6
Structured workflow Definition: The automation of business processes during which documents and tasks are passed among participants according to a set of procedural rules and assigned roles (WfMC, 1998, as quoted in Li and Coleman, 2004:4). Digital Curation Centre (UK) Workflow manual: The conception, storage, use and preservation of digital materials must all be undertaken according to the workflow plan that is best equipped to ensure longevity and sustained accessibility. (Digital Curation Centre, 2008b). Currently only a shell is available. Slide 7
Example To facilitate using the workflow, we have prepared a detailed example for planning and conducting one of the department s key annual surveys. The example shows where and when the various archiving steps should be taken when the project is executed. The survey plan consists of various milestones that need to be achieved when conducting the survey. The various archiving steps are included in the appropriate milestones so as to guide the archiving process for this survey. Slide 8
Example Slide 9
START PROCESS RESPONSIBILITY No Milestones / Tasks A B C Verification Process Performance Indicator 1 Information Needs Determination X X Report compiled 1.1 Branches Submit Inputs X % Inputs Received 1.2 B's Submit Inputs X X X % Inputs Received 1.3 Changes proposed after data analysis X X Changes proposed are implemented 1.4 External Inputs 1.5 Setup file plan to archive hardcopy with registry X X File plans should be determined in consultation between A and B's 1.6 Setup file plan to archive electronic data with registry x X Slide 10
5 Revision of Hardcopy Survey (pilot) X Report Compiled 5.1 Question Formulation revision X 5.2 Revised survey Collation X 5.3 H Steering committee approval X X 5.4 Writing submission to HQ X 5.5 H approval X X 5.6 Finalising the form X 5.7 Set archiving schedule and archiving period (i.e. 10 or 20yrs) X X 5.8 Determine the location of the repository X X Slide 11
7 Revision + Distribution of Electronic Capture Tool X X Report Compiled 7.1 Revisions made to electronic tool X 7.2 Testing of tool X X 7.3 Documentation X 7.4 Distribution to provinces & user support X X X Guidelines for software usage Training of Institutions on software tools 7.5 Determine file formats, and storage media for archiving including migration strategy X Slide 12
10 Provincial Processing X Report Compiled 10.1 B Quality Control X X X % Compliant data 10.2 B Data capturing X X % capturing Errors 10.3 B Data Cleaning & Quality Control X X X Data Quaity rating 10.4 Dataset Locked X Dataset numbering, version 10.5 Dataset technical report compiled X 10.6 Dataset & Technical Report couriered to A X X 10.7 Monitoring of B Process X X Revision of dataset 10.8 Archiving of Surveys (Hardcopy) at designated repository X X Slide 13
13 Masterlist procedures and final publication of dataset X X Report Compiled 13.1 Generating NatIT Numbers to align with Masterlist X 13.2 Documentation of the above X 13.3 Receipt Process of B Quarterly Master List X X 13.4 Updating Masterlist with new institutions X X X % Updated files received 13.5 Documentation of the above X 13.6 Creating metadata X X 13.7 Matching to Masterlist (final integrated data compare with the Masterlist and align accordingly) X X 13.8 Make data available for internal interrogation X X 13.9 Data IS Report (indication of data availability) X 13.1 Perform change management procedures X 13.11 Documentation of the above X 13.12 Data Ready for Server X 14 Analysis & Reporting on the Data X Report Compiled 14.1 Analysis of responses received as answers to Questions X X 14.2 Documentation of the Above X 14.3 Compilation of Survey Change request X 14.4 Dissemination of data 14.5 Archive electronic data at designated archive X 14.6 Create metadata X 14.7 Authenticate archived data X 14.8 Determine archive backup strategy and create backups X 14.9 Archive audit X 14.10 Create/update archive inventory X END PROCESS Report Compiled Slide 14
Conclusions We have described here a structured workflow for implementing digital archiving standards in an organisation. We feel that this workflow is sufficiently general to have application for digital curation in other environments. There are many standards available for archiving digital and analogue data and documents, but they come from various sources and it can be difficult to understand what standards should be used, and where. This workflow provides an easy to use outline of the processes to be followed when archiving digital data, indicating which standards need to be used for each of the process steps in the workflow. Slide 15
Acknowledgements The conference organisers for inviting us to participate in the conference. Our client for allowing us to use their archiving process as an example for this conference. CSIR Built Environment for financial support to attend the conference. Slide 16
Thank you! Q & A.