ISO PDF/A -Standard Archive file format standard for long-term preservation Marc Straat 22 March 2005 Project ArchiSafe Arbeitskreise Nationale&Internationale Standards: Rechtliche Rahmenbedingungen, Verfahren, Formate, Definitionen
Agenda PDF An Introduction PDF From Public Spec, DeFacto to DeJure (ISO) Requirements for Electronic Archives PDF/A - Archive file format ISO standard Some observations: Archiving standards for ERM/DocMgt
Adobe s philosophy TIFF
PDF Portable Document Format View and print on any platform UNIX/LINUX, Mac OS, Microsoft Windows, Symbian, imode Preserves fonts, images, graphics, and layout of any source document Model: Free reader - Paid writer PDF files are compact and complete, and can be shared, viewed, and printed by anyone with free Reader software Electronic format to read any document From e.g. MS word, scanned paper and other docs into PDF JPEG, MPEG, audio formats Also PDF/XML for Electronic Forms and Electronic Documents in workflows
PDF Main Features Free reader to view a PDF file More features through Adobe server products Tagged PDF files contain information on content and structure To repurpose content, e.g. Mobile Devices Accessibility - Screen readers Digital rights to Document / Digital Signatures Document control and fidelity Secure distribution and exchange of electronic documents and forms
PDF s Intellectual Property Public Specs PDF as the format for electronic documents is public Download at www.adobe.com or buy at www.amazon.co.uk Adobe owns copyright of the specification PDF s public specifications For 3 rd party PDF tool developers to develop their version of PDF software Create, generate or manipulate PDF s
Multiple PDF tools on the market PDF s public specifications attracted nearly 2000 3 rd party PDF tool developers Independent commercial implementations PDF creation tools and PDF viewers from Canon, HP, Apple, Oracle, ScanSoft, and Sun Open-source PDF implementations OpenOffice and Ghostscript Communities PlanetPDF, PDFzone
Multiple PDF tools on the market (cont d.) A downside Created PDF s can be different and behave inconsistently Potentially hinders interoperability Demand for standardisation To enable interoperability and quality levels Longevity of specs/standard beyond Adobe s life time For critical application areas within electronic documents Pre-press, Archiving, Accessibility, Engineering
PDF based ISO standards existing, in development Publishing PDF/X Predictable printing of digital files anywhere in the world ISO ratified in 2001 Archiving - PDF/A Reliable file format for archiving and preservation Draft International standard April 2005, ISO ratified 2005 Accessibility - PDF/Access Support for assistive technology WG started 2004 Architecture, Engineering, and Construction - PDF/E Exchange of complex technical documents for architecture, engineering, construction, manufacturing, and geospatial industries WG started 2004, NWI released March 2005
From Specs and DeFacto to ISO standard PDF as a commercial application meeting market requirements OpenSource and Commercial 3 rd party PDF tool developers PDF specifications made public PDF based Standards based Ratified by ISO
Specifications and Standardisation PDF Reference Specifications as a base for ISO Standardisation Subset of PDF Reference 1.4 as base for PDF/A-1 to be ISO ratified Two interrelated speed levels Speed level 1: determined by Adobe s speed of innovation Speed level 2: determined by ISO s standardisation process
Leverage innovative specs for standardisation ISO ratified standards based on PDF reference PDF/X PDF/A-1 PDF/A-2 (?) PDF1.3 PDF1.4 PDF1.5 PDF1.6 PDF1.x PDF Reference Public Specifications
Archiving Issues How do you preserve an electronic document today or 30 years from today? How do you preserve both paperbased and electronic records in a consistent format? How do you provide consistency in the integrity of your archives? How do you ease the search for an archived document?
Which archiving file format? Other formats and technologies in use as standards TIFF SGML HTML ASCII PDF But also: word processing formats, spreadsheet formats, etc
Scope of PDF/A International Standard specifies the use of the Portable Document Format (PDF) suitable for the long-term preservation of electronic documents. Based on the business and technical needs of governments, regulated industries, corporations, educational institutions and libraries
ISO PDF/A Participants and Process Joint initiative of accredited standards bodies AIIM International (co-ordination) NPES Working group with many vendors, universities, and government agencies, e.g. UK, US, Sweden, Japan, Australia, France International Organization for Standardization (ISO) ISO work initiated Fall 2002 ISO Draft International Standard (DIS) March 2005
ISO PDF/A Standards Process & Timeline Work organized by accredited standards bodies AIIM International (the Association for Information and Image Management) NPES (The Association for Suppliers of Printing, Publishing and Converting Technologies) International Standards Organization (ISO) status March 2003 Issued as a New Work Item (NWI) October 2003 ISO Working Draft Issued December 2003 ISO Committee Draft (CD) 31 March 2004 ISO International Meetings (New York) April 2004 second ISO CD 4-5 October 2004 Int l Meeting, BSI/National Archives, London DIS by March 2005, followed by ISO ratification Mid 2005
Why a Standard Version of PDF PDF is too powerful and flexible Higher degree of reliability than required by the published specification Compatibility into the future Reliable migration Developed and maintained by an external organization
PDF/A Specifies Subset of the PDF Reference Version 1.4 Specifies constrained features Specifies required features Specifies prohibited features PDF 1.4 Reference PDF/A
Requirements of PDF/A Constrains, requires, prohibits features No external resources needed All fonts embedded No encryption No embedded executables Meta data Guidance documents Business overview Technical overview Guidance for Industry Implementation documents
ISO/TC171/SC2 Document Imaging Applications
PDF/A Conversion, Archive, Store
Concluding Native formats are not suitable for archiving Right now: Avoid vendor proprietary archive formats, use PDF and TIFF instead Originating application might not be backwards compatible Use PDF/A when business context is important PDF/A represents not only the data contained in the document, but also the exact form of the document A PDF/A file can be viewed without the originating application Document Fidelity and Accessibility through XML Metadata (RDF-XML, XMP), providing a natural connect to an Archive DMS
Archiving standards for ERM/DocMgt (1) PRONOM/TNA2002 By The National Archives (TNA, UK) Tight specifications, for public sector use only Heavy testing to certify vendors -> costly/lengthy MoReq v1 (2001) Funded by EC, high profile and backing Move to a European approach/standard Lessons learnt from NARA and VERS
Archiving standards for ERM/DocMgt (2) MoReq Looks beyond functional requirements of records mgt Also attractive to private sector -> Deutsche Bank TNA decided to stop TNA2002 testing for certification and transition to MoReq Upgrade to MoReq v.2 will include elements of existing PRONOM/TNA2002 EC funding not granted yet Alternative to TNA2002/MoReq: ISO 15489 However, is seen as too generic -> doesn t define authenticity, reliability, integrity, usability
For More Information PDF/A standard Web site: www.aiim.org/pdf_a Contacts: Marc Straat Head of Standards Development Europe mstraat@adobe.com