PDF/A - The Basics. From the Understanding PDF White Papers PDF Tools AG

Similar documents
Chapter 9 Section 3. Digital Imaging (Scanned) And Electronic (Born-Digital) Records Process And Formats

PDF/arkivering PDF/A. Per Haslev Adobe Systems Danmark Adobe Systems Incorporated. All Rights Reserved.

The Making of PDF/A. 1st Intl. PDF/A Conference, Amsterdam Stephen P. Levenson. United States Federal Judiciary Washington DC USA

This document is a preview generated by EVS

Implementing a Standardized PDF/A Document Storage System with LEADTOOLS

ISO INTERNATIONAL STANDARD. Document management Engineering document format using PDF Part 1: Use of PDF 1.6 (PDF/E-1)

PDF Specification for IEEE Xplore (Part A-Core Requirements)

ISO PDF/A -Standard Archive file format standard for long-term preservation

Portal Subcommittee Agenda Thursday, May 10, :45-11:45. Criminal Case Initiation Workgroup Update Judge Martin Bidwill

ISO (PDF/A-2)

compart PDF/A-Support in Compart Products PDF/A White Paper White Paper May 2006

Smarter Document Capture

Emerging Trends in Records Management Technology. Jessie Weston, CRA 2018 MISA Conference October 11-12, 2018

ISO/IEC INTERNATIONAL STANDARD

PDF/A in the Product Lifecycle

Preparing PDF Files for ALSTAR

ISO INTERNATIONAL STANDARD

ISO Document management applications Electronic document file format enhancement for accessibility Part 1: Use of ISO (PDF/UA-1)

The Future of PDF/A and Validation

ISO/IEC INTERNATIONAL STANDARD. Information technology Coding of audio-visual objects Part 22: Open Font Format

ISO/IEC This is a preview - click here to buy the full publication INTERNATIONAL STANDARD. First edition

ISO/IEC INTERNATIONAL STANDARD. Information technology Office equipment Print quality attributes for machine readable Digital Postage Marks

GDUFA s Impact on ectds / Case Study: Implementing PDF Standards

ISO INTERNATIONAL STANDARD

Graham Taylor.

ISO/IEC INTERNATIONAL STANDARD. Conformity assessment Supplier's declaration of conformity Part 1: General requirements

ISO INTERNATIONAL STANDARD. Intelligent transport systems Communications access for land mobiles (CALM) 2G Cellular systems

ISO/TR TECHNICAL REPORT. Information and documentation Implementation guidelines for digitization of records

ISO INTERNATIONAL STANDARD. Document management Electronic document file format for long-term preservation Part 1: Use of PDF 1.

PDF/A in Healthcare. Dr. Bernd Wild. intarsys. Webinar: PDF/A in Healthcare Dr. Bernd Wild.

Preserving PDF at the coalface

ISO/IEC INTERNATIONAL STANDARD. Information technology Software asset management Part 2: Software identification tag

TCG. TCG Certification Program. TNC Certification Program Suite. Document Version 1.1 Revision 1 26 September 2011

Creating PDF/A using Office Apps which Include 1- button PDF Creators

DOCUMENT NAVIGATOR SALES GUIDE ADD NAME. KONICA MINOLTA Document Navigator Sales Guide

Guidelines for PDF Creation from Office Applications

ISO INTERNATIONAL STANDARD. Information and documentation The Dublin Core metadata element set

ISO/IEC INTERNATIONAL STANDARD. Conformity assessment Supplier's declaration of conformity Part 2: Supporting documentation

How to Create a PDF. Using Acrobat Distiller. Acrobat Distiller settings. Adobe Acrobat Professional 8.0 Guide

ISO/IEC INTERNATIONAL STANDARD

ISO INTERNATIONAL STANDARD. Information and documentation Records management Part 1: General

Ten Ways to Share Your Publications With the World: A Guide to Creating Accessible PDF Documents in Adobe Acrobat Professional 7.

INSTRUCTIONS FOR CREATING PDF/A-COMPLIANT FILES FOR ONLINE PUBLISHING AT THE TU BERLIN.

PDF/A for Scanned Documents

Introducing PDF/UA. The new International Standard for Accessible PDF Technology. Solving PDF Accessibility Problems

ISO INTERNATIONAL STANDARD. Geographic information Quality principles. Information géographique Principes qualité. First edition

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia application format (MPEG-A) Part 4: Musical slide show application format

Best practices for producing high quality PDF files

Using Save As to Conform to PDF/A

ISO/IEC TR TECHNICAL REPORT. Information technology Procedures for achieving metadata registry (MDR) content consistency Part 1: Data elements

ISO/IEC INTERNATIONAL STANDARD. Information technology CDIF transfer format Part 3: Encoding ENCODING.1

ISO/IEC INTERNATIONAL STANDARD. Information technology Security techniques Key management Part 4: Mechanisms based on weak secrets

ISO/IEC INTERNATIONAL STANDARD

ISO/IEC Information technology Icon symbols and functions for controlling multimedia software applications

ISO/IEC INTERNATIONAL STANDARD. Information technology Coding of audio-visual objects Part 18: Font compression and streaming

This is a preview - click here to buy the full publication TECHNICAL REPORT. Part 101: General guidelines

ISO/IEC TR TECHNICAL REPORT. Information technology Coding of audio-visual objects Part 24: Audio and systems interaction

ISO/IEC INTERNATIONAL STANDARD. Conformity assessment General requirements for third-party marks of conformity

ISO INTERNATIONAL STANDARD. Graphic technology Variable printing data exchange Part 1: Using PPML 2.1 and PDF 1.

ISO/IEC INTERNATIONAL STANDARD. Information technology Keyboard interaction model Machine-readable keyboard description

ISO INTERNATIONAL STANDARD. Information and documentation Managing metadata for records Part 2: Conceptual and implementation issues

ISO INTERNATIONAL STANDARD

ISO/IEC INTERNATIONAL STANDARD

Scanshare Sales Guide V1.2

ISO 3901 INTERNATIONAL STANDARD. Information and documentation International Standard Recording Code (ISRC)

TIFF-PDF/A White Paper

ISO/IEC INTERNATIONAL STANDARD. Information technology Icon symbols and functions for controlling multimedia software applications

ISO/IEC Information technology Coding of audio-visual objects Part 15: Advanced Video Coding (AVC) file format

ISO/IEC INTERNATIONAL STANDARD. Software engineering Lifecycle profiles for Very Small Entities (VSEs) Part 2: Framework and taxonomy

INTERNATIONAL STANDARD

Welcome to the Introduction to Concordance On Demand Training Series!

ISO INTERNATIONAL STANDARD

INTERNATIONAL STANDARD

ISO INTERNATIONAL STANDARD. Intelligent transport systems Communications access for land mobiles (CALM) Mobile wireless broadband using HC-SDMA

ISO/IEC TR TECHNICAL REPORT

ISO/IEC Information technology Automatic identification and data capture techniques Bar code scanner and decoder performance testing

ISO/IEC INTERNATIONAL STANDARD. General requirements for the competence of testing and calibration laboratories

ISO INTERNATIONAL STANDARD. Fire safety Vocabulary. Sécurité au feu Vocabulaire. This is a preview - click here to buy the full publication

Adobe Acrobat 7.0 Professional The complete solution for document collaboration and PDF print production.

ISO/IEC INTERNATIONAL STANDARD

ISO/IEC INTERNATIONAL STANDARD. Information technology Security techniques Information security management system implementation guidance

How. Can Acrobat Help My Bar Association? Catherine Sanders Reach ABA Legal Technology Resource Center

ABBYY FineReader 14 YOUR DOCUMENTS IN ACTION

POLICY AND GUIDELINES FOR THE RETENTION AND DISPOSITION OF ORIGINAL COUNTY RECORDS COPIED ONTO OPTICAL IMAGING AND DATA STORAGE SYSTEMS

PDF and Accessibility

SharePoint Archival Storage Strategies & Technologies January Porter-Roth Associates 1

ISO/TR TECHNICAL REPORT. Health Informatics Dynamic on-demand virtual private network for health information infrastructure

ISO/IEC INTERNATIONAL STANDARD. Systems and software engineering Measurement process. Ingénierie des systèmes et du logiciel Processus de mesure

This document is a preview generated by EVS

ACCESSIBILITY IN PDF FORMS

ISO/IEC INTERNATIONAL STANDARD. Information technology ECMAScript for XML (E4X) specification

ISO INTERNATIONAL STANDARD. Ergonomics of human-system interaction Part 110: Dialogue principles

ISO/IEC INTERNATIONAL STANDARD. Identification cards Machine readable travel documents Part 3: Machine readable official travel documents

ISO INTERNATIONAL STANDARD

Transcription:

White Paper PDF/A - The Basics From the Understanding PDF White Papers PDF Tools AG Why is PDF/A necessary? What is the PDF/A standard? What are PDF/A-1a, PDF/A-1b, PDF/A2? How should the PDF/A Standard be applied? Is PDF/A the answer to long-term archiving? Version: 2.0 Date: January 22, 2007 Copyright 2007 PDF Tools AG. All Rights Reserved. Other names and brands may be claimed as the property of others. Information regarding third party products is provided solely for educational purposes. PDF Tool AG is not responsible for the performance or support of third party products and does not make any representations or warranties whatsoever regarding quality, reliability, functionality, or compatibility of these devices or products.

White Paper - PDF/A Page 2 of 9 Contents Contents...2 Introduction...3 Background... 3 Why the PDF/A initiative?... 3 The PDF/A Standard...5 The Goal of PDF/A... 5 PDF vs PDF/A... 5 The PDF/A, A-1a, A-1b, A-2 "Babylon"... 5 Using the Standard...7 Obtaining a Copy... 7 Who should read the PDF/A Standard... 7 What tools are available?... 7 PDF/A requires a complete solution... 7 Summary...8 PDF/A as the new archiving standard... 8 How will the market react... 8 Hot air or a long-term solution?... 8

White Paper - PDF/A Page 3 of 9 Introduction PDF/A - A new Standard for Long-Term Archiving Background On September 28, 2005 the International Standards Organization (ISO) approved a new Standard governing electronic document archiving: ISO-19005-1 - Document management - Electronic document file format for long-term preservation - Part 1: Use of PDF 1.4 (PDF/A-1). This standard is a product of over 3 years of meetings, discussions, and review by organizations and companies worldwide. The initiative to create a standard format for electronically archived documents based on Adobe's PDF format was launched in May 2002 in the United States by AIIM (Association for Information and Image Management), the NPES (National Printing Equipment Association) and the Administrative Office of the U.S. Courts. The kick-off meeting was held in October 2002 and included numerous electronic documentation users and PDF suppliers including Adobe Systems, Library of Congress, Surety Inc., Quality Associates Inc., Appligent, Merck, EMC, PDF Sages, and NARA (National Archives & Records Administration). Subsequent attendees included Xerox, Honeywell, EDS, and Glaxo Smith Kline to name a few. The US initiative prepared a first draft and submitted their project to the ISO to be registered as an international Standard. The ISO assigned the project to a Technical Committee (TC 171 - Document Management Applications). TC 171 is comprised of 13 participating countries (who each have one vote) and 21 observer countries. After numerous reviews and amendments the Standard was approved by the ISO participating companies in September 2005. PDF/A was approved as an international standard in September 2005. Numerous organizations, users and suppliers were involved in the process. Why the PDF/A initiative? Approved archiving formats vary from country to country. Traditional archiving methods (paper and microfilm / microfiche) guarantee reproducibility but are outdated for modern technology. Large documents cannot be quickly sent around the globe and it is extremely difficult to search archived documents for specific content. In a first step towards electronic archiving, many organizations implemented TIFF archives. TIFF guarantees reproducibility in the long-term and has an established structure. TIFF is also easy to transmit in a worldwide business environment but is not easily searchable. A movement then began towards PDF. PDF is a more attractive archiving format than TIFF for a variety of reasons: PDF stores structured objects (e.g. text, vector graphics, raster images), allowing for an efficient full-text search in an entire archive. TIFF is a raster format and must first be scanned with an OCR engine (optical character recognition) before it can be searched. New technologies have paved the way for archiving documents electronically. PDF has numerous advantages over TIFF format.

White Paper - PDF/A Page 4 of 9 PDF files are more compact and require only a fraction of the memory space of respective TIFF files, often with a better quality. The smaller file size is especially advantageous for electronic file transfer (FTP, e-mail attachment etc.). Metadata like title, author, creation date, modification date, subject, keywords, etc. can be embedded in a PDF file. PDF files can be automatically classified based on the metadata, without requiring human intervention. Page content in a PDF document is usually device-independent, i.e. it does not depend on a specific raster resolution, color system etc. The pages are first mapped to a raster when they are reproduced for viewing or printing (rendering process). PDF will therefore profit from technological advances in reproduction devices (printers, monitors etc.) even long into the future. The inventor of the de facto PDF Standard, Adobe Systems, has published seven new versions of their "PDF Reference Manual" during the past 12 years. Each new version has enriched the format with countless new features and has updated some of the older features. It was therefore necessary to define a stable derivative of the PDF format, based on Adobe's proprietary PDF specification, that could be internationally accepted as a Standard for long-term electronic archiving. The result: PDF/A.

White Paper - PDF/A Page 5 of 9 The PDF/A Standard The Goal of PDF/A ISO 19005-1 defines "a file format based on PDF, known as PDF/A, which provides a mechanism for representing electronic documents in a manner that preserves their visual appearance over time, independent of the tools and systems used for creating, storing or rending the files." (from ISO 19005-1). The Standard does not define an archiving strategy or the goals of an archiving system. It identifies a "profile" for electronic documents that ensures the documents can be reproduced in years to come. A key element to this reproducibility is the requirement for PDF/A documents to be 100 % self-contained. All of the information necessary for displaying the document in the same manner every time is embedded in the file. This includes all visible content like text, raster images, vector graphics, fonts, color information and much more. A PDF/A document however is not permitted to be reliant on any information from direct or indirect external sources, for example links to external image files or font that are not embedded. PDF vs PDF/A PDF in its native form cannot guarantee long-term reproducibility and not even the "WYSISYG" principle (what you see is what you get). Certain restrictions and amendments had to be incorporated into the Standard. To be accepted, PDF/A needed to be based on an existing version of the PDF Reference and not on anticipated functionality in a future version. The ISO TC 171 chose the Adobe PDF Reference 1.4, which Adobe implemented in Acrobat 5, as the basis for the Standard. The ISO Standard states that PDF/A "shall adhere to all requirements of PDF Reference as modified by this part of ISO 19005". The Standard itself identifies only differences with respect to the PDF Reference. In order to fully understand PDF/A, you have to also understand the PDF Reference 1.4. Certain functionality allowed in PDF 1.4 has been specifically excluded from PDF/A, for example transparency and sound and movie actions. There are also elements described in the PDF Reference 1.4 that are not mandatory. PDF/A on the other hand requires these elements to be implemented, for example embedded fonts. In short, PDF/A is based on the PDF Reference 1.4, with specific features being either mandatory, recommended, restricted, or prohibited. The PDF/A, A-1a, A-1b, A-2 "Babylon" PDF/A has been established as a row of standards with several parts. Currently only PDF/A-1 (Part 1) has been approved. PDF/A-1 is further subdivided into two levels of compliance: PDF/A-1a and PDF/A-1b. PDF/A-1a (referred to as Level A Conformance) denotes full compliance with the currently approved PDF/A Standard (ISO 19005-1): Part 1. PDF/A files are self-contained. All of the information required to display a document the same way every time is embedded in the file. PDF/A is currently based on the PDF Reference Version 1.4. There are different levels of PDF/A. PDF/A-1a denotes full compliance with the currently approved PDF/A Standard.

White Paper - PDF/A Page 6 of 9 There is also a "minimal compliance" level for PDF/A: PDF/A-1b (referred to as Level B Conformance). PDF/A-1b requirements are meant to ensure that the rendered visual appearance of the file is reproducible over the long-term. PDF/A-1a and PDF/A-1b differ primarily with respect to text extraction. PDF/A-1a ensures the preservation of a document's logical structure and content text stream in natural reading order. The text extraction is especially important when the document must be displayed on a mobile device (for example a PDA) or other devices in accordance with Section 508 of the US Rehabilitation Act. In such cases the text must be reorganized on the limited screen size (re-flow). This feature is also known as "Tagged PDFs". PDF/A-1b ensures that the text (and additional content) can be correctly displayed (e.g. on a computer monitor), but does not guarantee that extracted text will be legible or comprehensible. It therefore does not guarantee compliance with Section 508. The difference between PDF/A-1a and -1b has no impact for scanned documents, provided the files have not been enhanced by means of OCR for searching. A new part to the standard, ISO 19005-1, Part-2 (PDF/A-2), is currently being worked on by the Technical Committee. PDF/A-2 will address some of the new feature added with versions 1.5, 1.6 and 1.7 of the PDF Reference. PDF/A-2 should be backwards compatible, i.e. all valid PDF/A-1 documents should also be compliant with PDF/A-2. However PDF/A-2 compliant files will not necessarily be PDF/A-1 compliant. PDF/A- 1b does not guarantee full text extraction functionality (as required by Section 508 of the US Rehabilitation Act). PDF/A-2 is currently being worked on and is taking into account the PDF Reference Versions 1.5, 1,6 and 1.7

White Paper - PDF/A Page 7 of 9 Using the Standard Obtaining a Copy The ISO 19005-1 Standard can be purchased from the ISO website. Copies can be ordered in paper or in PDF format and, like all other ISO Standards, they are protected by copyright. For this reason, publishing freely available version on the internet is illegal. The Standard is currently only available in English. Who should read the PDF/A Standard PDF/A format is meant to support and enhance a good archiving strategy. The Standard itself is quite technical and can only be fully understood by experts with a fundamental knowledge in page description languages like PostScript and PDF. The main standard is fairly small, however the volume of related documents is enormous. The PDF Reference alone contains almost 1000 pages, excluding the additional reference documents like font formats, XML specification, compression formats, RFCs etc. In addition, PDF/A alone does not guarantee long-term archiving. A good approach is to enlist an expert who will help you understand the requirements of PDF/A, determine how to implement PDF/A into your archiving strategy, and explain what steps you need to take to ensure your overall archiving goals can be met. The PDF/A Standard (ISO 19005-1) can be obtained at: www.iso.org. PDF/A is complex: alone the PDF Reference 1.4 has almost 1000 pages. What tools are available? Tools for creating, processing and validating PDF/A documents have been on the market since mid 2006. Adobe itself has integrated respective functions in Version 8 of the Adobe Acrobat, released in the fall of 2006. Even Microsoft has made available a separate downloadable plug-in for their new Office 2007 package, allowing for the creation of PDF/A compliant files directly from Office products. Due to the number of products that have already appeared for creating PDF/A, it has become extremely important to properly verify if the PDF/A documents are fully compliant with the ISO standard. PDF/A requires a complete solution PDF/A is only part of a complete archiving solution. PDF/A alone does not guarantee long-term archiving and it does not guarantee that information will be displayed as desired. PDF/A also does not claim that a PDF/A-based archive is always the best solution. However, it you decide to use PDF, then PDF/A defines a set of requirements that make long-term archiving possible. Other aspects that must be taken into account when implementing a PDF/Acompliant archive include, for example, corporate standards and procedures, reliable data sources, reliable fonts, quality management and special individual requirements. The migration of current paper- or TIFF-based archives to PDF/A compliant archives is not an insignificant task and must be well planned. PDF/A is only one element of a complete archiving strategy. PDF/A alone cannot guarantee long-term archiving, but it makes longterm archiving possible.

White Paper - PDF/A Page 8 of 9 Summary PDF/A as the new archiving standard PDF/A is expected to establish itself as the new electronic archiving standard. PDF is prevalent in public and private sectors worldwide and is already an accepted archiving format in countless markets. The PDF/A Standard will help ensure that users get the guarantee of long-term reproducibility. The establishment of the PDF/A Standard will probably (and it should) also have an impact on the future development of PDF itself. Adobe will continue to improve its PDF offerings and add new technology. Examples of these are 3D und XFA for dynamic PDF forms. The Standard will therefore come under further pressure, because a principle concept behind standards, and especially an archiving standard, is that they remain constant and don't regularly change. How will the market react? Don't expect the market to be flooded with PDF/A products in the short term. Understanding the technology behind PDF/A requires considerable knowledge. In addition, users have higher quality expectations for standards-compliant software. The first tools have been on the market since mid 2006. In demand are tools for creating and validating PDF/A compliant documents, as well as simple conversions of existing PDF files into compliant PDF/A files. The appearance of the first professional PDF/A tools has initiated processes for implementing PDF/A compliant archives. One cannot however expect too much functionality too quickly. Plan on only the more restrictive PDF/A-1b format being readily available, with the full functionality of PDF/A-1a coming later. Expect also to find numerous products that claim to support PDF/A, but don't really. Expertise for evaluations and serious suppliers will especially be in demand during the market introduction phase. PDF/A is expected to quickly establish itself as a preferred electronic archiving format. PDF/A tools are readily available. Verify the quality of the tool and the expertise of the supplier. Hot air or a long-term strategy? PDF/A is not being viewed as just "hot air". The drive towards PDF-based archives has been going on for years, and PDF is already an established archiving format. The PDF/A Standard will help ensure the long-term preservation of the electronic files. With Microsoft now supporting the direct generation of PDF/A from their new Office products, the signal is loud and clear. PDF/A, internationally accepted, is here to stay.

White Paper - PDF/A Page 9 of 9