Implementing a Standardized PDF/A Document Storage System with LEADTOOLS

Similar documents
Using LEADTOOLS OCR to Enhance Google Drive Search

Using Print to PACS to Digitize Your Legacy Medical Imaging Systems

Automatically Classify Scanned Documents using LEADTOOLS Forms Recognition

Creating and Processing OMR Forms with LEADTOOLS

HTML5 Web Scanning with LEADTOOLS

Forms Recognition Implementation Strategies for Large Enterprises

PDF/A - The Basics. From the Understanding PDF White Papers PDF Tools AG

ABBYY FineReader 14 YOUR DOCUMENTS IN ACTION

GUIDELINES FOR CREATION AND PRESERVATION OF DIGITAL FILES

ABBYY FineReader 14. User s Guide ABBYY Production LLC. All rights reserved.

ABBYY FineReader 14 Full Feature List

File Magic 5 Series. The power to share information PRODUCT OVERVIEW. Revised June 2003

USER S GUIDE Software/Hardware Module: ADOBE ACROBAT 7

Technical Overview. Access control lists define the users, groups, and roles that can access content as well as the operations that can be performed.

Smarter Document Capture

DOWNLOAD OR READ : THE IMAGE OF THE POPULAR FRONT PDF EBOOK EPUB MOBI

Scan to PC Desktop Professional v9 vs. Scan to PC Desktop SE v9 + SE

I.R.I.S. is shipping IRISPdf Server 6.0, including ihqc, I.R.I.S. new & revolutionary high quality compression technology!

DjVu Technology Primer

Sustainable File Formats for Electronic Records A Guide for Government Agencies

Welcome to the PDF Xpansion SDK

The DMS provides a web browser, a desktop client and a mobile browser as standard features.

DOCUMENT NAVIGATOR SALES GUIDE ADD NAME. KONICA MINOLTA Document Navigator Sales Guide

Why is Office 365 the right choice?

SOFTOLOGY LIMITED

Perfect PDF & Print 9

Impact 360 Express. Features and Benefits of Content Producer

CGM v SVG. Computer Graphics Metafile v Scalable Vector Graphic. David Manock

When it comes to information archive strategy, digital has met its ideal partner.

Chapter 9 Section 3. Digital Imaging (Scanned) And Electronic (Born-Digital) Records Process And Formats

Mako is a multi-platform technology for creating,

Nuance Licensing Opportunities: Power PDF and Dragon

PDF Specification for IEEE Xplore (Part A-Core Requirements)

Digital s ideal partner

Corporation. Unlock Your PDF Potentials. Explore How PDF Leads You to Document Success PDF. PDF Learning Guide

Emerging Trends in Records Management Technology. Jessie Weston, CRA 2018 MISA Conference October 11-12, 2018

Best practices for producing high quality PDF files

PDF/arkivering PDF/A. Per Haslev Adobe Systems Danmark Adobe Systems Incorporated. All Rights Reserved.

ICH M8 Expert Working Group. Specification for Submission Formats for ectd v1.1

Purchase PDF Converter 6 Volume License for $87 (normally $199)! Purchase PDF Converter Enterprise 6 in a DVD case for $99.95 (normally $229.95)!

PDF/A for Scanned Documents

K/Compute (Private) Limited Web:

Solutions Report. KODAK Capture Pro 5 OVERVIEW

Perfect PDF 9 Premium

KYOCERA Quick Scan v1.0

MULTINATIONALIZATION FOR GLOBAL LIMS DEPLOYMENT LABVANTAGE Solutions, Inc. All Rights Reserved.

How. Can Acrobat Help My Bar Association? Catherine Sanders Reach ABA Legal Technology Resource Center

Overview. What is TCM? TCM Supported File Types A Day in the Life of a Document Using TCM in Munis Using TCM without Munis TCM extra Features Q&A *

PEERNET PDF Creator Plus 6.0 Thank you for choosing PDF Creator Plus! Getting Started QUICK START GUIDE

Aquaforest OCR SDK for.net Release Notes

DOWNLOAD OR READ : WHATS THE DIFFERENCE IN PROTESTANT AND ROMAN CATHOLIC BELIEFS PDF EBOOK EPUB MOBI

Readiris Pro 9 for Mac, the New Release of I.R.I.S. Flagship Product Sets a New Standard for OCR on the Mac Platform

White Paper: ABBYY Recognition Server Web Service API Example

Oracle VueLink for Documentum

ADOBE 9A Acrobat 7.0 Prowith Adobe LiveCycle Designer ACE. Download Full Version :

PDF solution comparison.

XF Rendering Server 2008

PDF solution comparison.

PDF solution comparison.

PDFelement 6 Solutions Comparison

Welcome to the PDF Xpansion SDK

City of Bartlett. Request for Information. Document Management Solution

Viewer 2 Beta Frequently Asked Questions

TR 1288 Specifications for PDF & XML format Page 1 of 7

Enhancing applications with Cognitive APIs IBM Corporation

d-color d-color Code:

Portal Subcommittee Agenda Thursday, May 10, :45-11:45. Criminal Case Initiation Workgroup Update Judge Martin Bidwill

PDF solution comparison

Envivio Mindshare Presentation System. for Corporate, Education, Government, and Medical

Publishing Electronic Portfolios using Adobe Acrobat 5.0

Creating Searchable PDFs with Adobe Acrobat XI - Quick Start Guide

The diverse software in Adobe Creative Suite 2 enables you to create

PDF statistics the universe of electronic documents

Océ PRISMA archive software. Archiving made easy. Powerful, high-volume. archiving software

Improved automatic restart and failed job recovery 64-bit support for improved memory utilisation

The diverse software in the Adobe Creative Suite enables you to create

Automatic Reader. Multi Lingual OCR System.

Programs We Support. We accept files created in these major design and layout programs. Please contact us if you do not see your program listed below.

Preparing PDF Files for ALSTAR

Verint Knowledge Management Solution Brief Overview of the Unique Capabilities and Benefits of Verint Knowledge Management

THE LITTLE RED BRIEF. ArchiVing And backup strategies. Your answer to the latest and greatest issues facing IT. VOL

Hybrid WAN Operations: Extend Network Monitoring Across SD-WAN and Legacy WAN Infrastructure

XF RENDERING SERVER 2009 ARCHITECTS OVERVIEW

Review PDF Converter for Windows software downloader cnet ]

****** Release Note for Image Capture Plus ***** Copyright(C) , Panasonic Corporation All rights reserved.

Investing in a Better Storage Environment:

Chapter 2 Trend Toward Use of Lightweight 3D Data

A PRACTICE BUILDERS white paper. 8 Ways to Improve SEO Ranking of Your Healthcare Website

Scan to PC Desktop Professional v7.0 Orientation Guide

This guideline cannot anticipate all operating systems and software versions, therefore general instructions are provided.

V5 Printing and e-output Overview

Adobe Acrobat 8 Professional User Guide

PDF solution comparison

Juniata County, Pennsylvania

Management software PageScope Suite PageScope Suite The workflow accelerator

Adobe Acrobat 6.0 Professional

Embracing HTML5 CSS </> JS javascript AJAX. A Piece of the Document Viewing Puzzle

TT PERFORMANCE SUITE WHAT'S NEW IN RELEASE 2014

echive The Challenge Open Archive Systems The Solution echive Series Overview e c h .tif Drawing and Document Management Made Easy .

Your network s path to its fiber future. Grow confidently with fiber solutions from an experienced partner

Transcription:

Implementing a Standardized PDF/A Document Storage System with LEADTOOLS

Introduction Electronic document archival has evolved far beyond the simple days of scanning a paper document and saving it as an image or PDF. Nowadays, many documents don't even start in physical form and could be one of many open or proprietary formats. Adding to the disparity caused by varying file formats is how and where files are stored. Many enterprises have their documents spread around numerous "data islands" including local computers, networked file shares and cloud services. Finally, the prevalence of mobile devices and tablets which may or may not support some formats further reinforce the need for standardized document archival. Companies run on information, and as digital archives grow in both scale and diversity the ability to efficiently and accurately find data within them often fails to keep up. PDF/A is built for this purpose, but migrating all of your various file formats remains a challenge since raster image formats such as TIFF and JPEG have little to no searchable features beyond the file name. This white paper will explore how to take full advantage of PDF/A as your universal document storage format by using the state of the art technology within LEADTOOLS Document Imaging SDKs. Creating a Searchable Document Archive with PDF/A For years, PDF has been widely recognized and adopted as the best format for document archival, content management, record retention, risks management, litigation and discovery. This is especially true for the PDF/A sub-format which is specifically designed with archival and future-proofing in mind. PDF/A is completely self-contained and stores fonts, color management, annotations, images and more within the file itself. This ensures the document will stay true and not change its appearance for years on end while operating systems, devices, monitors and default fonts change all around it. Normalizing your archive will yield many benefits in storage allocation, productivity and costs. The problem of being able to find and view your documents is drastically reduced since PDF is such a widely supported format. Making the choice to use PDF/A as your sole document archival format is certainly wise, but only solves a small part of the overall problem. Yet to be addressed are the issues of converting a divergent archive and ensuring that all further storage is done in a uniform fashion.

A handful of applications and scanners natively come with the ability to save as PDF, but can be unnecessary and cost prohibitive. In addition, documents can come from many sources both inside and outside your organization so at some level your documents must be processed and converted. Without a well designed and automated process, the benefits of a normalized archive are hard to fully realize. Many organizations therefore shy away from going fully digital due to the challenges involved in properly correcting and maintaining their newly envisioned document storage system. Therefore they feel trapped in knowing they need to change but do not know how to accomplish their goals in a holistic and cost effective manner. Making it All Possible with LEADTOOLS Document Imaging SDKs If all or part of this situation sounds familiar, look no further than LEADTOOLS. Its Document Imaging SDKs cover the gamut of imaging technology needed to make a universal PDF/A document archive a reality. Full PDF and PDF/A File Format Support LEADTOOLS provides full control over the PDF format including advanced capabilities such as extracting text, hyperlinks, bookmarks and metadata as well as updating, splitting and merging pages from existing PDF documents. With LEAD Technologies' decades of expertise in image compression, its PDF SDK also offers the industry's best performing and most diverse PDF compression options including JBIG, JPEG2000 and Mixed Raster Content. Also included are features often difficult to find in similar commercial SDKs, including reading, displaying, editing and writing native PDF annotations and markup that work seamlessly with Adobe Acrobat and other compliant PDF viewers. Rather than being at the mercy of the PDF file format and the often exorbitant costs of PDF editing capabilities, LEADTOOLS will open up incredible opportunities for your archival system and keep all the decision making and customization in your court.

Optical Character Recognition (OCR) and Conversion LEADTOOLS comfortably tackles the problem of migrating an existing archive with mixed file formats to a unified PDF/A archive. With the ability to load, save and convert over 150 raster, vector and document file formats, you can rest assured that you will have your bases covered. Since not all formats are text-based and searchable, LEADTOOLS can use its fast and highly accurate Optical Character Recognition technology to convert those images to searchable PDF/A. The advanced OCR SDK in LEADTOOLS supports over forty languages and character sets including English, Spanish, French, German, Japanese, Chinese, Arabic and more, making it a reliable solution for the largest of enterprises running and providing services in multiple countries across the globe. Most text-based PDF files also have smaller file sizes than the original raster image from which they were converted. Moreover, all of this can be done in as few as three lines of code. IOcrEngine ocrengine = OcrEngineManager.CreateEngine(OcrEngineType.Advantage, false); ocrengine.startup(null, null, null, null); ocrengine.autorecognizemanager.run(_strinputfile, _stroutputfile, DocumentFormat.Pdf, null, null); Virtual Printing If there is anything that the vast majority of applications have in common it's the ability to print. This is, after all, where the need for document archival started. Instead of printing documents to paper and then later using scanners and OCR to convert them back into a searchable digital medium, the LEADTOOLS Virtual Printer can get it done right from the start. This approach not only handles the documents which you would normally print, but also allows you to archive many other sources of information including emails, faxes, website, social media and virtually any file format. As an added benefit, the vast majority of documents and materials you print are textual which means the resulting PDFs will already be searchable and require no special processing and are 100% accurate to the original document.

DocumentWriter _documentwriter; public void _printer_emfevent(object sender, EmfEventArgs e) // Create a new document page and pass the EMF in e.stream DocumentPage documentpage = DocumentPage.Empty; documentpage.emfhandle = new Metafile(e.Stream).GetHenhmetafile(); // Load EMF as raster for image over text e.stream.position = 0; documentpage.image = _codec.load(e.stream); // Add the page _documentwriter.addpage(documentpage); public void _printer_jobevent(object sender, JobEventArgs e) if (e.jobeventstate == EventState.JobStart) // Initialize DocumentWriter PdfDocumentOptions pdfoptions = new PdfDocumentOptions(); pdfoptions.documenttype = PdfDocumentType.PdfA; pdfoptions.fontembedmode = DocumentFontEmbedMode.Auto; pdfoptions.imageovertext = true; _documentwriter = new DocumentWriter(); _documentwriter.setoptions(documentformat.pdf, pdfoptions); _documentwriter.begindocument(_pdffilename, DocumentFormat.Pdf); else if (e.jobeventstate == EventState.JobEnd) // Add fonts and end the document AddAndInstallFonts(e.JobID); _documentwriter.enddocument(); // Load PDF System.Diagnostics.Process.Start(_pdfFileName); Finally, LEADTOOLS Virtual Printers can also be configured to run on a server and made accessible over your company's LAN or the web with Internet Printing Protocol (IPP). This flexibility makes Virtual Printing an excellent solution for maintaining your archive into the future by providing a large funnel into which nearly any piece of information can be printed and then automatically archived through a central business workflow process.

HTML5 Zero Footprint Viewer Just because you are saving your documents as PDF doesn't mean you can't benefit from a viewer. Though PDF is so widely adopted that few think about someone not being able to load it, plug-ins and viewing applications are still required in most situations. By using the HTML5 and JavaScript based viewer in LEADTOOLS, you can build a true cloud-based image viewing solution which requires no plug-ins or downloads. All of the heavy image processing and display is done on the client-side, yielding fast display times and a responsive user interface. Conclusion With LEADTOOLS, standardizing your document storage to PDF/A is no longer an arduous, complex and costly endeavor. Everything you need to convert your existing files, manage and normalize your PDFs, and create all-inclusive business workflows is included in programmer-friendly libraries for multiple platforms. You can rest easy knowing that all the information your company relies on for efficient and productive operation will be properly archived and readily accessible. This is just one of many real world solutions you can tackle with LEADTOOLS. Its state of the art Document Imaging SDK is the most flexible and powerful product in its class, and LEADTOOLS offers an incredible value with its comprehensive family of toolkits for raster, document, medical and multimedia imaging. For more information on how LEAD Technologies can image-enable your application and boost your ROI, visit www.leadtools.com to download a free evaluation, or give us a call at +1-704-332-5532.

SALES: (704) 332-5532 SALES@LEADTOOLS.COM SUPPORT: (704) 372-9681 SUPPORT@LEADTOOLS.COM LEAD TECHNOLOGIES, INC. 1927 SOUTH TRYON STREET SUITE 200 CHARLOTTE, NC 28203 About LEAD Technologies With a rich history of over twenty years, LEAD has established itself as the world's leading provider of software development toolkits for document, medical, multimedia, raster and vector imaging. LEAD's flagship product, LEADTOOLS, holds the top position in every major country throughout the world and boasts a healthy, diverse customer base and strong list of corporate partners including some of the largest and most influential organizations from around the globe.