FineReader Engine Overview & New Features in V10

Similar documents
ABBYY FineReader 10. Professional Edition Corporate Edition Site License Edition. Small and medium-sized businesses or individual departments

ABBYY FineReader 14 Full Feature List

ABBYY FineReader 14. User s Guide ABBYY Production LLC. All rights reserved.

ABBYY Version 12 User s Guide. FineReader ABBYY Production LLC. All rights reserved.

ABBYY FineReader 14 YOUR DOCUMENTS IN ACTION

Improved automatic restart and failed job recovery 64-bit support for improved memory utilisation

Core Technologies v3eleven Product Training Guide Rev. 4 01/13/11

Optimizations for Multi-Core processors X X X. Assemblies built with.net 1.1 and.net 2.0 X X X. 1-bit, 4-bit. 8-bit colormapped support X X X

Smarter Document Capture

BOXOFT Image to PDF s allow you scans paper documents and automatically s them as PDF attachments using your existing software

DIS: Design and imaging software

III-6Exporting Graphics (Windows)

How to use the open-access scanners 1. Basic instructions (pg 2) 2. How to scan a document and perform OCR (pg 3 7) 3. How to scan a document and

Scanshare Sales Guide V1.2

ivina BulletScan Manager

DOCUMENT NAVIGATOR SALES GUIDE ADD NAME. KONICA MINOLTA Document Navigator Sales Guide

DjVu Technology Primer

ARTSYL DOCALPHA INSTALLATION GUIDE

ScandAll PRO V2.1 User's Guide

Using Smart Touch A-61829

Perfect PDF 9 Premium

Informatik Reformat. Version User Guide. 02 November 2015

DOWNLOAD OR READ : THE IMAGE OF THE POPULAR FRONT PDF EBOOK EPUB MOBI

Introduction. You might be interested in the system requirements, the installation, payment and registration procedures.

ViewONE User Manual !"##$$$

How to make a PDF from inside Acrobat

Software User s Guide

Flipping Book Publisher for Image also provides different output methods for you to publish your

ScanSoft, Inc. 9 Centennial Drive Peabody, Massachusetts 01960

900gt Series User Guide

Introducing Brother s new range of ADS Network Scanners. Hardware that lasts, software that excels.

docalpha Installation Guide

ABBYY. FineReader. Version 11 User s Guide ABBYY. All rights reserved.

Flip Book Maker for Image Scan files into Page-flipping ebooks directly. User Documentation. About Flip Book Maker for Image. Detail features include:

Readiris Pro 9 for Mac, the New Release of I.R.I.S. Flagship Product Sets a New Standard for OCR on the Mac Platform

SendMe Guide C9850 MFP C9000

Lesson 7 Working with Graphics

Manual for Scanfile Retrieval Version 9

ABBYY Recognition Server 4 Release 6 Release Notes

SimpleView 5.1 MANUAL.

Features & Functionalities

Overview. What is TCM? TCM Supported File Types A Day in the Life of a Document Using TCM in Munis Using TCM without Munis TCM extra Features Q&A *

TR 1288 Specifications for PDF & XML format Page 1 of 7

Press-Ready Cookbook Page Guidelines

PDFelement 6 Solutions Comparison

easy ntelligent convenient GlobalScan NX Server 5/ Server 32/Server 750 Capture & Distribution Solution Energize Critical Workflows

SOFTWARE USER S GUIDE

EMC CAPTIVA. Capture-enable Your Mobile Applications with Ease

Perfect PDF & Print 9

San Pdf Software For Windows 7 Ultimate Full Version

IMPORTING, ORGANIZING, EXPORTING, AND SAVING. MyGraphicsLab: Adobe Photoshop CS6 ACA Certification Preparation for Visual Communication

The Case of the 35 Gigabyte Digital Record: OCR and Digital Workflows

Welcome to the PDF Xpansion SDK

ADOBE 9A Adobe(R) InDesign CS5 ACE.

SOFTWARE USER S GUIDE

SECTION E: DOCUMENT DIGITIZATION

HybridCAD RASTER AND VECTOR UNITED. For 15 Years the No.1 in Raster-Vector Technology. V9 Products and Specifications

2010 by Microtek International, Inc. All rights reserved.

Presto! PageManager 8

Print Services User Guide

AVS4YOU Programs Help

Alternate Format for STEM

PDF solution comparison.

KWizCom Corporation. Scan App. User Guide

Sharpdesk V3.3. User s Guide Version Sharpdesk User s Guide

d-color d-color Code:

ID:webArchive User Manual

Best practices for producing high quality PDF files

DOWNLOAD OR READ : TO BE FREE PDF EBOOK EPUB MOBI

CentreWare. OmniPage OCR 1.0 English for CentreWare Flow Services User Guide

SOFTWARE USER S GUIDE

Features & Functionalities

AmbirScan User Guide Ambir Technology, Inc. Page 1 of 8 UG-AS-5.0

Chapter 1 Introduction to Photoshop CS3 1. Exploring the New Interface Opening an Existing File... 24

P2WW ENZ0. PaperStream Capture 2.5. User's Guide

SP 1200SF/SP 1200S Software User s Guide

Scan to PC Desktop Professional v9 vs. Scan to PC Desktop SE v9 + SE

Aquaforest OCR SDK for.net Release Notes

EMC ApplicationXtender Web Access

ICH M8 Expert Working Group. Specification for Submission Formats for ectd v1.1

Fast. Easy to use. Built for business. Two powerful new desktop document scanners join our award winning line-up ADS-2200 I ADS-2700W

Create PDF s. Create PDF s 1 Technology Training Center Colorado State University

eform Suite Version 5

1 Overview. Contents. Installation 1. Go to: Installation Features Supporting documentation...

Plain-paper digital Fax/Copier/Printer/Scanner. Scanner and Fax Guide

Readiris 17. No retyping. No paper. Just smart documents. #1 Conversion Software

PRODUCT TECHNICAL OVERVIEW

****** Release Note for Image Capture Plus ***** Copyright(C) , Panasonic Corporation All rights reserved.

SOFTWARE USER S GUIDE

acrobat.txt last modified 9/27/2014 This file is from

EUROPEAN COMPUTER DRIVING LICENCE / INTERNATIONAL COMPUTER DRIVING LICENCE IMAGE EDITING

EMC ApplicationXtender Web Access

Contact Us. ZySCAN Manual. For full contact details, visit the ZyLAB website -

compart PDF/A-Support in Compart Products PDF/A White Paper White Paper May 2006

User s Guide ScanGear Toolbox for Windows

Page Content. Inserting Text To add text to your document, you can type the text directly or use Cut or Copy and Paste or Paste Special.

Welcome to the PDF Xpansion SDK

Format Type Support Thru. vector (with embedded bitmaps)

DOWNLOAD OR READ : WORD AND IMAGE IN ARTHURIAN LITERATURE PDF EBOOK EPUB MOBI

INDIVIDUAL bizhub ENHANCEMENT

Transcription:

FineReader Engine Overview & New Features in V10 Semyon Sergunin ABBYY Headquarters September 2010 Michael Fuchs ABBYY Europe GmbH September 2010

FineReader Engine Processing Steps Step 1: Image/Document Input Step 2: Image Pre-processing Algorithms Step 3: Document & Layout Analysis Step 4: Recognition Step 5: Verification of the Recognition Results Step 6: Synthesis & Export

Step 1 Image Input

Step 1. Input Opening existing images Load images from disc or memory BMP, PCX, DCX, GIF, PNG, DjVu JPEG and JPEG2000 (part 1) TIFF B&W (uncompressed, CCITT3, CCITT3FAX, CCITT4, PackBits, ZIP, LZW) Grayscale (uncompressed, Packbits, JPEG, ZIP, LZW) Colour (uncompressed, JPEG, ZIP, LZW) PDF Adobe PDF Library 9.0 Access to internal data (Metadata, Annotations, Text Objects, etc.) Memory Image formats: Raw, Bitmap [HBITMAP], DIB Load images from digital cameras Advanced image pre-processing algorithms in FRE available! Screenshot Reader Capture any area from the screen Any formats (including Flash)

Step 1. Input Scanning documents (TWAIN) Scanning via TWAIN Interface ADF (Automatic Document Feeder) Manual paper feeder Scanner settings Brightness Colour Resolution Image compression Define scanning area (zone) Simplex / Duplex Orientation / automatic rotation / manual rotation Paper format Paper Top/Bottom/Left/Right Etc. Visual Component: Alternatively the original dialogue from the scanner driver can be used

Step 2 Image Pre-Processing

Step 2. Image pre-processing Available Options Automatic rotation Automatic deskewing Cropping Automatic image splitting Straighten lines of text Noise removal Despeckling Scale images (i.e. interpolate images with low resolution) Rotation (90, 180 and 270 )

Step 2. Image pre-processing Binarisation Overview Intelligent background filtering Adaptive Binarisation

Step 2. Image pre-processing New V10: New Binarisation Original scan Prev. binarisation New binarsation

Step 2. Image pre-processing New V10: Binarisation,Textured Background optimisations Original scan Prev. binarisation New binarisation

Step 2. Image pre-processing New V10: Binarisation for the IMPACT project Original Prev. binarisation New No text from the other page

Step 2. Image pre-processing New V10 Colour Filtering (stamps and marks)

Step 2. Image pre-processing: Camera OCR New V10: Automatic correction of 3D perspective distortions Before After

Step 2. Image pre-processing: Camera OCR New V10: Blurred images correction Before After

Step 2. Image pre-processing: Camera OCR New V10: ISO noise reduction Before After

Step 3 Document & Layout Analysis

Step 3. Document & Layout Analysis Detecting sections of a document, analyse layout and find barcodes

Step 3. Document & Layout Analysis 3 layout analysis modes are available: Document Analysis Normal Returns text, tables, graphics (pictures), barcodes & patchcodes, lines (separators) Document Analysis for full text indexing Graphics & pictures are OCRed as well Returns text, tables, graphics (pictures), text inside of pictures and diagrams, barcodes & patchcodes, lines (separators) Document Analysis for invoices (DAI) Optimized for small fonts Returns text, tables as plain text, text inside of pictures and diagrams, barcodes & patchcodes, lines (separators)

Step 3. Document & Layout Analysis New V10: Improved detection of charts and graphics Old Technology Improved detection of pictures (photographs) V 10 Technology Old Technology V 10 Technology

Step 3. Document & Layout Analysis New V10: Improvements for magazine-style pages Wrong detection of image and text blocks Correct detection of image and text blocks Old Technology V 10 Technology

Step 4 Recognition

Step 4. Recognition After line detection, character recognition is applied with different classifiers Raster classifier Contour classifier Structure classifier Feature differentiating classifier

Step 4. Recognition Processing speed - Accuracy Balance The old Conflict Recognition Accuracy vs. Processing Speed still exists. Engine 10 solves this with different approaches! Image Quality does matter! New Accurate Mode for low resolution/quality images slightly slower Slightly improved accuracy in Normal Mode Significant speed increase on good quality images in a new enhanced Fast Mode

Step 4. Recognition New V10: Accurate mode for low resolution scans Additional classifier trained on low resolution scans and faxes About 20% more accurate for low resolution scans About 10% slower than Normal mode

Step 4. Recognition Accuracy Improvements FRE10 Normal mode vs. FRE9 Normal mode *based on ABBYY internal tests; number of recognition errors normalized relative to FRE9_R1 values

Step 4. Recognition Speed Improvements - important notes* Values of speed and accuracy make sense only for comparison of ABBYY OCR technologies in these particular conditions for these particular test batches. Please DO NOT USE these numbers as absolute values, comparing to other results of OCR technologies, taken for different batches! Background color keys: *based on ABBYY internal tests

Step 4. Recognition Speed Comparison FRE 8, 9, 10 modes* *based on ABBYY internal tests

Step 4. Recognition Increased speed for European languages* *based on ABBYY internal tests

Chinese Simplified FRE10_R1 FRE9_R1 FRE9_R7 Recognition test Chinese Simplified, Books 79

Step 4. Recognition Speed improvements through Multi-Core Support* & tuned Profiles Built in Multi-core support for multi page documents Added in V9 Improvements in V10 New V10: New tuned processing profiles increase the overall performance for specific scenarios 2 Sessions tomorrow! Rate, times 4,0 3,5 3,0 2,5 2,0 Recognition performance increase rate for multi-core systems comparing to one-core system 2 cores 4 cores 1,5 1,0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Pages in a document *based on ABBYY internal tests

Step 6 Synthesis & Export

Step 6. Document Export New API for PDF Export in FineReader Engine 10 FRE9.0 PDF Export Parameters Author Bw Format Color Format Creator Embed Fonts Encryption Info Export Mode Font Mode Gray Format Keep Text And Background Color Keywords Paper Height Paper Width PDF Version Picture Format Picture Resolution Producer Quality Replace Uncertain Words With Image Running Title Mode Set Page Size By Layout Size Subject Title Write Links Write Tagged PDF MRC Params (READ ONLY) FRE 9.0 25 parameters FRE10 PDF Export Parameters Scenario MRC Mode PDFA Compliance Mode Resolution Resolution Type Colority Text Export Mode PDF Features (READ ONLY) Picture Compression Params (READ ONLY) FRE10 7 parameters Scenario profiles MAX PDF Quality MIN PDF Size MAX Export Speed Balanced Quality-Size-Speed Fast and easy adjustment of PDF export and ability to set up any of all parameters Scenario Profiles Max Quality Balanced Min Size Max Speed PDF Features Embed Fonts Encryption Info Meta Data Writing Params Paper Size PDF Version Replace Uncertain Words With Image Running Title Mode Write Links Write Tagged PDF

Step 6. Synthesis & Export 2nd Generation of ADRT New elements and enhancements from the previous ADRT New elements Overall enhancement of ADRT 1.0 work Engine 10 offers a new API to the internal ADRT results

Step 6. Synthesis & Export New XML Output Formats E-book Reader: PDFs can be displayed but the new formats allow much more flexible rendering when switching from portrait to landscape mode FB2* epub* Libraries: AltoXML* Open Document Text format:.odt* ISO Standard, XML based export format More and more often required in public projects *planned for a Maintenance Release of FRE 10

FineReader Engine 10 Jumpstart Samples and Source Code for Developers

FineReader Engine 10 The must have SDK! ABBYY made significant technology optimisations in Engine 10: Image Pre-processing: New Binarisation = better OCR = better Results Speed Improvements: New Fast Mode, improved Multi-core Support Quality Improvements: New mode for low resolution images, improved Fraktur OCR New and Improved Language Support Improved Document Analysis and ADRT New API Calls and Optimised Processing Profiles New and Improved Export formats

Any questions? Thank you for your attention!