The Processing of Form Documents

Size: px
Start display at page:

Download "The Processing of Form Documents"


1 The Processing of Form Documents David S. Doermann and Azriel Rosenfeld Document Processing Group, Center for Automation Research University of Maryland, College Park Abstract In this paper we present an overview of our approach to the generic modeling and processing of known forms. Our system provides a methodology by which models are generated from regions in the document based on their usage. We propose automatic extraction of an optimal set of features to be used for registration and show how specialized detectors can be designed for each feature based on their position, orientation and width properties. Registration of the form with the model is accomplished using probing to establish correspondence. We detect and isolate form components which are corrupted by markings, interpret the intersections, and use the properties of the non-form markings to reconstruct the strokes through the intersections. The feasibility of these ideas is demonstrated through an implementation key components of our system. 1 Introduction The machine understanding of form documents is a problem which is essential to the advancement of office automation. Many systems have been developed each with different goals and assumptions [2, 9, 6, lo]. Our goal is to provide a processing environment for known forms which allows greater flexibility in the types and numbers of forms which are processed. By separating the form from the filled-in information, the filled-in components can be processed separately, and the data can be stored the data can be stored in a compressed format. We assume that the operator has been given a copy of the pre-designed form, so that design is not an issue, and that a homogeneous batch of forms will be processed. The task is to extract the hand or machine printed markings from such a batch of completed instances of the form and pass the recovered information to a storage or OCR system. We address the modeling, pre-processing and text extraction stages of the system. 2 Form Modeling There are three levels of abstraction at which a form can be modeled, depending on the a priori knowledge about the domain. We may analyze a form document 1) given no specific information about the format or application of the form, but only general heuristic knowledge about the form domain, 2) given the class of forms to which the document belongs, but not the exact form layout, or 3) given a detailed model of the original form or a set of specific models for a small group of forms. These cases fall into the general categories of unknown forms (e.g. general document processing), known classes of forms (e.g. checks and invoices), and known forms (e.g. report forms, taxes and surveys). We have demonstrated our approach on the class of known forms. Most form processing systems are designed or can be tuned for a specific known form or small set of forms. For applications which require the mass processing of these forms, we may assume that the format of the original form is known a priori. The fact that we know the form layout suggests that such systems will benefit from a top-down component, whereas the class of unknown forms typically requires a primarily bottom-up approach. The processing requirements obviously depend on the model knowledge and the functionality of the form. In general, the model of a form document may include 1) contextual knowledge about generic document domains, such as constraints on the size variations between text and graphic components, locations of key regions of interest, and relationships between known components; 2) abstract representations of lines, text, and regions such as graphs; 3) bitmap or image representations of form components which are traditionally difficult to model such as logos; and/or 4) generative models, such as form languages or form grammars. In /93 $ IEEE 497

2 addition, models for form images may include a characterization of noise in the imaging process as well as defects found in typical documents. 2.1 A Simplified Model We begin by providing an approach which allows direct modeling of typical office documents. We define a set of basic constructs which allows us to model - large number of simple forms. The goal is to define a model for the original form in such a way as to be able to symbolically subtract the original form from a scanned copy of a completed document. The model contains three primary components - line segments, form regions, and landmark features. Line segments are often used as guides for the form user and are defined in the model by two endpoints and a constant width. More complex graphic constructs such as boxes can be constructed from combinations of line segments. Regions are used to define areas on the form which are to be considered as a single unit. A region is classified as: 1) modeled: filled information within the region is of interest, but the presence of the original form components requires us to have a model to separate form from non-form information; 2) non- modeled: the assumption can be made that any marking within the region is inconsequential, whether it contains filled-in information or not; or 3) data: the system recovers the interior of the region, but ignores any markings which extend past its boundaries (limited data region) or analyzes the exterior if markings extend outside of the region (extended data region). The non-modeled region type also acts as a catch-all for form components which we currently do not model. Since the forms we are dealing with are known, it is not necessary to examine all form components in detail for interaction with filled-in information. Rather, we will use a focus of attention (Section 4) to concentrate on areas which are believed to contain non-form information. Our current model space limits regions to be upright rectangles. Regions can therefore be defined by their lower-left and upper right coordinates. In the case of modeled regions, an image or bitmap of the region is included. Landmark features describe line segment endpoints and the intersections of line segments with themselves and with region components. Line segment landmarks occur in 13 basic configurations: endpoints, t-junctions, and corners, each in four orientations, and one crossing configuration. Similarly, line segment may intersect a region form any of four directions. These relationships are invariant to transla- tion, rotation and scale, and are useful for solving the alignment problem and for analysis of stroke/form interact ions. Our current implementation relies primarily on the existence of line landmarks extracted from horizontal and vertical line segments to perform alignment, but the techniques described in the next section can be extended to use arbitrary regions for alignment at the expense of increased computational cost. 2.2 The Form Modeler A prototype automated modeling system is being developed using the KBVision system [l] to aid the user in the modeling of form documents. The current system takes a scanned version of the document as input and uses classical image processing techniques to extract character regions, line segments, and landmark points and assign them appropriate classifications. An interactive modeler is then used to refine the derived approximation of the form model. The scanning and image processing stages are necessary for forms which are not designed on-line. Ideally, if the form were created on-line, a form model would be produced at the time the form is defined, or a model could be derived from a CAD-based description of the form. The form model is extracted and stored in a file with attributes defining the location and size of each component and landmark. The model components are later organized spatially into a quadtree data structure, as described briefly below. 3 Alignment An approach is proposed in which an optimal basis set of landmark features is extracted directly from the model and the locations of these points are used to invert the transformation and perform a coarse alignment. The alignment is then verified and refined using higher order features. The advantage of automatic feature extraction is that it not only eliminates the human factor in modeling, but also quantifies the process of detecting which features are more or less likely to be confused with each other. Our approach to skew detection and correction is based on geometric probing [7]. A convolution probe can be defined to be more robust to different types of distortions, such as rotation. By passing the probe over an area bounded by the limits on the translation and rotation, we can approximate the locations of candidate feature points. When we have obtained a set of possible feature points, we apply the constraints 498

3 from the model to reduce the set of possible transformations. Other landmarks can then be verified and a fine scale alignment obtained. Our current research is centered around developing algorithms for automatic selection of an optimal set of features from the model, automatic definition of detectors, extracting and constraining features for coarse alignment and adjusting the fine-scale alignment. 4 Form Delineation. J... (Form 1 Components (a) - 1 i Simply subtracting the form model from the image is not sufficient because of possible interaction of the filled-in information with the form and the possibility of single- or sub-pixel shifts. To ensure that we capture all non-form information (marks) in the image, we perform a symbolic subtraction of the model on a component by component basis. Non-modeled regions are discarded immediately without consideration of their internal pixels. Line segments and modeled regions are interpreted using detailed analysis of the pixels around their boundaries. To isolate line segment areas or modeled regions for analysis, we project onto the image plane the rectangle which corresponding to the boundary of a region defined by the segment endpoints and the width. 4.1 Detection of Anomalies Intensity and gradient information is used to determine if a model component is isolated in the image or if it is interfered with by noise or markings on the form. By examining the image at the location of the boundary of each model component, we detect anomalies at the points where non-form features interfere with the form components. If boundary pixels are found to be corrupt, we hypothesize a corrupted feature and attempt to recover the conditions or markings in the document which gave rise to the corruption. If the boundary pixels are not corrupt, the entire model feature can be removed from the document image, and the remaining markings on the page are the filled-in data. There are several situations which give rise to anomalies. Small dents or bumps may by detected with a local analysis of the pixels in the neighborhood of the anomaly. We set a lower bound on the expected size of valid page markings, and delay analysis of features which are smaller than this threshold. The existence of such features is preserved for possible analysis with higher level context. For example, a small bump in the boundary may be the result of a decimal point touching a form line. Figure 1: A small region, anomalies and the computed MBR. Larger, non-stroke-like features which were added to a form (e.g. stickers, coffee stains, scratched-out regions, etc.) also appear as anomalies. For features which are much larger than the width of the line segment, precise representation of the interference may not be necessary. We assume that the feature is continuous across the line segment, the line segment is removed and the region patched. A final case is the intersection of a line segment with a stroke-like marking. Since stroke-like markings are presumably more common, and represent the data which we are trying to recover, we must take special care to reconstruct as accurately as possible the stroke which gave rise to the interference. 4.2 Focus of Attention In order to evaluate the intersection of the model and the form information, the region which requires analysis must be defined. The initial region is defined in such a way that it is almost certainly large enough for properties of the offending strokes to be measured. This region is then modified to take into account the geometry of the form model and the locations of nearby detected anomalies. Based on the expected locations of markings on the completed form, we are able to define a reasonable bound on the size of the region required for analysis. If it is found that this region is too small, it can be extended. The region is on the order of 25 times the expected stroke width or model segment width (Figure la). In an attempt to avoid unnecessary analysis or repeated analysis of the same region, the region can be modified in two ways: It can be expanded to include additional detected anomalies and can be constrained by regions which lack anomalies (Figure lb). If the region boundary is only corrupted on one side, the region of analysis is limited to that side. The compu- 499

4 tation of the Maximum Bounding Region (MBR) an anomaly point is described more fully in [3]. The final result is a region (MBR) which surrounds a given corrupted point, but does not engulf additional uncorrupted form components. A quadtree data structure can be used to efficiently index into the space of line segment and region boundaries [8] and to implement the sweeping algorithms. 5 Stroke/Form Interaction Our approach to the problem of interfering contours is based on the detection, analysis and detailed representation of the stroke-like and non-stroke-like regions in the document image [5]. The process involves two parts, an interpretation of the intersecting region and a reconstruction of the strokes or line segments which formed it. An interpretation of a region is derived from the local configuration of stroke segments and properties of the strokes themselves such as curvature, width and intensity. By treating the strokes as features and retaining a more complete representation of the document, we can use criteria and clues for interpretation which are not available with traditional approaches to document processing. 5.1 The Stroke Recovery Platform The framework we use to address the interpretation and reconstruction problems is based on the concept of a stroke recovery platform described in [4, 31. The platform provides a hierarchical representation of the stroke-like features in a document that extends from the pixel level up through an attributed stroke graph which represents the relationships between strokes and non-stroke features such as endpoints and intersections. The platform attempts to provide a complete representation which links higher level abstract representations with the pixels and other local features. Unfortunately, in many situations a complete interpretation based on low-level information either may not be possible or cannot be obtained with the desired confidence. In such cases, feedback from higher-level application-dependent modules is necessary and the platform can be amended dynamically. 5.2 Interpretation and Reconstruction Our first step is to construct a partial stroke platform for the region of the document image defined by the MBR in Section 4 and identify portions of the (b) Figure 2: Several components of the platform which result from the processing of a small document region. Figure 3: The candidate anchor points derived from the cross section endpoints. stroke graph which uniquely correspond to the given form component. The platform provides us with a representation which contains, most importantly, a set of cross-section groupings which exhibit stroke-like properties (hypothesized stroke segments), regions which are classified as possible junctions or endpoints, and the underlying contours or contour fragments of the stroke segments, junctions, endpoints and unclassified features in the image (Figure 2). The stroke graph supports top-down access to the pixels through the strokes, junctions, cross-sections and retinotopic information. We then identify those portions of the image which correspond to interacting features. Since the properties of the form components such as position, width and orientation are known a priori, the stroke graph can be examined and the features identified. In a more general application, we may identify line segments based on the regularity and size of the cross-sections comprising the hypothesized stroke segments. In either case, if the segment intersects another feature, we will observe a node in the stroke graph corresponding to the intersection. If the intersection occurs over an extended region, the affected portions of the stroke graph will have cross-section widths which are inconsistent with rest of the form component. In Figure 2a, for example, the top-center stroke segment is bounded by two apparent junctions and has cross-sections of significantly greater width than the corresponding left or right end segments. Since these changes contradict the consistency assumptions, such a situation will be examined for possible interpretation as resulting from an interaction of features. (C) 500

5 Once we have an approximate delineation of a form component, we begin the reconstruction. As stated earlier, this is based on properties of the portions of the document that do not involve feature interactions. We identify anchor points which are used to connect the reconstructed feature segments to known feature segments. We then identify a set of candidate anchor point pairs from the cross-sections at the ends of the affected segments (Figure 3). Since the form component is of known dimensions, we generate (or recall as part of our a priori knowledge) a cross-section representation of the model line segment and register it with the representation given by the platform. From this correspondence, we can easily identify the isolated line segment features which are uncorrupted and refine the registration if necessary. We then classify contours between the anchor points of the hypothesized stroke segments which do and do not fit the model. Bounded portions of the contours can be described as follows. A visible contour is a boundary representation of a stroke or region that is derivable from areas of high gradient activity in the image. An occluded contour is a boundary of a stroke or line segment which is obscured or otherwise distorted by another stroke or line segment. A contour is said to be stable if it corresponds to an uncorrupted portion of the stroke and is itself free from distortion caused by noise in the intensity image. A contour is unstable if its location or orientation may be corrupted by neighboring strokes. The platform can thus be annotated to reflect linesegment, non-line-segment and possible-combination cross sections, contours and stroke graph components. The occluded stroke segment contours are then reconstructed from the remaining visible contours and anchor points. For the intersection in this example, the visible stroke contour is connected to the remaining part of the stroke segment so we can assume that it is part of the same stroke. We use the properties of the unoccluded stroke segments to reconstruct the occluded contour and delineate the region which corresponds to the occluded stroke. Figure 4 shows the results of the cross section computation and reconstruction from the intersection with a known line. Once this analysis has been performed for each MBR in the form document, and the markings reconstructed, the remaining markings on the page are passed to an interpreter. References [l] Amerinex Artificial Intelligence, Inc. KBVision system, I Figure 4: Reconstruction of an e touching a line. [2] R. Casey, D. Ferguson, K. Mohiuddin, and E. Walach. Intelligent forms processing system. Machine Vision and Applications, 5: , [3] D. S. Doermann. DoczLment Image Understanding: Integrating Recovery and Interpretation. PhD thesis, University of Maryland, College Park, [4] D. S. Doermann and A. Rosenfeld. Recovery of temporal information from static images of handwriting. Technical Report CAR-TR-595, Center for Automation Research, University of Maryland, College Park, Maryland, To appear in the International Journal of Computer Vision. [5] D. S. Doermann and A. Rosenfeld. The interpretation and reconstruction of interfering strokes. In Proceedings of the International Workshop on Frontiers in Handwriting Recognition, pages 41-50, [SI G. Maderlechner. Symbolic Subtraction from fixed formatted graphics and text from filled in forms. In Machine Vision and Applications, pages , [7] K. Romanik. Approximate testing theory. Technical report, University of Maryland, [8] H. Samet. The Design and Analysis of Spatial Data Structures. Addison-Wesley, Reading, MA, [9] S. Liebowitz Taylor, R. Fritzson, and J.A. Pastor. Extraction of data from preprinted forms. Machine Vision and Applications, 5: ,1992. [lo] D. Wang and S. N. Srihari. Analysis of form images. In Proceedings of the International Conference on Document Analysis and Recognition, pages , I 50 1

Hidden Loop Recovery for Handwriting Recognition

Hidden Loop Recovery for Handwriting Recognition Hidden Loop Recovery for Handwriting Recognition David Doermann Institute of Advanced Computer Studies, University of Maryland, College Park, USA E-mail: Nathan Intrator School of

More information

CSE 252B: Computer Vision II

CSE 252B: Computer Vision II CSE 252B: Computer Vision II Lecturer: Serge Belongie Scribes: Jeremy Pollock and Neil Alldrin LECTURE 14 Robust Feature Matching 14.1. Introduction Last lecture we learned how to find interest points

More information

Outline 7/2/201011/6/

Outline 7/2/201011/6/ Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern

More information

Peripheral drift illusion

Peripheral drift illusion Peripheral drift illusion Does it work on other animals? Computer Vision Motion and Optical Flow Many slides adapted from J. Hays, S. Seitz, R. Szeliski, M. Pollefeys, K. Grauman and others Video A video

More information

A Generalized Method to Solve Text-Based CAPTCHAs

A Generalized Method to Solve Text-Based CAPTCHAs A Generalized Method to Solve Text-Based CAPTCHAs Jason Ma, Bilal Badaoui, Emile Chamoun December 11, 2009 1 Abstract We present work in progress on the automated solving of text-based CAPTCHAs. Our method

More information



More information

A New Algorithm for Detecting Text Line in Handwritten Documents

A New Algorithm for Detecting Text Line in Handwritten Documents A New Algorithm for Detecting Text Line in Handwritten Documents Yi Li 1, Yefeng Zheng 2, David Doermann 1, and Stefan Jaeger 1 1 Laboratory for Language and Media Processing Institute for Advanced Computer

More information

Hand-Eye Calibration from Image Derivatives

Hand-Eye Calibration from Image Derivatives Hand-Eye Calibration from Image Derivatives Abstract In this paper it is shown how to perform hand-eye calibration using only the normal flow field and knowledge about the motion of the hand. The proposed

More information

Scene Text Detection Using Machine Learning Classifiers

Scene Text Detection Using Machine Learning Classifiers 601 Scene Text Detection Using Machine Learning Classifiers Nafla C.N. 1, Sneha K. 2, Divya K.P. 3 1 (Department of CSE, RCET, Akkikkvu, Thrissur) 2 (Department of CSE, RCET, Akkikkvu, Thrissur) 3 (Department

More information

Multi-stable Perception. Necker Cube

Multi-stable Perception. Necker Cube Multi-stable Perception Necker Cube Spinning dancer illusion, Nobuyuki Kayahara Multiple view geometry Stereo vision Epipolar geometry Lowe Hartley and Zisserman Depth map extraction Essential matrix

More information

A Survey of Light Source Detection Methods

A Survey of Light Source Detection Methods A Survey of Light Source Detection Methods Nathan Funk University of Alberta Mini-Project for CMPUT 603 November 30, 2003 Abstract This paper provides an overview of the most prominent techniques for light

More information

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 2 Outline Review Stereo Dense Motion Estimation Translational

More information

Chapter 9 Object Tracking an Overview

Chapter 9 Object Tracking an Overview Chapter 9 Object Tracking an Overview The output of the background subtraction algorithm, described in the previous chapter, is a classification (segmentation) of pixels into foreground pixels (those belonging

More information

Motion Estimation. There are three main types (or applications) of motion estimation:

Motion Estimation. There are three main types (or applications) of motion estimation: Members: D91922016 朱威達 R93922010 林聖凱 R93922044 謝俊瑋 Motion Estimation There are three main types (or applications) of motion estimation: Parametric motion (image alignment) The main idea of parametric motion

More information

A Model-based Line Detection Algorithm in Documents

A Model-based Line Detection Algorithm in Documents A Model-based Line Detection Algorithm in Documents Yefeng Zheng, Huiping Li, David Doermann Laboratory for Language and Media Processing Institute for Advanced Computer Studies University of Maryland,

More information

N.Priya. Keywords Compass mask, Threshold, Morphological Operators, Statistical Measures, Text extraction

N.Priya. Keywords Compass mask, Threshold, Morphological Operators, Statistical Measures, Text extraction Volume, Issue 8, August ISSN: 77 8X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A Combined Edge-Based Text

More information

Staff Line Detection by Skewed Projection

Staff Line Detection by Skewed Projection Staff Line Detection by Skewed Projection Diego Nehab May 11, 2003 Abstract Most optical music recognition systems start image analysis by the detection of staff lines. This work explores simple techniques

More information

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera

Recognition of Gurmukhi Text from Sign Board Images Captured from Mobile Camera International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 17 (2014), pp. 1839-1845 International Research Publications House http://www. Recognition of

More information

CAP 5415 Computer Vision Fall 2012

CAP 5415 Computer Vision Fall 2012 CAP 5415 Computer Vision Fall 01 Dr. Mubarak Shah Univ. of Central Florida Office 47-F HEC Lecture-5 SIFT: David Lowe, UBC SIFT - Key Point Extraction Stands for scale invariant feature transform Patented

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve Seitz, Rick Szeliski,

More information

Automatic Logo Detection and Removal

Automatic Logo Detection and Removal Automatic Logo Detection and Removal Miriam Cha, Pooya Khorrami and Matthew Wagner Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, PA 15213 {mcha,pkhorrami,mwagner}

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who 1 in turn adapted slides from Steve Seitz, Rick Szeliski,

More information


SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

Separation of Overlapping Text from Graphics

Separation of Overlapping Text from Graphics Separation of Overlapping Text from Graphics Ruini Cao, Chew Lim Tan School of Computing, National University of Singapore 3 Science Drive 2, Singapore 117543 Email: {caorn, tancl} Abstract

More information

Cs : Computer Vision Final Project Report

Cs : Computer Vision Final Project Report Cs 600.461: Computer Vision Final Project Report Giancarlo Troni Raphael Sznitman Abstract Given a Youtube video of a busy street intersection, our task is to detect, track,

More information

Feature Detectors - Canny Edge Detector

Feature Detectors - Canny Edge Detector Feature Detectors - Canny Edge Detector 04/12/2006 07:00 PM Canny Edge Detector Common Names: Canny edge detector Brief Description The Canny operator was designed to be an optimal edge detector (according

More information

A Labeling Approach for Mixed Document Blocks. A. Bela d and O. T. Akindele. Crin-Cnrs/Inria-Lorraine, B timent LORIA, Campus Scientique, B.P.

A Labeling Approach for Mixed Document Blocks. A. Bela d and O. T. Akindele. Crin-Cnrs/Inria-Lorraine, B timent LORIA, Campus Scientique, B.P. A Labeling Approach for Mixed Document Blocks A. Bela d and O. T. Akindele Crin-Cnrs/Inria-Lorraine, B timent LORIA, Campus Scientique, B.P. 39, 54506 Vand uvre-l s-nancy Cedex. France. Abstract A block

More information

Lecture 6: Edge Detection

Lecture 6: Edge Detection #1 Lecture 6: Edge Detection Saad J Bedros Review From Last Lecture Options for Image Representation Introduced the concept of different representation or transformation Fourier Transform

More information

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang Extracting Layers and Recognizing Features for Automatic Map Understanding Yao-Yi Chiang 0 Outline Introduction/ Problem Motivation Map Processing Overview Map Decomposition Feature Recognition Discussion

More information


CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.1 Introduction Pattern recognition is a set of mathematical, statistical and heuristic techniques used in executing `man-like' tasks on computers. Pattern recognition plays an

More information

Scanner Parameter Estimation Using Bilevel Scans of Star Charts

Scanner Parameter Estimation Using Bilevel Scans of Star Charts ICDAR, Seattle WA September Scanner Parameter Estimation Using Bilevel Scans of Star Charts Elisa H. Barney Smith Electrical and Computer Engineering Department Boise State University, Boise, Idaho 8375

More information

An Automated Image-based Method for Multi-Leaf Collimator Positioning Verification in Intensity Modulated Radiation Therapy

An Automated Image-based Method for Multi-Leaf Collimator Positioning Verification in Intensity Modulated Radiation Therapy An Automated Image-based Method for Multi-Leaf Collimator Positioning Verification in Intensity Modulated Radiation Therapy Chenyang Xu 1, Siemens Corporate Research, Inc., Princeton, NJ, USA Xiaolei Huang,

More information

Digital Image Processing (CS/ECE 545) Lecture 5: Edge Detection (Part 2) & Corner Detection

Digital Image Processing (CS/ECE 545) Lecture 5: Edge Detection (Part 2) & Corner Detection Digital Image Processing (CS/ECE 545) Lecture 5: Edge Detection (Part 2) & Corner Detection Prof Emmanuel Agu Computer Science Dept. Worcester Polytechnic Institute (WPI) Recall: Edge Detection Image processing

More information

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA,

More information

Towards the completion of assignment 1

Towards the completion of assignment 1 Towards the completion of assignment 1 What to do for calibration What to do for point matching What to do for tracking What to do for GUI COMPSCI 773 Feature Point Detection Why study feature point detection?

More information

Part-Based Skew Estimation for Mathematical Expressions

Part-Based Skew Estimation for Mathematical Expressions Soma Shiraishi, Yaokai Feng, and Seiichi Uchida {fengyk,uchida} Abstract We propose a novel method for the skew estimation on text images containing

More information

A Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points

A Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points A Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points Tomohiro Nakai, Koichi Kise, Masakazu Iwamura Graduate School of Engineering, Osaka

More information

SIFT - scale-invariant feature transform Konrad Schindler

SIFT - scale-invariant feature transform Konrad Schindler SIFT - scale-invariant feature transform Konrad Schindler Institute of Geodesy and Photogrammetry Invariant interest points Goal match points between images with very different scale, orientation, projective

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 Abstract

More information

Model Based Perspective Inversion

Model Based Perspective Inversion Model Based Perspective Inversion A. D. Worrall, K. D. Baker & G. D. Sullivan Intelligent Systems Group, Department of Computer Science, University of Reading, RG6 2AX, UK.

More information

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE) Features Points Andrea Torsello DAIS Università Ca Foscari via Torino 155, 30172 Mestre (VE) Finding Corners Edge detectors perform poorly at corners. Corners provide repeatable points for matching, so

More information

2D rendering takes a photo of the 2D scene with a virtual camera that selects an axis aligned rectangle from the scene. The photograph is placed into

2D rendering takes a photo of the 2D scene with a virtual camera that selects an axis aligned rectangle from the scene. The photograph is placed into 2D rendering takes a photo of the 2D scene with a virtual camera that selects an axis aligned rectangle from the scene. The photograph is placed into the viewport of the current application window. A pixel

More information

Structured light 3D reconstruction

Structured light 3D reconstruction Structured light 3D reconstruction Reconstruction pipeline and industrial applications 11/05/2010 3D Reconstruction 3D reconstruction is the process of capturing the shape and appearance

More information


DATA EMBEDDING IN TEXT FOR A COPIER SYSTEM DATA EMBEDDING IN TEXT FOR A COPIER SYSTEM Anoop K. Bhattacharjya and Hakan Ancin Epson Palo Alto Laboratory 3145 Porter Drive, Suite 104 Palo Alto, CA 94304 e-mail: {anoop, ancin} Abstract

More information


IRIS SEGMENTATION OF NON-IDEAL IMAGES IRIS SEGMENTATION OF NON-IDEAL IMAGES William S. Weld St. Lawrence University Computer Science Department Canton, NY 13617 Xiaojun Qi, Ph.D Utah State University Computer Science Department Logan, UT 84322

More information

A Novel Logo Detection and Recognition Framework for Separated Part Logos in Document Images

A Novel Logo Detection and Recognition Framework for Separated Part Logos in Document Images Australian Journal of Basic and Applied Sciences, 5(9): 936-946, 2011 ISSN 1991-8178 A Novel Logo Detection and Recognition Framework for Separated Part Logos in Document Images Sina Hassanzadeh, Hossein

More information

Defect Inspection of Liquid-Crystal-Display (LCD) Panels in Repetitive Pattern Images Using 2D Fourier Image Reconstruction

Defect Inspection of Liquid-Crystal-Display (LCD) Panels in Repetitive Pattern Images Using 2D Fourier Image Reconstruction Defect Inspection of Liquid-Crystal-Display (LCD) Panels in Repetitive Pattern Images Using D Fourier Image Reconstruction Du-Ming Tsai, and Yan-Hsin Tseng Department of Industrial Engineering and Management

More information

Problems with template matching

Problems with template matching Problems with template matching The template represents the object as we expect to find it in the image The object can indeed be scaled or rotated This technique requires a separate template for each scale

More information

Reconstruction of 3D Interacting Solids of Revolution from 2D Orthographic Views

Reconstruction of 3D Interacting Solids of Revolution from 2D Orthographic Views Reconstruction of 3D Interacting Solids of Revolution from 2D Orthographic Views Hanmin Lee, Soonhung Han Department of Mechanical Engeneering Korea Advanced Institute of Science & Technology 373-1, Guseong-Dong,

More information

I. INTRODUCTION. Figure-1 Basic block of text analysis

I. INTRODUCTION. Figure-1 Basic block of text analysis ISSN: 2349-7637 (Online) (RHIMRJ) Research Paper Available online at: Detection and Localization of Texts from Natural Scene Images: A Hybrid Approach Priyanka Muchhadiya Post Graduate Fellow,

More information

Multi-scale Techniques for Document Page Segmentation

Multi-scale Techniques for Document Page Segmentation Multi-scale Techniques for Document Page Segmentation Zhixin Shi and Venu Govindaraju Center of Excellence for Document Analysis and Recognition (CEDAR), State University of New York at Buffalo, Amherst

More information

Textureless Layers CMU-RI-TR Qifa Ke, Simon Baker, and Takeo Kanade

Textureless Layers CMU-RI-TR Qifa Ke, Simon Baker, and Takeo Kanade Textureless Layers CMU-RI-TR-04-17 Qifa Ke, Simon Baker, and Takeo Kanade The Robotics Institute Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 Abstract Layers are one of the most well

More information

Patch-based Object Recognition. Basic Idea

Patch-based Object Recognition. Basic Idea Patch-based Object Recognition 1! Basic Idea Determine interest points in image Determine local image properties around interest points Use local image properties for object classification Example: Interest

More information



More information

(Refer Slide Time 00:17) Welcome to the course on Digital Image Processing. (Refer Slide Time 00:22)

(Refer Slide Time 00:17) Welcome to the course on Digital Image Processing. (Refer Slide Time 00:22) Digital Image Processing Prof. P. K. Biswas Department of Electronics and Electrical Communications Engineering Indian Institute of Technology, Kharagpur Module Number 01 Lecture Number 02 Application

More information


GENERAL AUTOMATED FLAW DETECTION SCHEME FOR NDE X-RAY IMAGES GENERAL AUTOMATED FLAW DETECTION SCHEME FOR NDE X-RAY IMAGES Karl W. Ulmer and John P. Basart Center for Nondestructive Evaluation Department of Electrical and Computer Engineering Iowa State University

More information


CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM 1 PHYO THET KHIN, 2 LAI LAI WIN KYI 1,2 Department of Information Technology, Mandalay Technological University The Republic of the Union of Myanmar

More information

CS 231A Computer Vision (Fall 2012) Problem Set 3

CS 231A Computer Vision (Fall 2012) Problem Set 3 CS 231A Computer Vision (Fall 2012) Problem Set 3 Due: Nov. 13 th, 2012 (2:15pm) 1 Probabilistic Recursion for Tracking (20 points) In this problem you will derive a method for tracking a point of interest

More information

Color Space Invariance for Various Edge Types in Simple Images. Geoffrey Hollinger and Dr. Bruce Maxwell Swarthmore College Summer 2003

Color Space Invariance for Various Edge Types in Simple Images. Geoffrey Hollinger and Dr. Bruce Maxwell Swarthmore College Summer 2003 Color Space Invariance for Various Edge Types in Simple Images Geoffrey Hollinger and Dr. Bruce Maxwell Swarthmore College Summer 2003 Abstract This paper describes a study done to determine the color

More information

Color Characterization and Calibration of an External Display

Color Characterization and Calibration of an External Display Color Characterization and Calibration of an External Display Andrew Crocker, Austin Martin, Jon Sandness Department of Math, Statistics, and Computer Science St. Olaf College 1500 St. Olaf Avenue, Northfield,

More information

Using temporal seeding to constrain the disparity search range in stereo matching

Using temporal seeding to constrain the disparity search range in stereo matching Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: Fred Nicolls Department

More information

Auto-Digitizer for Fast Graph-to-Data Conversion

Auto-Digitizer for Fast Graph-to-Data Conversion Auto-Digitizer for Fast Graph-to-Data Conversion EE 368 Final Project Report, Winter 2018 Deepti Sanjay Mahajan Sarah Pao Radzihovsky Ching-Hua (Fiona) Wang

More information

Advanced geometry tools for CEM

Advanced geometry tools for CEM Advanced geometry tools for CEM Introduction Modern aircraft designs are extremely complex CAD models. For example, a BAE Systems aircraft assembly consists of over 30,000 individual components. Since

More information

Combining Appearance and Topology for Wide

Combining Appearance and Topology for Wide Combining Appearance and Topology for Wide Baseline Matching Dennis Tell and Stefan Carlsson Presented by: Josh Wills Image Point Correspondences Critical foundation for many vision applications 3-D reconstruction,

More information


STEREO BY TWO-LEVEL DYNAMIC PROGRAMMING STEREO BY TWO-LEVEL DYNAMIC PROGRAMMING Yuichi Ohta Institute of Information Sciences and Electronics University of Tsukuba IBARAKI, 305, JAPAN Takeo Kanade Computer Science Department Carnegie-Mellon

More information

Medial Scaffolds for 3D data modelling: status and challenges. Frederic Fol Leymarie

Medial Scaffolds for 3D data modelling: status and challenges. Frederic Fol Leymarie Medial Scaffolds for 3D data modelling: status and challenges Frederic Fol Leymarie Outline Background Method and some algorithmic details Applications Shape representation: From the Medial Axis to the

More information

Local Feature Detectors

Local Feature Detectors Local Feature Detectors Selim Aksoy Department of Computer Engineering Bilkent University Slides adapted from Cordelia Schmid and David Lowe, CVPR 2003 Tutorial, Matthew Brown,

More information



More information


EDGE BASED REGION GROWING EDGE BASED REGION GROWING Rupinder Singh, Jarnail Singh Preetkamal Sharma, Sudhir Sharma Abstract Image segmentation is a decomposition of scene into its components. It is a key step in image analysis.

More information

Optical flow and tracking

Optical flow and tracking EECS 442 Computer vision Optical flow and tracking Intro Optical flow and feature tracking Lucas-Kanade algorithm Motion segmentation Segments of this lectures are courtesy of Profs S. Lazebnik S. Seitz,

More information

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field Finally: Motion and tracking Tracking objects, video analysis, low level motion Motion Wed, April 20 Kristen Grauman UT-Austin Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys, and S. Lazebnik

More information


WATERMARKING FOR LIGHT FIELD RENDERING 1 ATERMARKING FOR LIGHT FIELD RENDERING 1 Alper Koz, Cevahir Çığla and A. Aydın Alatan Department of Electrical and Electronics Engineering, METU Balgat, 06531, Ankara, TURKEY. e-mail:,,

More information

Chapter 3 Image Registration. Chapter 3 Image Registration

Chapter 3 Image Registration. Chapter 3 Image Registration Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation

More information

Optimal Grouping of Line Segments into Convex Sets 1

Optimal Grouping of Line Segments into Convex Sets 1 Optimal Grouping of Line Segments into Convex Sets 1 B. Parvin and S. Viswanathan Imaging and Distributed Computing Group Information and Computing Sciences Division Lawrence Berkeley National Laboratory,

More information

Character Recognition

Character Recognition Character Recognition 5.1 INTRODUCTION Recognition is one of the important steps in image processing. There are different methods such as Histogram method, Hough transformation, Neural computing approaches

More information

Fingerprint Classification Using Orientation Field Flow Curves

Fingerprint Classification Using Orientation Field Flow Curves Fingerprint Classification Using Orientation Field Flow Curves Sarat C. Dass Michigan State University Anil K. Jain Michigan State University Abstract Manual fingerprint classification

More information

Structural and Syntactic Pattern Recognition

Structural and Syntactic Pattern Recognition Structural and Syntactic Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent

More information

Automatic Detection of Change in Address Blocks for Reply Forms Processing

Automatic Detection of Change in Address Blocks for Reply Forms Processing Automatic Detection of Change in Address Blocks for Reply Forms Processing K R Karthick, S Marshall and A J Gray Abstract In this paper, an automatic method to detect the presence of on-line erasures/scribbles/corrections/over-writing

More information

Your Flowchart Secretary: Real-Time Hand-Written Flowchart Converter

Your Flowchart Secretary: Real-Time Hand-Written Flowchart Converter Your Flowchart Secretary: Real-Time Hand-Written Flowchart Converter Qian Yu, Rao Zhang, Tien-Ning Hsu, Zheng Lyu Department of Electrical Engineering { qiany, zhangrao, tiening, zhenglyu}

More information

Segmentation of Images

Segmentation of Images Segmentation of Images SEGMENTATION If an image has been preprocessed appropriately to remove noise and artifacts, segmentation is often the key step in interpreting the image. Image segmentation is a

More information


HISTOGRAMS OF ORIENTATIO N GRADIENTS HISTOGRAMS OF ORIENTATIO N GRADIENTS Histograms of Orientation Gradients Objective: object recognition Basic idea Local shape information often well described by the distribution of intensity gradients

More information

Edge and corner detection

Edge and corner detection Edge and corner detection Prof. Stricker Doz. G. Bleser Computer Vision: Object and People Tracking Goals Where is the information in an image? How is an object characterized? How can I find measurements

More information

Document Image Restoration Using Binary Morphological Filters. Jisheng Liang, Robert M. Haralick. Seattle, Washington Ihsin T.

Document Image Restoration Using Binary Morphological Filters. Jisheng Liang, Robert M. Haralick. Seattle, Washington Ihsin T. Document Image Restoration Using Binary Morphological Filters Jisheng Liang, Robert M. Haralick University of Washington, Department of Electrical Engineering Seattle, Washington 98195 Ihsin T. Phillips

More information

Occlusion Robust Multi-Camera Face Tracking

Occlusion Robust Multi-Camera Face Tracking Occlusion Robust Multi-Camera Face Tracking Josh Harguess, Changbo Hu, J. K. Aggarwal Computer & Vision Research Center / Department of ECE The University of Texas at Austin,,

More information

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy BSB663 Image Processing Pinar Duygulu Slides are adapted from Selim Aksoy Image matching Image matching is a fundamental aspect of many problems in computer vision. Object or scene recognition Solving

More information

Detecting Object Instances Without Discriminative Features

Detecting Object Instances Without Discriminative Features Detecting Object Instances Without Discriminative Features Edward Hsiao June 19, 2013 Thesis Committee: Martial Hebert, Chair Alexei Efros Takeo Kanade Andrew Zisserman, University of Oxford 1 Object Instance

More information

Chapter 12 Solid Modeling. Disadvantages of wireframe representations

Chapter 12 Solid Modeling. Disadvantages of wireframe representations Chapter 12 Solid Modeling Wireframe, surface, solid modeling Solid modeling gives a complete and unambiguous definition of an object, describing not only the shape of the boundaries but also the object

More information

Coarse-to-fine image registration

Coarse-to-fine image registration Today we will look at a few important topics in scale space in computer vision, in particular, coarseto-fine approaches, and the SIFT feature descriptor. I will present only the main ideas here to give

More information

Recognition. Clark F. Olson. Cornell University. work on separate feature sets can be performed in

Recognition. Clark F. Olson. Cornell University. work on separate feature sets can be performed in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 907-912, 1996. Connectionist Networks for Feature Indexing and Object Recognition Clark F. Olson Department of Computer

More information

Real-time Detection of Illegally Parked Vehicles Using 1-D Transformation

Real-time Detection of Illegally Parked Vehicles Using 1-D Transformation Real-time Detection of Illegally Parked Vehicles Using 1-D Transformation Jong Taek Lee, M. S. Ryoo, Matthew Riley, and J. K. Aggarwal Computer & Vision Research Center Dept. of Electrical & Computer Engineering,

More information


Journal of Asian Scientific Research FEATURES COMPOSITION FOR PROFICIENT AND REAL TIME RETRIEVAL IN CBIR SYSTEM. Tohid Sedghi Journal of Asian Scientific Research, 013, 3(1):68-74 Journal of Asian Scientific Research journal homepage: FEATURES COMPOSTON FOR PROFCENT AND REAL TME RETREVAL

More information

The SIFT (Scale Invariant Feature

The SIFT (Scale Invariant Feature The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical

More information

Biomedical Image Analysis. Point, Edge and Line Detection

Biomedical Image Analysis. Point, Edge and Line Detection Biomedical Image Analysis Point, Edge and Line Detection Contents: Point and line detection Advanced edge detection: Canny Local/regional edge processing Global processing: Hough transform BMIA 15 V. Roth

More information

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale. Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe presented by, Sudheendra Invariance Intensity Scale Rotation Affine View point Introduction Introduction SIFT (Scale Invariant Feature

More information

Qualitative Physics and the Shapes of Objects

Qualitative Physics and the Shapes of Objects Qualitative Physics and the Shapes of Objects Eric Saund Department of Brain and Cognitive Sciences and the Artificial ntelligence Laboratory Massachusetts nstitute of Technology Cambridge, Massachusetts

More information



More information

Motion in 2D image sequences

Motion in 2D image sequences Motion in 2D image sequences Definitely used in human vision Object detection and tracking Navigation and obstacle avoidance Analysis of actions or activities Segmentation and understanding of video sequences

More information

Time Stamp Detection and Recognition in Video Frames

Time Stamp Detection and Recognition in Video Frames Time Stamp Detection and Recognition in Video Frames Nongluk Covavisaruch and Chetsada Saengpanit Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand E-mail:

More information

An Adaptive Eigenshape Model

An Adaptive Eigenshape Model An Adaptive Eigenshape Model Adam Baumberg and David Hogg School of Computer Studies University of Leeds, Leeds LS2 9JT, U.K. Abstract There has been a great deal of recent interest

More information