Stabilization of infra-red aerial image sequences using robust estimation

Similar documents
Multi-stable Perception. Necker Cube

Structure from Motion

LEAST SQUARES. RANSAC. HOUGH TRANSFORM.

Image Alignment CSC 767

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Angle-Independent 3D Reconstruction. Ji Zhang Mireille Boutin Daniel Aliaga

What are the camera parameters? Where are the light sources? What is the mapping from radiance to pixel color? Want to solve for 3D geometry

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Image warping and stitching May 5 th, 2015

CS 534: Computer Vision Model Fitting

Feature Reduction and Selection

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

An efficient method to build panoramic image mosaics

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

RELATIVE ORIENTATION ESTIMATION OF VIDEO STREAMS FROM A SINGLE PAN-TILT-ZOOM CAMERA. Commission I, WG I/5

Robust Vanishing Point Detection for MobileCam-Based Documents

A Novel Accurate Algorithm to Ellipse Fitting for Iris Boundary Using Most Iris Edges. Mohammad Reza Mohammadi 1, Abolghasem Raie 2

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Unsupervised Learning and Clustering

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Parallelism for Nested Loops with Non-uniform and Flow Dependences

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Ecient Computation of the Most Probable Motion from Fuzzy. Moshe Ben-Ezra Shmuel Peleg Michael Werman. The Hebrew University of Jerusalem

A Robust Method for Estimating the Fundamental Matrix

A Binarization Algorithm specialized on Document Images and Photos

Prof. Feng Liu. Spring /24/2017

y and the total sum of

Multi-view 3D Position Estimation of Sports Players

Fitting and Alignment

Geometric Transformations and Multiple Views

A Methodology for Document Image Dewarping Techniques Performance Evaluation

Improved SIFT-Features Matching for Object Recognition

Support Vector Machines

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Unsupervised Learning and Clustering

S1 Note. Basis functions.

Hermite Splines in Lie Groups as Products of Geodesics

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

Calibrating a single camera. Odilon Redon, Cyclops, 1914

A method for real-time implementation of HOG feature extraction

Feature-based image registration using the shape context

MOTION BLUR ESTIMATION AT CORNERS

Kinematics Modeling and Analysis of MOTOMAN-HP20 Robot

Introduction to Geometrical Optics - a 2D ray tracing Excel model for spherical mirrors - Part 2

3D Modeling Using Multi-View Images. Jinjin Li. A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Inverse-Polar Ray Projection for Recovering Projective Transformations

A NEW IMPLEMENTATION OF THE ICP ALGORITHM FOR 3D SURFACE REGISTRATION USING A COMPREHENSIVE LOOK UP MATRIX

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

Pictures at an Exhibition

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

Smoothing Spline ANOVA for variable screening

New dynamic zoom calibration technique for a stereo-vision based multi-view 3D modeling system

AUTOMATIC IMAGE REGISTRATION OF MULTI-ANGLE IMAGERY FOR CHRIS/PROBA

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Data Mining: Model Evaluation

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Optimal Combination of Stereo Camera Calibration from Arbitrary Stereo Images.

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Lecture 9 Fitting and Matching

Takahiro ISHIKAWA Takahiro Ishikawa Takahiro Ishikawa Takeo KANADE

Machine Learning: Algorithms and Applications

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

IMAGE MATCHING WITH SIFT FEATURES A PROBABILISTIC APPROACH

Cluster Analysis of Electrical Behavior

Acoustic Camera Image Mosaicing and Super-resolution

An Optimal Algorithm for Prufer Codes *

Biostatistics 615/815

3D vector computer graphics

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

X- Chart Using ANOM Approach

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Programming in Fortran 90 : 2017/2018

PHOTOGRAMMETRIC ANALYSIS OF ASYNCHRONOUSLY ACQUIRED IMAGE SEQUENCES

Robust Computation and Parametrization of Multiple View. Relations. Oxford University, OX1 3PJ. Gaussian).

Graph-based Clustering

Range Data Registration Using Photometric Features

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

A Comparison and Evaluation of Three Different Pose Estimation Algorithms In Detecting Low Texture Manufactured Objects

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

2D Raster Graphics. Integer grid Sequential (left-right, top-down) scan. Computer Graphics

An Image Fusion Approach Based on Segmentation Region

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Active Contours/Snakes

Lecture 4: Principal components

Unsupervised Learning

Help for Time-Resolved Analysis TRI2 version 2.4 P Barber,

USING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

CORRELATION ICP ALGORITHM FOR POSE ESTIMATION BASED ON LOCAL AND GLOBAL FEATURES

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

n others; multple brghtness values n one mage may map to a sngle brghtness value n the other mage, and vce versa. In other words, the two mages are us

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

Transcription:

Stablzaton of nfra-red aeral mage sequences usng robust estmaton Danel McRenolds, Pascal Marchand and Yunlong Sheng COP, aval Unverst, Ste-Fo, Qc, G1K 7P4, Canada éandre Sévgn Defence Research Establshment Valcarter, 459 Boul. Pe XI nord, C.P. 8800, Courcelette, Qc, G0A 1R0, Canada angs Gagnon ockheed Martn Electronc Sstems Canada, 6111 Roalmount Ave., Montreal, Qc, H4P 1K6, Canada Kewords: mage stablzaton, M-estmaton, nfra-red, dfferental nvarants, mage matchng, affne camera Current address: Centre de Recherche Informatque de Montreal 550 Sherbrooke Ouest, Montreal Quebec, CANADA, H3A1B9

Stablzaton of nfra-red aeral mage sequences usng robust estmaton Abstract Image stablzaton s the mage regstraton appled to one vdeo mage sequence from a sngle camera, whch has been dentfed as the ke frst step for the task of mult-spectral mage fuson for aeral survellance applcatons. The stablzaton of nfra-red IR aeral mage sequences s challengng owng to the low contrast and sgnal-to-nose rato of the mages, and to potentall large vewpont changes that result n mages wth large rotaton and scale change and perspectve dstorton. Block matchng methods are relable and accurate even for mages of poor contrast but tpcall cannot effcentl handle mage rotaton and scalng. We demonstrate a new feature-based method for IR aeral mage sequence stablzaton. We use the grelevel dfferental nvarant GDI matchng due to Schmd and Mohr whch s nvarant to rotaton and scalng. Etensons to the basc GDI method are ntroduced that mprove the performance of the method. We descrbe the orthographc proecton as a close appromaton to perspectve proecton n the case of aeral mages. Instead of the least mean square soluton, we use M-estmaton for the mage regstraton parameters. The method s robust to outlers returned b the GDI method. We verf the pont correspondence under orthographc proecton usng the eppolar constrant. Epermental results are reported for real-world and sntheszed IR mage sequences. Introducton Imagng sensors of dfferent modaltes, for eample, vsble and nfra-red, provde complementar nformaton whch, f properl fused, can assst the observer n the mage nterpretaton task. Image stablzaton s a ke step n ths process. Image stablzaton s the mage regstraton appled to one vdeo mage sequence from a sngle camera. When the camera s mounted on an unstead or a movng platform and obects are far from the camera, the 3-D space moton of the camera wll affect the mages. Even small movements of the platform can result n large dsplacements of the mages. There s also moton of the target wth respect to the statonar scene background. Ths moton s, however, not to be removed b the stablzaton process. Untl now, the prmar means avalable to stablze mages from a camera on a movng vehcle has been to mount the camera on an electro-mechancal stablzng platform. These stablzers are bulk and epensve. Ther performance degrades wth vbraton n the crtcal 0 0 Hz range. An alternatve stablzaton method uses mage processng technques to frst estmate and then elmnate the scene moton that s due to camera moton b warpng each mage frame nto precse algnment wth a reference frame. A

3 temporal flter frame dfference then elmnates the background scene whle hghlghtng targets that are tracked over multple frames. The mage stablzaton task has been specfed n general terms as follows. Gven an mage sequence usuall consstng of at least 10 to 30 seconds of data, a specal frame called the reference frame s chosen at or near the begnnng of the sequence. Frames subsequent to the reference frame shall be regstered to the reference frame n such a wa that the frames are precsel algned wth the reference mage. The conventonal mage regstraton method uses block matchng technques. The mages are frst parttoned nto a matr of blocks. Then, cross-correlaton between the correspondng blocks n two mages are performed, resultng n a matr of dsplacement vectors. The least mean square estmaton can then ft the dsplacement vectors nto a geometrcal dstorton model, resultng n the parameters of mage dstorton wth sub-pel accurac. The block matchng utlzes the full mage nformaton and can be appled to an tpe of mage, rch or poor n teture. The block correlaton s robust aganst random nose and has hgh accurac. However, block matchng can account for onl translatons and onl appromatel for other mage dstortons. It s epensve to compute and becomes prohbtve when the mage dsplacement s large. Also, the crosscorrelaton based on the mage ntenst smlart s senstve to envronment changes and s not applcable for regstraton of mult-spectral mages [16]. On the other hand, feature-based mage matchng can account for an mage deformaton and can be nsenstve to mult-sensor modaltes b selectng structurall salent features. It s quck to compute. However, the feature-based methods wll fal to fnd matches n structure-less areas. Its relablt depends on the feature etracton process. Therefore, robustness of the matchng process s a crtcal ssue. The onl known geometrc constrant between two mages of a sngle scene s the eppolar constrant. The eppolar geometr descrbes relatons between correspondng ponts n two mages, and s, therefore, useful for mage matchng. Zhang et al. [5] proposed a robust technque for mage matchng for uncalbrated cameras based on geometrc verfcaton usng the eppolar constrant. However, for mage stablzaton tasks t s mpossble to know, a pror, the fundamental matr of the eppolar geometr whch depends on the 3-D dsplacement and ntrnsc parameters of the camera. The ntal matchng s then requred for estmatng the fundamental matr of the uncalbrated camera. Zhang et al. use cross-correlaton between local supports of the feature ponts for the ntal matchng, whch s not nvarant to scalng, rotaton and other deformatons, and s the man lmtaton of ther approach. The use east Medan of Squares MedS to dscard outlers, that allows up to 50% false ntal matches n the estmaton of the eppolar geometr. Our eperments show that Zhang s approach can provde a large number of correct matches for outdoor mages from a ground based camera where the perspectve proecton results n local translatons of the correspondng ponts. However, Zhang s approach fals n ts ntal matchng wth aeral mages where there s, for eample, a large global rotaton of the nput scene. In ths paper we propose an approach for the stablzaton of real-world aeral nfra-red mage sequences. In the aeral mage sequences the frame to frame moton can be ver large. The IR mages are

4 generall nos wth low contrast. We vew the task of mage stablzaton as a problem n mage regstraton, that can be accomplshed usng mage matchng methods. A classc approach to mage matchng s to hpothesze then verf the correspondence of features between mages [6]. In ths approach, the ntal matchng s crtcal for a successful matchng and regstraton. To mprove the ntal matchng b crosscorrelaton Derche [13] added the drectons of the gradent, curvature, and a dspart measure. Hu [14] suggested to appl the cross-correlaton n several drectons and Ballard [15] used steerable flters. Schmd and Mohr [1] ntroduce grelevel dfferental nvarant GDI as a local measure for pont matchng, whch s nvarant to mage rotaton, scalng and translaton. We etract feature ponts usng the Harrs-Stephens corner detector [1], whch s effcent to compute and has been shown to be one of the best detectors n terms of repeatablt wth vewpont change and other scene varables [11]. It s effcent also to etract corner ponts n the teture of the IR mages. We use the grelevel dfferental nvarant matchng method to fnd an ntal set of matches. Three addtonal enhancements to the basc grelevel dfferental nvarant matchng method are ntroduced and tested: 1 Search over the GDI space wth k-d trees to speed up the matchng process, snce a quer fnshes n logarthmc epected tme, Scale-space verfcaton of matches to mprove the rato of true to false matches, and 3 Determnng the covarance matr for normalzng dfferental nvarants at runtme. The GDI method provdes a set of matches that often contan outlers whch can serousl perturb the estmaton of the mage transformaton. We use M-estmaton whch s robust to outlers and s computatonall effcent, and a varant of least medan of squares, whch s more computatonall epensve, n the case that M- estmaton falure s detected. For aeral mages, the dstance from scene to camera s much larger than the depth of the scene, and the feld of vew ma be small. In ths case, the orthographc proecton provdes a good appromaton to perspectve proecton, as shown b Prtt [4, 10], and then the unque optmal soluton, n a least squares sense, for regsterng a par of mages, s the global affne transformaton wth local dstorton modeled b a dsplacement along parallel eppolar lnes proportonal to the heght of the scene pont. However, Prtt does not address the crtcal ssue of the ntal matchng, estmaton of the transformaton parameters n the presence of outlers, nor are results presented for real world mage data. We ft the GDI matches to the global affne transformaton, that accounts for mage scalng and rotaton n the mportant ntal matchng stage and mplement a robust mage regstraton. It s b ncorporatng robust estmaton nto the mage regstraton process that we perform a match verfcaton process usng the eppolar constrant. Enhanced grelevel dfferental nvarant matchng At each feature pont n the mage found b the Harrs-Stephens corner detector, a grelevel dfferental nvarant GDI vector s computed. The GDI representaton and matchng method are nvarant to mage rotaton, scalng and translaton [1]. A normalzed verson of the representaton s also nvarant to brghtness scalng. The representaton s farl robust wth respect to rotaton n depth that leads to foreshortenng of surface patches,.e., n general, a local affne dstorton of the brghtness surface.

5 The grelevel dfferental nvarants are based on the dervatves of the Gaussan fltered mage n a representaton called the local et. The local et represents the truncated Talor seres epanson of the mage functon and s useful for encodng the poston dependent geometr of the brghtness surface. The use of Gaussan smoothng makes the dfferentaton well posed n addton to havng other nce mathematcal propertes. The dfferental nvarant vector V s gven b [1] 1 = V 3 3 3 3 where s the Gaussan fltered mage functon and subscrpts denote partal dfferentaton n a Cartesan coordnate sstem. For eample, s the average brghtness, s the gradent magntude squared, and s the aplacan of the brghtness functon. The components of the vector, V, of nvarants, are the complete and rreducble set of dfferental nvarants up to thrd order. These functons are rotatonall smmetrc. Dfferental nvarants can be also nvarant to an affne transformaton of the brghtness functon gven b b a =,, ~ I I. These nvarants are the last seven components of the dfferental nvarant vector, V, normalzed b an approprate power of the gradent magntude squared. To gve scale nvarance over a fed range we compute a set of scales centered on a reference scale, σ 0, such that 0 5 6 σ = σ / where -n -1, 0, 1 n. A value of four for n elds the scale factor range 0.48 to.07, hence there are nne dfferental nvarant vectors for each kepont. Ths s a mult-scale representaton. The 0 percent factor s emprcall derved and reflects the epected dfferental scale range over whch the nvarants do not change apprecabl. Nearest neghbour search b k-d tree The Mahalanobs dstance s used to determne the nearest neghbour. Ponts are declared to be correspondng when the par of ponts from two mages are mutuall selected as closest. The space of dfferental nvarant vectors can be organzed n a hash table [1] or wth a tree representaton such as the k-d tree. Nearest neghbour searchng wth k-d trees reduces the search tme to logarthmc epected tme. The epected number of records eamned to satsf a quer s gven b the epresson k k b k G b b k R } ] / {[, / 1 1, where k s the record dmenson, b s the number of records n a bucket, and 1 k G s a constant that accounts for the geometrc propertes of the norm used for the dstance measure [7]. For our mplementaton of GDI matchng, 1 = k G, k s 7 and b s 1, hence R7,1=18.

6 Match verfcaton n scale-space Epermental results suggest that nearest neghbour matchng wth dfferental nvarants s scale senstve over moderate vewpont changes. A local multscale analss s used to flter out scale-space unstable and hence potentall ncorrect matches. The full dfferental nvarant matchng process s carred out at three reference scales that dffer b multples of ten percent of the reference scale σ 0,.e., σ 1 * 0.1 for k=0,..,. The value of ten percent s half the epected scale senstvt of 0 k dfferental nvarants. Ths value was found epermentall to eld good results. Too large a value elmnates most matches and too small a value does not effectvel elmnate scale unstable matches. A match s scalespace verfed f t ests at all three scales. Epermental results demonstrate that ths verfcaton step sgnfcantl ncreases the rato of correct to ncorrect matches. Ths step ncreases the computatonal cost b a factor of three but t s parallelzable. Normalzaton The varant of k-d tree used here requres that the data satsf a Eucldean norm. An approprate transformaton of the GDI vectors b a sutable covarance matr accomplshes ths. The covarance matr descrbes the epected varance of the dfferental vector components wth respect to changes n vewpont, llumnaton, sensor propertes, and nose. For an mage retreval task [8] the covarance matr s determned b averagng together over all keponts the covarance matr generated for each kepont b trackng observatons over an etended mage sequence. It s not clear, however, f ths s sutable for the mage matchng problem. Frst, we assume that trackng s generall not an opton. Secondl, the resultng covarance matr ma not be sutable for all mage pars to be matched, ecept, of course, the mages from the sequence tself. Computng and combnng covarance matrces over several dfferent sequences, thereb enlargng the sample sze, ma be suffcent for a varet of mage retreval tasks. Instead, we adopt a classcal perturbaton approach to estmatng the covarance matr for the dfferental nvarant vectors. Such an approach s frequentl used when an analtc soluton s not feasble. For eample, ths approach s used to determne datum/model compatblt for the RANSAC algorthm [9]. The RANSAC procedure uses data perturbaton to estmate mpled error bounds for a gven model. The covarance matr s estmated b averagng together a large number of covarance matrces generated from dfferental nvarant vectors and ther nose perturbed devates. We compute a covarance matr drectl from the gven mage par b modelng brghtness varaton due to all the factors mentoned above b normall dstrbuted nose added to the mages. The nose varance s some fracton of the varance of the mage brghtness over the entre feld of vew. A value of 5 percent of the brghtness varance has been used n the descrbed eperments. Ths s an emprcal value that has elded good results. Furthermore, the matchng results are not overl senstve to ths value.

7 Image regstraton wth M-estmators Image regstraton procedure Gven two mages, the GDI method returns a set of hpotheszed correspondences. Gross outlers are elmnated usng the gradent angle consstenc constrant. The resultng set of matches are then used to estmate the orthographc proecton usng M-estmaton. An ntal estmate of the transformaton uses the so called "Far" functon whch provdes an ntal set of weghts. The fnal estmate s computed usng the "Tuke b-weght" functon whch can elmnate outlers. The mage s then resampled nto the reference coordnate frame usng b-lnear nterpolaton. If the ntal estmate should cause the fnal estmaton process to elmnate 50 percent or more of the matches then a process smlar to least medan of squares s eecuted whch seeks to determne the best ntal estmate usng random subsets of 4 matches from the entre set of matches. The mplct assumpton s that there are fewer than 50 percent outlers, snce the best sub-sample s predcated on the medan resdual. The revsed ntal estmate s used to compute the new fnal estmate. The mage s then resampled and regstered as before. Eppolar Constrant The onl known geometrc constrant that ests between correspondng pont matches from two or more vews s the eppolar constrant, whch can be used for mage matchng. When the feld of vew s small and varaton n depth of the scene s small compared to ts average dstance to the camera, the orthographc and scaled orthographc proecton s a close appromaton to the perspectve proecton model [4][10]. The appromaton s also good for large feld of vew f the separaton angle between two cameras s small [4]. Under ths appromaton, the model locates the optcal center at nfnt, hence, the proecton ras are parallel to one another and perpendcular to the mage plane. The eppoles are stuated at nfnt n the mage plane and the eppolar lnes are parallel. The parallel proecton s lnear wth the regstraton functon epressed b [4] F, = A, h, e, 1 1 1 1 1 1 where F =, maps ponts 1, 1 transformaton gven b 3 ˆ ˆ p = p 1 4 p p 5 from the frst mage to the second, A 1, 1 s an affne 1 p3 1 p6 1, and h 1, 1 s the z coordnate of the non-occluded pont n the nput scene. Its value depends on the choce of the scene coordnate sstem. The vector e s the eppolar vector. The parallel eppolar lnes are epressed as A 1, 1 αe whch depends onl on the poston and orentaton of the two mage coordnate sstems relatve to the scene coordnate sstem and α s a real number. The regstraton functon s globall defned b an affne

transformaton. Eq. accounts eplctl for the depth varaton va the heght functon h 1, 1 as a local dstorton whch are dsplacements along eppolar lnes and proportonal to the heghts n the scene. The choce of parameters A 1, 1, h 1, 1 and e n Eq. s not unque. The soluton becomes unque wth normalzaton constrants proposed b Prtt, namel that the zero and frst order moments the scene s heght functon h 1, 1 be equal to zero. Under ths constrant, all the terms nvolvng h dsappear n the equatons for the least mean square soluton of Eq.. Ths s equvalent to the least mean square soluton of equaton 4 F 1, 1 = A 1, 1 for the affne transformaton A 1, 1. The affne component of the regstraton functon determnes the least squares estmate of a planar appromaton of the scene structure. After the determnaton of A 1, 1, the eppolar vector e and the scene heght functon can be recovered from the resdual errors: 5 s = - [p 1 1 p 1 p 3 ] and 6 t = - [p 4 1 p 5 1 p 6 ] where, s the observed pont n the second mage that corresponds wth 1, 1. The slope of the eppolar vector e s gven b 1/ 7 m = [ β δ β 4 ] ZKHUH Σt - s / Σs t DQG VJQΣs t. Ths estmate for the slope of the eppolar lne s the least squares soluton for the obectve functon that mnmzes the orthogonal dstances of the ponts s, t to the lne. The robust estmaton of the affne transformaton parameters therefore mplements a verfcaton of the matches usng the eppolar constrant. The eppolar vector e s gven b 8 e = e 1,e T = [1/1m 1/, m/1m 1/ ]. The soluton for h s gven b 9 h = se1 te. 8 M-estmaton appled to mage regstraton et rs,, t be the th resdual,.e., the dfference between the th observaton and the current estmate of the ftted value. The classc least squares problem seeks to mnmze r whch s known to be unstable n the presence of outlers n the data. The M-estmators are desgned to reduce ths nstablt b replacng the squared resduals r b a functon of the resduals whch elds the problem [3] mn ρ r,

9 where ρ s a smmetrc, postve defnte functon wth a unque mnmum at zero and s chosen to be less ncreasng than square. The problem s not solved drectl but s mplemented as an terated re-weghted lnear least squares soluton. We seek to estmate p for =1 to 6 from Eq.3 whch s also the desred soluton for Eq.4. We solve for the parameters p 1 to p 3 usng the M-estmaton formulaton r 10 w r r = 0 over =1,,k data ponts and =1,,3. Smlarl for p 4 to p 6 we solve r 11 w r r = 0 over =1,,k data ponts and =4,5,6. Choce of obectve functons Some ρ-functons assure a unque soluton, but snce the onl reduce the effect of outlers, the estmator can be based. Other ρ-functons do not guarantee a unque soluton, but reduce consderabl the effect of outlers or even more, elmnate them. Snce no ρ-functon s perfect, we use both tpes of ρ-functons as proposed b Huber []. We began the estmaton process wth the obectve functon Far and then refne the estmate wth Tuke's b-weght functon. For both functons, the convergence threshold used s 0.1% n the relatve dfference between terated parameter estmates. The Far dstrbuton has contnuous dervatves up to thrd order and elds a unque soluton. Gven these choces for the obectve functon the estmator of the global affne transformaton can be wrtten as follows from Eq. 10 for the resdual n the component 1 w [ = - p1 1 p 1 p3] 1 0 1 w = [ - p1 1 p 1 p3 ] 1 0 w = [ - p1 1 p 1 p3] 1 0 r where r = 1, r = 1, and = 1. The th teraton of the weght, w, s one of Far or Tuke's b- weght as descrbed above. Smlarl, for the component from Eq. 11 w [ = - p4 1 p5 1 p6] 1 0 13 w = [ - p4 1 p5 1 p6 ] 1 0 w = [ - p4 1 p5 1 p6] 1 0 r r r where = 1, = 1, and = 1. 4 5 3 6 Eqs.1 and 13 gve a lnear sstem of 6 equatons n the 6 unknown parameter varables p 1 to p 6 whch are solved for teratvel n two phases. The frst phase uses the Far weghtng functon to eld an

10 ntal estmate. The fnal estmate uses the Tuke b-weght functon ntalzed wth the weghts and parameters from the Far estmate. The regstraton functon for the epermental results reported here s gven b F, = A,,.e., the heght term s gnored. The assumpton s that the heght functon values are not large enough to sgnfcantl perturb the regstraton for the aeral mage sequences. It s straghtforward to generalze the approach b the addton of the term h, e to the global affne transformaton. The drawback s the necesst to estmate h, over the entre feld of vew. Ths requres us to nterpolate the heght functon for known pont correspondences whch wll eld a poor estmate for a sparse set of pont matches or, alternatvel, one can compute a dense matchng usng a method such as optcal flow. Match denst for pont features can be ncreased b searchng for addtonal matchng ponts along the estmated eppolar lnes, a one dmensonal search. Ill-condtoned solutons In some rare cases, convergence gven b the Far functon s not the one we would epect. Ths can happen when the GDI method returns matches where the number of outlers s not relatvel small or ther dstrbuton perturbs the estmator sgnfcantl. In ths case, the mnmum reached depends upon the nfluence of the outlers and the ntal value of the resdual error functon. The Far functon ma converge to a mnmum such that man of the matches have relatvel small weghts, and therefore the estmate s supported b relatvel few matches. After convergence of the Far functon, Tuke s b-weght functon seeks to mprove the estmate of the parameters b elmnatng the effect of large outlers, but contnues to converge n the same drecton that the Far functon dd. If we then look at the ω, we see that most of the matches are zero weghted because the estmate causes them to have a ver large resdual. Random search for least medan of squares resdual To overrde ths stuaton, we added a process whch provdes a new ntal parameter estmate that s then passed to the Tuke s b-weght functon process for computng a fnal estmate. In prncple, Tuke s bweght functon then converges to the global mnmum eldng a better parameter estmate. We defned as a threshold for computng a revsed ntal estmate that at least 50% of all matches must be 0-weghted from the fnal estmate provded b M-estmaton wth Tuke s b-weght functon. The process that fnds a better ntal estmate randoml chooses 4 matches, whch s the mnmum number of matches to do a mnmall overconstraned estmate, from the lst provded b the GDI method less the gross outlers detected b the gradent angle constrant. These four matches then pass through the Far and Tuke s b-weght estmaton processes as before. The process then computes the resdual for all matches found b GDI accordng to the parameter estmate found for ths subset. The process takes note of ths parameter estmate, the weghts ω of the subset and the medan resdual over the entre set of matches. If one or more of the subset ω s 0, the subset result s

11 gnored. The process terates untl 71 subsets are selected. Ths number of subsets guarantees wth a probablt of 0.99, assumng no more than 50 percent outlers, that at least one of the subsets s composed of 4 good matches. At the end, the process takes the ω and parameter estmate of the subset wth the lowest medan resdual. All matches are then ncorporated and the 4 matches of the selected subset keep ther recorded weght. All other ω are put to 0.5, where ω can var from 0 to 1, to avod over-weghtng potental outlers. Wth these values for the weghts a fnal estmate s made usng the Tuke s b-weght functon process untl convergence. Ths gves a more robust estmate of the global affne transformaton parameter vector. Note that ths random search process s more computatonall epensve than the M-estmaton process and, therefore, s not used as the man estmaton method. Infra-red aeral mage sequence DIM01 Epermental results Fgure 1 gves the GDI matchng results for frames 01 and 0 from the sequence DIM01, a sequence provded b Defence Research Establshment Valcarter for algorthm development purposes. The sequence s acqured at a 4Hz frame rate b a helcopter mounted nfra-red camera. Note the large scale change due to the camera translaton n the foreground and a rotaton about a pont n the lower left mage quadrant. The scale of Fgure 1: Grelevel dfferental nvarant matches. eft mage s frame 01 and rght mage s frame 0. Matches 7, 8 and 1 are ncorrect. the smoothng kernel for the dfferental nvarants s fve pels wth Gaussan pre-smoothng of the mages at a scale of four pels to reduce the IR scan nose as well as to compensate for the lack of fdelt of the actual mage transformaton wth the assumed transformaton. The GDI model assumes the sgnal s transformed b a smlart transformaton. The actual model s better appromated b a local affne transformaton that n turn s a local appromaton of the effect of the perspectve dstorton due to the translatng camera. Also, because the scale of the sgnal s ncreasng, more detal s present n the nonreference mage that leads to a perturbaton of the nvarants. Addtonal smoothng of the reference and nonreference mages reduces the perturbng effects of the scale change. Of the 1 matches 3 are ncorrect and are elmnated b the gradent angle constrant. The gradent angle constrant deletes matches whose angle change s dfferent b /- 45 from the medan angle change for all matches. These matches are defned as the gross outlers.

1 Fgure : eft: Regstered frame 0, compare wth frame 01 n Fg. 1. Note presence of artfacts along bottom from source mage resampled nto regstered frame. Rght: Dfference of reference and regstered frames. The left mage of Fgure shows frame 0 resampled nto the mage coordnate sstem of frame 01 n accordance wth the global affne transformaton robustl estmated from nne matches. Resamplng was b blnear nterpolaton. The rght mage of Fgure s the dfference between the regstered frame 0 and the reference frame 01. The contrast of the mage s stretched to fll the range 0 to 55. The wdths of brghtness dscontnut hghlghts the error n the regstraton. The errors are due to the lack of match ponts n the foreground where perspectve dstorton s greatest. Also, the local dstorton term has not been ncorporated. Nevertheless, the regstraton result s good. Grelevel dfferental nvarant matchng etensons Fgure 3 shows the matches returned b the GDI matchng method for the frst of three scale levels wth the dfferental nvarant flter scale equal to 5 pels. The remanng two reference scale levels are 5.5 and 6.05 pels respectvel. We note the larger number of matches, n all. Of these matches, 11 are correct or 50%. Match 7 and 11 are correct but do not appear at all three scales. For the 5.5-pel scale level the number of matches s 31 of whch 14 are correct 45%, and for the last scale level there are 19 out of 9 matches correct 66%. Wth scale-space trackng there are 9 out of 1 correct matches 75% a large mprovement n terms of the rato of correct to total matches. We also note the hgh senstvt to small perturbatons of scale ehbted b the GDI matchng process. Fnall, the average number of records searched over both k-d trees over the three reference scale levels wth 900 records n each tree was 76 n keepng wth the epected logarthmc tme to complete a quer. Fgure 3: GDI matches for the frst of three scales for nfra-red DIM01 sequence. eft: frame 01. Rght: frame 0.

13 Fgure 4. Eppolar lnes for frames 01 and 0. Observatons are marked wth "", reproected scene ponts b " DQGWKHKHJKW corrected reproectons b 7KHODVWWZRIUDPHV are a zoomed vew of pont 8 from frames 01 and 0 respectvel. Eppolar estmaton The eppolar lnes and heght corrected reproectons were estmated accordng to Eqs. 8 and 9 and are plotted n Fgure 4. The slope angles for the eppolar lnes n frames 01 and 0 were.7 and -8.1 respectvel for a fronto-parallel rotaton angle of 5.4. The RMS resdual for frame 01 reproected ponts before heght correcton s 0.85 pels and after heght correcton t s 0.5 pels. The RMS resdual for frame 0 s 0.97 and 0.69 pels before and after heght correcton. The last two frames of Fgure 4 shows pont 8 wth the ftted eppolar lne and the reproected matchng pont from the other mage before and after correcton b the heght local dstorton factor. Snthetc mage sequence ockheed Boats8 Fgure 5: Reference mage 00 wth matches to frame 08 from GDI method. Matches 15, 17, and 0 are ncorrect. Fgure 6: Frame 08 wth matches to reference mage. Image 08 s rotated 40º from the reference mage wth addtve nose and a random affne dstorton. Net s an eample of the regstraton process from a sntheszed moton sequence. These mages are from a sequence composed of 11 frames sntheszed from a real nfra-red mage. The moton conssts of a contnuous rotaton of the sequence reference mage wth addtve Gaussan nose and a random affne dstorton. In ths eample the non-reference mage s frame 08, whch s rotated b a 40º angle from the reference mage. Fgures 5 and 6 show the matches obtaned from the grelevel dfferental nvarant matchng method. The GDI method provdes 0 matches, where 17 are correct. Matches 15, 17, and 0 from Fgures 5 and 6 are ncorrect.

14 Fgure 7:Frame 08 regstered to reference mage. Fgure 8: Dfference between regstered mage 08 and the reference mage. See tet. Fve matches, those labeled 3,8,15,16, and 0 from fgures 5 and 6 were elmnated usng the gradent angle constrant. Note that matches 3,8 and 16 were ncorrectl elmnated. Ths ma be due to the added nose and the shear ntroduced b the affne transformaton n addton to the pure rotaton. Fgure 7 shows mage 08 regstered to the reference mage after M-estmaton of the global affne transformaton and resamplng b blnear nterpolaton. Incorrect match 17 from Fgures 5 and 6 was correctl elmnated b the Tuke b-weght robust estmaton functon,.e., ts weght was set to zero. Fgure 9. Eppolar lnes for frames 00 and 08. Observatons are marked wth "", reproected scene ponts b " DQG the heght corrected UHSURMHFWRQVE\ Fgure 8 gves the mage dfference between the regstered frame 08 and the reference mage. The scene obect can stll be dstngushed because the pel value plus addtve nose eceeded the mamum pel value and therefore was clpped. The nose s normall dstrbuted wth a zero mean and a standard devaton of 15 percent of the standard devaton of the reference brghtness mage. The brghtness dstrbuton s therefore not a normal dstrbuton after mage subtracton n that area where the brghtness values were clpped. Fgure 9 shows the estmated eppolar geometr for the two vews. The eppolar lne slope angle for frame 00 s 8.3º and 13º for frame 08 whch gves a fronto-parallel rotaton angle of 41.3º. Concluson We have demonstrated the stablzaton of aeral mage sequences wth scale, rotaton and large translatons usng robust estmaton. The method s computatonall nepensve and theoretcall well founded owng to the approprateness of the orthographc proecton model whch s a close appromaton to the perspectve proecton for the aeral mages. We have shown the grelevel dfferental nvarant GDI method that provdes matches for mage transformatons wth rotaton, scalng and some shear. Three etensons to the basc GDI matchng method were descrbed and epermentall verfed. The etensons allow for runtme computaton of the GDI vector normalzaton, ncreases the rato of true to false matches and reduces the quer tme to an epected logarthmc tme. We have appled the robust M-estmaton to the soluton of the

regstraton functon under the parallel proecton model. We have shown the robustness of the method to outlers n the ntal GDI matchng. In comparson, Zhang s method can provde man more matches than the GDI method n man cases, but cannot provde an correct matches when the affne transformaton ncludes a large rotaton or scalng. Therefore, the proposed approach s useful for aeral mage regstraton and Zhang s method can be used for ground-based mage regstraton. The ntal matchng sub-task, before eppolar verfcaton, s clearl seen to be the maor hurdle to a full automatc feature-based regstraton method. New methods for ths sub-task are presentl under nvestgaton. References [1] Schmd, C., Mohr, R., "Matchng b local nvarants", Rapport de recherche, N. 644, INRIA, August 1995. [] Huber, P.J., "Robust Statstcs", John Wle & Sons, New York, 1981. [3] Zhang, Z., "Parameter estmaton technques: a tutoral wth applcaton to conc fttng," Image and Vson Computng, 15, pp. 59-76, 1997. [4] Prtt, M.D., "Image regstraton wth use of the eppolar constrant for parallel proectons," Journal of the Optcal Socet of Amerca A, Vol. 10, No. 10, pp. 187-19, October 1993. [5] Zhang, Z., Derche, R., Faugeras, O., Quang-Tuan,., "A robust technque for matchng two uncalbrated mages through the recover of the unknown eppolar geometr," Artfcal Intellgence, 781-, 87-119, 1995. [6] McRenolds, D.P., owe, D.G., "Rgdt checkng of 3D pont correspondences under perspectve proecton," IEEE Transactons PAMI,181, 1174-1185, 1996. [7] Sproull, R.F., "Refnements to nearest-neghbour searchng n k-dmensonal trees," Algorthmca, 6, 579-589, 1991. [8] Schmd, C., Mohr, R., "ocal Gravalue Invarants for Image Retreval," IEEE PAMI19, No. 5, 530-535, Ma 1997. [9] Fschler, M.A., Bolles, R.C., "Random sample consensus: a paradgm for model fttng wth applcatons to mage analss and automated cartograph," Comm. of the ACM, Vol. 4, No. 6, 381-395, June 1981. [10] Shapro,.S., Zsserman, A., Brad, M., "3D moton recover va affne eppolar geometr," Internatonal Journal of Computer Vson, 16, 147-18, 1995. [11] Schmd, C., Mohr, R., and Bauckhage, C., "Comparng and Evaluatng Interest Ponts," Proc. Internatonal Conference on Computer Vson, 30-35, Bomba, Inda, 1998. [1] Harrs, C., Stephens, M., "A combned corner and edge detector", Proceedngs 4th Alve Vson Conference, 147-151, 1988. [13] Derche, R., Zhang, Z., uong, Q.-T., Faugeras, O., "Robust recover of the eppolar geometr for an uncalbrated stereo rg," Proc. European Conference on Computer Vson, 567-576, 1994. [14] Hu, X., Ahua, N., "Feature etracton and matchng as sgnal detecton," Internatonal Journal of Pattern Recognton and Artfcal Intellgence, 86, 1343-1379, 1994. [15] Ballard, D.H., Wson,.E., "Obect recognton usng steerable flters at multple scales," IEEE Workshop on Qualtatve Vson, n conuncton wth the conference on Computer Vson and Pattern Recognton, New York, NY, -10, June 1993. [16] Belanger, ous N.; Gagnon, Roger; essard, Perre; Maheu, Jean; Blanchard, Auguste, "Pel and subpel mage stablzaton at vdeo rate for arborne long-range magng sensors," n Proc. SPIE Vol. 69, 15-159, Infrared Technolog XX, Born F. Andresen; Ed., October 1994. 15