Similar documents
Motion Estimation. There are three main types (or applications) of motion estimation:

EE795: Computer Vision and Intelligent Systems

CS 565 Computer Vision. Nazar Khan PUCIT Lectures 15 and 16: Optic Flow

EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation

All images are degraded

Regularization and Markov Random Fields (MRF) CS 664 Spring 2008

Evaluating Error Functions for Robust Active Appearance Models

EECS490: Digital Image Processing. Lecture #19

Edge Detection (with a sidelight introduction to linear, associative operators). Images

Feature Detectors and Descriptors: Corners, Lines, etc.

Markov Random Fields and Gibbs Sampling for Image Denoising

Edge detection. Goal: Identify sudden. an image. Ideal: artist s line drawing. object-level knowledge)

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling

convolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection

How does bilateral filter relates with other methods?

Robust Regression. Robust Data Mining Techniques By Boonyakorn Jantaranuson

EE795: Computer Vision and Intelligent Systems

Bayesian Methods in Vision: MAP Estimation, MRFs, Optimization

Biomedical Image Analysis. Point, Edge and Line Detection

SURVEY ON IMAGE PROCESSING IN THE FIELD OF DE-NOISING TECHNIQUES AND EDGE DETECTION TECHNIQUES ON RADIOGRAPHIC IMAGES

Occluded Facial Expression Tracking

Computer Vision I - Basics of Image Processing Part 2

Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection

Local Features: Detection, Description & Matching

Computer Vision I. Dense Stereo Correspondences. Anita Sellent 1/15/16

Image Processing. Traitement d images. Yuliya Tarabalka Tel.

Computer Vision I - Filtering and Feature detection

Lecture 7: Most Common Edge Detectors

CS6670: Computer Vision

Recent Developments in Model-based Derivative-free Optimization

Against Edges: Function Approximation with Multiple Support Maps

Primary Vertex Reconstruction at LHCb

Mixture Models and EM

found to be robust to outliers in the response variable, but turned out not to be resistant to outliers in the explanatory variables, called leverage

Motion Estimation (II) Ce Liu Microsoft Research New England

[Programming Assignment] (1)

Edge detection. Stefano Ferrari. Università degli Studi di Milano Elaborazione delle immagini (Image processing I)

Anno accademico 2006/2007. Davide Migliore

The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields

Dense Image-based Motion Estimation Algorithms & Optical Flow

Points Lines Connected points X-Y Scatter. X-Y Matrix Star Plot Histogram Box Plot. Bar Group Bar Stacked H-Bar Grouped H-Bar Stacked

Statistical image models

3. Data Structures for Image Analysis L AK S H M O U. E D U

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems

An Iterative Procedure for Removing Random-Valued Impulse Noise

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

2. Linear Regression and Gradient Descent

Subpixel Corner Detection Using Spatial Moment 1)

ROBUST LINE-BASED CALIBRATION OF LENS DISTORTION FROM A SINGLE VIEW

Edge and corner detection

Edge detection. Gradient-based edge operators

CS-465 Computer Vision

03 - Reconstruction. Acknowledgements: Olga Sorkine-Hornung. CSCI-GA Geometric Modeling - Spring 17 - Daniele Panozzo

Digital Image Processing. Image Enhancement - Filtering

Classification: Feature Vectors

Capturing, Modeling, Rendering 3D Structures

Lecture 6 Linear Processing. ch. 5 of Machine Vision by Wesley E. Snyder & Hairong Qi. Spring (CMU RI) : BioE 2630 (Pitt)

HOUGH TRANSFORM CS 6350 C V

Non-Differentiable Image Manifolds

EE795: Computer Vision and Intelligent Systems

PERFORMANCE ANALYSIS OF CANNY AND OTHER COMMONLY USED EDGE DETECTORS Sandeep Dhawan Director of Technology, OTTE, NEW YORK

Learning Two-View Stereo Matching

Global motion model based on B-spline wavelets: application to motion estimation and video indexing

An Evaluation of Robust Cost Functions for RGB Direct Mapping

Nonlinear Multiresolution Image Blending

Panoramic Image Stitching

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

Edge Detection. Announcements. Edge detection. Origin of Edges. Mailing list: you should have received messages

VC 11/12 T11 Optical Flow

Accurate and Dense Wide-Baseline Stereo Matching Using SW-POC

Prof. Feng Liu. Winter /15/2019

Edge detection. Convert a 2D image into a set of curves. Extracts salient features of the scene More compact than pixels

Lecture 19: Motion. Effect of window size 11/20/2007. Sources of error in correspondences. Review Problem set 3. Tuesday, Nov 20

Automatic estimation of the inlier threshold in robust multiple structures fitting.

Irradiance Gradients. Media & Occlusions

Algebraic Iterative Methods for Computed Tomography


Stereo Matching for Calibrated Cameras without Correspondence

CS 231A Computer Vision (Fall 2012) Problem Set 3

Spatio-Temporal Stereo Disparity Integration

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi

Using Subspace Constraints to Improve Feature Tracking Presented by Bryan Poling. Based on work by Bryan Poling, Gilad Lerman, and Arthur Szlam

The Alternating Direction Method of Multipliers

Image segmentation using an annealed Hopfield neural network

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Optical Flow-Based Motion Estimation. Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides.

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Adaptative Elimination of False Edges for First Order Detectors

A Learning Algorithm for Piecewise Linear Regression

ULTRASONIC DECONVOLUTION TOOLBOX User Manual v1.0

CPSC 340: Machine Learning and Data Mining. Robust Regression Fall 2015

Outlines. Medical Image Processing Using Transforms. 4. Transform in image space

Multiple Model Estimation : The EM Algorithm & Applications

Edge Detection. CSE 576 Ali Farhadi. Many slides from Steve Seitz and Larry Zitnick

Feature Detectors - Canny Edge Detector

Analyzing Stochastic Gradient Descent for Some Non- Convex Problems

where g(x; y) is the the extracted local features or called observation, s(x; y) is the surface process and represents a smooth surface which is const

Illustration of SIR for CETB

ELEC Dr Reji Mathew Electrical Engineering UNSW

Transcription:

Notes on Robust Estimation David J. Fleet Allan Jepson March 30, 005 Robust Estimataion. The field of robust statistics [3, 4] is concerned with estimation problems in which the data contains gross errors, or outliers that do not conform to the statistical assumptions for the majority of the data (e.g., Gaussian noise). The main goals of robust statistics are: (i) To describe the structure best fitting the bulk of the data, (ii) To identify deviating data points (outliers) or deviating substructures for further treatment, if desired." Robust Regularization. Denoising (revisited). Let us again consider the problem of denoising a one-dimensional input signal. A generalization of the previous regularization formulation is to minimize the objective function constraints: E(v) = NX ρ(v[x] u[x];ff d )+ N X ρ(v[x +] v[x];ff s ); () where ρ is an error norm and ff d and ff s are scale parameters. The least-squares formulation discussed in the previous regularization notes is the special case in which ρ(z; ff) = z. But the least-squares approach is notoriously sensitive to outliers; the problem being that outliers contribute too much to the overall solution. To analyze the behavior of a ρ-function, we consider its derivative (denoted ρ 0 ) which is called the influence function. The influence function characterizes the bias that a particular measurement has on the solution. For example, the quadratic ρ-function has a linear ρ 0 -function: ρ(z) =z ; ρ 0 (z) =z: For least-squares estimation, the influence of outliers increases linearly and without bound (Fig. ). To increase robustness, an estimator must be more forgiving about outlying measurements; that is it should increase less rapidly. For example, consider the Lorentzian error norm: ρ(z; ff) = log ψ + z ff! z ; ρ 0 (z; ff) = ff + z : () The Lorentzian error norm (ρ) is plotted along with its influence function (ρ 0 ) in Fig.. Examination of the ρ 0 -function reveals that when the absolute value of a residual increases beyond a threshold, its influence decreases. The Lorentzian is one of many possible robust error functions (see [3] for other options). The least-squares regularization formulation assumes that the signal is equally smooth throughout. Figure shows a step" input with added noise. Linear regularization removes the noise appropriately, but it also blurs the step edge. Robust regularization smooths the smooth part while keeping the step sharp. Based on a handout by David Fleet and David Heeger at Stanford University.

Error Influence 0.5 0. 0 0.05 8-00 -50 50 00 u-d -0.05 6-0. -00-50 50 00 u-d -0.5 A B 5.5 4 3 0.5-6 -4-4 6-0.5 - -.5-6 -4-4 6 C D Figure : A: Least-square error (ρ) function. B: Least-squares influence (ρ 0 ) function. C: Lorentzian error (ρ) function. D: Lorentzian influence (ρ 0 ) function. 3 64 A 3 64 B 3 64 C 3 64 D Figure : A: Original step signal. B: Noisy step signal. C: Regularization result using quadratic error norm for both the data constraint and the smoothness constraint. D: Regularization result using a quadratic norm for the data constraint, but a robust Lorentzian error norm for the smoothness constraint.

Likewise, the least-squares regularization formulation assumes that the error/noise in the measurements is independent and identically distributed (IID). Consider a signal in which the k th input measurement is very badly corrupted. This will force the corresponding output value to be way off to minimize the squared difference between the two: (u[k] v[k]). This also forces neighboring output values to be affected to minimize the smoothness terms: (v[k +] v[k]) and (v[k] v[k ]). This effect ripples from one neighboring sample to the next, so that the one bad measurement throws off the entire solution. Robust regularization allows the data constraint to be violated near the bad measurement. Robust regularization can interpolate missing data, simply by leaving out the data constraint for those samples (as was done above for linear regularization). Robust regularization, like linear regularization, can be extended to two or more dimensions, and it can handle endpoints and edge pixels in a natural way. Implementation Given a robust regularization formulation, there are numerous techniques that can be employed to recover the smoothed and interpolated signal. In general, robust formulations do not admit closed form solutions. A standard approach istouseanumerical optimization algorithm such as Newton's method. Newton's method minimizes a function iteratively. For example, to minimize a function f(x), Newton's method iterates: z (i+) m = z m (i) f 0 (z m (i) ) f 00 (z (i) m ) ; (3) where z m is the estimate of the location of the minimum (i.e., f(z m ) <f(z) for all z). The functions f 0 (z) and f 00 (z) are, respectively, the first and second derivatives of f. To apply Newton's method to minimize Eq., we write its first and second derivatives with respect to v[k]: @E @v[k] = ρ 0 (v[x] u[x]) + ρ 0 (v[k] v[k ]) ρ 0 (v[k +] v[k]) @ E @(v[k]) = ρ 00 (v[x] u[x]) + ρ 00 (v[k] v[k ]) + ρ 00 (v[k +] v[k]) For example, using a quadratic error function (least-squares), substituting for the derivatives in Eq. 3 gives: v (i+) [x] = v (i) [x] (v(i) [x] u[x]) + ( v (i) [k ] + v (i) [k] v (i) [k + ]) + = + (u[x]+ v(i) [k ] + v (i) [k + ]); which is identical to the Jacobi iteration introduced in the linear regularization notes. Robust formulations typically result in nonconvex optimization problems. To find a globally optimal solution when the objective function is nonconvex we choose a robust ρ-function with a scale parameter, and we adjust the scale parameter to construct a convex approximation. This approximation is readily minimized. Then successively better approximations of the true objective function are constructed by slowly adjusting the scale parameter back to its original value. This process is sometimes called graduated non-convexity []. 3

0 8 6 4 0 4 6 8 0 6.5 5 4 0.5 3 0 0.5 0.5 0 8 6 4 0 4 6 8 0 Figure 3: Lorentzian error functions and (left) influence functions (right) for several values of ff (/,,, and 4). For example, Fig. 3 shows a family of Lorentzian error and influence functions for several values of ff. For larger values of ff, the error functions become more like quadratics and the influence functions become more like lines. The nonconvex Lorentzian error function becomes a simple (convex) quadratic when ff is very large. To compute the result in Fig. D, we initially set ff s =0 and then gradually reduced it to a value of 0:. The graduated non-convexity algorithm begins with the convex (quadratic) approximation so the initial estimate contains no outliers. Outliers are gradually introduced by lowering the value of ff and repeating the minimization. While this approach works well in practice, it is not guaranteed to converge to the global minimum since, as with least squares, the solution to the initial convex approximation may be arbitrarily bad. Measurements beyond some threshold, fi, can be considered outliers. The point where the influence of outliers first begins to decrease occurs when the second derivative of the ρ-function is zero. For the Lorentzian, the second derivative, ρ 00 (z) = (ff z ) (ff + z ) ; is zero when z = ± p ff. If the maximum expected (absolute) residual is fi, then choosing ff = fi=p will result in a convex optimization problem. A similar treatment applies to other robust ρ-functions. Note that this also gives a simple test of whether or not a particular residual is treated as an outlier. In the case of the Lorentzian, a residual is an outlier if jzj p ff. Figure 4 shows an example of applying robust regularization to a two-dimensional image, using a quadratic error norm for the data constraint and a Lorentzian error norm for the (first-order) membrane model smoothness constraint. Note that the regularization result is essentially a piecewise constant image, i.e., an image made up of constant regions separated by sharp discontinuities. Piecewise Smoothness. In linear regularization, the (first-order) membrane model smoothness constraints are satisfied perfectly when the first derivative of the output v is everywhere zero, i.e., when the output is a constant signal. For robust regularization (Eq. ), the robust first-order smoothness constraints are minimized by apiecewise constant signal (see Figs. and 4). Likewise, the (second-order) thin-plate model smoothness constraints in linear regularization are satisfied when the second derivative of the output is everywhere zero, i.e., when v is a linear ramp. For robust regularization, the robust second-order smoothness constraints are minimized by a signal that is made up of linear ramps separated by sharp discontinuities. 4

Figure 4: (Left) Input Image. (Middle) Regularization result using a robust Lorentzian error norm for the smoothness constraint. (Right) Edges obtained at pixels which are indicated to be outliers according to the Lorentzian error norm. Regularization with Line Processes. Another approach to extending linear regularization has been to add line processes, which allows one to recover a piecewise smooth signal/image by marking the specific locations of discontinuities. For example, regularization using the (first-order) membrane model smoothness constraint with a line process minimizes the following error function: E(v; l) = NX (v[x] u[x]) + NX [(v[x +] v[x]) l[x]+ψ(l[x])]; (4) where l[x] takes on values 0» l[x]». The line process indicates the presence (l[x]! 0) or absence (l[x]! ) of a discontinuity between neighboring samples. The function Ψ(l[x]) can be thought of as the penalty" for introducing each discontinuity. Penalty functions typically go to as l[x] tends to 0 and vice versa. Thus, when no discontinuity is present, the smoothness term has the original least-squares form, but when a discontinuity isintroduced, the smoothness term is dominated by Ψ. An example penalty function is: Ψ(l) =( l). Minimizing Eq. 4 with respect to v[x] and l[x] gives a piecewise smooth signal with breaks where the spatial gradient is too large. Black and Rangarajan [] have recently unified the line process regularization formulation with the robust regularization formulation. That is, given a penalty function in Eq. 4, one can derive a robust error norm for the smoothness constraint in Eq., and vice versa, so that the two optimization problems will be identical. For example, using the penalty function Ψ(l) = l log(l) in Eq. 4 is the same as using a Lorentzian error norm for the smoothness constraint in Eq.. Regularization of Vector Fields. Imagine that a user has interactively specified the spatial displacement for an image warp, but only for a handful of pixels. This is commonly done for image morphing. Regularization can be used to compute a dense spatial warp from the sparse and scattered specification. Keeping the same notation as before, ~u is now the sparse specification of the spatial displacements and ~v is the desired dense spatial displacements. Both are vector-valued so we use vector distance in the objective function E(~v) = X [x;y]p k~v[x; y] ~u[x; y]k + X all [x;y] k4~v[x; y] ~v[x ;y] ~v[x; y ] ~v[x+;y] ~v[x; y+]k : This objective function can again be minimized with simple iterative techniques. 5 (5)

A similar process could also be used in painterly rendering. Here a given input image is to be copied in painterly style. The spatial orientation of the individual paint strokes is given by some orientation field. One way to automatically generate constraints on a desirable orientation field is to use strong edges observed in the original image (say, using the Canny edge detector). The orientation of these edges can be represented by a D unit vector ~u[x; y]. However, these strong edges are only observed at a small subset P of the image pixels. What orientation should we use for paint strokes away from these edges? We could again minimze Eq. 5 to approximately interpolate the observed edge orientations at the pixels in P to a smooth orientation field ~v[x; y] across the full image. Alternatively, a robust estimator could be used on the spatial term in Eq. 5 to allow for spatial discontinuities in the orientation field. One issue that arises specifically for orientation fields is that we may not want to distinguish the sign of the gradient at an edge. For example, we may wish to consider a vertical edge where the x-component of the image gradient is positive to be the same orientation as another vertical edge having a negative x-component. If we used the angle of the image gradient to represent the orientation of these two edges, then these angles differ by ±ß. A standard trick that is useful here is to use a doubled angle representation for edge orientation. That is, we encode edges with a tangent direction ~t[x; y] = (cos( [x; y]); sin( [x; y])) T in terms of the doubled angle [x; y]. So ~t[x; y] is represented by the vector ~u[x; y] = (cos( [x; y]); sin( [x; y])) T. In this case, image gradients with the opposite signs have angles which differ by ±ß, so their doubled angles differ by ±ß. As a result, they are represented by identical vectors ~u[x; y]. The orientation field can then be filled in by minimizing an objective function such as the one in Eq. 5, or a robust version of it. Once the vector field ~v[x; y] is computed, the orientation to be used for paint strokes can be set to be half the angle of ~v[x; y]. References [] M J Black and A Rangarajan. On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. International Journal of Computer Vision, 9:57 9, 996. [] A Blake and A Zisserman. Visual Reconstruction. MIT Press, Cambridge, MA, 987. [3] F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw, and W. A. Stahel. Robust Statistics: The Approach Based on Influence Functions. John Wiley and Sons, New York, NY, 986. [4] P J Huber. Robust Statics. John Wiley and Sons, New York, 98. 6