Multiview Stereo COSC450. Lecture 8

Similar documents
CS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Epipolar Geometry CSE P576. Dr. Matthew Brown

CS 532: 3D Computer Vision 7 th Set of Notes

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 263

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Project: Camera Rectification and Structure from Motion

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 253

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction

CS231A Course Notes 4: Stereo Systems and Structure from Motion

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

Project 2: Structure from Motion

Camera Registration in a 3D City Model. Min Ding CS294-6 Final Presentation Dec 13, 2006

Project: Camera Rectification and Structure from Motion

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

Step-by-Step Model Buidling

Multiple View Geometry in Computer Vision

1 Projective Geometry

arxiv: v1 [cs.cv] 28 Sep 2018

Camera Drones Lecture 3 3D data generation

Multiple View Geometry

Geometry for Computer Vision

Structure from motion

Vision 3D articielle Multiple view geometry

Dense 3D Reconstruction. Christiano Gava

Hartley - Zisserman reading club. Part I: Hartley and Zisserman Appendix 6: Part II: Zhengyou Zhang: Presented by Daniel Fontijne

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching --

3D Geometry and Camera Calibration

Camera calibration. Robotic vision. Ville Kyrki

CSCI 5980/8980: Assignment #4. Fundamental Matrix

Vision par ordinateur

CS 664 Structure and Motion. Daniel Huttenlocher

Structure from motion

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz

Lecture 8.2 Structure from Motion. Thomas Opsahl

Contents. 1 Introduction Background Organization Features... 7

CS231A Midterm Review. Friday 5/6/2016

calibrated coordinates Linear transformation pixel coordinates

Dense 3D Reconstruction. Christiano Gava

arxiv: v1 [cs.cv] 28 Sep 2018

EE795: Computer Vision and Intelligent Systems

A Systems View of Large- Scale 3D Reconstruction

Image correspondences and structure from motion

Lecture 9: Epipolar Geometry

Improving Initial Estimations for Structure from Motion Methods

Homographies and RANSAC

Structure from Motion CSC 767

Application questions. Theoretical questions

Stereo Vision. MAN-522 Computer Vision

3D Reconstruction on GPU: A Parallel Processing Approach

Computer Vision Lecture 17

Computer Vision Lecture 17

Robot Mapping. Least Squares Approach to SLAM. Cyrill Stachniss

Graphbased. Kalman filter. Particle filter. Three Main SLAM Paradigms. Robot Mapping. Least Squares Approach to SLAM. Least Squares in General

Stereo and Epipolar geometry

Epipolar Geometry and Stereo Vision

ICRA 2016 Tutorial on SLAM. Graph-Based SLAM and Sparsity. Cyrill Stachniss

3D Computer Vision. Structure from Motion. Prof. Didier Stricker

Structure from Motion

Image processing and features

EECS 442: Final Project

Camera Geometry II. COS 429 Princeton University

Structure from Motion

Structure from Motion

Epipolar Geometry and Stereo Vision

CS 231A: Computer Vision (Winter 2018) Problem Set 2

Reminder: Lecture 20: The Eight-Point Algorithm. Essential/Fundamental Matrix. E/F Matrix Summary. Computing F. Computing F from Point Matches

Robust Geometry Estimation from two Images

Wide-Baseline Stereo Vision for Mars Rovers

Undergrad HTAs / TAs. Help me make the course better! HTA deadline today (! sorry) TA deadline March 21 st, opens March 15th

Lecture 6 Stereo Systems Multi-view geometry

Project Updates Short lecture Volumetric Modeling +2 papers

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Camera Calibration. COS 429 Princeton University

COMPUTER VISION Multi-view Geometry

Multiple Views Geometry

Two-view geometry Computer Vision Spring 2018, Lecture 10

N-Views (1) Homographies and Projection

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Structured Light. Tobias Nöll Thanks to Marc Pollefeys, David Nister and David Lowe

(Sparse) Linear Solvers

Rectification and Distortion Correction

CS231M Mobile Computer Vision Structure from motion

StereoScan: Dense 3D Reconstruction in Real-time

Augmented Reality, Advanced SLAM, Applications

Structure from Motion and Multi- view Geometry. Last lecture

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013

Robot Mapping. Graph-Based SLAM with Landmarks. Cyrill Stachniss

LOAM: LiDAR Odometry and Mapping in Real Time

Large Scale 3D Reconstruction by Structure from Motion

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

Image Stitching. Slides from Rick Szeliski, Steve Seitz, Derek Hoiem, Ira Kemelmacher, Ali Farhadi

3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: ,

COS429: COMPUTER VISON CAMERAS AND PROJECTIONS (2 lectures)

Structure from motion

Direct Methods in Visual Odometry

A New Representation for Video Inspection. Fabio Viola

Multi-stable Perception. Necker Cube

Binocular Stereo Vision. System 6 Introduction Is there a Wedge in this 3D scene?

Transcription:

Multiview Stereo COSC450 Lecture 8

Stereo Vision So Far Stereo and epipolar geometry Fundamental matrix captures geometry 8-point algorithm Essential matrix with calibrated cameras 5-point algorithm Intersect rays to recover 3D structure Errors and uncertainty Rays don t intersect closest approach Outliers upset estimation RANSAC Y X x T F x = 0 Z COSC450 Multiview Stereo 2

Multi-view Stereo Can use more than two images Multiple camera rigs Single moving camera Moving multi-camera systems Can reconstruct larger areas Can resolve more details Can use two-view methods Fundamental matrix between each pair Scales are not independent Non-overlapping views are a problem Incremental approaches are common Start with two cameras Recover motion and structure Determine a third camera s pose Recover more structure Repeat until done Incremental multiview problems Determine order of reconstruction Recover pose from 2D-3D matches Stopping errors causing drift COSC450 Multiview Stereo 3

Determining Reconstruction Order What frames to start with Should have many matching features Should have good geometry Choose the pair with the most matches? If we have n images, O(n 2 ) pairs Matching can be expensive Can use image search techniques for large n Represent images as Bags of Words Find nearest neighbours O(n log n) once kd-tree is built Once first pair is done, what next? We have some 3D points We ll need many 2D-3D matches Image with many matches with first pair Again, direct matching or kd-tree This repeats in a cycle Determine pose of the new image Compute new 3D structure Update existing 3D points Can add multiple images at once COSC450 Multiview Stereo 4

Perspective-n-Point Pose Can use 2D-3D matches directly Have 6 unknowns (R, t) Each 2D-3D match gives x u k v = K[R t] y z 1 1 We want to determine R and t How many matches do we need? We have Six unknowns for R, t Each point adds three equations But also 1 unknown (k) If we have n matching points, We have 6 + n unknowns And we get 3n equations Therefore n = 3 matches are needed COSC450 Multiview Stereo 5

Perspective-n-Point Pose This is a non-linear problem Homogeneous points Rotation matrix The geometry is simpler We know 3D points, A, B, C We know their projections, a, b, c The camera is at some point, P This defines a tetrahedron giving us P A P Aligning PA with Pa etc. gives us R RANSAC can be used for robust estimation B C COSC450 Multiview Stereo 6

Reprojection Error Many steps minimise some function Af to estimate F Ax for triangulation PnP model for n > 3 It s not always clear what these mean Taking a step back We measure points in images We have a model We want the model and the measurements to agree Our measurements are: u i,j = (u i,j, v i,j ), the ith point in the jth image Our model consists of The 3D location, x i = (x i, y i, z i )of the ith point The calibration, K j of the jth camera The pose, (R j, t j ) of the jth camera Can predict measurements from the model ] [ ] [ũi,j xi K 1 j [R j t j ] 1 COSC450 Multiview Stereo 7

Reprojection Error We want to minimise M N u i,j ũ i,j i=1 j=1 M is the number of 3D points N is the number of images This is non-linear Minimising it is not simple But it has a clear meaning COSC450 Multiview Stereo 8

Non-Linear Least Squares Linear least squares is (fairly) easy To estimate some parameters, p Form the linear equation Ap = b Solve A T Ap = A T b Non-linear least squares is (much) harder Form an initial guess of p Our model is f (p) = b Here f is any (continuous) function Make a linear approximation to f Use this to update the estimate of p We start with a 1D example We are given some measurements m(x i ) = y i We assume that the measurements come from some function with a parameter, p to estimate: y i f (x i, p) And we have an initial guess, p 0 We find a series of estimates, p 1, p 2,... Each estimate is more accurate COSC450 Multiview Stereo 9

Non-Linear Least Squares We can write this in vector form y f (x, p) And we minimise the squared error ɛ = y f (x, p) 2 We have an initial error, ɛ 0 = y f (x, p 0 ) 2 We can approximate f (x, p) by f (x, p 0 + δ) f (x, p 0 ) + f p δ p=p0 The error becomes ɛ y f (x, p 0 ) f p δ 2 We want to update p by δ to minimise ɛ COSC450 Multiview Stereo 10

Updating the Parameters A simple method is to step along the gradient Step along the negative gradient A small enough step always helps Stepping too far can be a problem Can search for a good step size However, this can be slow to converge Slow when the gradient is small Valleys in multiple dimensions Alternatively, at the minimum error 0 = ɛ δ 0 = 2 ( ) f 2 ( ) f δ = ɛ 0 p p ( y f (x, p 0 ) f ) ( p δ This is the Gauss-Newton algorithm Faster to converge in most cases But not guaranteed to converge f p ) COSC450 Multiview Stereo 11

Levenberg-Marquardt Algorithm We ve considered a single parameter, p Generally this is a vector, p = [ p 1 p 2... p n ] T The function is also vector-valued f (x, p) = [ f 1 (x, p)... f m (x, p) ] T The derivative becomes a matrix f 1 f p 1... 1 p n J =..... f m p 1... This is called the Jacobian f m p n We now solve J T Jδ = Jɛ Levenberg suggested solving (J T J λi)δ = Jɛ If λ is small this is Gauss Newton If λ is large, this is gradient descent Marquardt noted that it is more stable to use ( ) J T J λdiag(j T J) δ = Jɛ COSC450 Multiview Stereo 12

Bundle Adjustment For N images and F features y = f (p) y are our measurements 2D locations of image features There are 2NF measurements p are our parameters R, t and maybe K for each image 3D locations for each feature There are at least 6N + 3F parameters The Jacobian is at least 2NF (6N + 3F ) If we take 100 images (easy to do) And each has 1,000 features (not many) J is about 200, 000 3, 600 Just storing J as floats needs nearly 3GB of RAM, let alone doing the maths Fortunately J is sparse Each 2D measurement depends on just one camera and one 3D point This means each row has 9 non-zeros COSC450 Multiview Stereo 13

Sparse Structure x 1,1 y 1,1 x 1,2 y 1,2. x 1,F y 1,F x 2,1 y 2,1 x 2,2 y 2,2. x 2,F y 2,F. x N,1 y N,1 x N,2 y N,2. x N,F y N,F R 1 t 1 R 2 t 2... R N t N X 1 Y 1 Z 1 X 2 Y 2 Z 2... X F Y F Z F............ COSC450 Multiview Stereo 14

Multi-View Stereo Recap 1. Pick an initial pair of images (many features in common) 2. Determine their relative pose (8- or 5-point algorithm) 3. Determine initial 3D structure (triangulation) 4. Refine the initial estimate (bundle adjustment) 5. Pick the next image(s) to be added (many 2D-3D matches) 6. Estimate their pose and additional 3D structure 7. Refine the estimate (bundle adjustment) 8. If there are more images, go to 5 COSC450 Multiview Stereo 15

Dense Stereo Estimation The structure tends to be sparse Made from feature correspondences We reject many matches to find good ones Once camera poses are estimated We know epipolar geometry We can recover more reliable matches We can expand these to form patches COSC450 Multiview Stereo 16

Surface Estimation Point clouds are limited models We want to fit surfaces to points This is an ill-posed problem Interpolating vs approximating surfaces Once we have a surface We can reproject the images This gives fine texture detail Need to merge images (mosaicing) COSC450 Multiview Stereo 17