Lecture 8.2 Structure from Motion. Thomas Opsahl

Similar documents
Step-by-Step Model Buidling

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Image correspondences and structure from motion

Structure from motion

CS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018

Large Scale 3D Reconstruction by Structure from Motion

Geometry for Computer Vision

Structure from motion

Structure from motion

Lecture 10: Multi view geometry

Dense 3D Reconstruction. Christiano Gava

Structure from Motion CSC 767

Lecture 10: Multi-view geometry

CS231M Mobile Computer Vision Structure from motion

International Journal for Research in Applied Science & Engineering Technology (IJRASET) A Review: 3D Image Reconstruction From Multiple Images

Multiview Stereo COSC450. Lecture 8

Camera Drones Lecture 3 3D data generation

Structure from Motion

Bundle Adjustment. Frank Dellaert CVPR 2014 Visual SLAM Tutorial

BIL Computer Vision Apr 16, 2014

arxiv: v1 [cs.cv] 28 Sep 2018

Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction

Camera Calibration. COS 429 Princeton University

C280, Computer Vision

A Systems View of Large- Scale 3D Reconstruction

Structure from Motion

1 Projective Geometry

Fast and robust techniques for 3D/2D registration and photo blending on massive point clouds

Camera Models and Image Formation. Srikumar Ramalingam School of Computing University of Utah

CS 532: 3D Computer Vision 7 th Set of Notes

Improving Initial Estimations for Structure from Motion Methods

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Dense 3D Reconstruction. Christiano Gava

arxiv: v1 [cs.cv] 28 Sep 2018

Structure from Motion

Homographies and RANSAC

Today. Stereo (two view) reconstruction. Multiview geometry. Today. Multiview geometry. Computational Photography

Lecture 9: Epipolar Geometry

Hartley - Zisserman reading club. Part I: Hartley and Zisserman Appendix 6: Part II: Zhengyou Zhang: Presented by Daniel Fontijne

Stereo and Epipolar geometry

Camera Models and Image Formation. Srikumar Ramalingam School of Computing University of Utah

CS231A Course Notes 4: Stereo Systems and Structure from Motion

Project 2: Structure from Motion

Camera calibration. Robotic vision. Ville Kyrki

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

EECS 442: Final Project

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching --

Structure from Motion

Geometric camera models and calibration

Outline. ETN-FPI Training School on Plenoptic Sensing

Telecommunications Engineering Thesis

Structure from Motion

Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview

Vision Review: Image Formation. Course web page:

L15. POSE-GRAPH SLAM. NA568 Mobile Robotics: Methods & Algorithms

Research Article Resection-Intersection Bundle Adjustment Revisited

Euclidean Reconstruction Independent on Camera Intrinsic Parameters

Fundamental matrix. Let p be a point in left image, p in right image. Epipolar relation. Epipolar mapping described by a 3x3 matrix F

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 263

CS 6320 Computer Vision Homework 2 (Due Date February 15 th )

Augmenting Reality, Naturally:

On the improvement of three-dimensional reconstruction from large datasets

Multiple View Geometry

Project: Camera Rectification and Structure from Motion

Robust extraction of image correspondences exploiting the image scene geometry and approximate camera orientation

A New Representation for Video Inspection. Fabio Viola

N-View Methods. Diana Mateus, Nassir Navab. Computer Aided Medical Procedures Technische Universität München. 3D Computer Vision II

Monocular Visual-Inertial SLAM. Shaojie Shen Assistant Professor, HKUST Director, HKUST-DJI Joint Innovation Laboratory

Metric Structure from Motion

calibrated coordinates Linear transformation pixel coordinates

Chaplin, Modern Times, 1936

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz

A Global Linear Method for Camera Pose Registration

Unit 3 Multiple View Geometry

Computer Vision Lecture 17

A SEMI-AUTOMATIC PROCEDURE FOR TEXTURING OF LASER SCANNING POINT CLOUDS WITH GOOGLE STREETVIEW IMAGES

Computer Vision Lecture 17

Epipolar Geometry and Stereo Vision

Multi-View Geometry Part II (Ch7 New book. Ch 10/11 old book)

Data Acquisition, Leica Scan Station 2, Park Avenue and 70 th Street, NY

3D Scene Reconstruction with a Mobile Camera

ITERATIVE DENSE CORRESPONDENCE CORRECTION THROUGH BUNDLE ADJUSTMENT FEEDBACK-BASED ERROR DETECTION

COMPARISON OF LASER SCANNING, PHOTOGRAMMETRY AND SfM-MVS PIPELINE APPLIED IN STRUCTURES AND ARTIFICIAL SURFACES

An Overview of Matchmoving using Structure from Motion Methods

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 253

Augmented Reality, Advanced SLAM, Applications

Target Shape Identification for Nanosatellites using Monocular Point Cloud Techniques

Multiview Photogrammetry 3D Virtual Geology for everyone

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Sampling based Bundle Adjustment using Feature Matches between Ground-view and Aerial Images

Epipolar Geometry and Stereo Vision

Computer Vision and Remote Sensing. Lessons Learned

Error Minimization in 3-Dimensional Model Reconstruction Using Sparse Bundle Adjustment and the Levenberg-Marquardt Algorithm on Stereo Camera Pairs

Visual Pose Estimation System for Autonomous Rendezvous of Spacecraft

Robot Mapping. Graph-Based SLAM with Landmarks. Cyrill Stachniss

Last lecture. Passive Stereo Spacetime Stereo

The end of affine cameras

A Theory of Multi-Layer Flat Refractive Geometry

Transcription:

Lecture 8.2 Structure from Motion Thomas Opsahl

More-than-two-view geometry Correspondences (matching) More views enables us to reveal and remove more mismatches than we can do in the two-view case More views also enables us to predict correspondences that can be tested with or without the use of descriptors Scene geometry (structure) Effect of more views on determining the 3D structure of the scene? Camera geometry (motion) Effect of more views on determining camera poses? u ii = P i X j 2

Structure from Motion Problem Given m images of n fixed 3D points, estimate the m projection matrices P j and the n points X j from the m n correspondences u ii u kj u ii = P i X j 3

Structure from Motion Problem Given m images of n fixed 3D points, estimate the m projection matrices P j and the n points X j from the m n correspondences u ii u kj We can solve for structure and motion when 2mm 11m + 3n 15 In the general/uncalibrated case, cameras and points can only be recovered up to a projective ambiguity (u ii = P i Q 1 QX j ) In the calibrated case, they can be recovered up to a similarity (scale) Known as Euclidean/metric reconstruction u ii = P i X j 4

Structure from motion Problem Given m images of n fixed 3D points, estimate the m projection matrices P j and the n points X j from the m n correspondences u ii u kj We can solve for structure and motion when 2mm 11m + 3n 15 In the general/uncalibrated case, cameras and points can only be recovered up to a projective ambiguity (u ii = P i Q 1 QX j ) In the calibrated case, they can be recovered up to a similarity (scale) Known as Euclidean/metric reconstruction Projective reconstruction Metric reconstruction Images courtesy of Hartley & Zisserman http://www.robots.ox.ac.uk/~vgg/hzbook/ 5

Structure from motion Problem Given m images of n fixed 3D points, estimate the m projection matrices P j and the n points X j from the m n correspondences u ii u kj We can solve for structure and motion when 2mm 11m + 3n 15 In the general/uncalibrated case, cameras and points can only be recovered up to a projective ambiguity (u ii = P i Q 1 QX j ) In the calibrated case, they can be recovered up to a similarity (scale) Known as Euclidean/metric reconstruction This problem has been studied extensively and several different approaches have been suggested We will take a look at a couple of these Sequential structure from motion Bundle adjustment 6

Sequential structure from motion Initialize motion from two images F P 1, P 2 E P 1, P 2 = K 1 I 0, K 2 1R 2 1 t 2 Structure X j j Initialize the 3D structure by triangulation Cameras P i i 7

Sequential structure from motion Initialize motion from two images F P 1, P 2 E P 1, P 2 = K 1 I 0, K 2 1R 2 1 t 2 Structure X j j Initialize the 3D structure by triangulation For each additional view Determine the projection matrix P i, e.g. from 2D- 3D correspondences u ii X j Refine and extend the 3D structure by triangulation Cameras P i i 8

Sequential structure from motion Initialize motion from two images F P 1, P 2 E P 1, P 2 = K 1 I 0, K 2 1R 2 1 t 2 Initialize the 3D structure by triangulation For each additional view Determine the projection matrix P i, e.g. from 2D- 3D correspondences u ii X j Refine and extend the 3D structure by triangulation Cameras P i i Structure X j New observations of old points j New points observed 9

Sequential structure from motion Initialize motion from two images F P 1, P 2 E P 1, P 2 = K 1 I 0, K 2 1R 2 1 t 2 Structure X j j Extended structure Initialize the 3D structure by triangulation For each additional view Determine the projection matrix P i, e.g. from 2D- 3D correspondences u ii X j Refine and extend the 3D structure by triangulation Cameras P i i Refine old structure by triangulation over 3 views 10

Sequential structure from motion Initialize motion from two images F P 1, P 2 E P 1, P 2 = K 1 I 0, K 2 1R 2 1 t 2 Structure X j j Initialize the 3D structure by triangulation For each additional view Determine the projection matrix P i, e.g. from 2D- 3D correspondences u ii X j Refine and extend the 3D structure by triangulation Cameras P i i 11

Sequential structure from motion Initialize motion from two images F P 1, P 2 E P 1, P 2 = K 1 I 0, K 2 1R 2 1 t 2 Structure X j j Initialize the 3D structure by triangulation Cameras P i Extend structure For each additional view Determine the projection matrix P i, e.g. from 2D- 3D correspondences u ii X j Refine and extend the 3D structure by triangulation i Refine structure 12

Sequential structure from motion Initialize motion from two images F P 1, P 2 E P 1, P 2 = K 1 I 0, K 2 1R 2 1 t 2 Structure X j j Initialize the 3D structure by triangulation For each additional view Determine the projection matrix P i, e.g. from 2D- 3D correspondences u ii X j Refine and extend the 3D structure by triangulation Cameras P i i 13

Sequential structure from motion Initialize motion from two images F P 1, P 2 E P 1, P 2 = K 1 I 0, K 2 1R 2 1 t 2 Structure X j j Initialize the 3D structure by triangulation For each additional view Determine the projection matrix P i, e.g. from 2D- 3D correspondences u ii X j Refine and extend the 3D structure by triangulation Cameras P i i The resulting structure and motion can be refined in a process known as bundle adjustment 14

Bundle adjustment Non-linear method that refines structure and motion by minimizing the sum of squared reprojection errors m n ε = d u ii, P i X j 2 i=1 j=1 Camera calibration can be solved as part of bundle adjustment by including intrinsic parameters and skew parameters in the cost function Need initial estimates for all parameters! 3 per 3D point ~12 per camera depending on parameterization Some intrinsic parameters, like the focal length, can be initialized from image EXIF data 15

Bundle adjustment There are several strategies that deals with the potentially extreme number of parameters Reduce the number of parameters by not including all the views and/or all the points Perform bundle adjustment only on a subset and compute missing views/points based on the result Divide views/points into several subsets which are bundle adjusted independently and merge the results Interleaved bundle adjustment Alternate minimizing the reprojection error by varying only the cameras or only the points This is viable since each point is estimated independently given fixed cameras, and similarly each camera is estimated independently from fixed points 16

Bundle adjustment Sparse bundle adjustment For each iteration, iterative minimization methods need to determine a vector Δ of changes to be made in the parameter vector In Levenberg-Marquardt each such step is determined from the equation J T J + λλ Δ = J T ε where J is the Jacobian matrix of the cost function and ε is the vector of errors For the bundle adjustment problem the Jacobian matrix has a sparse structure that can be exploited in computations The sparse structure of the Jacobian matrix for a bundle adjustment problem with 3 cameras and 4 3D points Figure courtesy of Hartley & Zisserman http://www.robots.ox.ac.uk/~vgg/hzbook/ 17

Bundle adjustment Sparse bundle adjustment For each iteration, iterative minimization methods need to determine a vector Δ of changes to be made in the parameter vector In Levenberg-Marquardt each such step is determined from the equation J T J + λλ Δ = J T ε where J is the Jacobian matrix of the cost function and ε is the vector of errors For the bundle adjustment problem the Jacobian matrix has a sparse structure that can be exploited in computations Combined with parallel processing the before mentioned strategies has made it possible to solve extremely large SfM problems 18

Bundle adjustment Sparse bundle adjustment For each iteration, iterative minimization methods need to determine a vector Δ of changes to be made in the parameter vector In Levenberg-Marquardt each such step is determined from the equation J T J + λλ Δ = J T ε where J is the Jacobian matrix of the cost function and ε is the vector of errors For the bundle adjustment problem the Jacobian matrix has a sparse structure that can be exploited in computations Combined with parallel processing the before mentioned strategies has made it possible to solve extremely large SfM problems S. Agarwal et al, Building Rome in a Day, 2011 Cluster of 62-computers 150 000 unorganized images from Rome ~37 000 image registered Total processing time ~21 hours SfM time ~7 hours J. Heinly et al, Reconstructing the World in Six Days, 2015 1 dual processor PC with 5 GPU s (CUDA) ~96 000 000 unordered images spanning the globe ~1.5 000 000 images registered Total processing time ~5 days SfM time ~17 hours 19

Bundle adjustment SBA Sparse Bundle Adjustment A generic sparse bundle adjustment C/C++ package based on the Levenberg-Marquardt algorithm Code (C and Matlab mex) available at http://www.ics.forth.gr/~lourakis/sba/ CVSBA is an OpenCV wrapper for SBA www.uco.es/investiga/grupos/ava/node/39/ Ceres By Google (used in production since 2010) A C++ library for modeling and solving large, complicated optimization problems like SfM Homepage: www.ceres-solver.org Code available on GitHub https://github.com/ceres-solver/ceres-solver GTSAM Georgia Tech Smoothing and Mapping A C++ library based on factor graphs that is well suited for SfM ++ Code (C++ library and Matlab toolbox) available at https://borg.cc.gatech.edu/borg/download g 2 o General Graph Optimization Open source C++ framework for optimizing graphbased nonlinear error functions Homepage: https://openslam.org/g2o.html Code available on GitHub https://github.com/rainerkuemmerle/g2o 20

Bundle adjustment Bundler A structure from motion system for unordered image collections written in C and C++ SfM based on a modified version SBA (default) or Ceres Homepage: http://www.cs.cornell.edu/~snavely/bundler/ Code available on GitHub https://github.com/snavely/bundler_sfm RealityCapture A state-of-the-art photogrammetry software that automatically extracts accurate 3D models from images, laser-scans and other input Homepage: https://www.capturingreality.com/ VisualSfM A GUI application for 3D reconstruction using structure from motion Output works with other tools that performs dense 3D reconstruction Homepage: http://ccwu.me/vsfm/ 21

Example Holmenkollen 2-view SfM More than 10 000 points are registered in addition to the 2 cameras 22

Example Holmenkollen 2-view SfM 23

Example Holmenkollen 2-view SfM 24

Example Holmenkollen 3-view SfM More than 16 10 000 points are registered in addition to the 32 cameras 25

Example Holmenkollen 4-view SfM More than 1620 10000 points are are registered in in addition to to the the 3 4 2 cameras 26

Example Holmenkollen 110-view SfM ~470 More 000 than 16 points 20 10000 are points registered are are registered in addition in in to addition the 110 to to the cameras the 3 4 2 cameras 27

Example Holmenkollen 110-view SfM Maximal ~470 More 000 than reprojection 16 points 20 10000 are points registered error are are ~3 registered pixels in addition in in Mean to addition the reprojection 110 to to the cameras the 3 4 2 cameras error ~0.7 pixels 28

Summary Structure from motion Sequential SfM Bundle adjustment m n ε = d u ii, P i X j 2 i=1 j=1 Additional reading: Szeliski: 7.3-7.5 Optional reading: Snavely N. Seitz S. M., Szeliski R., Modeling the World from Internet Photo Collections, 2007 S. Agarwal et al, Building Rome in a Day, 2011 J. Heinly et al, Reconstructing the World in Six Days, 2015 29