Week 2: Two-View Geometry. Padua Summer 08 Frank Dellaert

Similar documents
Multiple View Geometry. Frank Dellaert

Epipolar geometry. x x

Two-view geometry Computer Vision Spring 2018, Lecture 10

Lecture 9: Epipolar Geometry

Recovering structure from a single view Pinhole perspective projection

calibrated coordinates Linear transformation pixel coordinates

Stereo and Epipolar geometry

Computer Vision I. Announcements. Random Dot Stereograms. Stereo III. CSE252A Lecture 16

Reminder: Lecture 20: The Eight-Point Algorithm. Essential/Fundamental Matrix. E/F Matrix Summary. Computing F. Computing F from Point Matches

Unit 3 Multiple View Geometry

Epipolar Geometry and Stereo Vision

Epipolar Geometry and Stereo Vision

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 253

Robust Geometry Estimation from two Images

Lecture 5 Epipolar Geometry

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 263

Structure from motion

Structure from motion

Lecture 14: Basic Multi-View Geometry

Multi-view geometry problems

Camera Geometry II. COS 429 Princeton University

Undergrad HTAs / TAs. Help me make the course better! HTA deadline today (! sorry) TA deadline March 21 st, opens March 15th

Structure from motion

CS231M Mobile Computer Vision Structure from motion

3D Computer Vision. Structure from Motion. Prof. Didier Stricker

Stereo Vision. MAN-522 Computer Vision

Vision par ordinateur

Structure from Motion

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Agenda. Rotations. Camera calibration. Homography. Ransac

3D Geometry and Camera Calibration

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz

Announcements. Stereo

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

3D reconstruction class 11

Visual Recognition: Image Formation

Last lecture. Passive Stereo Spacetime Stereo

Structure from Motion and Multi- view Geometry. Last lecture

Model Fitting. Introduction to Computer Vision CSE 152 Lecture 11

Announcements. Stereo

Agenda. Rotations. Camera models. Camera calibration. Homographies

3D Reconstruction from Two Views

Lecture'9'&'10:'' Stereo'Vision'

Two-View Geometry (Course 23, Lecture D)

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION

Lecture 10: Multi-view geometry

Geometric camera models and calibration

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Vision Review: Image Formation. Course web page:

Computer Vision Lecture 17

Computer Vision Lecture 17

But First: Multi-View Projective Geometry

Lecture 6 Stereo Systems Multi-view geometry

Epipolar Geometry in Stereo, Motion and Object Recognition

Camera Calibration. Schedule. Jesus J Caban. Note: You have until next Monday to let me know. ! Today:! Camera calibration

CS 664 Slides #9 Multi-Camera Geometry. Prof. Dan Huttenlocher Fall 2003

CS231A Course Notes 4: Stereo Systems and Structure from Motion

Computer Vision I - Robust Geometry Estimation from two Cameras

CS201 Computer Vision Camera Geometry

Multiple Views Geometry

Epipolar Geometry and the Essential Matrix

BIL Computer Vision Apr 16, 2014

Computer Vision Projective Geometry and Calibration. Pinhole cameras

C / 35. C18 Computer Vision. David Murray. dwm/courses/4cv.

Stereo II CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

CS231A Midterm Review. Friday 5/6/2016

Structure from motion

Today. Stereo (two view) reconstruction. Multiview geometry. Today. Multiview geometry. Computational Photography

Step-by-Step Model Buidling

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Lecture 6 Stereo Systems Multi- view geometry Professor Silvio Savarese Computational Vision and Geometry Lab Silvio Savarese Lecture 6-24-Jan-15

Camera Calibration Using Line Correspondences

1 Projective Geometry

Structure from Motion CSC 767

Multi-View Geometry Part II (Ch7 New book. Ch 10/11 old book)

EECS 442: Final Project

Pin Hole Cameras & Warp Functions

MAPI Computer Vision. Multiple View Geometry

Homographies and RANSAC

Hartley - Zisserman reading club. Part I: Hartley and Zisserman Appendix 6: Part II: Zhengyou Zhang: Presented by Daniel Fontijne

3D Reconstruction with two Calibrated Cameras

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

arxiv: v1 [cs.cv] 18 Sep 2017

Parameter estimation. Christiano Gava Gabriele Bleser

Multiple View Geometry

Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction

Geometry of Multiple views

Cameras and Stereo CSE 455. Linda Shapiro

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

3D Photography: Epipolar geometry

Rectification and Distortion Correction

Computer Vision I. Announcement. Stereo Vision Outline. Stereo II. CSE252A Lecture 15

Lecture 3: Camera Calibration, DLT, SVD

Multiple View Geometry in Computer Vision Second Edition

The end of affine cameras

CSE 252B: Computer Vision II

Pin Hole Cameras & Warp Functions

Self-calibration of a pair of stereo cameras in general position

arxiv: v1 [cs.cv] 28 Sep 2018

Augmented Reality II - Camera Calibration - Gudrun Klinker May 11, 2004

Transcription:

Week 2: Two-View Geometry Padua Summer 08 Frank Dellaert

Mosaicking Outline 2D Transformation Hierarchy RANSAC Triangulation of 3D Points Cameras Triangulation via SVD Automatic Correspondence Essential and Fundamental Matrix A recipe for Correspondence

Mosaicking www.cs.cmu.edu/~dellaert/mosaicking

Hierarchy of 2D Transforms Subgroup Structure: Translation (2DOF) Rigid 2D (3DOF) Affine (6DOF) Projective (8DOF)

Rigid 2D Transform Take Notes

Mosaicking Outline 2D Transformation Hierarchy RANSAC Triangulation of 3D Points Cameras Triangulation via SVD Automatic Correspondence Essential and Fundamental Matrix A recipe for Correspondence

Motivation Estimating motion models Typically: points in two images Candidates: Translation Rotation 2D Rigid transform Homography

Simpler Example Fitting a straight line

Discard Outliers No point with d>t RANSAC: RANdom SAmple Consensus Fischler & Bolles 1981 Copes with a large proportion of outliers

Main Idea Select 2 points at random Fit a line Support = number of inliers Line with most inliers wins

Why will this work?

Best Line has most support More support -> better fit

In General Fit a more general model Sample = minimal subset Translation:? Homography? Fundamental Matrix?

RANSAC Objective: Robust fit of a model to data S Algorithm Randomly select s points Instantiate a model Get consensus set Si If Si >T, terminate and return model Repeat for N trials, return model with max Si

Distance Threshold Requires noise distribution Gaussian noise with σ Chi-squared distribution with DOF m 95% cumulative: Line, F: m=1, t=3.84 σ 2 Translation, homography: m=2, t=5.99\ σ 2 I.e. -> 95% prob that d<t is inlier

How many samples? We want: at least one sample with all inliers Can t guarantee: probability p E.g. p =0.99

Calculate N If w = proportion of inliers = 1-etha P(sample with all inliers)=w s P(sample with an outlier)=1-w s P(N samples an outlier)=(1-w s )^N We want P(N samples an outlier)<1-p (1-w s )^N<1-p N>log(1-p)/log(1-w s )

Example P=0.99 s=2, etha=5% => N=2 s=2, etha=50% => N=17 s=4, etha=5% => N=3 s=4, etha=50% => N=72 s=8, etha=5% => N=5 s=8, etha=50% => N=1177

Remarks N = f(etha), not the number of points N increases steeply with s

Threshold T Remember: terminate if Si >T Rule of thumb: T #inliers So, T=(1-etha)n

Adaptive N When etha is unknown? Start with etha=50%, N=inf Repeat: Sample s, fit model -> update etha as outliers /n -> set N=f(etha,s,p) Terminate when N samples seen

Mosaicking Outline 2D Transformation Hierarchy RANSAC Triangulation of 3D Points Cameras Triangulation via SVD Automatic Correspondence Essential and Fundamental Matrix A recipe for Correspondence

Pinhole Camera (x, y, z) ( f x z, f y z ) P k C p i O f j q Q

Perspective Camera Model Z Y w v v Z X w u u T Z Y X P I w v u p = = = = = = = ˆ ˆ by normalizing : coordinates Recover image (Euclidean) 0 1 0 0 0 0 1 0 0 0 0 1 0] [ coordinates homogeneous (projective) Linear transformation of

Normalized Image coordinates 1 O u=x/z = dimensionless! P

Pixel units Pixels are on a grid of a certain dimension f O u=k f X/Z = in pixels! P [f] = m (in meters) [k] = pixels/m

Pixel coordinates We put the pixel coordinate origin on topleft f O u=u 0 + k f X/Z P

Pixel coordinates in 2D (0.5,0.5) 640 u 0 + kf X Z,v 0 + lf Y Z 480 (u 0,v 0 ) i j (640.5,480.5)

Important: MATLAB Convention (1,1)! Just as good as any other convention!

Summary: Intrinsic 3 3 Calibration Matrix K Calibration X u α s u 0 1 0 0 0 p = v = K[I 0]P = β v 0 0 1 0 0 Y Z w 1 0 0 1 0 T Recover image (Euclidean) coordinates by normalizing : ˆ u = u w = αx + sy + u 0 Z ˆ v = v w = βy + v 0 Z skew 5 Degrees of Freedom!

Camera Pose In order to apply the camera model, objects in the scene must be expressed in camera coordinates. y World Coordinates x z y c wt z x Camera Coordinates Calibration target looks tilted from camera viewpoint. This can be explained as a difference in coordinate systems.

Hierarchy of 3D Transforms Subgroup Structure: Translation (3DOF) Rigid 3D (6DOF) Affine (12DOF) Projective (15DOF)

Rigid Body Transformations Need a way to specify the six degreesof-freedom of a rigid body. Why are their 6 DOF? A rigid body is a collection of points whose positions relative to each other can t change Fix one point, three DOF Fix second point, two more DOF (must maintain distance constraint) Third point adds one more DOF, for rotation around line

Notations Superscript references coordinate frame A P is coordinates of P in frame A B P is coordinates of P in frame B Example : k A j A A P = A x A y A z OP = ( A x i ) A + ( A y j ) A + ( A z k ) A i A O A P

B P= A P+ B Translation ( O A ) k B k A i B O B j B i A O A j A P

Translation Using homogeneous coordinates, translation can be expressed as a matrix multiplication. B A B P = P + O A B B A P I OA P = 1 0 1 1 Note: composing two translations is commutative

Rotation A B x x A ( ) ( ) B OP = ia ja ka y = ib jb kb y A B z z B A P = R P B B A A R means describing frame A in The coordinate system of frame B

Rotation R i A. ib ja. ib k A. ib = i. j j. j k. j i A. k B ja. k B k A. k B B A A B A B A B B B B = i A ja k A A T ib A T = jb A T k B Orthogonal matrix!

Example: Rotation about z axis What is the rotation matrix?

Rotation in homogeneous coordinates Using homogeneous coordinates, rotation can be expressed as a matrix multiplication. P = R P B B A A B B A P A R 0 P = 1 0 1 1 Note: composing two rotations is not commutative

Rigid transformations B = B A + B A P R P O A

Rigid transformations (cont d) Unified treatment using homogeneous coordinates. B B B A P 1 OA A R 0 P = 1 0 1 0 1 1 B B A A R OA P = T 0 1 1 B A P B P = AT 1 1

3D-2D Projective mapping Projection Matrix (3x4)

Projective Camera Matrix Camera = Calibration Pr ojection Extrinsics p = u v w = K[I 0]TP = α s u 0 β v 0 1 1 0 0 0 0 1 0 0 0 0 1 0 R t 0 1 X Y Z T = K R t [ ]P = MP 5+6 DOF = 11!

Projective Camera Matrix [ ] = = = T Z Y X m m m m m m m m m m m m w v u MP P t R K p 34 33 32 31 24 23 22 21 14 13 12 11 5+6 DOF = 11!

Columns & Rows of M [ ] 4 3 2 1 3 2 1 m m m m m m m M = = m 2 P=0 i i i i i i m P P m v m P m P u 3 2 3 1 = = O

Mosaicking Outline 2D Transformation Hierarchy RANSAC Triangulation of 3D Points Cameras Triangulation via SVD Automatic Correspondence Essential and Fundamental Matrix A recipe for Correspondence

Why Consider Multiple Views? P X P' x x' Answer: To extract 3D structure via triangulation.

Stereo Rig Top View Matches on Scanlines Convenient when searching for correspondences.

Arbitrary 2-View Triangulation P? z Given p and p find P p p C C

Linear Triangulation Method Take Projection equations Apply cross-product trick: p MP p' M ' P p MP = 0 [ p ] M p' M ' P = 0 [ p' ] M' SVD Smallest Eigenvector! Generalizes to N points P = 0 0

Resectioning = Finding a Camera Given Known points SVD: 6-point algorithm Apply cross-product trick Take Notes

Mosaicking Outline 2D Transformation Hierarchy RANSAC Triangulation of 3D Points Cameras Triangulation via SVD Automatic Correspondence Essential and Fundamental Matrix A recipe for Correspondence

Feature Matching!

Real World Challenges Bad News: Good correspondences are hard to find Good news: Geometry constrains possible correspondences. 4 DOF between x and x'; only 3 DOF in X. Constraint is manifest in the Fundamental matrix F can be calculated either from camera matrices or a set of good correspondences.

Geometry of 2 views? What if we do not know R,t? Caveat: My exposition follows book conventions but more intuitive (IMHO) Different from Hartley & Zisserman! F&P use [R T -R T t] camera matrices H&Z uses [R t]

Epipolar Geometry Where can p appear? P p t C C M =[R T -R T t] M=[I 0]? p

Image of Camera Center epipole M =[R T -R T t] M=[I 0]

Example:Cameras Point at Each Other Top View Epipolar Lines

Epipoles Camera Center C in first view: [ ] t 1 e = I 0 = t Origin C in second view: e'= [ R T R T t] 0 = R T t 1

Image of Camera Ray? epipole M =[R T -R T t] M=[I 0]

Point at infinity Given p, what is corresponding point at infinity [x 0]? Answer for any camera M =[A a]: p'= [ A a] x = Ax x = A 1 p' 0 A -1 = Infinite homography In our case M =[R T -R T t]: x = Rp'

Sidebar: Infinite Homographies Homography between image plane plane at infinity Navigation by the stars: Image of stars = function of rotation R only! Traveling on a sphere rotates viewer

Essential Matrix

Epipolar Line Calculation 1) Point 1 = epipole e=t 2) Point 2 = point at infinity [ ] Rp' p = I 0 3) Epipolar line = join of points 1 and 2 0 l = t Rp' = Rp'

P Epipolar Lines P p C e e=t C M =[R T -R T t] M=[I 0] p l = t p =Rp Rp'

Epipolar lines e=t l = t Rp' p =Rp

Epipolar Plane P p l' l p C e e=t C M =[R T -R T t] M=[I 0]

Essential Matrix mapping from p to l l = t Rp'= [ t] R p'= E p' E = 3*3 matrix Because p is on l, we have p T Ep'= 0

E s Degrees of Freedom R,t = 6 DOF However, scale ambiguity! = 5 DOF

Fundamental Matrix

P Uncalibrated Case P p p =A -1 p p [ ] e A 1 p' C e =a e=-a -1 a C M =K [R [A a] -1 K [R T -R -R T t] t] M=K[I 0] M=[I 0] l =

Uncalibrated Case, Forsyth & Ponce Version Fundamental Matrix (Faugeras and Luong, 1992)

Fundamental Matrix mapping from p to l l = e A 1 p'= [ e] A 1 p'= F p' F = 3*3 matrix Because p is on l, we have p T Fp'= 0

Properties of the Fundamental Matrix Fp is the epipolar line associated with p. F T p is the epipolar line associated with p. F T e=0 and Fe =0. F is singular.

The Eight-Point Algorithm (Longuet-Higgins, 1981) Minimize: under the constraint 2 F =1.

Non-Linear Least-Squares Approach (Luong et al., 1993) Minimize with respect to the coefficients of F, using an appropriate rank-2 parameterization.

The Normalized Eight-Point Algorithm (Hartley, 1995) Center the image data at the origin, and scale it so the mean squared distance between the origin and the data points is 2 pixels: q i = T p i q i = T p i. Use the eight-point algorithm to compute F from the points q i and q i. Enforce the rank-2 constraint. Output T -1 F T.