Lecture 10: Multi-view geometry

Similar documents
Lecture 10: Multi view geometry

EECS 442 Computer vision. Stereo systems. Stereo vision Rectification Correspondence problem Active stereo vision systems

Lecture 6 Stereo Systems Multi-view geometry

Lecture 6 Stereo Systems Multi- view geometry Professor Silvio Savarese Computational Vision and Geometry Lab Silvio Savarese Lecture 6-24-Jan-15

BIL Computer Vision Apr 16, 2014

Structure from motion

Structure from motion

Lecture 9: Epipolar Geometry

Lecture 9 & 10: Stereo Vision

Structure from Motion CSC 767

CS231M Mobile Computer Vision Structure from motion

Introduction à la vision artificielle X

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

Stereo vision. Many slides adapted from Steve Seitz

Structure from Motion

Structure from Motion

Epipolar Geometry and Stereo Vision

Epipolar Geometry and Stereo Vision

Lecture 5 Epipolar Geometry

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Structure from motion

Stereo. Many slides adapted from Steve Seitz

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

C280, Computer Vision

There are many cues in monocular vision which suggests that vision in stereo starts very early from two similar 2D images. Lets see a few...

Stereo II CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Chaplin, Modern Times, 1936

Lecture'9'&'10:'' Stereo'Vision'

Binocular stereo. Given a calibrated binocular stereo pair, fuse it to produce a depth image. Where does the depth information come from?

Image Based Reconstruction II

Final project bits and pieces

Epipolar Geometry and Stereo Vision

Structure from Motion

Stereo. Outline. Multiple views 3/29/2017. Thurs Mar 30 Kristen Grauman UT Austin. Multi-view geometry, matching, invariant features, stereo vision

Structure from Motion

7. The Geometry of Multi Views. Computer Engineering, i Sejong University. Dongil Han

Recovering structure from a single view Pinhole perspective projection

Fundamental matrix. Let p be a point in left image, p in right image. Epipolar relation. Epipolar mapping described by a 3x3 matrix F

Stereo: Disparity and Matching

Stereo and Epipolar geometry

Epipolar Geometry and Stereo Vision

Step-by-Step Model Buidling

Structure from Motion and Multi- view Geometry. Last lecture

Passive 3D Photography

CS 532: 3D Computer Vision 7 th Set of Notes

Announcements. Stereo

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

The end of affine cameras

Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction

CS4495/6495 Introduction to Computer Vision. 3B-L3 Stereo correspondence

Today. Stereo (two view) reconstruction. Multiview geometry. Today. Multiview geometry. Computational Photography

Stereo Vision. MAN-522 Computer Vision

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Computer Vision Lecture 17

Announcements. Stereo

Computer Vision Lecture 17

Dense 3D Reconstruction. Christiano Gava

Recap from Previous Lecture

Cameras and Stereo CSE 455. Linda Shapiro

CS231A Course Notes 4: Stereo Systems and Structure from Motion

Lecture 14: Basic Multi-View Geometry

Lecture 8 Active stereo & Volumetric stereo

CS5670: Computer Vision

CS231A Midterm Review. Friday 5/6/2016

What have we leaned so far?

Last lecture. Passive Stereo Spacetime Stereo

CEng Computational Vision

Epipolar geometry. x x

CS 2770: Intro to Computer Vision. Multiple Views. Prof. Adriana Kovashka University of Pittsburgh March 14, 2017

Introduction to Computer Vision. Week 10, Winter 2010 Instructor: Prof. Ko Nishino

calibrated coordinates Linear transformation pixel coordinates

Multiple View Geometry

Camera Geometry II. COS 429 Princeton University

Camera Calibration. Schedule. Jesus J Caban. Note: You have until next Monday to let me know. ! Today:! Camera calibration

Computer Vision I. Announcements. Random Dot Stereograms. Stereo III. CSE252A Lecture 16

Week 2: Two-View Geometry. Padua Summer 08 Frank Dellaert

Stereo Epipolar Geometry for General Cameras. Sanja Fidler CSC420: Intro to Image Understanding 1 / 33

Project 3 code & artifact due Tuesday Final project proposals due noon Wed (by ) Readings Szeliski, Chapter 10 (through 10.5)

Unit 3 Multiple View Geometry

Multi-view geometry problems

Two-view geometry Computer Vision Spring 2018, Lecture 10

Passive 3D Photography

Multi-view stereo. Many slides adapted from S. Seitz

Computer Vision I. Announcement. Stereo Vision Outline. Stereo II. CSE252A Lecture 15

Lecture 8 Active stereo & Volumetric stereo

Miniature faking. In close-up photo, the depth of field is limited.

6.819 / 6.869: Advances in Computer Vision Antonio Torralba and Bill Freeman. Lecture 11 Geometry, Camera Calibration, and Stereo.

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Multiple View Geometry

Multi-View Geometry Part II (Ch7 New book. Ch 10/11 old book)

Reminder: Lecture 20: The Eight-Point Algorithm. Essential/Fundamental Matrix. E/F Matrix Summary. Computing F. Computing F from Point Matches

arxiv: v1 [cs.cv] 28 Sep 2018

Dense 3D Reconstruction. Christiano Gava

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

But First: Multi-View Projective Geometry

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Project 2 due today Project 3 out today. Readings Szeliski, Chapter 10 (through 10.5)

A Factorization Method for Structure from Planar Motion

Lecture 14: Computer Vision

Transcription:

Lecture 10: Multi-view geometry Professor Stanford Vision Lab 1

What we will learn today? Review for stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems Structure from motion Reading: [HZ] Chapters: 4, 9, 11 [FP] Chapters: 10 2

Recovering structure from a single view Pinhole perspective projection P p O w Calibration rig Scene C Camera K Why is it so difficult? Intrinsic ambiguity of the mapping from 3D to image (2D) 3

Two eyes help!? K1 =known p 2 p 1 R, T O 2 O 1 This is called triangulation K2 =known 4

Epipolar geometry P p 1 p 2 e 1 e 2 Epipolar Plane Epipoles e 1, e 2 Baseline Epipolar Lines O 1 O 2 = intersections of baseline with image planes = projections of the other camera center = vanishing points of camera motion direction 5

Triangulation P p p O O p T F p = 0 F = Fundamental Matrix (Faugeras and Luong, 1992) 6

Why is F useful? p l = F T p - Suppose F is known - No additional information about the scene and camera is given - Given a point on left image, how can I find the corresponding point on right image? 7

The Eight-Point Algorithm (Longuet-Higgins, 1981) (Hartley, 1995) Estimating F P p p O O W f Homogeneous system Rank 8 If N>8 W f = 0 A non-zero solution exists (unique) Lsq. solution by SVD! Fˆ f =1 8

Rectification P p p O O Make two camera images parallel Correspondence problem becomes easier 9

Application: view morphing S. M. Seitz and C. R. Dyer, Proc. SIGGRAPH 96, 1996, 21-30 10

What we will learn today? Review for stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems Structure from motion Reading: [HZ] Chapters: 4, 9, 11 [FP] Chapters: 10 11

Stereo-view geometry Correspondence: Given a point in one image, how can I find the corresponding point p in another one? Camera geometry: Given corresponding points in two images, find camera matrices, position and pose. Scene geometry: Find coordinates of 3D point from its projection into 2 or multiple images. Previous lecture (#9) 12

Stereo-view geometry This lecture (#10) Correspondence: Given a point in one image, how can I find the corresponding point p in another one? Camera geometry: Given corresponding points in two images, find camera matrices, position and pose. Scene geometry: Find coordinates of 3D point from its projection into 2 or multiple images. 13

Depth estimation Pinhole perspective projection P p p O 1 O 2 Subgoals: - Solve the correspondence problem - Use corresponding observations to triangulate 14

Correspondence problem Pinhole perspective projection P p p O 1 O 2 Given a point in 3D, discover corresponding observations in left and right images [also called binocular fusion problem] 15

Correspondence problem Pinhole perspective projection A Cooperative Model (Marr and Poggio, 1976) Correlation Methods (1970--) Multi-Scale Edge Matching (Marr, Poggio and Grimson, 1979-81) [FP] Chapters: 11 16

Correlation Methods Pinhole perspective projection p Pick up a window around p(u,v) 17

Correlation Methods Pinhole perspective projection Pick up a window around p(u,v) Build vector W Slide the window along v line in image 2 and compute w Keep sliding until w w is maximized. 18

Correlation Methods Pinhole perspective projection Normalized Correlation; maximize: (w w)(w w ) (w w)(w w ) 19

Correlation Methods Pinhole perspective projection Left Right scanline Norm. corr Credit slide S. Lazebnik 20

Correlation Methods Pinhole perspective projection Smaller window - More detail - More noise Larger window - Smoother disparity maps - Less prone to noise Window size = 3 Window size = 20 Credit slide S. Lazebnik 21

Issues with Correlation Methods Pinhole perspective projection Fore shortening effect O O Occlusions O O It is desirable to have small B/z ratio! 22

Issues with Correlation Methods Pinhole perspective projection Homogeneous regions Hard to match pixels in these regions 24

Issues with Correlation Methods Pinhole perspective projection Repetitive patterns 25

Results with window search Pinhole perspective projection Data Ground truth Window-based matching Credit slide S. Lazebnik 26

Improving correspondence: Non-local constraints Uniqueness For any point in one image, there should be at most one matching point in the other image 27

Improving correspondence: Non-local constraints Uniqueness For any point in one image, there should be at most one matching point in the other image Ordering Corresponding points should be in the same order in both views Courtesy J. Ponce 28

Dynamic Programming Pinhole perspective projection [Uses ordering constraint] (Baker and Binford, 1981) Courtesy J. Ponce Nodes = matched feature points (e.g., edge points). Arcs = matched intervals along the epipolar lines. Arc cost = discrepancy between intervals. Find the minimum-cost path going monotonically down and right from the top-left corner of the graph to its bottom-right corner. 29

Dynamic Programming (Baker and Binford, 1981) (Ohta and Kanade, 1985) 30

Improving correspondence: Non-local constraints Uniqueness For any point in one image, there should be at most one matching point in the other image Ordering Corresponding points should be in the same order in both views Smoothness Disparity is typically a smooth function of x (except in occluding boundaries) 31

Stereo matching as energy minimization Pinhole perspective projection Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 01 I 1 I 2 D W 1 (i) W 2 (i+d(i)) D(i ) E = α Edata ( I1, I 2, D) + β Esmooth ( D) E ( W ( i) W ( i D( i ) 2 + data = 1 2 )) i ( D( i) D( j ) Esmooth = ρ ) neighbors i, j Energy functions of this form can be minimized using graph cuts 32

Stereo matching as energy minimization Pinhole perspective projection Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 01 Ground truth Window-based matching Energy minimization 33

What we will learn today? Review for stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems (see Friday Oct 28 TA Section) Structure from motion Reading: [HZ] Chapters: 4, 9, 11 [FP] Chapters: 10 34

Active stereo (point) P projector p Replace one of the two cameras by a projector - Single camera - Projector geometry calibrated - What s the advantage of having the projector? O 2 Correspondence problem solved! 35

Active stereo (stripe) projector -Projector and camera are parallel - Correspondence problem solved! O 36

Active stereo (stripe) Digital Michelangelo Project http://graphics.stanford.edu/projects/mich/ Optical triangulation Project a single stripe of laser light Scan it across the surface of the object This is a very precise version of structured light scanning 37

Active stereo (stripe) Digital Michelangelo Project http://graphics.stanford.edu/projects/mich/ 38

Active stereo (shadows) J. Bouguet & P. Perona, 99 S. Savarese, J. Bouguet & Perona, 00 Light source O - 1 camera,1 light source - very cheap setup - calibrated light source 39

Active stereo (shadows) J. Bouguet & P. Perona, 99 S. Savarese, J. Bouguet & Perona, 00 40

Active stereo (color-coded stripes) L. Zhang, B. Curless, and S. M. Seitz 2002 S. Rusinkiewicz & Levoy 2002 projector - Dense reconstruction - Correspondence problem again - Get around it by using color codes O 41

Active stereo (color-coded stripes) Rapid shape acquisition: Projector + stereo cameras L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured Light and Multi-pass Dynamic Programming. 3DPVT 2002 42

What we will learn today? Review for stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems Structure from motion Reading: [HZ] Chapters: 4, 9, 11 [FP] Chapters: 10 43

Multiple view geometry - the structure from motion problem P j p 1j M m M 1 p mj p 2j M 2 Given m images of n fixed 3D points; p ij = M i P j, i = 1,, m, j = 1,, n 44

Multiple view geometry - the structure from motion problem Courtesy of Oxford Visual Geometry Group 45

Multiple view geometry - the structure from motion problem P j p 1j M m M 1 p mj p 2j From the mxn correspondences p ij, estimate: m projection matrices M i n 3D points P j motion structure M 2 46

Multiple view geometry - the structure from motion problem From the mxn correspondences p ij, estimate: m projection matrices M i n 3D points P j motion structure If 2 cameras and 2 frames, then back to the stereopsis case: Essential & Fundamental matrices. But SfM is usually good for a generic scenario: no camera information, multiple images, etc. E.g. a video recorded from handheld camcorder 47

Structure from motion ambiguity SFM can be solved up to a N-degree of freedom ambiguity In the general case (nothing is known) the ambiguity is expressed by an arbitrary affine or projective transformation p = M j i P j M = K i i [ R T ] i i H P 1 j M j H p j = M i P j = ( -1 M H )( H P ) i j 48

Affine ambiguity 49

Prospective ambiguity 50

Structure from motion ambiguity A scale (similarity) ambiguity exists even for calibrated cameras For calibrated cameras, the similarity ambiguity is the only ambiguity A reconstruction up to scale is called metric [Longuet-Higgins 81] 51

Types of ambiguity Projective 15dof A T v t v Preserves intersection and tangency Affine 12dof A T 0 t 1 Preserves parallellism, volume ratios Similarity 7dof s R T 0 t 1 Preserves angles, ratios of length Euclidean 6dof R T 0 t 1 Preserves angles, lengths With no constraints on the camera calibration matrix or on the scene, we get a projective reconstruction Need additional information to upgrade the reconstruction to affine, similarity, or Euclidean

Multiple view geometry - overview 1. Recover camera and geometry up to ambiguity Use affine approximation if possible (affine SFM) Algebraic methods Factorization methods 2. Metric upgrade to obtain solution up to scale (remove perspective or affine ambiguity) Self-calibration 3. Bundle adjustment (optimize solution across all observations) 53

Affine structure from motion - a simpler problem Image p P p Image M = camera matrix From the mxn correspondences p ij, estimate: m projection matrices M i (affine cameras) n 3D points P j 54

Affine structure from motion - a simpler problem Image p P p Image M = camera matrix p = u v = AP + b = M P 1 ; M = [ A b] 55

Affine structure from motion - a simpler problem Given m images of n fixed points P j we can write m cameras n points Problem: estimate the m 2 4 matrices M i and the n positions P j from the m n correspondences p ij. How many equations and how many unknown? 2m n equations in 8m+3n unknowns Two approaches: - Algebraic approach (affine epipolar geometry; estimate F; cameras; points) - Factorization method 56

Factorization method C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method, 1992. Two steps: Centering the data Factorization P k Centering: subtract the centroid of the image points ^ p i p ik Assume that the origin of the world coordinate system is at the centroid of the 3D points pˆ ij = p ij 1 n n k = 1 p ik = A i P j + b i 1 n n k = 1 ( A P + b ) i k i = A i P j 1 n n k = 1 P k = A i Pˆ j 57

Factorization method C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method, 1992. Two steps: Centering the data Factorization After centering, each normalized point p ij is related to the 3D point P i by pˆ ij = A i P j 58

Factorization method D Let s create a 2m n data (measurement) matrix: = pˆ pˆ pˆ 11 21 m1 pˆ pˆ pˆ 12 22 m2 pˆ pˆ pˆ 1n 2n mn cameras (2 m ) points (n ) 59

60 Factorization method Let s create a 2m n data (measurement) matrix: [ ] n m mn m m n n P P P p p p p p p p p p 2 1 2 1 2 1 2 22 21 1 12 11 ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ = = A A A D cameras (2 m 3) points (3 n ) The measurement matrix D = M S has rank 3 (it s a product of a 2mx3 matrix and 3xn matrix) (2 m n) M S

Factorization method Source: M. Hebert 61

Factorization method Singular value decomposition of D: Source: M. Hebert 62

Factorization method Singular value decomposition of D: Since rank (D)=3, there are only 3 non-zero singular values Source: M. Hebert 63

Factorization method Obtaining a factorization from SVD: Motion (cameras) structure What is the issue here? D has rank>3 because of - measurement noise - affine approximation Source: M. Hebert 64

Factorization method Obtaining a factorization from SVD: Motion (cameras) structure 65

Factorization method 1. Given: m images and n features p ij 2. For each image i, center the feature coordinates 3. Construct a 2m n measurement matrix D: Column j contains the projection of point j in all views Row i contains one coordinate of the projections of all the n points in image i 4. Factorize D: Compute SVD: D = U W V T Create U 3 by taking the first 3 columns of U Create V 3 by taking the first 3 columns of V Create W 3 by taking the upper left 3 3 block of W 5. Create the motion and shape matrices: M = M = U 3 and S = W 3 V 3 T (or U 3 W 3 ½ and S = W 3 ½ V 3 T ) Source: M. Hebert 66

Multiple view geometry - overview Recover camera and geometry up to ambiguity Use affine approximation if possible (affine SFM) Algebraic methods Factorization methods Metric upgrade to obtain solution up to scale (remove perspective or affine ambiguity) Self-calibration Bundle adjustment (optimize solution across all observations) 67

Self-calibration [HZ] Chapters: 18 Prior knowledge on cameras or scene can be used to add constraints and remove ambiguities Obtain metric reconstruction (up to scale) Condition N. Views Constant internal parameters 3 Aspect ratio and skew known Focal length and offset vary Aspect ratio and skew constant Focal length and offset vary 4* 5* skew =0, all other parameters vary 8* 68

Multiple view geometry - overview Recover camera and geometry up to ambiguity Use affine approximation if possible (affine SFM) Algebraic methods Factorization methods Metric upgrade to obtain solution up to scale (remove ambiguity) Self-calibration Bundle adjustment (optimize solution across all observations) 69

Bundle adjustment Non-linear method for refining structure and motion Minimizing re-projection error It can be used before or after metric upgrade [HZ] Chapters: 18 p j? m E( M, P) = n i= 1 j= 1 D ( ) 2 p, M P ij i j M 1 P j p 1j p 3j P 1 M 2 P j p 2j M 3 P j P 2 P 3 70

Bundle adjustment [HZ] Chapters: 18 Non-linear method for refining structure and motion Minimizing re-projection error It can be used before or after metric upgrade E Advantages Handle large number of views Handle missing data m ( M, P) = n i= 1 j= 1 D ( ) 2 p, M P ij i j Limitations Large minimization problem (parameters grow with number of views) Requires good initial condition 71

Stereo SDK vision software development kit A. Criminisi, A. Blake and D. Robertson 72

Foreground/background segmentation using stereo V. Kolmogorov, A. Criminisi, A. Blake, G. Cross and C. Rother. Bi-layer segmentation of binocular stereo video CVPR 2005 http://research.microsoft.com/~antcrim/demos/acriminisi_recognition_cowdemo.wmv 73

3D Urban Scene Modeling using a stereo system 3D Urban Scene Modeling Integrating Recognition and Reconstruction, N. Cornelis, B. Leibe, K. Cornelis, L. Van Gool, IJCV 08. http://www.vision.ee.ethz.ch/showroom/index.en.html# 74

Photo synth Noah Snavely, Steven M. Seitz, Richard Szeliski, "Photo tourism: Exploring photo collections in 3D," ACM Transactions on Graphics (SIGGRAPH Proceedings),2006, http://phototour.cs.washington.edu/bundler/ Bundler is a structure-from-motion (SfM) system for unordered image collections (for instance, images from the Internet) written in C and C++. 75

Additional references D. Nistér, PhD thesis 01 See also 5-point algorithm M. Pollefeys et al 98--- M. Brown and D. G. Lowe, 05 76

Two-frame stereo correspondence algorithms http://vision.middlebury.edu/stereo/ 77

What we have learned today Review for stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems (see Friday Oct 28 TA Section) Structure from motion Reading: [HZ] Chapters: 4, 9, 11 [FP] Chapters: 10 78