Lecture 6 Stereo Systems Multi- view geometry Professor Silvio Savarese Computational Vision and Geometry Lab Silvio Savarese Lecture 6-24-Jan-15
Lecture 6 Stereo Systems Multi- view geometry Stereo systems Rectification Correspondence problem Multi- view geometry The SFM problem Affine SFM Reading: [HZ] Chapter: 9 Epip. Geom. and the Fundam. Matrix Transf. [HZ] Chapter: 18 N view computational methods [FP] Chapters: 7 Stereopsis [FP] Chapters: 8 Structure from Motion Silvio Savarese Lecture 5-26-Jan-15
Epipolar geometry P p p Epipolar Plane Epipoles e 1, e 2 Baseline Epipolar Lines e e O O = intersections of baseline with image planes = projections of the other camera center
Epipolar Constraint P p p O p T E p = 0 E = Essential Matrix (Longuet-Higgins, 1981) E O [ T ] R =
Epipolar Constraint P p p O O p T F p = 0 F [ ] 1 = K T T R K F = Fundamental Matrix (Faugeras and Luong, 1992)
Example: Parallel image planes P e p p e y z x O O Epipolar lines are horizontal Epipoles go to infinity y-coordinates are equal p = " u v 1 % p' = " u' v 1 %
Rectification: making two images parallel H Courtesy figure S. Lazebnik
Why are parallel images useful? Makes the correspondence problem easier Makes triangulation easy Enables schemes for image interpolation
Application: view morphing S. M. Seitz and C. R. Dyer, Proc. SIGGRAPH 96, 1996, 21-30
Why are parallel images useful? Makes the correspondence problem easier Makes triangulation easy Enables schemes for image interpolation
Point triangulation P v v e 2 p p e 2 y u u z O x O
Computing depth P z f u u O B O u u' = B z f = disparity [Eq. 1] Note: Disparity is inversely proportional to depth
Disparity maps B f u u' = z u u Stereo pair Disparity map / depth map Disparity map with occlusions http://vision.middlebury.edu/stereo/
Why are parallel images useful? Makes the correspondence problem easier Makes triangulation easy Enables schemes for image interpolation
Correspondence problem P p p O 1 O 2 Given a point in 3D, discover corresponding observations in left and right images [also called binocular fusion problem]
Correspondence problem P p p O 1 O 2 When images are rectified, this problem is much easier
Correspondence problem A Cooperative Model (Marr and Poggio, 1976) Correlation Methods (1970--) Multi-Scale Edge Matching (Marr, Poggio and Grimson, 1979-81) [FP] Chapters: 7
Correlation Methods (1970--) p p' image 1 Image 2 148 210 195 150 209 What s the problem with this?
Correlation Methods (1970--) p image 1 Pick up a window W around p = (u, v)
Correlation Methods (1970--) W v v image 1 u u + d p = (u, v) Pick up a window W around Build vector w Slide the window W along v = v in image 2 and compute w Compute the dot product w w for each u and retain the maximum value
Correlation Methods (1970--) W W v v image 1 u u + d Image 2 p = (u, v) Pick up a window W around Build vector w Slide the window W along v = v in image 2 and compute w Compute the dot product w w for each u and retain the maximum value w
Correlation Methods (1970--) W W v v image 1 u u + d Image 2 w What s the problem with this?
Changes of brightness/exposure Changes in the mean and the variance of intensity values in corresponding windows
Correlation Methods (1970--) v v image 1 u u + d Image 2 w Normalized crosscorrelation: Maximize: (w w)( w" w ") (w w) ( w" w ") [Eq. 2]
Example Image 1 Image 2 v scanline NCC u Credit slide S. Lazebnik
Effect of the window s size Smaller window - More detail - More noise Window size = 3 Window size = 20 Larger window - Smoother disparity maps - Less prone to noise Credit slide S. Lazebnik
Fore shortening effect Issues O O Occlusions O O
Issues It is desirable to have small B / z ratio Small error in measurements implies large error in estimating depth O O O O u u u e u u u e
Issues Homogeneous regions mismatch
Repetitive patterns Issues
Correspondence problem is difficult - Occlusions - Fore shortening - Baseline trade-off - Homogeneous regions - Repetitive patterns Apply non-local constraints to help enforce the correspondences
Non-local constraints Uniqueness For any point in one image, there should be at most one matching point in the other image Ordering Corresponding points should be in the same order in both views Smoothness Disparity is typically a smooth function of x (except in occluding boundaries)
Lecture 6 Stereo Systems Multi- view geometry Stereo systems Rectification Correspondence problem Multi- view geometry The SFM problem Affine SFM Silvio Savarese Lecture 5-24-Jan-15
Structure from motion problem Courtesy of Oxford Visual Geometry Group
Structure from motion problem X j x 1j M m M 1 x mj x 2j M 2 Given m images of n fixed 3D points x ij = M i X j, i = 1,, m, j = 1,, n
Structure from motion problem X j x 1j M m M 1 x mj x 2j M 2 From the mxn correspondences x ij, estimate: m projection matrices M i n 3D points X j motion structure
Affine structure from motion (simpler problem) Image World Image From the mxn correspondences x ij, estimate: m projection matrices M i (affine cameras) n 3D points X j
M = m 1 m 2 m 3 " % = m 1 m 2 0 0 0 1 " % = m 1 m 2 m 3 " % X = m 1 X m 2 X 1 " % x E = (m 1 X, m 2 X) x = M X = A 2 x3 b 2 x1 0 1x3 1 " % magnification " % = 3 2 1 m m m x = M X " % = 1 v b A M = m 1 m 2 m 3 " % X = m 1 X m 2 X m 3 X " % x E = ( m 1 X m 3 X, m 2 X m 3 X ) Perspective Affine = A b " X = A b " x y z 1 " % % % % = AX E + b [Eq. 3]
Affine cameras Image 1 x 1j X j. x i j Image i Camera matrix M for the affine case x ij = A i X j + b i = M i [Eq. 4] " X j 1 ; M = A b " %
The Affine Structure-from-Motion Problem Given m images of n fixed points X j we can write x ij = A i X j + b i = M i X j Problem: estimate the m 2 4 matrices M i and the n positions X j from the m n correspondences x ij. How many equations and how many unknown? 2m n equations in 8m+3n unknowns " 1 % ; N. of cameras N. of points Two approaches: - Algebraic approach (affine epipolar geometry; estimate F; cameras; points) - Factorization method
A factorization method Tomasi Kanade algorithm C. Tomasi and T. KanadeShape and motion from image streams under orthography: A factorization method. IJCV, 9(2):137-154, November 1992. Data centering Factorization
A factorization method - Centering the data Centering: subtract the centroid of the image points [Eq. 5] xˆ ij = x ij 1 n n k = 1 x ik - x i X k - x i x ik Image i
A factorization method - Centering the data ik Centering: subtract the centroid of the image points n n 1 1 xˆ ij = xij xik = AiX j + bi n k = n 1 k = 1 x = A X + b [Eq. 4] i k i X k ( A + ) ixk bi [Eq. 6] - x i x ik
A factorization method - Centering the data Centering: subtract the centroid of the image points n n 1 1 xˆ ij = xij xik = AiX j + bi n k = n 1 k = 1 n 1 x = A X + b ik [Eq. 4] i k i = A i X j X % n k = 1 X X k ( A + ) ixk bi k " = A Xˆ [Eq. 7] i j - x i x ik X = 1 n n X k k=1 Centroid of 3D points
A factorization method - Centering the data Centering: subtract the centroid of the image points n n 1 1 xˆ ij = xij xik = AiX j + bi n k = n 1 k = 1 n 1 = A i X j X % n k = 1 ( A + ) ixk bi k " = [Eq. 7] j A Xˆ i After centering, each normalized observed point is related to the 3D point by xˆ = ij A i Xˆ j [Eq. 8]
A factorization method - Centering the data X X j - x i x ij If the centroid of points in 3D = center of the world reference system x ˆ = A Xˆ = ij i j A i X j [Eq. 9]
" % = mn m m n n x x x x x x x x x D ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ 2 1 2 22 21 1 12 11 " cameras (2 m ) points (n ) A factorization method - factorization Let s create a 2m n data (measurement) matrix:
Let s create a 2m n data (measurement) matrix: [ ] n m mn m m n n X X X A A A x x x x x x x x x D " 2 1 2 1 2 1 2 22 21 1 12 11 ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ " % = " % = cameras (2 m 3) points (3 n ) The measurement matrix D = M S has rank 3 (it s a product of a 2mx3 matrix and 3xn matrix) A factorization method - factorization (2 m n) M S
Factorizing the Measurement Matrix 2m Measurements Motion Structure = 3 n 3 n D = MS
Factorizing the Measurement Matrix Singular value decomposition of D: 2m = D U W V T n n n n n
Factorizing the Measurement Matrix Since rank (D)=3, there are only 3 non-zero singular values
Factorizing the Measurement Matrix S = structure M = Motion (cameras)
Factorizing the Measurement Matrix What is the issue here? D has rank>3 because of: measurement noise affine approximation D 3, U 3 W 3 V 3 T Theorem: When has a rank greater than is the best possible rank- approximation of D in the sense of the Frobenius norm. 3 D = U W V T 3 3 3 " M U 3 T % S W 3 V 3 S M
Affine Ambiguity = D M S The decomposition is not unique. We get the same D by using any 3 3 matrix C and applying the transformations M MC, S C - 1 S We can enforce some Euclidean constraints to resolve this ambiguity (more on next lecture)
Reconstruction results C. Tomasi and T. Kanade. Shape and motion from image streams under orthography: A factorization method. IJCV, 9(2):137-154, November 1992.
Next lecture Multiple view geometry Perspective structure from Motion