Undergrad HTAs / TAs Help me make the course better! HTA deadline today (! sorry) TA deadline March 2 st, opens March 5th
Project 2 Well done. Open ended parts, lots of opportunity for mistakes. Real implementation experience of a tricky vision system.
Episcopal Gaudi the haunted palace
Harder to mark Part 2 is somewhat open ended. Many of you came up with different solutions. -> We may have a few issues in the marking. Let us know if you think we ve made an error.
MATLAB tip - thresholding No need to iterate. img = im2double( imread( a.jpg ) ); imgt = img. double(img >.5);
Average Accuracy Across Notre Dame, Mt. Rushmore, and Gaudi s Episcopal Palace. 76% - Katya Schwiegershausen 2. 72% -Prasetya Utama 3. 7.6% - Jessica Fu 4. 68.67% Tiffany Chen 5. Gaudi s choice award: 34% - Spencer Boyum (st in Episcopal Palace)
Outline Recap camera calibration Epipolar Geometry
Oriented and Translated Camera R j w t k w O w i w
Degrees of freedom t x K R 33 32 3 23 22 2 3 2 z y x t r r r t r r r t r r r v u s v u w z y x 5 6
How to calibrate the camera? Z Y s sv su t x K R
How do we calibrate a camera? 88 24 43 23 27 97 886 347 745 32 943 28 476 59 49 24 37 335 783 52 235 427 665 429 655 362 427 333 42 45 746 35 434 45 525 234 76 38 62 87 su sv s Y Z 32.747 39.4 3.86 35.796 3.649 3.356 37.694 32.358 3.48 3.49 37.86 29.298 3.937 3.5 29.26 3.22 37.572 3.682 37.6 36.876 28.66 39.37 32.49 3.23 37.435 3.5 29.38 38.253 36.3 28.88 36.65 39.3 28.95 38.69 36.83 29.89 39.67 38.834 29.29 38.255 39.955 29.267 37.546 38.63 28.963 3.36 39.26 28.93 37.58 38.75 29.69 39.95 3.262 29.99 32.6 3.772 29.8 3.988 32.79 3.54
34 33 32 3 24 23 22 2 4 3 2 m m m m m m m m m m m m v Z v Y v v Z Y u Z u Y u u Z Y v Z v v Y v Z Y u Z u u Y u Z Y n n n n n n n n n n n n n n n n n n n n Method homogeneous linear system Solve for m s entries using linear least squares Ax= form 34 33 32 3 24 23 22 2 4 3 2 Z Y m m m m m m m m m m m m s sv su [U, S, V] = svd(a); M = V(:,end); M = reshape(m,[],3)';
For project 3, we want the camera center
Estimate of camera center.486 -.3645 -.685 -.44 -.9437 -.42.682.699.677 -.77.2543 -.6454 -.279.8635 -.457 -.3645 -.792.37.738.6382 -.58.332.3464.3377.337.89 -.43.242 -.4799.292.69.83 -.48.292 -.9 -.2992.529 -.575.46 -.4527.576 -.49.2598 -.5282.9695.382 -.682.2856.478.424 -.2 -.95.295.282 -.28.889 -.848.5255 -.9442 -.583 -.3759.45.3445.324 -.7975.37 -.826 -.4329 -.45 -.2774 -.475 -.772 -.2667 -.549 -.784 -.4.993 -.2854 -.24 -.432.243 -.53 -.748 -.384 -.248.878 -.96 -.263 -.765 -.5792 -.936.3237.797.27.389.5786 -.887.2323.442.456
Oriented and Translated Camera R j w t k w O w i w
Recovering the camera center Z Y s sv su t x K R 33 32 3 23 22 2 3 2 z y x t r r r t r r r t r r r v u s v u w z y x This is not the camera center C. It is RC (because a point will be rotated before t x, t y, and t z are added) This is t K Q So K - m 4 is t So we need -R - K - m 4 to get C Q is K R. So we just need -Q - m 4
Estimate of camera center.486 -.3645 -.685 -.44 -.9437 -.42.682.699.677 -.77.2543 -.6454 -.279.8635 -.457 -.3645 -.792.37.738.6382 -.58.332.3464.3377.337.89 -.43.242 -.4799.292.69.83 -.48.292 -.9 -.2992.529 -.575.46 -.4527.576 -.49.2598 -.5282.9695.382 -.682.2856.478.424 -.2 -.95.295.282 -.28.889 -.848.5255 -.9442 -.583 -.3759.45.3445.324 -.7975.37 -.826 -.4329 -.45 -.2774 -.475 -.772 -.2667 -.549 -.784 -.4.993 -.2854 -.24 -.432.243 -.53 -.748 -.384 -.248.878 -.96 -.263 -.765 -.5792 -.936.3237.797.27.389.5786 -.887.2323.442.456
Epipolar Geometry and Stereo Vision Many slides adapted from James Hays, Derek Hoiem, Lana Lazebnik, Silvio Saverese, Steve Seitz, Hartley & Zisserman
Depth from disparity image I(x,y) Disparity map D(x,y) image I (x,y ) (x,y )=(x+d(x,y), y) If we could find the corresponding points in two images, we could estimate relative depth James Hays
What do we need to know?. Calibration for the two cameras.. Camera projection matrix 2. Correspondence for every pixel. Like project 2, but project 2 is sparse. We need dense correspondence!
2. Correspondence for every pixel. Where do we need to search?
Depth from Stereo Goal: recover depth by finding image coordinate x that corresponds to x x x x z x' f f C Baseline C B
Depth from Stereo Goal: recover depth by finding image coordinate x that corresponds to x Sub-Problems. Calibration: How do we recover the relation of the cameras (if not already known)? 2. Correspondence: How do we search for the matching point x? x x'
Epipolar geometry Relates cameras from two positions
Wouldn t it be nice to know where matches can live? To constrain our 2d search to d?
Key idea: Epipolar constraint x x x x Potential matches for x have to lie on the corresponding line l. Potential matches for x have to lie on the corresponding line l.
VLFeat s 8 most confident matches among,+ local features.
Epipolar lines
Keep only the matches at are inliers with respect to the best fundamental matrix
Epipolar geometry: notation x x Baseline line connecting the two camera centers Epipoles = intersections of baseline with image planes = projections of the other camera center Epipolar Plane plane containing baseline (D family)
Epipolar geometry: notation x x Baseline line connecting the two camera centers Epipoles = intersections of baseline with image planes = projections of the other camera center Epipolar Plane plane containing baseline (D family) Epipolar Lines - intersections of epipolar plane with image planes (always come in corresponding pairs)
Think Pair Share Where are the epipoles? What do the epipolar lines look like? = camera center a) b) c) d)
Example: Converging cameras
Example: Motion parallel to image plane
Example: Forward motion e e Epipole has same coordinates in both images. Points move along lines radiating from e: Focus of expansion
What is this useful for? x x Find : If I know x, and have calibrated cameras (known intrinsics K,K and extrinsic relationship), I can restrict x to be along l. Discover disparity for stereo.
What is this useful for? x x Given candidate x, x correspondences, estimate relative position and orientation between the cameras and the 3D position of corresponding image points.
What is this useful for? x x Model fitting: see if candidate x, x correspondences fit estimated projection models of cameras and 2.
VLFeat s 8 most confident matches among,+ local features.
Epipolar lines
Keep only the matches at are inliers with respect to the best fundamental matrix
Epipolar constraint: Calibrated case x x x x xˆ K x Homogeneous 2d point (3D ray towards ) 2D pixel coordinate (homogeneous) 3D scene point xˆ K x 3D scene point in 2 nd camera s 3D coordinates
Epipolar constraint: Calibrated case x x x x t xˆ K Homogeneous 2d point (3D ray towards ) 2D pixel coordinate (homogeneous) x xˆ [ t 3D scene point ( Rxˆ )] (because x, R x, and t are co-planar) xˆ K x 3D scene point in 2 nd camera s 3D coordinates
Essential matrix x x xˆ [ t ( Rxˆ )] ˆ x T E xˆ with E t R E is a 3x3 matrix which relates corresponding pairs of normalized homogeneous image points across pairs of images for K calibrated cameras. Essential Matrix (Longuet-Higgins, 98) Estimates relative position/orientation. Note: [t] is matrix representation of cross product
Epipolar constraint: Uncalibrated case x x If we don t know K and K, then we can write the epipolar constraint in terms of unknown normalized coordinates: ˆ x T Ex ˆ x K xˆ, x Kx ˆ
The Fundamental Matrix Without knowing K and K, we can define a similar relation using unknown normalized coordinates xˆ T Ex ˆ T T xˆ K xˆ K x x x Fx with F K EK Fundamental Matrix (Faugeras and Luong, 992)
Properties of the Fundamental matrix x x x T Fx with F K EK T F x = is the epipolar line l associated with x F T x = is the epipolar line l associated with x F is singular (rank two): det(f)= F e = and F T e = (nullspaces of F = e ; nullspace of F T = e ) F has seven degrees of freedom: 9 entries but defined up to scale, det(f)=
F in more detail F is a 3x3 matrix Rank 2 -> projection; one column is a linear combination of the other two. Determined up to scale. 7 degrees of freedom a b αa + βb c d αc + βd e f αe + βf where a is scalar; e.g., can normalize out. Given x projected from into image, F constrains the projection of x into image 2 to an epipolar line.
Estimating the Fundamental Matrix 8-point algorithm Least squares solution using SVD on equations from 8 pairs of correspondences Enforce det(f)= constraint using SVD on F Note: estimation of F (or E) is degenerate for a planar scene.
8-point algorithm. Solve a system of homogeneous linear equations a. Write down the system of equations x T Fx uu f + uv f 2 + uf 3 + vu f 2 + vv f 22 + vf 23 + u f 3 + v f 32 + f 33 = Af = u u u v u v u v v v u v u n u v u n v n u n v n u n v n v n v n u n v n f f 2 f 3 f 2 f 33 =
8-point algorithm. Solve a system of homogeneous linear equations a. Write down the system of equations b. Solve f from Af= using SVD Matlab: [U, S, V] = svd(a); f = V(:, end); F = reshape(f, [3 3]) ;
Need to enforce singularity constraint
8-point algorithm. Solve a system of homogeneous linear equations a. Write down the system of equations b. Solve f from Af= using SVD Matlab: [U, S, V] = svd(a); f = V(:, end); F = reshape(f, [3 3]) ; 2. Resolve det(f) = constraint using SVD Matlab: [U, S, V] = svd(f); S(3,3) = ; F = USV ;
8-point algorithm. Solve a system of homogeneous linear equations a. Write down the system of equations b. Solve f from Af= using SVD 2. Resolve det(f) = constraint by SVD Notes: Use RANSAC to deal with outliers (sample 8 points) How to test for outliers?
Problem with eight-point algorithm f f f f f f f 2 uu uv u vu vv v u v f 2 3 22 23 3 32
Problem with eight-point algorithm f f f f f f f 2 uu uv u vu vv v u v Poor numerical conditioning Can be fixed by rescaling the data f 2 3 22 23 3 32
The normalized eight-point algorithm (Hartley, 995) Center the image data at the origin, and scale it so the mean squared distance between the origin and the data points is 2 pixels Use the eight-point algorithm to compute F from the normalized points Enforce the rank-2 constraint (for example, take SVD of F and throw out the smallest singular value) Transform fundamental matrix back to original units: if T and T are the normalizing transformations in the two images, than the fundamental matrix in original coordinates is T T F T
VLFeat s 8 most confident matches among,+ local features.
Epipolar lines
Keep only the matches at are inliers with respect to the best fundamental matrix
Comparison of estimation algorithms 8-point Normalized 8-point Nonlinear least squares Av. Dist. 2.33 pixels.92 pixel.86 pixel Av. Dist. 2 2.8 pixels.85 pixel.8 pixel
Let s recap Fundamental matrix song http://danielwedge.com/fmatrix/