Agenda. Rotations. Camera calibration. Homography. Ransac

Agenda Rotations Camera calibration Homography Ransac

Geometric Transformations y x Transformation Matrix # DoF Preserves Icon translation rigid (Euclidean) similarity affine projective h I t h R t h sr t h A h H i i i i i orientation lengths S S 4 angles S S 6 parallelism 8 straight lines `` Let s define families of transformations by the properties that they preserve

Rotations Linear transformations that preserve distances and angles Definition: an orthogonal transformation perserves dot products a T b = F (a) T F (b) where F (a) =Aa, a R n,a R n n a T b = a T A T Ab () A T A = I [can conclude by setting a,b = coordinate vectors] Defn: A is a rotation matrix if A T A = I, det(a) = Defn: A is a reflection matrix if A T A = I, det(a) = -

D Rotations R = apple cos sin sin cos DOF

D Rotations R X 4Y 5 = Z r r r 4r r r 5 r r r X 4Y 5 Z Think of as change of basis where ri = r(i,:) are orthonormal basis vectors r rotated coordinate frame r r How many DOFs? = ( to point r + to rotate along r)

D Rotations Lots of parameterizations that try to capture DOFs Helpful one for vision: axis-angle representation Represent a D rotation with a unit vector that represents the axis of rotation, and an angle of rotation about that vector -vs- D D

Recall: cross-product Dot product: a b = a b cos Cross product: i j k a a a b b b = a a b b i a a b b j + a a b b k Cross product matrix: a b = âb = 4 0 a a b a 0 a 5 4b 5 a a 0 b

Approach! R,! = x

Approach! R,! = x k x? x. Write as x as sum of parallel and perpindicular component to omega. Rotate perpindicular component by D rotation of theta in plane orthogonal to omega R = I +ŵ sin +ŵŵ( cos ) [Rx can simplify to cross and dot product computations]

Exponential map! R,! = x k x? x R =exp(ˆv), where v =! = I +ˆv +! ˆv +... [standard Taylor series expansion of exp(x) @ x=0 as + x + (/!)x + ] Implication: we can approximate change in position due to a small rotation as v x,

Agenda Rotations Camera calibration Homography Ransac

Perspective projection y x (x,y,) (X,Y,Z) COP z [right-handed coordinate system] x = f Z X y = f Z Y

Perspective projection revisited x 4y5 = f 0 0 40 f 05 0 0 X 4Y 5 Z Given (X,Y,Z) and f, compute (x,y) and lambda: x = fx = Z x = x = fx Z

Special case: f = Natural geometric intuition: D point is obtained by scaling ray pointed at image coordinate Scale factor = true depth of point (x,y,) (X,Y,Z) COP Z x 4y5 = X 4Y 5 Z [Aside: given an image with a focal length f, resize by /f to obtain unit-focal-length image]

Homogenous notation For now, think of above as shorthand notation for 4 x y z 5 4 X Y Z 5 4 x y z 5 4 X Y Z 5 9 s.t. 4 x y z 5 = 4 X Y Z 5

Camera projection x 4y5 = f 0 0 r r r t x 40 f 05 4r r r t 5 y 0 0 Camera instrinsic matrix K (can include skew & non-square pixel size) r r r t z Camera extrinsics (rotation and translation) X 6Y 4Z 7 5 D point in world coordinates r r camera r T world coordinate frame Aside: homogenous notation is shorthand for x = x

Fancier intrinsics x s = s x x y s = s y y x 0 = x s + o x y 0 = y s + o y x =x 0 + s y 0 } } non-square pixels shifted origin y skewed image axes x K = s x s o x 4 0 s y o 5 y 0 0 f 0 0 40 f 05 = 0 0 fs x fs o x 4 0 fs y o 5 y 0 0

Notation [Using Matlab s rows x columns] X x fs x fs o x r r r t x 4y5 = 4 0 fs y o y 5 4r r r t y 5 6Y 7 4Z 5 0 0 r r r t z X = K R T 6Y 7 4Z 5 X = M 4 6Y 7 4Z 5 Claims (without proof):. A x4 matrix M can be a camera matrix iff det(m) is not zero. M is determined only up to a scale factor

Notation (more) M 4 X 6Y 4Z 7 5 = A b = A X 6Y 4Z X 4Y 5 + b Z 7 5 M = m T 4m T m T 5, A = a T 4a T a T 5, b = b 4b 5 b

Applying the projection matrix x = ( X Y Z a + b ) y = ( X Y Z a + b ) = X Y Z a + b Set of D points that project to x = 0: Set of D points that project to y = 0: X Y Z a + b =0 X Y Z a + b =0 Set of D points that project to x = inf or y = inf: X Y Z a + b =0

Rows of the projection matrix describe the planes defined by the image coordinate system a y a COP a x image plane

Other geometric properties (x,y) COP (X,Y,Z) What s set of (X,Y,Z) points that project to same (x,y)? X x 4Y 5 = w + b where w = A 4y5,b= A b Z What s the position of COP / pinhole? X A 4Y 5 + b =0 ) Z X 4Y 5 = A b Z

Affine Cameras m T = 0 0 0 x = X Y Z a + b y = X Y Z a + b Image coordinates (x,y) are an affine function of world coordinates (X,Y,Z) Affine transformations = linear transformations plus an offset Example: Weak-perspective projection model Projection defined by 8 parameters Parallel lines are projected to parallel lines The transformation can be written as a direct linear transformation

Geometric Transformations Euclidean (trans + rot) preserves lengths + angles Affine: preserves parallel lines Projective: preserves lines Projective Affine Euclidean

Agenda Rotations Camera calibration Homography Ransac

Calibration: Recover M from scene points P,..,P N and the corresponding projections in the image plane p,..,p N Find M that minimizes the distance between the actual points in the image, p i, and their predicted projections MP i Problems: The projection is (in general) non-linear M is defined up to an arbitrary scale factor

PnP = Perspective n-point

i MP i p i T i T i i T i T i P m P m v P m P m u = = 0 ) ( 0 ) ( = = i i T i T i i T i T v P m P m u P m P m Write relation between image point, projection matrix, and point in space: Write non-linear relations between coordinates: Make them linear: The math for the calibration procedure follows a recipe that is used in many (most?) problems involving camera geometry, so it s worth remembering:

0 0 0 0 0 = m P v P P u P P v P P u P T N N T N T N N T N T T T T Put all the relations for all the points into a single matrix: = = 0 0 0 m m m m m P v P u P P T i i T i i T i T i Write them in matrix form: In noise-free case: Lm = 0 (vector of 0 s)

What about noisy case? min m = Lm Min right singular vector of L (or eigenvector of L T L) Is this the right error to minimize? If not, what is?

P z x y P i (u i,v i ) (u,v ) MP i Ideal error + i i i i i i P m P m v P m P m u Error(M) = Initialize nonlinear optimization with algebraic solution

Radial Lens Distortions

Radial Lens Distortions No Distortion Barrel Distortion Pincushion Distortion

Correcting Radial Lens Distortions Before After http://www.grasshopperonline.com/barrel_distortion_correction_software.html

Overall approach Minimize reprojection error: Error(M,k s) Initialize with algebraic solution (approaches in literature based on various assumptions)

Revisiting homographies Place world coordinate frame on object plane x 4y5 = f 0 0 r r r t x 40 f 05 4r r r t 5 y 0 0 r r r t z X 6Y 4 0 7 5

Projection of planar points x 4y5 = f 0 0 r r r t x 40 f 05 4r r r t 5 y 0 0 r r r t z f 0 0 r r t x X = 40 f 05 4r r t 5 4 y Y 0 0 r r t z fr fr ft x X = 4fr fr ft 5 4 y Y 5 r r t z Convert between D location on object plane and image coordinate with a X matrix H (Above holds for any instrinc matrix K) 5 X 6Y 4 0 7 5

Two-views of a plane Image correspondences 4 4 x y x y X 5 = H 4 Y 5 X 5 = H 4 Y 5 4 x y 5 = H H x 4y 5 [LHS and RHS are related by a scale factor] [Aside: H usually invertible] 4 5 4 5

Computing homography projections Given (x,y) and H, how do we compute (x,y)? 4 x y 5 = a b c 4d e f5 g h i x 4y 5 x = x = ax + by + c gx + hy + i Is this operation linear in H or (x,y)?

How many corresponding points needed? How many degrees of freedom in H? Estimating homographies Given corresponding D points in left and right image, estimate H Image correspondences x (gx + hy + i) =ax + by + c... AH(:) = 0 6 4 07 5. Homogenous linear system

Estimating homographies Given corresponding D points in left and right image, estimate H Image correspondences AH(:) = 0 6 4 07 5. H is determined only up to scale factor (8 DOFs) Need 4 points minimum. How to handle more points? min H(:) = AH(:) Minimum right singular vector of A (eigenvector of A T A)

Frontalizing planes using homographies Estimate homography on (at least) 4 pairs of corresponding points (e.g., corners of quad/rect) Apply homography on all (x,y) coordinates inside target rectangle to compute source pixel location

Frontalizing planes using homographies

hies are derived from the corresponding points, forming a mosaic cally is shaped like a bow-tie, as images farther away from the are warped outward to fit the homography. The figure below is efeys and Hartley & Zisserman. Special case of views: rotations about camera center Can be modeled as planar transformations, regardless of scene geometry! (a) incline L.jpg (img) (b) incline R.jpg (img) (c) img warped to img s frame Figure 5: Example output for Q6.: Original images img and img (left and center) and img warped to fit img (right). Notice that the warped image clips out of the image. We will fix this in Q6. Figure 6: Final panorama view. With homography estimated with

Derivation 4 4 4 X Y Z x y x y 5 = R 5 = X 4Y 5 Z f 0 0 4 0 f 05 0 0 5 = K RK 4 K x y 4 5 X Y Z 5

Take-home points for homographies 4 x y 5 = a b c 4d e f5 g h i x 4y 5 If camera rotates about its center, then the images are related by a homography irrespective of scene depth. If the scene is planar, then images from any two cameras are related by a homography. Homography mapping is a x matrix with 8 degrees of freedom.

Matching features What do we do about the bad matches?

General problem: we are trying to fit a (geometric) model to noisy data How about we choose the average vector (least-squares soln)? Why will/won t this work? 49

x Let s generalize the problem a bit Estimate best model (a line) that fits data {x i,y i } min w,b X (y i f w,b (x i )) i f w,b (x i )=wx i + b y

Let s generalize the problem a bit Least-squares solution y x

RANSAC Line Fitting Example Sample two points

RANSAC Line Fitting Example Fit Line

RANSAC Line Fitting Example Total number of points within a threshold of line.

RANSAC Line Fitting Example Repeat, until get a good result

RAndom SAmple Consensus Select one match, count inliers

Least squares fit Find average translation vector for the largest group of inliers

RANSAC for estimating transformation RANSAC loop:. Select feature pairs (at random). Compute transformation T (exact). Compute inliers (point matches where p i - T p i < ε) 4. Keep largest set of inliers 5. Re-compute least-squares estimate of transformation T using all of the inliers

RANSAC for alignment

Planar object recognition (what is transformation used; how many pairs must be selected in initial step?