eometry of image formation Tomáš Svoboda, svoboda@cmp.felk.cvut.cz Czech Technical University in Prague, Center for Machine Perception http://cmp.felk.cvut.cz Last update: November 3, 2008 Talk Outline Pinhole model Camera parameters Estimation of the parameters Camera calibration
Motivation 2/47 parallel lines window sizes image units distance from the camera
What will we learn 3/47 how does the 3D world project to 2D image plane? how is a camera modeled? how can we estimate the camera model?
Pinhole camera 4/47 http://en.wikipedia.org/wiki/pinhole camera
Camera Obscura 5/47 2 2 http://en.wikipedia.org/wiki/camera obscura
Camera Obscura room-sized 6/47 Used by the art department at the UNC at Chapel Hill 3 3 http://en.wikipedia.org/wiki/camera obscura
D Pinhole camera 7/47
D Pinhole camera projects 2D to D 8/47 image plane x f C z optical axis x Z Z 2 Z 3 x f = Z x = f Z
Problems with perspective I 9/47 image plane x 2 f x 2 C z Z Z 2 Z 3 optical axis = 2 x x 2
Problems with perspective II 0/47 image plane x 2 3 f x 23 C z optical axis Z 2 Z 3 2 3 x 2 = x 3
et rid of the ( ) sign /47 image plane x image plane 2 3 f x 23 x 23 C f z Z Z 2 optical axis Z 3
2/47 How does the 3D world project to the 2D image plane?
A 3D point in a world coordinate system 3/47 z [0, 0, 0] x y C
A pinhole camera observes a scene 4/47 z [0, 0, 0] x y C
Point projects to the image plane, point x 5/47 z [0, 0, 0] x y x C
Scene projection 6/47 z [0, 0, 0] x y x C
Scene projection 7/47 z [0, 0, 0] x y C
3D Scene projection observations 8/47 3D lines project to 2D lines but the angles change, parallel lines are no more parallel. area ratios change, note the front and backside of the house C
Put the sketches into equations 9/47
3D 2D Projection 20/47 We remember that: x = [ f Z, fy Z ] [ x ] f fy Z [ x ] f 0 f 0 0 [ ] x Use the homegeneous coordinates 4 λ [ ] x [3 ] = K [3 3] [ I 0 ] [4 ] but... C 4 for the notation conventions, see the talk notes
... we need the in camera coordinate system otate the vector: z 2/47 = ( w C w ) is a 3 3 rotation matrix. The point coordinates are now in the camera frame. Use homogeneous coordinates to get a matrix equation cam w x [0, 0, 0] y [ ] = [ Cw 0 ] [ w ] x C w The camera center C w is often replaced by the translation vector C t = C w
External (extrinsic parameters) The translation vector t and the rotation matrix are called External parameters of the camera. x K [ I 0 ] [ ] w z [0, 0, 0] 22/47 λx = K [ t ] [ w ] cam x y C w Camera parameters (so far): f,, t Is it all? What can we model? x C
What is the geometry good for? 23/47 video: Zoom out vs. motion away from scene How would you characterize the difference? Would you guess the motion type?
What is the geometry good for? 24/47 video: Zoom out vs. motion away from scene
25/47 Enough geometry 5, look at real images 5 just for a moment
From geometry to pixels and back again 26/47 00 200 300 400 500 600 700 00 200 300 400 500 600 700 800 900 000
Problems with pixels 27/47 00 200 300 400 500 600 700 00 200 300 400 500 600 700 800 900 000
Is this a stright line? 28/47 9 70 22 272 323 59 0 6 22 263 34 365 46 467 58 569 620 67 722 773 824 875 926 977
Problems with pixels 29/47 00 200 300 400 500 600 700 00 200 300 400 500 600 700 800 900 000
What are we looking at? 30/47 373 374 375 376 377 378 379 380 38 382 383 384 385 386 507 508 509 50 5 52 53 54 55 56 57 58 59 520 52 522 523 524 525
Did you recognize it? 3/47 00 200 300 400 500 600 700 00 200 300 400 500 600 700 800 900 000
Pixel images revisited 32/47 373 00 200 9 374 375 376 300 70 377 378 400 22 379 380 500 600 700 272 323 38 382 383 384 385 00 200 300 400 500 600 700 800 900 000 59 0 6 22 263 34 365 46 467 58 569 620 67 722 773 824 875 926 977 386 507 508 509 50 5 52 53 54 55 56 57 58 59 520 52 522 523 524 525 There are no negative coordinates. Where is the principal point? Lines are not lines any more. Pixels, considered independently, do not carry much information.
Pixel coordinate system Assume normalized geometrical coordinates x = [x, y, ] 33/47 u = m u ( x) + u 0 v = m v y + v 0 where m u, m v are sizes of the pixels and [u 0, v 0 ] are coordinates of the principal point. u v x 373 374 375 376 x 377 378 379 380 [u 0, v 0 ] y 38 382 383 384 385 386 507 508 509 50 5 52 53 54 55 56 57 58 59 520 52 522 523 524 525 C
Put pixels and geometry together From 3D to image coordinates: λx λy λ From normalized coordinates to pixels: Put them together: λ u v = = u v f 0 0 0 f 0 0 0 = fm u 0 u 0 0 fm v v 0 0 0 [ t ] [4 ] m u 0 u 0 0 m v v 0 0 0 [ t ] x y 34/47 Finally: u K [ t ] Introducing a 3 4 camera projection matrix P: u P
Non-linear distortion Several models exist. Less standardized than the linear model. We will consider a simple on-parameter radial distortion. x n denote the linear image coordinates, x d the distorted ones. 35/47 x d = ( + κr 2 )x n where κ is the distortion parameter, and r 2 = x 2 n + y 2 n is the distance from the principal point. Observable are the distorted pixel coordinates u d = Kx d Assume that we know κ. How to get the lines back?
Undoing adial Distortion 36/47 From pixels to distorted image coordinates: x d = K u d From distorted to linear image coordinates: x n = x d +κr 2 Where is the problem? r 2 = x 2 n + y 2 n. We have unknowns on both sides of the equation. Iterative solution:. initialize x n = x d 2. r 2 = x 2 n + y 2 n 3. compute x n = x d +κr 2 4. go to 2. (and repeat few times) And back to pixels u n = Kx n
Undoing adial Distortion 37/47 video
Estimation of camera parameters camera calibration The goal: estimate the 3 4 camera projection matrix P and possibly the parameters of the non-linear distortion κ from images. Assume a known projection [u, v] of a 3D point with known coordinates λu P λv = P Y 2 λ P Z 3 λu λ = P λv P and 3 λ = P 2 P 3 e-arrange and assume 6 λ 0 to get set of homegeneous equations u P 3 P = 0 v P 3 P 2 = 0 38/47 6 see some notes about λ = 0 in the talk notes
Estimation of the P matrix 39/47 u P 3 P = 0 v P 3 P 2 = 0 e-shuffle into a matrix form: [ 0 u } 0 {{ v } A [2 2] ] P P 2 P 3 } {{ } p [2 ] = 0 [2 ] A correspondece u i i forms two homogeneous equations. P has 2 parameters but scale does not matter. We need at least 6 2D 3D pairs to get a solution. We constitute A [ 2 2] data matrix and solve p = argmin Ap subject to p = which is a constrained LSQ problem. p minimizes algebraic error
Decomposition of P into the calibration parameters 40/47 P = [ K Kt ] and C = t We know that should be 3 3 orthonormal, and K upper triangular. P = P./norm(P(3,:3)); [K,] = rq(p(:,:3)); t = inv(k)*p(:,4); C = - *t; See the slide notes for more details.
An example of a calibration object 4/47
42/47 2D projections localized 0.43 0.72 0.92 0.55 0.50 0.33 0.65 0.82 0.63 0.40 5.65 3.7 4.3 0.39 0.56 0.53 0.22 4.99 0.47 2.73 0.78 0.34 0.22 0.46 0.4 2.28 2.64 5.44 4.50 0.35 0.79 2.65 0.54 0.43 0.2 0.45 0.82 0.56 0.3 0.30 2.79 2.35 2.59 3.82 4.78 0.62 0.46 0.36 0.36 0.27 2.2 6.9 2.77 2.62 3.96 6.43 5.98 2.70 2.29 7.74 4.37 2.76 2.44 2.85 2.85 2.84 3.85 2.54 2.59 6.27 505.79 5.27 3.57 3.43 6.32 4.52 2.73 2.85 2.0 6.36 2.97 3.4 2.7 6.56 503.36 2.2 8.23 502.59 2.0 503.4 2.9 5.05 8.55 2.75 2.95 8.39 503.38 4.64 6.2 2.08.78 3.99 4.66 2.56 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 00 200 300 400 500 600 50 00 50 200 250 300 350 400 450
eprojection for linear model 43/47 50 00 50 200 250 300 350 400 450 00 200 300 400 500 600
eprojection for full model 44/47 50 00 50 200 250 300 350 400 450 00 200 300 400 500 600
eprojection errors comparison between full and linear model 45/47 6 5 sorted 2D reprojection errors full model linear model 4 pixels 3 2 0 0 20 40 60 80 00 20
eferences The book [2] is the ultimate reference. It is a must read for anyone wanting use cameras for 3D computing. Details about matrix decompositions used throughout the lecture can be found at [] [] ene H. olub and Charles F. Van Loan. Matrix Computation. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, USA, 3rd edition, 996. [2] ichard Hartley and Andrew Zisserman. Multiple view geometry in computer vision. Cambridge University, Cambridge, 2nd edition, 2003. 46/47
End 47/47
image plane x f x C z Z Z 2 Z 3 optical axis x f = Z x = f Z
image plane x 2 f x 2 C z optical axis Z Z 2 = 2 x x 2 Z 3
image plane x 2 3 f x 23 C z optical axis Z 2 Z 3 2 3 x 2 = x 3
image plane x image plane 2 3 f x 23 x 23 C f z Z Z 2 optical axis Z 3
C z [0, 0, 0] x y
C z [0, 0, 0] x y
C z [0, 0, 0] x y x
C z [0, 0, 0] x y x
C z [0, 0, 0] x y
C
C x
C z w [0, 0, 0] x y cam C w x
C z w [0, 0, 0] x y cam C w x
00 200 300 400 500 600 700 800 900 000 00 200 300 400 500 600 700
00 200 300 400 500 600 700 800 900 000 00 200 300 400 500 600 700
9 70 2 72 23 59 0 6 22 263 34 365 46 467 58 569 620 67 722 773 824 875 926 977
00 200 300 400 500 600 700 800 900 000 00 200 300 400 500 600 700
73 74 75 76 77 78 79 80 8 82 83 84 85 86 507 508 509 50 5 52 53 54 55 56 57 58 59 520 52 522 523 524 525
00 200 300 400 500 600 700 800 900 000 00 200 300 400 500 600 700
00 200 300 400 500 600 700 800 900 000 00 200 300 400 500 600 700
9 70 2 72 23 59 0 6 22 263 34 365 46 467 58 569 620 67 722 773 824 875 926 977
73 74 75 76 77 78 79 80 8 82 83 84 85 86 507 508 509 50 5 52 53 54 55 56 57 58 59 520 52 522 523 524 525
73 74 75 76 77 78 79 80 8 82 83 84 85 86 507 508 509 50 5 52 53 54 55 56 57 58 59 520 52 522 523 524 525
C u v x x [u 0, v 0 ] y
0.43 0.72 0.92 0.55 0.50 0.33 0.65 0.82 0.63 0.40 5.65 3.7 4.3 0.39 0.56 0.53 0.22 4.99 0.47 2.73 0.78 0.34 0.22 0.46 0.4 2.28 2.64 5.44 4.50 0.35 0.79 2.65 0.54 0.43 0.2 0.45 0.82 0.56 0.3 0.30 2.79 2.35 2.59 3.82 4.78 0.62 0.46 0.36 0.36 0.27 2.2 6.9 2.77 2.62 3.96 6.43 5.98 2.70 2.29 7.74 4.37 2.76 2.44 2.85 2.85 2.84 3.85 2.54 2.59 6.27 505.79 5.27 3.57 3.43 6.32 4.52 2.73 2.85 2.0 6.36 2.97 3.4 2.7 6.56 503.36 2.2 8.23 502.59 2.0 503.4 2.9 5.05 8.55 2.75 2.95 8.39 503.38 4.64 6.2 2.08.78 3.99 4.66 2.56 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 00 200 300 400 500 600 50 00 50 200 250 300 350 400 450
00 200 300 400 500 600 50 00 50 200 250 300 350 400 450
00 200 300 400 500 600 50 00 50 200 250 300 350 400 450
0 0 20 40 60 80 00 20 6 5 sorted 2D reprojection errors full model linear model 4 pixels 3 2