52466S Machine Vision Exercise # Camera models. Pinhole camera. The perspective projection equations or a pinhole camera are x n = x c, = y c, where x n = [x n, ] are the normalized image coordinates, x c = [x c, y c, ] is the imaged point in the camera coordinate rame and is the ocal length. Give a geometric reasoning or the perspective projection equations. How do the equations change i we assume a virtual image located at a distance in ront o the pinhole? The ollowing igure illustrates the 3D geometry o a pinhole camera that projects point x c to x n : x c = [x c, y c, ] x n = [x n, ] x c ȳ c From similar triangles we get: x n = xc x n = xc = yc = yc The relation between the coordinates is better understood in 2D. For example, i we take plane x c =, the relation between and y c can be clearly seen: x c y c x n
I the image plane is replaced by a virtual image plane in ront o the pinhole the equations do not change. Notice that the directions o the coordinate axes are reversed in the virtual image plane! ȳn x c ȳ c
2. Pixel coordinate rame. The normalized image coordinates x n and given by the perspective projection equations above are not in pixel units. The x n and coordinates have the same unit as distance (typically millimeters) and the origin o the coordinate rame is the principal point (the point where the optical axis pierces the image plane). Now, give a ormula which transorms the point x n to its pixel coordinates p = [u, v] when the number o pixels per unit distance in u and v directions are m u and m v, respectively, the pixel coordinates o the principal point are (u, v ) and a) u and v axis are parallel to x and y axis, respectively. b) u axis is parallel to x axis and the angle between u and v axis is θ. a) Parallel axis: v u ū u We must express x n in pixel coordinates. From the picture we get: u = m u x n u = m ux n + u v = m v v = m v + v v b) Angled axis: (x n, ) v ỹ u θ ū A B θ x x x n D θ E We irst determine distances x and ỹ in the (, ) coordinate rame. From triangle ABC we get: ỹ = sin θ From triangle DEP we get x n x = tan θ x = x n tan θ Then C (x n, ) u = m u x + u = m u x n m u tan θ + u v = m v ỹ + v = m v sin θ + v
3. Intrinsics matrix Use homogeneous coordinates to represent cases (2.a) and (2.b) as a matrix K 3 3, also known as the camera s intrinsic matrix, so that ˆp = Kx c. Where ˆp is p in homogeneous coordinates. m u mu u tan θ K = mv v sin θ u s u = v v x c ˆp = K y c u x c + sy c + u = v y c + v 4. Vanishing point. Assume a pinhole camera looking into the direction o z-axis when the pinhole is placed at the origin. There are two parallel lines in the plane y =. The lines are parallel with the z-axis, point (,, ) is on the irst line and point (,, ) is on the second. What are the images o these lines? Compute the intersection o the projected lines. (You can work with the normalized image coordinates and assume that the ocal length is.) x c ȳ c ū B B A A =
3D points on line A: A(α) = A + α = + α = + α 3D points on line B: B(β) = B + β = + β = + β b() a( ) = b( ) ū a() Using the perspective equation we get 2D points on the lines and the line equations: ] a(α) = [ +α +α y = x ] b(β) = [ +β +β y = x Pinhole cameras project 3D lines as 2D lines (i the line is not on the principal plane). The lines (y = x) and (y = x) intersect at (, ). The projected lines intersect at the origin o the image coordinate plane, where the optical axis crosses the image plane. Note that the 3D lines are ininite but the 2D lines stop at the origin (i.e. they are only hal lines). Note: In act, it holds that the projections o two parallel lines lying on some plane Π converge on a horizon line L ormed by the intersection o the image plane with the plane parallel to Π and passing through the pinhole.
5. Radial lens distortion. According to a common model or radial lens distortion, the distorted image coordinates x d = [x d, y d ] are obtained by x d = x n + (x n x n ) ( k r 2 + k 2 r 4 + k 3 r 6 +... ) y d = + ( ) ( k r 2 + k 2 r 4 + k 3 r 6 +... ), where k i are parameters o the model (radial distortion coeicients), x n and are the normalized image coordinates, x n = [x n, ] is the distortion center, and r = (x n x n ) 2 + ( ) 2. Assume that k i = i >, x n = [3, 3], and x d = [5, 5] is the distorted position o x n = [, ]. Determine k. We may use either coordinate o the point correspondance x d x n to solve or k. By using x we get: x d = x n + (x n x n )k r 2 = x n + (x n x n )k ((x n x n ) 2 + ( ) 2 ) 2 x d x n k = (x n x n )((x n x n ) 2 + ( ) 2 ) 5 = ( 3)(( 3) 2 + ( 3) 2 ) = 3.25 6
6. Full camera model. Points are oten expressed in an arbitrary rame o reerence denoted the world reerence rame or {w}. We denote transormation between the world reerence rame and the camera reerence rame as {c} as w T c or simply T c. This is oten a simple rigid transormation consisting o a 3D rotation R and translation t: x c = R c x w + t c Write down the equations that convert a point orm world coordinates x w to pixel coordinates p c using homogeneous coordinates (ignore radial distortion). ˆx n = x c = [ R t ] ˆx w ˆp = K [ R t ] ˆx c We can deine P = K[R t] as the 3 4 projection matrix that completely deines a pinhole camera in space.