Perspective Mappings David Eberly, Geometric Tools, Redmond WA 98052 https://www.geometrictools.com/ This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. Created: August 16, 2011 Last Modified: November 3, 2012 Contents 1 Introduction 2 2 Perspective Mapping from a Quadrilateral to a Square 2 3 Perspective Mapping from a Cuboid to a Cube 5 4 Perspective Mapping from a Hypercuboid to a Hypercube 8 1
1 Introduction This document combines and supersedes three others Perspective Mapping Between Two Convex Quadrilaterals (file Perspective.pdf. A brief summary of the fractional linear transformation that map a convex quadrilateral to a unit square. The inverse of such a transformation is easy to compute, so in fact the extension to a mapping between two convex quadrilaterals is simple to construct. Perspective Mappings Between Quadrilaterals (file Perspective2.pdf. A 3D geometric motivation for the perspective projection of one convex quadrilateral onto another. The idea is to place one quadrilateral in the xy-plane and set z-values to zero. One of the vertices is translated to the origin. The other quadrilateral has a vertex translated to the origin. The quadrilateral is placed on some plane through the origin. This plane and an eyepoint for the perspective projection of the rotated quadrilateral onto the other is constructed, which leads to a 2D-to-2D mapping fractional linear transformation between the quadrilaterals. Perspective Mappings Between Cuboids (file Perspective3.pdf. A 4D geometric motivation for the perspective projection of one cuboid (convex polyhedron with 6 planar faces onto another cuboid. This is the extension of the idea discussed in Perspective Mappings Between Quadrilaterals. The current document simplifies the constructions. It also shows the extension to higher dimensions (perspective projection of hypercuboids to hypercuboids. It is sufficient to construct the mapping from the canonical hypercube (axis-aligned, vertex at the origin, edge of length one to a hypercuboid. The mapping is a fractional linear transformation, which is easy invertible, so you easily obtain the mapping from the hypercuboid to a hypercube. 2 Perspective Mapping from a Quadrilateral to a Square Let the square vertices be (0, 0, (1, 0, (1, 1, and (0, 1. The corresponding quadrilateral vertices are q 00, q 10, q 11, and q 01. To map the quadrilateral to the square, consider the problem as a perspective projection in three dimensions. The view plane is y 2 = 0 and the eyepoint is E = (e 0, e 1, e 2 where e 2 0 (the eyepoint is not on the view plane. The perspective projection of a point r = (r 0, r 1, r 2 to the view plane is the point p = (x 0, x 1, 0, which lies on the ray originating at E and contains r. The intersection of the ray with the plane is the projection point: p = E + t(r E for some t. Some algebra leads to t = e 2 /(e 2 r 2, x 0 = (e 2 r 0 e 0 r 2 /(e 2 r 2, and x 1 = (e 2 r 1 e 1 r 2 /(e 2 r 2. Embed the square vertices in the view plane as (i 0, i 1, 0 for i 0 and i 1 in {0, 1}. Embed the quadrilateral vertices as (q i0i 1 q 00, 0, which translates one of the vertices to the origin. Rotate the embedded quadrilateral vertices so that they lie in a plane containing the origin 0 and having normal vector N = (n 0, n 1, n 2. Let the rotated points be denoted r i0i 1 = R(q i0i 1 q 00, 0, where R is a rotation matrix corresponding to the normal; thus, N r i0i 1 = 0. The problem is to construct E and N so that E + t i0i 1 (r i0i 1 E = (i 0, i 1, 0, i j {0, 1} (1 for some scalars t i0i 1. Equation (1 says that the embedded quadrilateral vertices are projected perspectively to the square vertices. Figure 1 shows the geometric configuration. 2
Figure 1. Projection of a quadrilateral onto a square. The rotated quadrilateral points r i0i 1 are projected to the square points (i, j, 0. The figure shows a rotated quadrilateral interior point projected to a square interior point. The vector q 11 q 00 can be represented as a linear combination of two quadrilateral edges, namely, q 11 q 00 = a 0 (q 10 q 00 + a 1 (q 01 q 00 (2 The coefficients a 0 and a 1 are constructed as the solution of two linear equations in two unknowns. These coefficients show up in the equations for the perspective projection. The convexity of the quadrilateral guarantees that a 0 0, a 1 0, and a 0 + a 1 > 1. Rotation preserves the relative positions of the vertices, so we know that r 11 = a 0 r 10 + a 1 r 01 (3 Equation (1 contains four relationships, E + t 00 (r 00 E = (0, 0, 0 E + t 10 (r 10 E = (1, 0, 0 E + t 01 (r 01 E = (0, 1, 0 E + t 11 (r 11 E = (1, 1, 0 (4 It must be that t 00 = 1, because r 00 = 0. Dotting the last three equations with the normal vector leads to (1 t 10 N E = n 0, (1 t 01 N E = n 1, (1 t 11 N E = n 0 + n 1 Geometrically, it is clear that N E 0, so these three equations imply t 11 = t 10 + t 01 1 (5 3
Substituting equations (3 and (5 into the last line of equation (4 produces (2 t 10 t 01 E + (t 10 + t 01 1(a 0 r 10 + a 1 r 01 = (1, 1, 0 (6 Adding the second and third lines of equation (4, Subtracting equation (7 from (6, we have (2 t 10 t 01 E + t 10 r 10 + t 01 r 01 = (1, 1, 0 (7 [a 0 (t 10 + t 01 1 t 10 ]r 10 + [a 1 (t 10 + t 01 1 t 01 ]r 01 = 0 (8 The linear independence of r 10 and r 01 guarantees that the coefficients in equation (8 must both be zero. We now have two linear equations in the two unknowns t 10 and t 01. These are easily solved, and together with equation (5 produce t 00 = 1, t 10 = a 0 a 0 + a 1 1, t 01 = a 1 a 0 + a 1 1, t 11 = 1 a 0 + a 1 1 It turns out that computing t 11 explicitly in terms of a 0 and a 1 is not necessary to construct the fractional linear transformations. At this time we may attempt to construct E and N, but this is not necessary to actually construct the perspective projection. Instead, consider a rotated quadrilateral point r that is projected to a square point (x 0, x 1, 0, as illustrated in Figure 1. We may write r as a linear combination of two rotated quadrilateral edges, r = y 0 r 10 + y 1 r 01 (10 The ray equation for the pair of points is (x 0, x 1, 0 = E + t(r E = E + t(y 0 r 10 + y 1 r 01 E (11 for some scalar t. Substituting our now known values for t i0i 1 into the second and third lines of equation (4, solving for r 10 and r 01, replacing in equation (11, and grouping like terms produces ( 1 ty 0(1 t 10 ty ( ( 1(1 t 01 ty0 ty1 t E + x 0 (1, 0, 0 + x 1 (0, 1, 0 = (0, 0, 0 (12 t 10 t 01 t 10 t 01 By linear independence of E, (1, 0, 0, and (0, 0, 1, the coefficients in equation (12 must all be zero. This leads to the fractional linear transformation that maps the quadrilateral to the square (x 0, x 1 = (a 1(a 0 + a 1 1y 0, a 0 (a 0 + a 1 1y 1 a 0 a 1 + a 1 (a 1 1y 0 + a 0 (a 0 1y 1 (13 The fractional linear transformation that maps the square to the quadrilateral is (y 0, y 1 = (a 0 x 0, a 1 x 1 (a 0 + a 1 1 + (1 a 1 x 0 + (1 a 0 x 1 (14 Example. Intuitively, perspective transformations map lines to lines. They also map conic sections to conic sections. The proof is straightforward. Let the perspective transformation be y 0 = (a 0 x 0 +a 1 x 1 +a 2 /(c 0 x 0 + c 1 x 1 + c 2 and y 1 = (b 0 x 0 + b 1 x 1 + b 2 /(c 0 x 0 + c 1 x 1 + c 2. The inverse transformation is of the same form, so (9 4
it suffices to show a conic section in (y 0, y 1 coordinates satisfying Ay 2 0 + By 0 y 1 + Cy 2 1 + Dy 0 + Ey 1 + F = 0 is mapped to a conic section in (x 0, x 1 coordinates satisfying Āx2 0 + Bx 0 x 1 + Cx 2 1 + Dx 0 + Ēx 1 + F = 0. Substituting the formulas for y 0 and y 1 into the quadratic equation, multiplying by (c 0 x 0 + c 1 x 1 + c 2 2, expanding the products, and grouping the appropriate terms yields a quadratic in x 0 and x 1. The left image in Figure 2 shows a square containing several conic sections. The top curve is a parabola, the left and right curves are hyperbolas, the center curve is a circle, and the bottom curve is an ellipse. The grid consists of straight lines. The right image in Figure 2 shows a perspective mapping of these conic sections. Figure 2. Projection of conic sections in a square to conic sections in a quadrilateral. 3 Perspective Mapping from a Cuboid to a Cube The ideas of the previous section extend to 3D. A cuboid is a convex polyhedron that has 6 planar faces. For example, a typical view frustum is a cuboid. A cube is a special case of a cuboid, all six faces are squares, and opposite faces are parallel. Let the vertices of the cuboid be denoted q i0i 1i 2 where i j {0, 1}. The canonical cube vertices are (i 0, i 1, i 1. To project perspectively the cuboid to the cube, we can embed the problem in 4D. The view hyperplane is y 3 = 0 and the eyepoint is E = (e 0, e 1, e 2, e 3 where e 3 0. The cuboid is translated to the origin and then rotated into a hyperplane with normal N = (n 0, n 1, n 2, n 3. The rotation matrix is R and the rotated and translated vertices are r i0i 1i 2 = R(q i0i 1i 2 q 000, 0. Our goal is that for some scalars t i0i 1i 2. E + t i0i 1i 2 (r i0i 1i 2 E = (i 0, i 1, i 2 0, i j {0, 1} (15 The vector q 111 q 000 can be represented as a linear combination of three cuboid edges, namely, q 111 q 000 = a 0 (q 100 q 000 + a 1 (q 010 q 000 + a 2 (q 001 q 000 (16 5
The coefficients a 0, a 1, and a 2 are constructed as the solution of three linear equations in three unknowns. These coefficients show up in the equations for the perspective projection. The convexity of the cuboid guarantees that a 0 0, a 1 0, a 2 0, and a 0 + a 1 + a 2 > 1. Rotation preserves the relative positions of the vertices, so we know that r 111 = a 0 r 100 + a 1 r 010 + a 2 r 001 (17 Dotting the relationships in equation (15 with the normal vector produces (1 t 100 N E = n 0, (1 t 110 N E = n 0 + n 1 (1 t 010 N E = n 1, (1 t 101 N E = n 0 + n 2 (1 t 001 N E = n 2, (1 t 011 N E = n 1 + n 2 (1 t 111 N E = n 0 + n 1 + n 2 Because N E 0, we obtain t 110 = t 100 + t 010 1, t 101 = t 100 + t 001 1, t 011 = t 010 + t 001 1, t 111 = t 100 + t 010 + t 001 2 (18 Substituting the relationship for t 111 from equation (18 into the equation of (15 involving t 111 leads to (3 t 100 t 010 t 001 E + (t 100 + t 010 + t 001 2(a 0 r 100 + a 1 r 010 + a 2 r 001 = (1, 1, 1 (19 Adding the relationships of equation (15 involving t 100, t 010, and t 001 produces (3 t 100 t 010 t 001 E + t 100 r 100 + t 010 r 010 + t 001 r 001 = (1, 1, 1 (20 Subtracting equation (20 from (19, we have [a 0 (t 100 + t 010 + t 001 2 t 100 ]r 100 + [a 1 (t 100 + t 010 + t 001 2 t 010 ]r 010 + [a 2 (t 100 + t 010 + t 001 2 t 001 ]r 001 = 0 (21 The linear independence of r 100, r 010, and r 001 guarantees that the coefficients in equation (21 must all be zero. We now have three linear equations in three unknowns t 100, t 010, and t 001. These are easily solved, and together with previous equations produce t 000 = 1, t 100 = t 010 = t 001 = 2a 0 a 0+a 1+a 2 1, 2a 1 a 0+a 1+a 2 1, 2a 2 a 0+a 1+a 2 1, t 110 = +a0+a1 a2+1 a 0+a 1+a 2 1 t 101 = +a0 a1+a2+1 a 0+a 1+a 2 1 t 011 = a0+a1+a2+1 a 0+a 1+a 2 1 t 111 = +a0+a1+a2+1 a 0+a 1+a 2 1 (22 It turns out that computing t 110, t 101, t 011, and t 111 explicitly in terms of a 0, a 1, and a 2 is not necessary to construct the fractional linear transformations. As in the 2D problem, we can construct the fractional linear transformations for the projection without explicitly constructing E and N. We may write a rotated interior cuboid point r as r = y 0 r 100 + y 1 r 010 + y 2 r 001 (23 6
The ray equation for the pair of points is (x 0, x 1, x 2, 0 = E + t(r E = E + t(y 0 r 100 + y 1 r 010 + y 2 r 001 E (24 for some scalar t. Substituting the values for t i0i 1i 2 into equation (15; solving for r 100, r 010, and r 001 ; replacing in equation (24, and grouping like terms produces ( ty1(1 t010 t 010 1 ty0(1 t100 t ( 100 + ty0 t 100 x 0 (1, 0, 0, 0 + ty2(1 t001 t 001 E ( ty1 t 010 x 1 (0, 1, 0, 0 + ( ty2 t 001 x 2 (0, 0, 1, 0 = (0, 0, 0, 0 By linear independence of E, (1, 0, 0, 0, (0, 1, 0, 0, and (0, 0, 1, 0, the coefficients in equation (25 must all be zero. This leads to the fractional linear transformation that maps the cube to the cuboid, (y 0, y 1, y 2 = (25 2(a 0 x 0, a 1 x 1, a 2 x 2 d 0 x 0 + d 1 x 1 + d 2 x 2 + d 3 (26 where d 0 = a 0 a 1 a 2 + 1, d 1 = a 0 + a 1 a 2 + 1, d 2 = a 0 a 1 + a 2 + 1, and d 3 = a 0 + a 1 + a 2 + 1. The fractional linear transformation that maps the cube to the cuboid is (x 0, x 1, x 2 = d 3 (a 1 a 2 y 0, a 0 a 2 y 1, a 0 a 1 y 2 2a 0 a 1 a 2 a 1 a 2 d 0 y 0 a 0 a 2 d 1 y 1 a 0 a 1 d 2 y 2 (27 Example. When drawing a view volume in 3D graphics, one typically creates a view frustum with near plane, far plane, left plane, right plane, bottom plane, and top plane. The vertices of geometric primitives are transformed from model space to world space, then from world space to view space (camera space. The view-space vertices are finally transformed to clip space (homogeneous coordinates by a 4 4 perspective projection matrix. The rasterizer does the clipping in homogeneous coordinates and then performs the perspective divide to obtain the normalized window coordinates (x, y [ 1, 1] 2 and normalized depth z [0, 1] (for DirectX; OpenGL maps depth to [ 1, 1]. The viewport is the full rectangular window, but it may be chosen to be an subwindow specified as an axis-aligned rectangle. It is possible to specify a viewport that is a convex quadrilateral (nonrectangular. Let this viewport have vertices in camera coordinates, q i0i 10 for i j {0, 1}, and let them be counterclockwise oriented. This polygon acts as the near face of a cuboidal view volume. The quadrilateral may be extruded toward the far plane of the view frustum, thus constructing the far face of the cuboidal view volume. Let the camera forward direction be D. Let n be the near-plane distance from the eyepoint and let f be the far-plane distance from the eyepoint. The viewport vertex q i0i 10 is on the near plane, so D q i0i 1 = n. The corresponding far-plane vertex is q i0i 11 = (f/nq i0i 1 so that D q i0i 11 = f. We may construct a projection matrix that maps the cuboidal view volume to the cube [ 1, 1] 2 [0, 1]. Define U 0 = q 100 q 000, U 1 = q 010 q 000, and U 2 = q 001 q 000. Define the matrix M = [U 0 U 1 U 2 ] whose columns are the specified vectors. Define a to be the column vector whose components are the a i in the fractional linear transformations. Then a = M 1 (q 111 q 000 In homogeneous block form, a 1 = M q 000 0 T 1 1 q 111 1 = M 1 M 1 q 000 0 T 1 q 111 1 = M 1 (q 111 q 000 1 7
Define H to be the homogeneous matrix whose upper-left block is the matrix M and whose translation component is the vector q 000. The projection of equation (27 maps the cuboidal volume to the canonical cube [0, 1] 3. We need to perform one additional scaling and translation to map (x 0, x 1 from [0, 1] 2 to [ 1, 1] 2. With this additional transformation, the cuboid-to-cube mapping may be written in homogeneous form as P = a 2 a 0 (2d 3 + d 0 a 2 a 1 d 1 d 2 2a 2 a 2 a 0 d 0 a 2 a 1 (2d 3 + d 1 d 2 2a 2 0 0 d 3 0 a2 a 0 d 0 a2 a 1 d 1 d 2 2a 2 where the output (y 0, y 1, y 2, w is processed by the rasterizer in the usual manner, performing the perspective divide by w to obtain the cube points. The projection matrix from camera space to clip space is therefore P H 1. The matrix H 1 maps camera coordinates to cuboid coordinates relative to three linearly independent edges of the cuboid. These points are then perspectively projected by P into the cube. 4 Perspective Mapping from a Hypercuboid to a Hypercube The ideas in the previous sections generalize to higher dimensions. Let the d-dimensional hypercuboid have vertices q I, where I = (i 1, i 2,... i d is a multiindex with components i j {0, 1}. The canonical hypercube vertices are I, with bold face used to distinguish between vectors and multiindices. The view hyperplane is y d = 0 and the eyepoint is E = (e 1, e 2,..., e d with e d 0. The hypercuboid is translated to the origin and then rotated into a hyperplane with normal N = (n 1, n 2,..., n d. Let R be the corresponding rotation matrix and define the rotated hypercuboid vertices by r I = R(q I q O, where O is the multiindex of all zeros. Our goal is that E + t I (r I E = (I, 0 (28 for some scalars t I. Let U be the multiindex of all ones. The vector q U q O may be written as q U q O = a j (q Bj q O (29 where B j is the multiindex with 0 in all components except for a 1 in component j. The coefficients are nonnegative and d a j > 0. By linearity of the rotation matrix, r U = a j r Bj (30 Let I = (i 1, i 2,..., i d be a multiindex with at least one nonzero component. Suppose that I has m components that are 1; it is the case that 1 m d. Let those components be i j1 through i jm. Dotting the ray 8
equation with the normal vector, we have (1 t I N E = m n ik (31 Whenever m 2, the t I variable is related to t Bj variables by summations. Specifically, I = m k=1 i kb ik and m m (1 t I N E = n ik = (1 t Bik N E (32 k=1 Therefore, once we determine the relationships between t Bi and a j, the t I with 2 or more 1-valued indices are determined. It turns out that for the construction of the fractional linear transformations, we only need to know t U explicitly in terms of the a j, t U = k=1 k=1 t Bi (d 1 (33 Substituting the relationship for t U into equation (28 leads to ( ( d t Bi E + t Bj (d 1 a i r Bi = U (34 Adding the relationships of equation (28 involving the t Bi produces ( d t Bi E + t Bi r Bi = U (35 Subtracting the two equations, we have a i t Bj (d 1 t Bi r Bi = 0 (36 The linear independence of the r Bi guarantees that all the coefficients in this equation are zero. This leads to an invertible linear system of d equations in the d unknowns t B1 through t Bd. As in dimensions 2 and 3, we need not construct E and N explicitly. We may write a rotated interior hypercuboid point r as r = y i r Bi (37 The ray equation for the pair of points is ( (x 1,..., x d, 0 = E + t(r E = E + t y i r Bi E for some scalar t. Substituting the values for t Bi into equation (28, solving for r Bi, replacing in equation (38, and grouping like terms produces ( ty i (1 t Bi ( tyi 1 E + x i (B i, 0 = 0 (39 t Bi t Bi 9 (38
By linear independence of E and the (B i, 0, the coefficients in equation (39 must all be zero. This leads to the fractional linear transformation that maps the hypercube to the hypercuboid, y i = (d 1a i x i d k=1 c kx k + c d+1, 1 i d (40 where c k = b kj a j + 1, 1 k d; c d+1 = d 2, k = j a j 1; b kj = 1, k j The fractional linear transformation that maps the hypercuboid to the hypercube is [ d ] a j c d+1 (y i /a i x i = [ d ] ( a j (d 1, 1 i d (42 d c j(y j /a j The products of the a j appear as if they cancel, but if any of the a j are zero, you need to formally multiply through by the product and then evaluate the expression using the a j. (41 10