2D and 3D Transformations AUI Course Denbigh Starkey. Introduction 2 2. 2D transformations using Cartesian coordinates 3 2. Translation 3 2.2 Rotation 4 2.3 Scaling 6 3. Introduction to homogeneous coordinates 9 4. 2D transformations using homogeneous coordinates 5. Rotations and scaling relative to a fixed point 2 6. Comparison of Cartesian and homogeneous approaches 4 7. 3D transformations using homogeneous coordinates 6 8. Example 9
. Introduction The three most important transformations, translation, rotation, and scaling, are the most common operations performed in computer graphics, and so doing them efficiently and effectively is critical. I ll have two sets of notes related to these operations, this page and a page on using quaternions for rotations about arbitrary axes which I ll cover later. So in these notes I ll discuss three types of transformations, translations, rotations, and scaling. All three are affine transformations, which means that if they are applied to a straight line then the result is still straight. So to transform an object that is polygonal all we have to do is the transform the finite number of vertices (control points and then redraw the lines or polygons. Affine transformations also preserve spline curves and surfaces, which I ll be covering later in this course, and so when transforming splines we also only have to transform the control points that define the curves and surfaces. The biggest complication when dealing with these transformations is that to do them efficiently we need to use a whole new coordinate system. You ve been using Cartesian coordinates for years, and are probably happy with them, but as I ll show in these notes using homogeneous coordinates where, for example, a 2D point is represented with three components, is much more efficient. The basic approach is that I ll first use Cartesian coordinates for 2D transformations, and will then introduce and use 2D homogeneous coordinates. This lets me compare the two approaches in the simpler 2D case. Then having shown how much better the homogeneous coordinates are I ll use them for 3D transformations, which are obviously most common in graphics. I ll finish up with an example. There are other types of affine transformations which I won t cover because they are less important. In particular I won t cover shears. 2
2D transformations using Cartesian Coordinates 2. Translation Translation shifts an object in space without changing its size or orientation. I.e., it adds a fixed (δx, δy to every control point. In equation terms, for every point P = (x, y we get a new point P = (x, y, where we can either say P = (δx, δy + P if we are comfortable with point arithmetic, or x = δx + x y = δy + y using more traditional equations. If we move to vector notation, the points P and P are defined by & x P = % y & x ' and P =. % y' Then we can write &' x P = %' y + P or P = T + P where T is the column vector of δ s. E.g., if we have a simple house as shown below 3
& 2 and we translate by (2,, then the vector will be added to each of the % five control points (vertices to get The dotted lines show the original house and the solid lines the transformed house. 2.2 Rotation For rotations we need a little bit of trigonometry. The convention in graphics is that rotations are specified counterclockwise (ccw. In the picture below we have the original point P which is rotated θ (degrees, 4
radians, or whatever around the origin. So this gives a new point, P, which will be the same distance from the origin. Call that distance, the radius of the sweep, R. y P = (x, y R R sin(θ + φ θ φ R R cos(θ + φ R cos(φ P = (x, y R sin(φ x Now to do the math, we need to know the original angle between the line from the origin to P and the x axis. Call that φ as shown. Basic trigonometry tells us that and x = R cos(φ ( y = R sin(φ (2 x = R cos(φ + θ = R cos(φ cos(θ R sin(φ sin(θ (3 y = R sin(φ + θ = R cos(φ sin(θ + R sin(φ cos(θ (4 We can now use the substitutions from ( and (2 to get rid of the R and φ in (3 and (4 to get: x = x cos(θ y sin(θ y = x sin(θ + y cos(θ In matrix terms, cos( P = R P, where R = sin( sin( % cos( % 5
E.g., if we want to rotate our standard house 9 (reminder: cos(9 = and sin(9 = we will get expressions of the form: P = P. E.g., looking at the house corner at (3,, P = & 3 &' = % % 3 says that the new point corresponding to (3, is (-, 3. In the figure below I have shown this 9 ccw rotation with two lines to make the rotation a bit more clear. Obviously one possible surprise is that the house has swung away from its original location because we are rotating around the origin. We will deal with this problem later when we learn how to rotate around a fixed point in the figure (e.g., the lower left corner or the center of gravity of the house. 2.3 Scaling Scaling lets the user stretch or squash the object in both the x and y directions. E.g., one might want to double the width of the object and have 6
2 3 of the height, which would be a scale of (2, 32. Mathematically scaling is defined as: x = S x * x y = S y * y where S x and S y are the scaling factors, and (S x, S y is the scale. In matrix terms, P = S P Sx where S = Sy. E.g., applying the (2, 32 scale that I discussed above to the standard house, we get the picture below. Note that the house has also shifted right and down because of the scale. We can avoid this by fixing a point (e.g., the lower left and I ll look at how to do this later. If the scaling factors are equal (i.e., S x = S y then the scale is called uniform. As the above examples show, non-uniform scaling leads to distortion of the shape of the object. There are many times when we use non-uniform scaling. E.g. if we have a model of a house and we want to create a street of houses that all look 7
somewhat different then we can scale the basic house in different ways and drop each new house onto lots down the street. If one of the scaling factors is zero, then the object is projected (squashed down onto the opposite axis. E.g., if we apply S x = and S y = to the house then the result will be the line on the y axis shown as the wider line below: 8
3. Introduction to Homogeneous Coordinates A 2D homogeneous coordinate is a triple (x, y, w where w. It is the x y same point as the Cartesian coordinate,. So to convert a w w homogeneous coordinate to its Cartesian coordinate, just divide the first two components by the third component, which is called the weight. E.g., the four homogeneous coordinates (2, -3,, (-2, 3, -, (2, -3,, and (-,.5, -.5 all represent the same point, which in Cartesian coordinates is the point (2, -3. When doing transformations we will only use homogeneous points where the weight is. So any Cartesian point (x, y will be represented by the homogeneous point (x, y,. This might seem to give no benefits, but as we will see we can use this extra to greatly increase the efficiency of computing transformations. Projections, which I won t be covering, use general homogeneous coordinates where w. 9
4. 2D Transformations Using Homogeneous Coordinates Any points P and P will now have the vector form P = % & y x and P = % & ' ' y x. It is easy to confirm that the equations for 2D translate, rotate, and scale can be written as Translate: P = y x P Rotate: P = sin( cos( cos( sin( P Scale: P = Sx Sy P E.g., if we want to translate by (2, then we will compute: % & ' ' y x = 2 % & y x, which, when we multiply the two matrices on the right, gives the three equations x = x + 2, y = y +, and =,
which are the points that we want. So far we seem to have lost when going from Cartesian to homogeneous, since we have traded 22 matrix multiplications for 33 multiplications when doing rotation or scaling, and have traded a two element vector addition for a 33 matrix multiplication when translating. However, as we will see below, having all three operations as multiplications lets us create a combined transformation matrix for a sequence of operations, instead of having to do them one at a time, and this is what gives the homogeneous approach its efficiency. Before we do the comparison we ll first look at how to stop objects flying away by doing rotation and scaling with a fixed point.
5. Rotations and Scaling Relative to a Fixed Point Say we want to take our standard house and rotate it 9 around its lower left hand corner, which is at (, Cartesian. I.e., we want to produce the house shown below: We will do this with three steps. First we ll translate the house so that its lower left hand corner is at the origin, which is a translation of (-, -, then we ll do the 9 rotation, where the lower left hand corner will be fixed at the origin, and then finally we ll translate it back so that the lower left hand corner is back at (,, which is a translation of (,. I.e., we ll compute which is P = P P = T(, R(9 T(-, - P. Note that the order of the three operations is Right Left since first T(-, - is applied to P, then R(9, then T(,. In general, if we want to rotate by θ fixing the point (x, y we need the computation P = T(x, y R(θ T(-x, -y P. 2
We can do the same sort of thing when scaling and fixing a point. E.g., say we want to do the scale by (2, 32 that we showed earlier, but we now want to fix the center of the house, which is at (2, 2. I.e., we d like to produce the scale shown below: To do this we just use P = T(2, 2 S(2, 32 T(-2, -2 P, following the same logic that we used for rotations around a fixed point. In general the equation to scale by (S x, S y fixing the point (x, y will be: P = T(x, y S(S x, S y T(-x, -y P. 3
6. Comparison of Cartesian and Homogeneous Approaches Although it appears that the homogeneous approach is much more complicated than the Cartesian approach, it provides dramatic performance benefits if we want to apply a number of transformations to a figure with a fairly large number of points, which is the normal situation in a graphics program. For example, say one has a relatively simple graphics scene with 5, control points to be transformed, and one wants to first rotate with some fixed point, then scale with another fixed point, and then rotate again with a third fixed point. In the Cartesian environment we ll have to do something like P = T 6 + (R 2 (T 5 + (T 4 + (S (T 3 + (T 2 + (R (T + P 5, times. We can simplify things a bit by pre-computing T 5 + T 4 and T 3 + T 2, but the computation done 5, times still looks like P = T 6 + R 2 (T 7 + S (T 8 + R (T + P. With homogeneous coordinates this initially looks worse, because the additions become matrix multiplications, as in P = T 6 (R 2 (T 5 (T 4 (S (T 3 (T 2 (R (T P but matrix multiply associates and so this is equivalent to P = (T 6 R 2 T 5 T 4 S T 3 T 2 R T P We can now (one time only compute the Combined Transformation Matrix, say C, as C = T 6 R 2 T 5 T 4 S T 3 T 2 R T and then, 5, times, just compute P = C * P. This will cost about 2,64 additions and 2,48 multiplications, whereas the Cartesian approach will take 7,4 additions and 6, 4
multiplications. And this was for a trivial case. As the number of transformation operations goes up, or when we move from 2D to 3D, the gain in efficiency increases significantly. One thing to note here is that the bottom row of all three homogeneous matrices is (,,, and if you multiply two matrices like this the result will also have the same bottom row. I.e., all general 2D homogeneous multiplications will have the form: a b c d e f g h i j k l = ag + ch bg + dh ai + cj bi + dj ak + cl + e bk + dl + f and so only 2 multiplies and 8 additions are needed to compute the new matrix. Further efficiencies exist for some specific multiplications. For example it is easy to see that multiplying two homogeneous translation matrices takes two additions and no multiplications because one is just adding the two transformations. Multiplying a translation and a rotation matrix or a translation and a scaling matrix requires no additions and no multiplications. So further efficiencies are possible when computing the combined transformation matrix, but this is only done once, so they are not very important. 5
7. 3D Transformations Using Homogeneous Coordinates The efficiency gains for homogeneous coordinates over Cartesian coordinates are much higher when we go to 3D, and so there is no reason to consider Cartesian coordinates further. So from now on we will use 3D homogeneous coordinates, which in the general case will have the form (x, y, z, w, but for now will be (x, y, z,, as in the 2D case. The point, in vector form, will be: P = & x y z % Translation and scaling matrices follow directly from the 2D analysis, and are: Translation: P = x y z P Scaling: P = Sx Sy Sz P Rotation is more complicated, because it is not immediately clear what we should be rotating around in a 3D environment. We could rotate around an arbitrary axis, but most users (including me often have a hard time visualizing what is going to happen if, for example, they rotate by 3 around an axis with direction vector (2, 3, -. To avoid this problem we will only consider rotations around one of the three primary axes, x, y, and z. Any complex rotation will then have to be built up as a sequence of rotations around these axes. (We ll look at arbitrary rotations when we study quaternions. 6
The direction of rotation will, by generally accepted convention, be counterclockwise when looking down the axis towards the origin. I.e., as the picture below shows, a rotation around the x axis will move parallel to the y/z plane in the direction from y to z, a rotation around the y axis will move parallel to the z/x plane in the direction from z to x, and a rotation around the z axis will move parallel to the x/y plane in the direction from x to y. y Around z Around x x z Around y Before we get into matrices, we should note one potential problem, which is that if one does rotations by breaking them down into rotations around the primary axes, then the order of rotations matters. E.g., if we do a θ rotation around x followed by a φ rotation around z, then in general the result will be different from doing a φ rotation around z followed by a θ rotation around x. To see this, consider a line lying along the x axis, presumably as an edge on some object. If we do a 9 rotation around x, it will spin in place and still lie on the x axis. If we follow that with a 9 rotation around z it will now lie along the y axis. If we reverse these operations, then the 9 rotation will put it on the y axis, and then the rotation around x will leave it lying along z. The simplest rotation matrix to develop is the rotation around z, which we ll call R z. When we do this the effect on x and y values will mirror the 2D case, and the z value will be unchanged, and so the formula will be: 7
8 P = sin( cos( cos( sin( P We can now derive R x and R y from R z by symmetry: R x (θ = sin( cos( cos( sin( R y (θ = sin( cos( % % cos( sin(. The only thing to be careful of here is the location of the negative sign on the sin function in R y. In R z the rotation is from x to y, and the negative sign is on the xy element (row, column 2 of the matrix. So since R y rotates from z to x, symmetry requires that the negative sign must be on the zx element of the matrix (row 3, column.
9 8. Example Build the combined transformation matrix, C, for a clockwise rotation of 9 around the y axis fixing the point (,, followed by a scale of (2,, fixing the point (,,. Apply the matrix to the Cartesian point (, 2, -. The order of the operations will be:. T(-,, - 2. R y (-9 (remember that the default is ccw 3. T(,, 4. T(, -, - 5. S(2,, 6. T(,, Remembering to put these right to left, and noting that the clockwise rotation changes the signs of the sins, we have: = C 2 Calling these matrices A, B, D, E, F, and G, and taking these a pair at a time we have: AB = 2 DE = FG = ABDE = 2 2 C = ABDEFG = 2 4
2 Applying this matrix to the Cartesian (, 2, -: 2 4 % & ' 2 = % & 2 6, and so (, 2, - is transformed to (6, 2,.