A case study in geometric algebra: Fitting room models to 3D point clouds

Size: px

Start display at page:

Download "A case study in geometric algebra: Fitting room models to 3D point clouds"

Darren Stevenson
5 years ago
Views:

1 BSC THESIS (15 ECTS) A case study in geometric algebra: Fitting room models to 3D point clouds Author: Moos HUETING July 15, 2011 Supervisors: Dr. Marcel WORRING Dr. Daniël FONTIJNE Abstract Many geometrical problems exist which have been researched thoroughly, but always using classical methods such as linear algebra as a framework for the problem. As linear algebra is an algebra based on coordinates and numbers as basic elements of computation, this leads to longwinded and non-universal code. Geometric algebra is an alternative formalism in which geometric objects are the basic elements of computation. Using this formalism to represent geometrical problems can often yield more readable and more compact code. In this paper we present a case study of such a problem specifically fitting room models to 3D point clouds and the advantages geometric algebra has over classical methods in solving this problem.

2 Contents 1 Introduction 2 2 Context 3 3 Research question 4 4 Method Geometric algebra Overview of geometric algebra On compactness of expression Example: Plane through three points, two methods RANSAC Hough transform Nearest Neighbour Hough Transform Unique representation D to 2D Experiments Software GA implementation Generating the datasets Data Artificial set Real set RANSAC Hough transform Hough transform, 3D to 2D On computational speed Conclusions 18 7 Discussion 18 1

3 1 Introduction When tackling geometrical problems, one of the most common approaches is to create a model of the problem in classical linear algebra, solving it in that formalism using numbers as the basic elements of computation. This method has been used for a long time and yields satisfactory results (see section 2). However, the use of numbers as basic elements in a geometric problem generates long-winded expressions which are in most cases far from intuitive. Here lies a possibility for improvement. A cleaner way of representing geometric problems would arise if there was a way in which we could represent the objects about which we are reasoning in a more direct way. Geometric algebra (GA) [1], an alternative algebra for representing and computing with geometric objects and problems, fills that void. In GA, complete subspaces such as planes and lines are the elements of computation. As a result, computations can be done directly on these elements, without the need for manually manipulating any of their coordinates. This creates a compactness of expression which generates clean and compact code. The advantages of this algebra over classical methods are easily expressed but hard to clarify with an in-depth theoretical analysis. After all, going very deep into the nuts and bolts of the algebra will most likely not demonstrate the compactness of expression which is apparent when we deal with it on the surface. As a result, we have chosen to use a case study to show the advantages of using GA over classical methods in geometric problems. For the case study, we tackle the problem of creating 3D models from multiple 2D representations. This is a prime example of a geometrical problem which has been discussed using classical formalisms many times (see section 2). As we want to show the advantages of using GA for problems with a geometrical context, this problem makes for a good case study. 2D representations of the environment such as photographs have many uses, but are quite limited. The world we live in is not two dimensional. By definition, when creating a 2D representation of a 3D world 1, information is lost. Using 3D models of the environment thus creates new possibilities. Creating 3D models of any kind of environment from scratch using manual modeling software is expensive, and the process takes too much time to be effectively used on a regular basis. 2D imagery, on the other hand, is easy to create and quick to come by. Using multiple 2D views of an environment to create 3D models, thus, removes the difficulties posed by the direct way of generating these models. This method of reconstruction is twofold. First, using multiple view geometry, a 3D point cloud is created from the 2D imagery. Then, using this 3D point cloud, a surface 3D reconstruction is created from the captured scene. The generation of these point clouds is fairly straightforward and can be done in many different ways. In the following section we will briefly discuss the research done on this topic. After the point cloud has been generated, the step of fitting the point cloud to a 3D surface is more challenging and has been subject of many diverse approaches. All efforts done in the past used classical, linear algebra based techniques. We will discuss the same problem, but instead using the formalism of GA. In section 2, we will describe previous work done on generating 3D surfaces from reconstructed point clouds, and briefly mention the work done on generating the point clouds themselves. Afterwards, we will detail the research question posed in this paper, presenting the added value given by this work. In section 4 we will describe the methods used in this work, combining geometric algebra with classic techniques like RANSAC and Hough transforms. The experiments done with these methods and the results generated are presented in section 5. The conclusions that can be extrapolated from the research will be discussed in section 6. Finally, in section 7 we will discuss possible future work and relevant improvements. 1 A 3D world in which the dimensions are independent, which they are in our world. 2

4 2 Context The problem in our case study is not a new one. Much research has been done on how to most effectively reconstruct 3D models from 2D imagery. This process is twofold. The first step deals with reconstructing a 3D point cloud from the captured 2D data (in most cases photography), also called multiple view geometry, after which the actual surface model of the environment is reconstructed from that point cloud in the second step. The standard work on multiple view geometry was written by Hartley & Zisserman [2], which describes the complete process of generating 3D point clouds from 2D imagery. One recent novel method for generating a point cloud from multiple 2D representations of an environment was presented by Esteban [3]. It is based on creating a simulated stereo vision model using a single camera, where two photographs taken shortly after each other in time represent 2 virtual cameras, which are used as input for a stereo vision model, which removes the need for a dual camera system when reconstructing 3D environments in real time. When the point cloud has been constructed, it should be fitted to actual surface models to regenerate the environment from which the original data was captured. This particular problem has been studied numerous times. One method for regenerating the actual surface model of the environment from the generated point cloud is based on RANSAC. It was presented by Schnabel et al. [4]. They proposed a modified version of RANSAC (which we will discuss in section 4.2) specifically tailored to finding a number of different shapes in a point cloud. Another method which is often used for finding specific shapes within a dataset is the Hough transform, which we will discuss in section 4.3. Although commonly used for detecting lines and circles in 2D datasets, it can also be used in a 3D environment or point clouds derived from them. Recently, Borrmann et al. [5] devised a method of implementing the Hough transform in 3D environments, especially designed for finding planes. We have seen that many methods have been proposed for resolving the problem at hand, and although a good deal of research has already been performed, there has been no single mention of geometric algebra. The compactness of expression inherent to GA when dealing with geometric problems should be apparent when applying it to our case. Although geometric algebra was first mentioned and discussed by Grassmann [6], we have turned mostly to a quite recent book by Dorst et al. [1] for the geometric algebra used in our research. It gives a complete overview of the algebra as well as how it can be applied in a computer driven setting. 3

5 3 Research question Much research has been conducted on reconstructing a 3D surface model from a reconstructed 3D point cloud. However, in all research done, the basic elements of computation are real numbers. This makes for tedious equations and computations, as the surfaces and points in question are only parametrized by their numerical constituents. It would be efficient if the points and surfaces could be viewed as elements themselves. This is where geometric algebra comes in. Geometric algebra is an algebra in which different kinds of geometrical shapes and objects are the basic elements of computation, such as lines and planes. When they are created, different operators are available which can easily compute a number of relations between the terms of the equation, such as the distance between two points or the intersection of a plane and a line. This algebra does not make new computations possible, but it does simplify many computations significantly, which means that tasks involving geometrical computations will often be more compactly represented compared to using classical methods. The question addressed in this paper arises from these points. Although the problem at hand has been solved with classical techniques many times before, does the use of geometric algebra result in more compact equations and code? Furthermore, does the use of geometric algebra present any other advantages that one does not readily acknowledge? In section 4 we will give a comparison of the code when using geometric algebra and the code that solves the same problem with classical methods, and present any other findings regarding the difference between them. Section 5 will show some results we have gathered using an implementation we created using the methods described in section 4. 4

4 Method 4.1 Geometric algebra When one needs to represent geometric objects, geometric algebra offers an alternative to the algebraic approach.

Because of its geometric nature, many problems concerning the manipulation of geometric space and objects yield more intuitive computations than when using a classical representation.

6 4 Method 4.1 Geometric algebra When one needs to represent geometric objects, geometric algebra offers an alternative to the algebraic approach. In geometric algebra, geometric entities are the basic elements of computation and can be handled without working with the coordinate constituents of the objects. Because of its geometric nature, many problems concerning the manipulation of geometric space and objects yield more intuitive computations than when using a classical representation. For reference, we discuss some (but not all) of the operators and objects available in geometric algebra. For a more in-depth resource on the workings of GA we refer to Dorst et al. [1] Overview of geometric algebra Basis vectors Similar to linear algebra, the most basic elements of GA are the basis vectors of which the directional space is comprised. In 3D space, these are e 1, e 2, e 3 and each of these correspond to one of the x, y, z directions 2. The specific model we deal with (Conformal Geometric Algebra or CGA) expands the directional space with 2 extra dimensions, which correspond to the point in the origin o (the origin can be chosen arbitrarily) and the point at infinity. This point at infinity is a point which all lines and planes have in common and which does not change under Euclidean transformations. By making this point explicit, the algebraic patterns in geometrical statements become more universal [1]. Outer product The outer product, also called the wedge product is denoted as and spans the subspace comprised of its constituents. For example, e 1 e 2 denotes the subspace of all multiples of e 1 and e 2. Such an outer product results in a blade, and its dimensionality is called its grade. 3 This product is defined over all elements of GA and is purely algebraic. Figure 1: Basis vectors e 1 and e 2 Figure 2: The blade which results from the outer product between e 1 and e 2 : e 1 e 2. It represents the subspace spanned by the constituents (e 1, e 2 ). In a 2D space, this is of course the complete space, in which case the blade is called the pseudoscalar 2 It is possible to use a representational space in which the basis vectors correspond to completely different directions, but we will not cover that possibility here. 3 An element comprised of blades of different grades is called a multivector) 5

7 A blade with the same grade as the representational space is called a pseudoscalar and is denoted as I n where n is the grade of the pseudoscalar. All pseudoscalars of the same space are scalar multiples of each other (see figure 2). Contraction The contraction is a more abstract product and has been expressed by Dorst et al. [1] as: The contraction A on B of a blade A of grade a and a blade B of grade b is a specific sub-blade of B of grade b a perpendicular to A, with a weight proportional to the norm of B and to the norm of the projection of A onto B. It can be used to take a certain subspace out of another subspace. For example, we can use it to retrieve one of the original vectors from which a blade has been previously made up: A = e 1 e 2 e 1 A = e 2 (1) For vectors, the contraction is quite similar to the more familiar dot product from linear algebra. In the model we use (conformal geometric algebra), though, the dot product has to be extended to the added dimensions of o and, which is where it differs from the classical dot product. The result table is listed in figure 3. o e 1 e 2 e 3 o e e e Figure 3: Table of outcomes for the contraction between basis vectors Note that the rules for using the contraction are not as straightforward as this example may make one believe. An in-depth discussion of these algebraic rules are offered in [1]. Geometric product The geometric product is the fundamental product of geometric algebra and all other products are derived from it. It is simply denoted by a space: a b means the geometric product between a and b. For vectors, it is simply defined as a b = a b + a b (2) In more concrete terms, we can say that the geometric product between two elements contains every relationship between those elements (for example the distance between them, the angle between them, their containment relationship, etc.). The definition of the geometric product for objects of higher grade (dimensionality) is too involved to list here. More details may be found in Dorst et al. [1]. Dual form Any expression in geometric algebra has a one-on-one mapping with its dual form. This means that any object in GA can be expressed in 2 ways: directly and dually. The dual form of an object is denoted by the operator and can be computed from the direct form using a simple equation: P = P I 1 n (3) Conversely, the direct form can be retrieved from the dual form using undualization: P = P I n also written as P = (P ) (4) The geometric interpretation of the dual form is not always easily extrapolated, but for many objects the two different representations both correspond to classical well-known representations. For example, the direct 6

8 form of a plane is the outer product (which spans a subspace between elements as we have just discussed) between three points on the plane and the point at infinity, while its dual form uses a combination of the plane s normal vector and distance to the origin to define it fully, a representation which users of linear algebra should be familiar with. Conformal geometric algebra Geometric algebra is an algebra which can be implemented using many different models. The model we use is called conformal geometric algebra which is specifically designed for Euclidean geometry and its transformations. All Euclidean transformations (those comprised of rotations, reflections, translations and their compositions) can be expressed using the versor product. A versor is simply an object which represents some transformation. The creation of such a versor is oftentimes quite simple, and we will encounter such a computation in section 4.4. The transformation can be applied by computing the versor product. With a versor V and a to-be-transformed object O this is done as follows: O t = V OV 1 (5) Here V 1 is the inverse of the versor. All orthogonal transformations (i.e. transformations that preserve angles and lengths of vectors) can be represented like this, and in CGA all Euclidean transformations are orthogonal. This creates a very compact way of expressing quite complex transformations in a universal way. Moreover, in CGA, points can be expressed explicitly and different from vectors. While a simple vector comprised of multiples of the 3 basic direction vectors (e 1, e 2, e 3 ) denotes a direction in space, we would like to make an explicit representation of an actual point in space. In CGA, we represent such a point with p = o + v + 1 v v (6) 2 This representation, combined with the contraction table listed before in figure 3, means we can simply use the following equation to find the distance between two points: D = p 1 p 2 (7) This distance measure also means that we can easily check if two points are identical: if the above equation returns 0, the points in question are clearly in the same position and thus the same point On compactness of expression The advantages of geometric algebra we will show are based on their compactness of expression. However, it must be noted that this does not simply constitute a shorter way of writing down the problem definition. After all, using natural language we can easily define the complete problem at hand as fit a room model to this given point cloud. The difference lies in the fact that the representation given by geometric algebra is completely deterministic: following the rules of the algebra (of which we have listed a selection in section 4.1.1), one can calculate the result of any single expression. There are no external functions involved other than the basic rules for calculating the different products in the algebra. Solving the problem stated as fit a room model to this given point cloud is obviously not as straight-forward. We are thus talking about compactness of expression while retaining the possibility of directly calculating the result of the expression Example: Plane through three points, two methods We will try to make the workings of geometric algebra more concrete using an example. Here we will list a geometric problem together with its solution using classical methods. Then, for comparison, we solve the same problem using GA to show its compactness of expression. Two geometrical operations we need for our case study are creating the plane P through three given points p 1, p 2, p 3 and calculating the distance between any arbitrary point and such a plane (see section 4.2). In linear algebra, given these three points, one first calculates the normal of the plane: n = (p 1 p 2 ) (p 1 p 3 ) (8) 7

9 where denotes the cross product. This normal vector combined with any of the three points (p d ) defines the plane fully. Calculating the distance between an arbitrary point p a and this plane is a relatively involved operation. One first calculates the vector w from p d to p a, and then projects this vector onto the normal n. The length of the resulting vector is equal to the distance D of the point to the plane: D = proj n w = n w n Even though these computations are relatively cheap to perform, quite some mathematics are involved and the process is not intuitive. In conformal geometric algebra, we can use the outer product ( ) to span a subspace S using as many constituents as necessary. A sphere, for example, is defined by any four points on its surface. Remembering that the point at infinity is common to all planes and lines (section 4.1), with three given points p 1, p 2, p 3 the plane through these points is thus created with the simple equation P = p 1 p 2 p 3 (10) This fully defines the plane. Using this representation, one can easily calculate the distance between P and an arbitrary point p. To do so, the plane is converted to dual form, after which the distance is given using the contraction [1]: D = p P (11) This method, although not necessarily 4 faster (see section 5.6), is more intuitive and generates code that is clear and easy to maintain. (9) 4.2 RANSAC RANSAC is an iterative method used to estimate parameters of some (mathematical) model making use of a set of datapoints containing outliers. First published in 1981 by Fischler and Bolles [7], it has seen quite some variations, but the core has remained the same. The assumption upon which RANSAC is based is that a dataset contains valid datapoints and outliers. In an iterative manner, datapoints are randomly selected from the set and a model is fitted to those points. An error measure is calculated from that model given the rest of the points, and noted. Then the process starts over again. This process is repeated a number of times, after which the best model (with the lowest error measure) is returned as the right model. RANSAC can be described textually as presented in algorithm 1. Algorithm 1 RANSAC 1. Select at random the minimum amount of points necessary to determine the model parameters 2. Create model from these points 3. Determine how many points in the total set of points lie within a predefined threshold θ of the model 4. If this model has a lower error measure than the current best model, save it 5. Repeat steps 1 through 4 for a predetermined amount of N steps 6. Return best model A plane is defined by just three points, and thus for the problem at hand we select 3 points at random from the dataset and generate the plane through these points. In section 4.1 this was shown to be defined as P = p 1 p 2 p 3 (12) This is a basic element of computation and the original elements from which the plane was constructed are not needed for any further computation involving the plane. 5 The distance D between the plane P and any 4 This depends on the efficiency of the implementation. 5 The original elements are conversely also impossible to recreate from the combined representation. 8

10 arbitrary point p was defined as D = P p (13) where P is the dual form of P. These two computations are performed iteratively as shown in algorithm 1. The results gathered using this method are presented in section Hough transform The Hough transform is a technique used for feature extraction, mostly seen in image analysis. It provides a method for finding imperfect instances of a certain class of shapes within a dataset using a voting procedure. This voting procedure is carried out in parameter space, whose dimensionality is equal to the number of unknown parameters of the shape class to be detected. The idea behind the transform is relatively straightforward: for each point in the dataset, the shapes that can be formed containing that point are generated. An accumulator array stores the occurrences of these shapes using their parameters. If a certain shape is present in the dataset, all points in the set that lie on this shape should cluster around its parameters in the accumulator array. In the end, the local maxima in the accumulator space correspond to shapes found in the dataset. In its most basic form, the Hough transform can be described as in algorithm 2. Algorithm 2 Hough Transform A {} for all p dataset do par parameters of p A[par] A[par] + 1 end for return local maxima in A This process is computationally quite expensive. For each point in the dataset, a quite large number of parametrized shapes have to be generated (the number of shapes generated for each point dictates the precision with which shapes can be detected). With the task at hand, a dataset with points is not out of the ordinary. This yields a computation which is infeasibly expensive. Another version of the Hough transform called the Randomized Hough Transform (RHT), presented in 1990 by Xu [8], removes this problem. For a shape class defined by n parameters, instead of passing through each point, n points are selected at random and mapped to 1 point in the accumulator array. This procedure is then repeated. After some time, the accumulator array will show local maxima at the parameters corresponding to shapes in the dataset. Algorithm 3 Randomized Hough Transform for planes A {} repeat p 1, p 2, p 3 random selection of three points from dataset par parameters of plane (normal and distance to origin) defined by p 1, p 2, p 3 A[par] A[par] + 1 {Here A[par] can be a new cell or an already existing cell with a maximum distance to the current parameters of δ} until accumulator array has clear maxima return local maxima in A The error threshold δ specified in algorithm 3 above is introduced because the dataset used could be quite noisy, making the parametrized planes not identical. In our case, even though two parametrized planes could represent the same real plane in the room environment, they could have parameters which are not identical because of measurement noise. This way these planes would still fall in the same accumulator cell Nearest Neighbour Hough Transform To make the process even faster, we have opted for not choosing the three points from the dataset at random, but first creating a table listing the 2 nearest neighbours for each of the datapoints. Then, the accumulator array is filled by passing over each point in the dataset and creating the plane through it and its two 9

11 nearest neighbours. This increases the speed significantly, as the probability that 3 points that lie very close together are part of the same plane is much higher than a selection of 3 random points from the set Unique representation For storing the generated planes in the accumulator array, we need to make sure that the planes generated are unique: a plane generated from a set of three points S should render the exact same representation as a plane generated from another set of three points S on that same plane, otherwise a local maximum will never form in the parameter space. Such a unique representation is easily extrapolated from the representation used in the previous section. The dual form of a plane in CGA is a simple vector, in which the e 1, e 2, e 3 components (the Euclidean part) denote the normal direction and the component is proportional to the distance of the plane from the origin. If the plane is normalized, the component is equal to the distance to the origin. When the plane is normalized, just 3 values need to be saved in the accumulator array in order to uniquely store the plane: P n = P P P (14) Now just two Euclidean components and the component need to be saved, the third Euclidean component can be retrieved by acknowledging that because the plane is normalized, the following equation must hold: e e2 2 + e2 3 = 1 (15) The fact that three components need to be saved is not surprising. In classical techniques, the most commonly used unique representation of planes is one where the angle θ of the normal with the (x, y) plane is saved together with the angle φ of the normal with the (x, z) plane and the distance to the origin. These are exactly the degrees of freedom a plane in 3D has, so our unique representation in GA cannot possibly get any more compact. There is one more issue which we need to take into account: a plane defined by normal vector E is in our case the same as the plane defined by normal vector E. By specifying that we want the largest component of the normal vector to always be positive, we circumvent this problem. If the constructed plane does not satisfy this constraint, we can simply multiply by 1 to get the representation we want D to 2D Although the methods listed above are correct and should render good results (setting aside noise in the data), when looking at the problem closely we should notice that we are essentially dealing with just 2 dimensions: the walls of a room stand straight up and are (most of the time) not tilted. Certainly we should use this information to our advantage. One method of using this extra piece of information is by looking at the data from above and treating that view as a 2D dataset, in which lines need to be found instead of planes. However, the data generated from multiple view geometry methods are often tilted: it is very likely that the pictures taken were not completely level with the horizontal axis, and thus looking directly from above does not correspond with looking at the room directly from above. This should be corrected first. By first looking for the bottom or top plane (i.e., floor or ceiling) in the dataset, we can then rotate the complete dataset so that this plane is level with the horizontal plane. Then, we can look from above as mentioned before and will be left with a lower-dimensional problem. Finding the bottom or top plane by means of RANSAC or the Hough transform can be done by selecting the points used for generating the planes from a small portion of the set which has the lowest or highest vertical component (e 2 in our model). This plane P should then be rotated to be level with the horizontal plane. With p 1 as the point corresponding with e 1 and p 3 corresponding with e 3, the horizontal plane H is defined as H = o p 1 p 3 (16) 10

12 Now, as pointed out in section 4.1.1, we can create a versor to perform the rotation we want. In geometric algebra, the versor rotating one object A to another B can be computed as which maps to our problem as V = 1 + BA (17) V = 1 + HP (18) This is the complete definition of the versor. It can be applied to each point in the dataset, and the result will be the dataset rotated so that the found plane is level with the horizontal plane. Now that the point cloud has been rotated appropriately, we can project it onto the horizontal plane. The horizontal plane can be defined in dual form by its normal ( e 2 ). Afterwards we can span the line perpendicular to the horizontal plane and a single point p in the cloud using the outer product, spanning another subspace: L = e 2 p (19) The projection of the point p on the horizontal plane is then given by the meet operation, which is defined as P proj = L P (20) where means undualization (section 4.1.1). Doing this operation for all points in the point cloud results in the cloud being projected onto the horizontal plane, which is what we strived for. Here we see an incredible difference between linear algebra and geometric algebra. The same problem can be tackled in linear algebra, but is much more involved. We list it here for comparison. In linear algebra, planes are not direct objects of computation and are represented by a combination of different vectors, describing the angle in space and the location. In our problem we thus want to rotate the angle vectors describing the found plane to the angle vectors describing the horizontal plane. The rotation of one vector v 1 to another vector v 2 is a process comprised of two distinct steps. First, the axis and angle of rotation need to be computed. Then, using this axis and angle, a matrix can be computed which performs the wanted rotation. Given the two vectors, the axis of rotation is calculated using the cross product, which returns a vector perpendicular to both constituents: We will need to normalize this axis vector before we can use it: a = v 1 v 2 (21) a = a a (22) Then, we calculate the angle between these vectors using the dot product: ( ) v1 v 2 φ = acos v 1 ] v 2 (23) where acos is the arc cosine. If we denote the x, y and z components of the normalized axis vector as x, y, z respectively, the following matrix performs the rotation: (1 cos(φ))x 2 + cos(φ) (1 cos(φ))xy sin(φ)z (1 cos(φ))xz + sin(φ)y R = (1 cos(φ))xy + sin(φ)z (1 cos(φ))y 2 + cos(φ) (1 cos(φ))yz sin(φ)x (24) (1 cos(φ))xz sin(φ)y (1 cos(φ))yz + sin(φ)x (1 cos(φ))z 2 + cos(φ) 11

13 This should the be applied to each point in the point cloud. Afterwards, the projection onto the horizontal plane is done by throwing away the vertical component, which can be achieved using the following matrix: P = (25) Compare this equation and the step before with the simple versor and the versor product listed above. Clearly, the problem is in this case expressed much more compact in geometric algebra and generates much cleaner code. 12

algorithm. 5.1 Software 5.1.1 GA implementation For our geometric algebra expressions we used GAIGEN by Daniel Fontijne, which is a code generator for geometric algebra.

14 5 Experiments We implemented the methods presented in section 4 and tested the resulting implementations on different datasets. Here we present the results using specifically two datasets, one of which was generated artificially and another which was extrapolated from pictures of an actual room using a multiple view geometry algorithm. 5.1 Software GA implementation For our geometric algebra expressions we used GAIGEN by Daniel Fontijne, which is a code generator for geometric algebra. It was written in C and outputs C code. For purpose of ease of use, we created a coupling between Python and the generated C code from GAIGEN. This conversion results in a significantly slower implementation than the original C implementation, but makes implementation of the algorithms proposed straightforward. Implementing the algorithms in C is a surefire way of significantly increasing the speed of the algorithms Generating the datasets For generating the datasets, a combination of Microsoft Photosynth [9] and PMVS [10] was used. As input they use pictures of an environment, and the output is a 3D reconstructed point cloud of the environment. The discussion of these programs is outside the scope of this work. 5.2 Data Artificial set The first dataset we have used is one which was created artificially, in which two 2D images of a computer generated room were given to a multiple view geometry algorithm along with handcrafted feature pairs, thus resulting in a very clean dataset with a low amount of noise. It consists of roughly datapoints in 3D space. Figure 4: Artificial dataset 6, front view Figure 5: Artificial dataset, side view As can be seen especially in the side view, the back plane consists of many points, and is expected to be easily found using any of the methods used. 13

Figure 6: Real dataset, front view Figure 7: Real dataset, side view 5.3 RANSAC We implemented RANSAC as presented in section 4.

15 5.2.2 Real set The second dataset used was generated from a number of pictures taken from a room filled with furniture and other objects. This results in many points inside the room being added to the point cloud, which, for our algorithms, is simply a form of noise. Figure 6: Real dataset, front view Figure 7: Real dataset, side view 5.3 RANSAC We implemented RANSAC as presented in section 4.2 using GA, and ran the resulting implementation on our two different datasets. As expected, the backplane of the artificial dataset was easily discovered. After the backplane, the right wall was the next most likely plane to be detected, but this result varied. In some cases, the bottom plane was found first (see figure 8). This is all due to the random aspect of the algorithm. Unfortunately, RANSAC proved to be insufficiently powerful for the much more noisy real dataset. As can be seen in figure 9, planes were found that do not at all correspond with the walls in the room. This was to be expected, as the non-empty room has many datapoints in its point cloud corresponding to furniture and other objects present in the environment. 5.4 Hough transform We followed the Hough transform algorithm as we listed it in section 4.3, especially as specified in section 4.3.1, using the nearest neighbours of points to increase the speed of the algorithm. The results rendered were quite promising. With the artificial dataset, 4 of the 5 expected planes were easily found, with the left planes sometimes being overlooked. As can be seen in the dataset (section 5.2), this is also the wall that is the least represented in the point cloud. 6 N.B.: Only 1 in every 100 points shown 14

16 Figure 8: RANSAC run on the artificial dataset. In this particular instance, the bottom and top planes were found quite well Figure 9: RANSAC run on the real dataset. As can be seen, the dataset is way too noisy to be successfully processed More importantly, the results on the real dataset are much better than with RANSAC. Although the side walls are still not found, the top and bottom planes are found quite well. 15

Figure 10: Hough run on the artificial dataset. Figure 11: Hough run on the real dataset.

5 Hough transform, 3D to 2D As described in section 4.

level with the horizontal axis. Then the whole point cloud is projected onto the horizontal plane.

17 Figure 10: Hough run on the artificial dataset. Figure 11: Hough run on the real dataset. The ceiling and floor of the room are generated quite well. 5.5 Hough transform, 3D to 2D As described in section 4.4, we have implemented a Hough transform which first finds the bottom or top plane which it then rotates to be level with the horizontal axis. Then the whole point cloud is projected onto the horizontal plane. The flattened datasets which came out of this are shown below in figures 12 and 13. Figure 12: Flattened artificial dataset Figure 13: Flattened real dataset Using these flattened datasets, we could run the Hough transform again but this time trying to find lines 16

which correspond to the walls of the room. The results of this second Hough transform can be seen in figure 14 and 15. The green lines in both images correspond to actual walls in the dataset.

The other (blue) lines are caused by datapoints which do not correspond with any wall and should thus not appear in a perfect version.

18 which correspond to the walls of the room. The results of this second Hough transform can be seen in figure 14 and 15. The green lines in both images correspond to actual walls in the dataset. The yellow line in the real set corresponds to a quite accurate line within the dataset, but which is not an actual wall. The other (blue) lines are caused by datapoints which do not correspond with any wall and should thus not appear in a perfect version. Interestingly, the right wall is found quite well in the artificial set, something that didn t happen with the original Hough transform (see section 5.4). However, the left wall is not found as one of the top lines in the 2D set. The biggest improvement can be seen in the real set: no single wall was found in the original Hough transform (only the ceiling and floor), but the left wall is found perfectly in this 2D version. Figure 14: The 2D Hough transform run on the flattened artificial dataset Figure 15: The 2D Hough transform run on the flattened real dataset 5.6 On computational speed We have seen that the representational power of geometric algebra overshadows that of linear algebra when it comes to geometric problems. However, it must be noted that geometric algebra is not a set of algorithms but a formalism. This means that in and of itself geometric algebra will not offer an increase in computational speed over methods incorporating linear algebra. At the time of writing, computer hardware is optimized for computing linear algebra expressions 7. No such hardware optimizations are present for geometric algebra, although research on it has been done (Mishra et. al [11], [12]). 7 A graphics card is in essence nothing more than a very quick matrix multiplier 17

19 6 Conclusions The classic algorithms that have been discussed in section 2 work well as they stand. However, using geometric algebra significantly increases the compactness of expression. This could already be seen with our implementations of RANSAC and the Hough transform, but was most apparent when the full power of geometric algebra could be used when converting the originally three-dimensional problem to a twodimensional one. Overall, the methods used for solving the problem at hand were not sufficiently powerful to offer a complete start-to-finish foolproof 3D reconstruction implementation. However, the results were promising (especially those of the Hough transform), and could be the starting point for more intricate algorithms, all the while using the compactness of expression of geometric algebra to keep the code clean. It must be kept in mind that geometric algebra is only a formalism and thus methods incorporating it are not inherently quicker than those based on classical methods. 7 Discussion The results rendered were not unexpected. As a formalism tailored specifically to geometric problems, it seemed to fit the problem at hand like a glove. Our expectation of much compacter code was met, as we have seen in the previous sections. However, the full power of geometric algebra has not been revealed yet. Although a significant improvement over classical methods with respect to compactness of code has been shown for the 3D-to-2D Hough transform, many more intricate details of geometric algebra and their advantages when put to use in geometric problems have because of the nature of the case not been touched on. A case study involving a higher level geometric problem could be the basis of a better display of representational power. Furthermore, the universal power of geometric algebra means we could easily extend the algorithms discussed to rooms that are not strictly planar but may have spherical components, without a great increase of representational complexity. More research into this could lead to a witnessed spectacular difference between the representational power of classical methods and geometric algebra concerning geometric problems. The datasets used were very noisy and thus did not render the results we would have liked. Revising the datasets to be more noise-free or researching methods of cleaning up the data could resolve this issue and is a good start for future efforts. When the planes generated are sufficiently accurate, the corners of the room still have to be extrapolated. Simply intersecting all the planes is not enough, as figure 16 demonstrates. By computing what area of the plane is actually supported by the dataset it could be possible to differentiate between actual corners and regular plane intersections. In figure 16 for example, plane P will not find any support along the line between point A and point B, thus the intersection at A could be reasoned to be just an intersection, and not an actual corner. As all steps in this procedure 8 are quite easily represented in geometric algebra, implementing this is a suggestion for future research. Implementation-wise, as it stands, the speed of the algorithms implemented could be significantly increased by porting the implementation to a lower level language like C. The translation steps currently necessary to switch between C and Python are an enormous bottleneck for speed. Although it will not render novel results, it will make new datasets available for processing which are currently too large to handle. 8 Finding support by calculating distances between points and planes and finding the intersecting lines between planes 18

20 Figure 16: When simply calculating plane intersections, all the room corners do get returned, but also some intersections which are not actual corners 19

21 List of Figures 1 Basis vectors e 1 and e Outer product of e 1 and e Table of outcomes for the contraction between basis vectors Artificial dataset, front view Artificial dataset, side view Real dataset, front view Real dataset, side view RANSAC run on the artificial dataset. In this particular instance, the bottom and top planes were found quite well RANSAC run on the real dataset. As can be seen, the dataset is way too noisy to be successfully processed Hough run on the artificial dataset Hough run on the real dataset. The ceiling and floor of the room are generated quite well Flattened artificial dataset Flattened real dataset The 2D Hough transform run on the flattened artificial dataset The 2D Hough transform run on the flattened real dataset When simply calculating plane intersections, all the room corners do get returned, but also some intersections which are not actual corners

22 References [1] L. Dorst, D. Fontijne, and S. Mann. Geometric algebra for computer science: an object-oriented approach to geometry. Morgan Kaufmann, [2] R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: , [3] I. Esteban, J. Dijk, and F. Groen. Automatic 3D modeling of the urban landscape. In Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT), 2010 International Congress on, pages IEEE, [4] R. Schnabel, R. Wahl, and R. Klein. Efficient RANSAC for Point-Cloud Shape Detection. In Computer Graphics Forum, volume 26, pages Wiley Online Library, [5] D. Borrmann, J. Elseberg, et al. The 3D Hough Transform for Plane Detection in Point Clouds: A Review and a new Accumulator Design. 3DR Express, [6] H. Grassmann. Die lineale Ausdehnungslehre ein neuer Zweig der Mathematik. O. Wigand, [7] R.C. Bolles and M.A. Fischler. A ransac-based approach to model fitting and its application to finding cylinders in range data. In International Joint Conference on Artificial Intelligence, pages Citeseer, [8] L. Xu, E. Oja, and P. Kultanen. A new curve detection method: randomized hough transform (rht). Pattern Recognition Letters, 11(5): , [9] Microsoft photosynth. [10] Y. Furukawa and J Ponce. Pmvs. [11] B. Mishra and P. Wilson. Color edge detection hardware based on geometric algebra. In Visual Media Production, CVMP rd European Conference on, pages IET, [12] B. Mishra and P. Wilson. Hardware implementation of a geometric algebra processor core. In IMACS International Conference on Applications of Computer Algebra, [13] G. Vosselman, S. Dijkman, et al. 3D building model reconstruction from point clouds and ground plans. International Archives of Photogrammetry Remote Sensing and Spatial Information Sciences, 34(3/W4):37 44,

ANALYSIS OF POINT CLOUDS Using Conformal Geometric Algebra

ANALYSIS OF POINT CLOUDS Using Conformal Geometric Algebra Dietmar Hildenbrand Research Center of Excellence for Computer Graphics, University of Technology, Darmstadt, Germany Dietmar.Hildenbrand@gris.informatik.tu-darmstadt.de