MSc. Thesis dissertation: 3D Visual SLAM applied to large-scale underwater scenarios

Size: px

Start display at page:

Download "MSc. Thesis dissertation: 3D Visual SLAM applied to large-scale underwater scenarios"

Adrian Gilmore
5 years ago
Views:

1 MSc. Thesis dissertation: 3D Visual SLAM applied to large-scale underwater scenarios Author: Josep Aulinas a Supervisors: Dr. Joaquim Salvi a and Dr. Yvan Petillot b a Institute of Informatics and Applications University of Girons, Girons (Spain) b Ocean Systems Lab Heriot-Watt University, Edinburgh (UK) A Thesis Submitted for the Degree of MSc Erasmus Mundus in Vision and Robotics (VIBOT) 2008

2 Abstract This MSc thesis studies the most recent published techniques in the field of Visual Simultaneous Localization and Mapping (V-SLAM). In particular it is focused on large-scale underwater environments and on the existing techniques available to speed up the process. The main research field we investigate is the filtering algorithms as a way of reducing the amount of data. It seems that almost all the current approaches can not perform consistent maps for large areas, mainly due to the increase of the computational cost and due to the uncertainties that become prohibitive when the scenario becomes larger. I have not failed. I have just found 10,000 ways that will not work. Thomas A. Edison

3 Contents Acknowledgments vii 1 Introduction Context and motivations Objectives Methodology Schedule Structure of the Document Literature review Research Groups Current Solutions to the SLAM Problem Summary from Current Solutions: Filters in SLAM Kalman Filters and its Variations Particle Filter Based Methods Expectation Maximization Based Methods Classification: Pros and Cons Theoretical Background SLAM Fundamentals Introduction i

4 3.1.2 Process Model Observation Model Other Issues: Measuring and Data Association Notation Kalman Filter Introduction Into the Maths: Assumptions Kalman Filter Algorithm Extended Kalman Filter Introduction Into the Maths: Non-linear Systems Extended Kalman Filter Algorithm Compressed Extended Kalman Filter Introduction Into the Maths: the Active Region Compressed Kalman Filter Algorithm Unscented Kalman Filter Introduction Into the Maths: the Unscented Transform Unscented Kalman Filter Algorithm Information Filters Introduction Into the Maths: Canonical Representation Information Filter Algorithm Simulations and Results Simulation Issues Implementation Issues ii

5 4.3 Results SLAM in 1D SLAM in 2D SLAM in 3D Conclusions Contributions Future Work Bibliography 60 iii

6 List of Figures 1.1 Typical underwater V-SLAM scenario Gant diagram: MSc thesis planning Sensor comparison Normal EKF SLAM solution vs. Divide and Conquer approach Two level hierarchical SLAM model Comparison between SEIF and EKF Underwater acoustic SLAM example SLAM schematics Map administration for the CKF algorithm Unscented Transform graphic example Example of how active regions are defined in a CKF DOF slam simulation setup Behavior of the system against different rates of process noise Normalized error and normalized gradient Landmark measurement noise evaluation Position and velocity measurement noise evaluation Comparison between KF, IF and UKF in a 1DOF SLAM problem DoF simulation, first series of experiments setup iv

7 4.9 2DOF simulations with different process noise KF based SLAM algorithm under different process noise conditions Summary of mean error committed in each algorithm Measurement noises analysis DoF simulation, second series of experiments setup Algorithms performance when facing loop closure DOF simulation, third series of experiments setup Performance when dealing with highly populated maps D slam setup D SLAM simulation setup D SLAM simulation example at different time steps v

8 List of Tables 1.1 Master thesis scheduling SLAM filtering approaches: Pros and Cons comparative Notation Kalman Filter algorithm Extended Kalman Filter algorithm Compressed Kalman Filter algorithm Unscented Kalman Filter algorithm Information Filter algorithm Effect of the process noise on each algorithm Current underwater navigation sensors and systems Quantitative comparison of a 1DOF SLAM problem, using KF, IF and UKF comparison of all methods for 2DOF simulation vi

9 Acknowledgments I want to thank the tutors, Joaquim Salvi and Yvan Petillot, for supervising my work and also thanks to Xavier Lladó who has listen patiently to all my doubts and has given me helpful advices, useful suggestions, and valuable feedback. vii

10 Chapter 1 Introduction Simultaneous Localization and Mapping (SLAM) also known as Concurrent Mapping and Localization (CML) is one of the fundamental challenges of robotics, dealing with the necessity of building a map of the environment while simultaneously determining the location of the robot within this map. Visual SLAM (V-SLAM) is the convergence of robotics and artificial vision with the aim to provide new solutions to the SLAM challenge. The aim of this thesis is to survey the advantages and disadvantages of current available techniques, to compare and contrast them, and finally to identify gaps, i.e. possible new research directions or further improvements. This chapter presents a general overview of the topic as well as its motivation, objectives, methodology and planning. In addition, the outline of the thesis is presented at the end of the chapter. 1.1 Context and motivations SLAM is a process by which a mobile robot can build a map of an environment and at the same time use this map to deduce its location. Initially, both the map and the vehicle position are not known, the vehicle has a known kinematic model and it is moving through the unknown environment, which is populated with artificial or natural landmarks. A simultaneous estimate of both robot and landmark locations is required. The SLAM problem involves finding appropriate representation for both the observation and the motion models [1]. In order to do so, the vehicle must be equipped with a sensorial system capable of taking measurements of the relative location between landmarks and the vehicle itself. Several research groups and researchers have worked and are currently working in SLAM, and the most commonly used sensors can be categorized into laser-based, sonar-based, and vision-based systems. Additional sensorial sources are used to better perceive robot state information and the outside world [2], such as, compasses, infrared technology and Global Positioning 1

11 Chapter 1: Introduction 2 Figure 1.1: Typical underwater V-SLAM scenario: underwater vehicle equipped with vision systems [3] System (GPS). However, all these sensors are subject to errors, often referred to as measurement noise, and range limitations, making necessary to navigate through the environment, for instance, light and sound cannot penetrate walls. Laser ranging systems are accurate active sensors. Its most common form operates on the time of flight principle by sending a laser pulse in a narrow beam towards the object and measuring the time taken by the pulse to be reflected off the target and returned to the sender. Sonar-based systems are fast and provide measurements and recognition capacities similar to vision. However, its dependence on inertial sensors, such as odometers, implies that a small error can have large effects on later position estimates [2]. On the other hand, Vision systems are passive, they have long range and high resolution, but the computation cost is considerably high and good visual features are more difficult to extract and match. Vision is used to estimate the 3D structure, feature location and robot pose, for instance by means of stereo camera pairs or monocular cameras with structure from motion recovery. These 3D information seems to be a key factor that contributes to increase the robustness of SLAM algorithms, due to the fact that more distinguishable landmarks can be found, easying the association when revisiting a region. This fact is one of the main motivations for using vision sensors. However, computing the clouds of points that define a landmark must be faced as a completely independent problem, and for the SLAM context we plan to study in this thesis, it is only necessary to know the location of this cloud of points, i.e. its gravity center.

12 3 1.2 Objectives Further classification can be made in terms of working environment, for instance, ground indoor, ground outdoor, air-borne or underwater (Figure 1.1). Most of the work done so far focuses on ground and mainly indoor environments [4] [5] [6] [7], only few papers deal with airborne applications [8] [9] and a few more present the SLAM in underwater conditions and they generally work with acoustic data [10]. Recently there is a growing interest in SLAM for underwater scenarios [11], and in the latter cases vision plays an important role [12] [13] [3], in most cases combined with other sensory systems to acquire both depth and appearance information of the scene, for instance, acoustic or inertial sensors. Section 2 presents the objectives of this master thesis. The methodology is given in Section 3. The schedule of the master thesis is detailed in section 4. And finally, section 5 list the structure of the document, briefly describing the main points of each chapter. 1.2 Objectives This MSC thesis aims to study the most recent published techniques in the field of Visual Simultaneous Localization and Mapping (V-SLAM). In particular we will focus on large-scale underwater environments and on the existing techniques available to speed up the process. The main research field we plan to investigate is the filtering algorithms as a way of reducing the amount of data. It seems that almost all the current approaches can not perform consistent maps for large areas, mainly due to the increase of the computational cost and due to the uncertainties that become prohibitive when the scenario becomes larger. Thus the objectives are: Conduct a literature review to obtain a basis and fundamentals on the topic, as well as, to define an updated state-of-the-art. Classify methods in the literature, to group all methods surveyed in a reduced number of variants, depending on its main features. Identify pros and cons of the main approaches, to aid in the selection of the most interesting techniques in the context intended to investigate in this thesis. Implement the most common filters, to gain insightful knowledge on its working principles. Detect possible further improvements for large mission scenarios, to warranty future investigation.

13 Chapter 1: Introduction Methodology The methodology followed in this work consists of several stages. The first stage is the literature review on the topic of study. From this bibliographical research valuable information on the state-of-the-art is obtained, which is then useful to classify current approaches and identify its pros and cons. Once the main existing techniques are identified, those more representative for our work are implemented, simulated and tested. Its performance and accuracy is then intensively analyzed, and the outcome of this analysis contains the information necessary to list some conclusions and provide feedback to the work done so far. 1.4 Schedule The master thesis is scheduled for a period of 15 weeks (3rd of March to 13th of June 2008), in which three milestones are defined: the submission of the master thesis proposal (7th of April 2008), the submission of the master thesis dissertation (13th of June 2008) and the final presentation of the master thesis (16th of June 2008). Main task are listed in Table 1.1. Table 1.1: Master thesis scheduling. Task Survey (bibliographical research) Classification (pros and cons) Develop the most significant algorithms Identify gaps and further improvements Write the master thesis dissertation Estimated timing 4 weeks / 3rd 28th March 1 weeks / 31st March 4th April 3 weeks / 7th 25th April 4 weeks / 28th April 23rd May 3 weeks / 26th May 13th June A Gannt diagram for the project is presented in figure 1.2. In this diagram all meetings with the tutors are marked as milestones, and these meetings are mainly to check the work done so far and depending on this work, decide what to do next. This diagram may change after each meeting. Notice that despite this meetings are clearly fixed in the schedule, there is supposed to be a permanent contact with tutors during the whole period. 1.5 Structure of the Document Visual SLAM is a very important topic in Robotics and requires know how from computer vision. The primary goal is to obtain the map of an unknown environment while simultaneously locating the robot within this map. This information is useful in many applications. Hence, Chapter

5 1.5 Structure of the Document Figure 1.2: Gant diagram: MSc thesis planning. 2 begins with a brief overview of the most common techniques developed so far to deal with the SLAM problem.

14 5 1.5 Structure of the Document Figure 1.2: Gant diagram: MSc thesis planning. 2 begins with a brief overview of the most common techniques developed so far to deal with the SLAM problem. The main research groups on this topic are listed and its most significant works in the SLAM context are briefly described, denoting its advantages and disadvantages. Finally, a general classification is given, comparing the pros and cons of each approach. The representation of the observation and the motion models is generally performed by computing its prior and posterior distributions using probabilistic algorithms, i.e. KF, EKF or IF. In order to understand in depth its principles, the main theoretical characteristics are described in chapter 3, summarizing the SLAM fundamentals, and detailing each algorithm individually. Chapter 4 presents some experiments performed on synthetic data for a series of different scenarios, for instance 1D, 2D and 3D problems, and with various operating conditions, for example, measuring only the vehicle s velocity, trajectory or both, or with limited range sensor. These experiments let to compare the accuracy of each technique with respect to the ground truth. Both qualitative and quantitative results are computed and analyzed bringing insightful information concerning the potential and robustness of each method surveyed. Finally, Chapter 5 presents the conclusions of the thesis, including a discussion on further work derived from the results and some additional perspectives.

15 Chapter 2 Literature review A solution to the SLAM problem has been seen as a holy grail for mobile robotics community as it would provide the means to make a robot truly autonomous. For this reason during the last two decades several research groups and researchers have been working on different approaches. The main research groups on this topic are listed in section 1. In section 2, the most significant works in the SLAM context are briefly described, denoting its advantages and disadvantages. Finally, a general classification is given comparing the pros and cons in section Research Groups The main research groups are: Active Vision Group, University of Oxford 1 (Andrew Davison, David Murray and Ian Reid among others). They have been working in visual SLAM since Robotics Research Group, University of Oxford 2 (Paul Newman among others). Robotics Perception and Real-time Group, University of Zaragoza 3 (José A. Castellanos, Juan D. Tardós, José Neira, Carlos Estrada and José M. Martinez-Montiel among others). Department of Ocean Engineering at MIT, Computer Science and Artificial Intelligence Laboratory 4 (John Leonard, Mathew Walter among others)

16 7 2.2 Current Solutions to the SLAM Problem Australian Center for Field Robotics, University of Sidney 5 (Stefan William, Oscar Pizarro among others) 2.2 Current Solutions to the SLAM Problem The following paragraphs summarize some significant works in the SLAM context. Zhang and Faugeras [14] (pioneers in this area, using mobile robots equipped with trinocular stereo rigs to build small indoor environments) modeled the world as a collection of straight line segments and tracked lines positional uncertainty with Kalman filters. Figure 2.1: Left: sensor comparison [15]. Right: multisensory approach [16] Perez, Castellanos et al. [15] were the first to present a comparative study of the performance of map-based localization using different sensing devices, such as, monocular, trinocular or laser rangefinder (Figure 2.1 left). They concluded that all methods have comparable precision, but vision systems provide more information, showing that the use of efficient matching algorithms could provide a way to achieve better results. Later on, they presented a multi sensor approach (Figure 2.1 right) [16] in which they demonstrated that using a combination of 2D laser rangefinders and monocular vision increases the reliability and precision of the environment (which is actually the expected result due to 5

17 Chapter 2: Literature review 8 the use of redundant information). They presented several papers in the context of SLAM, most of them using laser rangefinder. Even if they did not use V-SLAM recent work must be mentioned, because they state and improve two limitations of the standard EFK based SLAM in large areas [17] by using the strategy of Divide and Conquer SLAM. Figure 2.2 shows a typical situation: EKF SLAM (left) is overconfident: errors are larger than the computed covariances, thus loop closing is not possible. Divide and Conquer SLAM (right) computes estimates and covariances that allow it to easily close the loop. Figure 2.2: From left to right: EKF SLAM solution, Divide and Conquer approach, mean consistency index and mean absolute error for EKF (black) and Divide and Conquer SLAM (blue). [17]. The cost of updating the covariance matrix at each step limits the use of EKF SLAM in large environments and the effect of linearization in the final vehicle and landmark estimates reduces the consistency. These considerations can be easily observed in Figure 2.2, in which the inconsistency rate keeps growing during the process in the EKF case, while it stays under a certain boundary for the multimap case. This fact justifies the approach they present where several sub maps of equal size are generated as the vehicle traverses the environment, and then joined at fixed intervals during the process to produce the final absolute map, reducing considerably the computational cost. To optimize the computational cost and maximize consistency in EKF-based SLAM for large environments they [18] combined Local Mapping with Map Joining in a way that the total cost of computing the final map is minimized compared to full global EKF-SLAM. Similarly, they [5] presented a hierarchical approach that allows to reduce the computational time and memory requirements and to obtain accurate metric maps of large environments in real time. Essentially, the method consists on the lower (or local) map level, which is composed of a set of local maps that are guaranteed to be statistically independent, and the upper (or global) level, which is an adjacency graph whose arcs are labeled with the relative location between local maps (Figure 2.3 left). An estimate of these relative locations is maintained at this level in a relative stochastic map. They simulated their method and concluded that it performs acceptably well when closing the loop, increasing map consistency (Figure 2.3 right).

18 9 2.2 Current Solutions to the SLAM Problem Figure 2.3: Left: Two level hierarchical SLAM model. Right plots: map before and after closing the first, second and third loop [5]. Bosse et al. [19] also presented a hierarchical strategy that achieves efficient mapping of largescale environments. They used a graph of coordinate frames, with each vertex in the graph representing a local frame, and each edge representing the transformation between adjacent frames. In each frame, they build a map that captures the local environment and the current robot pose along with the uncertainties of each. They claim that loop closing is achieved via an efficient map matching algorithm. Following the same idea of a Divide and Conquer, or a hyerarchical SLAM, other strategies appear in the literature proposing partitioned update methods, reducing the computational effort. A first group of approaches use what is known as Compressed EKF (CEKF) [20] [21] [22] [23], which operates in local regions of a global map, but maintaining the information for this global map through a global coordinate reference system. A second group are based on short-term submaps with its own local coordinates, avoiding very large global covariances, what makes it more numerically stable and less sensitive to linearization errors. This are known as the constrained local submap filters (CLSF) [24]. Se, Lowe and Little [25] demonstrated the use of SIFT point features as landmarks for the SLAM problem. The invariance of these features to image translation, scaling and partially invariant to rotation, illumination changes and affine or 3D projection [26] makes them suitable landmarks for mobile robot localization and map building. With their trinocular stereo vision system, these landmarks are localized and robot ego-motion is estimated by least-squares minimization of the matched landmarks. Feature viewpoint variation and occlusion are taken

19 Chapter 2: Literature review 10 into account by maintaining a view direction for each landmark. Experiments show that these visual landmarks are robustly matched, robot pose is estimated and a consistent 3D map is built. In further work [27] they build multiple 3D submaps which are subsequently merged together instead of building a map continuously. This global localization is done by using the Hough transform and RANSAC for matching groups of descriptors to a global map. The high dimensionality of the descriptor is a drawback of SIFT. For on-line applications each of the three steps (i.e. detection, description and matching) should be faster still. A recent approach [28], named the Speeded-Up Robust Features (SURF), reduces the computation time and increases the robustness by using a very basic approximation, i.e. assuming second order Gaussian derivatives with box filters, and describing a distribution of Haar-wavelet responses within the interest point neighborhood. SURF can be computed more efficiently and yields a lower dimensional feature descriptor resulting in faster matching. Murillo et al. [29] introduce the use of SURF to improve the performance of appearance-based localization and mapping methods that perform image retrieval in large data sets. In their approach they use omnidirectional images and tested both SIFT and SURF algorithms, showing that the use of SURF offers the best compromise between efficiency and accuracy in the results, performing always the best or being much faster in case of similar accuracy. Davison [4] argued that mapping research in mobile robotics is more relevant when tackling SLAM problem than structure from motion, which mainly works off-line. He found promising results using a single camera on indoor environments and dealing with the uncertainty by means of EKF. In a different work, with Murray [30] presented the first example of a general system for autonomous localization using active vision, i.e. high performance stereo head (which detects features by means of Harris corner detection, stereo matching (epipolar geometry) and intensity level (appearance matching)). In a more recent work, with Cid and Kita [31] demonstrated a single-camera SLAM algorithm capable of learning a set of 3D point features in real-time. Their algorithm uses a mix of particle distributions with the EKF to overcome the initial uncertainty arising from the use of a single camera. Further Davison works with several researchers deal with improvements on this 3D MonoSLAM [32] with the aim to show that the fundamental computer vision problem of structure from motion with a single camera can be tackled using the sequential, probabilistic methodology of monocular SLAM [33] [34] [35]. Jensfelt et al. [36] presented a framework for 3D vision based SLAM using a single camera. They focused on the management of the features to achieve real time performance in extraction, matching and loop detection, using SIFT descriptor and Harris-Laplace detector to match image features to map landmarks. To reduce the complexity only a few high quality image features are used for landmark mapping, while the rest of features are used for matching. Experiments show how the robot is able to successfully match current image features to the map when revisiting an area.

20 Current Solutions to the SLAM Problem Thrun et al. [37] proposed the Sparse Extended Information Filter (SEIF), probably the most well known SLAM information formulation, which is based upon representing the SLAM posterior through the dual of the EKF, i.e. the Extended Information Filter (EIF). SEIFs maintain a sparse information matrix representation and have been demonstrated to be efficient, scalable, allow for explicit representation of cyclic environments, and addresses data association. Figure 2.4 (left) plots a comparison of average CPU time between SEIF and EKF. In SEIFs, the computation time virtually levels off, regardless of the number of landmarks involved, while in EKFs, the time increases quadratically with the number of landmarks, making EKF prohibitively slow for larger maps. It happens the same for the memory usage (see Figure 2.4 center), because it increases linearly for SEIFs and quadratically for EKFs. However, the mean error on SEIFs is poorer mainly due to various assumptions made (Figure 2.4 right). The SEIF breaks links between the robot and features by taking advantage of sparsification. Sparsification is necessarily an approximative step. In the context of SLAM, it suffices to remove links between the robot pose and individual features in the map; if done correctly, this also limits the number of links between pairs of features. Figure 2.4: Comparison between SEIF and EKF. Left: average CPU time; Center: memory usage; Right: root mean square distance error. [37] Montmerlo, Thrun, Koller and Wegbreit [38] present the Fast SLAM algorithm. FastSLAM takes advantage of an important characteristic of the SLAM problem: landmark estimates are conditionally independent given the robot s path [39]. FastSLAM algorithm decomposes the SLAM problem into a robot localization problem, and a collection of landmark estimation problems that are conditioned on the robot pose estimate. FastSLAM employs Rao-Blackwellized particle filtering [40]. A key characteristic of FastSLAM is that each particle makes its own local data association. In contrast, EKF techniques must commit to a single data association hypothesis for the entire filter. In addition FastSLAM uses a particle filter to sample over robot paths, which requires less memory usage and computational time than a standard EKF or KF. Its scalability has been demonstrated in simulation by building maps with over 50,000 features [38]. The main drawback is that the theoretical relationship between the size of the mapped area and the number of required particles is poorly understood [41].

21 Chapter 2: Literature review 12 Leonard analyzes mainly acoustics based SLAM, but his work is interesting because he uses different strategies. Leonard and Newman [6] presented an approach, referred as constanttime SLAM (CTS), using multiple overlapping submaps, each built with respect to a local frame of reference defined by one of the features in the submap. The global position of each submap is estimated using information from other submaps in an efficient, probably consistent manner. For situations where the mobile robot is able to make repeated visits to all regions of the environment, the method achieves convergence and maintains consistent error bounds. In more recent research with Eustice, Walter and Ballard [11], they described a vision-based largearea SLAM algorithm. They presented a novel strategy for efficiently accessing and maintaining consistent covariance bounds within a SLAM information filter, greatly increasing the reliability of data association. In other works they introduced new variations of the Thrun s approach SEIF [37] solving the error estimates that become overconfident when expressed in the global reference frame. The Exactly Sparse Extended Information Filter (ESEIF) [42] [43] achieve exact sparseness while being computationally efficient. Newman et al. studied the loop closing problem in various works [44]. They combined visual and spatial appearance, to find salient features. Local spatial appearance (using laser rangefinder data) can be combined with a purely camera based system to produce superior performance. In a more recent work [45] they propose a loop closer independent from the SLAM estimates. They use vision combined with laser to capture the appearance of a local scene, then they encode the similarity between all possible pairings of scenes. They base its approach on finding similar scene sequences, which mean that the vehicle is back to a previous visited zone. 2.3 Summary from Current Solutions: Filters in SLAM Robotic map-building can be traced back to 25 years ago, and since the 1990s probabilistic approaches (i.e. Kalman Filters (KF), Particle Filters (PF) and Expectation Maximization (EM)) have become dominant in SLAM. The three techniques are mathematical derivations of the recursive Bayes rule. The main reason for this probabilistic techniques popularity is the fact that robot mapping is characterized by uncertainty and sensor noise, and probabilistic algorithms tackle the problem by explicitly modeling different sources of noise and their effects on the measurements [2] Kalman Filters and its Variations Kalman filters (KF) are Bayes filters that represent posteriors using Gaussians, i.e. unimodal multivariate distributions that can be represented compactly by a small number of parameters.

22 Summary from Current Solutions: Filters in SLAM KF SLAM relies on the assumption that the state transition and the measurement functions are linear with added Gaussian noise, and the initial posteriors are also Gaussian. There are two main variations of KF in state-of-the-art SLAM: the Extended Kalman Filter (EKF) and its related Information Filtering (IF) or Extended IF (EIF). The EKF accommodates the nonlinearities from the real world, by approximating the robot motion model using linear functions. Several existing SLAM approaches use the EKF [4] [6] [36] [25]. The IF is implemented by propagating the inverse of the state error covariance matrix. There are several advantages of the IF filter over the KF. Firstly, the data is filtered by simply summing the information matrix and vector, providing more accurate estimates [46]. Secondly, IF are more stable than KF [47]. Finally, EKF is relatively slow when estimating high dimensional maps, because every single vehicle measurement generally effects all parameters of the Gaussian, therefore the updates requires prohibitive times when dealing with environments with many landmarks [37]. However, IF have some important limitations. A primary disadvantage is the need to recover a state estimate in the update step, when applied to nonlinear systems. This step requires the inversion of the information matrix. Further matrix inversions are required for the prediction step of the information filters. For high dimensional state spaces the need to compute all these inversions is generally believed to make the IF computationally inferior to the Kalman filter. In fact, this is one of the reasons why the EKF has been vastly more popular than the EIF [48]. In many robotics problems the information matrix may be sparse. Some algorithms exist to perform the basic update and estimation equations efficiently for this approximately sparse information matrices [42] [43]. The Unscented Kalman Filter (UKF) [49] addresses the approximation issues of the EKF and the linearity assumptions of the KF, which introduces large errors in the true posterior mean and covariance, leading sometimes to divergence of the filter. In the UKF the state distribution is also represented by a Gaussian, but is now specified using a minimal set of carefully chosen sample points. These sample points completely capture the true mean and covariance of the Gaussian, and when these sample points are propagated through the true non-linear system, they capture the posterior mean and covariance accurately to the 3rd order for any nonlinearity. In order to do that, the unscented transform is used. One of the main drawbacks of the EKF and the KF implementation is the fact that for long duration missions, the number of landmarks will increase and, eventually, computer resources will not be sufficient to update the map in real-time. This scaling problem arises because each landmark is correlated to all other landmarks. The Compressed Extended Kalman Filter (CEKF) [20] algorithm significantly reduces the computational requirement without introducing any penalties in the accuracy of the results. A CEKF stores and maintains all the information gathered in a local area with a cost proportional to the square of the number of landmarks in the area. This information is then transferred to the rest of the global map with a cost that is

23 Chapter 2: Literature review 14 similar to full SLAM but in only one iteration Particle Filter Based Methods Particle Filters (PF) are also called the sequential Monte-Carlo (SMC) method. It executes SMC estimation by a set of random point clusters ( particles ) representing the Bayesian posterior. In contrast to parametric filters (e.g., KF), PF represents the distribution by a set of samples drawn from this distribution, what makes it capable of handling highly nonlinear sensors and non-gaussian noise. However, this ability produces a growth in computational complexity on the state dimension as new landmarks are detected, becoming not suitable for real time applications [38]. For this reason, PF has only been successfully applied to localization, i.e. determining position and orientation of the robot, but not to map-building, i.e. landmark position and orientation; therefore, there are no important papers using PF for the whole SLAM framework, but there exist few works that deal with the SLAM problem using a combination of PF with other techniques, for instance, the FastSLAM [38] and the fastslam2.0 [50] Expectation Maximization Based Methods Expectation Maximization (EM) estimation is a statistical algorithm that was developed in the context of maximum likelihood (ML) estimation and it offers an optimal solution, being an ideal option for map-building. The EM algorithm is able to build a map when the robot s pose is known, for instance, by means of expectation [51]. EM iterates two steps: an expectation step (E-step), where the posterior over robot poses is calculated for a given map, and maximization step (M-step), in which the most likely map is calculated given these pose expectations. The final result is a series of increasingly accurate maps. The main advantage of EM with respect to KF is that it can tackle the correspondence problem (data association problem) surprisingly well [2], because it localizes repeatedly the robot relative to the present map in the E-step, generating various hypotheses as to where the robot might have been (different possible correspondences). In the latter M-step, these correspondences are translated into features in the map, which then get reinforced in the next E-step or gradually disappear. However, the need to process the same data several times to obtain the most likely map makes it inefficient, not incremental and not suitable for real-time applications [52]. These problems could be avoided if the data association was known [53], what is the same, if the E-step was simplified or eliminated. For this reason, EM usually is combined with PF, for instance, using EM to construct the map (only the M-step), while the localization is done by a PF-based localizer to estimate poses from odometer readings [2].

24 Classification: Pros and Cons 2.4 Classification: Pros and Cons From the previous section, it seems clear that few works has been published on underwater SLAM [6] [3] [54] [55] and even less on underwater visual SLAM [56]. Most of the underwater approaches use sonar (see Figure 2.5 or other non visual sensory systems. There exist various V-SLAM approaches for terrestrial applications [4] [5] [36] [25], most of them deal with the uncertainties by using Kalman Filters (KF) and its variation Extended Kalman Filters (EKF) [4] [36] [25], and others use some improved information filter [46] [47] [42] [43], i.e. sparse expanded information filter (SEIF). Figure 2.5: Left: Schematic representation of the environment used by Ribas [54]. Middle: sonar data represented in cartesian coordinates. Notice the distortion produced by the movement of the vehicle. Right: undistorted image after integration with vehicle displacement, i.e. odometry. It seems obvious that almost non of the current approaches can perform consistent maps for large areas, mainly due to the increase on computational cost and on the uncertainties. Therefore this is possibly the most important issue that needs to be improved. Some recent publications tackle the problem by using multiple maps, or sub-maps that are lately used to build a larger global map [5] [27] [17]. However these methods rely considerably on assuming proper data association, which is another important issue that needs to be improved. Table 2.1 provides a list of advantages and disadvantages of different SLAM strategies in terms of the method used to deal with uncertainties. Essentially, the most challenging methods not still solved are the ones enabling large-scale implementations in increasingly unstructured environments, i.e. underwater, and especially in situations where other current solutions are unavailable or unreliable. According to the bibliographical survey, SLAM solutions could be improved either by formulating more efficient and consistent to large scenarios filtering algorithms, and solving in a very robust way the data association problem. For the first case, different filters applied into the SLAM framework must be studied, for instance the compressed extended Kalman Filter (CEKF) or information filters. The second issue is currently solved using SIFT and SURF, which seem to be considerably good solutions for the data association problem, however they become computationally expensive

25 Chapter 2: Literature review 16 Table 2.1: List of advantages and disadvantages of filtering approaches applied into the SLAM framework. Pros Cons Kalman Filter and Extended KF (KF/EKF) [4] [6] [36] [25] - high convergence - Gaussian assumption - handles uncertainty - slow in high dimensional maps Compressed Extended KF (CEKF) [20] [21] [22] - reduced uncertainty - requires very robust features - reduced memory usage - data association problem - handle with large areas - require multiple map merging - increase map consistency Information Filters (IF) [46] [47] - stable and simple - data association problem - accurate - may need to recover a state estimates - fast for high dimensional maps - in high-d is computationally expensive Particle Filter (PF) [38] [50] - handle nonlinearities - growth in complexity - handle non-gaussian noise - only successful in localization Expectation Maximization (EM) [46] [51] - optimal to map building - inefficient, cost growth - solve data association - unstable for large scenarios - only successful in map building when dealing with high dimensional maps.

26 Chapter 3 Theoretical Background As seen in Chapter 2 most of the current solutions to the SLAM problem uses probablistics approaches, i.e. KF, EKF or IF. In order to understand in depth its principles, the main theoretical characteristics are described in the following sections, i.e. section 1 summarizes the SLAM fundamentals, section 2 presents the KF, section 3 the EKF, section 4 the CKF, section 5 the UKF and finally section 6 explains the IF. 3.1 SLAM Fundamentals Introduction SLAM is a process by which a mobile robot can build a map of an environment and at the same time use this map to deduce its location. Initially, both the map and the vehicle position are not known, the vehicle has a known kinematic model and it is moving through the unknown environment, which is populated with artificial or natural landmarks. The SLAM problem involves finding appropriate representation for both the observation and the motion models. The process model is described in subsection 2, subsection 3 presents the observation model and subsection 4 describes several issues related with the SLAM problem, such as the data association problem, the loop closing or the dynamism of the environment Process Model The SLAM problem begins with an unknown map and unknown vehicle position within this map. The kinematic model of the vehicle is known and the environment is populated with various artificial or natural features. The idea of SLAM algorithms is to build an incremental 17

27 Chapter 3: Theoretical Background 18 navigation map and to localize the vehicle in this map at the same time. In order to do so, the state of the system at a given moment k is modeled with the state of the vehicle x v (3.1), which can contain data concerning the position, velocity, orientation, etc. This state of the vehicle is augmented with the position of the landmarks. Assuming that N landmarks have been validated and incorporated into the system then the vector of landmarks is denoted by p k (3.1). Landmarks location is assumed to be constant, i.e. all landmarks are considered stationary, thus p k = p k 1. For this reason the model for the evolution of the landmarks does not contain any uncertainty. x v = x y z v x v y v z θ ϕ. p = p T 1.. p T N = x 1 y 1 z 1. x N y N z N x = [x v p] T (3.1) The augmented state vector containing the state of the vehicle and the state of all landmarks (3.1), is then used to estimate a next state by means of the state transition model (3.2). A state transition matrix F k, a vector of control inputs u k and a vector of temporally uncorrelated process noise v k with zero mean and covariance Q k, are used to model the motion of the vehicle through the environment. Note that I pi is the identity matrix of size equal to the dimension of a landmark and 0 pi is a vector of zeros of the length equal to the landmark dimension. For instance in a 3D case where the landmark contains only its coordinates (x,y,z) information the dimension would be equal to three. x vk+1 p 1. = F k I p x vk p 1. + u vk+1 O p1. + ν vk+1 0 p1. (3.2) p N I pn p N 0 pn 0 pn

28 SLAM Fundamentals Figure 3.1: SLAM schematics Observation Model The observation model (3.3) estimates the observations using the observation matrix H k, which relates the output of a sensor to the state vector x k. A vector w k of temporally uncorrelated observation errors with zero mean and variance R k is used. z k = H k x k + w k (3.3) The estimations obtained by the process and observation models are then corrected thanks to the true observations provided by sensors that can obtain the relative location of landmarks with respect to the vehicle. The SLAM problem can then be seen as a black box, as shown in Figure 3.1, from which the vehicle pose and the map are obtained, thanks to several inputs from the observations and the odometers. To provide estimations of robot and landmarks location various probablistics approaches are used. The basis of these approaches is presented later on in this Chapter Other Issues: Measuring and Data Association The most fundamental key topic into all SLAM solutions is the data association problem, which arises when landmarks cannot be uniquely identified, consequently this the number of possible hypotheses may grow exponentially, making absolutely unviable the SLAM for large areas. Data association in SLAM can be simply presented as a feature correspondence problem, which identifies two features observed in different positions and different points in time as being from the same physical object in the world. Two common applications of such data association are: i. to match two successive scenes and ii. to close a loop of a long trajectory, when a robot goes back to the starting or a previously visited point of the trajectory. So far, most computer vision approaches only uses 2D information to perform data association, but in underwater scenarios this data association is more complicated due to more

29 Chapter 3: Theoretical Background 20 significant levels of distortion and noise. Therefore, in order to succeed when solving the correspondence problem, very robust features are necessary, even under weak lighting conditions or under different points of view. The use of vision sensors offers the possibility to extract landmarks considering 2D and 3D information [12] [56], hence more robust features. Feature recognition, tracking and 3D reconstruction are important steps that feed the measurements to the SLAM framework. Feature tracking is the problem of estimating the locations of features in an image sequence (for instance, Harris corner detector and Random Sample Consensus (RANSAC), Scale Invariant Feature Transform (SIFT) [25] or Speeded-up Robust Features (SURF) [29]). 3D reconstruction is the problem of obtaining the 3D coordinates and the camera pose using two or more 2D images (for instance, by using epipolar geometry and fundamental matrix). Recent advances in computer vision, more precisely, in feature extraction enable the usage of high-level vision-based landmarks (complex and natural structures) in contrast to early attempts using low-level features (e.g., vertical edges, line segments [14], etc.) and artificial beacons. However, the use of vision has several limitations and practical difficulties, for example, assumptions about the environment must be done in order to simplify the problem; in regions without texture or with repeating structures there is no method of finding true matches; and images are always noisy. The loop-closing problem (a robot turns back to the starting point of its trajectory) requires successful identification of revisited landmarks to build a consistent map in large scale environments. Due to accumulated errors along the trajectory (drift), the reconstructed map is not consistent, i.e., the loop of the trajectory is not closed properly. In order to solve this problem, correct data association is required to uniquely identify the landmarks corresponding to previously seen ones. different techniques are applied to correct the map, for example, Kalman smoother-based (used by most of the current solutions to the SLAM problem) or EMbased techniques [51]. The bearing-only SLAM arises from the limitations of using computer vision in SLAM, where it is not possible to calculate a meaningful range from a single measurement. Thus, an estimate of the landmark position will not be possible until successive measurements from different points of view are made. Cameras can calculate the angle to landmark sightings and apply triangulation to determine an appropriate initial location estimate. The dynamism of robot environment, i.e. changes of location of other agents in the environment, creates a big challenge, because it introduces inconsistent sensor measurements and because there are almost no algorithms that can deal with mapping in dynamic environments.

30 Kalman Filter Notation In the following sections, when explaining the different algorithms, the same notation is used in all of them, in order to facilitate comparison between methods and to standardize the names for the variables involved. These common variables and its meaning are listed in Table 3.1 Table 3.1: Notation Variable x k P k z k+1 x k+1 P k+1 ẑ k+1 x k+1 P k+1 F k H k u k Q k R k ν k+1 S k+1 W k+1 Description actual state vector at the current time step actual covariance matrix at the current time step next measurement vector for the current time step next state vector estimate next covariance matrix estimate next measurement vector estimate corrected state vector corrected covariance matrix transition matrix (process model) measurement matrix (observation model) control vector process uncertainty matrix observation uncertainty matrix innovation vector innovation associated matrix Kalman filter gain 3.2 Kalman Filter Introduction The Kalman Filter (KF) was invented in the 1950s by Rudolph Emil Kalman, as a technique for filtering and prediction in linear systems. The KF implements belief computation for continuous states. It is not applicable to discrete or hybrid state spaces. The KF estimates the belief of the state, at a certain moment in the time t, of a dynamic system from a series of incomplete and noise measurements, through their mean µ t and the covariance Σ t Into the Maths: Assumptions In order to obtain the correct posterior, three assumptions have to be fulfilled. First, the initial belief must be Gaussian (3.4). Second, the state transition probability must be composed of

31 Chapter 3: Theoretical Background 22 a function that is linear in its argument with added independent Gaussian noise (3.5). Third, the same applies to the measurement probability (3.6). It must also be linear in its argument, with added Gaussian noise. Systems that meet these assumptions are called linear Gaussian systems. 1 bel(x 0 ) = p(x 0 ) = ( 2πΣ 0 ) 1 2 exp{ 1 2 (x 0 µ 0 ) T Σ 1 0 (x 0 µ 0 )} (3.4) x t = A t x t 1 + B t u t + ε t (3.5) z t = C t x t + δ t (3.6) In (3.5) and (3.6) A t, B t and C t are matrices that multiply the state, control and measurement vectors, what allows to assimilate the transition and measurement functions as linear systems. ε t and δ t are Gaussian random vector and matrix respectively, with zero mean and covariances Q t and R t Kalman Filter Algorithm The KF algorithm adapted to the SLAM context is presented in Table 3.2. The algorithm is composed of three main steps, prediction, observation and update 1, which keep repeating along time. In the prediction stage, the process model is used to obtain an estimate for the state vector x k+1, the covariance matrix P k+1 and the measurement ẑ k+1 for instant k + 1 from the state vector x k and the covariance matrix P k of the previous instant k. The observation stage computes the innovation vector ν as a difference between real and estimated measurements, and the innovation associated matrix S. This matrix S is then used in the update stage to compute the Kalman gain W. Both the ν and the W are then used to correct the state vector x k+1 and the covariance matrix P k+1, which are used as input for the very next iteration. 3.3 Extended Kalman Filter Introduction The assumptions of linear state transitions and linear measurements with added Gaussian noise are rarely fulfilled in practice. For example, a robot that moves with constant translational and rotational velocity typically moves on a circular trajectory, which cannot be described by linear 1 Various authors defines only two steps: prediction and update. They consider the observation as a part of the update stage

32 Extended Kalman Filter Table 3.2: Kalman Filter algorithm Kalman Filter:(x k, P k, z k+1 ) Prediction (estimate) 1: x k+1 = F k x k + u k 2: Pk+1 = F k P k Fk T + Q k 3: ẑ k+1 = H k x k+1 Observation (innovation vector and matrix) 4: ν k+1 = z k+1 ẑ k+1 5: S k+1 = H k Pk+1 Hk T + R k Update (correction, Kalman Gain) 6: W k+1 = P k+1 Hk T S 1 k+1 7: x k+1 = x k+1 + W k+1 ν k+1 8: P k+1 = P k+1 W k+1 S k+1 Pk+1 Iterate with x k+1 and P k+1 next state transitions. This observation, along with the assumption of unimodal beliefs, makes plain KF inapplicable to all but the most trivial robotics problems. The extended Kalman filter (EKF) 2 overcomes the linearity assumption Into the Maths: Non-linear Systems In the EKF the assumption is that the next state probability (3.7) and the measurement probabilities (3.8) are governed by nonlinear functions g and h. Which essentially is the equivalent to (3.5) and (3.6) but now instead of A t and B t the function g is used, and h instead of C t. x t = g(u t, x t 1 ) + ε t (3.7) z t = h(x t ) + δ t (3.8) Unfortunately, with arbitrary functions g and h, the belief is no longer a Gaussian. For this reason the EKF calculates an approximation to the true belief. It represents this approximation by a Gaussian. Thus, the EKF inherits from the KF the basic belief representation, but it differs in that this belief is only approximate, not exact as was the case in KF. The technique for linearizing nonlinear functions used by an EKF is called Taylor Expansion, which constructs linear approximations taking advantage of the partial derivative of the 2 Although the EKF is not implemented in this thesis, it is presented here due to its importance in the literature

33 Chapter 3: Theoretical Background 24 nonlinear function (3.9). To do so, the Jacobian of a non linear function is computed. Within the EKF framework, g and h become G and H, being the Jacobian of g and h Extended Kalman Filter Algorithm g (u t, x t 1 ) = g(u t, x t 1 ) x t 1 (3.9) The EKF algorithm applied to the SLAM context is presented in Table 3.3. The iterative process is similar to that in the KF algorithm (Table 3.2), but now instead of the transition matrix F and the measurement matrix H, the corresponding nonlinear functions g and h and its Jacobian G and H are used. Note that now the H matrix is the Jacobian of the nonlinear function h. Table 3.3: Extended Kalman Filter algorithm Extended Kalman Filter:(x k, P k, z k+1 ) Prediction (estimate) 1: x k+1 = g(x k, u k+1 ) 2: Pk+1 = G k P k G T k + Q k 3: ẑ k+1 = h( x k+1 ) Observation (innovation vector and matrix) 4: ν k+1 = z k+1 ẑ k+1 5: S k+1 = H k Pk+1 Hk T + R k Update (correction, Kalman Gain) 6: W k+1 = P k+1 Hk T S 1 k+1 7: x k+1 = x k+1 + W k+1 ν k+1 8: P k+1 = P k+1 W k+1 S k+1 Pk+1 Iterate with x k+1 andp k Compressed Extended Kalman Filter Introduction The Compressed Extended Kalman Filter (CEKF) 3 takes advantage of the system structure giving an estimation that is identical to the standard EKF but reducing the computational 3 Although CEKF is normally implemented for the non-linear cases, note that in this thesis only linear cases are implemented, i.e. Compressed Kalman Filter(CKF).

34 Compressed Extended Kalman Filter complexity. Large sequences of prediction and observation steps depend only on a reduced set of states, because they are independent from most of the elements in the state vector, making unnecessary to perform a full SLAM update when working in local areas. The CEKF takes advantage of this characteristic, working only over local regions, performing local SLAM with those features that are in the vicinity of the vehicle, independently of the size of the whole map. Then a global update is conducted, thus the final solution is identical to the full EKF, but the expensive cost of a full EKF is avoided. Figure 3.2 depicts the overall idea of CEKF, in which a local region around the actual vehicle location is defined, i.e. region A. A plain KF based SLAM (or EKF for the non-linear case) is conducted on this local region till the moment the robot moves out of its boundaries, then the global map is updated, and a new local region is defined to continue with the same idea. Figure 3.2: Map administration for the compressed KF algorithm. The local region A comprises the range of the robot sensors and a larger area around the vehicle, while the global region B includes the whole map Into the Maths: the Active Region The discrete dynamic system and the observation model presented in (3.2) and (3.3) are used. Following the example presented in Figure 3.2, the state vector can be divided into two subvectors (3.10). The first one, referred as X a containing the state of the vehicle x v and the state of those landmarks inside the local region A, numbered 3 and 4 respectively. The second sub-

35 Chapter 3: Theoretical Background 26 vector, called X b, contains the state of the landmarks outside the local region, which have been observed and introduced to the system previously, landmarks 1 and 2 in the example. Similarly, the covariance matrix P is divided into four sub-matrices (3.10). P aa is the covariance matrix associated to the state vector of the local region X a and P bb is the one associated to X b. P ab and P ba are the cross-covariance between both state vectors. X = X a... X b = x v lm 3 lm 4... lm 1 P = ( P aa P ba P ab P bb ) (3.10) lm 2 From the previous statements (3.10) it can be seen that it is not necessary to perform a full SLAM update when working in a local area, instead a simple KF base SLAM algorithm can be applied using X a and P aa. This is possible thanks to the fact that the innovation matrix S a and the Kalman gain W a for the local region are independent from the vector and matrices related to region B, i.e. X b, P bb and P a. The final output of this local SLAM is then used to update the global X and P. In order to perform this global update, the vector X b and the matrices P bb and P ab must be also updated. For this reason several auxiliary computations are added in the prediction and the update stages, without interfering the local SLAM procedure, cost and final result. These extra computations only generates auxiliary variables (φ, ψ and θ) (3.11), that accumulate the effect of each single iteration during the local SLAM process. Where: φ 0 = I φ k+1 = (I µ k )φ k (3.11) ψ 0 = 0 ψ k+1 = ψ k + φ T k β k φ k θ 0 = 0 θ k+1 = θ k + φ T k 1Ha T Sa 1 z a,k β = H T a S 1 a H a µ = P aa β (3.12) This variables are then used to update the values of X b, P bb and P ab to the global map context (3.13).

36 Unscented Kalman Filter P ab,k+1 = φp ab,k (3.13) P bb,k+1 = P bb,k P ab,k ψp ab,k X b,k+1 = X b,k P ba,k θ Compressed Kalman Filter Algorithm The CKF algorithm adapted to a linear SLAM context is presented in Table 3.4. The first thing to be done is to define the boundaries of a region in the vicinity of the robot, referred as the active region. Then the state vector x k and the covariance matrix P k are divided into the active part (X a,k and P aa,k ), containing the vehicle state and those landmarks inside the active region, that were previously incorporated to the system, and the passive part (X b,k, P bb,k and P ab,k ), which contains the landmarks outside of the active region. Note that, at the very beginning, X a,0 will contain the vehicle state but no landmarks, and the X b,0 will be an empty entity. Similarly, the covariance matrices will only contain information about the vehicle state. Both, the state vector and the covariance matrix at the time step k for the actual active region are used as inputs for a plain KF based SLAM algorithm. Notice that at this point a different loop begins and its timing is expressed with a n, in order to avoid misunderstanding. During the KF iterations the prediction, observation and update are performed as normally. In addition, extra computations are conducted to obtain auxiliary variables, that will then be used for the global update stage. At the moment the active region reaches its end, the global update procedure starts, i.e. x k+1 and P k+1 are calculated. These corrected state vector and covariance matrix are the input for the next iteration of the CKF algorithm. 3.5 Unscented Kalman Filter Introduction KF performs properly in the linear cases, and is accepted as an efficient method for analytically propagating a Gaussian Random Variable (GRV) through a linear system dynamics. For non linear models, the EKF approximates the optimal terms by linearizing the dynamic equations. The EKF can be viewed as a first-order approximation to the optimal solution. In these approximations the state distribution is approximated by a GRV, which then is propagated analytically through the first-order linearization of the nonlinear system. These approximations can introduce large errors in the true posterior mean and covariance, which may lead sometimes

37 Chapter 3: Theoretical Background 28 Table 3.4: Compressed Kalman Filter algorithm Compressed Kalman Filter:(x k, P k, z k+1 ) 01: Define actual active region 02: Define X a,k, P aa,k and X b,k, P ab,k and P bb,k 03: Initialize F a,k, H a,k, Q a,k, R a,k, u a,k for the active region 04: x n = X a,k 05: P n = P a,k 06: F n = F a,k H n = H a,k Q n = Q a,k R n = R a,k u n = u a,k 07: φ n = I ψ n = 0 θ n = 0 08: Active Region Kalman Filter(x n, x n, φ n, ψ n, θ n ) Prediction (estimate) 01: x n+1 = F n x n + u n 02: Pn+1 = F n P n Fn T + Q n 03: ẑ n+1 = h( x n+1 ) 04: φ n+1 = φ n ψ n+1 = psi n θ n+1 = θn Observation (innovation vector and matrix) 05: ν n+1 = z n+1 ẑ n+1 06: S a,n+1 = H a,n Pn+1 Hn T + R n+1 Update (correction, Kalman Gain) 07: W n+1 = P n+1 Hn T Sn : x n+1 = x n+1 + W n+1 ν n+1 09: P n+1 = P n+1 W n+1 S n+1 Pn+1 10: β = Ha T Sa 1 H a 11: µ = P aa β 12: φ k+1 = (I µ k )φ k 13: ψ k+1 = ψ k + φ T k β kφ k 14: θ k+1 = θ k + φ T k 1 HT a S 1 a z a,k Iterate with x n+1, P n+1, φ n+1, ψ n+1 and θ n+1 Global Update 09: X a,k+1 = x n+1 10: P a,k+1 = P n+1 11: X b,k+1 = X b,k + Pab,k T θ k+1 12: P bb,k+1 = P bb,k Pab,k T ψ k+1 13: P ab,k+1 = φ k+1 P ab,k ( Xa ) 14: x k+1 = ( X b ) Paa P 15: P k+1 = ab Pab T P bb Iterate with x k+1 and P k+1

38 Unscented Kalman Filter to divergence of the filter. The Unscented Kalman Filter (UKF) addresses the approximation issues of the EKF. The state distribution is again represented by a GRV, but is now specified using a minimal set of carefully chosen sample points. These sample points completely capture the true mean and covariance of the GRV, and when propagated through the true non-linear system, captures the posterior mean and covariance accurately to the 3rd order (Taylor series expansion) for any nonlinearity. In order to do that, the unscented transform is used Into the Maths: the Unscented Transform The Unscented Transform (UT) is a method for calculating the statistics of a random variable which undergoes a nonlinear transformation [49]. Consider propagating a random variable x (dimension L) through a nonlinear function, y = f(x). Assume x has a mean x and covariance P x. To calculate the statistics of y, we form a matrix χ of 2L + 1 sigma vectors χ i according to (3.14). χ 0 = x (3.14) χ i = x + ( (L + λ)p x ) i i = 1,..., L χ i = x ( (L + λ)p x ) i L i = L + 1,..., 2L where λ = α 2 (L κ) L is a scaling parameter. The constant α determines the spread of the sigma points around x and is usually set to a small positive value, for instance 1e 4. The constant κ is a secondary scaling parameter which is usually set to 0, and β is used to incorporate prior knowledge of the distribution of x (for Gaussian distributions β = 2). ( (L + λ)p x ) i ) should be computed using Cholesky factorization. Once the sigma vectors have been calculated, they are propagated through the nonlinear function (3.15). Y i = f(χ i ) i = 0,..., 2L (3.15) And from a weighted sample of them, the mean y (3.16) and covariance P y approximated. The weights W i used are given by (3.18). (3.17) are P y 2L i=0 y 2L i=0 W (m) i Y i (3.16) W (c) i {Y i y}{y i y} T (3.17)

39 Chapter 3: Theoretical Background 30 W0 m = λ L + λ W0 c = λ L + λ + (1 α2 + β) Wi m = W (c) 1 i = 2(L + λ) i = 1...., 2L (3.18) The UT is able to approximate with an accuracy of the 3rd order for Gaussian inputs for all nonlinearities. For non-gaussian inputs, approximations are accurate to at least the 2nd order, with the accuracy of 3rd and higher order moments determined by the choice of alpha and beta. An example of the superior performance of the UT with respect to the EKF and the true mean and covariance is depicted in Figure 3.3. Figure 3.3: Example of UT for mean and covariance propagation. linearization (EKF), c) UT) a) actual, b) first-order Unscented Kalman Filter Algorithm UKF is an extension of the UT to the recursive estimation. An advantage is that no explicit calculation of Jacobians or Hessians is necessary to implement this algorithm. Furthermore, the overall number of computations is of the same order as the EKF. This implementation is given

40 Information Filters in Table 3.5. The UKF algorithm has two main stages: the prediction and the update. First, the weighting variables and the sigmas are computed in order to continue with the unscented transform. Then, the prediction stage begins, first predicting sigmas χ k+1 for the next step k + 1, which are then used in the estimation of the state vector x k+1 and the covariance matrix P k+1. The observation estimate corresponding to the predicted sigma Z k+1 are used to predict the measurements ẑ k+1. At this point the update begins with the computation of covariance P zz and cross-covariance P xz matrices, that are then used to find the Kalman gain K k+1. The last two steps correct the state vector and the covariance matrix, which then will be the inputs for the next UKF iteration. Table 3.5: Unscented Kalman Filter algorithm Unscented Kalman Filter:(x k, P k, z k+1 ) 01: Compute weights as in (3.18) Calculate sigma points 02: γ = L + λ 03: χ k = [x k x k + γ P k x k γ P k ] Prediction (estimate) 04: χ k+1 = F k χ k + u k 05: x k+1 = 2L i=0 W m i χ k+1 06: Pk+1 = 2L i=0 W i c[χ k+1 x k+1 ][χ k+1 x k+1 ] T + Q k 07: Z k+1 = H k χ k+1 08: ẑ k+1 = 2L i=0 W m i Z k+1 Update (correction) 09: P zz = 2L i=0 W c i [Z k+1 ẑ k+1 ][Z k+1 ẑ k+1 ] T + R k 10: P xz = 2L i=0 W c i [χ k+1 x k+1 ][Z k+1 ẑ k+1 ] T 11: K k+1 = P zz P 1 xz 12: x k+1 = x k+1 + K k+1 (z k+1 ẑ k+1 ) 13: P k+1 = P k+1 K k+1 P zz K T k+1 Iterate with x k+1 and P k Information Filters Introduction Gaussians can be represented in two different ways: the moment s representation and the canonical representation. The moment s representation consists of the mean (first moment)

41 Chapter 3: Theoretical Background 32 and the covariance (second moment) of the Gaussian. The canonical, or natural, representation consists of an information matrix and an information vector. Both representations are dual of each other, and each can be recovered from the other via matrix inversion. When applied to robotics problems, the Information Filter (IF) possesses several advantages over the KF. For example, representing global uncertainty is simple in the information filter. When using moments, such global uncertainty amounts to a covariance of infinite magnitude. This is especially problematic when sensor measurements carry information about a strict subset of all state variables, a situation often encountered in robotics. Incorporating measurements in a KF based algorithm is considerably more costly than doing the same for an IF. Furthermore, the IF tends to be numerically more stable than the KF. These advantages are offset by important limitations. A primary disadvantage is the need to recover a state estimate in the update step, when applied to nonlinear systems. This step, if implemented as stated here, requires the inversion of the information matrix. In many robotics problems, the EKF does not involve the inversion of matrices of comparable size. IF are not widely used because, for high dimensional state spaces, it seems to be computationally inferior to the KF Into the Maths: Canonical Representation The moments representation of a multivariate Gaussian is given by a matrix Σ to express the covariance, and a vector µ to define the mean. The probability for such a normal distribution is as presented in (3.19) and rearranging the term inside the exponential the equation becomes as in (3.20). 1 px = 2ΠΣ 1 2 exp{ 1 2 (x µ)t Σ 1 (x µ)} (3.19) p(x) = η = 1 2ΠΣ ΠΣ 1 2 exp{ 1 2 µt Σ 1 µ} exp{ 1 2 xt Σ 1 x + x T Σ 1 µ} (3.20) exp{ 1 2 µt Σ 1 µ} This suggests to use the canonical form (3.21) of the Gaussian composed of Ω and ξ, leading to a final probability function (3.22) more elegant than the moments representation, which denotes that the increase on computational cost is quadratic with respect to the increase on the variable x (i.e. in the SLAM framework the state vector). Ω = Σ 1 ξ = Σ 1 µ (3.21)

42 Information Filters p(x) = η exp{ 1 2 xt Ωx + x T ξ} (3.22) Information Filter Algorithm The IF adapted to the SLAM context is presented in Table 3.6. The inputs for the algorithm are the information vector y k, the information matrix Y k and the actual observation data z k+1. These information vector and matrix are the dual of the state vector x k and the covariance matrix P k, and are computed previously to the iterative process (3.23). Y k = P 1 k y k = P 1 f x k (3.23) The IF algorithm comprises three steps, prediction, observation and update. The update state it is clearly less expensive than a plain KF, because there are no inversions. However, the prediction step becomes more costly due to the inversion operations. Although this differences between both algorithms, a single IF iteration is believed to be less time consuming than a single KF iteration. One of the greatest advantages of the IF is that N measurements can be filtered at each time step by summing their information matrices and vectors. Table 3.6: Information Filter algorithm Information Filter:(y k, Y k, z k+1 ) Prediction (estimate) 1: Ŷ k+1 = (F k Y 1 k Fk T + Q k) 1 2: ŷ k+1 = Ŷk+1(F k Y k y k + u k ) Observation 3: I k = Hk T R 1 k H k 4: i k = Hk T R 1 k zk + 1 Update (correction) 5: y k+1 = ŷ k+1 + i k 6: Y k+1 = Ŷk+1 + I k Iterate with y k+1 and Y k+1

43 Chapter 4 Simulations and Results This chapter presents some experiments performed in order to validate the different filtering algorithms described in the previous chapters. This experiments have been conducted on synthetic data for a series of different scenarios, for instance 1D, 2D and 3D problems, and with various operating conditions, for example, measuring only the vehicle s velocity, trajectory or both, or with limited range sensor. These experiments let to compare the accuracy of each technique with respect to the ground truth. Both qualitative and quantitative results are computed and analyzed bringing insightful information concerning the potential and robustness of each method surveyed. First, several simulation issues are described. Then, some particular details of each implementation are presented. Finally, results are given and compared for each implementation and simulation. 4.1 Simulation Issues In order to conduct several simulations four different toolboxes with the same structure each of them has been developed, one for each filtering algorithm. These toolboxes follow the idea introduced by Joaquim Salvi in its KF Toolbox 1. Salvi s toolbox begins with very simple 1D examples with motionless robot observing all landmarks (see Figure 4.2), continues increasing the complexity in several 2D simulations (see Figure 4.8) and ends up with more complete 3D examples (see Figure 4.18), in which the vehicle has limited range sensors and the landmarks are only incorporated into the system once observed. 1 available on his web page: eia.udg.edu/ qslavi 34

44 Simulation Issues Assumptions In all implementations various assumptions are made: 1. Both the process model and the observation model are always linear. For this reason no results from extended models are shown, which remains for future work. 2. All sources of noise are defined as additive gaussian noise with zero mean, and its variance depends on the nature of the noise. Ideally, this noise should be as similar as possible to its reality equivalent, for instance, the landmark measurement noise should be the uncertainty of the sensors that measures the landmarks, or the odometry should be as uncertain as the measures provided by a Doppler Velocity Log (DVL). Although gaussian noise is added to the models, synthetic data not always reflects all the problems of real data. This is another aspect to be further studied in the future. 3. The data association problem is addresses by using previous knowledge from the synthetic scenarios, which is essentially assuming perfect data association. In real world this trick is not possible, thus data association algorithms must be faced. Desired Information Running a series of predefined simulations should return results allowing to evaluate the following features of each algorithm: 1. Noise effect over the system. It is well known that underwater environments are considerably noisy, which increase the uncertainties in all measurements. For this reason it is interesting to know the behavior of the algorithms when dealing with amounts of noise. 2. Accuracy, which can be computed by means of the distance between the estimates and the known ground truth from the synthetic data. It would also be interesting to know which of the methods converges faster. 3. Computational complexity, which can be determined by observing the computation time spent in each step involved in the simulation. Another good indicator is the sizes of the state vector and matrix. 4. Map complexity, which is strongly related to the algorithm s performance on large scenarios, highly landmark populated maps, long navigation periods and loop closing situations. PC specifications The experiments conducted in this work have been run on a PC equipped with a 64-bit Core2Duo 2.66GHz CPU and 4Gb of RAM memory.

45 Chapter 4: Simulations and Results Implementation Issues The algorithms described in Chapter 3 have been implemented on Matlab 7.6 (R2008a). Several implementation issues has been faced and solved, for instance, the data association problem or the update of all matrices and vectors involved in the algorithms when the state is augmented with new landmarks. KF implementation particular details A KF algorithm can be easily implemented. However, the data association problem must be addressed. This data association is not the main topic of interest of this thesis, for this reason the assumption of known data association is made. This assumption is translated into the implementation by the means of several vectors that stores the history of the landmarks observed so far, the landmarks detected and the non detected landmarks at the current time step. This vectors are then used to identify when a landmark is observed for the very first time, and also allow to augment the matrices and vectors involved in the algorithm. This implementation trick is also used in all other approaches. IF implementation particular details The IF algorithm contains several matrix inversions. Those which do not depend on the current state are computed only once to speed up the process. However, it is not possible to eliminate two auxiliary inversions at the end of each iteration. These calculus are necessary to augment the information matrix Y k involved in the process. In order to do so, new columns and rows need to be added and its values must be properly chosen, for instance, using the initial noises for the landmark measure and the odometry noises. Unfortunately the matrix Y k is the inversion of the state associated covariance matrix P k, and adding the inverse of a single value into a matrix does not produce the same result as adding the value and then inverting the whole matrix. For this reason it is necessary to first reconvert back the information matrix Y k to its covariance form P k, then augment it, and finally return to the information form. UKF implementation particular details A UKF algorithm computes at each time step the unscented transform. One of the steps of this transformation is a square root of a matrix, which should be calculated using numerically efficient and stable methods, for instance the Cholesky decomposition. After seeing the implementation itself, it seems that the potential of this approach will be better tested in non-linear cases.

37 4.3 Results CKF implementation particular details A CKF algorithm computes simple KF for small local regions and then updates the data into a global map.

46 Results CKF implementation particular details A CKF algorithm computes simple KF for small local regions and then updates the data into a global map. For this reason it is necessary to decide the size and shape of a local region, and to define when a new local region begins. Initially, the size of the local area has been chosen as three times the sensor range, because it is a size big enough to allow computing local regions with few landmarks, but it is sufficiently small to limit properly the complexity of each local KF problems within it. For simplicity reasons the shape is square. And the question that still remains unsolved is when to start a new region. Two different options come up. The first one starts a new local KF once the robot crosses the boundaries of the current active region, what produces overlapping between two consecutive regions (see Figure 4.1 left). The second choice consists on initializing a new local region when the vehicle has gone far enough to avoid overlapping (see Figure 4.1 right). Figure 4.1: Example of how active regions are defined in a CKF. Left: overlapping. Right: avoiding overlap. 4.3 Results SLAM in 1D A 1 Degree of Freedom (DOF) mobile robot is moving along a straight line detecting some motionless landmarks. The SLAM algorithms compute the position and velocity of the robot and the position of the landmarks. In this example, all measures are relative to the robot pose, all landmarks are seen during all the simulation, and all algorithms tested are linear, i.e. KF, IF and UKF. Figure 4.2 shows the synthetic data setup for the 1D simulations, emulating a straight corridor with several salient objects (landmarks) located at fixed positions along a

47 Chapter 4: Simulations and Results 38 line (i.e. at 10, 12, 19, 31, 42, 50, 60, 73, 80 and 91 meters), and the vehicle moves forward and backward in the direction of the landmarks. Several parameters must be tuned, such as, the process noise, the landmark measurement uncertainty and the robot s pose and velocity measurement noise. Figure 4.2: Left: 1DoF slam simulation setup. Right: example of simulation results along time. Process Noise The process noise (p n ) is assumed to be gaussian with zero mean and a certain variance. Depending on the order of this variance each algorithm behaves in different ways. Obviously, in the reality noise sources can not be tuned that easily, but using synthetic data the noise can be defined by the user. Therefore, for simulation purposes a proper value of p n is chosen. In order to find out which would be this proper value for p n, a set of simulations are conducted, only varying this noise from simulation to simulation (note that other noises are set to constant nonzero values). Very high and very low values of process noise are tested, for instance approximating the upper limit (p n ) or the lower one (p n 0). These results are presented in Table 4.1, which summarizes the maximum and mean errors given by each simulation. Table 4.1: Effect of the process noise on each algorithm Low p n 0 p n = 0.01 High p n method mean max mean max mean max KF IF UKF

48 Results In addition, qualitative results of these experiments are shown in Figure 4.3, where the left plots concern a simulation with low p n, while the right plots are for a high p n simulation. In both cases, the error is plotted in the bottom. From these images, it can be said that the process noise is important in terms of speed of reaction of the system. For instance, a low p n makes the system considerably slow, and consequently once the state is perturbed it takes really long time to converge back to the ground truth. Conversely, a high p n produces a constant oscillation of the estimates and the algorithms become hypersensitive to any small perturbation. Both situations must be avoided. Figure 4.3: Behavior of the system against different rates of p n. Only the vehicle trajectory and a landmark located at 12 m are plotted. Left: low process noise. Right: high process noise. Top: the simulation itself. Bottom: evolution of the error during the simulation. In order to find an appropriate value for the variance of the process noise both the error and the variation rate are computed and shown in Figure 4.4. The error is computed as the distance from the estimates to the ground truth, as it was done in previous testings. The variability rate is obtained by the means of the gradient or the derivative of the estimates along time. All methods change similarly in front of different sources of process noise, for instance, high p n produces a fall in the error but increases the oscillation rate. A good deal between these two parameters suggests a p n < 1. The right plot on Figure 4.4 contains a zoom of the best function fitting the curves, and the intersection between the error curve and the variability curve is found around p n

49 Chapter 4: Simulations and Results 40 Figure 4.4: Normalized error and normalized gradient. Left plot: wide range of p n simulation. Right plot: zoom in the range from 0 < p n < Landmark measurement Noise With the aim to evaluate the effect of the landmark measurement noise (lm n ) on each of the methods, the simulations are run on the limit, with very high and very low values of the noise. The results of these simulations are provided in Figure 4.5. The left plot shows that for very low noise level (lm n 0), the estimates for both robot pose and landmark position tend to the ground truth. In the example, the noise level is really low, and it suggests that UKF performs slightly better than KF and IF, which seem to be carrying a certain offset error from the beginning. However, all methods seem to perform similarly when the noise values become higher, for instance as depicted in the right plot. The error increases with the lm n, thus the uncertainty on the landmark measure is of vital importance for the stability and convergence of the system. In order to chose a realistic noise source, it seems a good idea to read the precision parameters in the specifications of existing technology. For instance, in the current thesis vision sensors are supposed to be the ones in charge of the landmark measurement task, therefore the noise used in the simulations should be chosen correspondingly to the vision system. Odometry noise The odometry is normally used as correction input in the update stage. For this reason it is interesting to have as many data sources as possible; for instance, orientation (heading), roll and pitch, velocity and position of the vehicle. All this information comes from different sensors, which have different precisions and ranges, therefore, each source of information must be simulated with its corresponding noise. In the examples, only synthetic position and velocity data is used. See Table 4.2 to find out several specifications of various underwater sensorial systems (adapted from [57] and [3]).

50 Results Figure 4.5: Landmark measurement noise evaluation. Left: low noise. Right: high noise. Top: simulation itself. Bottom: error plot. Table 4.2: Current underwater navigation sensors and systems [57] [3]. INSTRUMENT VARIABLE UPDATE RATE PRECISION RANGE Acoustic Altimeter Z - Altitude varies: Hz m varies Pressure Sensor Z - Depth medium: 1 Hz 0.01% full-ocean Inclinometer Roll Pitch fast: 1-10 Hz ±45 Magnetic Compass Heading medium: 1-2 Hz Gyro Compass Heading fast: 1-10 Hz Ring-Laser Gyro Heading fast: Hz Bottom-Lock Doppler XYZ - Velocity fast: 1-5 Hz % m 12 khz LBL* XYZ - Position varies: Hz m 5-10 km 300 khz LBL XYZ - Position Hz ±0 002 m 100 m *LBL stands for long-baseline It seems clear that higher noise levels will produce worst estimates, hence more likely to obtain unstable results, while really low noise sources returns as result almost the ground truth as result, as presented in Figure 4.6. Also, there is no significant differences between methods when dealing with different odometric uncertainty values. Comparison of algorithms Assuming that the noise levels are appropriately chosen, various tests are conducted in order to compare the three methods KF, IF and UKF. Note that CKF is exactly the same as a KF in

51 Chapter 4: Simulations and Results 42 Figure 4.6: Position and velocity of the vehicle measurement noise evaluation. Left: high noise. Right: low noise. this example, because all landmarks are observed during the whole simulation, and also because there are only few landmarks. Figure 4.7 depicts a comparison between the three algorithms. At each time step the algorithms estimates the position and velocity of the vehicle and the position of the landmarks, being this estimates far from the ground truth at the beginning and after a perturbation of the systems, for instance a change in the vehicle s motion, but with a clear trend to the ground truth as time goes by. Hence suggesting that all methods converge. In the bottom plot, the error committed at each time step is drawn, with its maximums and means. Quantitative results are presented in Table 4.3. These results are obtained after simulating several times exactly the same setup (for instance 100 times), and then computing the mean values of all simulations. The three algorithms seem to perform similarly in terms of accuracy, because the three have similar error rates and qualitatively, they seems to perform almost the same. Results suggests that UKF performs slightly better, because its maximum and mean errors are slightly lower than the rest. However, as far as time consumption is concerned, it seems clear that KF is the best, while the UKF requires longer times at each step, mainly due to the time spent in computing the unscented transform. The IF, as expected, has a very low update time, but with the cost of higher prediction times. Table 4.3: Quantitative comparison of a 1DOF SLAM problem, using KF, IF and UKF. Error (in m) Max times (ms) Mean times (ms) method mean max predict update predict update per Cycle KF IF UKF

43 4.3 Results Figure 4.7: Comparison between KF, IF and UKF in a 1DOF SLAM problem. simulation itself. Bottom: evolution of the error during the simulation.

52 Results Figure 4.7: Comparison between KF, IF and UKF in a 1DOF SLAM problem. simulation itself. Bottom: evolution of the error during the simulation. Top: the Finally, it would be interesting to evaluate if any of the methods converges quicker. However, after observing the figures shown so far it seems that all methods perform similarly, for instance under a certain perturbation all methods suffers a great disturbance, but after various time steps they tend to the ground truth in a similar way SLAM in 2D A 2 DOF mobile robot is moving along a path detecting some motionless landmarks, which are introduced into the system once observed. The range of the sensors is limited. The position and velocity of the vehicle and landmark poses are computed by means of Simultaneous Localization and Mapping (SLAM) using linear algorithms, i.e. KF, IF, UKF, CKF. Different synthetic setups are simulated in 2D. From them, valuable information about the accuracy, computational cost and performance is expected. First Simulation Set: Noise Evaluation The first setup is shown in Figure 4.8 and it is a simple squared map of 100 m x 100 m, in which 9 landmarks are located in fixed positions. A short trajectory of the robot is simulated. With this simple setup the effect when changing noise sources is evaluated.

Chapter 4: Simulations and Results 44 Figure 4.8: 2DoF simulation, first series of experiments setup. The setup shown in Figure 4.8 is simulated using the KF algorithm, but varying the process noise.

53 Chapter 4: Simulations and Results 44 Figure 4.8: 2DoF simulation, first series of experiments setup. The setup shown in Figure 4.8 is simulated using the KF algorithm, but varying the process noise. The behavior of the system is presented in Figure 4.9, showing the evolution of the robot position and the landmark estimates along the whole simulation. In the very left plot (with p n 0) it can be seen that convergence is too slow. In the middle cases (with p n and 0.01), the algorithm performs acceptably well. The main difference between both cases lays in the smoothness of the response and time required to converge after a perturbation. In the most right plot (with p n ), the vehicle trajectory estimate is highly oscillating around the ground truth, but the landmark location estimates are considerably uncertain, and this uncertainty increases dramatically. Figure 4.9: 2DOF simulation, from left to right the same simulation but with different process noise. Red: estimated trajectory of the vehicle; green: ground truth; blue circles: uncertainty of the landmark s estimates; pink: estimates of the landmarks pose. As previously seen in 1D examples, it seems clear that low p n reduces the speed of reaction of the system, while high p n produces high frequency oscillations around the ground truth. However, the results suggest that the landmark s position estimates are strongly unstable with high process noise. This outcome is depicted in Figure The top left plot compare the trajectory of the vehicle under different p n levels, and the distance from these trajectories to the ground truth are drawn in the left bottom plot. It seems that the trajectory of the robot is better estimated with high p n. The graphics in the right presents information from the mean error and uncertainty of all landmarks. The upper plot suggests that low process noise does

54 Results not properly estimate the uncertainty boundaries. The system becomes overconfident, because uncertainty ellipses are kept considerably small compared to the error, with the consequence that a landmark lays out of the 3σ ellipse. On the other hand, the right bottom corner plot shows that high p n conduces to an insecure system, for instance, the uncertainty boundaries increase so rapidly that according to them a single landmark could be located anywhere on the map. Figure 4.10: Comparison of the behavior of a KF based SLAM algorithm under different p n conditions. Left: robot trajectory and its error. Right: landmark position error. Table 4.4 is a summary of quantitative information from all methods. In the table, the vehicle state, the landmark pose error, and the timing information of each step involved in the algorithms are listed. In terms of error, it seems that all cases perform similarly, but the computational times are considerably different for each approach. Table 4.4: Quantitative comparison of all implemented methods for 2DOF SLAM simulation. Mean error (in m) Computational times (ms) method Localize Map Predict DA* Update Auxiliary Per cycle KF IF UKF CKF *DA stands for Data Association

Chapter 4: Simulations and Results 46 This information is also presented graphically in Figure 4.11.

55 Chapter 4: Simulations and Results 46 This information is also presented graphically in Figure The results shown in left plots suggest that all methods implemented perform similarly in terms of accuracy, for both robot and landmark location, independently of the process noise rate. In the right, it can be seen that the mean time required to complete every stage is also independent of the process noise. This right figure represents the prediction, data association, update and auxiliary computations mean times. It seems clear that the most time demanding algorithm is the UKF, because the unscented transform increases considerably the prediction time. The IF approach has the fastest update step (as expected from the theoretical background), but with the cost of considerably expensive prediction stage. Finally, the most interesting outcome is that the CKF consumes less time in the predict, data association and update stages than the normal KF. The reason for this improvement is the fact that CKF only deals with small amounts of data until it finish with a local region, hence the computations are less heavy and cheaper. Figure 4.11: Summary of mean error committed in each algorithm. Left-top: robot pose. Left-bottom: landmark location. Right: time consumption of each method in different p n conditions. As far as landmark position, robot pose and velocity measurement noises is concerned, their effect is tested on the KF algorithm and presented in Figure The left plot demonstrates that increasing the landmark measurement noise produces larger errors on the mapping stage, because the landmarks positions are estimated farther from its ground truth. In addition, the uncertainty ellipses supposed to contain the landmark with a probability of 99% are larger, hence the confidence of the system is lower. However, the robot localization is not significantly effected

47 4.3 Results by this landmark measurement noise variations. Right plots show the differences between simulations after rising the robot s position measurement noise.

12: Left: landmark measurement noise analysis. Right: robot position measurement noise analysis. Second Simulation Set: Loop Closing A second setup is depicted in Figure 4.

56 Results by this landmark measurement noise variations. Right plots show the differences between simulations after rising the robot s position measurement noise. For low noises, the robot localization lays inside a small uncertainty range around its ground truth, while large noises increase this uncertainty boundaries. Figure 4.12: Left: landmark measurement noise analysis. Right: robot position measurement noise analysis. Second Simulation Set: Loop Closing A second setup is depicted in Figure 4.13, where the vehicle closes several times the loop. This setup allows to analyze the performance and behavior when returning to a previously observed region of the map, i.e. facing the data association problem. Figure 4.13: 2DoF simulation, second series of experiments setup The four algorithms KF, IF, UKF and CKF are tested using this setup. It seems that all of them perform properly when closing the loop. Figure 4.14 presents the results obtained after

57 Chapter 4: Simulations and Results 48 running the simulation with each approach under exactly the same noise conditions. The top left plot shows the trajectories of the vehicle for each method during the whole simulation. These trajectories are almost equal, hence no significant differences on the localization are visible. The bottom left plot depicts the evolution of the mean map error during the simulation, which has a dramatic peak every time a new landmark is observed for the very first time. This peak is then regularly corrected. This correction is even more important when one of the first visited landmarks is reobserved (i.e. closing the loop), because all landmarks are updated and corrected correspondingly. Every time the robot closes a loop the mean error tends to zero in all approaches, as shown in Figure 4.14 (left bottom plot). In the right figures, the computational cost is drawn by means of the size of state vector (top), and by the means of time used in each stage (bottom). It seems that the best approach is the CKF, because it performs every single iteration faster, and because the length of vectors and size of matrices involved in the calculus are always smaller. Figure 4.14: Algorithms performance when facing loop closing. Top-left: vehicle s trajectory. Bottom-Left: robot localization mean error. Top-right: evolution of state size, as a measure of computational cost rise. Bottom-right: time consumption. No numerical data is given for this experiment, because the values follow the same structure as the ones in Table 4.4 and do not provide any significant information. Assuming known data association makes not necessary to present further results about the performance of all methods against noise, because the behavior is similar either facing or not loop closing.

49 4.3 Results Third Simulation Set: Highly Populated Map A third setup is shown in Figure 4.15. This map contains a high population of landmarks (i.e. number of landmarks = 81) in a map of 200 m x 200 m.

58 Results Third Simulation Set: Highly Populated Map A third setup is shown in Figure This map contains a high population of landmarks (i.e. number of landmarks = 81) in a map of 200 m x 200 m. This map permits to evaluate the performance of each method and specifically the increase on computational cost when dealing with large state vectors and matrices. Figure 4.15: 2DOF simulation, third series of experiments setup It seems obvious that the best approach is the CKF, because at every iteration only handles with few landmarks, i.e. those inside the active local region (see Figure 4.16 top left), while the rest of the algorithms carry all the landmarks seen so far. For this reason, the mean time per cycle is considerably lower in the CKF (see Figure 4.16 bottom left). In addition, in the CKF case almost all iterations stages require the same time (see Figure 4.16 right). On the contrary, the prediction step for the IF and UKF becomes more time consuming every moment a new landmarks is added to the system; and the time needed for the KF or UKF update step increases in a similar way. For all methods except for the CKF, the data association step is more time demanding when the state dimension increases. Further results are presented in Figure 4.17 in order to contrast a KF and CKF. In this set, the idea is to test both algorithms with a very high number of landmarks (i.e. 400, as drawn in the left). It seems clear that CKF overcomes KF in terms of computational cost, because the prediction and update steps of a KF becomes slower when the state is augmented with new landmarks (see Figure 4.17 bottom central plot), while the KF spends almost the same time independently of the number of landmarks seen so far (see Figure 4.17 top-right). In addition, the mean time of a single KF iteration (see Figure 4.17 bottom-right plot) is now considerably higher than a single CKF.

Chapter 4: Simulations and Results 50 Figure 4.16: Performance when dealing with highly populated maps. Top-left: growth of the state vector size. Bottom-left: time consumption.

18, which represents an underwater synthetic environment with several salient 3D objects. This 3D objects are defined as the landmarks used in the SLAM problem.

59 Chapter 4: Simulations and Results 50 Figure 4.16: Performance when dealing with highly populated maps. Top-left: growth of the state vector size. Bottom-left: time consumption. Right: evolution of time consumption along the whole simulation. Figure 4.17: 3D slam setup SLAM in 3D The final set of experiments simulates the environment drawn in Figure 4.18, which represents an underwater synthetic environment with several salient 3D objects. This 3D objects are defined as the landmarks used in the SLAM problem. All algorithms are tested and its performance is similar to that described for 1D and 2D experiments. Figure 4.19 depicts the uncertainty boundaries for the landmark estimates, and the vehicle s trajectory, for various time steps of a 3D KF based SLAM.

60 Results Figure 4.18: 3D SLAM simulation setup. Figure 4.19: 3D SLAM simulation example at different time steps.

Robotics. Lecture 7: Simultaneous Localisation and Mapping (SLAM)

Robotics. Lecture 7: Simultaneous Localisation and Mapping (SLAM) Robotics Lecture 7: Simultaneous Localisation and Mapping (SLAM) See course website http://www.doc.ic.ac.uk/~ajd/robotics/ for up to date information. Andrew Davison Department of Computing Imperial College