A Constant-Time Efficient Stereo SLAM System

Size: px
Start display at page:

Download "A Constant-Time Efficient Stereo SLAM System"

Transcription

1 A Constant-Time Efficient Stereo SLAM System Christopher Mei, Gabe Sibley, Mark Cummins, Ian Reid and Paul Newman Department of Engineering Science University of Oxford, OX1 3PJ, Oxford, UK Abstract Continuous, real-time mapping of an environment using a camera requires a constant-time estimation engine. This rules out optimal global solving such as bundle adjustment. In this article, we investigate the precision that can be achieved with only local estimation of motion and structure provided by a stereo pair. We introduce a simple but novel representation of the environment in terms of a sequence of relative locations. We demonstrate precise local mapping and easy navigation using the relative map, and importantly show that this can be done without requiring a global minimisation after loop closure. We discuss some of the issues that arise from using a relative representation, and evaluate our system on long sequences processed at a constant 3-4 Hz, obtaining precisions down to a few metres over distances of a few kilometres. I. INTRODUCTION The goal of building an autonomous platform using vision sensors has encouraged many developments in low-level image processing and in estimation techniques. Recent improvements have lead to real-time solutions on standard hardware. However these often rely on global solutions that do not scale with the size of environment. Furthermore, few systems integrate loop closure or a relocalisation mechanism that is essential for working in non-controlled environments where tracking assumptions are often violated. In this work, we investigate how to efficiently combine relevant approaches to obtain a vision-based solution that provides high frame rate, constanttime exploration, resilience to motion blur and a relocation and loop closure mechanism. The system presented in this paper combines a world representation enabling loop closure in real-time (Section II) with carefully engineered low-level image processing adapted to stereo image pairs. The novelty in the world representation comes from the continuous relative formulation that avoids map merging and the transfer of statistics between sub-maps. The low-level image processing provides an initialisation mechanism to improve resilience to strong rotational motion, a feature selection mechanism based on quadtrees and a method for obtaining scale invariance by using the metric information provided by the cameras ( true scale ). The experimental section (Section IV) provides an insight into the aspects that enable exploration of large environments in constant-time with good accuracy. The accuracy is down to a few meters over distances of a few kilometers. A. Related work Several approaches have been proposed for estimating the motion of a single camera and the structure of the scene (also Fig. 1. II. (a) Begbroke (b) New College 1 (c) New College 2 (d) New College 3 Estimated trajectories for the four data sets detailed in Tables I and known as simultaneous localisation and mapping (SLAM)) in real-time. In [6], the authors estimate the pose and landmark positions using an extended Kalman filter (EKF). The number of landmarks that can be processed is limited to a few hundreds due to quadratic complexity of the EKF. Recent real-time monocular systems have used local [17] or global [12, 7, 8] bundle adjustment as the underlying estimator. This choice is motivated by a complexity linear in the number of landmarks (but cubic in the number of poses) enabling the processing of more landmarks than an EKF leading to an overall better precision. A careful choice of keyframes is however required to keep the solving tractable. [17] does not provide a loop closing mechanism and cannot as such reduce drift when returning to a previously explored region. [12] uses a separate thread for bundle adjustment enabling the building of accurate maps for small environments but is not adapted to large scale exploration. [7, 8] share strong similarities with the present work combining a relative representation and a nonprobabilistic loop closure mechanism. In this work however, the low-level image processing is adapted to stereo vision and the focus is on precise exploration without requiring global graph minimisation.

2 TABLE I RESULTS FOR THE BEGBROKE AND FIRST NEW COLLEGE DATA SETS. Begbroke New College 1 Avg. Min. Max. Avg. Min. Max. Distance Travelled (km) Frames Processed 23, Velocity (m/s) e Angular Velocity (deg/s) e Frames Per Second Features per Frame Feature Track Length Reprojection Error e TABLE II RESULTS FOR THE SECOND AND THIRD NEW COLLEGE DATA SETS. New College 2 New College 3 Avg. Min. Max. Avg. Min. Max. Distance Travelled (km) Frames Processed 49,114 29,489 Velocity (m/s).83 4.e e Angular Velocity (deg/s) e e Frames Per Second Features per Frame Feature Track Length Reprojection Error Recent research has also provided some real-time solutions using stereo pairs [19, 13]. [19] relies on local bundle adjustment but does not address the problem of loop closing. The closest related work is that of Konolige and Agrawal [13]. The difference with the presented work is the focus on recovering globally metrically coherent maps thus rendering the complexity of the system linear in the number of poses. However for many applications (path planning, object manipulation, dynamic object detection,...) a relative local map is sufficient and can be recovered in constant time (the complexity being related to the size of the working space). It is also important to note that global metric maps and relative maps are not equivalent (see Section II-B) and share different properties. A. World representations II. MAP MANAGEMENT The position of the robot in the world can be represented in different ways: Global coordinates. This is the most common representation. An arbitrary initial frame (usually set to be the identity transform) is chosen and all subsequent position and landmark estimates are represented with respect to this frame. Robo-centric coordinates. This is similar to using global coordinates but the initial frame is chosen to be the current robot position. The map has to be updated at each new position estimate. This representation has been shown to improve consistency for EKF SLAM estimation [3]. Relative representation. [1, 2, 8, 13]. In this framework, each camera position is connected by an edge transform to another position forming a graph structure. There is no privileged position and recovering landmark estimates requires a graph traversal (eg breadth first search or shortest path computation). Sub-maps. Sub-maps consist in representing a map by local frames and can be used with any of the previously discussed map representations. There are mainly two reasons for using submaps: reducing computation and improved consistency (mainly in filtering frameworks to reduce the effect of propagating inconsistent statistics). In this work, the robot position and map are represented in a continuous relative framework. This differs from most relative representation approaches in the literature that use sub-maps and thus require map merging and the transfer of statistics through privileged nodes. B. Relative representation Notations: The following notations are adopted (Fig. 2): a 6-D robot pose i is denoted x pi and represented by blue triangles in the image. The current robot pose is filled in, a 3-D landmark j is written x mj and represented by a red star, z ijk indicates a measurement of the j th landmark observed from the i th pose and represented in base frame k. A base frame is defined as the pose in which a landmark is represented (i.e. where its 3-D coordinates are kept). A continuous relative representation was chosen to represent the world. Figure 2(a) shows pose estimates of the robot. The continuous line between poses indicates estimated transforms obtained during the exploration. We define an active region to be the set of poses within a given distance in the graph to the current pose. In the example, an active region of size two

3 (a) Graph representing a robot trajectory. The active region of size two contains the latest two poses. (b) Trajectory after loop closure with active region. The robot in x p8 makes an observation of landmark x m1. This provides an estimate between poses x p1 and x p8 represented by the new link in the graph. The active region of size two now comprises the older poses x p1 and x p2. The link between x p4 and x p still exists but is no longer represented as we do not enforce the composition of the transforms along cycles to be the identity. (c) Trajectory after localisation. The robot was previously connected to x p2 but the latest estimate of its position to the poses in the active region has shown a closer proximity to x p3. In this case, the old link is discarded and a new link is created, here between x pn and x p3. Fig. 2. Relative representation. Triangles represent robot poses with the current pose filled in. Stars represent landmarks. was chosen. Generally, the active region will be comprised of the latest poses. However, in the case of a loop closure, as illustrated by Fig. 2(b), older poses will also form the active region. The active region provides a way to find the landmarks visible from the current frame. Landmarks with base frames belonging to poses from the active region are projected into the current frame by composing the transforms along the edges. These estimates are then used to establish matches and compute the position of the robot. After a loop closure, the newly created edge can be used to project the estimates into the current frame thus enabling data association without requiring a global minimisation. In this work, the discovery of the active region is made by a breadth first search (BFS). A probabilistic approach could also be used with a Dijkstra shortest path discovery using the covariance from the edge transformations. However previous research has shown that the improvement over the map quality is negligible [2]. It should be noted that the global solution minimising the reprojection error in a relative framework is not equivalent to the global solution using a global reference with the same measurements. An experiment can help clarify this point. An indoor sequence was made with the exploration of the second floor of a building followed by a flight of stairs to the first floor. The elevator was then taken back up to the second floor (Fig. 3(a)). A loop closure was triggered on the second floor that created an edge transform in the graph. The reprojection error does not increase in this framework. However the trajectory cannot be represented in Euclidean space so a global bundle adjustment for example would fail. A BFS used to draw the map shows a tear (Fig. 3(b)). Travelling in the lift has removed the compatibility between a relative and a Euclidean representation. The transition between the floor building and the lift is the transition between two maps that can be represented in a Euclidean space. It is important to note that the relative map stays usable for navigation and path planning. C. Relocalisation Real-time relocalisation methods have been presented in the context of visual SLAM. In [22], the authors provide an approach using classification with randomised trees [14]. However the number of classes required leads to high memory usage (1.2MB per landmark). Using a SIFT descriptor with FAST feature extraction and scale provided by landmark triangulation as presented in Section III-F.2 has a comparatively low-cost. Combined with RANSAC, the experiments show good relocalising capabilities. D. Loop closure Loop closure detection is a well known difficulty for traditional SLAM systems. The proposed system employs an appearance-based approach to detect loop closure i.e. using sensory similarity to determine when the robot is revisiting a previously mapped area. Loop closure cues based on sensory similarity are independent of the robot s estimated position, and so are robust even in situations where there is significant error in the metric position estimate, for example after traversing a large loop where turning angles have been poorly estimated. The FABMAP approach (Fast Appearance Based Mapping) described in [] 1 is based on a probabilistic notion of similarity and incorporates a generative model for typical place appearance which allows the system to correctly assign loop closure 1 The FABMAP software used for this work is available at mjc/software.htm

4 (a) After exploring a floor, taking a flight of stairs followed by a lift, the robot returns to the same floor. No loop closure has currently been detected. (b) A loop closure is triggered. The trajectory cannot be represented in Euclidean space, a tear appears (in this example in the staircase). The size of the tear is related to the distance travelled in the lift. The built graph and related map is still usable for exploration because of its relative nature. Fig. 3. Sequence illustrating the difference between a global Euclidean reprsentation and a relative representation. probability to observations even in environments where many places have similar sensory appearance - a problem known as perceptual aliasing. This probabilistic approach differs from the work by Eade and Drummond [8] that provides best matches in real-time but does take into account repeated structure which is a common failure mode for loop closure. Appearance is represented using the bag-of-words model developed for image retrieval systems in the computer vision community [21, 18]. At time k the appearance map consists of a set of n k discrete locations, each location being described by a distribution over which appearance words are likely to be observed there. Incoming sensory data is converted into a bagof-words representation; for each location, a query is made that returns how likely it is that the observation came from that location s distribution. An expression for the probability that the observation came from a place not in the map is also produced. This yields a PDF over location, which can be used to make a data association decision and either create a new place model or update the belief about the appearance of an existing place. Essentially this is a SLAM algorithm in the space of appearance, which runs parallel to the relative mapping system. In a filtering framework, incorrect loop closures are often catastrophic as the statistical estimates are corrupted. The proposed relative representation enables to recover from erroneous loop closures as the simple removal of a link in the graph and the incorrectly associated measurements will return the system to its previous state. E. Frame dropping and localisation When exploring environments over long periods of time, memory usage becomes an issue. The presented system uses a simple heuristic similar to that presented in [17] based on the number of new features to decide what frames to keep. Furthermore, when exploring a previously mapped area detected by loop closure, the system localises with respect to the active region thus reducing the amount of new features created and simultaneously reducing drift. Figure 2(c) illustrates a typical localisation situation that would arise from frame x p3 being in the active region of frame x n and sufficiently close for a link to be connected between poses. The metric for deciding when to connect poses is based on the distance (typically 1 m) and an angle (1 deg) between frames. III. STEREO IMAGE PROCESSING This section describes the different steps applied to processing stereo image pairs. It is similar to other works in the literature [19, 12], the main novelty is in: a) the systematic application of second-order minimisation [1] for the pose initialisation and subpixel refinement, b) the use of quadtrees to ensure adequate spreading of the features in the image and c) the use of true scale to build discriminative descriptors at a lower cost than the standard scale space computation combined with a descriptor (as in SIFT [1]). We begin with a top level view. For each incoming frame, the following steps are made: A. image pre-processing, B. feature extraction, C. pose initialisation through sum-of-squared difference (SSD) inter-frame tracking, D. matching in time: the features are matched with current 3-D landmark estimates (the map), E. localisation: the position of the camera pair is estimated by minimising the landmark to image reprojection error, F. left-right matching: new 3-D landmarks are initialised after matching the features in the left and right images and triangulated. We will discuss in detail the different processing steps. A. Pre-processing Each incoming image is rectified using the known camera calibration parameters to ensure efficient scanline searches of corresponding left-right matches [11]. The image intensities for left and right images are then corrected to obtain the same mean and variance. This step improves the left-right matching scores, enabling better detection of outliers. A scale-space pyramid of both images is then built using a box filter with one image per octave for computational efficiency. This pyramid is used for extracting features with resilience to blur, for the pose initialisation and the calculation of SIFT descriptors with true scale.

5 B. Feature extraction The feature locations used in this work are provided by the FAST corner extractor [2]. This extractor provides good repeatability at a small computational cost. FAST corners are extracted at different pyramid levels. The pyramid provides robustness to motion blur and enables point matching in larger regions of the image. The number of pyramid levels used is related to the amount of motion blur expected. In indoor sequences, three pyramid levels are used whereas in outdoor sequence two levels are sufficient. (This aspect is dependant on the camera used.) The corner extraction threshold is initially set at a value providing a good compromise between number of points and robustness to noise. This threshold is then adapted at each timestep (bang-bang controller) to ensure a minimal amount of points. This proved sufficient to adapt to the strong changes in illumination typically encountered in outdoor sequences. C. Pose initialisation with image-based gradient descent Strong inter-frame rotation is a typical failure mode for feature-based tracking when a search window is used. An initial estimate of the 3-D rotation is obtained using the SSD gradient descent algorithm described in [16]. The assumption of pure rotation is valid if the inter-frame translation is small with respect to the landmark depths and at video frame rate this is indeed the case. The cost function associated with the rotation estimate can be written in the following way. Let I c be the current image and I p the previous image. K will denote the matrix of camera intrinsic parameters. p = [p x p y ] will be pixel coordinates. Proj refers to the de-homogenisation or projection function, ie Proj([x y z] ) = [x/z y/z]. The inverse of this function (this is not a true inverse) will be denoted Proj 1, (Proj 1 ([x y]) = [x y 1]). Multiplication by K followed by a projection will be written Π, Π = Proj K. Under Lambertian assumptions, we wish to find the optimal rotation R such that the intensities in the current and previous frames match: p I p, I c (Π(RΠ 1 (p))) = I p (p) Let x = [x 1,x 2,x 3 ] be a parameterisation of rotation, we use Lie algebra rotation parameters to simplify the computation of the Jacobians and denote the canonical generators of the group G i. G 1 = [ [1,,] ], G 2 = [ [,1,] ], G 3 = [ [,,1] ], with [x] denoting the skew-symmetric matrix of x. A rotation update parameterised by x will be written: ( ) R(x) = exp x i G i, R(x) x= = [G 1 G 2 G 3 ] i=1..3 Let R be the current estimate of the plane rotation between the camera positions corresponding to I c and I p. The SSD cost function over x can be written: 1 I c (Π( RR(x)Π 1 (p))) I p (p) 2 2 p I p This equation is solved by gradient descent using the efficient second-order minimisation. It is a special case of [16]. D. Feature matching in time The 3-D landmarks (the map) computed through the graph representing the relative poses (as detailed in Section II-B) are projected alternatively into the left and right images and matched in a fixed-sized window to the extracted FAST corners using mean SAD (sum of absolute difference with the mean removed for better resilience to lighting changes) on images patches of size 9 9. A maximal accepted score is set to provide a first pass robustness to outliers. E. Localisation After the map points have been matched, 3-point pose RANSAC [9] is applied to obtain an initial pose estimate and outlier classification. A localisation step then minimises the 6 degree of freedom of the camera pose using m-estimators for robustness (this minimisation can be found in standard textbooks [11]). After the minimisation, landmark measurements with strong reprojection errors are removed from the system. This step proved important to enable early removal of outliers and the possibility of adding new, more stable, landmarks. The RANSAC step only proved useful in difficult cases (strong motion blur in particular), m-estimators were sufficient in most situations. F. Left-right matching To achieve a high-frame rate with good accuracy around 1-1 features are tracked at each time step. The feature selection process uses quadtrees, described in following subsection, to ensure a uniform distribution of the features. Information theory could also be used to select features that constrain the motion estimation and for setting the search region for tracking [4] but the quadtree approach proved sufficient in the tested sequences. Along with each feature track, an image patch (of size 9 9) is saved for subpixel matching in time and a SIFT descriptor is computed using true scale to provide the distinctiveness required for a robust relocalisation mechanism and for loop closure (Section II). 1) Feature selection using quadtrees: The feature selection process follows the assumption that we desire distinctive features with a uniform distribution in the image (irrespective of the underlying tracking uncertainty). A quadtree is used to represent the distribution of the measurements at each time step. It contains the number of measurements in the different parts image and the maximal amount of points allowed to ensure a uniform distribution of features. It is used in the following way: a) In Step D., the matched map points increment the quadtree according to their measurement image locations. b) To add new features, FAST corners are extracted from the left and right images and ordered by a distinctiveness score (in this work we used Harris scores). To decide

6 which features to add, the best features are taken in order and their image location is checked in the quadtree to ensure the maximal amount of allowed points has not been exceeded. If it passes the test, the corresponding point in the other stereo pair is searched along the same scanline. Subpixel second-order minimisation [1] over the image coordinates and the average intensity (three parameter minimisation) is then applied to obtain accurate left-right correspondences and thus precise triangulation. The influence of subpixel refinement over the accuracy of the motion estimates will be evaluated in Section IV-A.1. 2) True scale: Relocalisation and loop closing based on features, like the methods used in this work, rely on repeatable feature extraction and distinctive feature descriptors. In methods such as SIFT [1], the feature extraction is based on finding extrema over multiple scales (using differenceof-gaussians, DoG). This provides invariance to scale. A descriptor is then built at this scale that provides invariance to rotation and robustness to small changes in viewpoint. The most expensive part of this algorithm is generally the computation of the image scale space. It has been shown to be feasible in real-time [8] (without transferring the processing to a graphics card) using a highly optimised recursive Gaussian filter and a downsampled image (32 24). An alternative in the case of a stereo pair is to build landmark descriptors corresponding to regions in the world of same physical size ( true scale ). This provides the property of matching only regions of same 3-D size which is not necessarily the case with DoG features. It can be achieved at no extra computational cost as the left-right matching that provides the 3-D location is required in any case for the motion estimation. We adopt a pinhole camera model with a focal length f. The size in the image of the projection of 3-D region of space is inversely proportional to its distance to the center of projection. Let s max and s min be the maximal and minimal size of square templates that could be matched reliably. s max corresponds to a certain landmark distance d min. The maximal acceptable distance corresponding to s min (Fig. 6(b)) is then: ( d max = d max ) ( s max f,d min = = smax s min d min ) s min f For a typical minimal distance d min =. m, s max = 32 pixels and s min = 9 pixels, we obtain d max 1.8 m which is insufficient to cover the full depth range. A solution is to use different 3-D sizes in rings to cover the depth range. Figure 6(a) shows the different distance discontinuities in 2-D corresponding to s min and s max with the camera at the center. Figure 4(c) illustrates the image template size according to depth. When relocalising or computing loop closures, only features belonging to the same ring are matched. Let d n be the smallest distance for the ring n. The equation 1 (a) A fixed 3-D region size between distances d min and d max projects to template sizes ranging from s max to s min with size inversely proportional to distance. After d max, a bigger fixed 3-D size can be computed to provide image templates in the same range (b) This figure illustrates the different bands for true scale computation. As can be seen, bands become bigger with distance. Fig. 4. Patch size (pixels) Distance (m) (c) Patch sizes according to distance for computing true scale descriptors. True scale distance rings relating d n+1 to n can be written (geometric series): { { ( ) n d1 = ( d min s ) d s d n+1 = max n+1 = max s dmin min s min d n n The template size as a function of the landmark distance, s(d) can thus be computed from: log(d/dmin) n = log(s ( + 1 max/s min) s(d) = s dnd ) max ( ) n 1 s = s max dmin max s min d For a landmark, once the scale has been estimated, a SIFT descriptor is built 2. The pyramid provides only a discrete set of scale values so scaling and bilinear interpolation is used to sample the points at the required scale. 2 The SIFT implementation is based on the code provided by A. Vedaldi vedaldi/code/code.html.

7 A. Low-level vision Error in m IV. EXPERIMENTAL RESULTS Frame Number Fig.. This figure illustrates the importance of subpixel refinement for the precision of the estimated trajectory. The blue dashed line and the red solid line with crosses show respectively the translation error without and without subpixel refinement over a 1 m simulated sequence. 1) Sub-pixel refinement: Subpixel refinement was found to be essential to obtain precise trajectory estimates. A simulated sequence shown in Fig. illustrates this importance. 2) Quadtree: Figure 6 shows how quadtrees effect the spreading of features in image of outdoor sequences. Vegetation typically gives strong Harris corner responses but often poorly constrain the estimate. Task Time (ms) Pre-processing (mean and variance correction, 1 pyramid computation, feature extrac- tion) Matching in time (with respect to the active region) RANSAC 1. Localisation 4 Match left-right (initialisation of new features) 2 SIFT descriptor Total 27. ( 36Hz) TABLE III TYPICAL PROCESSING TIME PER FRAME WITH 1 LANDMARKS AND AN ACTIVE REGION SIZE OF. 2) Relocalisation using true scale: The use of true scale greatly improved the robustness in difficult conditions such as the strong inter-frame motion shown in Fig. 7. (a) Taking the strongest Harris scores might lead to poorly constrained estimates. (b) Using a quadtree, it is possible to obtain points that do not necessary have high Harris scores but provide strong constraints. Fig. 6. B. Real data Quadtrees provide a way to spread features in the image. A variety of sequences that show different aspects of the system can be found on cmei/rss.html. 1) Timings : The computation time for the different steps are summarised in Tab. III. FABMAP does not run in real-time, the computation is dependent on the number of frames. As an order of magnitude, a call to FABMAP took about 1 ms for a exploration of 1 frames taken at 2 Hz. Not all frames became FABMAP areas. Fig. 7. This figure illustrates the relocalisation mechanism used in the system. FAST corners combined with scale invariance provided by the leftright matching and a SIFT descriptor for distinctiveness greatly improved robustness. 3) Loop closures using FABMAP and true scale: The probabilistic approach proposed in the FABMAP framework requires generating a vocabulary that can be used to evaluate how likely it is to see a particular feature in the environment. In this work, the training was generating from a separate sequence of 112 frames. A vocabulary of 1 words was generated. The full system was tested on a sequence of about 1. km. Figure 8 shows the results without loop closure in a dotted red line and with loop closure in a solid blue line. Without loop closure the trajectory exhibits drift. Loop closure enables to correct for drift without requiring a global minimisation. V. CONCLUSION This paper described a stereo system that demonstrated how a relative representation combined with careful engineering can provide very precise estimate and good robustness. Future research will investigate the issues regarding bounding memory usage during exploration.

8 x z y y 1 Fig. 8. The Begbroke sequence (Table II) was processed without loop closure (red trjectory with crosses) and with loop closure (blue continous line). Loop closure reduced the amount of drift even though no global minimisation was applied to the trajectory. 1 VI. ACKNOWLEDGMENTS The work reported in this paper undertaken by the Mobile Robotics Group was funded by the Systems Engineering for Autonomous Systems (SEAS) Defence Technology Centre established by the UK Ministry of Defence, Guidance Ltd, and by the UK EPSRC (CNA and Platform Grant EP/D3777/1). Christopher Mei and Ian Reid of the Active Vision Lab acknowledge the support of EPSRC grant GR/T2468/1. 1 REFERENCES [1] S. Benhimane and E. Malis. Real-time image-based tracking of planes using efficient second-order minimization. In IEEE International Conference on Intelligent Robots and Systems, 24. [2] M. Bosse, P. Newman, J. Leonard, M. Soika, W. Feiten, and S. Teller. An atlas framework for scalable mapping. In IEEE International Conference on Robotics and Automation, 23. [3] J. Castellanos, J. Neira, and J. Tards. Limits to the consistency of ekfbased slam. In IFAC, 24. [4] M. Chli and A. J. Davison. Active matching. In ECCV, 28. [] M. Cummins and P. Newman. Probabilistic appearance based navigation and loop closing. In IEEE International Conference on Robotics and Automation, 27. [6] A. J. Davison, I. Reid, N. Molton, and O. Stasse. Monoslam: Realtime single camera slam. IEEE Trans. in Pattern Analysis and Machine Intelligence, 27. [7] E. Eade and T. Drummond. Monocular slam as a graph of coalesced observations. In ICCV, 27. [8] E. Eade and T. Drummond. Unified loop closing and recovery for real time monocular slam. In British Machine Vision Conference, 28. [9] M. A. Fischler and R. C. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24(6):381 39, [1] J. Guivant and E. Nebot. Optimization of the simultaneous localization and map building algorithm for real time implementation. IEEE Transactions on Robotics and Automation, 17:242 27, 21. [11] R. Hartley and A. Zisserman. Multiple View geometry in Computer vision. Cambridge university press, 2. [12] G. Klein and D. Murray. Parallel tracking and mapping for small AR workspaces. In ISMAR, 27. [13] K. Konolige and M. Agrawal. Frameslam: From bundle adjustment to real-time visual mapping. IEEE Transactions on Robotics, 24(): , Oct. 28. [14] V. Lepetit and P. Fua. Keypoint recognition using randomized trees. IEEE Trans. in Pattern Analysis and Machine Intelligence, 28(9): , 26. [1] D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2(6):91 11, 24. [16] C. Mei, S. Benhimane, E. Malis, and P. Rives. Efficient homographybased tracking and 3-d reconstruction for single-viewpoint sensors. IEEE Transactions on Robotics, 28. [17] E. Mouragnon, M. Lhuillier, M. Dhome, F. Dekeyser, and P. Sayd. Real- Time Localization and 3D Reconstruction. In IEEE Conference of Vision and Pattern Recognition, 26. [18] D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In IEEE Conference of Vision and Pattern Recognition, 26. [19] D. Nistr, O. Naroditsky, and J. Bergen. Visual odometry for ground vehicle applications. Journal of Field Robotics, 23(1), 26. [2] E. Rosten and T. Drummond. Fusing points and lines for high performance tracking. In ICCV, 2. [21] J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In International Conference on Computer Vision, 23. [22] B. Williams, G. Klein, and I. Reid. Real-time slam relocalisation. In ICCV, 27.

Closing Loops Without Places

Closing Loops Without Places Closing Loops Without Places Christopher Mei, Gabe Sibley and Paul Newman Mobile Robotics Group University of Oxford, UK Abstract This paper proposes a new topo-metric representation of the world based

More information

Augmented Reality, Advanced SLAM, Applications

Augmented Reality, Advanced SLAM, Applications Augmented Reality, Advanced SLAM, Applications Prof. Didier Stricker & Dr. Alain Pagani alain.pagani@dfki.de Lecture 3D Computer Vision AR, SLAM, Applications 1 Introduction Previous lectures: Basics (camera,

More information

Unified Loop Closing and Recovery for Real Time Monocular SLAM

Unified Loop Closing and Recovery for Real Time Monocular SLAM Unified Loop Closing and Recovery for Real Time Monocular SLAM Ethan Eade and Tom Drummond Machine Intelligence Laboratory, Cambridge University {ee231, twd20}@cam.ac.uk Abstract We present a unified method

More information

Hidden View Synthesis using Real-Time Visual SLAM for Simplifying Video Surveillance Analysis

Hidden View Synthesis using Real-Time Visual SLAM for Simplifying Video Surveillance Analysis 2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center May 9-13, 2011, Shanghai, China Hidden View Synthesis using Real-Time Visual SLAM for Simplifying

More information

Robotics. Lecture 7: Simultaneous Localisation and Mapping (SLAM)

Robotics. Lecture 7: Simultaneous Localisation and Mapping (SLAM) Robotics Lecture 7: Simultaneous Localisation and Mapping (SLAM) See course website http://www.doc.ic.ac.uk/~ajd/robotics/ for up to date information. Andrew Davison Department of Computing Imperial College

More information

An image to map loop closing method for monocular SLAM

An image to map loop closing method for monocular SLAM An image to map loop closing method for monocular SLAM Brian Williams, Mark Cummins, José Neira, Paul Newman, Ian Reid and Juan Tardós Universidad de Zaragoza, Spain University of Oxford, UK Abstract In

More information

Jakob Engel, Thomas Schöps, Daniel Cremers Technical University Munich. LSD-SLAM: Large-Scale Direct Monocular SLAM

Jakob Engel, Thomas Schöps, Daniel Cremers Technical University Munich. LSD-SLAM: Large-Scale Direct Monocular SLAM Computer Vision Group Technical University of Munich Jakob Engel LSD-SLAM: Large-Scale Direct Monocular SLAM Jakob Engel, Thomas Schöps, Daniel Cremers Technical University Munich Monocular Video Engel,

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,

More information

Robot Localization based on Geo-referenced Images and G raphic Methods

Robot Localization based on Geo-referenced Images and G raphic Methods Robot Localization based on Geo-referenced Images and G raphic Methods Sid Ahmed Berrabah Mechanical Department, Royal Military School, Belgium, sidahmed.berrabah@rma.ac.be Janusz Bedkowski, Łukasz Lubasiński,

More information

Accurate Motion Estimation and High-Precision 3D Reconstruction by Sensor Fusion

Accurate Motion Estimation and High-Precision 3D Reconstruction by Sensor Fusion 007 IEEE International Conference on Robotics and Automation Roma, Italy, 0-4 April 007 FrE5. Accurate Motion Estimation and High-Precision D Reconstruction by Sensor Fusion Yunsu Bok, Youngbae Hwang,

More information

Visual Odometry. Features, Tracking, Essential Matrix, and RANSAC. Stephan Weiss Computer Vision Group NASA-JPL / CalTech

Visual Odometry. Features, Tracking, Essential Matrix, and RANSAC. Stephan Weiss Computer Vision Group NASA-JPL / CalTech Visual Odometry Features, Tracking, Essential Matrix, and RANSAC Stephan Weiss Computer Vision Group NASA-JPL / CalTech Stephan.Weiss@ieee.org (c) 2013. Government sponsorship acknowledged. Outline The

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

Stereo and Epipolar geometry

Stereo and Epipolar geometry Previously Image Primitives (feature points, lines, contours) Today: Stereo and Epipolar geometry How to match primitives between two (multiple) views) Goals: 3D reconstruction, recognition Jana Kosecka

More information

Robot localization method based on visual features and their geometric relationship

Robot localization method based on visual features and their geometric relationship , pp.46-50 http://dx.doi.org/10.14257/astl.2015.85.11 Robot localization method based on visual features and their geometric relationship Sangyun Lee 1, Changkyung Eem 2, and Hyunki Hong 3 1 Department

More information

Parallel, Real-Time Visual SLAM

Parallel, Real-Time Visual SLAM Parallel, Real-Time Visual SLAM Brian Clipp 1, Jongwoo Lim 2, Jan-Michael Frahm 1 and Marc Pollefeys 1,3 Department of Computer Science 1 Honda Research Institute USA, Inc. 2 Department of Computer Science

More information

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013 Live Metric 3D Reconstruction on Mobile Phones ICCV 2013 Main Contents 1. Target & Related Work 2. Main Features of This System 3. System Overview & Workflow 4. Detail of This System 5. Experiments 6.

More information

Parallel, Real-Time Visual SLAM

Parallel, Real-Time Visual SLAM Parallel, Real-Time Visual SLAM Brian Clipp 1, Jongwoo Lim 2, Jan-Michael Frahm 1 and Marc Pollefeys 1,3 Department of Computer Science 1 Honda Research Institute USA, Inc. 2 Department of Computer Science

More information

Object Recognition with Invariant Features

Object Recognition with Invariant Features Object Recognition with Invariant Features Definition: Identify objects or scenes and determine their pose and model parameters Applications Industrial automation and inspection Mobile robots, toys, user

More information

Vision-based Mobile Robot Localization and Mapping using Scale-Invariant Features

Vision-based Mobile Robot Localization and Mapping using Scale-Invariant Features Vision-based Mobile Robot Localization and Mapping using Scale-Invariant Features Stephen Se, David Lowe, Jim Little Department of Computer Science University of British Columbia Presented by Adam Bickett

More information

Estimating Geospatial Trajectory of a Moving Camera

Estimating Geospatial Trajectory of a Moving Camera Estimating Geospatial Trajectory of a Moving Camera Asaad Hakeem 1, Roberto Vezzani 2, Mubarak Shah 1, Rita Cucchiara 2 1 School of Electrical Engineering and Computer Science, University of Central Florida,

More information

LOCAL AND GLOBAL DESCRIPTORS FOR PLACE RECOGNITION IN ROBOTICS

LOCAL AND GLOBAL DESCRIPTORS FOR PLACE RECOGNITION IN ROBOTICS 8th International DAAAM Baltic Conference "INDUSTRIAL ENGINEERING - 19-21 April 2012, Tallinn, Estonia LOCAL AND GLOBAL DESCRIPTORS FOR PLACE RECOGNITION IN ROBOTICS Shvarts, D. & Tamre, M. Abstract: The

More information

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Extrinsic camera calibration method and its performance evaluation Jacek Komorowski 1 and Przemyslaw Rokita 2 arxiv:1809.11073v1 [cs.cv] 28 Sep 2018 1 Maria Curie Sklodowska University Lublin, Poland jacek.komorowski@gmail.com

More information

Application questions. Theoretical questions

Application questions. Theoretical questions The oral exam will last 30 minutes and will consist of one application question followed by two theoretical questions. Please find below a non exhaustive list of possible application questions. The list

More information

Robotics. Lecture 8: Simultaneous Localisation and Mapping (SLAM)

Robotics. Lecture 8: Simultaneous Localisation and Mapping (SLAM) Robotics Lecture 8: Simultaneous Localisation and Mapping (SLAM) See course website http://www.doc.ic.ac.uk/~ajd/robotics/ for up to date information. Andrew Davison Department of Computing Imperial College

More information

Large-Scale Robotic SLAM through Visual Mapping

Large-Scale Robotic SLAM through Visual Mapping Large-Scale Robotic SLAM through Visual Mapping Christof Hoppe, Kathrin Pirker, Matthias Ru ther and Horst Bischof Institute for Computer Graphics and Vision Graz University of Technology, Austria {hoppe,

More information

Real-Time Model-Based SLAM Using Line Segments

Real-Time Model-Based SLAM Using Line Segments Real-Time Model-Based SLAM Using Line Segments Andrew P. Gee and Walterio Mayol-Cuevas Department of Computer Science, University of Bristol, UK {gee,mayol}@cs.bris.ac.uk Abstract. Existing monocular vision-based

More information

Generalized Detection and Merging of Loop Closures for Video Sequences

Generalized Detection and Merging of Loop Closures for Video Sequences Generalized Detection and Merging of Loop Closures for Video Sequences Manfred Klopschitz 1, Christopher Zach 2, Arnold Irschara 1, Dieter Schmalstieg 1 1 Graz University of Technology {klopschitz,irschara,schmalstieg}@icg.tugraz.at

More information

15 Years of Visual SLAM

15 Years of Visual SLAM 15 Years of Visual SLAM Andrew Davison Robot Vision Group and Dyson Robotics Laboratory Department of Computing Imperial College London www.google.com/+andrewdavison December 18, 2015 What Has Defined

More information

Online Learning of Binary Feature Indexing for Real-time SLAM Relocalization

Online Learning of Binary Feature Indexing for Real-time SLAM Relocalization Online Learning of Binary Feature Indexing for Real-time SLAM Relocalization Youji Feng 1, Yihong Wu 1, Lixin Fan 2 1 Institute of Automation, Chinese Academy of Sciences 2 Nokia Research Center, Tampere

More information

Image correspondences and structure from motion

Image correspondences and structure from motion Image correspondences and structure from motion http://graphics.cs.cmu.edu/courses/15-463 15-463, 15-663, 15-862 Computational Photography Fall 2017, Lecture 20 Course announcements Homework 5 posted.

More information

Weighted Local Bundle Adjustment and Application to Odometry and Visual SLAM Fusion

Weighted Local Bundle Adjustment and Application to Odometry and Visual SLAM Fusion EUDES,NAUDET,LHUILLIER,DHOME: WEIGHTED LBA & ODOMETRY FUSION 1 Weighted Local Bundle Adjustment and Application to Odometry and Visual SLAM Fusion Alexandre Eudes 12 alexandre.eudes@lasmea.univ-bpclermont.fr

More information

Scale Invariant Feature Transform

Scale Invariant Feature Transform Why do we care about matching features? Scale Invariant Feature Transform Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Automatic

More information

Dense 3D Reconstruction. Christiano Gava

Dense 3D Reconstruction. Christiano Gava Dense 3D Reconstruction Christiano Gava christiano.gava@dfki.de Outline Previous lecture: structure and motion II Structure and motion loop Triangulation Today: dense 3D reconstruction The matching problem

More information

Scale Invariant Feature Transform

Scale Invariant Feature Transform Scale Invariant Feature Transform Why do we care about matching features? Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Image

More information

Augmenting Reality, Naturally:

Augmenting Reality, Naturally: Augmenting Reality, Naturally: Scene Modelling, Recognition and Tracking with Invariant Image Features by Iryna Gordon in collaboration with David G. Lowe Laboratory for Computational Intelligence Department

More information

Real-Time Monocular SLAM with Straight Lines

Real-Time Monocular SLAM with Straight Lines Real-Time Monocular SLAM with Straight Lines Paul Smith, Ian Reid and Andrew Davison Department of Engineering Science, University of Oxford, UK Department of Computing, Imperial College London, UK [pas,ian]@robots.ox.ac.uk,

More information

The SIFT (Scale Invariant Feature

The SIFT (Scale Invariant Feature The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical

More information

Sampling based Bundle Adjustment using Feature Matches between Ground-view and Aerial Images

Sampling based Bundle Adjustment using Feature Matches between Ground-view and Aerial Images Sampling based Bundle Adjustment using Feature Matches between Ground-view and Aerial Images Hideyuki Kume, Tomokazu Sato and Naokazu Yokoya Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma,

More information

Introduction to SLAM Part II. Paul Robertson

Introduction to SLAM Part II. Paul Robertson Introduction to SLAM Part II Paul Robertson Localization Review Tracking, Global Localization, Kidnapping Problem. Kalman Filter Quadratic Linear (unless EKF) SLAM Loop closing Scaling: Partition space

More information

Outline 7/2/201011/6/

Outline 7/2/201011/6/ Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern

More information

Ensemble of Bayesian Filters for Loop Closure Detection

Ensemble of Bayesian Filters for Loop Closure Detection Ensemble of Bayesian Filters for Loop Closure Detection Mohammad Omar Salameh, Azizi Abdullah, Shahnorbanun Sahran Pattern Recognition Research Group Center for Artificial Intelligence Faculty of Information

More information

Dense 3D Reconstruction. Christiano Gava

Dense 3D Reconstruction. Christiano Gava Dense 3D Reconstruction Christiano Gava christiano.gava@dfki.de Outline Previous lecture: structure and motion II Structure and motion loop Triangulation Wide baseline matching (SIFT) Today: dense 3D reconstruction

More information

Using temporal seeding to constrain the disparity search range in stereo matching

Using temporal seeding to constrain the disparity search range in stereo matching Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department

More information

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects Intelligent Control Systems Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/

More information

Stable Vision-Aided Navigation for Large-Area Augmented Reality

Stable Vision-Aided Navigation for Large-Area Augmented Reality Stable Vision-Aided Navigation for Large-Area Augmented Reality Taragay Oskiper, Han-Pang Chiu, Zhiwei Zhu Supun Samarasekera, Rakesh Teddy Kumar Vision and Robotics Laboratory SRI-International Sarnoff,

More information

Long-term motion estimation from images

Long-term motion estimation from images Long-term motion estimation from images Dennis Strelow 1 and Sanjiv Singh 2 1 Google, Mountain View, CA, strelow@google.com 2 Carnegie Mellon University, Pittsburgh, PA, ssingh@cmu.edu Summary. Cameras

More information

Multibody reconstruction of the dynamic scene surrounding a vehicle using a wide baseline and multifocal stereo system

Multibody reconstruction of the dynamic scene surrounding a vehicle using a wide baseline and multifocal stereo system Multibody reconstruction of the dynamic scene surrounding a vehicle using a wide baseline and multifocal stereo system Laurent Mennillo 1,2, Éric Royer1, Frédéric Mondot 2, Johann Mousain 2, Michel Dhome

More information

Direct Methods in Visual Odometry

Direct Methods in Visual Odometry Direct Methods in Visual Odometry July 24, 2017 Direct Methods in Visual Odometry July 24, 2017 1 / 47 Motivation for using Visual Odometry Wheel odometry is affected by wheel slip More accurate compared

More information

Monocular SLAM for a Small-Size Humanoid Robot

Monocular SLAM for a Small-Size Humanoid Robot Tamkang Journal of Science and Engineering, Vol. 14, No. 2, pp. 123 129 (2011) 123 Monocular SLAM for a Small-Size Humanoid Robot Yin-Tien Wang*, Duen-Yan Hung and Sheng-Hsien Cheng Department of Mechanical

More information

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies M. Lourakis, S. Tzurbakis, A. Argyros, S. Orphanoudakis Computer Vision and Robotics Lab (CVRL) Institute of

More information

Homographies and RANSAC

Homographies and RANSAC Homographies and RANSAC Computer vision 6.869 Bill Freeman and Antonio Torralba March 30, 2011 Homographies and RANSAC Homographies RANSAC Building panoramas Phototourism 2 Depth-based ambiguity of position

More information

RS-SLAM: RANSAC Sampling for Visual FastSLAM

RS-SLAM: RANSAC Sampling for Visual FastSLAM 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems September 25-30, 2011. San Francisco, CA, USA RS-SLAM: RANSAC Sampling for Visual FastSLAM Gim Hee Lee, Friedrich Fraundorfer, and

More information

Real-Time Global Localization with A Pre-Built Visual Landmark Database

Real-Time Global Localization with A Pre-Built Visual Landmark Database Real-Time Global Localization with A Pre-Built Visual Database Zhiwei Zhu, Taragay Oskiper, Supun Samarasekera, Rakesh Kumar and Harpreet S. Sawhney Sarnoff Corporation, 201 Washington Road, Princeton,

More information

Visual SLAM for small Unmanned Aerial Vehicles

Visual SLAM for small Unmanned Aerial Vehicles Visual SLAM for small Unmanned Aerial Vehicles Margarita Chli Autonomous Systems Lab, ETH Zurich Simultaneous Localization And Mapping How can a body navigate in a previously unknown environment while

More information

Visual Navigation for Micro Air Vehicles

Visual Navigation for Micro Air Vehicles Visual Navigation for Micro Air Vehicles Abraham Bachrach, Albert S. Huang, Daniel Maturana, Peter Henry, Michael Krainin, Dieter Fox, and Nicholas Roy Computer Science and Artificial Intelligence Laboratory,

More information

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich.

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich. Autonomous Mobile Robots Localization "Position" Global Map Cognition Environment Model Local Map Path Perception Real World Environment Motion Control Perception Sensors Vision Uncertainties, Line extraction

More information

Visual Recognition and Search April 18, 2008 Joo Hyun Kim

Visual Recognition and Search April 18, 2008 Joo Hyun Kim Visual Recognition and Search April 18, 2008 Joo Hyun Kim Introduction Suppose a stranger in downtown with a tour guide book?? Austin, TX 2 Introduction Look at guide What s this? Found Name of place Where

More information

Computer Vision Lecture 17

Computer Vision Lecture 17 Computer Vision Lecture 17 Epipolar Geometry & Stereo Basics 13.01.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar in the summer semester

More information

Computer Vision Lecture 17

Computer Vision Lecture 17 Announcements Computer Vision Lecture 17 Epipolar Geometry & Stereo Basics Seminar in the summer semester Current Topics in Computer Vision and Machine Learning Block seminar, presentations in 1 st week

More information

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10 Structure from Motion CSE 152 Lecture 10 Announcements Homework 3 is due May 9, 11:59 PM Reading: Chapter 8: Structure from Motion Optional: Multiple View Geometry in Computer Vision, 2nd edition, Hartley

More information

Image Augmented Laser Scan Matching for Indoor Localization

Image Augmented Laser Scan Matching for Indoor Localization Image Augmented Laser Scan Matching for Indoor Localization Nikhil Naikal Avideh Zakhor John Kua Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2009-35

More information

Local features and image matching. Prof. Xin Yang HUST

Local features and image matching. Prof. Xin Yang HUST Local features and image matching Prof. Xin Yang HUST Last time RANSAC for robust geometric transformation estimation Translation, Affine, Homography Image warping Given a 2D transformation T and a source

More information

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo -- Computational Optical Imaging - Optique Numerique -- Multiple View Geometry and Stereo -- Winter 2013 Ivo Ihrke with slides by Thorsten Thormaehlen Feature Detection and Matching Wide-Baseline-Matching

More information

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy BSB663 Image Processing Pinar Duygulu Slides are adapted from Selim Aksoy Image matching Image matching is a fundamental aspect of many problems in computer vision. Object or scene recognition Solving

More information

SLAM for Robotic Assistance to fire-fighting services

SLAM for Robotic Assistance to fire-fighting services SLAM for Robotic Assistance to fire-fighting services Sid Ahmed Berrabah *,**, Yvan Baudoin *, Hichem Sahli ** * Royal Military Academy of Belgium (RMA), Av. de la Renaissance 30, B1000 Brussels, Belgium

More information

Real Time Localization and 3D Reconstruction

Real Time Localization and 3D Reconstruction Real Time Localization and 3D Reconstruction E. Mouragnon, M. Lhuillier, M. Dhome, F. Dekeyser, P. Sayd LASMEA UMR 6602, Université Blaise Pascal/CNRS, 63177 Aubière Cedex, France Image and embedded computer

More information

CSE 527: Introduction to Computer Vision

CSE 527: Introduction to Computer Vision CSE 527: Introduction to Computer Vision Week 10 Class 2: Visual Odometry November 2nd, 2017 Today Visual Odometry Intro Algorithm SLAM Visual Odometry Input Output Images, Video Camera trajectory, motion

More information

From Structure-from-Motion Point Clouds to Fast Location Recognition

From Structure-from-Motion Point Clouds to Fast Location Recognition From Structure-from-Motion Point Clouds to Fast Location Recognition Arnold Irschara1;2, Christopher Zach2, Jan-Michael Frahm2, Horst Bischof1 1Graz University of Technology firschara, bischofg@icg.tugraz.at

More information

Tightly-Integrated Visual and Inertial Navigation for Pinpoint Landing on Rugged Terrains

Tightly-Integrated Visual and Inertial Navigation for Pinpoint Landing on Rugged Terrains Tightly-Integrated Visual and Inertial Navigation for Pinpoint Landing on Rugged Terrains PhD student: Jeff DELAUNE ONERA Director: Guy LE BESNERAIS ONERA Advisors: Jean-Loup FARGES Clément BOURDARIAS

More information

Towards a visual perception system for LNG pipe inspection

Towards a visual perception system for LNG pipe inspection Towards a visual perception system for LNG pipe inspection LPV Project Team: Brett Browning (PI), Peter Rander (co PI), Peter Hansen Hatem Alismail, Mohamed Mustafa, Joey Gannon Qri8 Lab A Brief Overview

More information

3D Model Acquisition by Tracking 2D Wireframes

3D Model Acquisition by Tracking 2D Wireframes 3D Model Acquisition by Tracking 2D Wireframes M. Brown, T. Drummond and R. Cipolla {96mab twd20 cipolla}@eng.cam.ac.uk Department of Engineering University of Cambridge Cambridge CB2 1PZ, UK Abstract

More information

Autonomous Mobile Robot Design

Autonomous Mobile Robot Design Autonomous Mobile Robot Design Topic: EKF-based SLAM Dr. Kostas Alexis (CSE) These slides have partially relied on the course of C. Stachniss, Robot Mapping - WS 2013/14 Autonomous Robot Challenges Where

More information

Dealing with Scale. Stephan Weiss Computer Vision Group NASA-JPL / CalTech

Dealing with Scale. Stephan Weiss Computer Vision Group NASA-JPL / CalTech Dealing with Scale Stephan Weiss Computer Vision Group NASA-JPL / CalTech Stephan.Weiss@ieee.org (c) 2013. Government sponsorship acknowledged. Outline Why care about size? The IMU as scale provider: The

More information

Using Geometric Blur for Point Correspondence

Using Geometric Blur for Point Correspondence 1 Using Geometric Blur for Point Correspondence Nisarg Vyas Electrical and Computer Engineering Department, Carnegie Mellon University, Pittsburgh, PA Abstract In computer vision applications, point correspondence

More information

URBAN STRUCTURE ESTIMATION USING PARALLEL AND ORTHOGONAL LINES

URBAN STRUCTURE ESTIMATION USING PARALLEL AND ORTHOGONAL LINES URBAN STRUCTURE ESTIMATION USING PARALLEL AND ORTHOGONAL LINES An Undergraduate Research Scholars Thesis by RUI LIU Submitted to Honors and Undergraduate Research Texas A&M University in partial fulfillment

More information

Large Scale Visual Odometry for Rough Terrain

Large Scale Visual Odometry for Rough Terrain Large Scale Visual Odometry for Rough Terrain Kurt Konolige, Motilal Agrawal, and Joan Solà SRI International 333 Ravenswood Ave Menlo Park, CA 94025 konolige,agrawal,sola@ai.sri.com 1 Introduction Estimating

More information

ELEC Dr Reji Mathew Electrical Engineering UNSW

ELEC Dr Reji Mathew Electrical Engineering UNSW ELEC 4622 Dr Reji Mathew Electrical Engineering UNSW Review of Motion Modelling and Estimation Introduction to Motion Modelling & Estimation Forward Motion Backward Motion Block Motion Estimation Motion

More information

Monocular SLAM with Inverse Scaling Parametrization

Monocular SLAM with Inverse Scaling Parametrization Monocular SLAM with Inverse Scaling Parametrization D. Marzorati 2, M. Matteucci 1, D. Migliore 1, and D. G. Sorrenti 2 1 Dept. Electronics and Information, Politecnico di Milano 2 Dept. Informatica, Sistem.

More information

ORB SLAM 2 : an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras

ORB SLAM 2 : an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras ORB SLAM 2 : an OpenSource SLAM System for Monocular, Stereo and RGBD Cameras Raul urartal and Juan D. Tardos Presented by: Xiaoyu Zhou Bolun Zhang Akshaya Purohit Lenord Melvix 1 Outline Background Introduction

More information

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit Augmented Reality VU Computer Vision 3D Registration (2) Prof. Vincent Lepetit Feature Point-Based 3D Tracking Feature Points for 3D Tracking Much less ambiguous than edges; Point-to-point reprojection

More information

Final Exam Study Guide

Final Exam Study Guide Final Exam Study Guide Exam Window: 28th April, 12:00am EST to 30th April, 11:59pm EST Description As indicated in class the goal of the exam is to encourage you to review the material from the course.

More information

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT Oct. 15, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof. John Wawrzynek)

More information

Video-rate Localization in Multiple Maps for Wearable Augmented Reality

Video-rate Localization in Multiple Maps for Wearable Augmented Reality Video-rate Localization in Multiple Maps for Wearable Augmented Reality Robert Castle, Georg Klein, David W Murray Department of Engineering Science, University of Oxford, Oxford OX1 3PJ, UK [bob,gk,dwm]@robots.ox.ac.uk

More information

Hybrids Mixed Approaches

Hybrids Mixed Approaches Hybrids Mixed Approaches Stephan Weiss Computer Vision Group NASA-JPL / CalTech Stephan.Weiss@ieee.org (c) 2013. Government sponsorship acknowledged. Outline Why mixing? Parallel Tracking and Mapping Benefits

More information

Feature descriptors and matching

Feature descriptors and matching Feature descriptors and matching Detections at multiple scales Invariance of MOPS Intensity Scale Rotation Color and Lighting Out-of-plane rotation Out-of-plane rotation Better representation than color:

More information

WANGSIRIPITAK, MURRAY: REASONING ABOUT VISIBILITY AND OCCLUSION 1 Reducing mismatching under time-pressure by reasoning about visibility and occlusion

WANGSIRIPITAK, MURRAY: REASONING ABOUT VISIBILITY AND OCCLUSION 1 Reducing mismatching under time-pressure by reasoning about visibility and occlusion Reducing mismatching under time-pressure by reasoning about visibility and occlusion S Wangsiripitak D W Murray Department of Engineering Science University of Oxford Parks Road, Oxford, OX 3PJ www.robots.ox.ac.uk/activevision

More information

Target Tracking Using Mean-Shift And Affine Structure

Target Tracking Using Mean-Shift And Affine Structure Target Tracking Using Mean-Shift And Affine Structure Chuan Zhao, Andrew Knight and Ian Reid Department of Engineering Science, University of Oxford, Oxford, UK {zhao, ian}@robots.ox.ac.uk Abstract Inthispaper,wepresentanewapproachfortracking

More information

Monocular Visual Odometry

Monocular Visual Odometry Elective in Robotics coordinator: Prof. Giuseppe Oriolo Monocular Visual Odometry (slides prepared by Luca Ricci) Monocular vs. Stereo: eamples from Nature Predator Predators eyes face forward. The field

More information

Visual Tracking of Planes with an Uncalibrated Central Catadioptric Camera

Visual Tracking of Planes with an Uncalibrated Central Catadioptric Camera The 29 IEEE/RSJ International Conference on Intelligent Robots and Systems October 11-15, 29 St. Louis, USA Visual Tracking of Planes with an Uncalibrated Central Catadioptric Camera A. Salazar-Garibay,

More information

Optimization of the Simultaneous Localization and Map-Building Algorithm for Real-Time Implementation

Optimization of the Simultaneous Localization and Map-Building Algorithm for Real-Time Implementation 242 IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 17, NO. 3, JUNE 2001 Optimization of the Simultaneous Localization and Map-Building Algorithm for Real-Time Implementation José E. Guivant and Eduardo

More information

A Sparse Hybrid Map for Vision-Guided Mobile Robots

A Sparse Hybrid Map for Vision-Guided Mobile Robots A Sparse Hybrid Map for Vision-Guided Mobile Robots Feras Dayoub Grzegorz Cielniak Tom Duckett Department of Computing and Informatics, University of Lincoln, Lincoln, UK {fdayoub,gcielniak,tduckett}@lincoln.ac.uk

More information

Week 2: Two-View Geometry. Padua Summer 08 Frank Dellaert

Week 2: Two-View Geometry. Padua Summer 08 Frank Dellaert Week 2: Two-View Geometry Padua Summer 08 Frank Dellaert Mosaicking Outline 2D Transformation Hierarchy RANSAC Triangulation of 3D Points Cameras Triangulation via SVD Automatic Correspondence Essential

More information

Visual Bearing-Only Simultaneous Localization and Mapping with Improved Feature Matching

Visual Bearing-Only Simultaneous Localization and Mapping with Improved Feature Matching Visual Bearing-Only Simultaneous Localization and Mapping with Improved Feature Matching Hauke Strasdat, Cyrill Stachniss, Maren Bennewitz, and Wolfram Burgard Computer Science Institute, University of

More information

An Angle Estimation to Landmarks for Autonomous Satellite Navigation

An Angle Estimation to Landmarks for Autonomous Satellite Navigation 5th International Conference on Environment, Materials, Chemistry and Power Electronics (EMCPE 2016) An Angle Estimation to Landmarks for Autonomous Satellite Navigation Qing XUE a, Hongwen YANG, Jian

More information

Euclidean Reconstruction Independent on Camera Intrinsic Parameters

Euclidean Reconstruction Independent on Camera Intrinsic Parameters Euclidean Reconstruction Independent on Camera Intrinsic Parameters Ezio MALIS I.N.R.I.A. Sophia-Antipolis, FRANCE Adrien BARTOLI INRIA Rhone-Alpes, FRANCE Abstract bundle adjustment techniques for Euclidean

More information

Computing the relations among three views based on artificial neural network

Computing the relations among three views based on artificial neural network Computing the relations among three views based on artificial neural network Ying Kin Yu Kin Hong Wong Siu Hang Or Department of Computer Science and Engineering The Chinese University of Hong Kong E-mail:

More information

Revising Stereo Vision Maps in Particle Filter Based SLAM using Localisation Confidence and Sample History

Revising Stereo Vision Maps in Particle Filter Based SLAM using Localisation Confidence and Sample History Revising Stereo Vision Maps in Particle Filter Based SLAM using Localisation Confidence and Sample History Simon Thompson and Satoshi Kagami Digital Human Research Center National Institute of Advanced

More information

StereoScan: Dense 3D Reconstruction in Real-time

StereoScan: Dense 3D Reconstruction in Real-time STANFORD UNIVERSITY, COMPUTER SCIENCE, STANFORD CS231A SPRING 2016 StereoScan: Dense 3D Reconstruction in Real-time Peirong Ji, pji@stanford.edu June 7, 2016 1 INTRODUCTION In this project, I am trying

More information

Advanced Techniques for Mobile Robotics Bag-of-Words Models & Appearance-Based Mapping

Advanced Techniques for Mobile Robotics Bag-of-Words Models & Appearance-Based Mapping Advanced Techniques for Mobile Robotics Bag-of-Words Models & Appearance-Based Mapping Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Motivation: Analogy to Documents O f a l l t h e s e

More information