Estimating Pose and Motion using Bundle Adjustment and Digital Elevation Model Constraints. Gil Briskin

Size: px
Start display at page:

Download "Estimating Pose and Motion using Bundle Adjustment and Digital Elevation Model Constraints. Gil Briskin"

Transcription

1 Estimating Pose and Motion using Bundle Adjustment and Digital Elevation Model Constraints Gil Briskin

2

3 Estimating Pose and Motion using Bundle Adjustment and Digital Elevation Model Constraints Research Thesis Submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Gil Briskin Submitted to the Senate of the Technion Israel Institute of Technology Sivan 5773 Haifa May 2013

4

5 This research was carried out under the supervision of Prof. Ehud Rivlin and Dr. Hector P. Rotstein, in the Faculty of Computer Science.

6

7 Contents List of Figures Abstract 1 1 Introduction Related Work Current Work Thesis Organization Bundle Adjustment Introduction Coordinates System BA equations Cost Function and Numeric Solver Gauss-Newtons Method Levenberg Marquardt Sparse Bundle Adjustment Digital Terrain Map Introduction Definition and Properties Degrees of Freedom Bundle Adjustment With Digital Terrain Model Constraints Introduction Coordinate System Conversion Surface Approximation Plane Approximation Second order Approximation Bundle Adjustment Framework Image Dilution Feature Extraction Outlier Detection

8 4.4.4 Parameters Initial Guess Solving the Bundle Adjustment Influence on the Bundle Adjustment Structure and Computation Time Bundle Adjustment Structure Computation Time Handling Errors Iterative Closest Point Introduction Iterative Closest Point Experimental Results Introduction Synthetic Experiments Experiments on Small Scale Model Experiment Outline Evaluation methods Experiment Flow Results Conclusion and Future Work 51 Hebrew Abstract i

9 List of Figures 2.1 The Bundle Adjustment weight matrix The Bundle Adjustment Jacobian matrix The Bundle Adjustment Hessian matrix Digital Terrain Model Examples for degenerated scenes The input generation in the synthetic experiments The synthetic scenarion Camera s Location Error on Angular Drift Camera s Location Error on Velocity Error Camera s Location Error Norm Slopes Comparison Images from the first experiment Images from the second experiment The estimated and measured camera path The estimated location error The estimated angles error The reconstructed DTM Heights Ill-constraint point example

10

11 Abstract Pose and motion estimation of a calibrated camera is a common application in the photogrammetric world. In most cases this is solved by combining motion estimation with connections between the images taken by the camera and external geographical data such as Global Positioning System (GPS), Orthophoto, Digital Terrain Model etc. While the motion estimation can be solved automatically using only the connections between the images, the connections to the geographical data are generated manually or by using geographical measurements, such as GPS signals of the camera s motion. The contribution of this thesis is a new constraint added to a pose and motion estimation algorithm. We propose to integrate the Digital Terrain Model (DTM) into the Bundle Adjustment framework. For that, the terrain was approximated to a differentiable second order function and new constraints were added to the Bundle Adjustment that minimized the distance between the 3D points and the terrain approximation. A framework that solves the pose and motion estimation with the new constraints using inaccurate initial guess was added. We showed that under certain conditions, the proposed method can replace other constraints based on geographical measurements, such as GPS signals on camera s motion, and by that be the first method that solves the pose and motion estimation of a sequence of images using only DTM and without additional geographical source. Our method has several advantages: The generated 3D structure is more accurate when the DTM constraints are added to the Bundle Adjustment, specifically for Ill-constraint points. In addition, DTM is available worldwide, can be acquired offline and is resistant to signal distortion and blocking as apposed to other measurements such as GPS. 1

12 2

13 Chapter 1 Introduction In spite of the effort placed in finding efficient and robust estimates for the pose and motion of a calibrated camera from multiple image views, the problem continues to attract extensive attention in the photogrammetric and computer vision communities. Perhaps the main reason for this continuous attention is the well-known fact that pose and motion cannot be uniquely solved from a series of images. Some of the limitations are obvious from the start: one cannot expect to obtain absolute information about pose with respect to an external coordinate system from a sequence of images; other limitations are more subtle and relate to the specifics of the visual information. One such limitation is the pose estimation when the images are taken from a small baseline relatively to the scene. The lack of uniqueness makes the problem hard and ill-conditioned, and hence additional assumptions, external information and specially design algorithm are required to produce robust and reliable estimates. This thesis will present a new approach for computing the pose and motion problem based on adding external absolute information, namely, information with respect to the external coordinate systems, to the data obtained from the sequence of images. Overall, the Motion and Pose estimation is sometimes referred to as the navigation problem, understood here as the problem of determining the position, location and velocity of an object (in the present case, a camera) with respect to an external reference system. In particular, if the scene is outdoors then the reference system will usually be attached to earth global coordinates, while in the case of indoors then the reference will somehow be attached to the building of interest. Details are discussed below. In some applications, a solution to the motion estimation problem suffices. This is the case, for instance, when dealing with applications that require the transformation of information from one image to another, such as tracking and relative navigation. On the other hand, for applications that transform information from images to an external scene coordinate system and backwards, such as mapping and absolute navigation, pose estimation is required, possibly in addition to the relative solution. The pose and motion estimation can be specified as follows: given a sequence of images I t taken by a calibrated camera at several time instants and from different 3

14 locations and orientations, find an estimate for the pose of the camera at the time instants when the images were acquired, together with the relative translation and rotation of the camera between time instants. The approach followed in this work is to attempt to solve this navigation problem by using optimization tools, hence one can conceptually consider an objective function L(pose, motion, structure, data) and solve: argmin pose,motion,structure L(pose, motion, structure, data) Here pose, motion and structure denote parametrization of the location of the camera with respect to a global coordinate system, of the linear and rotational motion of the camera between time instants, and of the structure of the scene, respectively. Moreover data denotes all the known data used for formulating the function, including information extracted from the images, calibration parameters of the camera, external information available and additional assumptions that are used. An example of the latter is the rigidity assumption of the scene used throughout. In some instances of the problem formulation the scene structure is eliminated, either totally or partially; however, as will be shown below, scene structure and its estimation is an important component of the present work. 1.1 Related Work Among the various approaches to the motion estimation problem, Bundle Adjustment (BA) [FZ98, TMHF99] is currently the method of choice for many applications that require a solution based on images taken under realistic conditions. To solve the pose estimation problem, an external source in addition to the images is needed. The common source is typically a GPS, an Inertial Navigation System (INS) system or a combination of both [Ell06, McG04, Lhu11]. BA together with GPS and INS is also used in outdoor vehicle navigation [KA], [SKLP06]. Alternatively, the location of one or more of the tracked features is sometimes assumed to be known, e.g., geo-referenced landmarks or manual control points. [DMH04] used laser sensor, such as LIDAR, to produce automatic control points. Another type of external source is the Digital Terrain Model (DTM) also known as Digital Elevation Model (DEM). [LRR06, LR11] used DTM to recover the orientation for a pair of images by approximating the DTM surface and using matches between the images to formulate a constraint on the pose parameters. The disadvantage of this method in compared to Bundle Adjustment is that the later solves all the images as a block and by that achieves better epipolar geometry between the images. Another approach for solving the problem, is using a coordinate descent approach, namely solve sequentially: 1. [ motion k, structure k] =argmin motion,structure L(pose k 1, motion k 1, structure k 1, data) 2. [ pose k] = argmin pose L(pose k 1, motion k, structure k, data) 4

15 The first step can be solved by using, for example, the epipolar or trifocal tensor geometries [SZB95, YWCO06] or the Bundle Adjustment. The second step can be solved by using scene matching, the Iterated Closest Point (ICP) algorithm [BM92, Zha94] or any other approach that exploits the additional data available in the formulation [HY12]. The solutions to each one of the steps are indexed, to express the fact that the coordinate descent can be iterated in the attempt to refine the solution. Since the ICP solves the rigid transformation between the cloud points generated by the BA and the DTM, it is very sensitive to errors in the cloud points structure. For example, velocity errors in the INS will produce a scale error in the estimated point cloud. [ZSN05] introduced a variation on the ICP method that also estimates the scale between the points sets. [SSH + 08, JB11] used also star observations, INS measurements, and doppler measurements as absolute information about the camera location and orientation. The solution of the BA using only those geographical measurements produced a drift in the Z axis (height values), and to solve that, they added a second step, where simple constraints between the 3D points and the DTM were added to the BA. All the methods mentioned above have advantages and disadvantages: GPS may or may not be available [BPHR11], the accuracy of the INS solution diverges with time, landmarks need to be in the field of view and then identified. 1.2 Current Work The combination of BA with some sort of external information is the field of the current work; however, instead of assuming the availability of absolute information about position, motion or structure, it will be shown that the BA can be extended by using a Digital Elevation or Terrain Model (DEM/DTM), which is the digital representation of ground surface topography or terrain and hence can be use as a source of absolute information on the scene. It can also be used to solve the INS drift by extracting the transformation between the 3D points generated by the BA and the DTM. Although plausible and intuitively appealing, it is well-known that if the objective function does not meet some stringent assumptions then coordinate descent will not necessarily produce an optimal solution to the overall optimization problem, and may even not converge to a stationary point. Consequently, one is interested in finding a single step solution to the problem, namely, one that would seek for a motion, structure and pose simultaneously. The main purpose of our work is to present one such algorithm, and to compare its performance with respect to a more standard variation of coordinate descent. In this thesis, we propose to solve both the relative and absolute pose in the BA framework by adding DTM to the BA constraints using second order approximation of it. By using the DTM information directly in the BA framework, we successfully solve the pose and estimation problem without the need for additional measurements, such as GPS and INS, or additional solver, such as the ICP. 5

16 1.3 Thesis Organization The rest of the thesis is organized as follows: Chapter 2 gives a general overview of the Bundle Adjustment. Chapter 3 describes the DTM properties and its limitations. Chapter 4 describes our proposed method in details. Chapter 5 shortly describes the ICP algorithm for which we compared our results to and in Chapter 6 the evaluation of our proposed method on synthetic and real images is presented. The conclusion and future work are detailed in Chapter 7. 6

17 Chapter 2 Bundle Adjustment 2.1 Introduction Bundle Adjustment is an algorithm that uses information extracted from a set of images taken by a moving camera to computes the 3D structure of the observed scene together with the relative motion of the camera. BA has effectively become the algorithm of choice for reconstructing large and complex scenes, under the assumption that the relative motion in between images is totally or partially measured using an additional sensor, e.g. a differential GPS. BA was originally formulated in the photogrammetric literature and later re-invented as a structure from motion method in [FZ98]; a comprehensive tutorial including pointers to the relevant literature is provided in [TMHF99]. The name Bundle Adjustment refers to the fact that the method considers the bundles of light rays emanating from a camera at several time instants. Each light ray represents the possible 3D location of a feature seen by the image the light ray emanates from. The light ray origins from the location of the camera s pinhole at the time the image was taken, and passes through the projection of that feature on the image. Light rays from 2D observations of the same feature should ideally intersect at a unique 3D point as they represent the same 3D object. The relative pose of the camera at each imaging location can be adjusted so as to meet the constraint that every set of rays representing a single feature will intersect at a single 3D point. In practice, BA can be formulated to a large sparse geometric parameter estimation problem that can be efficiently solved using modifications of standard optimization algorithms. The optimization problem is formulated using measurements, parameters and equations (restrictions) that tie the measurements and parameters together. In classical BA, the measurements consist of a collection of 2D location of extracted points from the images, where, as mentioned in the previous section, a set of 2D locations on different images that represent the same feature is called a track. The parameters are the camera poses at the time the images were taken and the 3D location representing the intersection point of the light rays of each track. The cost function to be minimized is the square sum of the differences between the measured 2D location and the projection 7

18 of the estimated 3D location of those features using the estimated camera poses. The cost function is actually formed by multiplying each constraint by a weight matrix that represent the predicted noise of the measurement. Alternatively, the weights can be also thought of as a normalizing factors, so that different types of measurements can be combined into a single cost function. In the standard BA the cost function consists only of constraints resulting from the 2D projections. As a consequence, only the relative solution can be solved, namely a solution up to 7 degrees of freedom. In order to solve for the absolute pose, new information and constraints are required and a number of options has been proposed in the literature. For example, a source of measurement can be the solution of an Inertial Navigation System (INS). The new restriction would be the euclidean distance between the INS location of the camera at time t and its estimated location. Although those measurements solve the degree of freedom, they are sensitive to drift and velocity errors in the INS system. Another type of measurement is the 3D location of a track, also known as a control point. The restriction in this case would be the distance between the estimated point and the measured 3D location. The disadvantage of the control points is that the matching between the measured 3D location and the 2D locations is done manually. In this chapter we will briefly present the Bundle Adjustment algorithm without the new constraints proposed in this thesis, which will be described in details in Chapter 4. The rest of the chapter is organized as follows: In Section 2.2 the conversion between the different coordinate system embedded in the equations is detailed. Section 2.3 describes the BA equations and parameters used in this work. Section 2.4 formulates the cost function and the numeric solver used to minimize it. Section 2.5 describes the sparse nature of the Bundle Adjustment and how it is used to reduce significantly the complexity in time and place. 2.2 Coordinates System In order to solve for absolute pose, global coordinate systems with respect to the earth are required. For example, GPS measurements of the camera location is given in either Longitude-Latitude-Height or in Earth-Centered-Earth-Fixed (ECEF) coordinate systems, while DTM is usually referred to Universal Transverse Mercator (UTM) coordinates possibly due to its defense-oriented origins. Except from ECEF, non of the above are cartesian coordinate system. On the other hand, the pinhole camera [HZ04] model used in the BA to forwardproject from the estimated 3D point to the camera plane is Cartesian. More over, since each estimated 3D point is projected to several images (otherwise it is rejected since there is not enough information to determine its location as it has only a single ray of light), the ability to convert between the Cartesian system of each image is needed in order to construct the BA cost function. Another type of projection used in this Thesis, is the backward-projection from the 8

19 camera to the DTM surface. This projection is described in details in Chapter 4, but it is clear that the ability to convert between the Cartesian system of the Camera and the DTM is needed. In addition, when the DTM is referenced to a UTM coordinates, the backward-projection function would suffer from the geographical distortions resulting from the UTM approximation. In order to address all the issues raised above, we used a single Cartesian system in our Bundle Adjustment, and converted all the geographical information from the different coordinate systems to that single Cartesian system at the beginning of the process, before building the cost function. All the measurements, parameters and constraints are then defined and solved in that coordinate system. Obviously, the parameters can be transformed back to its origin coordinate system using the reverse transformation. This method has several advantages. First, it simplifies all the constrains and equations since there is no need for conversions due to coordinate system differences. For example, the conversion between the GPS measurements of the camera location is not added to the forward-projection constraints between the estimated 3D point and the image plain. Secondly, if the DTM is referenced to UTM coordinates, it solves the geographical distortion in the backward-projection function resulting from the UTM approximation. Another advantage, is the comfort of having all the parameters in the same coordinate system, so for example calculating the distance between two points is equivalent to just measuring the difference between their values, with no additional conversion. We used a Local-Level Local-North (LLLN) system, also known as North-East-Down (NED) system as our Coordinate system. To avoid numeric overflow and to accurately represent the problem, we preferred the parameters to have small values. Therefore, we chose the origin to be the measured location of the first camera and by that all other parameters only represented the delta from the first camera. For the same reason we didn t choose ECEF coordinate system since its origin is set to the center of the earth. In our experiments, there were three types of geographical measurements: Camera locations in Longitude-Latitude-Height, Camera angles in LLLN coordinate system originated in the camera measured location and DTM in UTM projection. The camera locations were converted to the main Coordinate System by first converting them to ECEF and then transforming them to the main Coordinate System. Since the angles are given in an LLLN system in the first place, we converted them to the main Coordinate system by just rotating the angles with the rotation matrix that transforms between the two LLLN coordinate systems. The conversion of the DTM to our Coordinate system is detailed in Section

20 2.3 BA equations The measurements together with the parameters below define the set of constraints (equations) solved by the bundle adjustment. Since measurements are based on physical observations, they are not accurate and include errors and noise from different sources. In particular, noise can be modeled as a white-noise, Gaussian process. Let m i R n be a noisy measurement vector, and m i R n be the true ideal measurement vector. Then: m i = m i + n i, n i N (0, i ) (2.1) where n i R n is modeled as a Gaussian noise vector with zero mean and covariance i R nxn. Covariance matrix with large singular values indicates that the error vector n i may have large values and therefore the measurement might contain noise. In the same way, covariance matrix with small singular values indicates that with high probability the error vector will be small and the measurement is close to its real value. The covariance matrix is used to weight the different equations in the cost function defined in Section 2.4, so errors on measurements with large singular values will contribute less to the cost function then the same errors on measurements with small singular values. The common measurements used in the Bundle Adjustment are the 2D observations of the 3D features extracted from the images. The 2D observations are usually generated by an image processing tool, such as Scale Invariant Feature Transform [Low04] or can be even set manually. We will denote such observation on an image taken at time t with p t i R2. That observation is then matched manually or automatically to 2D observation on other images. A set of such observation is denoted a track: i track i = { p t 1 i, p t 2 i,..., p t h i }, [t 1, t 2,..t h ] [1..n], 3 h n For each track i we denote the 3D point P i that represents the location of the feature all the track s 2D observations refers to. For each observation p i a covariance matrix ( t i ) R2x2 is defined so: p t i = p t i + n t i (2.2) where p t i is the true location of the feature and nt i N (0, t i ) is the Gaussian noise vector. The desired situation is that each track will contain 2D observations from as many images as possible. This has two main advantages: The estimation of the 3D point P i will be more robust to bad matches in the track. 10

21 The longer the track is the wider the base line angel between the extreme images, which will then increase the accuracy of the estimation of the 3D point. See the section on Ill-constraint points in Chapter 7 for more details. Another type of measurements is the camera poses, locations and orientations, generated by the Internal Navigation System (INS) and Global Position System (GPS) at different time stamps. INS is based on accelerometers, gyroscopes, or other motionsensing devices and together with GPS is able to generate location at any time stamp by integrating on the motion sensing devices. As such, it is obvious that the error of the location is not independent since error in the accelerometer or gyroscope at time t will influence all further locations and angles due to the integration process. This type of error is also referred to as a Drift Error. For such errors there is a strong correlation between the poses in different time stamps and the Gaussian noise error model used in this thesis is not suitable. Since we didn t conduct experiments using real GPS and INS data (see Chapter 6) and in Chapter 4 we presented a method that solves the absolute pose problem using only DTM, we preferred to use the simple representation of the error model. To formulate our pose error model, the pose measurement at different time stamps is donated by: T t, Ψ t, t [1..n] In addition, the covariance matrix n t T,Ψ N (0, t T,Ψ ) is given so: [ T t Ψ t ] where n t T,Ψ N (0, t T,Ψ ) is the Gaussian error vector and [ 6x1 = [ T t Ψ t ] 6x1 + n t T,Ψ, (2.3) true location and orientation of the camera in the form of Euler angles. T t Ψ t ] 6x1 represent the While the measurements define the equations of the cost function, the parameters are the variables of the cost function. These are the types of parameters used in this thesis: T3x1 t, Ψt 3x1-6 parameters representing the camera pose at time t: T t represents the camera location and Ψ t represents its Euler angles. Both T t and Ψ t are relative to the main coordinate system, defined in Section 2.2. The transformation from the Euler angel to the rotation matrix used in Equation 2.4 can be found in [Sla99]. P i 3x1-3 parameters representing the 3D location of feature i. To model the camera projection, we use a Pinhole camera model [HZ04]. The 2D projection of feature i on the image taken at time t, p t i, can be calculated using the 11

22 camera pose at time t, T t Ψ t, and the 3D location of feature i, P i by: p t i = Vi t(0) Vi t(2) Vi t(1) Vi t(2) 2 1, V t i = KP t i = KR(Ψ t )(P i T t ) (2.4) It is assumed that the camera internal parameters, represented by K matrix, are known and accurate. In case this assumption is not valid, the internal parameters can be estimated by adding them as parameters to the BA. 2.4 Cost Function and Numeric Solver As presented by [TMHF99], the BA problem can be expressed by looking at the measurements and their error model, and finding the parameters that maximize the likelihood of our solution: argmax P i,ψ t,t t i,t P rob(n t i = p t i p t i) P rob(n t Ψ = Ψ t Ψ t ) P rob(n t T = T t T t ) = argmax P i,ψ t,t t i,t t t t 1 2 Ψ 2π t t 1 2 i 2π exp( 1 2 ( pt i p t i) T ( t i) 1 ( p t i p t i)) exp( 1 2 ( Ψ t Ψ t ) T ( t Ψ) 1 ( Ψ t Ψ t )) t 1 2 T exp( 1 2π 2 ( T t T t ) T ( t T ) 1 ( T t T t )) = t argmax exp( P i,ψ t,t t i,t 1 2 ( pt i p t i) T ( t i) 1 ( p t i p t i) + t 1 2 ( Ψ t Ψ t ) T ( t Ψ) 1 ( Ψ t Ψ t )+ + t 1 2 ( T t T t ) T ( t T ) 1 ( T t T t )) = argmin P i,ψ t,t t i,t ( p t i p t i) T ( t i) 1 ( p t i p t i)+ t ( T t T t ) T ( t T ) 1 ( T t T t )+ t ( Ψ t Ψ) T ( t Ψ) 1 ( Ψ t Ψ) In summary, the BA is the problem of minimizing the following cost function: q(p i, Ψ t, T t ) = i,t ( p t i p t i) T ( t i) 1 ( p t i p t i) + t ( T t T t ) T ( t T ) 1 ( T t T t )+ 12

23 ( Ψ t Ψ) T ( t Ψ) 1 ( Ψ t Ψ) t The above cost function is a weighted Sum of Squared Error (SSE) when the inverse of the covariance matrices ( t i ) 1, ( t T ) 1, ( t Ψ ) 1 is considered as the weights. Each term in the cost function, such as ( p t i pt i )T ( t i ) 1 ( p t i pt i ), can be thought as an equation that ties between the measurement and the parameter. It can be easily seen that the cost function described above is not necessarily convex, and therefore a numeric solver is needed. we use the Levenberg Marquardt (LM) algorithm which is a step control algorithm that interpolates between Gauss-Newton s and Gradient methods Gauss-Newtons Method Given some non convex function h(x), Gauss-Newton s method [Bjo96, HZ04] attempts to construct a sequence x n from an initial guess x 0 that converges towards the minimum of the function. First, h(x) is approximated by developing the quadratic Taylor approximation around x 0 : h(x 0 + δx) h(x 0 ) + g T δx δxt Hδx g dh dx (x 0) H d2 h dx 2 (x 0) (2.5) This approximation is convex and has a simple global minimum, which can be found explicitly using linear algebra. Setting dh(x 0 + δx) dδx = g + Hδx = 0 (2.6) will give us the Gauss-Newton step: δx = H 1 g (2.7) The next value of the variable will be x 1 = x 0 + δx. Iterating on the Gauss-Newton step (by recalculating Taylor approximation around the point x n = x n 1 + δx and solving (2.6)) will give us the Gauss-Newton s method, which hopefully converges to the minimum Levenberg Marquardt For large steps, the Taylor approximation (2.5) might be inaccurate, and the Gauss- Newton step may not decrease h(x). On the other hand, the gradient descent method guarantees decrease but may be slow. 13

24 A combination of the two methods is used: δx = (H + λi) 1 g (2.8) where λ is the parameter that weights between the two steps. When λ is big H can be neglected and the step is a small Gradient step : δx 1 λ g When λ is small then λi can be neglected and the step is a Gauss-Newton step: δx H 1 g Levenberg Marquardt (LM) [PTVF92] is a step control algorithm that interpolates between the Gauss-Newton s method and the Gradient method. The algorithm changes the λ factor through the iterations in order to a achieve fast and accurate convergence. There are many variants to the LM algorithm. The following one was used in this work: 1. λ = initial value, iter = 1 2. while iter <max num of iterations && improve >minimum improve (a) while 1 i. δx = (H + λi) 1 g ii. if h(x i + δx) < h(x i ) A. λ = λ/10 B. break iii. else A. λ = λ 10 (b) x i+1 = x i + δx (c) iter = iter +1 (d) improve = h(x i ) h(x i+1 ) To describe the numeric step in the Bundle Adjustment, lets first represent the cost function (2.4) in a matrix form: q(x) = 1 2 (f(x) b)t W (f(x) b) (2.9) where f(x) is the projection matrix transferring from the parameters vector x and the measurements vector b, and W is the weight matrix compound from the measurements covariance. 14

25 Therefore: g = dq dx = d( 1 2 f(x) b)t W (f(x) b) = J T W (f(x) b) (2.10) dx where J = d(f(x) b)) dx = df(x) dx. The Hessian is therefore: H = d2 q dx 2 = dg dx = J T d(f(x) b)) W + dx i ( d2 f(x) i dx 2 )(W (f(x) b)) i = J T W J + i ( d2 f(x) i dx 2 )(W (f(x) b)) i In practice, the term ( d2 f(x) i )(W (f(x) b)) dx 2 i is relatively small in comparison to i J T W J (since either the prediction error (f(x) b) is small or the model is nearly linear d 2 f(x) dx 2 0). Dropping the second term gives the Gauss-Newton approximation to the least squares Hessian, Now, the Levenberg Marquardt step becomes: H J T W J (2.11) δx = (H + λi) 1 g = (J T W J + λi) 1 J T W (f(x) b) (2.12) 2.5 Sparse Bundle Adjustment One of the main strength of the Bundle Adjustment is its sparse structure that enables it to solve large problems with large number of constraints and parameters in real time. It boils down to the fact that the calculation time of a single step in the numeric solver is proportional to the number of pose parameters and not to the number of 3D points parameters. To show that, lets look at a simple problem with eight 3D points seen by 3 images. The parameter vector is then: and the measurement vector is: x = [P 1... P 8, T 1, Ψ 1... T 3, Ψ 3 ] 42x1 b = [{ p 1 1,..., p 3 1},..., { p 1 7, p 2 7}, { p 1 8,..., p 3 8}, { T 1, Ψ 1 },...{ T 3, Ψ 3 }] 62 1 when in this case the 7th point is not seen by the 3rd camera. If all the measurements are independent, which is usually the case, then the weight matrix W is block diagonal and the order is defined by the measurement vector b. Figure 2.1 shows the weight matrix for the example above. 15

26 Figure 2.1: The Bundle Adjustment weight matrix. The Jacobian matrix is defined by J = df(x) dx. The rows correspond to the measurement vector b, and the columns correspond to the parameters vector x. To evaluate the structure of the Jacobian, lets first examine the derivations of the constraints. Here are the derivation of the 2D observations: d( p t i pt i ) dp j = 0, j i d( p t i pt i ) dψ j = 0, j t d( p t i pt i ) dt j = 0, j t The derivations of the camera constraints are: d( T t T t ) = d( Ψ t Ψ t ) = 0, j dp j dp j d( T t T t ) dψ j = d( Ψ t Ψ t ) dt j = 0, j d( T t T t ) dt j = d( Ψ t Ψ t ) dψ t = 0, j t In Figure 2.2, the sparseness of the Jacobian can be easily seen for the example defined above. As was seen in Equation 2.11 the hessian can be approximated by H = J T W J. Since W is block diagonal it does not change the structure of J, therefore the structure of H is equivalent to the structure of J T J. The entry H i,j is therefore not empty if the inner product of column J i with column J j is not zero. The structure of H is called Arrowhead and it is divided into 4 parts: N 1 represents the second derivations of 16

27 Figure 2.2: The Bundle Adjustment Jacobian matrix. points by points, N 2 represents the second derivations of points by camera parameters, N 3 is the transpose of N 2, N 3 = N2 T, and N 4 represents the second derivations of camera parameters by camera parameters. Figure 2.3 shows the Hessian matrix for the example defined above. Figure 2.3: The Bundle Adjustment Hessian matrix. Note that the step control λi added to H does not change the structure of the matrix, as the diagonal is already assumed to be full. To further analyze the calculation needed to solve the numeric step defined in 17

28 Equation 2.12, we will donate g for the gradient step: g = J T W (f(x) b) and divide it together with δx into to two parts g = Equation 2.12 can be written as a set of two equations: ( N 1 N 3 N 2 N 4 ) ( δx 1 δx 2 ) = [ ( g 1 g 2 g 1 g 2 ] [ and δx = ) δx 1 δx 2 ] so that (2.13) From the first equation: N 1 δx 1 + N 2 δx 2 = g 1 we can write δx 1 in terms of δx 2 : δx 1 = N 1 1 (g 1 N 2 δx 2 ) (2.14) By applying it to the second equation N 3 δx 1 + N 4 δx 2 = g 2 we get: and δx 2 can be extracted as: N 3 N 1 1 (g 1 N 2 δx 2 ) + N 4 δx 2 = g 2 δx 2 = (N 4 N T 2 N 1 1 N 2) 1 (g 2 N T 2 N 1 1 g 1) (2.15) and δx 1 can be extracted from δx 2 using Equation Now, lets analyze the computational time needed for computing δx 1 and δx 2. For that lets denote k as the number of features and l as the number if images. Therefore, Calculating g - O(l k) as the worst case is when all the features are seen by all the images. Calculating N O(k) Calculating (g 2 N T 2 N 1 1 g 1) - O(l k). Calculating (N 4 N T 2 N 1 1 N 2) 1 - O(l 3 ). Under the assumption that l < O(k 2 ) the overall computation is O(l 3 ) and is proportional to the number of images and not to the number of features. 18

29 Chapter 3 Digital Terrain Map 3.1 Introduction As described in Chapter 2, simple Bundle Adjustment can be solved up to 7 degrees of freedom based only on a sequence of images. In order to solve the degrees of freedom, additional constraints on geographical data is needed. One type of global information is the Digital Terrain Model (DTM). DTM is a digital model or 3-D representation of a terrain s surface. Before describing the new constraints added to the Bundle Adjustment, the properties and limitations of the DTM are described in this chapter. Section 3.2 describes the DTM and Section 3.3 describes some degenerated cases where solving the absolute pose using DTM is limited. 3.2 Definition and Properties A Digital Terrain Map (DTM) is a model of the surface of the earth. More specifically, given a two-dimensional (2D) parametrization for the horizontal location on the earth surface, the DTM is the mapping DT M : R 2 R providing the altitude of the earth surface at a particular horizontal location: h(x, y) = DT M x,y. The parametrization x, y may denote the geographical coordinates with x being the latitude and y the longitude. Alternatively, the two variables can denote the northing and easting of a UTM projection. Likewise, h may denote the altitude of the terrain above sea level or above a reference ellipsoid, typically WGS84. In practice, the earth surface cannot be modeled by a simple function, and instead the DTM is known by the altitude values over a discrete grid: h i,j = DT M xi,y i = DT M i,j The resolution of the DTM grid ranges from 5 meters to 90 meters. The dense DTM (5 meters resolution) is usually available only to official authorities but the sparser DTMs are freely available: Shuttle Radar Topography Mission (SRTM) [12] has a 19

30 Figure 3.1: Digital Terrain Model Terrain s surface and its Digital Terrain Model. global coverage with 90m resolution, where Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) [AST] has coverage of 99% and resolution of 30m. The accuracy of the height values of the DTM ranges from one meter to 15 meters. The accuracy depends on the source of elevation, the terrain roughness and the sampling density. For example, [RMB06] states that 90% of the height errors in SRTM are below 5-9 meters, depends on the location. The errors might be correlated and obviously have influence on the quality of the registration to the DTM. While some of the errors can be handled, some errors, such as a constant drift error in all the height values, can not be solved when DTM is the only geographical source. The different error types and the methods used to handle them are detailed in Section 4.6. In terms of memory, the DTM is very compact. For example, a region of 10x10 km with a grid resolution of 25m has 160,000 sampled point and by assuming each point requires 4 bytes, the whole DTM can be represented with only 540k. 3.3 Degrees of Freedom The basic idea behind solving the absolute pose estimation problem using Bundle Adjustment and DTM, is the ability to solve the pose problem by comparing between the reconstructed scene and the DTM using the 3D structure of the scene. In some degenerated cases, the terrain s scene is such that it is possible to solve the absolute estimation problem up to some degree of freedom. In those cases, additional geographical information is needed, such as GPS. For example, when the surface is represented as a 20

31 horizontal plane, there are 3 degrees of freedom: the matched solution can vary along the XY plain and can be scaled without effecting the matching to the DTM. Figure 3.2 describes some additional examples. Figure 3.2: Examples for degenerated scenes. Some cases for degenerated scenes. The top example has 3 degrees of freedom, as the solution can vary along the XY plane and can be scaled. In the first example in the second order, the solution has 2 degrees of freedom as it can vary along the X axis and can be scaled. In the third example, the solution has a single degree of freedom as the solution can be scaled and in the last example there are no degrees of freedom. 21

32 22

33 Chapter 4 Bundle Adjustment With Digital Terrain Model Constraints 4.1 Introduction Our novel approach is to combine the DTM into the BA equations. To add the DTM to the BA, the set of parameters, measurements and constraints must be defined. The requirements the BA framework impose are such that the new measurements must have covariance matrices and the new constraints be differentiable, otherwise so they could not fit into the numeric solver introduced in Chapter 2. Our method does not introduce new parameters to those described in Chapter 2, which is important since as described earlier the calculation time of the numeric step is proportional to the number of parameters. As for the constraints, beside adding new constraints that ties the reconstructed structure to the DTM, we also removed the constraints based on the INS measurements for two reasons: First, we wanted to simplify the equation system so the absolute solution will be solved only by the DTM. Secondly, we wanted to show the strength of the proposed constraints so the absolute solution can be solved without INS and GPS signals. By that, we extend the ability to solve the absolute position problem to products without geographical signals, such as INS and GPS, or in case those signals are not available. Having said that, it is obvious that in case other geographical measurements exist, they can be used together with the new DTM constraints. Our basic idea is to add a constraint on the distance between the 3D point P i and the DTM. Ideally, we would like that all the 3D points will lie on the DTM. In order to define a constraint between P i and the DTM, the DTM surface is approximated near the predicted location of P i to a simple differential function h i (x, y) : R 2 R Section 4.3 describes two possible approximation we examined. 23

34 In contrast to the measurements described so far, the surface approximation does not have an error model. On the other hand, the BA scheme is based on the assumption that the measurements have Gaussian error model. Therefore, in order to add the constraint to the BA we assume that the distance between the point and the surface is modeled by Gaussian error: d i = dist(p i, h i (x, y)), d i N (0, var i ) The variance of the constraint, var i, is set a-priori by the scenario properties and is changed according to the accuracy of the DTM. See Section 4.6 for more details. 4.2 Coordinate System Conversion As described in Section 2.2, a single coordinate system is used in our framework and all the other geographical information is converted to it. We use a LLLN coordinate system originated in the first measured camera location. To add the DTM to the BA cost function, we first need to convert the DTM to the main coordinate system. As mentioned in Chapter 3, the DTM is given in a UTM projection and is constructed of 2D grid of points with height values. Due to the geographical distortions in the UTM projections, a grid of points defined in the UTM projection is not converted to a grid in a real Cartesian system, such as ECEF or LLLN. Since a grid of points is used in the DTM backward-projection function, simple conversion of the DTM grid is not enough. Instead, we first define the 2D grid in the main coordinate system where we want the DTM to be converted to. Then, we transform the new grid to the UTM coordinates, and sample the height values at those points by interpolating the DTM values. Since usually the images are taken on a relative small area compared to the earth surface, the transformed DTM is still a valid mapping R 2 R. 4.3 Surface Approximation One of the basic ideas in the approach presented here is to approximate the true surface around a given location by using a simple function of the horizontal parametrization. For each 3D point, P i, defined in the BA parameters, an approximated surface will be calculated and used to define a new constraints. Since the surface approximation is local, we first need to define the location on the surface where the surface approximation is needed. To do that before an estimation of P i even exists, we use backward-projection on the DTM. Given a 2D observation p t i of point P i and the camera pose approximation, T t Ψ t, the estimated point on the surface can be calculated using ray tracing [App68]. Lets denote the intersection point by Pi DT M. To approximate the true surface around Pi DT M, we will consider the DTM grid as a sample of points from the true surface. The grid of points is then used to construct 24

35 the surface approximation. There are several considerations when choosing the size of the grid to use in the surface approximation methods. First, we need to decide the size of the area for which we want to approximate the surface from. In the one hand, a large area should be chosen so that the topographic of that area is captured, but on the other hand, our surface approximation can model only simple geometric shapes, such as a single mountain or valley in the case of the second order approximation. In our experiments we found that choosing large area produced bad surface approximation. We chose the size of the area a-priori according to the surface topography. Since usually the topography doesn t change so rapidly in the region of interest (ROI) of the camera, this can also be done automatically before the estimation begins by examining the derivations in the height values inside the ROI. Given that the size of the area was defined, we used all the grid points in that area so our approximation will be less sensitive to errors in the grid samples. Another element that influences the quality of the surface approximations is the DTM resolution. For very sparse DTM samples, small changes in the surface are not modeled by the DTM and therefore not modeled by the surface approximation. This can be handled by increasing the constraint variance, see Section 4.6. The surface approximation depends on the point Pi DT M which is determined by ray tracing from the camera pose estimation, T t Ψ t. Since the camera pose is vary inaccurate in the beginning of the process, and consequently so is Pi DT M and the surface approximation, we re-calculate Pi DT M and the surface approximation h i (x, y) through the BA convergence. A plane approximation is described in Subsection and a taylor expansion of it to a second order function can be found in Subsection In our experiments we found the second approximation to be more precise and to speed up the convergence Plane Approximation We approximated the surface around the point Pi DT M points on the plane satisfies: to a plane (ñ i, d i ) such that ñ T i p + d i = 0 (4.1) The surface parameters were calculated by taking the 3x3 height values from the DTM surrounding the approximated point Pi DT M, and solving the minimal square problem: Ax = b (4.2) where A = X1 DT M Y1 DT M Z1 DT M X2 DT M Y2 DT M Z2 DT M... X9 DT M Y9 DT M Z9 DT M 25 9x3, b = 1 3x1

36 and (Xi DT M, Yi DT M, Zi DT M ) are the nine DTM points around Pi DT M which in our experiments reflects an area of 75x75 meters. By multiplying A T on both sides of the equation, we get the following equations: (A T A) 3x3 x = (A T b) 3x1 (4.3) This set of equations can be solved explicitly since it has only 3 variables. The normalized plane parameters are then: ñ i = x x, di = 1 x For each 3D point P i and its plane approximation (ñ i, d i ) we add the constraint: ñ T i P i + d i = 0 (4.4) Now, the BA is the problem of minimizing the following cost function : minimize ( p t P i,ψ t,t t i p t i) T cov 1 ( p t i p t i) + i,t i (ñ T i P i + d i ) 2 var (4.5) Second order Approximation We approximate the surface around the point Pi DT M that all the points p = (x, y, z) on the surface satisfies: to a second order surface such z = ãx 2 + by 2 + cxy + dx + ẽy + f (4.6) where (ã, b, c, d, ẽ, f) are the surface parameters. The surface parameters are calculated by taking the 7x7 height values from the DTM around the approximated point Pi DT M which in our experiments reflects an area of 175x175 meters, and solving the minimal square problem: where Ax = b A = (X1 DT M ) 2... (X49 DT M ) 2 (Y1 DT M ) 2... (Y49 DT M ) 2 X1 DT M Y1 DT M... X49 DT M Y49 DT M X1 DT M... X49 DT M Y1 DT M... Y49 DT M T 6x49

37 X = ã b c d ẽ f 6x1, b = Z DT M 1 Z DT M 2.. Z DT M 49 and (Xi DT M, Yi DT M, Zi DT M ) are the 49 DTM points around Pi DT M. By multiplying A T on both sides of the equation, we get the following equations: 49x1 (A T A) 6x6 x = (A T b) 6x1 (4.7) Now that the matrix, (A T A) 6x6, is square and semi-positive, we can solve the set of equations using LU factorization. To simplify the derivations of the new constraint, we used as a distance function the difference in the height value of estimated point P i = (X i, Y i, Z i ) and the surface at point X i, Y i : Z i ãx 2 i + by 2 i + cx i Y i + dx i + ẽy i + f = 0 (4.8) In our experiments we found that this distance function was a good approximation of the real distance between a 3D point and the surface, and on the other hand simplified significantly the derivations of the constraint. For each 3D point P i estimated in the Bundle Adjustment and its surface approximation (ã, b, c, d, ẽ, f) we add the above constraint. The combined cost function is now: minimize ( p t P i,ψ t,t t i p t i) T cov 1 ( p t i p t i) + i,t i (Z i Z i ) T var 1 i (Z i Z i ) = (4.9) minimize ( p t P i,ψ t,t t i p t i) T cov 1 ( p t i p t i) + i,t i (Z i ãx 2 i + by 2 i + cx i Y i + dx i + ẽy i + f) 2 var i 4.4 Bundle Adjustment Framework In this section, the framework of the proposed method is described. As mentioned before, the input to our method contains a sequence of images, DTM and an initial guess for the first image. Before we solve the optimization problem, we first need to extract 2D observations, i.e. tracks, and set an initial value to all the parameters: camera locations and orientations and 3D locations of the features. 27

38 4.4.1 Image Dilution When images are taken from adjacent locations, as in video stream, using all the images will increase significantly the number of the BA parameters, and specifically the number of image pose parameters, and as was described in Section 2.5 are critical to the time performance. In those cases, some dilution of the images that will be estimated is needed. The Dilution should be such, that the tracks will be long enough, and as a rule of thumb, a 65% overlap between the images is sufficient. A simple method for dilution can be marking an image for estimation every X frames. This method assumes that the camera is moving in a constant speed. A more precise method will be choosing an image that has X% degrees of baseline, e.g. 3 degrees, relatively to the last estimated image. In both cases, all the images can be used to help matching tracks between the estimated images. In our experiments, the images were sparse enough and no dilution was needed Feature Extraction To extract 2D features, there are various method, and we used a tracking method that is based on the Scale Invariant Feature Transform (SIFT). In the first stage, we extract a predefined number of features, e.g. 200, from the first images and try to match them to 2D features that were extracted in the second image. Assuming that some features are not successfully matched, new features are extracted in the second image so the total number of features is not changed. In the third image, the existing features are first matched to extend the existing tracks, and new features are then extracted to compensate for the failed matched, and so on. Features that are not matched, i.e. have a track of length one, are rejected as there is no baseline for them and their 3D location can not be extracted Outlier Detection Some of the tracks contain wrong matches, 2D observations of different 3D features. Since the BA is eventually a SSE algorithm, such outliers may drift the algorithm from the desired estimation. Therefore, we apply various outlier detections after the features were extracted. We used outlier detections using the RANdom SAmple Consensus (RANSAC) [FB81] with Fundamental and Affine models on the 2D observations. In addition, we also used outlier detection based on the Root Mean Square (RMS) error of the 2D constraint of the Bundle Adjustment. After each iteration of the Bundle Adjustment, we rejected 2D observations that were more then 3σ of the average RMS Parameters Initial Guess Before starting to solve the Bundle Adjustment, we first need to set initial values for the parameters, P i T t Ψ t. The initial values for T 1 Ψ 1 are set to the initial guess given 28

39 as an input. The initial guess for the 3D locations of the tracks seen by the first image, are calculated by ray tracing the DTM using T 1 Ψ 1 and the 2D observations of those features on the first image. Then, the location and orientation of the second image, T 2 Ψ 2, are calculated using features that their 3D location, P i, was calculated in the previous stage, i.e. features that are common between the first and second image. This can be done using Bundle Adjustment where the only parameters are of the second image pose parameters, T 2 Ψ 2, and the measurements are the 2D observations on that image and the features location is fixed. These BA has only 6 parameters and is solved very quickly. The previous steps, are repeated for all the images and features Solving the Bundle Adjustment As described in Section 4.3, the surface approximation depends on the estimated point on the DTM, Pi DT M, which itself depends on the camera estimation. Since in the beginning of the process, the camera estimation is very poor and so is the Pi DT M, the surface approximation and obviously the solution of the BA will be inaccurate as well. Therefore, we repeat the surface approximation several times as the camera pose is more accurate. After each time the Bundle Adjustment converge, we use the new camera poses to recalculate Pi DT M Adjustment estimation. and the surface approximation and repeat the Bundle Since in the beginning, the surface approximation is not good, there is no point giving the BA to converge, since we only wants its coarse step. In any way, it could converge in the next iteration. Therefore, we start with small number of BA iterations, and as the surface approximation is getting better, we increase the BA iterations. In our experiments, we started with 20 BA iterations, and increased them by 10 iterations for every surface approximation. We found that by average after 7 surface approximation, the whole solution converge. 4.5 Influence on the Bundle Adjustment Structure and Computation Time Bundle Adjustment Structure First, lets evaluate the number of constraints added: For every 3D point we add a single constraint. On the other hand, for every 2D observation of that feature there are 2 constraints, one per image axis. For example, if a feature is seen by 10 images, it has 20 constraints on 2D observations and therefore in terms of matrices size, our new constraint is quite neglected. As mentioned before, new parameters are not added. Now lets examine the influence of our constraint on the sparseness of the Bundle Adjustment by examining the change in Equation 2.12, where we denote constraint DT M i as the new constraint for P i : 29

40 Jacobian J - We add a single row for every constraint DT i M. The number of columns are not change, since no new parameters were added. As for the sparseness of the constraint, since the new constraint depends only on the 3D location of its feature, i.e. only dconstraintdt i M dp i is not zero, the constraint is very sparse. Weight Matrix W - For every new constraint a single row and column is added to the weight matrix. This is neglected in terms of the matrix size for the same argument given in the previous item. Since the new constraint is independent of the other constraints, the only value that is filled in the new row and column is the value on the diagonal of the Weight Matrix representing the variance of the constraint. Hessian J T W J - Since the number of columns of J is unchanged, the size of the Hessian matrix is unchanged as well. As for the sparseness, the constraint depends only on its feature location, so the only derivation that is not zero is d 2 constraint DT M i dp 2 i or equivalently the only inner product of the new column with other columns that is not zero is the inner product of the column with itself. Therefore the new constraint are only contribute values to the diagonal of Hessian, which is already filled by other constraints. To conclude, our new constraint does not influence the structure of the Bundle Adjustment Computation Time There are three stages in our proposed method that influence the computational time. The first is the computation of the approximated point Pi DT M by ray tracing from an estimated camera. This stage is relatively time consuming, but since it is done only once for every feature and prior to the Bundle Adjustment estimation, it is negligible relatively to the estimation of the Bundle Adjustment. This stage can also be accelerated by using GPU calculation. The second stage is the surface approximation given P DT M i. Each surface approximation takes in the worst case O(mn 2 ), where m is the number of DTM points used and n is the number of parameters in the surface approximation. As mentioned in Chapter 2, the bottleneck of computing a single step in the BA is inverting a matrix of size 6 k where k is the number of cameras. So, for a scenario with 100 images and 1000 features, solving a single step in the BA optimization method will take at least 200M cycles, where the approximation of 1000 features will take 1M cycles. Therefore, the additional computational time added by the approximation stage is quite negligible. The third stage where we might add computational time is in the numeric step calculation due to the new constraints. There are two main stages in calculating the numeric stage: calculating J T W J and inverting it. Since the new constraints does not change the size or the sparseness of the Hessian, they can only influence the first stage - calculating J T W J. To evaluate that, lets look again at the example above, where 30

41 there are 100 images and 1000 features and lets assume that every feature is seen by 10 images. There are therefore = 20k constraints from 2D observation and 600 constraints on GPS and INS measurements (6 for each image - 3 for location and 3 for angles). Our proposed method adds a single constraint for each feature, 1000 all together. Even if we don t remove the GPS and INS constraints, we increase the number of constraints in 10%. Given that the main bottleneck is inverting the Hessian, our constraints are adding very little to the computational time if at all. In our experiments, we didn t see any increase in the computational time. 4.6 Handling Errors This section describes the different methods we used to handle DTM inaccuracy and outliers. Our surface approximation might not be an accurate representation of the surface for several reasons: DTM represents only the topographic surface and in many scenarios, such as wooded areas, this does not represent correctly the surface. When DTM has low resolution then subtle changes between two sampled points will not be represented by the DTM. Inaccuracy of the height values of the DTM. The surface approximation is not good enough due to the topography or due to the limitations of our approximation. To handle the surface approximation inaccuracy, we change the covariance of the distance constraint: As the surface approximation is inaccurate, the higher the covariance is, and vice versa. So for example, in desert area where the DTM representation should be accurate and the surface approximation should suffice, the covariance will be low. On the other hand, in wooded area the covariance will be high as the DTM does not model the trees. The decision on the value of the covariance can be done offline, before the estimation begins, manually or automatically, by examining the region of interest (ROI). The covariance can also be updated during estimation. For example, if after convergence, most of the 3D points do not lie close to the surface (and obviously the error can t be explained by a constant drift, where in that case it is expected that the BA will solve the drift), the covariance can be increased in the next iteration. On the other hand, if the distances between the surface and the points are small, the covariance can be reduced in the next iteration. Another type of error, is outliers. Outlier is a 3D point that its distance from the approximated surface is significantly larger then the other 3D points. These can happen for several reasons: 31

42 3D points that lie on edges in the scene. For example, consider a point on a cliff edge. A small error in the camera location will generate an approximated DTM point at the ray tracing stage far away from the correct point, in the valley beneath the cliff for example. Since the 2D observations will force the point to be in the height of the cliff and on the other hand the DTM constraint will want to pull the point to valley, the error on the constraint will be very big. Urban areas were there are large buildings that are not modeled by the DTM, and therefore will have large errors. In addition, these points might not be decorrelated. In some urban areas, there are data sources, such as Digital Surface Models (DSM), that maps the urban surface. These models can be used to better approximate the surface in urban areas (see Chapter 7). Since the error model we use for new constraints can not handle this kind of errors, we perform an outlier detection after each convergence of the Bundle Adjustment. We sort the 3D points by their error values, and remove the top x% (we used 3% in our experiments). The other constraints on these points, such as 2D observation, are not removed. Another type of error is a bias in the DTM height values. This kind of errors can not be solved with our new constraints as they are insensitive to a bias in all points. We assume that in addition there are no local drift in the DTM values, i.e. areas where the DTM has suddenly a drift in the height values. 32

43 Chapter 5 Iterative Closest Point 5.1 Introduction To compare our proposed method, we used Iterative Closest Point (ICP) algorithm which is a common method to calculate the transformation between DTM and a reconstructed scene. The Iterative Closest Point algorithm was presented independently by Besl and McKay [BM92] and by Chen and Medioni [CM92]. An overview of different extensions and an experimental evaluation can be found in [SMFF07]. All the different methods receives as an input two sets of 3D points, not necessary of the same size, and returns the transformation between them, represented by rotation matrix R R 3x3 and translation vector v T 3x1. Some version require additional information, such as Chen method that requires the points normal. Chen and Besl assumed that the sets have the same scale. Zinber al. in [ZSN05] presented ICP that calculates the scale in addition to the transformation between the sets. We found Zinber s method to be important since in some cases the reconstructed scene has scale errors, mainly due to velocity errors in the INS. The next section describes the ICP algorithm in some details. 5.2 Iterative Closest Point The basic structure of the algorithm performs the following two steps until convergence: matching points and calculating transformation. In the first step, every point in the first (source) set is matched to a point in the second (target) set by minimizing the euclidean distance. Outliers detection is used to reject badly matched pairs. In the second step, the best motion that aligns the matched points is calculated and applied to the source set. The algorithm stops when the registration change is below a specified threshold. The simplest strategy for the matching stage is to find for each point in the source set the closest point in the target set by minimizing the euclidean distance. Other methods selects a mixture of source and target points by sampling the sets. Since this 33

44 operation of finding the closest point is time-consuming, an optimized data structure is used. Most of the methods use optimized KD tree. In the end of this step, a set of matched pairs is generated. Denoting the source set by A = {a 1,..., a n } and the target set by B = {b 1,..., b m }, the generated pairs set is: S = {(a i, b j ) a i A, b j B} Pairs with large distance in compare to the others are rejected. Besl and Chen differs in the transformation calculation. While Besl is using only the points, Chen s method requires the points normal. Zinber extended Besl s method by calculating the scale as well. Here is a brief description of the methods. Besl uses the sum of squared distances of the corresponding point pairs as the error measure, and therefore tries to minimize the following cost function: argmin R,t (i,j) S b j Ra i t 2 (5.1) A comparison of different methods to solve the problem can be found in [ELF97]. We chose to use the SVD method since it provides stability, high level of accuracy and speed. First, the center of mass of the two sets is found: a = 1 S (i,j) S a i b = 1 S (i,j) S Centralizing the two sets yields the following minimization problem: R = argmin R (i,j) S and the problem is solved by computing the SVD: The solution is set to be: UDV T = (i,j) S b j (5.2) (bj b) R(a j ā) 2 (5.3) (b j b)(a i ā) T (5.4) R = UV T, t = b R ā (5.5) In case R is not a uniform matrix, hence represents a reflection and not a rotation matrix, its third column is multiplied by 1. 34

45 Zinver estimated the scale, s, in addition to R, t. The cost function is set to be: argmin R,t,s And equation (5.3) becomes: R, s = argmin R,s (i,j) S (i,j) S b j sra i t 2 (5.6) (bj b) sr(a j ā) 2 (5.7) Adding the scale will change matrix D in SVD and therefore does not change the calculation of R. s can then be calculated directly from: s = argmin s (i,j) S (b j b) sr (a j ā) 2 (5.8) And set to be: s = (i,j) S (i,j) S b T j ā i ā it ā i (5.9) The translation vector is updated by: t = b s R ā. Chen method differs in the cost function. By denoting ã i = [a i 1], bj = [b j 1] and n j as the normal of point b j, Chen method minimizes the following cost function: argmin M (i,j) S ((Mã i b j ) n j ) 2 (5.10) where M = [R t] 3x4 is the transformation between the source set and the target set. By approximating R for small angles, M can be written as 1 γ β t x M = γ 1 α t y β α 1 t z The set of equations in 5.10 are now linear in α, β, γ, t and can be easily calculated. Similar to Zinber, we added the scale parameter, s, to the cost function: argmin M (i,j) S s sγ sβ t x where M = [sr t] 3x4 = sγ s sα t y sβ sα s t z ((Mã i b j ) n j ) 2 35

46 By denoting γ = sγ, β = sβ, α = sα, M can be rewritten to: s γ β t x M = γ s α t y β α s t z After calculating M, the angles can be extracted by normalizing α, β, γ with s. 36

47 Chapter 6 Experimental Results 6.1 Introduction We evaluated our work on two types of experiments: Synthetic and real images. The first was conducted to evaluate our work under controlled environment where different types of errors were added with known error models and magnitudes. We also compared it with regular Bundle Adjustment followed by a transformation to the DTM using Iterative Closest Point (ICP) algorithm. In the second experiment, we wanted to evaluate our proposed method on a real sequence of images. While we wanted to use images taken by an airplane, we couldn t find images with ground truth data as described in Section 6.3. Instead, we used a small scale model and images taken from a robotic arm with known pose and orientation. In Section 6.2 the synthetic experiment and its results are detailed, and in Section 6.3 the results from the image sequence experiment are described. 6.2 Synthetic Experiments In this section, we evaluated our proposed method on synthetic data and compared it to a Bundle Adjustment with geographical measurements on the camera pose and orientation, as described in Chapter 2, followed by a transformation to the DTM calculated by ICP. We also evaluated a third method where we checked whether the remaining error in our proposed method could be further reduced using ICP as a follow up method. The input to our proposed method contained 2D observations, i.e. tracks, and an initial guess for the first image pose and orientation. The output is the final camera poses together with the 3D locations of the tracked features. While the output of BA with GPS and INS is the same as in our proposed method, the input is different: In addition to the 2D observations it also contained pose measurements for every image. The input to the ICP algorithm, in both cases it was used, contained two set of 37

48 points: 3D locations of the tracked features calculated by the BA and DTM grid points. Each 3D point in the DTM point set represents a point in the DTM grid together with its height value. The output of the ICP algorithm is the transformation between the BA and the DTM. The final output of the methods that uses ICP is the BA s output transformed by the ICP s transformation. The generation of the input data is described in Figure 6.1 and contained the following steps: 1. Generating the true trajectory of the camera, over the given DTM. 2. Sampling the DTM. Each sample will represent the true 3D location of a feature. 3. Adding Gaussian noise to the DTM samples. In real scenarios, the DTM does not represent the true surface and contained errors. Some of the errors are due to the DTM inaccuracy and due to objects in the scene that do not lie on the DTM surface, see more details in Section 4.6. These errors were modeled by adding a Gaussian noise to the points sampled from the DTM. 4. Projecting the sampled points on the camera using the true trajectory of the camera to generate the 2D observations. These are the accurate projections of the features. 5. Adding Gaussian noise to the 2D observations to illustrate the errors in the tracking algorithms, such as Scale Invariant Feature Transform. 6. Adding Gaussian noise to the camera pose, location and orientation. 7. Adding drift errors to the camera pose, location and orientation. This illustrates errors in the Inertial Navigation System. 8. Running all methods, BA followed by ICP, the proposed method and our proposed method followed by ICP on the noisy data. 9. Evaluating the performance by measuring the error between the estimated data, camera pose and 3d locations of the features and the true measurements. In all the experiments listed below, the specified errors where in addition to a basic set of errors: Error Type Camera location STD Camera angles STD Camera location drift Point noise STD Pixel STD Value 10m 1 deg 20m 1m 0.5 pixel 38

49 Figure 6.1: The input generation in the synthetic experiments. Figure 6.2: The synthetic scenarion. The DTM used in the synthetic experiment and the camera locations marked with blue pluses. 39

50 We used DTM from the Shuttle Radar Topography Mission (SRTM) [12] project in an area near Haifa, Israel. The camera was located 3000 meters above the ground, with 10 images 40 meters apart, see Figure 6.2. We increased the grid density from 90 meters to 25 meters so it will represents an average DTM. We ran Monte Carlo for increasing velocity and angular drift errors. For every error configuration, 1000 samples of noisy input was generated and the output of all the algorithms were recalculated and evaluated. Figures 6.3 and 6.4 shows the average camera location error for the three methods for different axes and noise. Figure 6.5 shows the average norm error of the camera location. Here are some observations from the figures: The method of BA with GPS and INS followed by ICP did not converge for some errors. This can be seen by the large STD of the X axes error the method had in Figure 6.3. In contrast, the BA based on the DTM converged for all the errors. The error in our proposed method did not increased as the noise increased, in contrast to BA with GPS and INS followed by ICP. The best method is our proposed method followed by ICP, the second is our proposed method and BA with GPS and INS followed by ICP is third. In addition, the improvement between the first method and the second is less dramatic compared to the improvement between the second and the third method. The first two points can be explained by the fact that our method uses only the initial guess of the first image, and therefore is built to overcome large initial errors. On the other hand, the structure of the reconstructed scene of the regular BA is sensitive to the camera measurements errors, especially drift errors. In addition, the ICP algorithm can not fix errors in the scene structure. The improvement the ICP algorithm added to our proposed method can be explained by the inaccuracy in our surface approximation surface and the approximation we did to the distance function. This can be further examined and should be part of the future work. 6.3 Experiments on Small Scale Model In addition to the synthetic experiments, we also wanted to tests our proposed method on real images captured in a scenario as close as possible to common Photogrametric scenarios. One of our most important consideration in choosing the experiment framework, was our ability to evaluate the performance of our proposed method, similar to what we did in the synthetic experiment. We wanted to make sure we had true measurements of the pose of the camera at the time the images were taken. Therefore, we decided to perform the experiments using a small scale model of a surface and a camera assembled 40

51 Figure 6.3: Camera s Location Error on Angular Drift The camera s location error in each axes. Three types of BA are compared: BA with GPS together with ICP (blue), BA with DTM (green) and BA with DTM and ICP (red). As can be seen in the second graph, BA with GPS (blue) did not always converge (hence the large STD values), which can indicates that the method is sensitive to noise in the initial starting point. Therefore, the median error (and not average) was calculated for that method and in addition the STD graphs of that method were removed from other graphs. 41

52 Figure 6.4: Camera s Location Error on Velocity Error The camera s location error in each axes. Three types of BA are compared: BA with GPS together with ICP (blue), BA with DTM (green) and BA with DTM and ICP (red). As can be seen in the second graph, BA with GPS (blue) did not always converge (hence the large STD values), which can indicates that the method is sensitive to noise in the initial starting point. Therefore, the median error (and not average) was calculated for that method and in addition the STD graphs of that method were removed from other graphs. 42

53 Figure 6.5: Camera s Location Error Norm The camera s location norm error for velocity and angular drift errors. Three types of BA are compared: BA with GPS to together with ICP (blue), BA with DTM (green) and BA with DTM adn ICP (red). One can see that for velocity error, BA+GPS+ICP drafts in an unbounded manner while the two methods based on BA+DTM do not. This is due to the fact that BA+DTM methods use only the location of the first image and therefore are less sensitive to velocity draft. 43

54 on a robotic arm able to record its pose in the space. In figure 6.6, we compared the slopes of the Model with slopes of a real scenario to show that the surface is realistic. By capturing simultaneously still images of the surface and the robotic arm orientation, we were able to record the true measurements of the camera orientation needed for the algorithm evaluation. Figure 6.6: Slopes Comparison To verify that the model is realistic, we compared the Model s slope (blue) with the slope of an area in southern Israel (red). It can be seen that they have the same magnitude. The slopes were calculated from the directional derivates of the model and of SRTM on southern Israel Experiment Outline Here is a short description of the different components used in the experiments: Camera - We used a camera with a field of view of X degrees and an image size of 1280X1204. Surface model - We used a 53x79x11 cm sandbox. The DTM of the model was generated using depth scans from a kinect device. The scale of the model is one centimetre to 25 meters, 1:2500. It was calculated by comparing the average width of a mountain in the model (20 cm) with real mountains (500 m). The accuracy of the DTM height values is around 0.5 cm which is equivalent to 12.5m. To make the model a little bit more realistic we added 3D objects (car, bridge, vegetation) to the model after it was scanned, so they were not modeled in the DTM. Robotic arm - The camera was assembled on a robotic arm that was able to capture its 6 Degrees Of Freedom with the following accuracy: 1 cm (10 m after scaling) in location and 0.5 angle degree. 44

55 6.3.2 Evaluation methods To evaluate the performance of our proposed method on the experiments, we used several comparison and evaluation methods: Comparing the estimated camera pose, location and angles, with the ground truth. The advantage of this method is that it directly compares the pose of the camera in oppose to the other methods that checks the alignment of the Bundle Adjustment points to the surface. A good alignment of the Bundle Adjustment points does not necessarily indicates accurate camera estimations. Reconstructing the height values of the surface and comparing them to the DTM. The height values were reconstructed from a dense 3D points set that was generated by triangulating dense 2D features extracted from the images. The triangulation was done by using camera orientations estimated by the Bundle Adjustment. Re-Projecting the images on the surface. By coloring the surface with the original images using the estimated camera orientations, we were also able to verify with naked eye the accuracy of the solution Experiment Flow The initial guess used in the experiments was the orientation of the first image only, estimated manually by roughly guessing the distance and pitch angle of the first image. The inputs to the Bundle Adjustment are therefore: the images, the initial guess of the first image and the Digital Terrain Model. The output of the Bundle Adjustment was then evaluated using the methods described above Results We ran two experiments that represents different flight patterns. The first experiment illustrated a photogrammetric flight, where the camera passed through the model in a straight line, See Figure 6.7. The second experiment illustrated a camera approaching the surface, similar to a landing flight or a missile approchiong the target, See Figure 6.8. Both patterns started from a distance of 1m from the surface (equivalent to 2.5 km). Figures 6.7 and 6.8 shows several of the images used in the experiment and figure 6.9 shows the estimated and measured path related to the DTM. Figures 6.10 and 6.11 shows the location and and angles errors of the estimated camera. Figure 6.12 shows the error between the reconstructed height values. Some observations from the experiments: The errors in the experiment are of the magnitude of the robotic arm accuracy, below 1 cm in location and 1 degree in angles, which indicates that our method converged. 45

56 Figure 6.7: Images from the first experiment. Figure 6.8: Images from the second experiment 46

57 Figure 6.9: The estimated and measured camera path The red and blue crosses represent the measured and estimated camera locations. The blue dots represent the location of the Bundle Adjustment 3D points. Note that the Bundle Adjustment points are lying on the DTM. 47

Hartley - Zisserman reading club. Part I: Hartley and Zisserman Appendix 6: Part II: Zhengyou Zhang: Presented by Daniel Fontijne

Hartley - Zisserman reading club. Part I: Hartley and Zisserman Appendix 6: Part II: Zhengyou Zhang: Presented by Daniel Fontijne Hartley - Zisserman reading club Part I: Hartley and Zisserman Appendix 6: Iterative estimation methods Part II: Zhengyou Zhang: A Flexible New Technique for Camera Calibration Presented by Daniel Fontijne

More information

Multiview Stereo COSC450. Lecture 8

Multiview Stereo COSC450. Lecture 8 Multiview Stereo COSC450 Lecture 8 Stereo Vision So Far Stereo and epipolar geometry Fundamental matrix captures geometry 8-point algorithm Essential matrix with calibrated cameras 5-point algorithm Intersect

More information

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 263

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 263 Index 3D reconstruction, 125 5+1-point algorithm, 284 5-point algorithm, 270 7-point algorithm, 265 8-point algorithm, 263 affine point, 45 affine transformation, 57 affine transformation group, 57 affine

More information

Multiple View Geometry in Computer Vision

Multiple View Geometry in Computer Vision Multiple View Geometry in Computer Vision Prasanna Sahoo Department of Mathematics University of Louisville 1 Structure Computation Lecture 18 March 22, 2005 2 3D Reconstruction The goal of 3D reconstruction

More information

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 253

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 253 Index 3D reconstruction, 123 5+1-point algorithm, 274 5-point algorithm, 260 7-point algorithm, 255 8-point algorithm, 253 affine point, 43 affine transformation, 55 affine transformation group, 55 affine

More information

CS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018

CS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018 CS 395T Lecture 12: Feature Matching and Bundle Adjustment Qixing Huang October 10 st 2018 Lecture Overview Dense Feature Correspondences Bundle Adjustment in Structure-from-Motion Image Matching Algorithm

More information

Real-Time Vision-Aided Localization and. Navigation Based on Three-View Geometry

Real-Time Vision-Aided Localization and. Navigation Based on Three-View Geometry Real-Time Vision-Aided Localization and 1 Navigation Based on Three-View Geometry Vadim Indelman, Pini Gurfil, Ehud Rivlin and Hector Rotstein Abstract This paper presents a new method for vision-aided

More information

Distributed Vision-Aided Cooperative Navigation Based on Three-View Geometry

Distributed Vision-Aided Cooperative Navigation Based on Three-View Geometry Distributed Vision-Aided Cooperative Navigation Based on hree-view Geometry Vadim Indelman, Pini Gurfil Distributed Space Systems Lab, Aerospace Engineering, echnion Ehud Rivlin Computer Science, echnion

More information

Robot Mapping. Least Squares Approach to SLAM. Cyrill Stachniss

Robot Mapping. Least Squares Approach to SLAM. Cyrill Stachniss Robot Mapping Least Squares Approach to SLAM Cyrill Stachniss 1 Three Main SLAM Paradigms Kalman filter Particle filter Graphbased least squares approach to SLAM 2 Least Squares in General Approach for

More information

Graphbased. Kalman filter. Particle filter. Three Main SLAM Paradigms. Robot Mapping. Least Squares Approach to SLAM. Least Squares in General

Graphbased. Kalman filter. Particle filter. Three Main SLAM Paradigms. Robot Mapping. Least Squares Approach to SLAM. Least Squares in General Robot Mapping Three Main SLAM Paradigms Least Squares Approach to SLAM Kalman filter Particle filter Graphbased Cyrill Stachniss least squares approach to SLAM 1 2 Least Squares in General! Approach for

More information

Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction

Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction Carsten Rother 09/12/2013 Computer Vision I: Multi-View 3D reconstruction Roadmap this lecture Computer Vision I: Multi-View

More information

ELEC Dr Reji Mathew Electrical Engineering UNSW

ELEC Dr Reji Mathew Electrical Engineering UNSW ELEC 4622 Dr Reji Mathew Electrical Engineering UNSW Review of Motion Modelling and Estimation Introduction to Motion Modelling & Estimation Forward Motion Backward Motion Block Motion Estimation Motion

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,

More information

Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview

Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview Arun Das 05/09/2017 Arun Das Waterloo Autonomous Vehicles Lab Introduction What s in a name? Arun Das Waterloo Autonomous

More information

CS 664 Structure and Motion. Daniel Huttenlocher

CS 664 Structure and Motion. Daniel Huttenlocher CS 664 Structure and Motion Daniel Huttenlocher Determining 3D Structure Consider set of 3D points X j seen by set of cameras with projection matrices P i Given only image coordinates x ij of each point

More information

CS231A Course Notes 4: Stereo Systems and Structure from Motion

CS231A Course Notes 4: Stereo Systems and Structure from Motion CS231A Course Notes 4: Stereo Systems and Structure from Motion Kenji Hata and Silvio Savarese 1 Introduction In the previous notes, we covered how adding additional viewpoints of a scene can greatly enhance

More information

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo -- Computational Optical Imaging - Optique Numerique -- Multiple View Geometry and Stereo -- Winter 2013 Ivo Ihrke with slides by Thorsten Thormaehlen Feature Detection and Matching Wide-Baseline-Matching

More information

Terrain correction. Backward geocoding. Terrain correction and ortho-rectification. Why geometric terrain correction? Rüdiger Gens

Terrain correction. Backward geocoding. Terrain correction and ortho-rectification. Why geometric terrain correction? Rüdiger Gens Terrain correction and ortho-rectification Terrain correction Rüdiger Gens Why geometric terrain correction? Backward geocoding remove effects of side looking geometry of SAR images necessary step to allow

More information

Stereo Observation Models

Stereo Observation Models Stereo Observation Models Gabe Sibley June 16, 2003 Abstract This technical report describes general stereo vision triangulation and linearized error modeling. 0.1 Standard Model Equations If the relative

More information

Geometry for Computer Vision

Geometry for Computer Vision Geometry for Computer Vision Lecture 5b Calibrated Multi View Geometry Per-Erik Forssén 1 Overview The 5-point Algorithm Structure from Motion Bundle Adjustment 2 Planar degeneracy In the uncalibrated

More information

REGISTRATION OF AIRBORNE LASER DATA TO SURFACES GENERATED BY PHOTOGRAMMETRIC MEANS. Y. Postolov, A. Krupnik, K. McIntosh

REGISTRATION OF AIRBORNE LASER DATA TO SURFACES GENERATED BY PHOTOGRAMMETRIC MEANS. Y. Postolov, A. Krupnik, K. McIntosh REGISTRATION OF AIRBORNE LASER DATA TO SURFACES GENERATED BY PHOTOGRAMMETRIC MEANS Y. Postolov, A. Krupnik, K. McIntosh Department of Civil Engineering, Technion Israel Institute of Technology, Haifa,

More information

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching --

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching -- Computational Optical Imaging - Optique Numerique -- Single and Multiple View Geometry, Stereo matching -- Autumn 2015 Ivo Ihrke with slides by Thorsten Thormaehlen Reminder: Feature Detection and Matching

More information

ICRA 2016 Tutorial on SLAM. Graph-Based SLAM and Sparsity. Cyrill Stachniss

ICRA 2016 Tutorial on SLAM. Graph-Based SLAM and Sparsity. Cyrill Stachniss ICRA 2016 Tutorial on SLAM Graph-Based SLAM and Sparsity Cyrill Stachniss 1 Graph-Based SLAM?? 2 Graph-Based SLAM?? SLAM = simultaneous localization and mapping 3 Graph-Based SLAM?? SLAM = simultaneous

More information

Chapter 7: Computation of the Camera Matrix P

Chapter 7: Computation of the Camera Matrix P Chapter 7: Computation of the Camera Matrix P Arco Nederveen Eagle Vision March 18, 2008 Arco Nederveen (Eagle Vision) The Camera Matrix P March 18, 2008 1 / 25 1 Chapter 7: Computation of the camera Matrix

More information

3D Model Acquisition by Tracking 2D Wireframes

3D Model Acquisition by Tracking 2D Wireframes 3D Model Acquisition by Tracking 2D Wireframes M. Brown, T. Drummond and R. Cipolla {96mab twd20 cipolla}@eng.cam.ac.uk Department of Engineering University of Cambridge Cambridge CB2 1PZ, UK Abstract

More information

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects Intelligent Control Systems Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/

More information

Midterm Exam Solutions

Midterm Exam Solutions Midterm Exam Solutions Computer Vision (J. Košecká) October 27, 2009 HONOR SYSTEM: This examination is strictly individual. You are not allowed to talk, discuss, exchange solutions, etc., with other fellow

More information

calibrated coordinates Linear transformation pixel coordinates

calibrated coordinates Linear transformation pixel coordinates 1 calibrated coordinates Linear transformation pixel coordinates 2 Calibration with a rig Uncalibrated epipolar geometry Ambiguities in image formation Stratified reconstruction Autocalibration with partial

More information

Outline. ETN-FPI Training School on Plenoptic Sensing

Outline. ETN-FPI Training School on Plenoptic Sensing Outline Introduction Part I: Basics of Mathematical Optimization Linear Least Squares Nonlinear Optimization Part II: Basics of Computer Vision Camera Model Multi-Camera Model Multi-Camera Calibration

More information

MAPI Computer Vision. Multiple View Geometry

MAPI Computer Vision. Multiple View Geometry MAPI Computer Vision Multiple View Geometry Geometry o Multiple Views 2- and 3- view geometry p p Kpˆ [ K R t]p Geometry o Multiple Views 2- and 3- view geometry Epipolar Geometry The epipolar geometry

More information

Rectification and Distortion Correction

Rectification and Distortion Correction Rectification and Distortion Correction Hagen Spies March 12, 2003 Computer Vision Laboratory Department of Electrical Engineering Linköping University, Sweden Contents Distortion Correction Rectification

More information

Structure from Motion. Prof. Marco Marcon

Structure from Motion. Prof. Marco Marcon Structure from Motion Prof. Marco Marcon Summing-up 2 Stereo is the most powerful clue for determining the structure of a scene Another important clue is the relative motion between the scene and (mono)

More information

Motion Estimation. There are three main types (or applications) of motion estimation:

Motion Estimation. There are three main types (or applications) of motion estimation: Members: D91922016 朱威達 R93922010 林聖凱 R93922044 謝俊瑋 Motion Estimation There are three main types (or applications) of motion estimation: Parametric motion (image alignment) The main idea of parametric motion

More information

Use of n-vector for Radar Applications

Use of n-vector for Radar Applications Use of n-vector for Radar Applications Nina Ødegaard, Kenneth Gade Norwegian Defence Research Establishment Kjeller, NORWAY email: Nina.Odegaard@ffi.no Kenneth.Gade@ffi.no Abstract: This paper aims to

More information

Visual Tracking (1) Feature Point Tracking and Block Matching

Visual Tracking (1) Feature Point Tracking and Block Matching Intelligent Control Systems Visual Tracking (1) Feature Point Tracking and Block Matching Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/

More information

Exploitation of GPS-Control Points in low-contrast IR-imagery for homography estimation

Exploitation of GPS-Control Points in low-contrast IR-imagery for homography estimation Exploitation of GPS-Control Points in low-contrast IR-imagery for homography estimation Patrick Dunau 1 Fraunhofer-Institute, of Optronics, Image Exploitation and System Technologies (IOSB), Gutleuthausstr.

More information

Image Registration Lecture 4: First Examples

Image Registration Lecture 4: First Examples Image Registration Lecture 4: First Examples Prof. Charlene Tsai Outline Example Intensity-based registration SSD error function Image mapping Function minimization: Gradient descent Derivative calculation

More information

Geometric Accuracy Evaluation, DEM Generation and Validation for SPOT-5 Level 1B Stereo Scene

Geometric Accuracy Evaluation, DEM Generation and Validation for SPOT-5 Level 1B Stereo Scene Geometric Accuracy Evaluation, DEM Generation and Validation for SPOT-5 Level 1B Stereo Scene Buyuksalih, G.*, Oruc, M.*, Topan, H.*,.*, Jacobsen, K.** * Karaelmas University Zonguldak, Turkey **University

More information

RECOMMENDATION ITU-R P DIGITAL TOPOGRAPHIC DATABASES FOR PROPAGATION STUDIES. (Question ITU-R 202/3)

RECOMMENDATION ITU-R P DIGITAL TOPOGRAPHIC DATABASES FOR PROPAGATION STUDIES. (Question ITU-R 202/3) Rec. ITU-R P.1058-1 1 RECOMMENDATION ITU-R P.1058-1 DIGITAL TOPOGRAPHIC DATABASES FOR PROPAGATION STUDIES (Question ITU-R 202/3) Rec. ITU-R P.1058-1 (1994-1997) The ITU Radiocommunication Assembly, considering

More information

A Factorization Method for Structure from Planar Motion

A Factorization Method for Structure from Planar Motion A Factorization Method for Structure from Planar Motion Jian Li and Rama Chellappa Center for Automation Research (CfAR) and Department of Electrical and Computer Engineering University of Maryland, College

More information

Inverse Kinematics II and Motion Capture

Inverse Kinematics II and Motion Capture Mathematical Foundations of Computer Graphics and Vision Inverse Kinematics II and Motion Capture Luca Ballan Institute of Visual Computing Comparison 0 1 A B 2 C 3 Fake exponential map Real exponential

More information

Projective geometry for Computer Vision

Projective geometry for Computer Vision Department of Computer Science and Engineering IIT Delhi NIT, Rourkela March 27, 2010 Overview Pin-hole camera Why projective geometry? Reconstruction Computer vision geometry: main problems Correspondence

More information

Humanoid Robotics. Least Squares. Maren Bennewitz

Humanoid Robotics. Least Squares. Maren Bennewitz Humanoid Robotics Least Squares Maren Bennewitz Goal of This Lecture Introduction into least squares Use it yourself for odometry calibration, later in the lecture: camera and whole-body self-calibration

More information

Multiple View Geometry in Computer Vision Second Edition

Multiple View Geometry in Computer Vision Second Edition Multiple View Geometry in Computer Vision Second Edition Richard Hartley Australian National University, Canberra, Australia Andrew Zisserman University of Oxford, UK CAMBRIDGE UNIVERSITY PRESS Contents

More information

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited. page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5

More information

High-Precision Positioning Unit 2.2 Student Exercise: Calculating Topographic Change

High-Precision Positioning Unit 2.2 Student Exercise: Calculating Topographic Change High-Precision Positioning Unit 2.2 Student Exercise: Calculating Topographic Change Ian Lauer and Ben Crosby (Idaho State University) Change is an inevitable part of our natural world and varies as a

More information

CS354 Computer Graphics Rotations and Quaternions

CS354 Computer Graphics Rotations and Quaternions Slide Credit: Don Fussell CS354 Computer Graphics Rotations and Quaternions Qixing Huang April 4th 2018 Orientation Position and Orientation The position of an object can be represented as a translation

More information

Vision par ordinateur

Vision par ordinateur Epipolar geometry π Vision par ordinateur Underlying structure in set of matches for rigid scenes l T 1 l 2 C1 m1 l1 e1 M L2 L1 e2 Géométrie épipolaire Fundamental matrix (x rank 2 matrix) m2 C2 l2 Frédéric

More information

Motion Estimation for Video Coding Standards

Motion Estimation for Video Coding Standards Motion Estimation for Video Coding Standards Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Introduction of Motion Estimation The goal of video compression

More information

Capturing, Modeling, Rendering 3D Structures

Capturing, Modeling, Rendering 3D Structures Computer Vision Approach Capturing, Modeling, Rendering 3D Structures Calculate pixel correspondences and extract geometry Not robust Difficult to acquire illumination effects, e.g. specular highlights

More information

Camera Parameters Estimation from Hand-labelled Sun Sositions in Image Sequences

Camera Parameters Estimation from Hand-labelled Sun Sositions in Image Sequences Camera Parameters Estimation from Hand-labelled Sun Sositions in Image Sequences Jean-François Lalonde, Srinivasa G. Narasimhan and Alexei A. Efros {jlalonde,srinivas,efros}@cs.cmu.edu CMU-RI-TR-8-32 July

More information

Camera Registration in a 3D City Model. Min Ding CS294-6 Final Presentation Dec 13, 2006

Camera Registration in a 3D City Model. Min Ding CS294-6 Final Presentation Dec 13, 2006 Camera Registration in a 3D City Model Min Ding CS294-6 Final Presentation Dec 13, 2006 Goal: Reconstruct 3D city model usable for virtual walk- and fly-throughs Virtual reality Urban planning Simulation

More information

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling Moritz Baecher May 15, 29 1 Introduction Edge-preserving smoothing and super-resolution are classic and important

More information

Computer Vision I - Appearance-based Matching and Projective Geometry

Computer Vision I - Appearance-based Matching and Projective Geometry Computer Vision I - Appearance-based Matching and Projective Geometry Carsten Rother 05/11/2015 Computer Vision I: Image Formation Process Roadmap for next four lectures Computer Vision I: Image Formation

More information

Incremental Light Bundle Adjustment for Robotics Navigation

Incremental Light Bundle Adjustment for Robotics Navigation Incremental Light Bundle Adjustment for Robotics Vadim Indelman, Andrew Melim, Frank Dellaert Robotics and Intelligent Machines (RIM) Center College of Computing Georgia Institute of Technology Introduction

More information

Geometric Rectification of Remote Sensing Images

Geometric Rectification of Remote Sensing Images Geometric Rectification of Remote Sensing Images Airborne TerrestriaL Applications Sensor (ATLAS) Nine flight paths were recorded over the city of Providence. 1 True color ATLAS image (bands 4, 2, 1 in

More information

COORDINATE TRANSFORMATION. Lecture 6

COORDINATE TRANSFORMATION. Lecture 6 COORDINATE TRANSFORMATION Lecture 6 SGU 1053 SURVEY COMPUTATION 1 Introduction Geomatic professional are mostly confronted in their work with transformations from one two/three-dimensional coordinate system

More information

APPENDIX: DETAILS ABOUT THE DISTANCE TRANSFORM

APPENDIX: DETAILS ABOUT THE DISTANCE TRANSFORM APPENDIX: DETAILS ABOUT THE DISTANCE TRANSFORM To speed up the closest-point distance computation, 3D Euclidean Distance Transform (DT) can be used in the proposed method. A DT is a uniform discretization

More information

High Altitude Balloon Localization from Photographs

High Altitude Balloon Localization from Photographs High Altitude Balloon Localization from Photographs Paul Norman and Daniel Bowman Bovine Aerospace August 27, 2013 Introduction On December 24, 2011, we launched a high altitude balloon equipped with a

More information

LOAM: LiDAR Odometry and Mapping in Real Time

LOAM: LiDAR Odometry and Mapping in Real Time LOAM: LiDAR Odometry and Mapping in Real Time Aayush Dwivedi (14006), Akshay Sharma (14062), Mandeep Singh (14363) Indian Institute of Technology Kanpur 1 Abstract This project deals with online simultaneous

More information

Stereo imaging ideal geometry

Stereo imaging ideal geometry Stereo imaging ideal geometry (X,Y,Z) Z f (x L,y L ) f (x R,y R ) Optical axes are parallel Optical axes separated by baseline, b. Line connecting lens centers is perpendicular to the optical axis, and

More information

Exterior Orientation Parameters

Exterior Orientation Parameters Exterior Orientation Parameters PERS 12/2001 pp 1321-1332 Karsten Jacobsen, Institute for Photogrammetry and GeoInformation, University of Hannover, Germany The georeference of any photogrammetric product

More information

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,

More information

Relating Local Vision Measurements to Global Navigation Satellite Systems Using Waypoint Based Maps

Relating Local Vision Measurements to Global Navigation Satellite Systems Using Waypoint Based Maps Relating Local Vision Measurements to Global Navigation Satellite Systems Using Waypoint Based Maps John W. Allen Samuel Gin College of Engineering GPS and Vehicle Dynamics Lab Auburn University Auburn,

More information

Euclidean Reconstruction Independent on Camera Intrinsic Parameters

Euclidean Reconstruction Independent on Camera Intrinsic Parameters Euclidean Reconstruction Independent on Camera Intrinsic Parameters Ezio MALIS I.N.R.I.A. Sophia-Antipolis, FRANCE Adrien BARTOLI INRIA Rhone-Alpes, FRANCE Abstract bundle adjustment techniques for Euclidean

More information

Video Georegistration: Key Challenges. Steve Blask Harris Corporation GCSD Melbourne, FL 32934

Video Georegistration: Key Challenges. Steve Blask Harris Corporation GCSD Melbourne, FL 32934 Video Georegistration: Key Challenges Steve Blask sblask@harris.com Harris Corporation GCSD Melbourne, FL 32934 Definitions Registration: image to image alignment Find pixel-to-pixel correspondences between

More information

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz Epipolar Geometry Prof. D. Stricker With slides from A. Zisserman, S. Lazebnik, Seitz 1 Outline 1. Short introduction: points and lines 2. Two views geometry: Epipolar geometry Relation point/line in two

More information

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University. 3D Computer Vision Structured Light II Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Introduction

More information

Visual Tracking (1) Pixel-intensity-based methods

Visual Tracking (1) Pixel-intensity-based methods Intelligent Control Systems Visual Tracking (1) Pixel-intensity-based methods Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/

More information

Binocular Stereo Vision. System 6 Introduction Is there a Wedge in this 3D scene?

Binocular Stereo Vision. System 6 Introduction Is there a Wedge in this 3D scene? System 6 Introduction Is there a Wedge in this 3D scene? Binocular Stereo Vision Data a stereo pair of images! Given two 2D images of an object, how can we reconstruct 3D awareness of it? AV: 3D recognition

More information

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013 Live Metric 3D Reconstruction on Mobile Phones ICCV 2013 Main Contents 1. Target & Related Work 2. Main Features of This System 3. System Overview & Workflow 4. Detail of This System 5. Experiments 6.

More information

Dense Image-based Motion Estimation Algorithms & Optical Flow

Dense Image-based Motion Estimation Algorithms & Optical Flow Dense mage-based Motion Estimation Algorithms & Optical Flow Video A video is a sequence of frames captured at different times The video data is a function of v time (t) v space (x,y) ntroduction to motion

More information

Image processing and features

Image processing and features Image processing and features Gabriele Bleser gabriele.bleser@dfki.de Thanks to Harald Wuest, Folker Wientapper and Marc Pollefeys Introduction Previous lectures: geometry Pose estimation Epipolar geometry

More information

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science. Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Bundle Adjustment 2 Example Application A vehicle needs to map its environment that it is moving

More information

Robot Mapping. Graph-Based SLAM with Landmarks. Cyrill Stachniss

Robot Mapping. Graph-Based SLAM with Landmarks. Cyrill Stachniss Robot Mapping Graph-Based SLAM with Landmarks Cyrill Stachniss 1 Graph-Based SLAM (Chap. 15) Use a graph to represent the problem Every node in the graph corresponds to a pose of the robot during mapping

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information

Convert Local Coordinate Systems to Standard Coordinate Systems

Convert Local Coordinate Systems to Standard Coordinate Systems BENTLEY SYSTEMS, INC. Convert Local Coordinate Systems to Standard Coordinate Systems Using 2D Conformal Transformation in MicroStation V8i and Bentley Map V8i Jim McCoy P.E. and Alain Robert 4/18/2012

More information

Geometric Correction of Imagery

Geometric Correction of Imagery Geometric Correction of Imagery Geometric Correction of Imagery Present by: Dr.Weerakaset Suanpaga D.Eng(RS&GIS) The intent is to compensate for the distortions introduced by a variety of factors, so that

More information

Scan Matching. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

Scan Matching. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Scan Matching Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Scan Matching Overview Problem statement: Given a scan and a map, or a scan and a scan,

More information

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit Augmented Reality VU Computer Vision 3D Registration (2) Prof. Vincent Lepetit Feature Point-Based 3D Tracking Feature Points for 3D Tracking Much less ambiguous than edges; Point-to-point reprojection

More information

Robot Localization based on Geo-referenced Images and G raphic Methods

Robot Localization based on Geo-referenced Images and G raphic Methods Robot Localization based on Geo-referenced Images and G raphic Methods Sid Ahmed Berrabah Mechanical Department, Royal Military School, Belgium, sidahmed.berrabah@rma.ac.be Janusz Bedkowski, Łukasz Lubasiński,

More information

DEVELOPMENT OF POSITION MEASUREMENT SYSTEM FOR CONSTRUCTION PILE USING LASER RANGE FINDER

DEVELOPMENT OF POSITION MEASUREMENT SYSTEM FOR CONSTRUCTION PILE USING LASER RANGE FINDER S17- DEVELOPMENT OF POSITION MEASUREMENT SYSTEM FOR CONSTRUCTION PILE USING LASER RANGE FINDER Fumihiro Inoue 1 *, Takeshi Sasaki, Xiangqi Huang 3, and Hideki Hashimoto 4 1 Technica Research Institute,

More information

Space Filling Curves and Hierarchical Basis. Klaus Speer

Space Filling Curves and Hierarchical Basis. Klaus Speer Space Filling Curves and Hierarchical Basis Klaus Speer Abstract Real world phenomena can be best described using differential equations. After linearisation we have to deal with huge linear systems of

More information

Computer Vision I - Filtering and Feature detection

Computer Vision I - Filtering and Feature detection Computer Vision I - Filtering and Feature detection Carsten Rother 30/10/2015 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image

More information

Stereo Vision. MAN-522 Computer Vision

Stereo Vision. MAN-522 Computer Vision Stereo Vision MAN-522 Computer Vision What is the goal of stereo vision? The recovery of the 3D structure of a scene using two or more images of the 3D scene, each acquired from a different viewpoint in

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 10 130221 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Canny Edge Detector Hough Transform Feature-Based

More information

Geometric Reconstruction Dense reconstruction of scene geometry

Geometric Reconstruction Dense reconstruction of scene geometry Lecture 5. Dense Reconstruction and Tracking with Real-Time Applications Part 2: Geometric Reconstruction Dr Richard Newcombe and Dr Steven Lovegrove Slide content developed from: [Newcombe, Dense Visual

More information

Constrained Pose and Motion Estimation. Ronen Lerner

Constrained Pose and Motion Estimation. Ronen Lerner Constrained Pose and Motion Estimation Ronen Lerner Constrained Pose and Motion Estimation Research Thesis In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Ronen Lerner

More information

Epipolar Geometry CSE P576. Dr. Matthew Brown

Epipolar Geometry CSE P576. Dr. Matthew Brown Epipolar Geometry CSE P576 Dr. Matthew Brown Epipolar Geometry Epipolar Lines, Plane Constraint Fundamental Matrix, Linear solution + RANSAC Applications: Structure from Motion, Stereo [ Szeliski 11] 2

More information

Correcting INS Drift in Terrain Surface Measurements. Heather Chemistruck Ph.D. Student Mechanical Engineering Vehicle Terrain Performance Lab

Correcting INS Drift in Terrain Surface Measurements. Heather Chemistruck Ph.D. Student Mechanical Engineering Vehicle Terrain Performance Lab Correcting INS Drift in Terrain Surface Measurements Ph.D. Student Mechanical Engineering Vehicle Terrain Performance Lab October 25, 2010 Outline Laboratory Overview Vehicle Terrain Measurement System

More information

Camera calibration. Robotic vision. Ville Kyrki

Camera calibration. Robotic vision. Ville Kyrki Camera calibration Robotic vision 19.1.2017 Where are we? Images, imaging Image enhancement Feature extraction and matching Image-based tracking Camera models and calibration Pose estimation Motion analysis

More information

Robust Geometry Estimation from two Images

Robust Geometry Estimation from two Images Robust Geometry Estimation from two Images Carsten Rother 09/12/2016 Computer Vision I: Image Formation Process Roadmap for next four lectures Computer Vision I: Image Formation Process 09/12/2016 2 Appearance-based

More information

From Least-Squares to ICP

From Least-Squares to ICP From Least-Squares to ICP Giorgio Grisetti grisetti@dis.uniroma1.it Dept of Computer Control and Management Engineering Sapienza University of Rome Special thanks to Ulrich Wollath for reporting errors

More information

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image

More information

Autonomous Mobile Robot Design

Autonomous Mobile Robot Design Autonomous Mobile Robot Design Topic: EKF-based SLAM Dr. Kostas Alexis (CSE) These slides have partially relied on the course of C. Stachniss, Robot Mapping - WS 2013/14 Autonomous Robot Challenges Where

More information

Occluded Facial Expression Tracking

Occluded Facial Expression Tracking Occluded Facial Expression Tracking Hugo Mercier 1, Julien Peyras 2, and Patrice Dalle 1 1 Institut de Recherche en Informatique de Toulouse 118, route de Narbonne, F-31062 Toulouse Cedex 9 2 Dipartimento

More information

Direct Plane Tracking in Stereo Images for Mobile Navigation

Direct Plane Tracking in Stereo Images for Mobile Navigation Direct Plane Tracking in Stereo Images for Mobile Navigation Jason Corso, Darius Burschka,Greg Hager Computational Interaction and Robotics Lab 1 Input: The Problem Stream of rectified stereo images, known

More information

CHAPTER 2 SENSOR DATA SIMULATION: A KINEMATIC APPROACH

CHAPTER 2 SENSOR DATA SIMULATION: A KINEMATIC APPROACH 27 CHAPTER 2 SENSOR DATA SIMULATION: A KINEMATIC APPROACH 2.1 INTRODUCTION The standard technique of generating sensor data for navigation is the dynamic approach. As revealed in the literature (John Blakelock

More information

Tightly-Integrated Visual and Inertial Navigation for Pinpoint Landing on Rugged Terrains

Tightly-Integrated Visual and Inertial Navigation for Pinpoint Landing on Rugged Terrains Tightly-Integrated Visual and Inertial Navigation for Pinpoint Landing on Rugged Terrains PhD student: Jeff DELAUNE ONERA Director: Guy LE BESNERAIS ONERA Advisors: Jean-Loup FARGES Clément BOURDARIAS

More information