The Correspondence Problem in Perspective Images PhD Thesis Proposal

Size: px
Start display at page:

Download "The Correspondence Problem in Perspective Images PhD Thesis Proposal"

Transcription

1 CENTER FOR MACHINE PERCEPTION The Correspondence Problem in Perspective Images PhD Thesis Proposal CZECH TECHNICAL UNIVERSITY Ondřej Chum CTU CMP 23 3 January 31, 23 RESEARCH REPORT ISSN Supervisor: Dr. Jiří Matas The author was supported by the Czech Ministry of Education under project MSM and by The Grant Agency of the Czech Republic under project GACR 12/2/1539. Research Reports of CMP, Czech Technical University in Prague, No. 3, 23 Published by Center for Machine Perception, Department of Cybernetics Faculty of Electrical Engineering, Czech Technical University Technická 2, Prague 6, Czech Republic fax , phone , www:

2

3 Contents 1 Introduction 2 2 RANSAC 4 3 Randomized RANSAC Algorithm The T d,d Test Experiments Conclusion Locally optimized RANSAC Algorithm Local Optimization Methods Experimental Results Conclusions Wide baseline stereo Maximally Stable Extremal Regions The proposed robust wide-baseline algorithm Experiments Conclusions Conclusions and Thesis Proposal 36

4 Abstract This thesis proposal addresses the correspondence problem, especially matching of two views of a scene taken with unknown cameras from unknown and arbitrary viewpoints. This task is known under the name of Wide Baseline Stereo Matching. Our recent research related to this field is described and the thesis goals are proposed. 1 Introduction This thesis proposal describes our work taken towards solving the correspondence problem. The focus was set on matching of two views of scene taken with unknown cameras from unknown and arbitrary viewpoints, known as Wide Baseline Stereo Matching. Significant part of the work focuses on the robust estimator RANSAC, as many computer vision algorithms include a robust estimation step where model parameters are computed from a data set containing a significant proportion of outliers. The RANSAC 1 algorithm introduced by Fishler and Bolles in 1981 [5] is possibly the most widely used robust estimator in the field of computer vision. RANSAC has been applied in the context of short baseline stereo [3, 33], wide baseline stereo matching [23, 35, 25, 15], motion segmentation [3], mosaicing [17], detection of geometric primitives [3], robust eigenimage matching [1] and elsewhere. Overview of the algorithm is given in Section 2. In Section 3 we show that under a broad range of conditions, RANSAC efficiency is significantly improved if its hypothesis evaluation step is randomized. A new randomized (hypothesis evaluation) version of the RANSAC algorithm, R- RANSAC, is introduced. Computational savings are achieved by typically evaluating only a fraction of data points for models contaminated with outliers. The idea is implemented in a two-step evaluation procedure. A mathematically tractable class of statistical preverification tests for test samples is introduced. For this class of preverification test we derive an approximate relation for the optimal setting of its single parameter. The proposed pre-test is evaluated on both synthetic data and real-world problems and a significant increase in speed is shown. A new modification of RANSAC, the locally optimized RANSAC, is introduced in Section 4. It has been observed that, to find an optimal solution (with a given probability), the number of samples drawn in RANSAC is significantly higher than predicted from the mathematical model. This is due to the assumption that a model with parameters computed from an outlier-free sample is consistent with all inliers. The assumption rarely holds in practice. The locally optimized RANSAC 1 RANdom SAmple Consensus 2

5 makes no new assumptions about the data, on the contrary - it makes the abovementioned assumption valid by applying local optimization to the solution estimated from the random sample. Finally, in Section 5 a novel algorithm to wide baseline stereo matching is introduced. A new set of image elements that are put into correspondence, the so called extremal regions, is introduced. Extremal regions possess highly desirable properties: the set is closed under 1. continuous (and thus projective) transformation of image coordinates and 2. monotonic transformation of image intensities. An efficient (near linear complexity) and practically fast detection algorithm (near frame rate) is presented for an affinely-invariant stable subset of extremal regions, the maximally stable extremal regions (MSER). A new robust similarity measure for establishing tentative correspondences is proposed. The robustness ensures that invariants from multiple measurement regions (regions obtained by invariant constructions from extremal regions), some that are significantly larger (and hence discriminative) than the MSERs, may be used to establish tentative correspondences. The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes. Significant change of scale (3.5 ), illumination conditions, out-of-plane rotation, occlusion, locally anisotropic scale change and 3D translation of the viewpoint are all present in the test problems. Good estimates of epipolar geometry (average distance from corresponding points to the epipolar line below.9 of the inter-pixel distance) are obtained. In Section 6 conclusions and the thesis proposal are given. 3

6 2 RANSAC The structure of the RANSAC algorithm is simple but powerful (see Algorithm 1). Repeatedly, subsets are randomly selected from the input data and model parameters fitting the sample are computed. The size of the random samples is the smallest sufficient for determining model parameters. In a second step, the quality of the model parameters is evaluated on the full data set. Different cost functions may be used [31] for the evaluation, the standard being the number of inliers, i.e. the number of data points consistent with the model. The process is terminated when the likelihood of finding a better model becomes low. The strength of the method stems from the fact that, to find a good solution, it is sufficient to select a single random sample not contaminated by outliers. Depending on the complexity of the model (the size of random samples) RANSAC can handle contamination levels well above 5%, which is commonly assumed to be a practical limit in robust statistics [24]. In: U = {x i } set of data points, U = N f : S p computes model parameters from a data point sample ρ(p,x) the cost function for a single data point (e.g. 1 if x is an inlier to the model with parameters p, otherwise) Out: p parameters of the model maximizing the cost function k := Repeat until P{better solution exists} < η (a function of C = max(c i ), i = 1..k, the cost (quality) of the best tested model and no. of steps k) k := k + 1 I. Hypothesis (1) select randomly set S k U, S k = m (2) compute parameters p k = f(s k ) II. Evaluation (3) compute cost (quality) C k = x U ρ(p k,x) (4) if C < C k then C := C k, p := p k Algorithm 1: Summary of the standard version of the RANSAC algorithm. 4

7 3 Randomized RANSAC The speed of RANSAC depends on two factors. Firstly, the level of contamination determines the number of random samples that have to be taken to guarantee a certain confidence in the optimality of the solution. Secondly, the time spent evaluating the quality of each of the hypothesized model parameters is proportional to the size N of the data set. Typically, a very large number of erroneous model parameters obtained from contaminated samples are evaluated. Such models are consistent with only a small fraction of the data. This observation can be exploited to significantly increase the speed of the RANSAC algorithm. As the main contribution of this work, we show that under a broad range of conditions, RANSAC efficiency is significantly improved if its hypothesis evaluation step is randomized. The core idea of the Randomized (hypothesis evaluation) RANSAC is that most model parameter hypotheses evaluated are influenced by outliers. For such erroneous models, it is sufficient to test only a small number of data points d from the total of N points (d N) to conclude, with high confidence, that they do not correspond to the sought solution. The idea is implemented in a two-step evaluation procedure. First, a statistical test is performed on d randomly selected data points. The final evaluation on all N data points is carried out only if the pre-test is passed. The increase in the speed of the modified RANSAC depends on the likelihoods of the two types of errors made in the pre-test: 1. rejection of an uncontaminated model and 2. acceptance of a contaminated model. Since RANSAC is already a randomized algorithm, the randomization of model evaluation does not change the nature of the solution it is only correct with a certain probability. However, the same confidence in the solution is obtained in, on average, a shorter time. Finding an optimal pre-test with the fastest average behaviour is naturally desirable, but very complex. Instead we introduce in Section 3.2 a mathematically tractable class of pre-tests based on small test samples. For this class we derive an approximate relation for optimal setting of its single parameter. The proposed pre-tests are assessed on both synthetic data and real-world problems and performance improvements are demonstrated. The structure of this section is as follows. First, in Section 3.1, the concept of evaluation with pre-tests is introduced and formulae describing the total complexity of the algorithm are derived. Both the number of samples drawn and the amount of time spent on evaluation of a hypothesized model are discussed in detail. In Section 3.2, the d-out-of-d class of pre-test is introduced and analyzed. In Section 3.3 both simulated and real experiments are presented and their results discussed. The work is concluded in Section 3.4 and plans for future work are discussed. 5

8 3.1 Algorithm In this section, the time complexity of the RANSAC algorithm is expressed as a function of quantities that characterise the input data and the complexity of the model. We start by introducing the notations. The set of all data points is denoted U, the number of data points N = U, and ε represents the fraction of inliers in the data set. The size of the sample is m, i.e. the number of data points necessary to compute model parameters. Let us first express the total time spent in the R-RANSAC procedure. From the analysis of the algorithm (Table 2) we derived the average time spent in R- RANSAC in number of verified data points J = k(t M + t), (1) where k is the number of samples drawn, t is the average number of data points verified within one model evaluation, and t M is the time necessary to compute the parameter of the model from the selected sample.the time needed to verify the consistency of one data point with the hypothesized parameters was chosen as a unit of time. Note that t M is a constant independent of both the number of data points N and the fraction of inliers ε. From (1) we see, that the average time spent in R-RANSAC depends on both the number of samples drawn k and the average time required to process each sample. The analysis of these two components follows. The number of tested hypothesis, which is equal to the number of samples, depends (besides other factors) on the termination condition. Two different termination criteria may be adopted in RANSAC. The hypothesize-verify loop is either stopped after evaluation of more samples than expected on average to select a good (uncontaminated) sample. Alternatively, the number of samples is chosen to ensure that the probability that a better-than-currently-best sample is missed is lower than a predefined confidence level. We show that the stopping times for the two cases, average-driven and confidence-driven, differ only by a multiplicative factor and hence the optimal value in the proposed test is reached with the same parameters. Since the sample is selected without replacement, the probability of taking a good sample is ( ) I m 1 m I! (N m)! P I = ( ) = N (I m)! N! = I j N j, j= m where I = εn stands for the number of inliers. For N m a simple and accurate 6

9 In: U = {x i } set of data points, U = N f : S p computes model parameters from a data point sample ρ(p, x) the cost function for a single data point (e.g. 1 if x is an inlier to the model with parameters p, otherwise) Out: p parameters of the model maximizing the cost function k := Repeat until P{better solution exists} < η (a function of C = max(c i ), i = 1..k, the cost (quality) of the best tested model and no. of steps k) k := k + 1 I. Hypothesis (1) select randomly set S k U, S k = m (2) compute parameters p k = f(s k ) II. Preliminary test (3) perform test based on d N data points (4) continue verification only if the test is passed III. Evaluation (5) compute cost (quality) C k = x U ρ(p k,x) (6) if C < C k then C := C k, p := p k Algorithm 2: Summary of RANSAC and R-RANSAC algorithms. Step II is added to RANSAC to randomize its cost function evaluation. approximation is obtained P I ε m, (2) which is exactly correct for sampling with replacement and commonly used in the literature. Since P I > ε m, running RANSAC without replacement is on average faster than estimated with approximation (2). The average number of samples taken before the first uncontaminated is given by (from properties of the geometric distribution) k = 1 ε m α, (3) where α is the probability of a good sample passing the preverification test. Note that for the randomized version of RANSAC the number of samples is higher than 7

10 or equal to the standard version, because a valid solution may be rejected in a preliminary test with probability 1 α. In the confidence-driven sampling, at least k samples have to be taken to reduce the probability of missing a good sample below a predefined confidence level η. Thus we get, as in [3], and solving for k leads to η = (1 ε m α) k, (4) k = log η log (1 ε m α). (5) Since (1 x) is the first order Taylor expansion of e x at zero, and (1 x) e x, we have η = (1 ε m α) k e εm α k ln η ε m α k ln η ε m α k We see, that k k( ln η), where ln η is a predefined constant, so all formulae obtained for the η-confidence driven case can be trivially modified to cover the average case. The number of data points points tested. So far we have seen that introduction of a preliminary test has increased the number of samples drawn. For the pre-test to make sense, this effect must be more than offset by the reduction in the average number of data points tested per hypothesis. There are two cases to be considered. First, with probability P I an uncontaminated ( good ) sample is drawn. Then the preverification test is passed with probability α and all N data points are verified. Else, with probability 1 α, this good sample is rejected and only t α data points are on average tested. In the second case, a contaminated ( bad ) sample is drawn, and this happens with probability 1 P I. Again either the pre-verification step is passed, but this time with a different probability β, and the full test with all N data points is carried out, or with probability 1 β, only t β data points are tested in the preverification test. Here β stands for the probability, that a bad sample passes the preverification test. Note that it is important that β α, i.e. a bad (contaminated) sample is consistent with a smaller number of data points than a good sample. Forming a weighted average of the four cases, the formula for the average number of tests per sample is obtained: t(d)=p I ( αn+(1 α) t α ) + (1 PI ) ( βn+(1 β) t β ). (6) Values of α, β, t α, and t β depend on the type of preverification test. 8

11 3.2 The T d,d Test In this section we introduce a simple and thus mathematically tractable class of preverification tests. Despite its simplicity, we show in the simulations and experiments of Section 3.3 its potential. The test we analyze is defined as follows: Definition 1 (the T(d,d)) The T(d,d) is passed if all d data points out of d randomly selected are consistent with the hypothesized model. In the rest of this section we derive the optimal value for d. First of all we express constants as introduced in the previous section as α = ε d and β = δ d, where δ is the probability that a data point is consistent with a random model. Since we do not need to test all d points (since single failure means that the pre-test failed), the average time spent in the preverification test is Since t α = d i (1 ε) ε i 1 and t β = i=1 d i(1 x)x i 1 i=1 i=1 d i (1 δ) δ i 1 i=1 i(1 x)x i 1 = 1 1 x, (7) we have t α 1 and t β 1 1 ε 1 δ. The approximation we get after substituting 7 into (6) ( t(d) ε m ε d N + 1 ) εd + (1 ε m ) (δ d N + 1 ) δd 1 ε 1 δ is too complicated for finding optimal d. Therefore, we incorporate the following approximations (1 ε m ) 1 δd 1 δ 1, (1 ε m )δ d N δ d N, and ε d N 1 εd 1 ε, which are sufficiently accurate for commonly encountered values of ε, δ and N. After applying these approximations, we have t(d) N δ d ε m+d N (8) 9

12 The average time spent in R-RANSAC in number of verified data points is then approximately J(T d,d ) 1 ε m ε d ( N δ d + ε m+d N t M ) (9) We are looking for the minimum of J(T d,d ) which is found by solving for d in =. The optimal length of the T d,d test is J(T d,d ) d d ( ) ln ln ε(tm +1) N (ln δ ln ε). (1) ln δ The value of d opt must be an integer greater or equal to zero, so it could be expressed as d opt = max(, arg min J(T d,d)). (11) d { d, d } Since the cost function J(T d,d ) has only one extremum and for d ± we have J(T d,d ), we can say that R-RANSAC is faster than the standard RANSAC if J(T, ) > J(T 1,1 ). From this equation we get 3.3 Experiments N > (t M + 1) 1 ε ε δ. (12) In this section are experiments that show the usefulness of the new randomized RANSAC algorithm with a preverification tests. The speed-up is demonstrated on the problem of epipolar geometry estimation. Three experiments are conducted on data from a synthetic, a short (standard) and wide-baseline stereo matching problems. Results of these experiments are summarized in Tables 1, 2, and 3 respectively. The structure of the tables is the following. The first column shows the length d of the T d,d test, where d = means standard RANSAC. The number of samples, each consisting of 7 point-to-point correspondences, that were used for model parameter estimation is given in the second column. Since the seven-point algorithm [8] for computation of the fundamental matrix may lead to one or three solutions, the next column, labeled models, shows the number of hypothesized fundamental matrices. The tests column displays the number of point-to-point correspondences evaluated during the procedure. In the penultimate column, the average number of inliers detected is given. The last column is rather informative 1

13 d samples models tests inliers time Table 1: Synthetic experiment on 15 correspondences, 4% of inliers, 3 repetitions. d samples models tests inliers time Table 2: Short baseline experiment on 676 tentative correspondences. and shows the time in seconds taken by the algorithm. This is strongly dependent on the implementation. Synthetic experiment. 15 correspondences were generated, 9 outliers and 6 inliers. Since the run-time of both RANSAC and R-RANSAC is a random variable, the programs were executed 3 times and averages were taken. Result are shown in Table 1. Since the number of correspondences is large, the standard RANSAC algorithm spends a long time verifying all correspondence as can be seen in column tests. The short baseline experiment was conducted on the images from a standard dataset of the Leuven castle [22]. There were 676 tentative correspondences of the Harris interest points selected in the basis of the cross-correlation of neigbourhoods. The tentative correspondences contained approximately 6% of inliers. Looking at Table 2, we see that approximately twice as many fundamental matrices were hypothesized in R-RANSAC, but more than nine times less correspondences were evaluated. Wide baseline experiment on the BOOKSHELF dataset. The tentative correspondences were formed as follows. Discriminative regions (MSERs, SECs) [14] were detected. Robust similarity functions on the affine invariant description were used to establish the mutually nearest pair of regions. Point correspondences were obtained as centers of gravity of those regions. There were less then 4% of inliers among the correspondences. The speeding-up in this experiment, shown in Table 3, is approximately 5%. 3.4 Conclusion We presented a new algorithm called R-RANSAC, which increased the speed of model parameter estimation under a broad range of conditions, due to randomiza- 11

14 d samples models tests inliers time Table 3: Wide baseline experiment on 413 tentative correspondences. Figure 1: Short baseline image set 12

15 Figure 2: Wide baseline image set 13

16 tion of the hypothesis evaluation step. For samples contaminated by outliers, it was shown that it was sufficient to test only a small number of data points d N to conclude with high confidence that they do not correspond to the sought solution. The idea was implemented in a two-step evaluation procedure (Table 2). We introduced a mathematically tractable class of pre-tests based on small test samples. For this class an approximate relation for optimal setting of its single parameter was derived. The proposed pre-test was evaluated on both synthetic data and real-world problems and a significant increase in speed was observed. The task for the future is to design an optimal preverification test in a class broader then the T d,d. 14

17 4 Locally optimized RANSAC In a classical formulation of RANSAC, the problem is to find all inliers in a set of data points. The number of inliers I is typically not known a priori. Inliers are data points consistent with the best model, e.g. epipolar geometry or homography in a two view correspondence problem or line or ellipse parameters in the case of detection of geometric primitives. The RANSAC procedure finds, with a certain probability, the inliers and the corresponding model by repeatedly drawing random samples from the input set of data points. RANSAC is popular because it is simple and it works well in practice. The reason is that almost no assumptions are made about the data and no (unrealistic) conditions have to be satisfied for RANSAC to succeed. However, it has been observed experimentally that RANSAC runs much longer (even by an order of magnitude) than theoretically predicted [29]. The discrepancy is due to one assumption of RANSAC that is rarely true in practice: it is assumed that a model with parameters computed from an uncontaminated sample is consistent with all inliers. 2 In this section we propose a novel, and easy-to-implement, modification of RANSAC exploiting the fact that the model hypothesis from an uncontaminated minimal sample is almost always sufficiently near the optimal solution and a local optimization step applied to selected models produces an algorithm with near perfect agreement with theoretical (i.e. optimal) performance. This approach not only increases the number of inliers found and consequently speeds up the RANSAC procedure by allowing its earlier termination, but also returns models of higher precision. The increase of average time spent in a single RANSAC verification step is minimal. The proposed optimization strategy guarantees that the number of samples to which the optimization is applied is insignificant. The main contributions of this work are (a) modification of the RANSAC that simultaneously improve the speed of the algorithm and and the quality of the solution (which is near to optimal) (b) introduction of two local optimization methods (c) a rule for application of the local optimization and a theoretical analysis showing the local optimization is applied at most log k times, where k is number of samples drawn. In experiments on two image geometry estimation (epipolar geometry and homography) the speed-up achieved is two to three fold. The problem described above was noticed by Tordoff and Murray [11], who required real-time performance. The necessary speed-up was achieved by providing the estimation process with additional information in the form of probability of correctness for each data point. However, such information is not always available and it is in general difficult to estimated the probabilities reliably. 2 Experiments reported in Section 4.3 confirm that the assumption does not hold 15

18 In contrast, the modification proposed in this work is internal only (requires no extra input information), it does not interfere with other modifications of the algorithm, the MLESAC [31], R-RANSAC [2] and NAPSAC [21]. MLESAC, proposed by Torr and Zisserman, defines a cost function in the maximal likelihood framework. The R-RANSAC algorithm increases the speed of the algorithm by randomization of its verification part. NAPSAC focuses on the selection of samples. In fact, all these modification can be used in conjunction. The structure of this section is as follows. First, in Section 4.1, the motivation of this work is discussed in detail and the general algorithm of locally optimized RANSAC is described. Four different methods of local optimization are proposed in Section 4.2. All methods are experimentally tested and evaluated through epipolar geometry and homography estimation. The results are shown and discussed in Section 4.3. The work is concluded in Section Algorithm The structure of the RANSAC algorithm is simple but powerful. Repeatedly, subsets are randomly selected from the input data and model parameters fitting the sample are computed. The size of the random samples is the smallest sufficient for determining model parameters. In a second step, the quality of the model parameters is evaluated on the full data set. Different cost functions may be used [31] for the evaluation, the standard being the number of inliers, i.e. the number of data points consistent with the model. The process is terminated [5, 33] when the likelihood of finding a better model becomes low, i.e. the probability η of missing a set of inliers of size I within k samples falls under predefined threshold η = (1 P I ) k. (13) Symbol P I stands for the probability, that an uncontaminated sample of size m is randomly selected from N data points ( I m) P I = ( N = m) m 1 j= I j N j εm, (14) where ε is the fraction of inliers ε = I/N. The number of samples that has to be drawn to ensure given η is k = log(η)/ log(1 P I ). From equations (13) and (14), it can be seen, that termination criterion based on probability η expects that a selection of a single random sample not contaminated by outliers is followed by a discovery of whole set of I inliers. However, this 16

19 assumption is often not valid since inliers are perturbed by noise. Since RANSAC generates hypotheses from minimal sets, the influence of noise is not negligible, and the set of correspondences the size of which is smaller than I is found. The consequence is an increase in the number of samples before the termination of the algorithm. The effect is clearly visible in the histograms of the number of inliers found by standard RANSAC. The first column of Figure 4 shows the histogram for five matching experiments. The number of inliers varies by about 2-3%. We propose a modification that increases the number of inliers found near to the optimum I. This is achieved via a local optimization of promising samples. For the summary of the locally optimized RANSAC see Algorithm 3. Repeat until the probability of finding better solution falls under predefined threshold, as in equation (13): 1. Select a random sample of the minimum number of data points S m. 2. Estimate the model parameters consistent with this minimal set. 3. Calculate the number of inliers I k, i.e. the data points their error is smaller than predefined threshold θ. 4. If new maximum has occurred (I k > I j for all j < k), run local optimization. Store the best model. Algorithm 3: A brief summary of the LO-RANSAC The local optimization step is carried out only if a new maximum in the number of inliers from the current sample has occurred, i.e. when standard RANSAC stores its best result. The number of consistent data points with a model from a randomly selected sample can be thought of as a random variable with unknown (or very complicated) density function. This density function is the same for all samples, so the probability that k-th sample will be the best so far is 1/k. Then, the average number of reaching maxima within k samples is k 1 1 k x 1 dx + 1 = log k x Note, that this is the upper bound as the number of correspondences is finite and discrete and so the same number of inliers will occur often. This theoretical bound was confirmed experimentally, the average numbers of local optimization over an execution of (locally optimized) RANSAC can be found in Table 6. For more details about experiments see Section

20 Average error Epipolar geometry from sampled points ALL Standard deviation of the error Epipolar geometry from sampled points Noise level Noise level Figure 3: The average error (left) and the standard deviation of the error for samples of 7,8,9, 14 and all 1 points respectively with respect to the noise level. 4.2 Local Optimization Methods The following methods of local optimization have been tested. The choice is motivated by the two observations that are given later in this section. 1. Standard. The standard implementation of RANSAC without any local optimization. 2. Simple. Take all data points with error smaller than θ and use a linear algorithm to hypothesize new model parameters. 3. Iterative. Take all data points with error smaller that K θ and use linear algorithm to compute new model parameters. Reduce the threshold and iterate until the threshold is θ. 4. Inner RANSAC. A new sampling procedure is ran only on I k data points consistent with the hypothesised model. As the sampling is running on inlier data, there is no need for the size of sample to be minimal. On the contrary, the size of the sample is selected to minimize the error of the model parameter estimation. In our experiments the size of samples are set to min(i k /2, 14) for epipolar geometry (see results in Section 4.2) and to min(i k /2, 12) for the case of homography estimation. The number of repetitions is set to ten in the experiments presented. 5. Inner RANSAC with iteration. This method is similar to the previous one, the difference being that each sample of the inner RANSAC is processed by method 3. The local optimization methods are based on the two following observations. Observation 1: The Size of Sample The less information (data points) is used to estimate the model parameters in the presence of noise, the less accurate the model is. The reason for RANSAC to draw 18

21 minimal samples is that every extra point exponentially decreases the probability of selecting an outlier-free sample, which is approximately 3 ε m where m is the size of the sample (i.e. the number of data points included in the sample). It has been shown in [3], that the fundamental matrix estimated from a seven point sample is more precise than the one estimated form eight points using a linear algorithm [7]. This is due to the singularity enforcement in the eight point algorithm. However, the following experiment shows, that this holds only for eight point samples and taking nine or more points gives more stable results than those obtained when the fundamental matrix is computed from seven points only. Experiment: This experiment shows, how the quality of a hypothesis depends on the number of correspondences used to calculate the fundamental matrix. For seven points, the seven point algorithm was used [3] and for eight and more points the linear algorithm [7] was used. The course of experiment was as follows. Noise of different levels was added to the noise-free image points correspondences divided into two sets of hundred correspondences. Samples of different sizes were drawn from the first set and the average error over the second was computed. This was repeated 1 times for each noise level. Results are displayed in Figure 3. This experiment demonstrates, that the more points are used to estimate the model (in this case fundamental matrix) the more precise solution is obtained (with the exception of eight points). The experiment also shows that the minimal sample gives hypotheses of rather poor quality. One can use different cost functions that are more complicated than simply the number of inliers, but evaluating this function only at parameters arising from the minimal sample will get results at best equal to the proposed method of local optimization. Observation 2: Iterative Scheme It is well known from the robust statistic literature, that pseudo-robust algorithms that first estimate model parameters from all data by least squares minimization, then remove the data points with the biggest error (or residual) and iteratively repeat this procedure do not lead to correct estimates. It can be easily shown, that a single far outlying data point, i.e. leverage point, will cause a total destruction of the estimated model parameters. That is because such a leverage point overweights even the majority of inliers in least-squares minimization. This algorithm works only well, when the outliers are not overbearing, so the majority of inliers have bigger influence on the least squares. In local optimization method 3 there are no leverage points, as each data point has error below K θ subject to the sampled model. 3 This is exact for the sampling with replacement. 19

22 4.3 Experimental Results The proposed algorithm was extensively tested on the problem of estimating two view relations (epipolar geometry and homography) from image point correspondences. Five experiments are presented in this section, all of them on publicly available data, depicted in Figures 5 and 6. In experiments A and B, the epipolar geometry is estimated in a wide-baseline setting. In experiment C, the epipolar geometry was estimated too, this time from short-baseline stereo images. From the point of view of RANSAC use, the narrow and wide baseline problems differ by the number of correspondences and inliers (see Table 4), and also by the distribution of errors of outliers. Experiments D and E try to recover homography. The scene in experiment E is the same as in experiment A and this experiment could be seen as a plane segmentation. All tentative correspondences were detected and matched automatically. Algorithms were implemented in C and the experiments were ran on AMD K7 18+ MHz processor. The terminating criterion based on equation (13) was set to η <.5. The threshold θ was set to θ = 3.84σ 2 for the epipolar geometry and θ = 5.99σ 2 for the homography. In both cases the expected σ was set to σ =.3. The characterization of the matching problem, such as number of correspondences, the total number of inliers and expected number of samples, are summarized in Table 4. The total number of inliers was set to the maximal number of inliers obtained over all methods over all repetitions. The expected number of samples was calculated according to the termination criterion mentioned above. Performance of local optimization methods 1 to 5 was evaluated on problems A to E. The results for 1 runs are summarized in Table 5. For each experiment, a table containing the average number of inliers, average number of samples drawn, average time spent in RANSAC (in seconds) and efficiency (the ratio of the number of samples drawn and expected) is shown. Table 6 shows both, how many times the local optimization has been applied and the theoretical upper bound derived in Section 4.1. The method 5 achieved the best results in all experiments in the number of samples and differs slightly from the theoretically expected number. On the other hand standard RANSAC exceeds this limit times. In Figure 4 the histograms of the sizes of the resulting inliers sets are shown. Each column shows results for one method, each row for one experiment. One can observe that the peaks are shifting to the higher values with the increasing identification number of method. Method 5 reaches the best results in terms of sizes of inlier sets and consequently in number of samples before termination. This method should be used when the fraction of inliers is low. Resampling, on the other hand, might be quite costly in the case of high number of inliers, especially if accompanied by a small 2

23 A B C D E # corr # inl ε 61% 29% 32% 19% 18% # sam Table 4: Characteristics of experiments A-E. Total number of correspondences, maximal number of inliers found within all tests, fraction of inliers ε and theoretically expected number of samples inl A sam time eff inl B sam time eff inl C sam time eff inl D sam time eff inl E sam time eff Table 5: The summary of local optimization experiments: average number of inliers (inl) and samples taken (sam), average time in seconds and efficiency (eff). The best values for each row are highlighted in bold. For more details see the description in text in Section

24 A B C D E Figure 4: Histograms of the number of inliers. The methods 1 to 5 (1 stands for standard RANSAC) are stored in columns and different dataset are shown in rows (A to E). On each graph, there is a number of inliers on the x-axis and how many times this number was reached within one hundred repetitions on the y-axis A 3. : : : : : 4.7 B 6.4 : : : : :1.6 C 7.7 : : : : : 9.2 D 5.2 : : : : : 8.1 E 4.8 : : : : : 8.6 Table 6: The average number of local optimizations ran during one execution of RANSAC and logarithm of average number of samples for comparison. number of correspondences in total) as could be seen in experiment A (61 % of inliers out of 94 correspondences). In this case, method 3 was the fastest. Method 3 got significantly better results than the standard RANSAC in all experiments, the speed up was about 1%, and slightly worse than for method 5. We suggest to use method 3 in real-time procedures when a high number of inliers is expected. Methods 2 and 4 are inferior to methods with iteration (3 and 5 respectively) without any time saving advantage. 4.4 Conclusions This section has introduced a simple modification of the RANSAC algorithm that increases the number of detected inliers. Consequently, the number of samples drawn decreased. In all experiments, the run-time is reduced by a factor of at 22

25 Figure 5: Image pairs and detected points used in epipolar geometry experiments (A - C). Inliers are marked as dots in left images and outliers as crosses in right images. 23

26 Figure 6: Image pairs and detected points used in homography experiments (D and E). Inliers are marked as dots in left images and outliers as crosses in right images. 24

27 least two, which may be very important in real-time application incorporating a RANSAC step. Two methods of local optimization were proposed: method 3 is recommended for problems with a large fraction of inliers and small amount of data points (which is typical for real-time applications) and method 5, which reaches almost optimal results. It has been shown and experimentally checked that the number of times local optimization is applied is lower than logarithm of the number of samples drawn. The proposed improvement allows to make precise quantitative statements about the number of samples drawn in RANSAC. The behavior of the modified RANSAC is in much closer agreement with the mathematical model than a straighfowrard implementation. 25

28 5 Wide baseline stereo Finding reliable correspondences in two images of a scene taken from arbitrary viewpoints viewed with possibly different cameras and in different illumination conditions is a difficult and critical step towards fully automatic reconstruction of 3D scenes [8]. A crucial issue is the choice of elements whose correspondence is sought. In the wide-baseline set-up, local image deformations cannot be realistically approximated by translation or translation with rotation and a full affine model is required. Correspondence cannot be therefore established by comparing regions of a fixed (Euclidean) shape like rectangles or circles since their shape is not preserved under affine transformation. In most images there are regions that can be detected with high repeatability since they posses some distinguishing, invariant and stable property. We argue that such regions of, in general, data-dependent shape, called distinguished regions (DRs), may serve as the elements to be put into correspondence either in stereo matching or object recognition. The first contribution of the work is the introduction of a new set of distinguished regions, the so called extremal regions. Extremal regions have two desirable properties. The set is closed under continuous (and thus perspective) transformation of image coordinates and, secondly, it is closed under monotonic transformation of image intensities. An efficient (near linear complexity) and practically fast detection algorithm is presented for an affinely-invariant stable subset of extremal regions, the maximally stable extremal regions (MSER). Robustness of a particular type of DR depends on the image data and must be tested experimentally. Successful wide-baseline experiments on indoor and outdoor datasets presented in Section 5.3 demonstrate the potential of MSERs. Reliable extraction of a manageable number of potentially corresponding image elements is a necessary but certainly not a sufficient prerequisite for successful wide-baseline matching. With two sets of distinguished regions, the matching problem can be posed as a search in the correspondence space [6]. Forming a complete bipartite graph on the two sets of DRs and searching for a globally consistent subset of correspondences is clearly out of question for computational reasons. Recently, a whole class of stereo matching and object recognition algorithms with common structure has emerged [23, 34, 1, 35, 4, 28, 18, 11]. These methods exploit local invariant descriptors to limit the number of tentative correspondences. Important design decisions at this stage include: 1. the choice of measurement regions, i.e. the parts of the image on which invariants are computed, 2. the method of selecting tentative correspondences given the invariant description and 3. the choice of invariants. Typically, distinguished regions or their scaled version serve as measurement 26

29 regions and tentative correspondences are established by comparing invariants using Mahalanobis distance [25, 35, 26]. As a second novelty of the presented approach, a robust similarity measure for establishing tentative correspondences is proposed to replace the Mahalanobis distance. The robustness of the proposed similarity measure allows us to use invariants from a collection of measurement regions, even some that are much larger than the associated distinguished region. Measurements from large regions are either very discriminative (it is very unlikely that two large parts of the image are identical) or completely wrong (e.g. if orientation or depth discontinuity becomes part of the region). The former helps establishing reliable tentative (local) correspondences, the influence of the latter is limited due to the robustness of the approach. Finding epipolar geometry consistent with the largest number of tentative (local) correspondences is the final step of all wide-baseline algorithms. RANSAChas been by far the most widely adopted method since [32]. The presented algorithm takes novel steps to increase the number of matched regions and the precision of the epipolar geometry. The rough epipolar geometry estimated from tentative correspondences is used to guide the search for further region matches. It restricts location to epipolar lines and provides an estimate of affine mapping between corresponding regions. This mapping allows the use of correlation to filter out mismatches. The process significantly increases precision of the EG estimate; the final average inlier distance-from-epipolar-line is below.1 pixel. For details see Section 5.2. Related work. Since the influential paper by Schmid and Mohr [26] many image matching and wide-baseline stereo algorithms have been proposed, most commonly using Harris interest points as distinguished regions. Tell and Carlsson [28] proposed a method where line segments connecting Harris interest points form measurement regions. The measurements are characterised by scale invariant Fourier coefficients. The Harris interest detector is stable over a range of scales, but defines no scale or affine invariant measurement region. Baumberg [1] applied an iterative scheme originally proposed by Lindeberg and Garding to associate affine-invariant measurement regions with Harris interest points. In [18], Mikolajczyk and Schmid show that a scale-invariant MR can be found around Harris interest points. In [23], Pritchett and Zisserman form groups of line segments and estimate local homographies using parallelograms as measurement regions. Tuytelaars and Van Gool introduced two new classes of affine-invariant distinguished regions, one based on local intensity extrema [35] the other using point and curve features [34]. In the latter approach, DRs are characterised by measurements from inside an ellipse, constructed in an affine invariant manner. Lowe [11] describes the Scale Invariant Feature Transform approach which produces a scale and orientation-invariant characterisation of interest points. The rest of this section is structured as follows. Maximally Stable Extremal 27

30 Image I is a mapping I : D Z 2 S. Extremal regions are well defined on images if: 1. S is totally ordered, i.e. reflexive, antisymmetric and transitive binary relation exists. In this work only S = {, 1,..., 255} is considered, but extremal regions can be defined on e.g. real-valued images (S = R). 2. An adjacency (neighbourhood) relation A D D is defined. In this work 4-neighbourhoods are used, i.e. p, q D are adjacent (paq) iff d i=1 p i q i 1. Region Q is a contiguous subset of D, i.e. for each p, q Q there is a sequence p, a 1, a 2,..., a n, q and paa 1, a i Aa i+1, a n Aq. (Outer) Region Boundary Q = {q D \ Q : p Q : qap}, i.e. the boundary Q of Q is the set of pixels being adjacent to at least one pixel of Q but not belonging to Q. Extremal Region Q D is a region such that for all p Q, q Q : I(p) > I(q) (maximum intensity region) or I(p) < I(q) (minimum intensity region). Maximally Stable Extremal Region (MSER). Let Q 1,..., Q i 1, Q i,... be a sequence of nested extremal regions, i.e. Q i Q i+1. Extremal region Q i is maximally stable iff q(i) = Q i+ \ Q i / Q i has a local minimum at i (. denotes cardinality). S is a parameter of the method. Table 7: Definitions used in Section 5.1 Regions are defined and their detection algorithm is described in Section 5.1. In Section 5.2, details of a novel robust matching algorithm are given. Experimental results on outdoor and indoor images taken with an uncalibrated camera are presented in Section 5.3. Presented experiments are summarized and the contributions of the work are reviewed in Section Maximally Stable Extremal Regions In this section, we introduce a new type of image elements useful in wide-baseline matching the Maximally Stable Extremal Regions. The regions are defined solely by an extremal property of the intensity function in the region and on its outer boundary. The concept can be explained informally as follows. Imagine all possible thresholdings of a gray-level image I. We will refer to the pixels below a threshold as black and to those above or equal as white. If we were shown a movie of thresholded images I t, with frame t corresponding to threshold t, we would see first a white image. Subsequently black spots corresponding to local intensity minima will appear and grow. At some point regions corresponding to two local 28

31 minima will merge. Finally, the last image will be black. The set of all connected components of all frames of the movie is the set of all maximal regions; minimal regions could be obtained by inverting the intensity of I and running the same process. The formal definition of the MSER concept and the necessary auxiliary definitions are given in Table 7. In many images, local binarization is stable over a large range of thresholds in certain regions. Such regions are of interest since they posses the following properties: Invariance to affine transformation of image intensities. Covariance to adjacency preserving (continuous) transformation T : D D on the image domain. Stability, since only extremal regions whose support is virtually unchanged over a range of thresholds is selected. Multi-scale detection. Since no smoothing is involved, both very fine and very large structure is detected. The set of all extremal regions can be enumerated in O(n loglog n), where n is the number of pixels in the image. Enumeration of extremal regions proceeds as follows. First, pixels are sorted by intensity. The computational complexity of this step is O(n) if the range of image values S is small, e.g. the typical {,..., 255}, since the sort can be implemented as BINSORT [27]. After sorting, pixels are placed in the image (either in decreasing or increasing order) and the list of connected components and their areas is maintained using the efficient union-find algorithm [27]. The complexity of our union-find implementation is O(n log log n), i.e. almost linear 4. Importantly, the algorithm is very fast in practice. The MSER detection takes only.14 seconds on a Linux PC with the Athlon XP 16+ processor for an 53x35 image (n = 1855). The process produces a data structure storing the area of each connected component as a function of intensity. A merge of two components is viewed as termination of existence of the smaller component and an insertion of all pixels of the smaller component into the larger one. Finally, intensity levels that are local minima of the rate of change of the area function are selected as thresholds producing maximally stable extremal regions. In the output, each MSER is represented by position of a local intensity minimum (or maximum) and a threshold. Notes. The structure of the above algorithm and of an efficient watershed algorithm [36] is essentially identical. However, the structure of the output of 4 even faster (but more complex) connected component algorithms exist with O(nα(n)) complexity, where α is the inverse Ackerman function; α(n) 4 for all practical n. 29

32 the two algorithms is different. The watershed is a partitioning of D, i.e. a set of regions R i : R i = D, R j R k =. In watershed computation, focus is on the thresholds where regions merge (and two watersheds touch). Such threshold are of little interest here, since they are highly unstable after merge, the region area jumps. In MSER detection, we seek a range of thresholds that leaves the watershed basin effectively unchanged. Detection of MSER is also related to thresholding. Every extremal region is a connected component of a thresholded image. However, no global or optimal threshold is sought, all thresholds are tested and the stability of the connected components evaluated. The output of the MSER detector is not a binarized image. For some parts of the image, multiple stable thresholds exist and a system of nested subsets is output in this case. Finally we remark that MSERs can be defined on any image (even high-dimensional) whose pixel values are from a totally ordered set. 5.2 The proposed robust wide-baseline algorithm Distinguished region detection. As a first step, the DRs are detected - the MSERs computed on the intensity image (MSER+) and on the inverted image (MSER-). Measurement regions. A measurement region of arbitrary size may be associated with each DR, if the construction is affine-covariant. Smaller measurement regions are both more likely to satisfy the planarity condition and not to cross a discontinuity in depth or orientation. On the other hand, small regions are less discriminative, i. e. they are much less likely to be unique. Increasing the size of a measurement region carries the risk of including parts of background that are completely different in the two images considered. Clearly, the optimal size of a MR depends on the scene content and it is different for each DR. In [35], Tuytelaars at al. double the elliptical DR to increase discriminability, while keeping the probability of crossing object boundaries at an acceptable level. In the proposed algorithm, measurement regions are selected at multiple scales: the DR itself, 1.5, 2 and 3 times scaled convex hull of the DR. Since matching is accomplished in a robust manner, we benefit from the increase of distinctiveness of large regions without being severely affected by clutter or non-planarity of the DR s pre-image. This is a novelty of our approach. Commonly, Mahalanobis distance has been used in MR matching. However, the non-robustness of this metric means that matching may fail because of a single corrupted measurement (this happened in the experiments reported below). Invariant description. In all experiments, rotational invariants (based on complex moments) were used after applying a transformation that diagonalises the regions covariance matrix of the DR. In combination, this is an affinely-invariant procedure. Combination of rotational and affinely invariant generalised colour moments [19] gave a similar result. On their own, these affine invariants failed on 3

33 problems with a large scale change. Robust matching. A measurement taken from an almost planar patch of the scene with stable invariant description will be referred to as a good measurement. Unstable measurements or those computed on non-planar surfaces or at discontinuities in depth or orientation will be referred to as corrupted measurements. The robust similarity is computed as follows. For each measurement MA i on region A, k regions B 1,...,B k from the other image with the corresponding i-th measurement MB i 1,...,MB i k nearest to MA i are found and a vote is cast suggesting correspondence of A and each of B 1,..., B k. Votes are summed over all measurements. In the current implementation 216 invariants at each scale, i.e. a total of 864 measurements are used (i [1, 864]). The DRs with the largest number of votes are the candidates for tentative correspondences. Experimentally, we found that k set to 1% of the number of regions gives good results. Probabilistic analysis of the likelihood of the success of the procedure is not simple, since the distribution of invariants and their noise is image-dependent. We therefore only suppose that corrupted measurements spread their votes randomly, not conspiring to create a high score and that good measurements are more likely to vote for correct matches. Tentative correspondences using correlation. Invariant description is used as a preliminary test. The final selection of tentative correspondences is based on correlation. First transformations that diagonalise the covariance matrix of the DRs are applied. The resulting circular regions are correlated (for all relative rotations). This procedure is done efficiently in polar coordinates for different sizes of circles. Rough epipolar geometry (EG) is estimated by applying RANSAC to the centers of gravity of DRs. Subsequently, the precision of the EG estimate is significantly improved by the following process. First, an affine transformation between pairs of potentially corresponding DRs, i.e. the DRs consistent with the rough EG, is computed. Correspondence of covariance matrices defines an affine transformation up to a rotation. The rotation is determined from epipolar lines. Next, DR correspondences are pruned and only those with correlation of their transformed images above a threshold are selected. In the next step, RANSAC is applied again, but this time with a very narrow threshold. The final improvement of the EG is achieved by adding to RANSAC inliers DR pairs whose convex hull centres are EG-consistent. Commonly, DRs differ in minute differences that render their centres of gravity inconsistent with the fine EG, but the centers of the convex hulls are precise enough. The precision of the final EG, estimated linearly by the eight point algorithm (without bundle adjustment or radial distortion correction) is surprisingly high. The average distance of inliers from epipolar line is below.1 31

34 Figure 7: BOOKSHELF: Estimated epipolar geometry on indoor scene with significant scale change. In the cutouts the change in the resolution of detected DRs is clearly visible. pixel, see Table Experiments The following experiments were conducted: Bookshelf, (Fig. 7). The BOOKSHELF scene tests performance under a very large scale change. The corresponding DRs in the left view are confined only to a small part of the image since the rest of the scene is not visible in the second view. Different resolution of detected features is evident in the close-up. Valbonne, (Fig. 8). This outdoor scene has been analysed in the literature [25, 23]. Repetitive patterns such as bricks are present. The part of the scene visible in both views covers a small fraction of the image. Wash, (Fig. 9). Results on this image set have been presented in [35]. The camera undergoes significant translation and rotation. The ordering constraint is notably violated, objects appear on different backgrounds. Kampa, (Fig. 1), is an example of an urban outdoor scene. A relatively large fraction of the images is covered by changing sky. Repeating windows made matching difficult. Cylindrical Box, (Fig. 11, top and bottom left), shows a metal box on a textured 32

35 Figure 8: VALBONNE: Estimated epipolar geometry and points associated to the matched regions are shown in the first row. Cutouts in the second row show matched bricks. floor. The regions matched on the box demonstrate performance on a non-planar surface. A significant change of illumination and a strong specular reflection is present in the second image that was taken with a flash (this strongly decreases the number of MSER +). Shout, (Fig. 11, bottom right). This scene has been used in [35]. Since the spectral power distribution of the illumination and the position of light sources is significantly different, we included the test to demonstrate performance in variable illumination conditions. Results are summarized in Tables 8 and 9. Table 8 shows the number of detected DRs in the left right images for both types of the DRs (MSER- and number of: MSER - MSER + TC Bookshelf Valbonne Wash Kampa Cyl. Box Shout Table 8: Number of DRs detected in images. The number of tentative correspondences is given in the TC column. 33

36 Figure 9: WASH: Epipolar geometry and dense matched regions with fully affine distortion. MSER+). The number of tentative correspondences is given in the last column of Table 8. Table 9 shows the number of correspondences established in different stages of the algorithm. Column TC repeats the number of tentative correspondences. Column rough EG displays the number of tentative correspondences consistent with the rough estimate of the epipolar geometry. The ratio of TC and rough EG determines the speed of the RANSAC algorithm. The column headed EG + corr gives the number of correspondences consistent with rough EG that passed the correlation test. Notice that the numbers are much higher than those in the rough EG column. The final number of correspondences is given in the penultimate column fine EG. Average distances from epipolar lines are presented in columns rough d and fine d. We can see, that the precision of the estimated epipolar geometry is very high, much higher than the precision of the rough EG. The last column shows the number of mismatches (found manually). 5.4 Conclusions A new method for wide-baseline matching was proposed. The three main novelties are: the introduction of MSERs, robust matching of local features and the use of multiple scaled measurement regions. The MSERs are sets of image elements, closed under the affine transforma- 34

37 Figure 1: Estimated EG on an outdoor scene. TC rough EG rough d EG + corr fine EG fine d miss Bookshelf Valbonne Wash Kampa Cyl. Box Shout Table 9: Experimental results. For details see the text, at the beginning of Section 5.3. tion of image coordinates and invariant to affine transformation of intensity. An efficient (near linear complexity) and practically fast detection algorithm was presented. The stability and high utility of MSERs was demonstrated experimentally. Another novelty of the approach is the use of a robust similarity measure for establishing tentative correspondences. Due to the robustness, we were able to consider invariants from multiple measurement regions, even some that were significantly larger (and hence probably discriminative) than the associated MSER. Good estimates of epipolar geometry were obtained on challenging widebaseline problems with the robustified matching algorithm operating on the output produced by the MSER detector. The average distance from corresponding points to the epipolar line was below.9 of the inter-pixel distance. Significant change of scale (3.5 ), illumination conditions, out-of-plane rotation, occlusion, locally anisotropic scale change and 3D translation of the viewpoint are all present in the test problems. Test images included both outdoor and indoor scenes, some already used in published work. In future work, we intend to proceed towards fully automatic projective reconstruction of the 3D scene, which requires computing projective reconstruction and dense matching. Secondly, we will investigate properties of robust similarity measures and their selection based on statistical properties of the data. 35

38 Figure 11: CYLINDRICAL BOX: Epipolar geometry (top) and matched regions (bottom left). Fully affine distortion, a non-planar object, textured surface and a strong specular reflection are present in the scene. SHOUT (bottom right), a scene with a change of illumination spectral power distribution. 6 Conclusions and Thesis Proposal This work presented two improvements to the RANSAC algorithm and a novel algorithm for the wide baseline stereo correspondence problem. Parts of the work were already presented in [2, 15, 13, 16, 12]. There remain open issues for further research. We would like to touch some of the following topics: Degenerated configurations in RANSAC. Degenerated configuration (DC) is a set of data points 5 that are consistent with a whole family of model parameters. An example of DC are identical points for a line or a coplanar points for a fundamental matrix. In the presence of significant DC, any model from the family defined by such DC may have large support (set of inliers) and hence RANSAC may return incorect solution. Automatic detection of DC and the ways how to deal with them can be studied. Feedback in correspondence problem. Once the model is hypothesized, 5 interesting DCs are those with the number of data points greater than minimal number of data points needed to calculate model parameters uniquely 36

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

Stereo and Epipolar geometry

Stereo and Epipolar geometry Previously Image Primitives (feature points, lines, contours) Today: Stereo and Epipolar geometry How to match primitives between two (multiple) views) Goals: 3D reconstruction, recognition Jana Kosecka

More information

Combining Appearance and Topology for Wide

Combining Appearance and Topology for Wide Combining Appearance and Topology for Wide Baseline Matching Dennis Tell and Stefan Carlsson Presented by: Josh Wills Image Point Correspondences Critical foundation for many vision applications 3-D reconstruction,

More information

Construction of Precise Local Affine Frames

Construction of Precise Local Affine Frames Construction of Precise Local Affine Frames Andrej Mikulik, Jiri Matas, Michal Perdoch, Ondrej Chum Center for Machine Perception Czech Technical University in Prague Czech Republic e-mail: mikulik@cmp.felk.cvut.cz

More information

Feature Based Registration - Image Alignment

Feature Based Registration - Image Alignment Feature Based Registration - Image Alignment Image Registration Image registration is the process of estimating an optimal transformation between two or more images. Many slides from Alexei Efros http://graphics.cs.cmu.edu/courses/15-463/2007_fall/463.html

More information

CSE 252B: Computer Vision II

CSE 252B: Computer Vision II CSE 252B: Computer Vision II Lecturer: Serge Belongie Scribes: Jeremy Pollock and Neil Alldrin LECTURE 14 Robust Feature Matching 14.1. Introduction Last lecture we learned how to find interest points

More information

Requirements for region detection

Requirements for region detection Region detectors Requirements for region detection For region detection invariance transformations that should be considered are illumination changes, translation, rotation, scale and full affine transform

More information

Motion Estimation and Optical Flow Tracking

Motion Estimation and Optical Flow Tracking Image Matching Image Retrieval Object Recognition Motion Estimation and Optical Flow Tracking Example: Mosiacing (Panorama) M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 Example 3D Reconstruction

More information

Invariant Features from Interest Point Groups

Invariant Features from Interest Point Groups Invariant Features from Interest Point Groups Matthew Brown and David Lowe {mbrown lowe}@cs.ubc.ca Department of Computer Science, University of British Columbia, Vancouver, Canada. Abstract This paper

More information

A Summary of Projective Geometry

A Summary of Projective Geometry A Summary of Projective Geometry Copyright 22 Acuity Technologies Inc. In the last years a unified approach to creating D models from multiple images has been developed by Beardsley[],Hartley[4,5,9],Torr[,6]

More information

Object Recognition with Invariant Features

Object Recognition with Invariant Features Object Recognition with Invariant Features Definition: Identify objects or scenes and determine their pose and model parameters Applications Industrial automation and inspection Mobile robots, toys, user

More information

Generalized RANSAC framework for relaxed correspondence problems

Generalized RANSAC framework for relaxed correspondence problems Generalized RANSAC framework for relaxed correspondence problems Wei Zhang and Jana Košecká Department of Computer Science George Mason University Fairfax, VA 22030 {wzhang2,kosecka}@cs.gmu.edu Abstract

More information

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10 Structure from Motion CSE 152 Lecture 10 Announcements Homework 3 is due May 9, 11:59 PM Reading: Chapter 8: Structure from Motion Optional: Multiple View Geometry in Computer Vision, 2nd edition, Hartley

More information

Instance-level recognition part 2

Instance-level recognition part 2 Visual Recognition and Machine Learning Summer School Paris 2011 Instance-level recognition part 2 Josef Sivic http://www.di.ens.fr/~josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire d Informatique,

More information

Video Google: A Text Retrieval Approach to Object Matching in Videos

Video Google: A Text Retrieval Approach to Object Matching in Videos Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic, Frederik Schaffalitzky, Andrew Zisserman Visual Geometry Group University of Oxford The vision Enable video, e.g. a feature

More information

Chapter 3 Image Registration. Chapter 3 Image Registration

Chapter 3 Image Registration. Chapter 3 Image Registration Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation

More information

RANSAC RANdom SAmple Consensus

RANSAC RANdom SAmple Consensus Talk Outline importance for computer vision principle line fitting epipolar geometry estimation RANSAC RANdom SAmple Consensus Tomáš Svoboda, svoboda@cmp.felk.cvut.cz courtesy of Ondřej Chum, Jiří Matas

More information

Local Feature Detectors

Local Feature Detectors Local Feature Detectors Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Slides adapted from Cordelia Schmid and David Lowe, CVPR 2003 Tutorial, Matthew Brown,

More information

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy BSB663 Image Processing Pinar Duygulu Slides are adapted from Selim Aksoy Image matching Image matching is a fundamental aspect of many problems in computer vision. Object or scene recognition Solving

More information

Outline 7/2/201011/6/

Outline 7/2/201011/6/ Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern

More information

Maximally Stable Extremal Regions and Local Geometry for Visual Correspondences

Maximally Stable Extremal Regions and Local Geometry for Visual Correspondences Maximally Stable Extremal Regions and Local Geometry for Visual Correspondences Michal Perďoch Supervisor: Jiří Matas Center for Machine Perception, Department of Cb Cybernetics Faculty of Electrical Engineering

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION Mr.V.SRINIVASA RAO 1 Prof.A.SATYA KALYAN 2 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING PRASAD V POTLURI SIDDHARTHA

More information

Instance-level recognition II.

Instance-level recognition II. Reconnaissance d objets et vision artificielle 2010 Instance-level recognition II. Josef Sivic http://www.di.ens.fr/~josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire d Informatique, Ecole Normale

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,

More information

Automatic Image Alignment

Automatic Image Alignment Automatic Image Alignment Mike Nese with a lot of slides stolen from Steve Seitz and Rick Szeliski 15-463: Computational Photography Alexei Efros, CMU, Fall 2010 Live Homography DEMO Check out panoramio.com

More information

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image

More information

Multi-modal Registration of Visual Data. Massimiliano Corsini Visual Computing Lab, ISTI - CNR - Italy

Multi-modal Registration of Visual Data. Massimiliano Corsini Visual Computing Lab, ISTI - CNR - Italy Multi-modal Registration of Visual Data Massimiliano Corsini Visual Computing Lab, ISTI - CNR - Italy Overview Introduction and Background Features Detection and Description (2D case) Features Detection

More information

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit Augmented Reality VU Computer Vision 3D Registration (2) Prof. Vincent Lepetit Feature Point-Based 3D Tracking Feature Points for 3D Tracking Much less ambiguous than edges; Point-to-point reprojection

More information

Automatic Image Alignment

Automatic Image Alignment Automatic Image Alignment with a lot of slides stolen from Steve Seitz and Rick Szeliski Mike Nese CS194: Image Manipulation & Computational Photography Alexei Efros, UC Berkeley, Fall 2018 Live Homography

More information

Global localization from a single feature correspondence

Global localization from a single feature correspondence Global localization from a single feature correspondence Friedrich Fraundorfer and Horst Bischof Institute for Computer Graphics and Vision Graz University of Technology {fraunfri,bischof}@icg.tu-graz.ac.at

More information

Tracking in image sequences

Tracking in image sequences CENTER FOR MACHINE PERCEPTION CZECH TECHNICAL UNIVERSITY Tracking in image sequences Lecture notes for the course Computer Vision Methods Tomáš Svoboda svobodat@fel.cvut.cz March 23, 2011 Lecture notes

More information

The SIFT (Scale Invariant Feature

The SIFT (Scale Invariant Feature The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical

More information

RANSAC: RANdom Sampling And Consensus

RANSAC: RANdom Sampling And Consensus CS231-M RANSAC: RANdom Sampling And Consensus Roland Angst rangst@stanford.edu www.stanford.edu/~rangst CS231-M 2014-04-30 1 The Need for RANSAC Why do I need RANSAC? I know robust statistics! Robust Statistics

More information

Stereo Vision. MAN-522 Computer Vision

Stereo Vision. MAN-522 Computer Vision Stereo Vision MAN-522 Computer Vision What is the goal of stereo vision? The recovery of the 3D structure of a scene using two or more images of the 3D scene, each acquired from a different viewpoint in

More information

Accelerating Pattern Matching or HowMuchCanYouSlide?

Accelerating Pattern Matching or HowMuchCanYouSlide? Accelerating Pattern Matching or HowMuchCanYouSlide? Ofir Pele and Michael Werman School of Computer Science and Engineering The Hebrew University of Jerusalem {ofirpele,werman}@cs.huji.ac.il Abstract.

More information

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58 Image Features: Local Descriptors Sanja Fidler CSC420: Intro to Image Understanding 1/ 58 [Source: K. Grauman] Sanja Fidler CSC420: Intro to Image Understanding 2/ 58 Local Features Detection: Identify

More information

School of Computing University of Utah

School of Computing University of Utah School of Computing University of Utah Presentation Outline 1 2 3 4 Main paper to be discussed David G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, IJCV, 2004. How to find useful keypoints?

More information

Fast Outlier Rejection by Using Parallax-Based Rigidity Constraint for Epipolar Geometry Estimation

Fast Outlier Rejection by Using Parallax-Based Rigidity Constraint for Epipolar Geometry Estimation Fast Outlier Rejection by Using Parallax-Based Rigidity Constraint for Epipolar Geometry Estimation Engin Tola 1 and A. Aydın Alatan 2 1 Computer Vision Laboratory, Ecóle Polytechnique Fédéral de Lausanne

More information

Specular 3D Object Tracking by View Generative Learning

Specular 3D Object Tracking by View Generative Learning Specular 3D Object Tracking by View Generative Learning Yukiko Shinozuka, Francois de Sorbier and Hideo Saito Keio University 3-14-1 Hiyoshi, Kohoku-ku 223-8522 Yokohama, Japan shinozuka@hvrl.ics.keio.ac.jp

More information

Robust Geometry Estimation from two Images

Robust Geometry Estimation from two Images Robust Geometry Estimation from two Images Carsten Rother 09/12/2016 Computer Vision I: Image Formation Process Roadmap for next four lectures Computer Vision I: Image Formation Process 09/12/2016 2 Appearance-based

More information

Nonparametric estimation of multiple structures with outliers

Nonparametric estimation of multiple structures with outliers Nonparametric estimation of multiple structures with outliers Wei Zhang and Jana Kosecka Department of Computer Science, George Mason University, 44 University Dr. Fairfax, VA 223 USA {wzhang2,kosecka}@cs.gmu.edu

More information

Feature Detectors and Descriptors: Corners, Lines, etc.

Feature Detectors and Descriptors: Corners, Lines, etc. Feature Detectors and Descriptors: Corners, Lines, etc. Edges vs. Corners Edges = maxima in intensity gradient Edges vs. Corners Corners = lots of variation in direction of gradient in a small neighborhood

More information

Structure Guided Salient Region Detector

Structure Guided Salient Region Detector Structure Guided Salient Region Detector Shufei Fan, Frank Ferrie Center for Intelligent Machines McGill University Montréal H3A2A7, Canada Abstract This paper presents a novel method for detection of

More information

THE image based localization problem as considered in

THE image based localization problem as considered in JOURNAL OF L A TEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 1 Image Based Localization Wei Zhang, Member, IEEE, and Jana Košecká Member, IEEE, Abstract In this paper we present an approach for image based

More information

A Comparison of SIFT, PCA-SIFT and SURF

A Comparison of SIFT, PCA-SIFT and SURF A Comparison of SIFT, PCA-SIFT and SURF Luo Juan Computer Graphics Lab, Chonbuk National University, Jeonju 561-756, South Korea qiuhehappy@hotmail.com Oubong Gwun Computer Graphics Lab, Chonbuk National

More information

Computer Vision I - Filtering and Feature detection

Computer Vision I - Filtering and Feature detection Computer Vision I - Filtering and Feature detection Carsten Rother 30/10/2015 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image

More information

Accurate Image Registration from Local Phase Information

Accurate Image Registration from Local Phase Information Accurate Image Registration from Local Phase Information Himanshu Arora, Anoop M. Namboodiri, and C.V. Jawahar Center for Visual Information Technology, IIIT, Hyderabad, India { himanshu@research., anoop@,

More information

Nonparametric estimation of multiple structures with outliers

Nonparametric estimation of multiple structures with outliers Nonparametric estimation of multiple structures with outliers Wei Zhang and Jana Kosecka George Mason University, 44 University Dr. Fairfax, VA 223 USA Abstract. Common problem encountered in the analysis

More information

calibrated coordinates Linear transformation pixel coordinates

calibrated coordinates Linear transformation pixel coordinates 1 calibrated coordinates Linear transformation pixel coordinates 2 Calibration with a rig Uncalibrated epipolar geometry Ambiguities in image formation Stratified reconstruction Autocalibration with partial

More information

Automatic Image Alignment (feature-based)

Automatic Image Alignment (feature-based) Automatic Image Alignment (feature-based) Mike Nese with a lot of slides stolen from Steve Seitz and Rick Szeliski 15-463: Computational Photography Alexei Efros, CMU, Fall 2006 Today s lecture Feature

More information

CS 664 Image Matching and Robust Fitting. Daniel Huttenlocher

CS 664 Image Matching and Robust Fitting. Daniel Huttenlocher CS 664 Image Matching and Robust Fitting Daniel Huttenlocher Matching and Fitting Recognition and matching are closely related to fitting problems Parametric fitting can serve as more restricted domain

More information

Edge and corner detection

Edge and corner detection Edge and corner detection Prof. Stricker Doz. G. Bleser Computer Vision: Object and People Tracking Goals Where is the information in an image? How is an object characterized? How can I find measurements

More information

Automatic estimation of the inlier threshold in robust multiple structures fitting.

Automatic estimation of the inlier threshold in robust multiple structures fitting. Automatic estimation of the inlier threshold in robust multiple structures fitting. Roberto Toldo and Andrea Fusiello Dipartimento di Informatica, Università di Verona Strada Le Grazie, 3734 Verona, Italy

More information

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies M. Lourakis, S. Tzurbakis, A. Argyros, S. Orphanoudakis Computer Vision and Robotics Lab (CVRL) Institute of

More information

Determinant of homography-matrix-based multiple-object recognition

Determinant of homography-matrix-based multiple-object recognition Determinant of homography-matrix-based multiple-object recognition 1 Nagachetan Bangalore, Madhu Kiran, Anil Suryaprakash Visio Ingenii Limited F2-F3 Maxet House Liverpool Road Luton, LU1 1RS United Kingdom

More information

Factorization with Missing and Noisy Data

Factorization with Missing and Noisy Data Factorization with Missing and Noisy Data Carme Julià, Angel Sappa, Felipe Lumbreras, Joan Serrat, and Antonio López Computer Vision Center and Computer Science Department, Universitat Autònoma de Barcelona,

More information

Fitting. Fitting. Slides S. Lazebnik Harris Corners Pkwy, Charlotte, NC

Fitting. Fitting. Slides S. Lazebnik Harris Corners Pkwy, Charlotte, NC Fitting We ve learned how to detect edges, corners, blobs. Now what? We would like to form a higher-level, more compact representation of the features in the image by grouping multiple features according

More information

Homographies and RANSAC

Homographies and RANSAC Homographies and RANSAC Computer vision 6.869 Bill Freeman and Antonio Torralba March 30, 2011 Homographies and RANSAC Homographies RANSAC Building panoramas Phototourism 2 Depth-based ambiguity of position

More information

Vision par ordinateur

Vision par ordinateur Epipolar geometry π Vision par ordinateur Underlying structure in set of matches for rigid scenes l T 1 l 2 C1 m1 l1 e1 M L2 L1 e2 Géométrie épipolaire Fundamental matrix (x rank 2 matrix) m2 C2 l2 Frédéric

More information

CAP 5415 Computer Vision Fall 2012

CAP 5415 Computer Vision Fall 2012 CAP 5415 Computer Vision Fall 01 Dr. Mubarak Shah Univ. of Central Florida Office 47-F HEC Lecture-5 SIFT: David Lowe, UBC SIFT - Key Point Extraction Stands for scale invariant feature transform Patented

More information

Wide Baseline Matching using Triplet Vector Descriptor

Wide Baseline Matching using Triplet Vector Descriptor 1 Wide Baseline Matching using Triplet Vector Descriptor Yasushi Kanazawa Koki Uemura Department of Knowledge-based Information Engineering Toyohashi University of Technology, Toyohashi 441-8580, JAPAN

More information

Step-by-Step Model Buidling

Step-by-Step Model Buidling Step-by-Step Model Buidling Review Feature selection Feature selection Feature correspondence Camera Calibration Euclidean Reconstruction Landing Augmented Reality Vision Based Control Sparse Structure

More information

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM Karthik Krish Stuart Heinrich Wesley E. Snyder Halil Cakir Siamak Khorram North Carolina State University Raleigh, 27695 kkrish@ncsu.edu sbheinri@ncsu.edu

More information

Instance-level recognition

Instance-level recognition Instance-level recognition 1) Local invariant features 2) Matching and recognition with local features 3) Efficient visual search 4) Very large scale indexing Matching of descriptors Matching and 3D reconstruction

More information

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1 Feature Detection Raul Queiroz Feitosa 3/30/2017 Feature Detection 1 Objetive This chapter discusses the correspondence problem and presents approaches to solve it. 3/30/2017 Feature Detection 2 Outline

More information

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Extrinsic camera calibration method and its performance evaluation Jacek Komorowski 1 and Przemyslaw Rokita 2 arxiv:1809.11073v1 [cs.cv] 28 Sep 2018 1 Maria Curie Sklodowska University Lublin, Poland jacek.komorowski@gmail.com

More information

10/03/11. Model Fitting. Computer Vision CS 143, Brown. James Hays. Slides from Silvio Savarese, Svetlana Lazebnik, and Derek Hoiem

10/03/11. Model Fitting. Computer Vision CS 143, Brown. James Hays. Slides from Silvio Savarese, Svetlana Lazebnik, and Derek Hoiem 10/03/11 Model Fitting Computer Vision CS 143, Brown James Hays Slides from Silvio Savarese, Svetlana Lazebnik, and Derek Hoiem Fitting: find the parameters of a model that best fit the data Alignment:

More information

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm Group 1: Mina A. Makar Stanford University mamakar@stanford.edu Abstract In this report, we investigate the application of the Scale-Invariant

More information

Instance-level recognition

Instance-level recognition Instance-level recognition 1) Local invariant features 2) Matching and recognition with local features 3) Efficient visual search 4) Very large scale indexing Matching of descriptors Matching and 3D reconstruction

More information

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi hrazvi@stanford.edu 1 Introduction: We present a method for discovering visual hierarchy in a set of images. Automatically grouping

More information

Agenda. Rotations. Camera calibration. Homography. Ransac

Agenda. Rotations. Camera calibration. Homography. Ransac Agenda Rotations Camera calibration Homography Ransac Geometric Transformations y x Transformation Matrix # DoF Preserves Icon translation rigid (Euclidean) similarity affine projective h I t h R t h sr

More information

CS 558: Computer Vision 4 th Set of Notes

CS 558: Computer Vision 4 th Set of Notes 1 CS 558: Computer Vision 4 th Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Overview Keypoint matching Hessian

More information

Motion Tracking and Event Understanding in Video Sequences

Motion Tracking and Event Understanding in Video Sequences Motion Tracking and Event Understanding in Video Sequences Isaac Cohen Elaine Kang, Jinman Kang Institute for Robotics and Intelligent Systems University of Southern California Los Angeles, CA Objectives!

More information

Image matching. Announcements. Harder case. Even harder case. Project 1 Out today Help session at the end of class. by Diva Sian.

Image matching. Announcements. Harder case. Even harder case. Project 1 Out today Help session at the end of class. by Diva Sian. Announcements Project 1 Out today Help session at the end of class Image matching by Diva Sian by swashford Harder case Even harder case How the Afghan Girl was Identified by Her Iris Patterns Read the

More information

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale. Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe presented by, Sudheendra Invariance Intensity Scale Rotation Affine View point Introduction Introduction SIFT (Scale Invariant Feature

More information

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra)

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra) Mierm Exam CS223b Stanford CS223b Computer Vision, Winter 2004 Feb. 18, 2004 Full Name: Email: This exam has 7 pages. Make sure your exam is not missing any sheets, and write your name on every page. The

More information

CS 231A Computer Vision (Winter 2014) Problem Set 3

CS 231A Computer Vision (Winter 2014) Problem Set 3 CS 231A Computer Vision (Winter 2014) Problem Set 3 Due: Feb. 18 th, 2015 (11:59pm) 1 Single Object Recognition Via SIFT (45 points) In his 2004 SIFT paper, David Lowe demonstrates impressive object recognition

More information

3D Modeling using multiple images Exam January 2008

3D Modeling using multiple images Exam January 2008 3D Modeling using multiple images Exam January 2008 All documents are allowed. Answers should be justified. The different sections below are independant. 1 3D Reconstruction A Robust Approche Consider

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford Image matching Harder case by Diva Sian by Diva Sian by scgbt by swashford Even harder case Harder still? How the Afghan Girl was Identified by Her Iris Patterns Read the story NASA Mars Rover images Answer

More information

Implementing the Scale Invariant Feature Transform(SIFT) Method

Implementing the Scale Invariant Feature Transform(SIFT) Method Implementing the Scale Invariant Feature Transform(SIFT) Method YU MENG and Dr. Bernard Tiddeman(supervisor) Department of Computer Science University of St. Andrews yumeng@dcs.st-and.ac.uk Abstract The

More information

Occluded Facial Expression Tracking

Occluded Facial Expression Tracking Occluded Facial Expression Tracking Hugo Mercier 1, Julien Peyras 2, and Patrice Dalle 1 1 Institut de Recherche en Informatique de Toulouse 118, route de Narbonne, F-31062 Toulouse Cedex 9 2 Dipartimento

More information

Lecture 3.3 Robust estimation with RANSAC. Thomas Opsahl

Lecture 3.3 Robust estimation with RANSAC. Thomas Opsahl Lecture 3.3 Robust estimation with RANSAC Thomas Opsahl Motivation If two perspective cameras captures an image of a planar scene, their images are related by a homography HH 2 Motivation If two perspective

More information

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University. 3D Computer Vision Structured Light II Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Introduction

More information

ROBUST LINE-BASED CALIBRATION OF LENS DISTORTION FROM A SINGLE VIEW

ROBUST LINE-BASED CALIBRATION OF LENS DISTORTION FROM A SINGLE VIEW ROBUST LINE-BASED CALIBRATION OF LENS DISTORTION FROM A SINGLE VIEW Thorsten Thormählen, Hellward Broszio, Ingolf Wassermann thormae@tnt.uni-hannover.de University of Hannover, Information Technology Laboratory,

More information

A Novel Algorithm for Color Image matching using Wavelet-SIFT

A Novel Algorithm for Color Image matching using Wavelet-SIFT International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 A Novel Algorithm for Color Image matching using Wavelet-SIFT Mupuri Prasanth Babu *, P. Ravi Shankar **

More information

Automated Scene Matching in Movies

Automated Scene Matching in Movies Automated Scene Matching in Movies F. Schaffalitzky and A. Zisserman Robotics Research Group Department of Engineering Science University of Oxford Oxford, OX1 3PJ fsm,az @robots.ox.ac.uk Abstract. We

More information

Bagging for One-Class Learning

Bagging for One-Class Learning Bagging for One-Class Learning David Kamm December 13, 2008 1 Introduction Consider the following outlier detection problem: suppose you are given an unlabeled data set and make the assumptions that one

More information

Multiple Model Estimation : The EM Algorithm & Applications

Multiple Model Estimation : The EM Algorithm & Applications Multiple Model Estimation : The EM Algorithm & Applications Princeton University COS 429 Lecture Nov. 13, 2007 Harpreet S. Sawhney hsawhney@sarnoff.com Recapitulation Problem of motion estimation Parametric

More information

III. VERVIEW OF THE METHODS

III. VERVIEW OF THE METHODS An Analytical Study of SIFT and SURF in Image Registration Vivek Kumar Gupta, Kanchan Cecil Department of Electronics & Telecommunication, Jabalpur engineering college, Jabalpur, India comparing the distance

More information

IMPACT OF SUBPIXEL PARADIGM ON DETERMINATION OF 3D POSITION FROM 2D IMAGE PAIR Lukas Sroba, Rudolf Ravas

IMPACT OF SUBPIXEL PARADIGM ON DETERMINATION OF 3D POSITION FROM 2D IMAGE PAIR Lukas Sroba, Rudolf Ravas 162 International Journal "Information Content and Processing", Volume 1, Number 2, 2014 IMPACT OF SUBPIXEL PARADIGM ON DETERMINATION OF 3D POSITION FROM 2D IMAGE PAIR Lukas Sroba, Rudolf Ravas Abstract:

More information

Final Exam Study Guide

Final Exam Study Guide Final Exam Study Guide Exam Window: 28th April, 12:00am EST to 30th April, 11:59pm EST Description As indicated in class the goal of the exam is to encourage you to review the material from the course.

More information

Week 2: Two-View Geometry. Padua Summer 08 Frank Dellaert

Week 2: Two-View Geometry. Padua Summer 08 Frank Dellaert Week 2: Two-View Geometry Padua Summer 08 Frank Dellaert Mosaicking Outline 2D Transformation Hierarchy RANSAC Triangulation of 3D Points Cameras Triangulation via SVD Automatic Correspondence Essential

More information

CHAPTER 9. Classification Scheme Using Modified Photometric. Stereo and 2D Spectra Comparison

CHAPTER 9. Classification Scheme Using Modified Photometric. Stereo and 2D Spectra Comparison CHAPTER 9 Classification Scheme Using Modified Photometric Stereo and 2D Spectra Comparison 9.1. Introduction In Chapter 8, even we combine more feature spaces and more feature generators, we note that

More information

Using Edge Detection in Machine Vision Gauging Applications

Using Edge Detection in Machine Vision Gauging Applications Application Note 125 Using Edge Detection in Machine Vision Gauging Applications John Hanks Introduction This application note introduces common edge-detection software strategies for applications such

More information

Chapter 9 Object Tracking an Overview

Chapter 9 Object Tracking an Overview Chapter 9 Object Tracking an Overview The output of the background subtraction algorithm, described in the previous chapter, is a classification (segmentation) of pixels into foreground pixels (those belonging

More information

This chapter explains two techniques which are frequently used throughout

This chapter explains two techniques which are frequently used throughout Chapter 2 Basic Techniques This chapter explains two techniques which are frequently used throughout this thesis. First, we will introduce the concept of particle filters. A particle filter is a recursive

More information

Two-view geometry Computer Vision Spring 2018, Lecture 10

Two-view geometry Computer Vision Spring 2018, Lecture 10 Two-view geometry http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 2018, Lecture 10 Course announcements Homework 2 is due on February 23 rd. - Any questions about the homework? - How many of

More information