Perception IV: Place Recognition, Line Extraction

Size: px

Start display at page:

Download "Perception IV: Place Recognition, Line Extraction"

Augusta McDonald
5 years ago
Views:

1 Perception IV: Place Recognition, Line Extraction Davide Scaramuzza University of Zurich Margarita Chli, Paul Furgale, Marco Hutter, Roland Siegwart 1

2 Outline of Today s lecture Place recognition using Vocabulary Tree Line extraction from images Line extraction from laser data Introduction 2

3 K-means clustering - Review Minimizes the sum of squared Euclidean distances between points x i and their nearest cluster centers m 2 k ( xi D( X, M ) m cluster k point i in cluster k k ) Algorithm: Randomly initialize K cluster centers Iterate until convergence: Assign each data point to the nearest center Recompute each cluster center as the mean of all points assigned to it

4 K-means clustering - Demo Source:

5 Feature-based object recognition - Review Q: Is this Book present in the Scene? Look for corresponding matches Most of the Book s keypoints are present in the Scene A: The Book is present in the Scene

6 Taking this a step further Find an object in an image? Find an object in multiple images? Find multiple objects in multiple images As the number of images increases, feature based object recognition becomes computationaly unfeasable?

7 Fast visual search Query in a database of 110 million images in 5.8 seconds Video Google, Sivic and Zisserman, ICCV 2003 Scalable Recognition with a Vocabulary Tree, Nister and Stewenius, CVPR 2006.

8 How much is 110 million images? Slide

9 How much is 110 million images? Slide

10 How much is 110 million images?

11 Bag of Words Extension to scene/place recognition: Is this image in my database? Robot: Have I been to this place before? Use analogies from text retrieval: Visual Words Vocabulary of Visual Words Bag of Words (BOW) approach

12 Indexing local features With potentially thousands of features per image, and hundreds to millions of images to search, how to efficiently find those that are relevant to a new image? Quantize/cluster the descriptors into `visual words Inverted file indexing schemes

13 Indexing local features: inverted file text For text documents, an efficient way to find all pages on which a word occurs is to use an index We want to find all images in which a feature occurs To use this idea, we ll need to map our features to visual words

14 Building the Visual Vocabulary Extract Features Image Collection Cluster Descriptors Descriptors space Examples of Visual Words:

15 Video Google These features map to the same visual word Video Google [J.Sivic and A. Zisserman, ICCV 2003] Demo:

16 Video Google System 1. Collect all words within query region 2. Inverted file index to find relevant frames 3. Compare word counts 4. Spatial verification Query region Sivic & Zisserman, ICCV 2003 Demo online at : Retrieved frames

17 Efficient Place/Object Recognition We can describe a scene as a collection of words and look up in the database for images with a similar collection of words What if we need to find an object/scene in a database of millions of images? Build Vocabulary Tree via hierarchical clustering Use the Inverted File system for efficient indexing [Nister and Stewénius, CVPR 2006]

18 Recognition with K-tree Populate the descriptor space

19 Recognition with K-tree Populate the descriptor space

20 Recognition with K-tree Populate the descriptor space

21 Recognition with K-tree Populate the descriptor space

22 Recognition with K-tree Populate the descriptor space

23 Recognition with K-tree Populate the descriptor space

36 Building the inverted file index

37 Building the inverted file index

38 Building the inverted file index

39 Building the inverted file index

40 Building the inverted file index

Inverted File index Inverted File DB lists all possible visual words Each word points to a list of images where this word occurs Voting array: has as many cells as the images in the DB each word in

41 Inverted File index Inverted File DB lists all possible visual words Each word points to a list of images where this word occurs Voting array: has as many cells as the images in the DB each word in query image, votes for an image Query image Q Visual words in Q Inverted File DB Visual word Voting Array for Q List of images that this word appears in

FABMAP [Cummins and Newman IJRR 2011] Place recognition for robot localization Use training images to build the BoW database Probabilistic model of the world At a new frame, compute: P(being

42 FABMAP [Cummins and Newman IJRR 2011] Place recognition for robot localization Use training images to build the BoW database Probabilistic model of the world At a new frame, compute: P(being at a known place) P(being at a new place) Captures the dependencies of words to distinguish the most characteristic structure of each scene (using the Chow-Liu tree) Binaries available online

43 FABMAP example robots.ox.ac.uk/~mjc/appearance_based_results.htm p = probability of images coming from the same place

44 FABMAP example robots.ox.ac.uk/~mjc/appearance_based_results.htm p = probability of images coming from the same place

45 Robust object/scene recognition Visual Vocabulary holds appearance information but discards the spatial relationships between features Two images with the same features shuffled around in the image will be a 100% match when using only appearance information. If different arrangements of the same features are expected then one might use geometric verification Test the k most similar images to the query image for geometric consistency (e.g. using RANSAC) Further reading (out of the scope of this course): [Cummins and Newman, IJRR 2011] [Stewénius et al, ECCV 2012]

46 Outline of Today s lecture Place recognition using Vocabulary Tree Line extraction from images Line extraction from laser data Introduction 46

47 Line extraction Supose that you have been commissioned to implement a lane detection for a car driving-assistance system. How would you proceed?

48 Line extraction How do we extract lines from edges?

49 Two popular line extraction algorithms Hough transform (used also to detect circles, ellipses, and any sort of shape) RANSAC (Random Sample Consensus)

50 Hough-Transform Finds lines from a binary edge image using a voting procedure The voting space (or accumulator) is called Hough space 1. P. Hough, Machine Analysis of Bubble Chamber Pictures, Proc. Int. Conf. High Energy Accelerators and Instrumentation, J. Richard, O. Duda, P.E. Hart (April 1971). "Use of the Hough Transformation to Detect Lines and Curves in Pictures". Artificial Intelligence Center (SRI International)

51 Hough-Transform Let x 0, y 0 be an image point We can represent all the lines passing through it by y 0 = mx 0 + b The Hough transform works by parameterizing this expression in terms of m and b: b = x 0 m + y 0 This is represented by a line in the Hough space x 0, y 0 Every point votes a line in the Hough space b = x 0 m + y 0

52 Hough-Transform Let x 0, y 0 be an image point We can represent all the lines passing through it by y 0 = mx 0 + b The Hough transform works by parameterizing this expression in terms of m and b: b = x 0 m + y 0 This is represented by a line in the Hough space x 1, y 1 b = x 1 m + y 1 x 0, y 0 b = x 0 m + y 0 Every point votes a line in the Hough space

53 Hough-Transform How do we determine the line (b, m ) that contains both x 0, y 0 and x 1, y 1? It is the intersection of the lines b = x 0 m + y 0 and b = x 1 m + y 1 x 1, y 1 b = x 1 m + y 1 x 0, y 0 b = x 0 m + y 0 Every point votes a line in the Hough space b m

54 Hough-Transform x 1, y 1 b = x 1 m + y 1 x 0, y 0 b = x 0 m + y 0 Every point votes a line in the Hough space Each point in image space, votes for line-parameters in Hough parameter space

55 Hough-Transform Problems with the (m, b) space: Unbounded parameter domain m, b can assume any value in [, + ]

56 Hough-Transform Problems with the (m, b) space: Unbounded parameter domain m, b can assume any value in [, + ] Alternative line representation: polar representation x cos y sin

57 Hough-Transform Each point in image space maps to a sinusoid in the parameter space (ρ, θ) x cos y sin x = 0 y = 180 = 0 = 100

58 Hough-Transform Each point in image space maps to a sinusoid in the parameter space (ρ, θ) x cos y sin x = 0 y = 180 = 0 = 100

59 Hough-Transform 1. Initialize: set all accumulator cells to zero 2. for each edge point (x,y) in the image for all θ in [0 : step : 180] Compute ρ = x cos θ + y sin θ H(θ, ρ) = H(θ, ρ) + 1 end end 3. Find the values of (θ, ρ) where H(θ, ρ) is a local maximum 4. The detected line in the image is given by: ρ = x cos θ + y sin θ

60 Examples

61 Examples Hough Transform Notice, however, that the Hough only find the parameters of the line, not the ends of it. How do you find them?

62 Examples

63 Examples

64 Problems Effects of noise: peaks get fuzzy and hard to locate How to overcome this? Increase bin size (increase resolution of the Hough space); however, this reduces the accuracy of the line parameters Convolute the output with a box filter; why?

65 RANSAC (RAndom SAmple Consensus) It has become the standard method for model fitting in the presence of outliers (very noisy points or wrong data) It can be applied to line fitting but also to thousands of different problems where the goal is to estimate the parameters of a model from noisy data (e.g., camera calibration, structure from motion, DLT, homography, etc.) Let s now focus on line extraction M. A.Fischler and R. C.Bolles. Random sample consensus: A paradigm for model fitting with applicatlons to image analysis and automated cartography. Graphics and Image Processing, 24(6): , 1981.

66 RANSAC

67 RANSAC Select sample of 2 points at random

68 RANSAC Select sample of 2 points at random Calculate model parameters that fit the data in the sample

69 RANSAC Select sample of 2 points at random Calculate model parameters that fit the data in the sample Calculate error function for each data point

70 RANSAC Select sample of 2 points at random Calculate model parameters that fit the data in the sample Calculate error function for each data point Select data that supports current hypothesis

71 RANSAC Select sample of 2 points at random Calculate model parameters that fit the data in the sample Calculate error function for each data point Select data that supports current hypothesis Repeat sampling

72 RANSAC Select sample of 2 points at random Calculate model parameters that fit the data in the sample Calculate error function for each data point Select data that supports current hypothesis Repeat sampling

73 RANSAC Set with the maximum number of inliers obtained after k iterations

74 RANSAC How many iterations does RANSAC need? Ideally: check all possible combinations of 2 points in a dataset of N points. Number of all pairwise combinations: N(N-1)/2 computationally unfeasible if N is too large. example: edge points need to check all *9999/2= 50 million combinations! Do we really need to check all combinations or can we stop after some iterations? Checking a subset of combinations is enough if we have a rough estimate of the percentage of inliers in our dataset This can be done in a probabilistic way

75 RANSAC How many iterations does RANSAC need? w := percentage of inliers: number of inliers/n N := total number of data points w : fraction of inliers in the dataset w = P(selecting an inlier-point out of the dataset) Let p := P(selecting a set of points free of outliers) Assumption: the 2 points necessary to estimate a line are selected independently w 2 = P(both selected points are inliers) 1-w 2 = P(at least one of these two points is an outlier) Let k := no. RANSAC iterations executed so far ( 1-w 2 ) k = P(RANSAC never selects two points that are both inliers) 1-p = ( 1-w 2 ) k and therefore : k log(1 log(1 p) ) 2 w

76 RANSAC How many iterations does RANSAC need? The number of iterations k is k log(1 log(1 p) ) 2 w knowing the fraction of inliers w, after k RANSAC iterations we will have a probability p of finding a set of points free of outliers Example: if we want a probability of success p=99% and we know that w=50% k=16 iterations these are dramatically fewer than the number of all possible combinations! Notice that the number of points does not influence the estimated number of iterations, only w does! In practice we need only a rough estimate of w. More advanced variants of RANSAC estimate the fraction of inliers and adaptively change it on every iteration

77 RANSAC - Algorithm Let A be a set of N points 1. repeat 2. Randomly select a sample of 2 points from A 3. Fit a line through the 2 points 4. Compute the distances of all other points to this line 5. Construct the inlier set (i.e. count the number of points whose distance < d) 6. Store these inliers 7. until maximum number of iterations k reached 8. The set with the maximum number of inliers is chosen as a solution to the problem

? Typical applications in robotics are: line extraction from 2D range data, plane extraction from 3D data, feature matching, structure from motion, camera calibration,

78 RANSAC - Applications RANSAC = RANdom SAmple Consensus. A generic & robust fitting algorithm of models in the presence of outliers (i.e. points which do not satisfy a model) Can be applied in general to any problem where the goal is to identify the inliers which satisfy a predefined model.? Typical applications in robotics are: line extraction from 2D range data, plane extraction from 3D data, feature matching, structure from motion, camera calibration, homography estimation, etc. RANSAC is iterative and non-deterministic the probability to find a set free of outliers increases as more iterations are used Drawback: a non-deterministic method, results are different between runs.

79 Outline of Today s lecture Place recognition using Vocabulary Tree Line extraction from images Line extraction from laser data RANSAC (same as for line dtection from images) Hough (same as for line dtection from images) Split and Merge (only laser scans: uses sequential order of scan data) Line regression (only laser scans: uses sequential order of scan data) Introduction 79

80 Algorithm 1: Split-and-Merge (standard) Popular algorithm, originates from Computer Vision. A recursive procedure of fitting and splitting. A slightly different version, called Iterative end-point-fit, simply connects the end points for line fitting. Let S be the set of all data points Split Fit a line to points in current set S Find the most distant point to the line If distance > threshold split set & repeat with left and right point sets Merge If two consecutive segments are collinear enough, obtain the common line and find the most distant point If distance <= threshold, merge both segments Introduction 80

81 Algorithm 1: Split-and-Merge (iterative end-point-fit) Iterative end-point-fit: simply connects the end points for line fitting

82 Algorithm 1: Split-and-Merge (iterative end-point-fit) Iterative end-point-fit: simply connects the end points for line fitting Split

83 Algorithm 1: Split-and-Merge (iterative end-point-fit) Iterative end-point-fit: simply connects the end points for line fitting Split Split Split

84 Algorithm 1: Split-and-Merge (iterative end-point-fit) Iterative end-point-fit: simply connects the end points for line fitting Split Split No more Splits Split

85 Algorithm 1: Split-and-Merge (iterative end-point-fit) Iterative end-point-fit: simply connects the end points for line fitting Split Split Merge No more Splits Split

86 Algorithm 2: Line-Regression Sliding window of size N f points Fit line-segment to all points in each window Then adjacent segments are merged if their line parameters are close Line-Regression Initialize sliding window size N f Fit a line to every N f consecutive points (i.e. in each window) Merge overlapping line segments + recompute line parameters for each segment N f = 3 Introduction 86

87 Algorithm 2: Line-Regression Sliding window of size N f points Fit line-segment to all points in each window Then adjacent segments are merged if their line parameters are close Line-Regression Initialize sliding window size N f Fit a line to every N f consecutive points (i.e. in each window) Merge overlapping line segments + recompute line parameters for each segment N f = 3

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich.

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich. Autonomous Mobile Robots Localization "Position" Global Map Cognition Environment Model Local Map Path Perception Real World Environment Motion Control Perception Sensors Vision Uncertainties, Line extraction