CPSC 425: Computer Vision - PDF Free Download

1 / 45 CPSC 425: Computer Vision Instructor: Fred Tung ftung@cs.ubc.ca Department of Computer Science University of British Columbia Lecture Notes 2015/2016 Term 2

2 / 45 Menu March 3, 2016 Topics: Hough transform (cont.) Stereo Reading: Today: Forsyth & Ponce (2nd ed.) 7.1.1, 7.2.1, 7.4, 7.6 Next: None Reminders: Assignment 5 due Tuesday, March 8 www: http://www.cs.ubc.ca/~ftung/cpsc425/ piazza: https://piazza.com/ubc.ca/winterterm22015/cpsc425/

3 / 45 Today s Fun Example Today, we look at an octopus. Point your web browser to http://www.youtube.com/watch?v=pmdttkzlmwm The YouTube clip is the last minute of a five minute TED talk by David Gallo, recorded March, 2007. The entire video can be found at http://www.ted.com/talks/lang/eng/david_gallo_shows_underwater_ astonishments.html The first portion of the TED talk shows examples of acquatic bio-luminescence. The second portion, which is more relevant to us, shows cephalopods (cuttlefish, squid, octopus) in action

4 / 45 Lecture 14: Re-cap RANSAC is a technique to fit data to a model randomly draw minimal set of data points to fit a model improve model estimate using all inliers choose model with largest consensus set (support) RANSAC is a general method suited for a wide range of model fitting problems easy to implement easy to estimate/control failure rate RANSAC handles only a moderate percentage of outliers

5 / 45 The Hough Transform for Lines Idea: Each point votes for the lines that pass through it A line is the set of points, (x, y), such that x sin θ y cos θ + r = 0 Different choices of θ, r give different lines

The Hough Transform for Lines 6 / 45

7 / 45 Example 1: Tokens Votes Horizontal axis is θ Vertical Axis is r Forsyth & Ponce (2nd ed.) Figure 10.1 (Top)

8 / 45 Example 1 (cont d): Tokens Votes Horizontal axis is θ Vertical Axis is r Forsyth & Ponce (2nd ed.) Figure 10.1 (Bottom)

9 / 45 Example 2: Tokens Votes Horizontal axis is θ Vertical Axis is r Forsyth & Ponce (2nd ed.) Figure 10.2

10 / 45 Mechanics of the Hough Transform Construct a quantized array to represent θ and r For each point, render the curve (θ, r) into this array adding one vote at each cell Difficulties: How big should the cells be? (too big, and we merge quite different lines; too small, and noise causes lines to be missed) How many lines? Count the peaks in the Hough array Treat adjacent peaks as a single peak

11 / 45 Some Practical Details of Hough Transform It is best to vote for the two closest bins in each dimension, as the locations of the bin boundaries are arbitrary This means that peaks are blurred and noise will not cause similar votes to fall into separate bins Can use a hash table rather than an array to store the votes This means that no effort is wasted on initializing and checking empty bins It avoids the need to predict the maximum size of the array, which can be non-rectangular

12 / 45 Some Practical Details of Hough Transform A key is to have each feature (token) determine as many parameters as possible Lines are detected more effectively from edge elements with both position and orientation For object recognition, each token should predict position, orientation, and scale The Hough transform can extract feature groupings from clutter in linear time

13 / 45 Example 3: Object recognition Combined object detection and segmentation using an implicit shape model. Image patches cast weighted votes for the object centroid. Image credit: B. Leibe et al., 2004.

14 / 45 Example 4: Object recognition Boundary fragments cast weighted votes for the object centroid. Also obtains an estimate of the object s contour. Image credit: Opelt et al., 2006.

15 / 45 Example 4: Object recognition Image credit: Opelt et al., 2006.

16 / 45 Example 5: Object recognition Poselets are image patches that have distinctive appearance and can be used to infer some of the configuration of a parts-based object. Detected poselets vote for the object configuration. Image credit: Bourdev and Malik, 2009.

17 / 45 Example 5: Object recognition Image credit: Bourdev and Malik, 2009.

18 / 45 Discussion of Hough Transform Advantages: Can handle high percentage of outliers: each point votes separately Can detect multiple instances of a model in a single pass Disadvantages: Complexity of search time increases exponentially with the number of model parameters Can be tricky to pick a good bin size

19 / 45 Summary of Hough Transform The Hough transform is another technique for fitting data to a model a voting procedure possible model parameters define a quantized accumulator array data points vote" for compatible entries in the accumulator array A key is to have each data point (token) constrain model parameters as tightly as possible

20 / 45 Clicker Quiz (2 questions) Please turn on clickers

25 / 45 Framework for Next Topic Problem: Determine depth using two images acquired from (slightly) different viewpoints Key Idea(s): The 3D coordinates of each point imaged are constrained to lie along a ray. This is true also for a second image obtained from a (slightly) different viewpoint. Rays for the same point in the world intersect at the actual 3D location of that point Theory: Geometry Gotchas: good matches are not necessarily correct not all points are visible in both images

26 / 45 Stereo Vision With two eyes, we acquire images of the world from slightly different viewpoints We perceive depth based on differences in the relative position of points in the left image and in the right image

27 / 45 Binoculars Figure credit: http://en.wikipedia.org/wiki/binoculars Binoculars enhance binocular depth perception in two distinct ways: 1 magnification 2 longer baseline (i.e., distance between entering light paths) compared to the normal human inter-pupillary distance

28 / 45 Stereo Vision Task: Compute depth from two images acquired from (slightly) different viewpoints Approach: Match locations in one image to those in another Sub-tasks: Calibrate cameras and camera positions Find all corresponding points (the hardest part) Compute depth and surfaces

29 / 45 Stereo Vision (cont d) Slide credit: Trevor Darrell

30 / 45 Stereo Vision (cont d) Triangulate on two images of the same point Match correlation windows across scan lines Image credit: Point Grey Research Slide credit: Trevor Darrell

31 / 45 Point Grey Research Digiclops Image credit: Point Grey Research

32 / 45 Correspondence Forsyth & Ponce (2nd ed.) Figure 7.2

The Epipolar Constraint 33 / 45

45 / 45 Summary Hough transform possible model parameters define a quantized accumulator array tokens vote for compatible entries in the accumulator array Stereo: task is to compute depth from two images acquired from (slightly) different viewpoints Stereo is formulated as a correspondence problem determine match between location of a scene point in one image and its location in another matching points lie along corresponding epipolar lines