Binocular Stereo Vision. System 6 Introduction Is there a Wedge in this 3D scene?

Similar documents
Correspondence and Stereopsis. Original notes by W. Correa. Figures from [Forsyth & Ponce] and [Trucco & Verri]

Stereo II CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Stereo imaging ideal geometry

Digital Image Processing COSC 6380/4393

System 1 Overview. System 1 Overview. Geometric Model-based Object Recognition. This Lecture: Geometric description Next Lecture: Model matching

Computer Vision Lecture 17

Computer Vision Lecture 17

Lecture 14: Basic Multi-View Geometry

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

Lecture 9: Epipolar Geometry

Stereo Vision. MAN-522 Computer Vision

Multiple View Geometry

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Camera Calibration. Schedule. Jesus J Caban. Note: You have until next Monday to let me know. ! Today:! Camera calibration

Dense 3D Reconstruction. Christiano Gava

Machine vision. Summary # 11: Stereo vision and epipolar geometry. u l = λx. v l = λy

Cameras and Stereo CSE 455. Linda Shapiro

3D Computer Vision. Structure from Motion. Prof. Didier Stricker

Anno accademico 2006/2007. Davide Migliore

Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection

Two-view geometry Computer Vision Spring 2018, Lecture 10

Epipolar Geometry and Stereo Vision

Robust Geometry Estimation from two Images

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

METRIC PLANE RECTIFICATION USING SYMMETRIC VANISHING POINTS

CV: 3D to 2D mathematics. Perspective transformation; camera calibration; stereo computation; and more

Capturing, Modeling, Rendering 3D Structures

EE795: Computer Vision and Intelligent Systems

MAPI Computer Vision. Multiple View Geometry

There are many cues in monocular vision which suggests that vision in stereo starts very early from two similar 2D images. Lets see a few...

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz

Dense 3D Reconstruction. Christiano Gava

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

BIL Computer Vision Apr 16, 2014

Stereo Observation Models

Image Rectification (Stereo) (New book: 7.2.1, old book: 11.1)

Edge and corner detection

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Homogeneous Coordinates. Lecture18: Camera Models. Representation of Line and Point in 2D. Cross Product. Overall scaling is NOT important.

CS201 Computer Vision Camera Geometry

Geometric camera models and calibration

Introduction to Homogeneous coordinates

Final Exam Study Guide

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

CS223b Midterm Exam, Computer Vision. Monday February 25th, Winter 2008, Prof. Jana Kosecka

Computer Vision I - Appearance-based Matching and Projective Geometry

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: ,

calibrated coordinates Linear transformation pixel coordinates

What have we leaned so far?

A Summary of Projective Geometry

Computer Vision I. Announcement. Corners. Edges. Numerical Derivatives f(x) Edge and Corner Detection. CSE252A Lecture 11

Computer Vision. Coordinates. Prof. Flávio Cardeal DECOM / CEFET- MG.

CS4733 Class Notes, Computer Vision

CS5670: Computer Vision

Announcements. Edges. Last Lecture. Gradients: Numerical Derivatives f(x) Edge Detection, Lines. Intro Computer Vision. CSE 152 Lecture 10

arxiv: v1 [cs.cv] 28 Sep 2018

Edge detection. Convert a 2D image into a set of curves. Extracts salient features of the scene More compact than pixels

Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923

Structure from motion

Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction

Rigid Body Motion and Image Formation. Jana Kosecka, CS 482

Epipolar Geometry and Stereo Vision

ECE 470: Homework 5. Due Tuesday, October 27 in Seth Hutchinson. Luke A. Wendt

Agenda. Rotations. Camera models. Camera calibration. Homographies

Computer Vision I. Announcement. Stereo Vision Outline. Stereo II. CSE252A Lecture 15

Computer Vision I. Dense Stereo Correspondences. Anita Sellent 1/15/16

Lecture 14: Computer Vision

Filtering Images. Contents

Structure from motion

Final Review CMSC 733 Fall 2014

Lecture 7: Most Common Edge Detectors

The SIFT (Scale Invariant Feature

Announcements. Edge Detection. An Isotropic Gaussian. Filters are templates. Assignment 2 on tracking due this Friday Midterm: Tuesday, May 3.

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Assignment 2: Stereo and 3D Reconstruction from Disparity

Stereo and Epipolar geometry

Chaplin, Modern Times, 1936

CIS 580, Machine Perception, Spring 2015 Homework 1 Due: :59AM

Stereo vision. Many slides adapted from Steve Seitz

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar.

Edge Detection. Announcements. Edge detection. Origin of Edges. Mailing list: you should have received messages

Visual Recognition: Image Formation

Structured light 3D reconstruction

Local Feature Detectors

Midterm Exam Solutions

Stereo. Outline. Multiple views 3/29/2017. Thurs Mar 30 Kristen Grauman UT Austin. Multi-view geometry, matching, invariant features, stereo vision

Camera Geometry II. COS 429 Princeton University

Computer Vision I. Announcements. Fourier Tansform. Efficient Implementation. Edge and Corner Detection. CSE252A Lecture 13.

Automatic Image Alignment (feature-based)

Computer Vision I - Filtering and Feature detection

CSE 252B: Computer Vision II

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Computer Vision I - Appearance-based Matching and Projective Geometry

ME/CS 132: Introduction to Vision-based Robot Navigation! Low-level Image Processing" Larry Matthies"

Introduction to Computer Vision

Laser sensors. Transmitter. Receiver. Basilio Bona ROBOTICA 03CFIOR

Computer Vision cmput 428/615

Transcription:

System 6 Introduction Is there a Wedge in this 3D scene? Binocular Stereo Vision Data a stereo pair of images! Given two 2D images of an object, how can we reconstruct 3D awareness of it? AV: 3D recognition from binocular stereo Fisher lecture 6 slide 1 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 2 Stereo vision - a solution System 6 Overview 3D part recognition using geometric stereo 1) Feature extraction 2) Feature matching: 3) Triangulation: 1. Feature extraction: Canny edge detector RANSAC straight line finding 2. Feature matching: Stereo correspondence matching lines 3. Triangulation: 3D feature position estimation 4. 3D Object recognition: 3D geometric model Model-data matching using lines 3D pose estimation using lines AV: 3D recognition from binocular stereo Fisher lecture 6 slide 3 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 4

Possible image features 1) Edge fragments: 2) Edge structures (lines): 3) General interest points: Edge Detector Introduction Edge detection: find pixels at large changes in intensity Much historical work on this topic in computer vision (Roberts, Sobel) Canny edge detector first modern edge detector and still commonly used today Edge detection never very accurate process: image noise, areas of low contrast, a question of scale. Humans see edges where none exist. AV: 3D recognition from binocular stereo Fisher lecture 6 slide 5 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 6 Four stages: Canny Edge Detector Canny: Gaussian Smoothing Convolve with a 2D Gaussian 1. Gaussian smoothing: to reduce noise and smooth away small edges 2. Gradient calculation: to locate potential edge areas 3. Non-maximal suppression: to locate best edge positions 4. Hysteresis edge tracking: to locate reliable, but weak edges AV: 3D recognition from binocular stereo Fisher lecture 6 slide 7 Averages pixels with preference near center: smooths noise without too much blurring of edges AV: 3D recognition from binocular stereo Fisher lecture 6 slide 8

σ of Gaussian controls smoothing explicitly convolution mask(r, c) = 1 r 2 +c 2 2πσ e 2σ 2 Gaussian Smoothing Examples σ = 2 σ = 4 50 50 Larger σ gives more smoothing - low pass filter 100 150 200 100 150 200 250 250 300 300 350 350 400 400 450 450 100 200 300 400 500 600 100 200 300 400 500 600 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 9 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 10 Conservative Smoothing Gaussian smoothing inappropriate for salt&pepper/spot noise Canny: Gradient Magnitude Calculation G(r, c) is smoothed image Compute local derivatives in the r and c directions as G r (r, c), G c (r, c): Edge gradient: G(r, c) = (G r (r, c), G c (r, c)) Noisy image Gauss smooth Conservative AV: 3D recognition from binocular stereo Fisher lecture 6 slide 11 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 12

Gradient magnitude: H(r, c) = G r (r, c) 2 + G c (r, c) 2 Gradient direction. = G r (r, c) + G c (r, c) θ(r, c) = arctan(g r (r, c), G c (r, c)) G r (r, c) = G r = lim G(r + h, c) G(r, c) h 0 h =. G(r + 1, c) G(r, c) Gradient Magnitude Examples σ = 2 σ = 4 σ controls amount of s moothing Smaller σ gives more detail & noise Larger σ gives less detail & noise AV: 3D recognition from binocular stereo Fisher lecture 6 slide 13 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 14 Canny: Non-maximal Suppression Where exactly is the edge? peak of gradient Suppress lower gradient magnitude values: need to check ACROSS gradient 0 0 3 12 4 0 0 0 6 10 2 0 0 2 8 7 1 0 0 3 11 4 0 0 Estimate gradient magnitudes using gradient direction: A H 3 Hβ H 4 H 1 A H 1 H H α 2 : REAL PIXEL POSITION : INTERPOLATED PIXEL : GRADIENT DIRECTION A = G r G c H α = AH 1 + (1 A)H 2 H β = AH 4 + (1 A)H 3 Suppress (set to 0) if H < H α OR H < H β AV: 3D recognition from binocular stereo Fisher lecture 6 slide 15 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 16

Canny: Hysteresis Tracking Start edges at certainty: H > τ start Reduce requirements at connected edges to get weaker edges: H > τ continue Stereo Canny Edges 0 50 100 150 200 250 300 350 400 450 0 0 100 200 300 400 500 600 50 100 150 200 250 300 350 400 450 0 100 200 300 400 500 600 Matlab has Canny: edge(leftr, canny,[0.08,0.2],3); AV: 3D recognition from binocular stereo Fisher lecture 6 slide 17 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 18 Midlecture Problem Where might the Canny edge detector find edges in this image? RANSAC: Random Sample and Consensus Model-based feature detection: features based on some a priori model Assume Works even in much noise and clutter Tunable failure rate Shape of feature determined by T true data points Hypothesized feature is valid if V data points nearby AV: 3D recognition from binocular stereo Fisher lecture 6 slide 19 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 20

RANSAC Pseudocode for i = 1 : Trials Select T data points randomly Estimate feature parameters if number of nearby data points > V return success end end return failure RANSAC Termination Limit p all f is probability of algorithm failing to detect a feature p 1 is probability of a data point belonging to a valid feature p d is probability of a data point belonging to same feature Algorithm fails if Trials consecutive failures p all f = (p one f ) Trials Success if all needed T random data items are valid p one f = 1 p 1 (p d ) T 1 Solving for expected number of trials: Trials = log(p all f ) log(1 p 1 (p d ) T 1 ) AV: 3D recognition from binocular stereo Fisher lecture 6 slide 21 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 22 RANSAC Line Detection Line model: infinite line through 2 points (T = 2) T = 2 edge points randomly chosen Accept if V = 80 edge points within 3 pixels p all f = 0.001, p 1 = 0.1, p d = 0.01, Trials = 688 0 50 100 0 50 100 Finding line segments Some random data crossings Want to find approximate observed start and end of true segment 1. Project points { x i } onto ideal line thru point p with direction a: λ i = ( x i p) a. Projected point is p + λ i a x a 150 150 200 200 250 300 350 250 300 350 p λ 400 450 0 100 200 300 400 500 600 400 450 0 100 200 300 400 500 600 2. Remove points not having 43 neighbor points within 45 pixels distance Mostly accurate lines, but don t know endpoints 3. Endpoints are given by smallest and largest remaining λ i. AV: 3D recognition from binocular stereo Fisher lecture 6 slide 23 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 24

Found line segments Stereo Overview AV: 3D recognition from binocular stereo Fisher lecture 6 slide 25 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 26 Line Segment Description Given line segment endpoints b and c with direction a = (u, v): a L n x d λ b R c D D Estimate edge contrast of pixels within D of line in regions L & R: For any pixel x If 0 ( x b) a b c If 0 < ( x b) n D, put x in L If D ( x b) n < 0, put x in R Estimate contrast g = (average brightness of pixels in L) - (average brightness of pixels in R) Orientation: compute perpendicular n = ( v, u) Position: estimate midpoint m = ( b + c)/2 Shape: estimate length b c Description is direction a, midpoint m, length b c and contrast g. AV: 3D recognition from binocular stereo Fisher lecture 6 slide 27 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 28

Corresponding Line Properties Edges L dir R dir L len R len L1:R4 (0.00,-1.00) (0.00,-1.00) 137 125 L5:R11 (0.45,0.89) (0.38,0.93) 90 119 L6:R7 (0.96,-0.28) (0.96,-0.29) 291 251 L7:R2 (0.95,0.32) (1.00,0.00) 93 140 L8:R5 (0.70,-0.71) (0.86,-0.50) 49 61 L10:R12 (0.81,-0.59) (0.86,-0.52) 87 82 L11:R10 (-0.33,-0.94) (-0.29,-0.96) 72 101 L12:R3 (0.89,0.45) (0.96,0.28) 129 115 Edges L mid R mid L con R con L1:R4 (370,369) (364,242) -91-37 L5:R11 (55,477) (40,309) 28 18 L6:R7 (158,355) (181,194) 44 55 L7:R2 (74,416) (90,246) -117-44 L8:R5 (250,567) (278,371) -66-31 L10:R12 (208,402) (205,221) 38 55 L11:R10 (199,536) (195,358) 141 113 L12:R3 (140,554) (132,400) 47 23 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 29 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 30 Binocular Stereo Goal: build 3D scene description (eg. depth) given two 2D image descriptions Image Features What to match? Edge fragments from edge detector. Usually many. Useful for: obstacle avoidance, grasping, object location Key principle: triangulation Larger structures, like lines (RANSAC, vertical for mobile robots) α β LEFT SENSOR D RIGHT SENSOR Larger easier to match but harder to get and less dependable Human visual system thought to work at edge fragment level AV: 3D recognition from binocular stereo Fisher lecture 6 slide 31 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 32

Stereo Correspondence Problem Which feature in left image matches a given feature in the right? LEFT RIGHT Constraining Matches: Edge Direction : WHICH?{ Different pairings give different depth results Often considered the key problem of stereo OK BAD BAD Match features with nearly same orientation AV: 3D recognition from binocular stereo Fisher lecture 6 slide 33 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 34 Constraining Matches: Edge Contrast Constraining Matches: Feature Shape : OK BAD BAD Match features with nearly same contrast across edge : OK BAD BAD Match features with nearly same length AV: 3D recognition from binocular stereo Fisher lecture 6 slide 35 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 36

Constraining Matches: Uniqueness and Smoothness Smoothness: match features giving nearly same depth as neighbors Uniqueness: a feature in one image can match from the other image: 0 - occlusion Midlecture Problem Which stereo correspondence constraint would you use to reject these matches? : 1 - normal case 2+ - transparencies, wires, vines, etc from coincidental alignments AV: 3D recognition from binocular stereo Fisher lecture 6 slide 37 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 38 Constraining Matches: Epipolar Geometry Feature p l in left image lies on a ray r thru space. r projects to an epipolar line e in the right image, along which the matching image feature must lie. +Y +Z p r If images are rectified, then the epipolar line is an image row. Reduces 2D search to 1D search +Y p l +Z p r p r e p l p r e +X Images are linked by the Fundamental matrix F Matched points satisfy p l F p r = 0 (Points are in homogeneous coordinates, F is 3 3) +X AV: 3D recognition from binocular stereo Fisher lecture 6 slide 39 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 40

Stereo Matching Results Stereo Matching Results Matched segments: 0 0 50 50 100 100 150 150 200 200 250 250 300 300 350 350 400 400 450 450 0 100 200 300 400 500 600 0 100 200 300 400 500 600 Maximally consistent set of matches Based on local and epipolar constraints AV: 3D recognition from binocular stereo Fisher lecture 6 slide 41 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 42 Stereo Matching Code % find lines with matching properties linepairs = zeros(100,2); numpairs=0; for l = 1 : llinecount for r = 1 : rlinecount if abs(llinem(l,1) - rlinem(r,1)) < Tep... % epipolar & abs(llinem(l,2) - rlinem(r,2)) > Tdisl... % min disp & abs(llinem(l,2) - rlinem(r,2)) < Tdish... % max disp & abs(llinel(l) - rlinel(r)) <... Tlen*(llinel(l) + rlinel(r))... % length & llinea(l,:)*rlinea(r,:) > Tort... % direction & abs(llineg(l) - rlineg(r)) < Tcon % contrast numpairs = numpairs+1; linepairs(numpairs,1)=l; linepairs(numpairs,2)=r; AV: 3D recognition from binocular stereo Fisher lecture 6 slide 43 pairs = 0; % Enforce uniqueness pairlist = zeros(100,2); changes = 1; while changes changes = 0; for l = 1 : numpairs if linepairs(l,1) > 0 testset = find(linepairs(:,1)==linepairs(l,1)); if length(testset) == 1 changes = 1; pairs = pairs + 1; pairlist(pairs,1) = linepairs(l,1); pairlist(pairs,2) = linepairs(l,2); linepairs(testset(1),:) = zeros(1,2); % clear taken AV: 3D recognition from binocular stereo Fisher lecture 6 slide 44

Image Projection Geometry Pinhole camera model: projects 3D point v = (x, y, z,1) onto image point u i = (r i, c i, 1) : λ u i = P i v. i = L, R. Notice use of homogeneous coordinates in 2D and 3D (e,e,e ) x y +X z +Y +Z +C (r,c ) 0 0 u=(r,c) (0,0) +R v=(x,y,z) OPTICAL AXIS P i decomposes as P i = K i R i [I e i ] where R i : orientation of camera (3 degrees of freedom) e i = (e xi, e yi, e zi ) : camera center in world (3 DoF) K i : camera intrinsic calibration matrix = f i m ri s i r 0i 0 f i m ci c 0i 0 0 1 f i : camera focal length in mm m ri, m ci : row, col pixels/mm conversion on image plane r 0i, c 0i : where optical axis intersects image plane s i : skew factor ========= 12 Degrees of Freedom per camera AV: 3D recognition from binocular stereo Fisher lecture 6 slide 45 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 46 Calibration R L = R R = I (by definition and controlled camera motions) e L = (0, 0, 0) by definition and e R = (55, 0, 0) by calibrated motion s L = s R = 0 (usual for CCD cameras) (r 0L, c 0L ) = (r 0R, c 0R ) = (HEIGHT/2, WIDTH/2) (approximate) f L m rl = f L m cl = f R m rr = f R m cr = 832.5 (same camera + calibration) Actual Intrinsic Parameters Matrix Slightly permuted because of different axis labeling and image axis directions 0 fm r r 0 fm c 0 c 0 0 0 1 K L = K R as same camera moved 55 mm to right AV: 3D recognition from binocular stereo Fisher lecture 6 slide 47 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 48

3D Line Calculation Aim: recovery of 3D line positions Assume: line successfully matched in L & R images 1. Compute 3D plane that goes through image line and camera origin 2. Compute intersection of 3D planes from 2 cameras (which gives a line) (e,e,e ) x y z +X 3D plane passing thru 2D image line +Y +Z +C (0,0) +R 2D IMAGE LINE 2D image line is: 3D LINE rcos(θ) + csin(θ) + b = 0 From our projection relations (λr, λc, λ) = P(x, y, z,1) and solving for r and c we get: r = fm r y e y z e z + r 0 c = fm c x e x z e z + c 0 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 49 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 50 Both geometric and algebraic analysis give the same result. 3D Plane Intersection 3D Line +Y +Z +X Substituting r and c into the line equation, we get the equation for the plane that contains the coplanar points (x, y, z) : n (x, y, z) + d = 0 +X +Y +Z 3D LINE n = (fm c sin(θ), fm r cos(θ), r 0 cos(θ) + c 0 sin(θ) + b) d = n (e x, e y, e z ) Do this for both left and right cameras: gives two planes that intersect at the 3D edge. AV: 3D recognition from binocular stereo Fisher lecture 6 slide 51 Planes: x n L + d L = 0 and x n R + d R = 0 Line given by vector direction v through point a Direction: v = n L n R Choose point a on line closest to origin, thus a v = 0 Let: M = [ v, n L, n R ] Let: b = [0, d L, d R ] Then: a = M 1 b AV: 3D recognition from binocular stereo Fisher lecture 6 slide 52

Matched Lines 3D Line Equations Number Pairs direction point 1 L1:R4 (-0.82, 0.08,-0.56) (9.1,2.0,-13.0) 2 L5:R11 ( 0.61,-0.06, 0.78) (-125.3,98.6,107.1) 3 L6:R7 (-0.28,-0.95,-0.03) (0.9,-10.6,294.4) 4 L7:R2 ( 0.07,-0.62,-0.77) (48.3,-97.0,82.9) 5 L8:R5 (-0.18,-0.45, 0.87) (114.8,91.8,72.1) 6 L10:R12 (-0.50,-0.73, 0.44) (71.5,77.0,208.8) 7 L11:R10 ( 0.79,-0.20, 0.57) (-98.4,57.2,154.6) 8 L12:R3 ( 0.11,-0.69,-0.70) (110.4,-123.6,140.1) AV: 3D recognition from binocular stereo Fisher lecture 6 slide 53 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 54 Projected Stereo Line Positions A little off, but typical 3D line 1 3D line 2 Angle True 2 3 1.4347 1.57 2 4 1.0221 1.57 2 5 0.9281 1.57 2 6 1.4863 1.57 2 7 0.3023 0.00 2 8 1.1186 1.57 3 4 0.9180 0.71 3 5 1.0964 0.71 3 6 0.5848 0.71 3 7 1.5221 1.57 3 8 0.8457 0.71 4 5 1.1527 1.57 4 6 1.4920 1.57 4 7 1.3026 1.57 4 8 0.1085 0.00 5 6 0.6152 0.00 5 7 1.1060 1.57 5 8 1.2453 1.57 6 7 1.5679 1.57 6 8 1.4276 1.57 7 8 1.3918 1.57 Angles Between Lines AV: 3D recognition from binocular stereo Fisher lecture 6 slide 55 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 56

3D Edge Based Recognition Pipeline 3D Edge Based Recognition Match 3D data edges to 3D wireframe model edges +Z +X M8 (0,0,46) M4 M1 M2 M3 M9 (67,0,0) M6 M7 (0,0,0) M5 (0,67,0) +Y Model = (0,0,0)-(67,0,0) (0,0,0)-(0,67,0) (0,0,0)-(0,0,46) (67,0,0)-(67,0,46) (0,67,0)-(0,67,46) (0,0,46)-(67,0,46) (0,0,46)-(0,67,46) (67,0,0)-(0,67,0) (67,0,46)-(0,67,46) AV: 3D recognition from binocular stereo Fisher lecture 6 slide 57 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 58 Matching Performance 3D Model Matching Use Interpretation Tree algorithm: match edges, Limit = 5 Unary test: similar length l m l d < τ l (l m + l d ) Binary test: similar angle between pairs: θ m θ d < τ a 144 interpretation tree matches thru to pose estimation and verification Two valid solutions found (symmetric model rotated 180 degrees) Data Model 1 Model 2 Segment Segment Segment 3 M8 M9 4 M2 M6 6 M1 M7 7 M3 M3 8 M7 M1 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 59 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 60

3D Pose Estimation Given: matched line directions {( m i, d i )} and points on corresponding lines (but not necessarily same point positions) {( a i, b i )} Rotation (matrix R): estimate rotation from matched vectors (same as previous task) except: 1) use line directions instead of surface normals 2) don t know which ± direction for edge correspondence: try both for each matched segment 3) if det(r) = 1 then need to flip symmetry 4) verify rotation by comparing rotated model and data line orientations 3D Translation Estimation Given N paired model and data segments, with point a i on model segment i and b i on data segment i Direction d i of data segment i Previously estimated rotation R d Ra λ b λi = R a i + t b d i ( d i (R a i + t b)) is translation error to minimize Goal: find t that minimizes i λ i λ i AV: 3D recognition from binocular stereo Fisher lecture 6 slide 61 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 62 How: L = i (I d i d i ) (I d i d i ) n = i (I d i d i ) (I d i d i )(R a i b i ) t = L 1 n Verify translation by comparing perpendicular distance of transformed model endpoints to data line λ TRANSFORMED MODEL Matching Overlay Two solutions for symmetric model Left and right image with matched portion of model overlaid: λ DATA Calibration a bit off 287 verify attempts (2 successes) AV: 3D recognition from binocular stereo Fisher lecture 6 slide 63 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 64

Discussion Hard to find reliable edges/lines, but Canny finds most reasonable edges and RANSAC can put them together for lines Given enough stereo correspondence constraints, can get reasonably correct correspondences Large features help stereo matching but require more preprocessing Stereo geometry easy but needs accurate calibration: not always easy, but now possible to autocalibrate using 8 matched points Dense Depth Data Problem: have depth only at triangulated feature locations Solution 1: Linear interpolate known values at all other pixels Solution 2: Correlation-based stereo Use pixel neighborhoods as features Triangulate depth at every pixel But needs to find matching pixel - not easy Binocular feature matching stereo gives good 3D at corresponding features, but nothing in between: use scan line stereo? AV: 3D recognition from binocular stereo Fisher lecture 6 slide 65 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 66 Use stereo image pair Correlation based stereo Finding best match For each scanline on rectified image pair: 1. Build array of all possible matching scores Features are neighborhoods at each pixel Match using similarity metric: SSD - Sum of Squared Differences (of pixel values) of left image at (u, v) to right image at (r, s): SSD(u, v, r, s) = N 2 i= N 2 N 2 j= N 2 (L(u + i, v + j) R(r + i, s + j)) 2 2. Dynamic programming finds lowest cost path (bright line thru middle of array above - optimisation problem) AV: 3D recognition from binocular stereo Fisher lecture 6 slide 67 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 68

Dense Stereo Results Commercial Dense Stereo Results AV: 3D recognition from binocular stereo Fisher lecture 6 slide 69 AV: 3D recognition from binocular stereo Fisher lecture 6 slide 70 What We Have Learned Feature based stereo: Feature detection: Canny & RANSAC Stereo matching: local & epipolar constraints Triangulation & 3D: epipolar geometry Recognition: IT algorithm & verification Correlation based stereo: similar but using pixel neighbourhoods AV: 3D recognition from binocular stereo Fisher lecture 6 slide 71