Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Similar documents
Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

There are many cues in monocular vision which suggests that vision in stereo starts very early from two similar 2D images. Lets see a few...

Multiple View Geometry

Correspondence and Stereopsis. Original notes by W. Correa. Figures from [Forsyth & Ponce] and [Trucco & Verri]

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

CS5670: Computer Vision

Stereo Vision. MAN-522 Computer Vision

Stereo. Shadows: Occlusions: 3D (Depth) from 2D. Depth Cues. Viewing Stereo Stereograms Autostereograms Depth from Stereo

Computer Vision Lecture 17

Computer Vision Lecture 17

Lecture 14: Basic Multi-View Geometry

Robert Collins CSE486, Penn State Lecture 08: Introduction to Stereo

Lecture 14: Computer Vision

lecture 10 - depth from blur, binocular stereo

Dense 3D Reconstruction. Christiano Gava

CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION

Project 2 due today Project 3 out today. Readings Szeliski, Chapter 10 (through 10.5)

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

What have we leaned so far?

Machine vision. Summary # 11: Stereo vision and epipolar geometry. u l = λx. v l = λy

Announcements. Stereo

Binocular Stereo Vision

Dense 3D Reconstruction. Christiano Gava

Project 4 Results. Representation. Data. Learning. Zachary, Hung-I, Paul, Emanuel. SIFT and HoG are popular and successful.

3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: ,

COMP 558 lecture 22 Dec. 1, 2010

Announcements. Stereo

Multiple View Geometry

Computer Vision, Lecture 11

Epipolar Geometry and Stereo Vision

LUMS Mine Detector Project

Stereo imaging ideal geometry

EE795: Computer Vision and Intelligent Systems

Epipolar Geometry and Stereo Vision

Depth from two cameras: stereopsis

Project 3 code & artifact due Tuesday Final project proposals due noon Wed (by ) Readings Szeliski, Chapter 10 (through 10.5)

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

Epipolar Geometry CSE P576. Dr. Matthew Brown

Geometric Reconstruction Dense reconstruction of scene geometry

EE795: Computer Vision and Intelligent Systems

Depth from two cameras: stereopsis

Stereo: Disparity and Matching

Stereo Vision A simple system. Dr. Gerhard Roth Winter 2012

BIL Computer Vision Apr 16, 2014

Camera Calibration. Schedule. Jesus J Caban. Note: You have until next Monday to let me know. ! Today:! Camera calibration

Announcements. Stereo Vision Wrapup & Intro Recognition

MAPI Computer Vision. Multiple View Geometry

Introduction to 3D Imaging: Perceiving 3D from 2D Images

3D Computer Vision. Depth Cameras. Prof. Didier Stricker. Oliver Wasenmüller

Stereo II CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Binocular cues to depth PSY 310 Greg Francis. Lecture 21. Depth perception

Image Rectification (Stereo) (New book: 7.2.1, old book: 11.1)

Epipolar Geometry and Stereo Vision

Think-Pair-Share. What visual or physiological cues help us to perceive 3D shape and depth?

Important concepts in binocular depth vision: Corresponding and non-corresponding points. Depth Perception 1. Depth Perception Part II

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Multiple Views Geometry

Computer Vision I. Announcement. Stereo Vision Outline. Stereo II. CSE252A Lecture 15

Outline. ETN-FPI Training School on Plenoptic Sensing

Camera Drones Lecture 3 3D data generation

7. The Geometry of Multi Views. Computer Engineering, i Sejong University. Dongil Han

Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923

A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation

Depth estimation from stereo image pairs

Miniature faking. In close-up photo, the depth of field is limited.

CS201 Computer Vision Camera Geometry

Today. Stereo (two view) reconstruction. Multiview geometry. Today. Multiview geometry. Computational Photography

Rectification and Disparity

Image Based Reconstruction II

1 CSE 252A Computer Vision I Fall 2017

Lecture 10 Multi-view Stereo (3D Dense Reconstruction) Davide Scaramuzza

Rectification and Distortion Correction

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Stereo and Epipolar geometry

Recap from Previous Lecture

Camera Model and Calibration

Epipolar Geometry and Stereo Vision

Cameras and Stereo CSE 455. Linda Shapiro

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz

CS 2770: Intro to Computer Vision. Multiple Views. Prof. Adriana Kovashka University of Pittsburgh March 14, 2017

EECS 442 Computer vision. Stereo systems. Stereo vision Rectification Correspondence problem Active stereo vision systems

Stereo Observation Models

Perception II: Pinhole camera and Stereo Vision

CS 563 Advanced Topics in Computer Graphics Stereoscopy. by Sam Song

Multi-view stereo. Many slides adapted from S. Seitz

Step-by-Step Model Buidling

Binocular Stereo Vision. System 6 Introduction Is there a Wedge in this 3D scene?

Computer Vision. Coordinates. Prof. Flávio Cardeal DECOM / CEFET- MG.

Structured Light. Tobias Nöll Thanks to Marc Pollefeys, David Nister and David Lowe

Assignment 2: Stereo and 3D Reconstruction from Disparity

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Binocular stereo. Given a calibrated binocular stereo pair, fuse it to produce a depth image. Where does the depth information come from?

A virtual tour of free viewpoint rendering

3D Environment Measurement Using Binocular Stereo and Motion Stereo by Mobile Robot with Omnidirectional Stereo Camera

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Computer Vision cmput 428/615

Application questions. Theoretical questions

Stereo vision. Many slides adapted from Steve Seitz

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Transcription:

Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1

Stereo Vision 2

Inferring 3D from 2D Model based pose estimation single (calibrated) camera Stereo vision Known model -> Can determine the pose of the model two (calibrated) cameras Arbitrary scene -> Can determine the positions of points in the scene Relative pose between cameras is also known 3

Stereo Vision A way of getting depth (3-D) information about a scene from two (or more) 2-D images Used by humans and animals, now computers Computational stereo vision Studied extensively in the last 25 years Difficult; still being researched Some commercial systems available Good references Scharstein and Szeliski, 2002. A Taxonomy and Evaluation of Dense Two- Frame Stereo Correspondence Algorithms. International Journal of Computer Vision, 47(1-3), 7-42 http://vision.middlebury.edu/stereo - extensive website with evaluations of algorithms, test data, code 4

Example Left image Right image Davi Geiger Reconstructed surface with image texture 5

Example Notice how different parts of the two images align, for different values of the horizontal shift (disparity) Iright = im2double(imread('pentagonright.png')); Ileft = im2double(imread('pentagonleft.png')); % Disparity is d = xleft-xright % So Ileft(x,y) = Iright(x+d,y) for d=-20:20 d Idiff = abs(ileft(:, 21:end-20) - Iright(:, d+21:d+end-20)); imshow(idiff, []); pause end 6

Stereo Displays Stereograms were popular in the early 1900 s A special viewer was needed to display two different images to the left and right eyes http://www.columbia.edu/itc/mealac/pritchett/00routesdata/1700_1799/jaipur/jaipurcity/jaipurcity.html 7

Stereo Displays 3D movies were popular in the 1950 s The left and right images were displayed as red and blue http://j-walkblog.com/index.php?/weblog/posts/swimmers/ 8

Stereo Displays Current technology for 3D movies and computer displays is to use polarized glasses The viewer wears eyeglasses which contain circular polarizers of opposite handedness http://www.3dsgamenews.com/201 1/01/3ds-to-feature-3d-movies/ 9

Stereo Principle If you know intrinsic parameters of each camera the relative pose between the cameras If you measure An image point in the left camera The corresponding point in the right camera Each image point corresponds to a ray emanating from that camera You can intersect the rays (triangulate) to find the absolute point position 10

Stereo Geometry Simple Case Assume image planes are coplanar There is only a translation in the X direction between the two coordinate frames b is the baseline distance between the cameras x X L L f, Z L x R f X Z R R Z X L L Z R X Z R b P(X L,Y L,Z L ) d x L x R Disparity d = x L - x R f x L X R f b Z X R b Z X R Z f f b Z b d Z L Left camera x L X L b Z R x R X R Right camera 11

Goal: a complete disparity map Disparity is the difference in position of corresponding points between the left and right images http://vision.middlebury.edu/stereo 12

Reconstruction Error Given the uncertainty in pixel projection of the point, what is the error in depth? Obviously the error in depth (DZ) will depend on: Z, b, f Dx L, Dx R Let s find the expected value of the error, and the variance of the error From http://www.danet.dk/sensor_fusion 13

Reconstruction Error First, find the error in disparity Dd, from the error of locating the feature in each image, Dx L and Dx R d x L x R Taking the total derivative of each side d( d) d( x ) d( x ) Dd Dx L L Dx R R Assuming Dx L, Dx R are independent and zero mean and E Var Var Dd EDx EDx 0 Dd E Dd L 2 ED d Dd E Dx Dx E E R 2 2 2 L R EDx L 2DxLDxR DxR 2 2 Dx L 2EDx LDxR EDx R 2 2 Dx ED L x R 2 So s d 2 = s L 2 + s R 2 14

Reconstruction Error Next, we take the total derivative of Z=fb/d If the only uncertainty is in the disparity d b DZ f 2 d The mean error is Z = E[DZ] Dd The variance of the error is s Z 2 = E [(DZ- Z ) 2 ] 15

Example A stereo vision system estimates the disparity of a point as d=10 pixels What is the depth (Z) of the point, if f = 500 pixels and b = 10 cm? What is the uncertainty (standard deviation) of the depth, if the standard deviation of locating a feature in each image = 1 pixel? How to handle uncertainty in both disparity and focal length? 16

Geometry - general case Cameras not aligned, but we still know relative pose Assuming f=1, we have p xl xr y, p y 1 1 L L R R In principle, you can find P by intersecting the rays O L p L and O R p R However, they may not intersect Instead, find the midpoint of the segment perpendicular to the two rays Z L X L Left camera p L Z R Right camera p R X R P(X L,Y L,Z L ) 17

Triangulation (continued) The projection of P onto the left image is Z L p L = M L P The projection of P onto the right image is where Z R p R = M R P p L P p R M L 1 0 0 0 0 1 0 0 0 0 1 0 r r r t M R t r31 r32 r33 t z 11 12 13 x R R R r21 r22 r23 ty L Lorg 18

Triangulation (continued) Note that p L and M L P are parallel, so their cross product should be zero Similarly for p R and M R P Point P should satisfy both p p L R MP L MP R 0 0 p L P p R This is a system of four equations; can solve for the three unknowns (X L, Y L, Z L ) using least squares Method also works for more than two cameras 19

Stereo Process Extract features from the left and right images Match the left and right image features, to get their disparity in position (the correspondence problem ) Use stereo disparity to compute depth (the reconstruction problem) http://vision.middlebury.edu/stereo/data/scenes2003/ The correspondence problem is the most difficult 20

Characteristics of Human Stereo Vision Matching features must appear similar in the left and right images For example, we can t fuse a left stereo image with a negative of the right image http://cs.wellesley.edu/~cs332/ 21

Characteristics of Human Stereo Vision Can only fuse objects within a limited range of depth around the fixation distance Vergence eye movements are needed to fuse objects over larger range of depths http://cs.wellesley.edu/~cs332/ 22

Panum's fusional area is the range of depths for which binocular fusion can occur (without changing vergence angles) It s actually quite small we are able to perceive a wide range of depths because we are changing vergence angles Panum s Fusional Area http://webvision.med.utah.edu/imageswv/kalldepth7.jpg 23

Characteristics of Human Stereo Vision Cells in visual cortex are selective for stereo disparity Neurons that are selective for a larger disparity range have larger receptive fields zero disparity: at fixation distance near: in front of point of fixation far: behind point of fixation http://cs.wellesley.edu/~cs332/ 24

Characteristics of Human Stereo Vision Can fuse random-dot stereograms Bela Julesz, 1971 Shows Stereo system can function independently We can match simple features Highlights the ambiguity of the matching process http://cs.wellesley.edu/~cs332/ 25

Example Make a random dot stereogram L = rand(400,400); R = L; % Shift center portion by 50 pixels R(100:300, 150:350) = L(100:300, 100:300); % Fill in part that moved R(100:300, 100:149) = rand(201, 50); 26

Correspondence Problem Most difficult part of stereo vision For every point in the left image, there are many possible matches in the right image Locally, many points look similar -> matches are ambiguous We can use the (known) geometry of the cameras to help limit the search for matches The most important constraint is the epipolar constraint We can limit the search for a match to be along a certain line in the other image 27

Epipolar Constraint With aligned cameras, search for corresponding point is 1D along corresponding row of other camera. 28

Epipolar constraint for non baseline stereo computation If cameras are not aligned, a 1D search can still be determined for the corresponding point. P1, C1, C2 determine a plane that cuts image I2 in a line: P2 will be on that line. 29

Rectification If relative camera pose is known, it is possible to rectify the images effectively rotate both cameras so that they are looking perpendicular to the line joining the camera centers Original image pair overlaid with several epipolar lines These means that epipolar lines will be horizontal, and matching algorithms will be more efficient From Richard Szeliski, : Algorithms and Applications, Springer, 2010 Images rectified so that epipolar lines are horizontal and in vertical correspondence 30

Correspondence Problem Even using the epipolar constraint, there are many possible matches Worst case scenarios A white board (no features) A checkered wallpaper (ambiguous matches) The problem is under constrained To solve, we need to impose assumptions about the real world: Disparity limits Appearance Uniqueness Ordering Smoothness 31

Disparity limits Assume that valid disparities are within certain limits Constrains search Why usually true? When is it violated? 32

Appearance Assume features should have similar appearance in the left and right images Why usually true? http://vision.middlebury.edu/stereo/data/scenes2003/ When is it violated? 33

Uniqueness Assume that a point in the left image can have at most one match in the right image Why usually true? When is it violated? x L x R b Left camera X L X R Right camera 34

Ordering Assume features should be in the same left to right order in each image Why usually true? When is it violated? 35

Smoothness Assume objects have mostly smooth surfaces, meaning that disparities should vary smoothly (e.g., have a low second derivative) Why usually true? When is it violated? 36

Methods for Correspondence Match points based on local similarity between images Two general approaches Correlation-based approaches Matches image patches using correlation Assumes only a translational difference between the two local patches (no rotation, or differences in appearance due to perspective) A good assumption if patch covers a single surface, and surface is far away compared to baseline between cameras Works well for scenes with lots of texture Feature-based approaches Matches edges, lines, or corners Gives a sparse reconstruction May be better for scenes with little texture 37

Correlation Approach Select a range of disparities to search For each patch in the left image, compute cross correlation score for every point along the epipolar line Find maximum correlation score along that line 38

Parameters: Matlab demo Size of template patch Horizontal disparity search window Vertical disparity search window % Simple stereo system using cross correlation clear all close all Left Right % Constants W=16; DH = 50; DV = 8; % size of cross-correlation template is (2W+1 x 2W+1) % disparity horizontal search limit is -DH.. DH % disparity vertical search limit is -DV.. +DV Template Search area Ileft = imread('left.png'); Iright = imread('right.png'); figure(1), imshow(ileft, []), title('left image'); figure(2), imshow(iright, []), title('right image'); pause; % Calculate disparity at a set of discrete points xborder = W+DH+1; yborder = W+DV+1; xtsize = W+DH; % horizontal template size is 2*xTsize+1 ytsize = W+DV; % vertical template size is 2*yTsize+1 Template patch from left Correlation scores Search region in right Correlation scores (peak in red) 39

Matlab demo (continued) npts = 0; % number of found disparity points for x=xborder:w:size(ileft,2)-xborder for y=yborder:w:size(ileft,1)-yborder % Extract a template from the left image centered at x,y figure(1), hold on, plot(x, y, 'rd'), hold off; T = imcrop(ileft, [x-w y-w 2*W 2*W]); %figure(3), imshow(t, []), title('template'); % Search for match in the right image, in a region centered at x,y % and of dimensions DW wide by DH high. IR = imcrop(iright, [x-xtsize y-ytsize 2*xTsize 2*yTsize]); %figure(4), imshow(ir, []), title('search area'); % The correlation score image is the size of IR, expanded by W in % each direction. ccscores = normxcorr2(t,ir); %figure(5), imshow(ccscores, []), title('correlation scores'); % Get the location of the peak in the correlation score image [max_score, maxindex] = max(ccscores(:)); [ypeak, xpeak] = ind2sub(size(ccscores),maxindex); hold on, plot(xpeak, ypeak, 'rd'), hold off; % If score too low, ignore this point if max_score < 0.85 continue; end Scan through left image Extract a template patch from the left Do normalized crosscorrelation to match to the right Accept a match if score is greater than a threshold 40

Matlab demo (continued) Extract peak location, save disparity value Plot all points when done % These are the coordinates of the peak in the search image ypeak = ypeak - W; xpeak = xpeak - W; %figure(4), hold on, plot(xpeak, ypeak, 'rd'), hold off; % These are the coordinates in the full sized right image xpeak = xpeak + (x-xtsize); ypeak = ypeak + (y-ytsize); figure(2), hold on, plot(xpeak, ypeak, 'rd'), hold off; % Save the point in a list, along with its disparity npts = npts+1; xpt(npts) = x; ypt(npts) = y; dpt(npts) = xpeak-x; % disparity is xright-xleft end end %pause figure, plot3(xpt, ypt, dpt, 'd'); 41

Area-based matching Window size tradeoff Larger windows are more unique Smaller windows less likely to cross discontinuities Similarity measures CC (cross-correlation) SSD (sum of squared differences) SSD is equivalent to CC SAD (sum of absolute differences) 42

Additional notes Stereo vision website http://vision.middlebury.edu/stereo Example commercial system http://www.ptgrey.com 43