CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Similar documents
CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Chaplin, Modern Times, 1936

Fundamental matrix. Let p be a point in left image, p in right image. Epipolar relation. Epipolar mapping described by a 3x3 matrix F

Final project bits and pieces

CS4495/6495 Introduction to Computer Vision. 3B-L3 Stereo correspondence

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

Stereo. Outline. Multiple views 3/29/2017. Thurs Mar 30 Kristen Grauman UT Austin. Multi-view geometry, matching, invariant features, stereo vision

Stereo: Disparity and Matching

Stereo vision. Many slides adapted from Steve Seitz

Ninio, J. and Stevens, K. A. (2000) Variations on the Hermann grid: an extinction illusion. Perception, 29,

Stereo. Many slides adapted from Steve Seitz

Binocular stereo. Given a calibrated binocular stereo pair, fuse it to produce a depth image. Where does the depth information come from?

BIL Computer Vision Apr 16, 2014

Multi-stable Perception. Necker Cube

Epipolar Geometry and Stereo Vision

Epipolar Geometry and Stereo Vision

Project 3 code & artifact due Tuesday Final project proposals due noon Wed (by ) Readings Szeliski, Chapter 10 (through 10.5)

Computer Vision Lecture 17

Lecture 10: Multi view geometry

Computer Vision Lecture 17

Recap from Previous Lecture

Image Based Reconstruction II

CS5670: Computer Vision

Stereo II CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

EECS 442 Computer vision. Stereo systems. Stereo vision Rectification Correspondence problem Active stereo vision systems

What have we leaned so far?

Epipolar Geometry and Stereo Vision

Lecture 10: Multi-view geometry

Introduction à la vision artificielle X

CS 4495 Computer Vision Motion and Optic Flow

Lecture 9 & 10: Stereo Vision

There are many cues in monocular vision which suggests that vision in stereo starts very early from two similar 2D images. Lets see a few...

Peripheral drift illusion

Epipolar Geometry and Stereo Vision

Project 2 due today Project 3 out today. Readings Szeliski, Chapter 10 (through 10.5)

CS 2770: Intro to Computer Vision. Multiple Views. Prof. Adriana Kovashka University of Pittsburgh March 14, 2017

Multiple View Geometry

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

CEng Computational Vision

Step-by-Step Model Buidling

Dense 3D Reconstruction. Christiano Gava

Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923

Multiple View Geometry

Stereo Epipolar Geometry for General Cameras. Sanja Fidler CSC420: Intro to Image Understanding 1 / 33

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

6.819 / 6.869: Advances in Computer Vision Antonio Torralba and Bill Freeman. Lecture 11 Geometry, Camera Calibration, and Stereo.

Today. Stereo (two view) reconstruction. Multiview geometry. Today. Multiview geometry. Computational Photography

Dense 3D Reconstruction. Christiano Gava

EE795: Computer Vision and Intelligent Systems

Lecture 14: Computer Vision

Computer Vision Lecture 20

Computer Vision I. Announcements. Random Dot Stereograms. Stereo III. CSE252A Lecture 16

Image Rectification (Stereo) (New book: 7.2.1, old book: 11.1)

Computer Vision Lecture 20

Multi-view stereo. Many slides adapted from S. Seitz

Cameras and Stereo CSE 455. Linda Shapiro

Mosaics wrapup & Stereo

A virtual tour of free viewpoint rendering

Lecture 6 Stereo Systems Multi-view geometry

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Stereo and Epipolar geometry

Lecture'9'&'10:'' Stereo'Vision'

Stereo Matching. Stereo Matching. Face modeling. Z-keying: mix live and synthetic

Introduction to Computer Vision. Week 10, Winter 2010 Instructor: Prof. Ko Nishino

Human Body Recognition and Tracking: How the Kinect Works. Kinect RGB-D Camera. What the Kinect Does. How Kinect Works: Overview

Structure from Motion

Computer Vision I. Announcement. Stereo Vision Outline. Stereo II. CSE252A Lecture 15

Computer Vision Lecture 20

Announcements. Stereo Vision Wrapup & Intro Recognition

Epipolar Geometry Based On Line Similarity

Lecture 14: Basic Multi-View Geometry

Stereo and structured light

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Kinect Device. How the Kinect Works. Kinect Device. What the Kinect does 4/27/16. Subhransu Maji Slides credit: Derek Hoiem, University of Illinois

Stereo Matching.

Structure from motion

Stereo Vision. MAN-522 Computer Vision

Stereo Wrap + Motion. Computer Vision I. CSE252A Lecture 17

CS 1674: Intro to Computer Vision. Midterm Review. Prof. Adriana Kovashka University of Pittsburgh October 10, 2016

Reminder: Lecture 20: The Eight-Point Algorithm. Essential/Fundamental Matrix. E/F Matrix Summary. Computing F. Computing F from Point Matches

arxiv: v4 [cs.cv] 8 Jan 2017

Structure from motion

Miniature faking. In close-up photo, the depth of field is limited.

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching --

Feature Tracking and Optical Flow

Last time: Disparity. Lecture 11: Stereo II. Last time: Triangulation. Last time: Multi-view geometry. Last time: Epipolar geometry

Optical flow and tracking

CS6670: Computer Vision

Final Exam Study Guide

Structure from Motion

Epipolar Geometry CSE P576. Dr. Matthew Brown

EE795: Computer Vision and Intelligent Systems

Lecture 6 Stereo Systems Multi- view geometry Professor Silvio Savarese Computational Vision and Geometry Lab Silvio Savarese Lecture 6-24-Jan-15

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Lecture 19: Motion. Effect of window size 11/20/2007. Sources of error in correspondences. Review Problem set 3. Tuesday, Nov 20

Capturing, Modeling, Rendering 3D Structures

On-line and Off-line 3D Reconstruction for Crisis Management Applications

C280, Computer Vision

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Transcription:

Stereo Matching

Fundamental matrix Let p be a point in left image, p in right image l l Epipolar relation p maps to epipolar line l p maps to epipolar line l p p Epipolar mapping described by a 3x3 matrix F It follows that

Fundamental matrix This matrix F is called the Essential Matrix when image intrinsic parameters are known the Fundamental Matrix more generally (uncalibrated case) Can solve for F from point correspondences Each (p, p ) pair gives one linear equation in entries of F F has 9 entries, but really only 7 or 8 degrees of freedom. With 8 points it is simple to solve for F, but it is also possible with 7. See Marc Pollefey s notes for a nice tutorial

Stereo image rectification

Stereo image rectification Reproject image planes onto a common plane parallel to the line between camera centers Pixel motion is horizontal after this transformation Two homographies (3x3 transform), one for each input image reprojection C. Loop and Z. Zhang. Computing Rectifying Homographies for Stereo Vision. IEEE Conf. Computer Vision and Pattern Recognition, 1999.

Rectification example

The correspondence problem Epipolar geometry constrains our search, but we still have a difficult correspondence problem.

Fundamental Matrix + Sparse correspondence

Fundamental Matrix + Dense correspondence

SIFT + Fundamental Matrix + RANSAC Building Rome in a Day By Sameer Agarwal, Yasutaka Furukawa, Noah Snavely, Ian Simon, Brian Curless, Steven M. Seitz, Richard Szeliski Communications of the ACM, Vol. 54 No. 10, Pages 105-112

Sparse to Dense Correspodence Building Rome in a Day By Sameer Agarwal, Yasutaka Furukawa, Noah Snavely, Ian Simon, Brian Curless, Steven M. Seitz, Richard Szeliski Communications of the ACM, Vol. 54 No. 10, Pages 105-112

Correspondence problem Multiple match hypotheses satisfy epipolar constraint, but which is correct? Figure from Gee & Cipolla 1999

Correspondence problem Beyond the hard constraint of epipolar geometry, there are soft constraints to help identify corresponding points Similarity Uniqueness Ordering Disparity gradient To find matches in the image pair, we will assume Most scene points visible from both views Image regions for the matches are similar in appearance

Dense correspondence search For each epipolar line Adapted from Li Zhang For each pixel / window in the left image compare with every pixel / window on same epipolar line in right image pick position with minimum match cost (e.g., SSD, normalized correlation)

Correspondence search with similarity constraint Left Right scanline Matching cost disparity Slide a window along the right scanline and compare contents of that window with the reference window in the left image Matching cost: SSD or normalized correlation

Correspondence search with similarity constraint Left Right scanline SSD

Correspondence search with similarity constraint Left Right scanline Norm. corr

Correspondence problem Intensity profiles Source: Andrew Zisserman

Correspondence problem Neighborhoods of corresponding points are similar in intensity patterns. Source: Andrew Zisserman

Correlation-based window matching

Correlation-based window matching

Correlation-based window matching

Correlation-based window matching

Correlation-based window matching??? Textureless regions are non-distinct; high ambiguity for matches.

Effect of window size Source: Andrew Zisserman

Effect of window size W = 3 W = 20 Want window large enough to have sufficient intensity variation, yet small enough to contain only pixels with about the same disparity. Figures from Li Zhang

Results with window search Window-based matching (best window size) Ground truth

Better solutions Beyond individual correspondences to estimate disparities: Optimize correspondence assignments jointly Scanline at a time (DP) Full 2D grid (graph cuts)

intensity CS 4495 Computer Vision A. Bobick Scanline stereo Try to coherently match pixels on the entire scanline Different scanlines are still optimized independently Left image Right image

Left occlusion CS 4495 Computer Vision A. Bobick Shortest paths for scan-line stereo Right occlusion Left image I Right image I S left one-to-one q t Left occlusion Right occlusion s p S right Can be implemented with dynamic programming Ohta & Kanade 85, Cox et al. 96, Intille & Bobick, 01 Slide credit: Y. Boykov

Coherent stereo on 2D grid Scanline stereo generates streaking artifacts Can t use dynamic programming to find spatially coherent disparities/ correspondences on a 2D grid

Stereo as energy minimization What defines a good stereo correspondence? 1. Match quality Want each pixel to find a good match in the other image 2. Smoothness If two pixels are adjacent, they should (usually) move about the same amount

Stereo matching as energy minimization I 1 I 2 D W 1 (i) W 2 (i+d(i)) D(i) E Edata ( I1, I2, D) Esmooth ( D) E W ( i) W ( i D( i 2 data 1 2 )) i Energy functions of this form can be minimized using graph cuts Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001 Esmooth D( i) D( j) neighborsi, j Source: Steve Seitz

Better results Graph cut method Boykov et al., Fast Approximate Energy Minimization via Graph Cuts, International Conference on Computer Vision, September 1999. Ground truth For the latest and greatest: http://www.middlebury.edu/stereo/

Challenges Low-contrast ; textureless image regions Occlusions Violations of brightness constancy (e.g., specular reflections) Really large baselines (foreshortening and appearance change) Camera calibration errors

Active stereo with structured light Project structured light patterns onto the object Simplifies the correspondence problem Allows us to use only one camera camera projector L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured Light and Multi-pass Dynamic Programming. 3DPVT 2002

Kinect: Structured infrared light http://bbzippo.wordpress.com/2010/11/28/kinect-in-infrared/

Summary Epipolar geometry Epipoles are intersection of baseline with image planes Matching point in second image is on a line passing through its epipole Fundamental matrix maps from a point in one image to a line (its epipolar line) in the other Can solve for F given corresponding points (e.g., interest points) Stereo depth estimation Estimate disparity by finding corresponding points along scanlines Depth is inverse to disparity

The scale of algorithm name quality better RANSAC SIFT Deep Learning worse Optical Flow Hough Transform Neural Networks Essential and Fundamental Matrix Dynamic Programming

Computer Vision Motion and Optical Flow Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys, K. Grauman and others

Video A video is a sequence of frames captured over time Now our image data is a function of space (x, y) and time (t)

Motion Applications: Segmentation of video Background subtraction A static camera is observing a scene Goal: separate the static background from the moving foreground

Motion Applications: Segmentation of video Background subtraction Shot boundary detection Commercial video is usually composed of shots or sequences showing the same objects or scene Goal: segment video into shots for summarization and browsing (each shot can be represented by a single keyframe in a user interface) Difference from background subtraction: the camera is not necessarily stationary

Motion Applications: Segmentation of video Background subtraction Shot boundary detection Motion segmentation Segment the video into multiple coherently moving objects

Motion Applications: Segmentation of video Background subtraction Shot boundary detection Motion segmentation Segment the video into multiple coherently moving objects

Motion and perceptual organization Gestalt psychology (Max Wertheimer, 1880-1943)

Motion and perceptual organization Sometimes, motion is the only cue Gestalt psychology (Max Wertheimer, 1880-1943)

Motion and perceptual organization Sometimes, motion is the only cue

Motion and perceptual organization Sometimes, motion is the only cue

Motion and perceptual organization Sometimes, motion is the only cue

Motion and perceptual organization Even impoverished motion data can evoke a strong percept

Motion and perceptual organization Even impoverished motion data can evoke a strong percept

Motion and perceptual organization Experimental study of apparent behavior. Fritz Heider & Marianne Simmel. 1944

More applications of motion Segmentation of objects in space or time Estimating 3D structure Learning dynamical models how things move Recognizing events and activities Improving video quality (motion stabilization)

Motion estimation techniques Feature-based methods Extract visual features (corners, textured areas) and track them over multiple frames Sparse motion fields, but more robust tracking Suitable when image motion is large (10s of pixels) Direct, dense methods Directly recover image motion at each pixel from spatio-temporal image brightness variations Dense motion fields, but sensitive to appearance variations Suitable for video and when image motion is small

Motion estimation: Optical flow Optic flow is the apparent motion of objects or surfaces Will start by estimating motion of each pixel separately Then will consider motion of entire image

To be continued