CEng Computational Vision

Similar documents
Stereo. Outline. Multiple views 3/29/2017. Thurs Mar 30 Kristen Grauman UT Austin. Multi-view geometry, matching, invariant features, stereo vision

CS4495/6495 Introduction to Computer Vision. 3B-L3 Stereo correspondence

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Stereo: Disparity and Matching

Computer Vision Lecture 17

Computer Vision Lecture 17

Chaplin, Modern Times, 1936

Single-view 3D reasoning

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

Final project bits and pieces

Fundamental matrix. Let p be a point in left image, p in right image. Epipolar relation. Epipolar mapping described by a 3x3 matrix F

EECS 442 Computer vision. Stereo systems. Stereo vision Rectification Correspondence problem Active stereo vision systems

Epipolar Geometry and Stereo Vision

Lecture 10: Multi view geometry

Multiple View Geometry

BIL Computer Vision Apr 16, 2014

Lecture 10: Multi-view geometry

Stereo vision. Many slides adapted from Steve Seitz

Epipolar Geometry and Stereo Vision

Dense 3D Reconstruction. Christiano Gava

Mosaics wrapup & Stereo

Dense 3D Reconstruction. Christiano Gava

There are many cues in monocular vision which suggests that vision in stereo starts very early from two similar 2D images. Lets see a few...

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

CS 2770: Intro to Computer Vision. Multiple Views. Prof. Adriana Kovashka University of Pittsburgh March 14, 2017

Camera Geometry II. COS 429 Princeton University

Spatial track: range acquisition modeling

Shadows in the graphics pipeline

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Multi-view stereo. Many slides adapted from S. Seitz

Contexts and 3D Scenes

Lecture 14: Computer Vision

Passive 3D Photography

Lecture 9 & 10: Stereo Vision

Epipolar Geometry and Stereo Vision

Contexts and 3D Scenes

Image Rectification (Stereo) (New book: 7.2.1, old book: 11.1)

More Single View Geometry

Multiple View Geometry

Binocular stereo. Given a calibrated binocular stereo pair, fuse it to produce a depth image. Where does the depth information come from?

Automatic Photo Popup

Perspective projection. A. Mantegna, Martyrdom of St. Christopher, c. 1450

Correspondence and Stereopsis. Original notes by W. Correa. Figures from [Forsyth & Ponce] and [Trucco & Verri]

Epipolar Geometry and Stereo Vision

More Single View Geometry

Stereo. Many slides adapted from Steve Seitz

A virtual tour of free viewpoint rendering

Lecture 6 Stereo Systems Multi-view geometry

Image Based Reconstruction II

Project 4 Results. Representation. Data. Learning. Zachary, Hung-I, Paul, Emanuel. SIFT and HoG are popular and successful.

Recap from Previous Lecture

Lecture 14: Basic Multi-View Geometry

EE795: Computer Vision and Intelligent Systems

CS 558: Computer Vision 13 th Set of Notes

Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923

Stereo Vision. MAN-522 Computer Vision

What have we leaned so far?

Announcements. Stereo Vision Wrapup & Intro Recognition

Step-by-Step Model Buidling

More Single View Geometry

Computer Vision CS 776 Fall 2018

Optimizing Monocular Cues for Depth Estimation from Indoor Images

Stereo imaging ideal geometry

6.819 / 6.869: Advances in Computer Vision Antonio Torralba and Bill Freeman. Lecture 11 Geometry, Camera Calibration, and Stereo.

Single-view 3D Reconstruction

Lecture 6 Stereo Systems Multi- view geometry Professor Silvio Savarese Computational Vision and Geometry Lab Silvio Savarese Lecture 6-24-Jan-15

Stereo II CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Perception, Part 2 Gleitman et al. (2011), Chapter 5

Correcting User Guided Image Segmentation

Project 2 due today Project 3 out today. Readings Szeliski, Chapter 10 (through 10.5)

Project 3 code & artifact due Tuesday Final project proposals due noon Wed (by ) Readings Szeliski, Chapter 10 (through 10.5)

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field

Multi-stable Perception. Necker Cube

Structure from motion

CMPSCI 670: Computer Vision! Image formation. University of Massachusetts, Amherst September 8, 2014 Instructor: Subhransu Maji

Stereo Epipolar Geometry for General Cameras. Sanja Fidler CSC420: Intro to Image Understanding 1 / 33

On-line and Off-line 3D Reconstruction for Crisis Management Applications

Camera Calibration. Schedule. Jesus J Caban. Note: You have until next Monday to let me know. ! Today:! Camera calibration

Camera Calibration. COS 429 Princeton University

Basic distinctions. Definitions. Epstein (1965) familiar size experiment. Distance, depth, and 3D shape cues. Distance, depth, and 3D shape cues

Structure from motion

Multi-View Stereo for Static and Dynamic Scenes

Structure from Motion

1 Projective Geometry

Think-Pair-Share. What visual or physiological cues help us to perceive 3D shape and depth?

CS5670: Computer Vision

Scene Modeling for a Single View

DEPTH AND GEOMETRY FROM A SINGLE 2D IMAGE USING TRIANGULATION

Multi-View Geometry (Ch7 New book. Ch 10/11 old book)

EE795: Computer Vision and Intelligent Systems

Ninio, J. and Stevens, K. A. (2000) Variations on the Hermann grid: an extinction illusion. Perception, 29,

Stereo Wrap + Motion. Computer Vision I. CSE252A Lecture 17

Visualization 2D-to-3D Photo Rendering for 3D Displays

Today. Stereo (two view) reconstruction. Multiview geometry. Today. Multiview geometry. Computational Photography

Why is computer vision difficult?

More Single View Geometry

Transcription:

CEng 583 - Computational Vision 2011-2012 Spring Week 4 18 th of March, 2011

Today 3D Vision Binocular (Multi-view) cues: Stereopsis Motion Monocular cues Shading Texture Familiar size etc. "God must have loved depth cues, for He made so many of them. -- (Yonas & Ganrud, 1985)

Binocular Cues: Stereopsis

Depth with stereo: basic idea scene point optical center image plane Source: Steve Seitz

Depth with stereo: basic idea Basic Principle: Triangulation Gives reconstruction as intersection of two rays Requires camera pose (calibration) point correspondence Source: Steve Seitz

The Problem Picture: http://www.imec.be/scientificreport/sr2007/html/1384302.html

The Problem Left View Right View Disparity Map

The Problem Calibration If you are interested in 3D reconstruction or utilizing the epipolar line Matching Computing Similarities Finding the best match for each pixel/feature Gives us the disparities 3D Reconstruction

Correspondence Problem How can we match pixels? Local versus Global Matching Especially homogeneous ones? What if we cannot find a match? Interpolation, Filling-in (Barrow&Tenenbaum, 1981)

Stereo correspondence constraints Trevor Darrell

Stereo correspondence constraints Geometry of two views allows us to constrain where the corresponding pixel for some image point in the first view must occur in the second view. epipolar line epipolar plane epipolar line Epipolar constraint: Why is this useful? Reduces correspondence problem to 1D search along conjugate epipolar lines Adapted from Steve Seitz

Stereo image rectification http://homepages.inf.ed.ac.uk/rbf/cvonline/local_copies/fusiello/tutorial.html

Stereo image rectification: example Source: Alyosha Efros

Correspondence problem Beyond the hard constraint of epipolar geometry, there are soft constraints to help identify corresponding points Similarity Uniqueness Ordering Disparity gradient To find matches in the image pair, we will assume Most scene points visible from both views Image regions for the matches are similar in appearance Grauman

Correspondence problem Neighborhood of corresponding points are similar in intensity patterns. Source: Andrew Zisserman

Computing Similarity

Correlation-based window matching Source: Andrew Zisserman

Dense correspondence search For each epipolar line For each pixel / window in the left image compare with every pixel / window on same epipolar line in right image pick position with minimum match cost (e.g., SSD, correlation) Adapted from Li Zhang Grauman

Effect of window size Source: Andrew Zisserman Grauman

Effect of window size W = 3 W = 20 Want window large enough to have sufficient intensity variation, yet small enough to contain only pixels with about the same disparity. Figures from Li Zhang Grauman

Uniqueness For opaque objects, up to one match in right image for every point in left image Figure from Gee & Cipolla 1999 Grauman

Ordering constraint Points on same surface (opaque object) will be in same order in both views Figure from Gee & Cipolla 1999 Grauman

Ordering constraint Won t always hold, e.g. consider transparent object, or an occluding surface Figures from Forsyth & Ponce Grauman

Grouping Constraint Pugeault et al., 2006; 2008.

Disparity gradient Assume piecewise continuous surface, so want disparity estimates to be locally smooth Figure from Gee & Cipolla 1999 Grauman

intensity Scanline stereo Try to coherently match pixels on the entire scanline Different scanlines are still optimized independently Left image Right image Grauman

Coherent stereo on 2D grid Scanline stereo generates streaking artifacts Can t use dynamic programming to find spatially coherent disparities/ correspondences on a 2D grid Grauman

As energy minimization I 1 I 2 D W 1 (i ) W 2 (i+d(i )) D(i ) E Edata ( I1, I2, D) Esmooth ( D) E W ( i) W ( i D( i 2 data 1 2 )) i E smooth ) neighborsi, j D( i) D( j Grauman

Examples

Grauman

Stereo vision ~6cm ~50cm After 30 feet (10 meters) disparity is quite small and depth from stereo is unreliable Slide: A. Torralba

Multibaseline Stereo Basic Approach Choose a reference view Use your favorite stereo algorithm BUT replace two-view SSD with SSD over all baselines Limitations Must choose a reference view Visibility: select which frames to match [Kang, Szeliski, Chai, CVPR 01] Szeliski

Active stereo with structured light Li Zhang s one-shot stereo camera 1 camera 1 projector projector camera 2 Project structured light patterns onto the object simplifies the correspondence problem Szeliski

http://vision.middlebury.edu/stereo/

Problems with Stereo Calibration Matching is difficult. Deciding on what to match: Pixels vs. features. How to match: Local vs. global. Accuracy of depth is limited by the baseline.

Further Reading

Human Stereo Vision: Fixation, convergence Grauman

Human stereopsis: disparity d=0 Disparity: d = r-l = D-F. Adapted from M. Pollefeys

Do you have stereo vision? http://www.vision3d.com/frame.html

Binocular Cues: Motion

Depth from optical flow http://cns.bu.edu/vislab/projects/buk/ http://www.pages.drexel.edu/~weg22/opticflow.html

Structure from motion Given: m images of n fixed 3D points x ij = P i X j, i = 1,, m, j = 1,, n Problem: estimate m projection matrices P i and n 3D points X j from the mn correspondences x ij X j x 1j x 3j P 1 x 2j Lazebnik P 2 P 3

Bundle adjustment Non-linear method for refining structure and motion Minimizing reprojection error m E( P, X) D n i 1 j 1 2 x, P X ij i j X j P 1 X j x 1j x 3j P 1 P 2 X j x 2j P 3 X j Lazebnik P 2 P 3

http://grail.cs.washington.edu/rome/

Problems with motion Structure from optic flow: Estimation of optic flow is not easy: Flow field is usually over-smooth, noisy and incomplete. Gives a rough estimate only. Structure from Motion: Requires too many views/frames Matching is now more difficult due to many views Illumination becomes a bigger problem

Monocular Cues An important fraction of people don t use stereo vision.

Monocular cues

No news is good news [W.E.L. Grimson] No contrast in 2D means continuity in 3D Utilized a lot in surface interpolation & dense stereo methods. Quantified & extended in (Kalkan et al., 2006)

Examples for monocular cues

Monocular cues to depth Relative depth cues: provide relative information about depth between elements in the scene Absolute depth cues: (assuming known camera parameters) these cues provide information about the absolute depth between the observer and elements of the scene Slide: A. Torralba

Relative depth cues Slide: A. Torralba Simple and powerful cue, but hard to make it work in practice

Slide: A. Torralba Interposition / occlusion

Texture Gradient A Witkin. Recovering Surface Shape and Orientation from Texture (1981) Slide: A. Torralba

Illumination Shading Shadows Inter-reflections Slide: A. Torralba

Shading Based on 3 dimensional modeling of objects in light, shade and shadows. Source: A. Torralba

Larry Davis, Ramani Duraiswami, Daniel DeMenthon, and Cornelia Fermüller

Shadows Slide by Steve Marschner http://www.cs.cornell.edu/courses/cs569/2008sp/schedule.stm

Atmospheric perspective Far objects: Bluish Lower contrast http://encarta.msn.com/medias_761571997/perception_(psychology).html

Predicting Depth from Existing Depth Combination of different depth cues.

Depth Prediction from Edges Kalkan et al., 2008.

Depth Prediction from Edges Kalkan et al., 2008.

Depth Prediction from Edges Kalkan et al., 2008.

Depth Prediction from Edges Kalkan et al., 2008.

Learning Monocular Cues from Labeled Data

Learn to Estimate Surface Orientations Learn structure of the world from labeled examples Slides by Efros

Label Geometric Classes Goal: learn labeling of image into 7 Geometric Classes: Support (ground) Vertical Planar: facing Left ( ), Center ( ), Right ( ) Non-planar: Solid (X), Porous or wiry (O) Sky Slides by Efros

What cues to use? Color, texture, image location Vanishing points, lines Slides by Efros Texture gradient

The General Case (outdoors) Typical outdoor photograph off the Web Got 300 images using Google Image Search keyboards: outdoor, scenery, urban, etc. Certainly not random samples from world 100% horizontal horizon 97% pixels belong to 3 classes -- ground, sky, vertical (gravity) Camera axis usually parallel to ground plane Still very general dataset! Slides by Efros

Let s use many weak cues Material Image Location Perspective Slides by Efros

Image Segmentation Naïve Idea #1: segment the image Chicken & Egg problem Naïve Idea #2: multiple segmentations Decide later which segments are good Slides by Efros

Estimating surfaces from segments We want to know: Is this a good (coherent) segment? P(good segment data) If so, what is the surface label? P(label good segment, data) Learn these likelihoods from training images Slides by Efros

Labeling Segments For each segment: - Get P(good segment data) P(label good segment, data) Slides by Efros

Image Labeling Labeled Segmentations Labeled Pixels Slides by Efros

No Hard Decisions Support Vertical Sky V-Left V-Center V-Right V-Porous V-Solid

Labeling Results Input image Ground Truth Our Result Slides by Efros

Reasoning about spatial relationships between objects Ballard & Brown, 1982 Freeman, 1974 Guzman, 1969 A. Torralba

Scene layout assumptions Assumption: objects stand on ground plane A. Torralba

Recovering scene geometry Polygon types Ground Standing Attached Edge types Contact Attached Occluded Camera parameters A. Torralba

Recovering scene geometry Polygon types Ground Standing Attached Edge types Contact Attached Occluded Camera parameters A. Torralba

Relationships between polygons Part-of Attached Standing / Ground / Attached Supported-by Standing Ground A. Torralba

Recovering scene geometry Polygon types Ground Standing Attached Edge types Contact Attached Occluded Camera parameters A. Torralba

Edge types Ground and attached objects have attached edges Standing objects can have contact or occluding edges Cues for contact edges: Orientation Proximity to ground Length A. Torralba

Tool went online July 1st, 2005 250,000 object annotations A. Torralba Labelme.csail.mit.edu B. Russell, A. Torralba, W.T. Freeman. IJCV 2008

Polygon quality A. Torralba

Online Hooligans Do not try this at home A. Torralba

Absolute (monocular) depth cues Are there any monocular cues that can give us absolute depth from a single image? Source: A. Torralba

Familiar size Which object is closer to the camera? How close? Source: A. Torralba

Familiar size Apparent reduction in size of objects at a greater distance from the observer Size perspective is thought to be conditional, requiring knowledge of the objects. Source: A. Torralba

Distance from the horizon line Based on the tendency of objects to appear nearer the horizon line with greater distance to the horizon. Objects approach the horizon line with greater distance from the viewer. Source: A. Torralba

Moon illusion Ebbinghaus illusion http://en.wikipedia.org/wiki/moon_illusion Adapted from: A. Torralba

Relative height The object closer to the horizon is perceived as farther away, and the object further from the horizon is perceived as closer If you know camera parameters: height of the camera, then we know real depth Source: A. Torralba

Object Size in the Image Image World Slide by Derek Hoiem

A. Torralba

A. Torralba

A. Torralba

Camera parameters Assume flat ground plane camera roll is negligible (consider pitch only) Camera parameters: height and orientation Slide from J-F Lalonde

Camera parameters t v t-b X = v-b C b X World object height (in meters) C World camera height (in meters) A. Torralba

Camera parameters Human height distribution 1.7 +/- 0.085 m (National Center for Health Statistics) Car height distribution 1.5 +/- 0.19 m (automatically learned) Slide from J-F Lalonde

Object heights Database image Pixel heights Real heights Slide from J-F Lalonde

Depth from Vanishing Lines

Three-dimensional reconstruction from single and multiple images. Antonio Criminisi Microsoft Research, Cambridge, UK

Visual cues Vanishing line Vertical vanishing point (at infinity) Vanishing point Vanishing point Source: A. Criminisi

Visual cues Masaccio s Trinity Source: A. Criminisi vanishing line (horizon) vanishing point

Source: A. Criminisi

Measuring heights in real photos Problem: How tall is this person? 185.3 cm reference Source: A. Criminisi

Assessing geometric accuracy Problem: Are the heights of the two groups of people consistent with each other? Piero della Francesca, Flagellazione di Cristo, c.1460, Urbino Measuring relative heights Source: A. Criminisi

Problems with Monocular Depth Cues They provide relative information. The ones that provide absolute information require a reference. What features/visual-information to investigate? Usually hand-designed. How can we also learn the features that lead to monocular cues? One cue is not sufficient. Different cues should be combined.

What did I skip? Shape from silhouette. The details of most of the monocular cues (i.e., shading, shadow, occlusion, etc.). Reconstruction from disparity, especially for features like edges and corners.

Reading I will supply material for: Stereo Depth from motion Monocular cues