Multi-stable Perception. Necker Cube

Similar documents
Peripheral drift illusion

Ninio, J. and Stevens, K. A. (2000) Variations on the Hermann grid: an extinction illusion. Perception, 29,

CS 4495 Computer Vision Motion and Optic Flow

Fundamental matrix. Let p be a point in left image, p in right image. Epipolar relation. Epipolar mapping described by a 3x3 matrix F

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow

Computer Vision Lecture 20

Computer Vision Lecture 20

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi

Optical flow and tracking

Comparison between Motion Analysis and Stereo

Computer Vision Lecture 20

ECE Digital Image Processing and Introduction to Computer Vision

Final project bits and pieces

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Visual motion. Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field

CSCI 1290: Comp Photo

EECS 556 Image Processing W 09

EE795: Computer Vision and Intelligent Systems

Lecture 16: Computer Vision

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

CS6670: Computer Vision

Lecture 16: Computer Vision

Overview. Video. Overview 4/7/2008. Optical flow. Why estimate motion? Motion estimation: Optical flow. Motion Magnification Colorization.

Lucas-Kanade Motion Estimation. Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides.

EE795: Computer Vision and Intelligent Systems

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar.

Chaplin, Modern Times, 1936

Dense Image-based Motion Estimation Algorithms & Optical Flow

CS4495/6495 Introduction to Computer Vision. 3B-L3 Stereo correspondence

Automatic Image Alignment (direct) with a lot of slides stolen from Steve Seitz and Rick Szeliski

VC 11/12 T11 Optical Flow

Capturing, Modeling, Rendering 3D Structures

Undergrad HTAs / TAs. Help me make the course better! HTA deadline today (! sorry) TA deadline March 21 st, opens March 15th

Optical flow. Cordelia Schmid

Epipolar Geometry and Stereo Vision

Lecture 14: Tracking mo3on features op3cal flow

Epipolar Geometry and Stereo Vision

Optical flow. Cordelia Schmid

Lecture 19: Motion. Effect of window size 11/20/2007. Sources of error in correspondences. Review Problem set 3. Tuesday, Nov 20

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania.

COMPUTER VISION > OPTICAL FLOW UTRECHT UNIVERSITY RONALD POPPE

Optical Flow CS 637. Fuxin Li. With materials from Kristen Grauman, Richard Szeliski, S. Narasimhan, Deqing Sun

Optical Flow-Based Motion Estimation. Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides.

Stereo. Outline. Multiple views 3/29/2017. Thurs Mar 30 Kristen Grauman UT Austin. Multi-view geometry, matching, invariant features, stereo vision

CS664 Lecture #18: Motion

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

Step-by-Step Model Buidling

3D Photography. Marc Pollefeys, Torsten Sattler. Spring 2015

(More) Algorithms for Cameras: Edge Detec8on Modeling Cameras/Objects. Connelly Barnes

Stereo: Disparity and Matching

Spatial track: motion modeling

Computer Vision Lecture 17

Computer Vision Lecture 18

Computer Vision Lecture 17

Motion Estimation. There are three main types (or applications) of motion estimation:

Lecture 13: Tracking mo3on features op3cal flow

Dense 3D Reconstruction. Christiano Gava

Final Exam Study Guide CSE/EE 486 Fall 2007

Dense 3D Reconstruction. Christiano Gava

Spatial track: motion modeling

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra)

Stereo and Epipolar geometry

Miniature faking. In close-up photo, the depth of field is limited.

CS231A Section 6: Problem Set 3

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching --

Particle Tracking. For Bulk Material Handling Systems Using DEM Models. By: Jordan Pease

Local Feature Detectors

CPSC 425: Computer Vision

Feature Based Registration - Image Alignment

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Fi#ng & Matching Region Representa3on Image Alignment, Op3cal Flow

Image processing and features

Leow Wee Kheng CS4243 Computer Vision and Pattern Recognition. Motion Tracking. CS4243 Motion Tracking 1

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

What have we leaned so far?

Wikipedia - Mysid

Structure from Motion

Final Exam Study Guide

Lecture 10: Multi view geometry

3D Vision. Viktor Larsson. Spring 2019

3D Computer Vision. Structure from Motion. Prof. Didier Stricker

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Autonomous Navigation for Flying Robots

SURVEY OF LOCAL AND GLOBAL OPTICAL FLOW WITH COARSE TO FINE METHOD

Recovering structure from a single view Pinhole perspective projection

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Lecture 20: Tracking. Tuesday, Nov 27

Motion. 1 Introduction. 2 Optical Flow. Sohaib A Khan. 2.1 Brightness Constancy Equation

Towards the completion of assignment 1

Epipolar geometry. x x

CS5670: Computer Vision

CS201: Computer Vision Introduction to Tracking

Object Recognition with Invariant Features

Scale Invariant Feature Transform

Transcription:

Multi-stable Perception Necker Cube

Spinning dancer illusion, Nobuyuki Kayahara

Multiple view geometry Stereo vision Epipolar geometry Lowe Hartley and Zisserman Depth map extraction

Essential matrix X x x xˆ [ t ( Rxˆ )] 0 ˆ x T E xˆ 0 with E t R E is a 3x3 matrix which relates corresponding pairs of normalized homogeneous image points across pairs of images for K calibrated cameras. Essential Matrix (Longuet-Higgins, 1981) Estimates relative position/orientation. Note: [t] is matrix representation of cross product

Fundamental matrix for uncalibrated cases X x x x T Fx 0 with F K EK T 1 F x = 0 is the epipolar line l associated with x F T x = 0 is the epipolar line l associated with x F is singular (rank two): det(f)=0 F e = 0 and F T e = 0 (nullspaces of F = e ; nullspace of F T = e ) F has seven degrees of freedom: 9 entries but defined up to scale, det(f)=0

VLFeat s 800 most confident matches among 10,000+ local features.

RANSAC N Inliers 14 Algorithm: 1. Sample (randomly) the number of points required to fit the model (s=2) 2. Solve for model parameters using samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence

Epipolar lines

Keep only the matches at are inliers with respect to the best fundamental matrix

Dense correspondence problem Multiple match hypotheses satisfy epipolar constraint, but which is correct? Figure from Gee & Cipolla 1999

Left occlusion CS 4495 Computer Vision A. Bobick Shortest paths for scan-line stereo Right occlusion Left image I Right image I S left one-to-one q t Left occlusion Right occlusion s p S right Can be implemented with dynamic programming Ohta & Kanade 85, Cox et al. 96, Intille & Bobick, 01 Slide credit: Y. Boykov

Effect of window size W = 3 W = 20 Want window large enough to have sufficient intensity variation, yet small enough to contain only pixels with about the same disparity. Figures from Li Zhang

Motion and Gestalt laws of grouping Gestalt psychology (Max Wertheimer, 1880-1943)

Motion and perceptual organization plus closure, continuation, good form Gestalt psychology (Max Wertheimer, 1880-1943)

Motion and perceptual organization

Motion and perceptual organization Sometimes motion is the only cue

Motion and perceptual organization Even impoverished motion data can evoke a strong percept

Motion and perceptual organization Even impoverished motion data can evoke a strong percept

Video A video is a sequence of frames captured over time A function of space (x, y) and time (t)

Motion Applications Background subtraction Shot boundary detection Motion segmentation Segment the video into multiple coherently moving objects

Mosaicing (Michal Irani, Weizmann)

Mosaicing (Michal Irani, Weizmann)

Mosaicing for Panoramas on Smartphones Left to right sweep of video camera Frame t t+1 t+3 t+5 Compare small overlap for efficiency

Mosaicing for Panoramas on Smartphones

Mosaicing for Panoramas on Smartphones

Mosaicing for Panoramas on Smartphones

Mosaicing for Panoramas on Smartphones

Motion estimation techniques Feature-based methods Extract visual features (corners, textured areas) and track them over multiple frames Sparse motion fields, but more robust tracking Suitable when image motion is large (10s of pixels) Direct, dense methods Directly recover image motion at each pixel from spatio-temporal image brightness variations Dense motion fields, but sensitive to appearance variations Suitable for video and when image motion is small Optical flow!

Computer Vision Motion and Optical Flow Many slides adapted from J. Hays, S. Seitz, R. Szeliski, M. Pollefeys, K. Grauman and others

Motion estimation: Optical flow Optic flow is the apparent motion of objects or surfaces Will start by estimating motion of each pixel separately Then will consider motion of entire image

Problem definition: optical flow How to estimate pixel motion from image I(x,y,t) to I(x,y,t+1)? Solve pixel correspondence problem Given a pixel in I(x,y,t), look for nearby pixels of the same color in I(x,y,t+1) Key assumptions I( x, y, t ) I( x, y, t 1) Small motion: Points do not move very far Color constancy: A point in I(x,y,t) looks the same in I(x,y,t+1) For grayscale images, this is brightness constancy

Optical flow constraints (grayscale images) I( x, y, t ) I( x, y, t 1) Let s look at these constraints more closely Brightness constancy constraint (equation) I( x, y, t) I( x u, y v, t 1) Small motion: (u and v are less than 1 pixel, or smoothly varying) Taylor series expansion of I: I I I( x u, y v) I( x, y) u v [higher order terms] x y I I I( x, y) u v x y

Optical flow equation Combining these two equations 0 I( x u, y v, t 1) I( x, y, t) I( x, y, t 1) I u I v I( x, y, t) x y (Short hand: I x = I x for t or t+1)

Optical flow equation Combining these two equations 0 I( x u, y v, t 1) I( x, y, t) I( x, y, t 1) I u I v I( x, y, t) [ I( x, y, t 1) I( x, y, t)] I u I v I I u I v t x y I I u, v t x y x y (Short hand: I x = I x for t or t+1)

Optical flow equation Combining these two equations 0 I( x u, y v, t 1) I( x, y, t) I( x, y, t 1) I u I v I( x, y, t) [ I( x, y, t 1) I( x, y, t)] I u I v I I u I v t x y I I u, v t x y x y (Short hand: I x = I x for t or t+1) In the limit as u and v go to zero, this becomes exact 0 I I u, v t Brightness constancy constraint equation I u I v I x y t 0

How does this make sense? Brightness constancy constraint equation I u I v I 0 x y t What do the static image gradients have to do with motion estimation?

The brightness constancy constraint Can we use this equation to recover image motion (u,v) at each pixel? 0 I I u, v or t I xu I yv It How many equations and unknowns per pixel? One equation (this is a scalar equation!), two unknowns (u,v) The component of the motion perpendicular to the gradient (i.e., parallel to the edge) cannot be measured 0 If (u, v) satisfies the equation, so does (u+u, v+v ) if I u' v' T 0 gradient (u,v) (u+u,v+v ) (u,v ) edge

Aperture problem

Aperture problem

Aperture problem

The barber pole illusion http://en.wikipedia.org/wiki/barberpole_illusion

The barber pole illusion http://en.wikipedia.org/wiki/barberpole_illusion

Solving the ambiguity B. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In Proceedings of the International Joint Conference on Artificial Intelligence, pp. 674 679, 1981. How to get more equations for a pixel? Spatial coherence constraint Assume the pixel s neighbors have the same (u,v) If we use a 5x5 window, that gives us 25 equations per pixel

Solving the ambiguity Least squares problem:

Matching patches across images Overconstrained linear system Least squares solution for d given by The summations are over all pixels in the K x K window

Conditions for solvability Optimal (u, v) satisfies Lucas-Kanade equation When is this solvable? What are good points to track? A T A should be invertible A T A should not be too small due to noise eigenvalues 1 and 2 of A T A should not be too small A T A should be well-conditioned 1 / 2 should not be too large ( 1 = larger eigenvalue) Does this remind you of anything? Criteria for Harris corner detector

Low texture region gradients have small magnitude small 1, small 2

Edge large gradients, all the same large 1, small 2

High textured region gradients are different, large magnitudes large 1, large 2

The aperture problem resolved Actual motion

The aperture problem resolved Perceived motion

Errors in assumptions A point does not move like its neighbors Motion segmentation Brightness constancy does not hold Do exhaustive neighborhood search with normalized correlation - tracking features maybe SIFT more later. The motion is large (larger than a pixel) 1. Not-linear: Iterative refinement 2. Local minima: coarse-to-fine estimation

Revisiting the small motion assumption Is this motion small enough? Probably not it s much larger than one pixel How might we solve this problem?

Optical Flow: Aliasing Temporal aliasing causes ambiguities in optical flow because images can have many pixels with the same intensity. I.e., how do we know which correspondence is correct? actual shift estimated shift nearest match is correct (no aliasing) To overcome aliasing: coarse-to-fine estimation. nearest match is incorrect (aliasing)

Reduce the resolution!

Coarse-to-fine optical flow estimation u=1.25 pixels u=2.5 pixels u=5 pixels image 1 u=10 pixels image 2 Gaussian pyramid of image 1 Gaussian pyramid of image 2

Coarse-to-fine optical flow estimation run iterative L-K run iterative L-K warp & upsample... image J1 image I2 Gaussian pyramid of image 1 Gaussian pyramid of image 2

Optical Flow Results * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Optical Flow Results * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Large displacement optical flow, Brox et al., CVPR 2009 CS 4495 Computer Vision A. Bobick State-of-the-art optical flow, 2009 Start with something similar to Lucas-Kanade + gradient constancy + energy minimization with smoothing term + region matching + keypoint matching (long-range) Region-based +Pixel-based +Keypoint-based

State-of-the-art optical flow, 2015 CNN Pair of input frames Upsample estimated flow back to input resolution Near state-of-the-art in terms of end-point-error Fischer et al. 2015. https://arxiv.org/abs/1504.06852

State-of-the-art optical flow, 2015 Synthetic Training data Fischer et al. 2015. https://arxiv.org/abs/1504.06852

State-of-the-art optical flow, 2015 Results on Sintel Fischer et al. 2015. https://arxiv.org/abs/1504.06852

Optical flow Definition: the apparent motion of brightness patterns in the image Ideally, the same as the projected motion field Take care: apparent motion can be caused by lighting changes without any actual motion Imagine a uniform rotating sphere under fixed lighting vs. a stationary sphere under moving illumination.

Can we do more? Scene flow Combine spatial stereo & temporal constraints Recover 3D vectors of world motion Stereo view 1 Stereo view 2 t y x t-1 z 3D world motion vector per pixel

Scene flow example for human motion Estimating 3D Scene Flow from Multiple 2D Optical Flows, Ruttle et al., 2009

Scene Flow https://www.youtube.com/watch?v=rl_tk_be6_4 https://vision.in.tum.de/research/sceneflow [Estimation of Dense Depth Maps and 3D Scene Flow from Stereo Sequences, M. Jaimez et al., TU Munchen]