C18 Computer vision. C18 Computer Vision. This time... Introduction. Outline.

Similar documents
Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

COMPUTER VISION > OPTICAL FLOW UTRECHT UNIVERSITY RONALD POPPE

CPSC 425: Computer Vision

Final Exam Study Guide

Massachusetts Institute of Technology Department of Computer Science and Electrical Engineering 6.801/6.866 Machine Vision QUIZ II

Computer Vision Lecture 20

Comparison between Motion Analysis and Stereo

Multi-stable Perception. Necker Cube

The 2D/3D Differential Optical Flow

Visual Tracking (1) Feature Point Tracking and Block Matching

CS 4495 Computer Vision Motion and Optic Flow

Peripheral drift illusion

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra)

Image processing and features

Monocular Visual Odometry

EE795: Computer Vision and Intelligent Systems

Announcements. Motion. Structure-from-Motion (SFM) Motion. Discrete Motion: Some Counting

VC 11/12 T11 Optical Flow

CS664 Lecture #19: Layers, RANSAC, panoramas, epipolar geometry

Application questions. Theoretical questions

Leow Wee Kheng CS4243 Computer Vision and Pattern Recognition. Motion Tracking. CS4243 Motion Tracking 1

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Lecture 16: Computer Vision

Announcements. Motion. Structure-from-Motion (SFM) Motion. Discrete Motion: Some Counting

CS201: Computer Vision Introduction to Tracking

CS231A Section 6: Problem Set 3

Feature Tracking and Optical Flow

Announcements. Computer Vision I. Motion Field Equation. Revisiting the small motion assumption. Visual Tracking. CSE252A Lecture 19.

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

OPPA European Social Fund Prague & EU: We invest in your future.

Computer Vision II Lecture 4

Final Review CMSC 733 Fall 2014

Chapter 9 Object Tracking an Overview

Robert Collins CSE598G. Intro to Template Matching and the Lucas-Kanade Method

Feature Tracking and Optical Flow

Structure from Motion. Prof. Marco Marcon

Computer Vision Lecture 20

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania.

Computer Vision Lecture 20

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

CS231M Mobile Computer Vision Structure from motion

CS231A Midterm Review. Friday 5/6/2016

Flow Estimation. Min Bai. February 8, University of Toronto. Min Bai (UofT) Flow Estimation February 8, / 47

Motion. CS 554 Computer Vision Pinar Duygulu Bilkent University

Structure from Motion

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar.

Lecture 14: Basic Multi-View Geometry

Fundamental matrix. Let p be a point in left image, p in right image. Epipolar relation. Epipolar mapping described by a 3x3 matrix F

Lecture 16: Computer Vision

Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1

Final Exam Study Guide CSE/EE 486 Fall 2007

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi

Stereo Wrap + Motion. Computer Vision I. CSE252A Lecture 17

Stereo Vision. MAN-522 Computer Vision

Structure from Motion

Rectification and Distortion Correction

CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION

Midterm Exam Solutions

CSE 252B: Computer Vision II

Lecture 20: Tracking. Tuesday, Nov 27

CS223b Midterm Exam, Computer Vision. Monday February 25th, Winter 2008, Prof. Jana Kosecka

Dense Image-based Motion Estimation Algorithms & Optical Flow

Optical flow and tracking

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

3D Computer Vision. Structure from Motion. Prof. Didier Stricker

3D Geometry and Camera Calibration

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures

Key properties of local features

Stereo and Epipolar geometry

Spatial track: motion modeling

Motion Estimation. There are three main types (or applications) of motion estimation:

Visual Odometry. Features, Tracking, Essential Matrix, and RANSAC. Stephan Weiss Computer Vision Group NASA-JPL / CalTech

Kanade Lucas Tomasi Tracking (KLT tracker)

SE 263 R. Venkatesh Babu. Object Tracking. R. Venkatesh Babu

Motion. 1 Introduction. 2 Optical Flow. Sohaib A Khan. 2.1 Brightness Constancy Equation

Motion Tracking and Event Understanding in Video Sequences

Ninio, J. and Stevens, K. A. (2000) Variations on the Hermann grid: an extinction illusion. Perception, 29,

Autonomous Navigation for Flying Robots

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Multiple View Geometry

C / 35. C18 Computer Vision. David Murray. dwm/courses/4cv.

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field

CS 2770: Intro to Computer Vision. Multiple Views. Prof. Adriana Kovashka University of Pittsburgh March 14, 2017

Lecture 19: Motion. Effect of window size 11/20/2007. Sources of error in correspondences. Review Problem set 3. Tuesday, Nov 20

CS6670: Computer Vision

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

Motion detection Computing image motion Motion estimation Egomotion and structure from motion Motion classification. Time-varying image analysis- 1

CAP5415-Computer Vision Lecture 8-Mo8on Models, Feature Tracking, and Alignment. Ulas Bagci

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

CS664 Lecture #18: Motion

Representing Moving Images with Layers. J. Y. Wang and E. H. Adelson MIT Media Lab

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

CS4670: Computer Vision

CS4495 Fall 2014 Computer Vision Problem Set 5: Optic Flow

calibrated coordinates Linear transformation pixel coordinates

Comparison Between The Optical Flow Computational Techniques

UNIVERSITY OF TORONTO Faculty of Applied Science and Engineering. ROB501H1 F: Computer Vision for Robotics. Midterm Examination.

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Transcription:

C18 Computer Vision. This time... 1. Introduction; imaging geometry; camera calibration. 2. Salient feature detection edges, line and corners. 3. Recovering 3D from two images I: epipolar geometry. C18 Computer vision Lecture 7 MT 2015 Andrea Vedaldi For slides and up-to-date information: http://www.robots.o.ac.uk/~vedaldi/teach.html 4. Recovering 3D from two images II: stereo correspondence algorithms; triangulation. 5. Structure from Motion I: computing the fundamental matri by matching corners, robust matching with RANSAC. 6. Structure from Motion II: SIFT, recovering egomotion. 7. Image motion: optic flow; the aperture problem; visual tracking. 8. Object recognition and detection. 2 / 40 Introduction Outline So far we have considered estimating structure and motion from a set of n images {I 1,..., I n }. The images were not taken in any particular order or time. Here we analyse video data, intended as a sequence of images I(t), t 0 captured by a moving camera. In particular the goals of this lecture are to: 1. Study the properties of the projected image motion 2. Compute the optical flow as an approimation of the image motion Optical flow and the aperture problem Lucas-Kanade tracking 3. Design trackers to track objects in videos Tracking with models: RaPiD 3 / 40 4 / 40

Outline Scene V Ω Projection from X = (X, Y, Z) ^ y X = f Z X (1) Optical flow and the aperture problem Lucas-Kanade tracking ^ Optic centre f Image plane ^ z Differentiate w.r.t. time to get projected motion ẋ = f Z Ẋ f Ż Z 2 X But the scene motion consists of translation V and rotation Ω relative to the camera. So: Ẋ = (Ẋ, Ẏ, Ż) = V + Ω X Tracking with models: RaPiD Hence writing Ż = Ẋ ẑ and the using (1): ẋ = f (V + Ω X) ẑ (V + Ω X) f X Z Z 2 = f Z V + Ω V ẑ Z (Ω ) ẑ f 5 / 40 6 / 40 (continued) Eample: a simple motion alarm ẋ = f Z V + Ω V z Z Ω y Ω y f In this eample, the camera is rotating around a fied position hence Ω 0 but V = 0. There is also an independently moving object in the scene. Note that: Rotation The depth Z does not appear with the rotation Ω: The rotational component of motion is independent of the scene depth. Translation The depth Z appears with the translation V as V/Z : There is a depth/speed scaling ambiguity. Still from sequence Measured optical flow Subtracting rotational flow The odometry from the camera aes provide Ω. Subtracting the motion due to the rotation (independent of the scene depth Z ) reveals the moving object: ẋ rotational = Ω (Ω ) ẑ f This means that one cannot tell whether something is large, far away and moving quickly or small, near-to and moving slowly. 7 / 40 8 / 40

Approaching and receding motion In this eample, the camera is approaching or receding but not rotating hence V = V y = 0, V z 0 and Ω = 0. Ego-motion and the focus of epansion Generalising the previous eample, if V 0 but Ω = 0, the motion field diverges/converges from a point, called focus of epansion. To find the focus of epansion, set ẋ = 0: 0 = ẋ = f Z V V z Z = f V z V. approaching (V z < 0) receding (V z > 0) Hence the projected motion is ẋ = f Z V V z Z ẋ ẏ = V z y Z 0 0 The projected motion is a divergent/converged vector field, with a stationary point in the middle of the image = 0, ẋ = 0. Divergence and time to contact 9 / 40 Note: This equation says that you can treat the translational velocity V as a ray, and find where it hits the image plane. This point in the image is called the focus of epansion (c.f. epipole) Eample: segmentation and motion alarm 10 / 40 Suppose again that Ω = 0 and that the scene depth Z varies smoothly as a function of and y. The projected motion is ẋ = f Z V V z Z The divergence of the motion field ẋ(, y) is ẋ ẏ = 1 fv V z fv y yv z Z 0 0 ẋ = ẋ + ẏ = 1 Z 2 Z (fv V z ) 1 Z 2 Z (fv y yv z ) 2 Z V z If the camera is approaching/receding, V = V y = 0 and at = y = 0 where t c is the time to contact. ẋ = 2 V z Z = 2 t c If the scene is a fronto-parallel plane, then Z anywhere in the image = Z = 0, and so ẋ = 2 t c 11 / 40 12 / 40

Outline Computing the visual motion Optical flow and the aperture problem We have considered some uses of projected motion, but not yet discussed how to obtain motion field from imagery Three ways: 1. Token based compute conrers in successive frames match corners using some proimity and/or simiilarity scores Lucas-Kanade tracking Tracking with models: RaPiD 2. Gradient based directly from brighness values great if corner detector or matcher unreliable 3. Phase based We will look at (2) in some detail For (3) see work of (eg) Heeger, Fleet and others Observable visual motion and projected motion 13 / 40 Gradient-based visual motion 14 / 40 A word of warning: Compare a rotating snooker ball and a stationary light source, with a stationary ball and a moving light source: Highlight stationary Scene moves, but visual motion is zero Highlight moves In general, observable motion projected motion Scene still, but highlight moves, so visual motion is non-zero. But often its the best we can do (and a reasonable approimation...) Regard the image as a sampling of a continuous irradiance function I = I(, y, t). Follow a particular image patch over time. In this case = (t) and y = y(t), and total di/dt must eists. Using the chain rule: di dt = + d dt + dy dt But if I is constant in the image patch, di/dt = 0, so that 0 = + d dt + dy dt I(,y,t) t 15 / 40 16 / 40

Gradient-based visual motion /ctd The aperture problem Hence if we write I = 0 = + d dt + dy dt ], µ = d ] dt dy dt we arrive at the motion or brightness constraint equation v µ We have not recovered µ...... but only the component of µ along the direction of I. This edge-normal flow is ( ) ( ) ( ) µ I I I v = = I I I 2 I µ = where µ is called the optical flow. This failure to recover both components is called the Aperture Problem. Taking gradients means the information about the intensity is local, as if seen through an aperture. 17 / 40 18 / 40 Overcoming the aperture problem Overcoming the aperture problem /ctd Intuition tells us that there should be no aperture problem if there is a brightness gradient in two directions ; i.e. at corners. E(u, v) i,j ( + i, y + j) u + ( + i, y + j) v + ] 2 ( + i, y + j) Differentiating w.r.t. u, v yields Drawing inspiration from the Plessey corner detector, proceed as follows: Consider tracking a small image patch assumed to have translational image motion. Let E u = i,j E v = i,j u + v + u + v + ] = 0 ] = 0 E(u, v) = i,j i,j I( + i + u, y + j + v, t + δt) I( + i, y + j, t)] 2 ( + i, y + j) u + ( + i, y + j) v + ] 2 ( + i, y + j) Rewrite as ( ) 2 ( ) ( ( ) ] ( ) 2 u = ( v ) ] ) 19 / 40 20 / 40

Rewrite as ( ) 2 ( ) Computing the optic flow ( ( ) ] ) 2 u v ( = ( H µ = b ) ] ) Computing the optic flow Notes The first order approimation used only holds for very small values of µ, µ < 1. To recover bigger flows, use idea of a Gaussian pyramid: Successively smooth and subsample images H is the matri describing the local autocorrelation. Thus: In a region of uniform brightness, H has rank 0 and we can t solve for optic flow On an edge, H has rank 1, and we can find the normal component of optic flow At a corner, H has full rank (i.e. 2), and we can solve the equations to find both components of flow. Outline 21 / 40 Visual tracking 22 / 40 Optical flow and the aperture problem Lucas-Kanade tracking Tracking with models: RaPiD The aim of visual tracking is to locate the position/pose of a target in a image successively over an etended sequence of images. Target: require some appearance model image patch, colour histogram deformable contours (i.e. edges) 3D CAD model Position: image position image position plus deformation parameters 3D translation and rotation full body pose of articulated object... Let s start simply by tracking a planar image patch... 23 / 40 24 / 40

T Tracking as optimisation I The case of pure translation f p ( ) Formulate search as an optimisation problem using brightness constraint as our objective function Optimising non-conve function so need to start near solution Suppose starting point is ] t t y We seek the image position p = ] t t y which maimises the similarity to the template Ehaustive search possible but undesirable Better to do gradient ascent/descent Minimise: E(δt, δt y ) = I( + t + δt, y + t y + δt y ) T (, y)] 2 w.r.t. δt, δt y.,y W Pure translation 25 / 40 Generalisation 26 / 40 Epanding to first order we obtain: E(δt, δt y ),y W I( + t, y + t y ) + I Taking partials w.r.t. δt, δt y yields: 2 I I( + t, y + t y ) + I,y W δt δt δt y ] T (, y) ] ] T (, y) δt y = 0 ] 2 T f p ( ) I Pure translation can be overly restrictive Algorithm generalises straightforwardly Hence,y W I I ] δt δt y =,y W I I( + t, y + t y ) T (, y)] Note the (unsurprising) similarity to the optic flow solution. Suppose represents a location in a template. We seek a set of parameters p such that the warp function f p () aligns the template with an image I(t). Set t t + δt, t y t y + δt y and iterate until change negligible. 27 / 40 28 / 40

More general transformations Generalisation The function f could be quite general. For eample: translation p = 1 0 t ] f t t p () = 0 1 t y y y 0 0 1 1 translation & scale p = s t t y ] f p () = 1 0 p 0 1 p y 0 0 1 y 1 translation & rotation p = ] f p () = cos θ sin θ p sin θ cosθ p y θ t t y 0 0 1 a general affinity f p () = p 11 p 12 p 13 p 21 p 22 p 23 0 0 1 y 1 y 1 Returning to the brightness constraint again, we need to minimize: E( p) = W I(f p+ p (), t) T ()] 2 w.r.t. p. Epanding to first order we obtain: E( p) I(f p (), t) + I f p T () p W Taking partials w.r.t. p yields: 2 I f ] I f ] p p p + I(f p(), t) T () Hence W W ] 2 = 0 I f ] I f ] p = I f ] T () I(f p (), t)] p p p W 29 / 40 30 / 40 Generalisation Summary I f ] I f ] p p W }{{} p = I f ] T () I(f p (), t)] p W }{{} H p = b = Given template T, image I(t) and current tracker parameters p(t 1): p p(t 1) Repeat until p 0: H = I f ] I f ] p p W b = I f ] T () I(f p (), t)] p W Note that the gradients I are computed at f p () T I p = H 1 b f p () = f p f p () p p + p] f p( ) p(t) p 31 / 40 32 / 40

Remarks Eample The epressions look nastier than they are. Easy to go back to the translational case: f p () = ] ] f 1 0 + t y + t y, p = 0 1 f p () f p () = 1 0 δt 0 1 δt y 1 0 t 0 1 t y = 1 0 t + δt 0 1 t y + δt y 0 0 1 0 0 1 0 0 1 We are actually doing a lot more work than we need to calculating gradients in the image at each iteration. Instead pre-compute template gradients, template Hessian, etc, and warp the other way ; see tutorial sheet 33 / 40 34 / 40 Outline Tracking with rigid models: RaPiD tracker Harris, 90 Assumes calibrated camera and rigid 3D polyhedral model consisting of Optical flow and the aperture problem 3D edges control points on model edges y Lucas-Kanade tracking C P Tracking with models: RaPiD Optic ais z T ω v 35 / 40 36 / 40

RaPiD tracker The coordinates of a control point are given (in the camera reference frame) by X = T + P, Ẋ = V + Ω P. The projected motion is then ẋ = d 1 dt Z X = 1 Z Ẋ Ż Z X 2 = 1 V Ω y P z Ω z P y V y + Ω z P Ω P z (V Z z + Ω P y Ω y P ) y V z Ω P y Ω y P 1 V + Ω y P z Ω z P y (V z + Ω P y Ω y P ) = V y + Ω z P Ω P z y(v z + Ω P y Ω y P ) 0 Hence we can write the projection at time t + δt as ] ] 1 0 Py P = + ẋδt = + z + P P y V δt 0 1 y P z yp y yp P Ω RaPiD tracker = + G For each control point, measure its distance to the nearest image edge V Ω] l = ˆn ( ) = ˆn G With n control points, minimize E(V, Ω) = i l i ˆn i G i V Ω w.r.t. V, Ω to obtain pose update. Wrap this up in a Kalman Filter ] δt δt ] 2 ] V δt Ω located edge n predicted edge control point measured edge RaPiD tracker Summary 37 / 40 RaPiD tracker Eample 38 / 40 3D polyhedral model Tracking rigid objects for teleoperations 1D search at various points normal to projected model edges Filter state (6 dof) using constant velocity KF Predict Measure Update 39 / 40 40 / 40