Lecture 14: Tracking mo3on features op3cal flow

Lecture 14: Tracking mo3on features op3cal flow Dr. Juan Carlos Niebles Stanford AI Lab Professor Fei- Fei Li Stanford Vision Lab Lecture 14-1

What we will learn today? Introduc3on Op3cal flow Feature tracking Applica3ons (Supplementary) Technical note Reading: [Szeliski] Chapters: 8.4, 8.5 [Fleet & Weiss, 2005] hzp://www.cs.toronto.edu/pub/jepson/teaching/vision/2503/op3calflow.pdf Lecture 14-2

From images to videos A video is a sequence of frames captured over 3me Now our image data is a func3on of space (x, y) and 3me (t) Lecture 14-3

Mo3on es3ma3on techniques Op3cal flow Recover image mo3on at each pixel from spa3o- temporal image brightness varia3ons (op3cal flow) Feature- tracking Extract visual features (corners, textured areas) and track them over mul3ple frames Lecture 14-4

Op3cal flow Vector field func3on of the spa3o- temporal image brightness varia3ons Picture courtesy of Selim Temizer - Learning and Intelligent Systems (LIS) Group, MIT Lecture 14-5

Feature- tracking Courtesy of Jean- Yves Bouguet Vision Lab, California Ins3tute of Technology Lecture 14-6

Feature- tracking Courtesy of Jean- Yves Bouguet Vision Lab, California Ins3tute of Technology Lecture 14-7

Op3cal flow Defini3on: op3cal flow is the apparent mo3on of brightness pazerns in the image Note: apparent mo3on can be caused by ligh3ng changes without any actual mo3on Think of a uniform rota3ng sphere under fixed ligh3ng vs. a sta3onary sphere under moving illumina3on GOAL: Recover image mo3on at each pixel from op3cal flow Lecture 14-8

technical note Es3ma3ng op3cal flow I(x,y,t 1) I(x,y,t) Given two subsequent frames, es3mate the apparent mo3on field u(x,y), v(x,y) between them Key assump3ons Brightness constancy: projec3on of the same point looks the same in every frame Small mo9on: points do not move very far Spa9al coherence: points move like their neighbors Lecture 14-9

technical note The brightness constancy constraint I(x,y,t 1) I(x,y,t) Brightness Constancy Equa3on: I(x, y,t 1) = I(x + u(x, y), y + v(x, y), t) Linearizing the right side using Taylor expansion: I(x + u, y + v,t) I(x, y,t 1)+ I x u(x, y)+ I y v(x, y)+ I t Hence, I x u + I y v + I t 0 Image deriva3ve along x I(x + u, y + v,t) I(x, y,t 1) = I x u(x, y)+ I y v(x, y)+ I t I [ u v] T + I t = 0 Lecture 14-10

technical note The brightness constancy constraint Can we use this equa3on to recover image mo3on (u,v) at each pixel? I u v [ ] T + I t = 0 How many equa3ons and unknowns per pixel? One equa3on (this is a scalar equa3on!), two unknowns (u,v) The component of the flow perpendicular to the gradient (i.e., parallel to the edge) cannot be measured If (u, v ) sa3sfies the equa3on, so does (u+u, v+v ) if I [ u' v' ] T = 0 (u,v ) gradient edge (u,v) (u+u,v+v ) Lecture 14-11

The aperture problem Actual mo9on Lecture 14-12

The aperture problem Perceived mo9on Lecture 14-13

The barber pole illusion hzp://en.wikipedia.org/wiki/barberpole_illusion Lecture 14-14

The barber pole illusion hzp://en.wikipedia.org/wiki/barberpole_illusion Lecture 14-15

Aperture problem cont d * From Marc Pollefeys COMP 256 2003 Lecture 14-16

technical note Solving the ambiguity B. Lucas and T. Kanade. An itera3ve image registra3on technique with an applica3on to stereo vision. In Proceedings of the Interna6onal Joint Conference on Ar6ficial Intelligence, pp. 674 679, 1981. How to get more equa3ons for a pixel? Spa9al coherence constraint: Assume the pixel s neighbors have the same (u,v) If we use a 5x5 window, that gives us 25 equa3ons per pixel Lecture 14-17

technical note Lucas- Kanade flow Overconstrained linear system: Lecture 14-18

technical note Lucas- Kanade flow Overconstrained linear system Least squares solu3on for d given by The summa3ons are over all pixels in the K x K window Lecture 14-19

technical note Condi3ons for solvability Op3mal (u, v) sa3sfies Lucas- Kanade equa3on When is This Solvable? A T A should be inver3ble A T A should not be too small due to noise eigenvalues λ 1 and λ 2 of A T A should not be too small A T A should be well- condi3oned λ 1 / λ 2 should not be too large (λ 1 = larger eigenvalue) Does this remind anything to you? Lecture 14-20

technical note M = A T A is the second moment matrix! (Harris corner detector ) Eigenvectors and eigenvalues of A T A relate to edge direc3on and magnitude The eigenvector associated with the larger eigenvalue points in the direc3on of fastest intensity change The other eigenvector is orthogonal to it Lecture 14-21

technical note Interpre3ng the eigenvalues Classifica3on of image points using eigenvalues of the second moment matrix: λ 2 Edge λ 2 >> λ 1 Corner λ 1 and λ 2 are large, λ 1 ~ λ 2 λ 1 and λ 2 are small Flat region Edge λ 1 >> λ 2 λ 1 Lecture 14-22

technical note Edge gradients very large or very small large λ 1, small λ 2 Lecture 14-23

technical note Low- texture region gradients have small magnitude small λ 1, small λ 2 Lecture 14-24

technical note High- texture region gradients are different, large magnitudes large λ 1, large λ 2 Lecture 14-25

What we will learn today? Introduc3on Op3cal flow Feature tracking Applica3ons (Supplementary) Technical note Lecture 14-26

What are good features to track? Can measure quality of features from just a single image Hence: tracking Harris corners (or equivalent) guarantees small error sensi3vity! à Implemented in Open CV Lecture 14-27

Recap Key assump3ons (Errors in Lucas- Kanade) Small mo9on: points do not move very far Brightness constancy: projec3on of the same point looks the same in every frame Spa9al coherence: points move like their neighbors Lecture 14-28

Revisi3ng the small mo3on assump3on Is this mo3on small enough? Probably not it s much larger than one pixel (2 nd order terms dominate) How might we solve this problem? * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003 Lecture 14-29

Reduce the resolu3on! * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003 Lecture 14-30

Coarse- to- fine op3cal flow es3ma3on u=1.25 pixels u=2.5 pixels u=5 pixels image 1 H u=10 pixels image 2 I Gaussian pyramid of image 1 Gaussian pyramid of image 2 Lecture 14-31

Coarse- to- fine op3cal flow es3ma3on run itera3ve L- K run itera3ve L- K warp & upsample... image 1 J Gaussian pyramid of image 1 (t) image 2 I Gaussian pyramid of image 2 (t+1) Lecture 14-32

Op3cal Flow Results * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003 Lecture 14-33

Op3cal Flow Results hzp://www.ces.clemson.edu/~stb/klt/ OpenCV * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003 Lecture 14-34

Reminder: Gestalt common fate Lecture 14-36

Mo3on segmenta3on How do we represent the mo3on in this scene? Lecture 14-37

Mo3on segmenta3on J. Wang and E. Adelson. Layered Representa3on for Mo3on Analysis. CVPR 1993. Break image sequence into layers each of which has a coherent (affine) mo3on Lecture 14-38

Lecture 14 - Subs3tu3ng into the brightness constancy equa3on: y a x a a y x v y a x a a y x u 6 5 4 3 2 1 ), ( ), ( + + = + + = 0 + + t y x I v I u I Affine mo3on 39 technical note

Lecture 14-0 ) ( ) ( 6 5 4 3 2 1 + + + + + + t y x I y a x a a I y a x a a I Subs3tu3ng into the brightness constancy equa3on: y a x a a y x v y a x a a y x u 6 5 4 3 2 1 ), ( ), ( + + = + + = Each pixel provides 1 linear constraint in 6 unknowns [ ] 2 + + + + + + = t y x I y a x a a I y a x a a I a Err ) ( ) ( ) ( 6 5 4 3 2 1! Least squares minimiza3on: Affine mo3on 40 technical note

technical note How do we es3mate the layers? 1. Obtain a set of ini3al affine mo3on hypotheses Divide the image into blocks and es3mate affine mo3on parameters in each block by least squares Eliminate hypotheses with high residual error Map into mo3on parameter space Perform k- means clustering on affine mo3on parameters Merge clusters that are close and retain the largest clusters to obtain a smaller set of hypotheses to describe all the mo3ons in the scene 2. Iterate un3l convergence: Assign each pixel to best hypothesis Pixels with high residual error remain unassigned Perform region filtering to enforce spa3al constraints Re- es3mate affine mo3ons in each region Lecture 14-43

Example result J. Wang and E. Adelson. Layered Representation for Motion Analysis. CVPR 1993. Lecture 14-44

What we will learn today? Introduc3on Op3cal flow Feature tracking Applica3ons (Supplementary) Technical note Lecture 14-45

Uses of mo3on Tracking features Segmen3ng objects based on mo3on cues Learning dynamical models Improving video quality Mo3on stabiliza3on Super resolu3on Tracking objects Recognizing events and ac3vi3es Lecture 14-46

Es3ma3ng 3D structure Lecture 14-47

Segmen3ng objects based on mo3on cues Background subtrac3on A sta3c camera is observing a scene Goal: separate the sta3c background from the moving foreground Lecture 14-48

Segmen3ng objects based on mo3on cues Mo3on segmenta3on Segment the video into mul3ple coherently moving objects S. J. Pundlik and S. T. Birchfield, Mo3on Segmenta3on at Any Speed, Proceedings of the Bri3sh Machine Vision Conference (BMVC) 2006 Lecture 14-49

Tracking objects Z.Yin and R.Collins, "On- the- fly Object Modeling while Tracking," IEEE Computer Vision and PaIern Recogni6on (CVPR '07), Minneapolis, MN, June 2007. Lecture 14-50

Synthesizing dynamic textures Lecture 14-51

Super- resolu3on Example: A set of low quality images Lecture 14-52

Super- resolu3on Each of these images looks like this: Lecture 14-53

Super- resolu3on The recovery result: Lecture 14-54

Recognizing events and ac3vi3es Tracker D. Ramanan, D. Forsyth, and A. Zisserman. Tracking People by Learning their Appearance. PAMI 2007. Lecture 14-55

Recognizing events and ac3vi3es Juan Carlos Niebles, Hongcheng Wang and Li Fei- Fei, Unsupervised Learning of Human Ac9on Categories Using Spa9al- Temporal Words, (BMVC), Edinburgh, 2006. Lecture 14-56

Recognizing events and ac3vi3es Crossing Talking Queuing Dancing jogging W. Choi & K. Shahid & S. Savarese WMC 2010 Lecture 14-57

W. Choi, K. Shahid, S. Savarese, "What are they doing? : Collec3ve Ac3vity Classifica3on Using Spa3o- Temporal Rela3onship Among People", 9th Interna3onal Workshop on Visual Surveillance (VSWS09) in conjuc3on with ICCV 09 Lecture 14-58

Op3cal flow without mo3on! Lecture 14-59

What we have learned today? Introduc3on Op3cal flow Feature tracking Applica3ons (Supplementary) Technical note Reading: [Szeliski] Chapters: 8.4, 8.5 [Fleet & Weiss, 2005] hzp://www.cs.toronto.edu/pub/jepson/teaching/vision/2503/op3calflow.pdf Lecture 14-60

technical note Fleet & Weiss, 2005 Lecture 14-61

technical note Fleet & Weiss, 2005 Lecture 14-62

technical note Fleet & Weiss, 2005 Lecture 14-63

technical note Fleet & Weiss, 2005 Lecture 14-64

technical note Fleet & Weiss, 2005 Lecture 14-65

technical note Fleet & Weiss, 2005 Lecture 14-66

technical note Fleet & Weiss, 2005 Lecture 14-67

technical note Fleet & Weiss, 2005 Lecture 14-68

technical note Fleet & Weiss, 2005 Lecture 14-69

technical note Fleet & Weiss, 2005 Lecture 14-70

technical note Fleet & Weiss, 2005 Lecture 14-71

technical note Fleet & Weiss, 2005 Lecture 14-72