CS201: Computer Vision Introduction to Tracking

CS201: Computer Vision Introduction to Tracking John Magee 18 November 2014 Slides courtesy of: Diane H. Theriault Question of the Day How can we represent and use motion in images? 1

What is Motion? Change over time Position Orientation Pose Shape Motion in Images (No such thing, really, since an image is just a picture captured at one moment in time) Example: Video Camera captures a series of images as scene changes Video: Ordered sequence of images captured in rapid succession (Fixed Camera) 2

Motion in Images (No such thing, really, since an image is just a picture captured at one moment in time) Example: Video Camera captures a series of images as scene changes Video: Ordered sequence of images captured in rapid succession (Moving Camera) 3

Motion in Images (No such thing, really, since an image is just a picture captured at one moment in time) Example: Video Camera captures a series of images as scene changes Ordered sequence of images captured in rapid succession Example: Stereo Cameras captures same scene from different viewpoints Set of images where cameras are separated in space Stereoscopic Pairs 4

Motion in Images Motion in Images What happens to the color / brightness values captured in successive images as the scene or camera moves? What does it mean for an image feature to move? How do we use the movement of image features to infer things about the scene or the camera? (Usually need to assume that the difference between images is reasonably small) 5

Types of Tracking Tracking by Detection Feature Tracking Types of Tracking 6

Types of Tracking Optical Flow / Dense scene motion (After Spring Break) Contour Tracking Types of Tracking 7

Types of Tracking Multi-target Tracking Discussion Questions What are different types of tracking that we could do in video of sports? Surveillance? Videos of daily life? What other types of videos are you interested in? What types of information might we want to obtain by understanding motion in images? 8

Feature Tracking What is an image feature? Distinctive Repeatable Uniquely Localizable What does it mean for an image feature to move? Template Tracking Simple feature: small image patch Motion: The same pattern of brightness values appear in a (slightly) different place in the next image 9

Template Tracking Given: small image patch of something we re looking for Goal: Find the best-match location in the new image How: Search in a small window around its previous location Template Tracking Goal: Find the best-match location in the new image How to compute a matching score? Normalized Correlation Coefficient 10

Template Tracking Goal: Find the best-match location in the new image How to find the best location? Exhaustive search (convolution style) Template Tracking Given: template, initial location x0 For each image, t=1:n Search in a small window around x_{t-1} x_t is location with highest NCC score 11

Template Tracking Challenges: Computational Cost Getting lost / Drift Non-translational motion (e.g. rotation) Non-rigid motion (articulation of hand) 3D motion Changing appearance of real object Discussion Questions What are the benefits / downsides of using larger templates / search windows? Why is rotation and scaling problematic for a template tracker? If we update the template as we track, what problems do we solve? What problems do we create? How could we benefit from using a constellation of smaller templates instead of one big one? 12

Discussion Questions: Motion blur Image brightness / contrast changes Computational cost What is the location of an object? What are different types of tracking you could do on XXX video? Assume: Small changes Discussion Questions Assumptions needed for template tracking? Changes in brightness / contrast? How can image feature change? Translation only. No rotation. No scaling How to choose search window? How to choose size of template? Collection of templates to break up one big template? Expensive: Coarse-to-fine How does change accumulate over time? Update template. What might happen if you do that? 13

Question of the Day How can we track features and do better than brute force search? Lucas-Kanade Goal: Find the location in the new image with the best match What if we could do better than exhaustive search? How could we direct our search for the best match using the difference between the two images and the image gradient 14

Background Taylor Series Any function can be approximated with a polynomial Truncated Taylor Series First order approximation Taylor series example Background 15

Background Newton-Raphson method for finding roots (zeros) of a function Assume: Algorithm Want to find roots: Finding Best Match in 1D Two curves: F(x), G(x) Displacement: h Goal: Find the displacement (Derivation on board) Lucas-Kanade Assume: G is a translated version of F: G(x) = F(x+h) Assume: First-order approximation is sufficient F(x+h) = F(x) + h F (x) Assume: Displacement is small 16

Background Multivariate first order approximation Finding Best Match in 2D Two surfaces: F(x), G(x) Displacement: h Goal: Find the displacement Lucas-Kanade Assume: G is a translated version of F: G(x) = F(x+h) Assume: First-order approximation is sufficient Assume: Displacement is small 17

Lucas-Kanade Algorithm: For each patch, use previous location as initial guess Until error (F(x+h) G(x)) is sufficiently small Compute the summations over the gradient Compute the summation over the image values Find displacement, h, by: Find displacement, h, by Use equations to guide search over several iterations Lucas-Kanade Details: Compute sums with weights distance to center mitigate regions where image values match but gradients do not 18

Lucas-Kanade Details: When misregistration might be large wrt image patch Smooth image Coarse-to-fine strategy (search in low resolution image for approximate match, then refine in high resolution image) Kanade-Tomasi How to choose features to track? Manual annotation Large gradient Zero-crossings of Laplacian Corners No! 19

Kanade-Tomasi How to choose features to track? Look to tracking equation: Kanade-Tomasi What properties of this thing can we use? If the matrix is not invertible, we can t track this patch How do we know if it is invertible? Its determinant Slightly more information: eigenvalues Find regions of the image where Eigenvalues are sufficiently large (larger than in a patch that is just noise) Ratio of eigenvalues is reasonable (matrix is well-conditioned) 20

Shi-Tomasi Includes the work from Tomasi-Kanade Additionally: Extension from G(x) = F(x+h) to G(x) = F(Ax+h) (Affine model of motion) When to give up on patches that you are tracking? When dissimilarity score is bad (large difference between G(x) and F(Ax+h) after solving for A and h) Use translation for tracking and affine model for deciding when to give up Discussion Questions: Why would we want to bother with this approach? Why is it important for the displacement to be small with respect to the window size? What can we do if our assumption about the displacement is not true? 21