Motion. 1 Introduction. 2 Optical Flow. Sohaib A Khan. 2.1 Brightness Constancy Equation

Similar documents
COMPUTER VISION > OPTICAL FLOW UTRECHT UNIVERSITY RONALD POPPE

Dense Image-based Motion Estimation Algorithms & Optical Flow

CS-465 Computer Vision

EECS 556 Image Processing W 09

Leow Wee Kheng CS4243 Computer Vision and Pattern Recognition. Motion Tracking. CS4243 Motion Tracking 1

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar.

Ruch (Motion) Rozpoznawanie Obrazów Krzysztof Krawiec Instytut Informatyki, Politechnika Poznańska. Krzysztof Krawiec IDSS

Finally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow

Lecture 16: Computer Vision

CS 4495 Computer Vision Motion and Optic Flow

CS 565 Computer Vision. Nazar Khan PUCIT Lectures 15 and 16: Optic Flow

Lecture 16: Computer Vision

EE795: Computer Vision and Intelligent Systems

Midterm Exam Solutions

VC 11/12 T11 Optical Flow

Optical flow and tracking

CS664 Lecture #18: Motion

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi

Computer Vision for HCI. Motion. Motion

Optic Flow and Basics Towards Horn-Schunck 1

Optical Flow Estimation

Peripheral drift illusion

Notes 9: Optical Flow

CS6670: Computer Vision

SURVEY OF LOCAL AND GLOBAL OPTICAL FLOW WITH COARSE TO FINE METHOD

Visual motion. Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Comparison between Motion Analysis and Stereo

16720 Computer Vision: Homework 3 Template Tracking and Layered Motion.

Visual Tracking (1) Feature Point Tracking and Block Matching

Lucas-Kanade Motion Estimation. Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides.

Lecture 19: Motion. Effect of window size 11/20/2007. Sources of error in correspondences. Review Problem set 3. Tuesday, Nov 20

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures

Lecture 20: Tracking. Tuesday, Nov 27

EE 264: Image Processing and Reconstruction. Image Motion Estimation I. EE 264: Image Processing and Reconstruction. Outline

Computer Vision Lecture 20

EE795: Computer Vision and Intelligent Systems

Motion Estimation. There are three main types (or applications) of motion estimation:

Optical Flow-Based Motion Estimation. Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides.

Local Image Registration: An Adaptive Filtering Framework

Camera Model and Calibration. Lecture-12

Capturing, Modeling, Rendering 3D Structures

Multi-stable Perception. Necker Cube

Computer Vision Lecture 20

Ninio, J. and Stevens, K. A. (2000) Variations on the Hermann grid: an extinction illusion. Perception, 29,

CS223b Midterm Exam, Computer Vision. Monday February 25th, Winter 2008, Prof. Jana Kosecka

Chapter 3 Image Registration. Chapter 3 Image Registration

Marcel Worring Intelligent Sensory Information Systems

CS231A Section 6: Problem Set 3

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Comparison Between The Optical Flow Computational Techniques

Introduction to Computer Vision

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

Assignment: Backgrounding and Optical Flow.

Mariya Zhariy. Uttendorf Introduction to Optical Flow. Mariya Zhariy. Introduction. Determining. Optical Flow. Results. Motivation Definition

Module 7 VIDEO CODING AND MOTION ESTIMATION

Time-to-Contact from Image Intensity

Computer Vision Lecture 20

Global Flow Estimation. Lecture 9

Camera Model and Calibration

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra)

Horn-Schunck and Lucas Kanade 1

Fundamental matrix. Let p be a point in left image, p in right image. Epipolar relation. Epipolar mapping described by a 3x3 matrix F

Final Exam Study Guide

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania.

The 2D/3D Differential Optical Flow

Camera Calibration. Schedule. Jesus J Caban. Note: You have until next Monday to let me know. ! Today:! Camera calibration

Spatial track: motion modeling

CS4495 Fall 2014 Computer Vision Problem Set 5: Optic Flow

MASTER THESIS. Optical Flow Features for Event Detection. Mohammad Afrooz Mehr, Maziar Haghpanah

Chapter 7: Computation of the Camera Matrix P

The Lucas & Kanade Algorithm

CPSC 425: Computer Vision

Massachusetts Institute of Technology Department of Computer Science and Electrical Engineering 6.801/6.866 Machine Vision QUIZ II

Chapter 18. Geometric Operations

Camera model and multiple view geometry

Automatic Image Alignment (direct) with a lot of slides stolen from Steve Seitz and Rick Szeliski

Final Exam Study Guide CSE/EE 486 Fall 2007

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

Introduction to Homogeneous coordinates

Global Flow Estimation. Lecture 9

Motion. CS 554 Computer Vision Pinar Duygulu Bilkent University

An idea which can be used once is a trick. If it can be used more than once it becomes a method

Stereo Observation Models

Introduction to Computer Vision

Spatial track: motion modeling

THE VIEWING TRANSFORMATION

Hand-Eye Calibration from Image Derivatives

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching --

Optic Flow and Motion Detection

Announcements. Motion. Structure-from-Motion (SFM) Motion. Discrete Motion: Some Counting

CSE 252A Computer Vision Homework 3 Instructor: Ben Ochoa Due : Monday, November 21, 2016, 11:59 PM

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

ROBUST LINE-BASED CALIBRATION OF LENS DISTORTION FROM A SINGLE VIEW

extracted occurring from the spatial and temporal changes in an image sequence. An image sequence

Structure from Motion. Prof. Marco Marcon

Transcription:

Motion Sohaib A Khan 1 Introduction So far, we have dealing with single images of a static scene taken by a fixed camera. Here we will deal with sequence of images taken at different time intervals. Motion of an object in 3D or the camera induces 2D motion of pixels on the image plane. This motion is called optical flow. We will first present a method to compute local optical flow at each pixel, followed by a method to compute global flow, which fits the affine model. 2 Optical Flow Optical flow may be defined as a flow-field u(x, y),v(x, y) where u(x, y) is the velocity of pixel (x, y) in the x-direction, and v(x, y) in the y-direction. Example optical flow fields are shown in Figure 1. It can be realized from this figure that optical flow is a powerful feature for object segmentation. In addition, there are several applications that stem from optical flow computation, like object-based compression (MPEG-4), image stabilization and gesture recognition. 2.1 Brightness Constancy Equation Let a 3D function, f(x, y, t), where x and y are the spatial coordinates, and t is the time coordinate, denote the image sequence. Then, f(x 1,y 1,t 1 )isthegraylevelatpixel(x 1,y 1 )attimet 1. We assume that if there is a small change dx, dy and dt in x, y and t, there is no change in the gray-levels, that is: f(x, y, t) =f(x + dx, y + dy, t + dt) (1) This equation represents the brightness constancy assumption. That is, over small time intervals, pixels can experience small displacements but with no change in color. This assumption is not always true in real world sequences, because of non-lambertian object surfaces, changes in distance from light sources, and camera noise. However it is a reasonable simplifying assumption, based on which we can easily derive equations for computing u and v. Recall that the Taylor series expansion of a function f(x) aboutapointx = a is given by: f(x) =f(a)+(x a)f (a)+ 1 2! (x a)2 f (a)+... (2) Thus, the Taylor series expansion of right hand side of Equation 1 around x, y, t, is: f(x + dx, y + dy, t + dt) =f(x, y, t)+dx f f + dy x y + dt f +... (3) t Adapted from Fundamentals of Computer Vision by Mubarak Shah c 1992, and other sources 1

Figure 1: Examples of optical flow fields: (a) translation, (b) rotation, (c) zoom, (d) unzoom are flow fields that will fit a global model like affine or projective. (e) shows an image from a sequence whose optical flow is shown in (f). Here different image regions are moving at different velocities that do not fit a global 2D displacement model. Ignoring higher order terms and substituting in Equation 1, we get: f(x, y, t) =f(x, y, t)+dx f f + dy x y + dt f t. (4) The above equation can be simplified as: f x dx + f y dy + f t dt =0, (5) where f x = f x, f y = f y, f t = f t are the x-, y- andt-derivatives respectively. These derivatives can be computed by convolving the sequence f(x, y, t) with the masks shown in Figure 2. Dividing each term in the above equation by dt, weget: f x u + f y v + ft =0, (6) where u = dx dt and v = dy dt is the optical flow. This equation is called the brightness constancy equation. It consists of two unknowns, u and v, and therefore a unique solution for these two unknowns does not exist, based on a information available at a single pixel. In fact, the equation presents a linear constraint on the possible solutions of u and v, which can be seen if it is rewritten as: v = f x u f t. (7) f y f y This is the equation of a straight line in uv-space. There are several possible solutions of u, v, which lie along this line, as shown in Figure 3. Let (û, ˆv) be the correct solution. This vector can 2

Figure 2: Derivative Masks: The axis convention (left) and the derivative masks that conform to this convention. Note that in this convention, optical flow vectors go from I t to I t+1. be divided into two components, one along the straight line denoted by p and the other along the perpendicular line denoted by d. We can show that since cos α = d/ ft f x and cos(90 α) =d/ ft f y, f therefore d = (f t 2 x +fy 2 ). Therefore, knowing the derivatives f x, f y and f t, we can only compute the normal component, d, of optical flow. However, the parallel component p cannot be computed directly from the derivatives. 2.1.1 Example It is instructive to do a simple example to clearly understand the concept of normal and parallel components of optical flow. Consider a foreground object which has translated by û = 1, ˆv = 1 between two frames 4. We are interested in finding optical flow at the point marked x. Applying the masks in Figure 2 at x (assuming origin of the mask to be bottom right corner), we get f x (x) =0,f y (x) =2andf t (x) = 2. This means that the possible solutions of (u, v) lie along the line v = 1 in the uv-space. This makes sense, because if we look at a localized neighborhood around point x, we can only determine that the edge has moved by one pixel in the horizontal direction. All variations in the vertical direction, for example (u =0,v =1),(u = 1,v =1),(u =3,v =1) will generate exactly the same local variations in the image, and hence the same derivatives. This example illustrates an important problem with the brightness constancy equation, stated as the aperture problem. This problem is illustrated in Figure 5 and can be stated as follows: The component of the motion field in the direction orthogonal to the spatial image gradient is not constrained by the brightness constancy equation. 3 Lucas-Kanade Method Lucas-Kanade s method of finding optical flow relies on the least-squares solution of (u, v) over a small neighborhood. The idea is very simple and is as follows. We know that a single point yields one equation from which two unknowns cannot be recovered. However, if we assume that brightness constancy assumption holds for a small neighborhood around the point (typically 3x3 3

Figure 3: Optical flow constraint line in uv-space. d is the length of the perpendicular from the origin to the line, α is the angle the perpendicular makes with the x-axis. (û, ˆv) is one possible solution, which can be divided into two components: p long the constraint line, and d which is perpendicular to the constraint line. Figure 4: Example of an image sequence with (1, 1) translation at all points between two frames. Derivative masks are applied at x (affecting pixels enclosed in blue square). If white pixels are 1 and black are 0, then f x =0,f y = 2, f t = 2. The possible solutions of (u, v) lie along the line v = 1 in the uv-space (right). 4

Figure 5: The aperture problem: the black and gray lines represent two positions of the same image line in two consecutive frames. The image velocity, perceived in the image on the left through a small aperture is only the normal component d. The actual image velocity is shown on right, as u. or 5x5 neighborhood), then each point will yield one equation, but this set of equations will still have only two unknowns. This yields an over-constrained linear system, which can be solved by least-squares method. Formally, consider a 3x3 neighborhood for which brightness constancy assumption holds, i.e. we assume that the entire neighborhood has moved over interval dt with velocity (u, v). Then: f x1 u + f y1 v = f t1 (8) f x2 u + f y2 v = f t2. f x9 u + f y9 v = f t9 If we define A = f x1 f x2. f x9 f y1 f y2. f y9 then this gives us a linear system of the form B = f t1 f t2. f t9 u = [ u v ], (9) which can be solved by taking the pseudo-inverse: Au = B, (10) A T Au = A T B (11) ( 1 u = A A) T A T B (12) This solution is the least-squares solution, which means that it finds the values of (u, v) such that the square of the error is the minimum. This can be realized by deriving Eq. 11 in an alternate 5

manner. Consider the error term e = i (f x u + f y v + f t ) 2. (13) This error should be ideally zero over all points in the neighborhood, so we can find optimal values of (u, v) by minimizing this equation. e u =2 (f xi u + f yi v + f ti )(f xi ) = 0 (14) i e v =2 (f xi u + f yi v + f ti )(f yi )=0 i We can simplify these two equations and write them as: [ i f xi 2 i f ][ ] [ xif yi u i i f xif yi i f yi 2 = f ] xif ti v i f, (15) yif ti which is simply an expanded form of Eq. 11. 3.1 Lucas-Kanade with Pyramids Typically the derivative masks are small in size and therefore cannot capture faster moving objects in the scene. Therefore there is a need to either use larger derivative masks, or to use smaller images. One technique to do this is to use pyramids. At the highest level of the pyramid, standard Lucas-Kanade is applied. The resulting flow vectors from each level are propagated to the next level, using interpolation for the intermediate values. They are then multiplied by two to compensate for the increased resolution at this level. The correction in the flow vectors is then computed by applying LK, but with the additional step that f t is computed after compensating for the known estimate of flow. Finally the correction is added to the initial estimate, to obtain optical flow at the current level. The algorithm is illustrated in Figure 6. 4 Global Affine Flow So far we have looked at the issue of computing optical flow at every pixel. Often, the sequence of images is such that the entire image is being deformed in a consistent manner. Such images are generated mostly because of camera motion. For example, if the camera is translating or zooming, then optical flow of an image in this sequence has global consistency. Assuming that the deformation between frames I and I is given by the affine transformation: x = a 1x + a 2 y + b 1 y = a 3 x + a 4y + b 2, (16) the optical flow at every pixel is also related to the pixel coordinates. x x = u = a 1 x + a 2 y + b 1 y y = v = a 3 x + a 4 y + b 2 (17) where a 1 = a 1 1anda 4 = a 4 1. This equation gives a global model for optical flow, i.e. (u, v) values over the entire image are related by this model. Given two images, the model parameters 6

Figure 6: Lucas-Kanade with pyramids algorithm a =(a 1,a 2,b 1,a 3,a 4,b 2 ) T can be recovered finding the value of a which minimize the error given by the brightness constancy equation. We define the error term over the image as: e = pixels (f t + f x u) 2, (18) where f x =[f x,f y ] T and u =[u, v] T. Note that Eq. 17 can be rewritten in terms of the unknowns as follows: a 1 [ ] [ ] a 2 u x y 1 0 0 0 b 1 = v 0 0 0 x y 1 a 3 a 4 b 2 u = Xa (19) Substituting u in Eq. 18, we get e = pixels (f t + f x Xa) 2. (20) This equation represents the combined deviation of the whole image from the brightness constancy equation when affine deformation of a is assumed. The optimal value of a is the one that minimizes e, which can be obtained by solving e a = 0. This gives us the following equation: ( ) X T f x fx T X u = X T f x f t (21) pixels 7 pixels

Figure 7: Iterative method for computing global flow. W denotes warping module, M denotes global motion estimation module (Eq. 21) and + denotes the process of combining two transformations. In practice, several iterations may be done at each level, which are not shown here for the sake of simplicity. The term ( ) pixels X T f x fx T X is a 6x6 matrix, which can be inverted to solve this linear system for the unknown parameters, a. Practically this process is also done using pyramids. The 2x2 derivative masks cannot capture large motion, and therefore the process is done iteratively at multiple resolutions. We are given two images I 1 and I 2. Ateachlevell of the pyramid, the transformation from the previous level a l 1 is used as the initial estimate. Image I 1 at this level is warped using this transformation 1. The remaining transformation δa l between the warped image I1 and I 2 is recovered using Eq. 21. The final transformation at this level is then the product of the homogeneous transformation matrices of a l 1 and δa l. This process is illustrated in Figure 7. The initial estimate at the highest level is taken to be the identity transformation. 1 Note that for warping, 1 will have to be added to a 1 and a 4 8