Midterm Exam Solutions

Similar documents
CS223b Midterm Exam, Computer Vision. Monday February 25th, Winter 2008, Prof. Jana Kosecka

calibrated coordinates Linear transformation pixel coordinates

Camera Model and Calibration

ECE 470: Homework 5. Due Tuesday, October 27 in Seth Hutchinson. Luke A. Wendt

Motion. 1 Introduction. 2 Optical Flow. Sohaib A Khan. 2.1 Brightness Constancy Equation

CS201 Computer Vision Camera Geometry

Camera Model and Calibration. Lecture-12

3D Geometry and Camera Calibration

Two-View Geometry (Course 23, Lecture D)

1 (5 max) 2 (10 max) 3 (20 max) 4 (30 max) 5 (10 max) 6 (15 extra max) total (75 max + 15 extra)

Camera Calibration. Schedule. Jesus J Caban. Note: You have until next Monday to let me know. ! Today:! Camera calibration

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

3-D D Euclidean Space - Vectors

Agenda. Rotations. Camera models. Camera calibration. Homographies

Rigid Body Motion and Image Formation. Jana Kosecka, CS 482

Compositing a bird's eye view mosaic

Agenda. Rotations. Camera calibration. Homography. Ransac

Rectification and Distortion Correction

Projective geometry for Computer Vision

CS6670: Computer Vision

DD2423 Image Analysis and Computer Vision IMAGE FORMATION. Computational Vision and Active Perception School of Computer Science and Communication

Camera model and multiple view geometry

Lecture 3: Camera Calibration, DLT, SVD

Reminder: Lecture 20: The Eight-Point Algorithm. Essential/Fundamental Matrix. E/F Matrix Summary. Computing F. Computing F from Point Matches

Perspective projection and Transformations

Multiple Views Geometry

Vision Review: Image Formation. Course web page:

Laser sensors. Transmitter. Receiver. Basilio Bona ROBOTICA 03CFIOR

Final Exam Study Guide

Machine vision. Summary # 11: Stereo vision and epipolar geometry. u l = λx. v l = λy

Transforms. COMP 575/770 Spring 2013

Module 4F12: Computer Vision and Robotics Solutions to Examples Paper 2

Stereo and Epipolar geometry

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Camera Geometry II. COS 429 Princeton University

Introduction to Homogeneous coordinates

Geometric camera models and calibration

Two-view geometry Computer Vision Spring 2018, Lecture 10

EXAM SOLUTIONS. Computer Vision Course 2D1420 Thursday, 11 th of march 2003,

MAPI Computer Vision. Multiple View Geometry

Stereo Vision. MAN-522 Computer Vision

METRIC PLANE RECTIFICATION USING SYMMETRIC VANISHING POINTS

Computer Vision I Name : CSE 252A, Fall 2012 Student ID : David Kriegman Assignment #1. (Due date: 10/23/2012) x P. = z

Single View Geometry. Camera model & Orientation + Position estimation. What am I?

Agenda. Perspective projection. Rotations. Camera models

Pin Hole Cameras & Warp Functions

Computer Vision Projective Geometry and Calibration. Pinhole cameras

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Computer Vision. Coordinates. Prof. Flávio Cardeal DECOM / CEFET- MG.

Computer Vision I - Appearance-based Matching and Projective Geometry

Epipolar geometry. x x

Augmented Reality II - Camera Calibration - Gudrun Klinker May 11, 2004

CS231A Midterm Review. Friday 5/6/2016

Unit 3 Multiple View Geometry

Perspective Projection [2 pts]

Homogeneous Coordinates. Lecture18: Camera Models. Representation of Line and Point in 2D. Cross Product. Overall scaling is NOT important.

Camera Projection Models We will introduce different camera projection models that relate the location of an image point to the coordinates of the

Multiple View Geometry in Computer Vision

CIS 580, Machine Perception, Spring 2015 Homework 1 Due: :59AM

COMPUTER VISION > OPTICAL FLOW UTRECHT UNIVERSITY RONALD POPPE

1 Projective Geometry

Camera Calibration. COS 429 Princeton University

Visual Recognition: Image Formation

Dense Image-based Motion Estimation Algorithms & Optical Flow

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies

Chapter 3 Image Registration. Chapter 3 Image Registration

Epipolar Geometry and Stereo Vision

A simple method for interactive 3D reconstruction and camera calibration from a single view

CIS 580, Machine Perception, Spring 2016 Homework 2 Due: :59AM

Recovering structure from a single view Pinhole perspective projection

CSE152a Computer Vision Assignment 1 WI14 Instructor: Prof. David Kriegman. Revision 0

Final Review CMSC 733 Fall 2014

Announcements. Motion. Structure-from-Motion (SFM) Motion. Discrete Motion: Some Counting

Stereo II CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Perspective Projection in Homogeneous Coordinates

Image formation. Thanks to Peter Corke and Chuck Dyer for the use of some slides

Cameras and Stereo CSE 455. Linda Shapiro

3D Modeling using multiple images Exam January 2008

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz

Fundamental Matrix & Structure from Motion

Fundamental Matrix & Structure from Motion

CSE 252B: Computer Vision II

Partial Calibration and Mirror Shape Recovery for Non-Central Catadioptric Systems

3.1 INTRODUCTION TO THE FAMILY OF QUADRATIC FUNCTIONS

Structure from Motion. Prof. Marco Marcon

CS231A Course Notes 4: Stereo Systems and Structure from Motion

Announcements. Motion. Structure-from-Motion (SFM) Motion. Discrete Motion: Some Counting

Single View Geometry. Camera model & Orientation + Position estimation. Jianbo Shi. What am I? University of Pennsylvania GRASP

CSE 252B: Computer Vision II

Pin Hole Cameras & Warp Functions

Epipolar Geometry and Stereo Vision

Epipolar Geometry CSE P576. Dr. Matthew Brown

Computer Vision Lecture 17

Perspective Mappings. Contents

Image Warping and Mosacing

Chapter 7: Computation of the Camera Matrix P

Computer Vision Lecture 17

CS 664 Slides #9 Multi-Camera Geometry. Prof. Dan Huttenlocher Fall 2003

Transcription:

Midterm Exam Solutions Computer Vision (J. Košecká) October 27, 2009 HONOR SYSTEM: This examination is strictly individual. You are not allowed to talk, discuss, exchange solutions, etc., with other fellow students. Furthermore, you are only allowed to use the book and your class notes. You may only ask questions to the class instructor. Any violation of the honor system, or any of the ethic regulations, will be immediately reported according to George Mason University honor court. 1. (15) Filtering. The image below is an image of a 3 pixel thick vertical line. (a) Show the resulting image obtained after convolution of the original with the following approximation of the derivative filter [ 1, 0, 1] in the horizontal direction. How many local maxima of the filter response do you obtain? Solution The response of each row after convolution with the above filter will be There are two local extrema at 1 and -1. 0 0 1 1 0 1 1 0 0 (b) Suggest a filter which when convolved with the same image would yield a single maximum in the middle of the line. Demonstrate the result of the convolution on the original image. Solution Convolving the image with the following filter will yield single maximum in the middle of the line g = [1, 2, 1] T. (c) What is a difference between Gaussian smoothing and median filtering? How would you decide to use one vs. another? Median filter is more used for noise removal, more suitable for salt and pepper noise. Gaussian is used for smoothing and its better when the noise is not spiky. 1

2. (10) Image Motion. Consider an image motion model (related to affine and translational models considered in the class) which properly models the following local changes of image appearance: image translation d =[d 1,d 2 ] T, change in image contrast λ, intensity offset δ and simple isotropic expansion modelled by a parameter a. Derive a method for estimating the unknown parameters of this model and discuss conditions when such method would fail. Solution The image motion model which we will consider, is derived from modified version of the brightness constancy constraint introduced for the affine motion model and has the following form: Where matrix A is: I(x,t)=λI(Ax + d, t + dt)+δ A = [ ] a 0 0 a with a being the expansion parameter and d =[d 1,d 2 ] T. Similarly as in the pure translational case we will try to expand the RHS of the above constraint using Taylor series expansion. The RHS can be written as function of additional parameters of λ and δ denote by g(x + dx, y + dy, t + dt, λ + dλ, δ + dδ) = λi(x + ax x + d 1,y+ ax y + d 2,t+ dt)+δ = λi(x + dx, y + dy, t + dt)+δ (1) with dx = ax x + d 1 and dy = ay y + d 2. Now expanding the LHS above at g(x, y, t, 1, 0) we get g(x + dx, y + dy, t + dt, λ + dλ, δ + dδ) =g(x, y, t, 1, 0) + x dx + y dy + t dt + λ dλ + dδ + ɛ. δ Now evaluating the partial derivatives at the point of expansion and keeping only the first order terms we get g(x, y, dt, λ, δ) =f(x, y, t)+ x dx + y dy + t dt + λ (λ 1) + (δ 0). δ Substituting the RHS back to the original equation and substituting back for dx and dy we will get the following version of the brightness constancy constraint x ((a 1)x + d 1)+ y ((a 1)y + d 2)+ t dt + Iλ e +1δ =0 where λ e = λ 1. Separating the unknowns and the measurements and denoting a = a 1, we can rewrite the above constraint in the following from [xi x + yi y,i x,i y, I, 1] T [a,d 1,d 2, λ, δ] = I t where I x = x,i y = y,i t = t. The least squares solution of the unknowns given above model follows in the same way as purely affine (or translational case). More details on page 383 of the book or in Chapter 4). 2

3. (15) Paracatadioptric Cameras. A paracatadioptric camera combines a paraboloidal mirror of focal length 1/2 and focus at the origin, with an orthographic projection. The equation of such parabolioidal mirror is given by Z = 1 2 (X 2 + Y 2 1). (2) Therefore, the projection (image) x =(x, y, 0) T of a 3-D point q =(X, Y, Z) T is obtained by intersecting a parameterized ray with the equation of the paraboloid to yield b (see Figure 1), and then orthographically projecting b onto the image plane Z = 0. 1 b p q image plane O x Figure 1: Showing the projection model for paracatadioptric cameras, and the back-projection ray b associated with image point x. Solution (a) Show that the image of q =(X, Y, Z) T is given by [ [ ] x 1 X = y] Z + = 1 [ X X 2 + Y 2 + Z 2 Y λ Y ]. (3) (b) The back-projection ray b R 3 is a ray from the optical center in the direction of the 3D point q R 3 being imaged (see Figure 1). Show that λb = q, where b =(x, y, z) T and z =(x 2 + y 2 1)/2. (c) Given two views {(x i 1, xi 2 )}N i=1 related by a discrete motion (R, T ) SE(3), can one apply the 8-point algorithm to the corresponding back-projection rays {(b i 1, bi 2 )}N i=1 to compute R and T? (a)-(b) The intersection of the ray from the optical center in the direction of the 3D point q with the paraboloid is a point b =(x, y, z) T = q/λ, where λ is a scale factor. Since b is in the paraboloid, we have z =(x 2 + y 2 1)/2, hence Z/λ = ((X 2 + Y 2 )/λ 2 1)/2. After multiplying by 2λ 2 and rearranging the terms we get λ 2 +2Zλ (X 2 +Y 2 ) = 0, from which λ = Z ± X 2 + Y 2 + Z 2. Since λ> 0, then we have λ = Z + X 2 + Y 2 + Z 2. (c) We have q 2 = Rq 1 + T, hence λ 2 b 2 = λ 1 Rb 1 + T and b T TRb 2 1 = 0. Therefore, one can apply the 8-point algorithm directly to the back-projection rays {(b i 1, bi 2 )}N i=1 to compute R and T. The only difference is that the third entry of each b need not be equal to 1, since the backprojection rays are obtained from the given images {(x i 1, x i 2)} N i=1 by setting the third entry to z =(x 2 + y 2 1)/2. Notice that one may as well divide each b by b z, and then consider x = b/b z as perspective images. Then one can apply the 8-point algorithm as usual. 3

4. (10) Perspective projection. Consider a person standing on the road viewing the road at the viewing angle α. (a) How would you compute the viewing angle, providing that you can observe the y image coordinate of the horizon 1 line? (b) Suppose that computed viewing angle is 30 o and that there is an obstacle in front the person at the distance of 2 meters from the feet. Consider that the parameters of the imaging system can be well approximated by a pinhole camera where the resulting image is of resolution 400 400, the focal length is f = 30 and the image of the projection center is the center of the image. The y-coordinate of the obstacle in the image is 300 pixels. How tall is the person? 30 degrees y x 300 2m Solution a) The coordinate of the horizon is related to the coordinate of the vanishing point, which is an intersection of the two parallel lines in the ground plane. I assume that y-axis is pointing downwards, z-axis if to the right and x-axis out of plane. The coordinates of the vanishing point are x = X c + λv 1 and y = Y c + λv 2 Z c + λv 3 Z c + λv 3 where v c =[v 1,v 2,v 3 ] T is the direction vector of a line in the camera coordinate frame and X = [X c,y c,z c ] T is the base point of the line. Consider two lines in the world coordinate frame which lie in the ground plane with direction vector v w = [0, 0, 1] T. Then the same direction vector in the camera frame will be v c = [0, sin α, cos α] (see the relationship between the camera and world coordinate frame in part b). Suppose that y is actual retinal coordinate of the horizon, where y =(y 200)/f Then the y -coordinate of the vanishing point (and hence the horizon) can be obtained by letting λ : y Y c λ sin α = λ Z c + λ cos α = sin α cos α. So α = atan(y ) can be directly computed from the y -coordinate of the horizon. b) There are couple ways how this can be done. The coordinate transformation between the feet coordinate frame {w} and the eye coordinate frame {c} has the following form X w = RX c + T = 1 0 0 0 cos α sin α X c + 0 t y 0 sin α cos α 0 where t y is the height of the person and α is the viewing angle. The inverse transformation is then expressed as X c = R T X w R T T X c = 1 0 0 0 cos α sin α X w + 0 t y cos α 0 sin α cosα t y sin α 1 Horizon line is the line where all parallel lines in the road plane intersect in the image. 4

Suppose that the coordinate of the obstacle in the feet coordinate system is [X w, 0,Z w ] T, then the retinal y-coordinate y (computed as in a) ) is related to its 3D counterpart as y = Y c Z c = sin αz w + cos αt y cos αz w + sin αt y. Since all the above quantities except t y are known we can compute it as t y = y cos αz w + sin αz w cos α y. sin α 5

5. (15) Rotational motion. Consider set of corresponding points x 1 and x 2 in pixel coordinates in two views, which are related by pure rotation R. Assume that all the parameters of the calibration matrix K are known except the focal length f. Describe in detail following steps of an algorithm for recovering R and the focal length of the camera f. (a) The projection points in two views are related by an unknown 3 3 matrix H. Write down the parametrization of matrix H in terms of rotation matrix entries r ij and the focal length f. (b) Describe a linear least squares algorithm for estimation of matrix H. What is the minimal number of corresponding points needed in order to solve for H? (c) Given the parametrization of H derived in a) describe a method for estimating the actual rotation and the focal length of the camera. Solution Rigid body motion equation in case of pure rotation and partial calibration can be written as λ 2 x 2 = K f RK 1 f λ 1x 1 where the intrinsic calibration matrix K f has the following form (the only unknown is the focal length) K f = f 0 0 0 f 0 0 0 1 and x 1, x 2 are the partially calibrated image coordinates (center of the image has beeb subtracted and aspect ration is assumed to be 1). Eliminating the unknown scales λ i and denoting H = K f RK 1 f we obtain a following constraint between the measurements and the unknown H: x 2 Hx 1 =0 This constraint will give us two linearly independent equations per corresponding pair which can be written in the form: a 1i H s = 0 (4) a 2i H s = 0 (5) where H s =[h 11,h 12,h 13,h 21,h 22,h 23,h 31,h 32,h 33 ] T is a stacked form of matrix H. Given at least 4 points (two equations each) we can collect all the constraints as rows of matrix A and solve for H s as AH s =0 Least squares solution to the above system of homogeneous equations is obtained as eigenvector associated with the smallest eigenvalue. Given the least squares solution of H s, gives us Ĥ up to an unknown scale γ. Now given this Ĥ we need to recover K f and R. Note that Ĥ has the following form H = γ r 11 r 12 fr 13 r 21 r 22 fr 23 r 31 /f r 32 /f r 33 Given Ĥ the focal length can be then estimated using the constraints between rows of rotation matrix as r1 T r 2 = 0. From this constraint f can be recovered as h11 h 12 + h 21 h 22 f =. h 31 h 32 Given f we can divide the entries h 13,h 23 by f and multiply the entries h 31,h 32 by f to eliminate the unknown focal length factor. The unknown scale γ can be obtained by normalizing the rows of the new Ĥ (corrected by dividing appropriate entries by f) to be unit norm so as to yield a proper rotation matrix. 6

6. (10) 6-point Algorithm for the Recovery of Fundamental (Essential) Matrix. The relationship between a planar homography H and the Fundamental (Essential) matrix F = T H (section 5.3.4), suggests a simple alternative algorithm for the recovery of F. Outline the steps of the algorithm by assuming that you have available correspondences between at least 4 planar points and at least two points which do not lie in the plane. Proceed by first recovering H, followed by T. Solution First recover the homography H using the 4 planar points. Since x T 2 F x 1 = 0, we have x T 2 T Hx 1 = 0, hence l T T = 0, where l = x 2 Hx 1. Therefore, one can get the epipole T by taking an intersection of two epipolar lines l 1 and l 2 defined by the two non-planar points in the second view and their correspondences in the first view warped by homography. That is, T = l 2 l 1. Given H and T the F becomes F = T H. 7