Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

Similar documents
Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

CSCI 5980/8980: Assignment #4. Fundamental Matrix

Camera Models and Image Formation. Srikumar Ramalingam School of Computing University of Utah

Structure from motion

Camera Models and Image Formation. Srikumar Ramalingam School of Computing University of Utah

Image correspondences and structure from motion

3D Reconstruction of a Hopkins Landmark

Prof. Noah Snavely CS Administrivia. A4 due on Friday (please sign up for demo slots)

Homographies and RANSAC

Multiview Stereo COSC450. Lecture 8

Multiple View Geometry in Computer Vision

Instance-level recognition

Instance-level recognition

CS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018

Structure from Motion CSC 767

CS4670: Computer Vision

Observations. Basic iteration Line estimated from 2 inliers

Epipolar Geometry CSE P576. Dr. Matthew Brown

3D Environment Reconstruction

Dense 3D Reconstruction. Christiano Gava

A Systems View of Large- Scale 3D Reconstruction

CS231A Midterm Review. Friday 5/6/2016

Structure from motion

3D Computer Vision. Structure from Motion. Prof. Didier Stricker

calibrated coordinates Linear transformation pixel coordinates

Geometric camera models and calibration

Parameter estimation. Christiano Gava Gabriele Bleser

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz

Instance-level recognition part 2

Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction

EECS 442: Final Project

Improving Initial Estimations for Structure from Motion Methods

Vision 3D articielle Multiple view geometry

Personal Navigation and Indoor Mapping: Performance Characterization of Kinect Sensor-based Trajectory Recovery

3D Reconstruction from Two Views

Dense 3D Reconstruction. Christiano Gava

Geometry for Computer Vision

Augmenting Reality, Naturally:

Homography estimation

Vision par ordinateur

Multiple View Geometry

Contents. 1 Introduction Background Organization Features... 7

Stereo and Epipolar geometry

Two-view geometry Computer Vision Spring 2018, Lecture 10

Project: Camera Rectification and Structure from Motion

PART IV: RS & the Kinect

CS 6320 Computer Vision Homework 2 (Due Date February 15 th )

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Structure from motion

Object Recognition with Invariant Features

Project 2: Structure from Motion

Camera Geometry II. COS 429 Princeton University

CS201 Computer Vision Camera Geometry

Camera Drones Lecture 3 3D data generation

A Summary of Projective Geometry

Instance-level recognition II.

Large Scale 3D Reconstruction by Structure from Motion

Binocular Stereo Vision. System 6 Introduction Is there a Wedge in this 3D scene?

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 263

CS 231A Computer Vision (Winter 2018) Problem Set 3

Camera Calibration. COS 429 Princeton University

Camera Model and Calibration

C280, Computer Vision

Scene Reconstruction from Uncontrolled Motion using a Low Cost 3D Sensor

Estimate Geometric Transformation

arxiv: v1 [cs.cv] 28 Sep 2018

Robust Geometry Estimation from two Images

Overview. Augmented reality and applications Marker-based augmented reality. Camera model. Binary markers Textured planar markers

Index. 3D reconstruction, point algorithm, point algorithm, point algorithm, point algorithm, 253

EE795: Computer Vision and Intelligent Systems

Correspondence and Stereopsis. Original notes by W. Correa. Figures from [Forsyth & Ponce] and [Trucco & Verri]

Fitting (LMedS, RANSAC)

Automatic Image Alignment (feature-based)

ORB SLAM 2 : an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras

Miniature faking. In close-up photo, the depth of field is limited.

CS6670: Computer Vision

How to Compute the Pose of an Object without a Direct View?

Rigid Body Motion and Image Formation. Jana Kosecka, CS 482

Target Shape Identification for Nanosatellites using Monocular Point Cloud Techniques

CPSC 425: Computer Vision

Image stitching. Digital Visual Effects Yung-Yu Chuang. with slides by Richard Szeliski, Steve Seitz, Matthew Brown and Vaclav Hlavac

CS 664 Structure and Motion. Daniel Huttenlocher

Epipolar Geometry and Stereo Vision

Computer Vision CSCI-GA Assignment 1.

Feature Based Registration - Image Alignment

Object Reconstruction

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching --

Last lecture. Passive Stereo Spacetime Stereo

MERGING POINT CLOUDS FROM MULTIPLE KINECTS. Nishant Rai 13th July, 2016 CARIS Lab University of British Columbia

Lecture 14: Basic Multi-View Geometry

arxiv: v1 [cs.cv] 28 Sep 2018

1 Projective Geometry

PS3 Review Session. Kuan Fang CS231A 02/16/2018

Visual Pose Estimation System for Autonomous Rendezvous of Spacecraft

Lecture 9: Epipolar Geometry

Pin Hole Cameras & Warp Functions

Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview

3D Photography: Epipolar geometry

A New Representation for Video Inspection. Fabio Viola

Transcription:

School of Computing University of Utah

Presentation Outline 1 2 3

Forward Projection (Reminder) u v 1 KR ( I t ) X m Y m Z m 1

Backward Projection (Reminder) Q K 1 q

Presentation Outline 1 2 3

Sample Problem Compute the solution for pose estimation when λ 1 is given. (λ 1 X 1 λ 2 X 2 ) 2 + (λ 1 Y 1 λ 2 Y 2 ) 2 + (λ 1 Z 1 λ 2 Z 2 ) 2 = d12 2 (λ 2 X 2 λ 3 X 3 ) 2 + (λ 2 Y 3 λ 3 Y 3 ) 2 + (λ 2 Z 2 λ 3 Z 3 ) 2 = d23 2 (λ 3 X 3 λ 1 X 1 ) 2 + (λ 3 Y 3 λ 1 Y 1 ) 2 + (λ 3 Z 3 λ 1 Z 1 ) 2 = d31 2 Compute λ 2 from the first equation. Compute λ 3 from the third equation. Use the second equation to remove incorrect solutions for λ 2 and λ 3.

We consider that the camera is calibrated, i.e. we know its calibration matrix K. K = 200 0 320 0 200 240 0 0 1 K 1 = 1 1 0 320 0 1 240 200 0 0 200 We are given three 2D image to 3D object correspondences. Let the 3 2D points be given by: q 1 = 320 140 q 2 = 320 50 3 290 q 3 = 320 + 50 3 290. 1 1 1 Let the inter-point distances be given by {d 12 = 1000, d 23 = 1000, d 31 = 1000} Is it possible to have λ 1 λ 2?

using n correct correspondences We can compute the pose using 3 correct correspondences. How to compute pose using n correspondences, with outliers. Use RANSAC to identify m inliers where m n. Use least squares to find the best pose using all the inliers - basic idea is to use all the forward projection equations for all the inliers and compute R and t.

General Version - RANSAC (REMINDER) 1 Randomly choose s samples Typically s = minimum sample size that lets you fit a model 2 Fit a model (e.g., line) to those samples 3 Count the number of inliers that approximately fit the model 4 Repeat N times 5 Choose the model that has the largest set of inliers Slide: Noah Snavely

Let us do RANSAC!

Matching Images We match keypoints from left and right images. 2D-to-2D image matching using descriptors such as SIFT.

Kinect Sample Frames Sequences of RGBD frames (I1, D1 ), (I2, D2 ), (I3, D3 ),..., (In, Dn ). How to register Kinect depth data for reconstructing large scenes? We have 2D-3D pose estimators and 2D-2D image matchers.

Kinect Sample Frames

Matching Images We match keypoints from left and right images. One of the matches is incorrect! In a general image matching problem with 1000s of matches, we can have 100 s of incorrect matches.

Presentation Outline 1 2 3

(Two view triangulation) Given: calibration matrices - (K 1, K 2 ). Given: Camera poses - {(R 1, t 1 ), (R 2, t 2 )} are known. Given: 2D point correspondence - (q 1, q 2 ). Our goal is to find the associated 3D point Q m.

(Two view triangulation) Due to noise, the back-projected rays don t intersect. The required point is given by Q m = Qm 1 +Qm 2 2. The 3D point on the first back-projected ray is given by: q 1 K 1 R 1 (I t 1 )Q m 1. The 3D point on the second back-projected ray is given by: q 2 K 2 R 2 (I t 2 )Q m 2.

(Two view triangulation) Let us parametrize the 3D points using λ 1 and λ 2 : Q m 1 = t 1 + λ 1 R T 1 K1 1 1 Q m 2 = t 2 + λ 2 R T 2 K2 1 2 We rewrite using 3 1 constant vectors a, b, c and d for simplicity: Q m 1 = a + λ 1 b, Q m 2 = c + λ 2 d

(Two view triangulation) We can compute λ 1 and λ 2 as follows: [λ 1, λ 2 ] = arg min λ 1,λ 2 dist(q m 1, Q m 2 )

(Two view triangulation) dist(q m 1, Q m 2 ) = 3 (a i + λ 1 b i c i λ 2 d i ) 2 i=1 [λ 1, λ 2 ] = arg min 3 (a i + λ 1 b i c i λ 2 d i ) 2 λ 1,λ 2 [λ 1, λ 2 ] = arg min λ 1,λ 2 D sqr = i=1 3 (a i + λ 1 b i c i λ 2 d i ) 2 i=1 3 (a i + λ 1 b i c i λ 2 d i ) 2 i=1

(Two view triangulation) At minima: D sqr = D sqr λ 1 = 3 (a i + λ 1 b i c i λ 2 d i ) 2 i=1 3 2(a i + λ 1 b i c i λ 2 d i )b i = 0 i=1 D sqr λ 2 = 3 2(a i + λ 1 b i c i λ 2 d i )d i = 0 i=1 We have two linear equations with two variables λ 1 and λ 2. This can be solved!

Once λ s are computed then we can obtain: Q m 1 = a + λ 1 b Q m 2 = c + λ 2 d We can compute the required intersection point Q m from the mid-point equation: Q m = Qm 1 +Qm 2 2

Sample Calibration matrices: K 1 = K 2 = Rotation matrices:r 1 = R 2 = I. 200 0 320 0 200 240 0 0 1 Translation matrices: t 1 = 0, t 2 = (100, 0, 0) T. Correspondence: q 1 = 520 440 q 2 = 500 440 1 1 Compute the 3D point Q m.

Simple Pipeline 1 Given a sequence of images {I 1, I 2,..., I n } with known calibration, obtain 3D reconstruction. 2 Compute correspondences for the image pair (I 1, I 2 ). 3 Find the motion between I 1 and I 2 using motion estimation algorithm (next class). 4 Compute partial 3D point cloud P 3D using the point correspondences from (I 1, I 2 ). 5 Initialize k = 3. 6 Compute correspondences for the pair (I k 1, I k ) and compute the pose of I k with respect to P 3D. 7 Increment P 3D using 3D reconstruction from (I k 1, I k ). 8 k = k + 1 9 If k < n go to Step 5.

Three view triangulation Q m 1 = a + λ 1 b, Q m 2 = c + λ 2 d, Q m 3 = e + λ 3 f We can compute the required point Q m from the intersection of three rays. What is the cost function to minimize?

Problem Calibration matrices: K 1 = K 2 = K 3 = 200 0 320 0 200 240 0 0 1 Rotation matrices: R 1 = R 2 = R 3 = I. Translation matrices: t 1 = 0, t 2 = (100, 0, 0) T, t 3 = (200, 0, 0) T. Correspondence: q 1 = 520 440 q 2 = 500 440 q 3 = 480 440 1 1 1 Compute the 3D point Q m.

Acknowledgments Some presentation slides are adapted from the following materials: Peter Sturm, Some lecture notes on geometric computer vision (available online).