Deep Learning: Image Registration. Steven Chen and Ty Nguyen

Similar documents
Outline 7/2/201011/6/

Visual Tracking (1) Pixel-intensity-based methods

The Lucas & Kanade Algorithm

Image matching. Announcements. Harder case. Even harder case. Project 1 Out today Help session at the end of class. by Diva Sian.

Image processing and features

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Autonomous Navigation for Flying Robots

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Feature Matching and RANSAC

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

CS4670: Computer Vision

CSE 527: Introduction to Computer Vision

Optical flow and tracking

Tracking in image sequences

Local Features: Detection, Description & Matching

Application questions. Theoretical questions

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Visual Odometry. Features, Tracking, Essential Matrix, and RANSAC. Stephan Weiss Computer Vision Group NASA-JPL / CalTech

Automatic Image Alignment (feature-based)

Image Features. Work on project 1. All is Vanity, by C. Allan Gilbert,

Mosaics. Today s Readings

Leow Wee Kheng CS4243 Computer Vision and Pattern Recognition. Motion Tracking. CS4243 Motion Tracking 1

Multiple-Choice Questionnaire Group C

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

Final Exam Study Guide CSE/EE 486 Fall 2007

Vision-based Mobile Robot Localization and Mapping using Scale-Invariant Features

CSE 252B: Computer Vision II

Edge and corner detection

SE 263 R. Venkatesh Babu. Object Tracking. R. Venkatesh Babu

Robert Collins CSE598G. Intro to Template Matching and the Lucas-Kanade Method

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Local features: detection and description May 12 th, 2015

Combining Appearance and Topology for Wide

CS5670: Computer Vision

Local features and image matching. Prof. Xin Yang HUST

Scale Invariant Feature Transform

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

Feature Based Registration - Image Alignment

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania.

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching --

AK Computer Vision Feature Point Detectors and Descriptors

Scale Invariant Feature Transform by David Lowe

Kanade Lucas Tomasi Tracking (KLT tracker)

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Local Feature Detectors

Visual Tracking (1) Feature Point Tracking and Block Matching

EE795: Computer Vision and Intelligent Systems

Motion Estimation and Optical Flow Tracking

Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview

Image Stitching. Slides from Rick Szeliski, Steve Seitz, Derek Hoiem, Ira Kemelmacher, Ali Farhadi

Scale Invariant Feature Transform

Computer Vision. Exercise 3 Panorama Stitching 09/12/2013. Compute Vision : Exercise 3 Panorama Stitching

Ninio, J. and Stevens, K. A. (2000) Variations on the Hermann grid: an extinction illusion. Perception, 29,

Deep Learning with Tensorflow AlexNet

Today s lecture. Image Alignment and Stitching. Readings. Motion models

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

Visual Tracking. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

Peripheral drift illusion

Computer Vision I - Basics of Image Processing Part 2

SCALE INVARIANT FEATURE TRANSFORM (SIFT)

arxiv: v1 [cs.cv] 2 May 2016

Motion Tracking and Event Understanding in Video Sequences

Object Recognition with Invariant Features

Feature Tracking and Optical Flow

CS231A Midterm Review. Friday 5/6/2016

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Humanoid Robotics. Least Squares. Maren Bennewitz

OPPA European Social Fund Prague & EU: We invest in your future.

CS 231A Computer Vision (Winter 2014) Problem Set 3

Stitching and Blending

Feature Tracking and Optical Flow

2D Image Processing Feature Descriptors

Feature Matching and Robust Fitting

COMPUTER VISION > OPTICAL FLOW UTRECHT UNIVERSITY RONALD POPPE

CAP 5415 Computer Vision Fall 2012

Lecture 16: Computer Vision

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

Lecture 16: Computer Vision

CS231N Section. Video Understanding 6/1/2018

The SIFT (Scale Invariant Feature

Towards a visual perception system for LNG pipe inspection

Fundamental matrix. Let p be a point in left image, p in right image. Epipolar relation. Epipolar mapping described by a 3x3 matrix F

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1

Local features: detection and description. Local invariant features

Local Patch Descriptors

Announcements. Recognition I. Gradient Space (p,q) What is the reflectance map?

Patch Descriptors. CSE 455 Linda Shapiro

Object Detection on Self-Driving Cars in China. Lingyun Li

Lecture 8 Fitting and Matching

Computer Vision for HCI. Topics of This Lecture

Automatic Image Alignment (direct) with a lot of slides stolen from Steve Seitz and Rick Szeliski

CRF Based Point Cloud Segmentation Jonathan Nation

Comparison between Motion Analysis and Stereo

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

16720 Computer Vision: Homework 3 Template Tracking and Layered Motion.

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CS 231A Computer Vision (Fall 2012) Problem Set 3

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

Transcription:

Deep Learning: Image Registration Steven Chen and Ty Nguyen

Lecture Outline 1. Brief Introduction to Deep Learning 2. Case Study 1: Unsupervised Deep Homography 3. Case Study 2: Deep LucasKanade

What is Deep Learning? Machine Learning with a deep neural network Supervised Learning Unsupervised Learning Reinforcement Learning and more variants Define: 1. Inputs 2. Architecture 3. Output 4. Loss Function Optimize by performing gradient descent on network parameters

Deep Learning = Learning Features Classification 1) 2) 3) Regression Linear Methods: Linear in Input Space Kernel Methods: Linear in PreDefined Feature Space Neural Network Methods: Linear in Learned Feature Space

Deep Learning = Learning Hierarchical Representations

Training Neural Networks

Convolutional Neural Network (CNN) Used for images Building Blocks for CNN: Parameter reduction by exploiting spatial locality Convolutional Layer Nonlinear Activation Function MaxPooling Layer Convolution Layer Convolution instead of Matrix Multiplication Usually implemented as crosscorrelation/filtering (kernel not flip) Tensorflow Learn the weights of the kernel/filter Buildingblocks for CNN s

Deep Learning in Computer Vision Applications: (1) Classification, (2) Segmentation; (3) Image Registration; and more... Example: Fruit Segmentation 1. Does not utilize prior knowledge about the problem (besides labels) 2. Standard template for any deep learning problem Standard Deep Learning Template: 1) Collect image data and ground truth labels 2) Design network architecture 3) Train via supervised learning by minimizing a loss function against Ground Truth Works well but potential drawbacks: 1. Requires ground truth (not always available) 2. Limited generalization ability (need a lot of diverse data) 3. Long training times Chen at al. Counting Apples and Oranges With Deep Learning: A DataDriven Approach. 2017

Deep Learning in Image Registration Classification and Segmentation have a lot of semantic problem structure Image Registration is interesting because it has a lot of semantic and geometric structure Key Theme of Lecture: Incorporating problem structure and utilizing insights from traditional techniques can lead to more powerful/efficient deep learning algorithms Applications: 1. Estimating Homographies (supervised > unsupervised: improves performance and data efficiency) 2. Object Tracking (offline > hybrid offline/online: improves generalization) 3. Stereo Vision 4. Visual Odometry 5. Image Mosaicing 6. much more!

Case #1: Unsupervised Deep Homography Estimating Homographies

Case #1 Outline 1) Deep Homography (Supervised) 2) Deep Homography (Unsupervised) 3) Traditional FeatureBased Methods

Homography Definition Project points on one plane to points on another plane. 8 free parameters

Deep Learning Based Method Recap: Deep Learning = Learning Hierarchical Representations Use these features to compute homography H = f (features on image 1, features on image 2) Network Input: Pair of images Output: Vector of 8 parameters Training Approaches Supervised Unsupervised Hierarchical Representations of a Deep CNN

Supervised Deep Homography Needs ground truth, which are homographies parameters in the training data Expensive to obtain homography ground truth on real data Loss function: Diagram of Supervised Approach

Regression Model (a Deep CNN model)

Geometric Loss vs Supervised Loss Supervised Loss: Sum of square error in homography parameters Requires ground truth: not always available Does not capture any properties of the transformation Supervised Loss Geometric Loss: Sum of square error in pixel intensity values Same objective as Direct Methods of Homography Estimation Requires no ground truth => Great! Captures the consistency property of the transformation => Great! Geometric Loss Idea: make use of the geometric loss => A Better Unsupervised Approach

Unsupervised Deep Homography: Idea Deep Network First Image Regression Model Geometric Loss to train the network Estimate of H Second Image Subtraction to get error

Unsupervised Deep Homography: Diagram Use unsupervised loss function Does not use ground truth! 2 new network structures: 1. Tensor Direct Linear Transform (DLT) 2. Spatial Transformation Layer Output of the Deep CNN: Estimate of parameters of homography

Direct Linear Transform (DLT) Recap: Given 4 pairs of correspondences H Consider 1 pair of correspondences x = (u,v, 1) and x = (u, v, 1 ) i i Use crossproduct to get rid of scale One pair, 2 independent equations

Direct Linear Transform (DLT) H can be determined with 4 pairs of correspondences (8 unknown parameters) Solving for h equivalent to finding the null space of A (8x9 matrix) SVD? One pair, 2 independent equations Ai h = 0 4 pairs, 8 independent equations Alternative to SVD, write equation to the form by moving the last column Of A to the right (since h33 = 1) This is straightforward to solve and easy to compute gradients w.r.t elements of H

Spatial Transformer A warping algorithm that transforms input image I to a new image I given a transformation matrix H (H can be homography, affine transform ) Differentiable w.r.t elements of H Two steps: grid generator & differentiable sampling Grid generator: Is a pixel in the image I Applying inverse of H to G Differentiable sampling to pain G, to obtain image I Grid G after warping Image I Create Grid G

Results: Performance on Real Images Evaluate accuracy of homography estimation Use 4 pairs of correspondences. Red color: ground truth, Yellow color: estimation. Expectation: yellow polygon overlaps red polygon Unsupervised SIFT Supervised ORB

FeatureBased Approach Note: these are steps that you may refer to when doing your upcoming project!

Detecting and Describing Local Features How to find a shared pattern of the two images? Pixel intensity has a short range, from 0 to 255 Idea: use a patch other than a single pixel

Simple Local Features (examples)

Simple Local Features Corner (curvature) extraction: points where the edge direction changes rapidly are corners. I.e. Harris corner, ShiTomasi features Pros: fast Cons: sensitive to changes in image scale and as such is unsuited to matching images of differing size Scale invariant feature transform (SIFT): involves two stages: feature extraction and description. Pros: invariant to image scale and rotation, and with partial invariance to change in illumination. Cons: Computationally expensive There are numerous studies in the literature!

Feature Matching How to define the similarity between two features f1, f2? Simple approach is SSD(f1, f2): sum of square differences between entries of two descriptors Minimize Does not provide a way to discard ambiguous (bad) matches Better approach: minimize ratio distance = SSD(f1, f2) / SSD(f1, f 2) Decision rule: accept a match if SSD < T

RANSAC: Motivation Example There are outliers which represent wrong matches How to find good correspondences?

RANSAC: Motivation Example Outliers disappear!

Case #2: DeepLK Tracking Objects

Case #2 Outline 1) Traditional LucasKanade Algorithm a) b) ForwardAdditive InverseCompositional 2) GOTURN (Offline Deep Learning Tracker w/o LK) 3) DeepLK (Hybrid Online/Offline Deep Learning Tracker w/ LK)

Unsupervised Homography uses LK Objective E.g. affine warp LucasKanade (Direct Method) applies to general differentiable warps and is used in: 1. 2. 3. Tracking Optical Flow Mosaicing We will introduce the DeepLK network which is used in object tracking.

Overview of LucasKanade Algorithm

FirstOrder Taylor Expansion (GaussNewton) E.g. affine warp Jacobian Baker and Matthews. LucasKanade 20 Years On: A Unifying Framework (2004)

InverseCompositional LK Algorithm

Precomputing Jacobian and Hessian

GOTURN [1] Regression based Fast (100 FPS) No LK GOTURN has poor generalization [2] D. Held, S. Thrun, and S. Savarese. Learning to track at 100 fps with deep regression networks. 2016

DeepLK Minimize feature distance instead of intensity distance Can view as regression: [1] Wang et al. DeepLK for Efficient Adaptive Object Tracking. 2017

DeepLK Wang et al. DeepLK for Efficient Adaptive Object Tracking. 2017

Results Wang et al. DeepLK for Efficient Adaptive Object Tracking. 2017

Thank you!