Local-Level 3D Deep Learning. Andy Zeng

Similar documents
Learning from 3D Data

3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions

ECE 6554:Advanced Computer Vision Pose Estimation

CVPR 2014 Visual SLAM Tutorial Kintinuous

3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington

Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

CS 223B Computer Vision Problem Set 3

CS 231A Computer Vision (Winter 2014) Problem Set 3

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

Perceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington

Keypoint-based Recognition and Object Search

CS 231A Computer Vision (Fall 2012) Problem Set 3

Ensemble of Bayesian Filters for Loop Closure Detection

Super-Resolution Keyframe Fusion for 3D Modeling with High-Quality Textures

3D object recognition used by team robotto

LOCAL AND GLOBAL DESCRIPTORS FOR PLACE RECOGNITION IN ROBOTICS

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

Human Pose Estimation with Deep Learning. Wei Yang

Learning Photographic Image Synthesis With Cascaded Refinement Networks. Jonathan Louie Huy Doan Siavash Motalebi

From 3D descriptors to monocular 6D pose: what have we learned?

Computer Vision: Making machines see

Understanding Faces. Detection, Recognition, and. Transformation of Faces 12/5/17

Self-supervised Visual Descriptor Learning for Dense Correspondence

3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis

Geometric Reconstruction Dense reconstruction of scene geometry

Analysis and Synthesis of 3D Shape Families via Deep Learned Generative Models of Surfaces

Filtering and mapping systems for underwater 3D imaging sonar

Semi-Supervised Hierarchical Models for 3D Human Pose Reconstruction

Real-Time Vision-Based State Estimation and (Dense) Mapping

3D Pose Estimation using Synthetic Data over Monocular Depth Images

3D Object Representations. COS 526, Fall 2016 Princeton University

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Shape Matching. Michael Kazhdan ( /657)

3D reconstruction how accurate can it be?

Simultaneous Localization and Mapping

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Local Features Tutorial: Nov. 8, 04

The Hilbert Problems of Computer Vision. Jitendra Malik UC Berkeley & Google, Inc.

CS 343H: Honors AI. Lecture 23: Kernels and clustering 4/15/2014. Kristen Grauman UT Austin

Human Shape from Silhouettes using Generative HKS Descriptors and Cross-Modal Neural Networks

Dense 3D Reconstruction from Autonomous Quadrocopters

Pose estimation using a variety of techniques

A System of Image Matching and 3D Reconstruction

... arxiv: v2 [cs.cv] 5 Sep Learning local shape descriptors from part correspondences with multi-view convolutional networks

Spontaneously Emerging Object Part Segmentation

Jakob Engel, Thomas Schöps, Daniel Cremers Technical University Munich. LSD-SLAM: Large-Scale Direct Monocular SLAM

Nonlinear State Estimation for Robotics and Computer Vision Applications: An Overview

Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey

Recognizing people. Deva Ramanan

Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material

Edges and Binary Images

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction

Urban Scene Segmentation, Recognition and Remodeling. Part III. Jinglu Wang 11/24/2016 ACCV 2016 TUTORIAL

Snakes, level sets and graphcuts. (Deformable models)

Deep Learning for Virtual Shopping. Dr. Jürgen Sturm Group Leader RGB-D

Deformable Part Models

arxiv: v1 [cs.cv] 18 Sep 2017

Seeing the unseen. Data-driven 3D Understanding from Single Images. Hao Su

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach

Su et al. Shape Descriptors - III

Visual Recognition and Search April 18, 2008 Joo Hyun Kim

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016

FLaME: Fast Lightweight Mesh Estimation using Variational Smoothing on Delaunay Graphs

KinectFusion: Real-Time Dense Surface Mapping and Tracking

Scanning and Printing Objects in 3D Jürgen Sturm

A Real-time Algorithm for Atmospheric Turbulence Correction

Instance-level recognition

Application questions. Theoretical questions

Supplementary: Cross-modal Deep Variational Hand Pose Estimation

A Comparison of SIFT and SURF

Structured light 3D reconstruction

Deep Incremental Scene Understanding. Federico Tombari & Christian Rupprecht Technical University of Munich, Germany

Instance-level recognition

Instance-level recognition part 2

arxiv: v1 [cs.cv] 28 Sep 2018

Instance-level recognition II.

Object Recognition 1

Object Recognition 1

Depth from Stereo. Dominic Cheng February 7, 2018

Geometric Registration for Deformable Shapes 1.1 Introduction

Scale Invariant Feature Transform by David Lowe

3D Shape Segmentation with Projective Convolutional Networks

6. Convolutional Neural Networks

arxiv: v3 [cs.cv] 27 Jul 2018

CS 231A Computer Vision (Fall 2011) Problem Set 4

Beyond Firewalls: The Future Of Network Security

SurfNet: Generating 3D shape surfaces using deep residual networks-supplementary Material

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Edge and Texture. CS 554 Computer Vision Pinar Duygulu Bilkent University

Deep Face Recognition. Nathan Sun

Implementation and Comparison of Feature Detection Methods in Image Mosaicing

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Image Features: Detection, Description, and Matching and their Applications

Dense Tracking and Mapping for Autonomous Quadrocopters. Jürgen Sturm

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Detection III: Analyzing and Debugging Detection Methods

Local invariant features

Transcription:

Local-Level 3D Deep Learning Andy Zeng 1

Matching 3D Data Image Credits: Song and Xiao, Tevs et al. 2

Matching 3D Data Reconstruction Image Credits: Song and Xiao, Tevs et al. 3

Matching 3D Data Reconstruction Image Credits: Song and Xiao, Tevs et al. Shape retrieval Object pose estimation 4

Matching 3D Data Reconstruction Image Credits: Song and Xiao, Tevs et al. Shape retrieval Object pose estimation Aligning deformable shapes 5

Matching 3D Data Establish 3D geometric correspondences 6

Matching 3D Data Establish 3D geometric correspondences Find interesting 3D features Match 3D features 7

Matching 3D Features in Scanning Data is Hard Partial and noisy scan data 8

Matching 3D Features in Scanning Data is Hard Partial and noisy scan data Viewpoint variance 9

Matching 3D Features in Scanning Data is Hard Partial and noisy scan data Viewpoint variance 10

Matching 3D Features in Scanning Data is Hard Partial and noisy scan data Viewpoint variance Traditional hand-crafted 3D feature descriptors do not work well! 11

Solution: Let the data speak for itself! 12

Solution: Let the data speak for itself! http://3dmatch.cs.princeton.edu/ 3DMatch: 3D ConvNet that recognizes correspondences in 3D scan data 1 2 13

3D Data Representation Use truncated distance fields (TDF) Image Credits: Song and Xiao 14

3D Data Representation Use truncated distance fields (TDF) Intermediate 3D Representation Image Credits: Song and Xiao 15

3D Data Representation Use truncated distance fields (TDF) Intermediate 3D Representation Image Credits: Song and Xiao Enables 3D Convolution 16

3D Data Representation 17

3D Data Representation 18

Metric Network vs. L2 Distance 19

Metric Network vs. L2 Distance L2 contrastive loss 20

Metric Network vs. L2 Distance L2 contrastive loss Use metric network for accuracy, use L2 distance for speed 21

Generating Training Data Automatically Manually label geometric correspondences? Too much work! 22

Generating Training Data Automatically Manually label geometric correspondences? Too much work! Think of all those maps that we've built using large-scale SLAM and all those correspondences that these systems provide isn t that a clear path for building terascale image-image "association" datasets which should be able to help deep learning? The basic idea is that today's SLAM systems are large-scale correspondence engines which can be used to generate largescale datasets, precisely what needs to be fed into a deep ConvNet. Newcombe s Proposal: Use SLAM to help Deep Learning 23 Image Credits: Malisiewicz et al.

Generating Training Data Automatically Manually label geometric correspondences? Too much work! The basic idea is that today's SLAM systems are largescale correspondence engines which can be used to generate large-scale datasets, precisely what needs to be fed into a deep ConvNet. Newcombe s Proposal: Use SLAM to help Deep Learning Tomasz Malisiewicz s Computer Vision Blog ICCV s Future of Real-Time SLAM Workshop Solution: Use existing 3D reconstructions to fuel correspondence labels! 24 Image Credits: Malisiewicz et al.

Generating Training Data Automatically Solution: Use existing 3D reconstructions to fuel correspondence labels! 25 Image Credits: Shotton et al.

26

3DMatch for Reconstruction 27

3DMatch for Loop Closures 28

3DMatch for Loop Closures 29

3DMatch for Loop Closures 30

3DMatch for Loop Closures 31

3DMatch for 3D Reconstruction 32

3DMatch for Other Applications Shape retrieval Object pose estimation 33

Evaluation: 3DMatch vs. Others Correspondence 34

Evaluation: 3DMatch > Others Correspondence Geometric Registration 35

Keypoint Selection Does Not Matter 36

Keypoint Selection Does Not Matter The choice of keypoints do not matter much. 37

Conclusion 38

3DMatch http://3dmatch.cs.princeton.edu/ 3D ConvNet that recognizes correspondences in 3D scan data 39