Revisiting 3D Geometric Models for Accurate Object Shape and Pose

Similar documents
Revisiting 3D Geometric Models for Accurate Object Shape and Pose

Lecture 15 Visual recognition

Object Detection by 3D Aspectlets and Occlusion Reasoning

Part-Based Models for Object Class Recognition Part 3

3D 2 PM 3D Deformable Part Models

High Level Computer Vision

Towards Scene Understanding with Detailed 3D Object Representations

Object Detection by 3D Aspectlets and Occlusion Reasoning

Deformable Part Models Revisited: A Performance Evaluation for Object Category Pose Estimation

EECS 442 Computer vision. 3D Object Recognition and. Scene Understanding

3D Object Representations for Recognition. Yu Xiang Computational Vision and Geometry Lab Stanford University

Part-Based Models for Object Class Recognition Part 2

Part-Based Models for Object Class Recognition Part 2

Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing

Toward Coherent Object Detection And Scene Layout Understanding

3D Spatial Layout Propagation in a Video Sequence

Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014

Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material

Local cues and global constraints in image understanding

Contexts and 3D Scenes

Contexts and 3D Scenes

Multi-View Object Class Detection with a 3D Geometric Model

Backprojection Revisited: Scalable Multi-view Object Detection and Similarity Metrics for Detections

3D Pose-by-Detection of Vehicles via Discriminatively Reduced Ensembles of Correlation Filters

Constructing Implicit 3D Shape Models for Pose Estimation

A Novel Illumination-Invariant Loss for Monocular 3D Pose Estimation

Instance-level recognition part 2

Viewpoint Invariant Features from Single Images Using 3D Geometry

Efficient Object Localization and Pose Estimation with 3D Wireframe Models

Learning People Detectors for Tracking in Crowded Scenes

3D Object Class Detection in the Wild

Learning People Detectors for Tracking in Crowded Scenes

What are we trying to achieve? Why are we doing this? What do we learn from past history? What will we talk about today?

Instance-level recognition II.

Efficient Detector Adaptation for Object Detection in a Video

Scene Understanding From a Moving Camera for Object Detection and Free Space Estimation

Occlusion Reasoning for Object Detection under Arbitrary Viewpoint

Three-Dimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients

Modeling 3D viewpoint for part-based object recognition of rigid objects

Detecting Object Instances Without Discriminative Features

Class-Specific Object Pose Estimation and Reconstruction using 3D Part Geometry

Methods for Representing and Recognizing 3D objects

Simultaneous Recognition and Homography Extraction of Local Patches with a Simple Linear Classifier

CS 558: Computer Vision 13 th Set of Notes

Multiple Viewpoint Recognition and Localization

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce

Visual Object Recognition

PoseEstimationfor CategorySpecific Multiview Object Localization

Supervised learning. y = f(x) function

3DNN: Viewpoint Invariant 3D Geometry Matching for Scene Understanding

Continuous Pose Estimation in 2D Images at Instance and Category Levels

arxiv: v3 [cs.cv] 7 Nov 2018

Supervised learning. y = f(x) function

Local features and image matching. Prof. Xin Yang HUST

Single-view metrology

Object and Class Recognition I:

Specular 3D Object Tracking by View Generative Learning

Segmenting Objects in Weakly Labeled Videos

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Efficient Object Detection and Segmentation with a Cascaded Hough Forest ISM

Shape-based instance detection under arbitrary viewpoint

EE290T : 3D Reconstruction and Recognition

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14

Object Category Detection. Slides mostly from Derek Hoiem

Active View Selection for Object and Pose Recognition

Efficient 3D Object Detection using Multiple Pose-Specific Classifiers

Combining PGMs and Discriminative Models for Upper Body Pose Detection

3D model search and pose estimation from single images using VIP features

Estimating Human Pose in Images. Navraj Singh December 11, 2009

Depth-Encoded Hough Voting for Joint Object Detection and Shape Recovery

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

Object Detection Using Segmented Images

PEDESTRIAN DETECTION IN CROWDED SCENES VIA SCALE AND OCCLUSION ANALYSIS

Course Administration

Automatic occlusion removal from facades for 3D urban reconstruction

Camera Geometry II. COS 429 Princeton University

Detecting and Segmenting Humans in Crowded Scenes

Closing the Loop in Scene Interpretation

Gender Classification

In Good Shape: Robust People Detection based on Appearance and Shape

3dRR: Representation and Recognition

Visual Object Recognition

Part based models for recognition. Kristen Grauman

Geometric verifica-on of matching

3D Models and Matching

Occlusion Patterns for Object Class Detection

Deformable Part Models

Part-based models. Lecture 10

Separating Objects and Clutter in Indoor Scenes

Computer Vision Lecture 17

Viewpoint-Aware Object Detection and Continuous Pose Estimation

Computer Vision Lecture 17

arxiv: v1 [cs.cv] 1 Dec 2016

FPM: Fine Pose Parts-Based Model with 3D CAD Models

3D Object Detection and Pose Estimation. Yu Xiang University of Michigan 1st Workshop on Recovering 6D Object Pose 12/17/2015

Lecture 10: Multi view geometry

LEARNING BOUNDARIES WITH COLOR AND DEPTH. Zhaoyin Jia, Andrew Gallagher, Tsuhan Chen

Context. CS 554 Computer Vision Pinar Duygulu Bilkent University. (Source:Antonio Torralba, James Hays)

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Transcription:

Revisiting 3D Geometric Models for Accurate Object Shape and Pose M. 1 Michael Stark 2,3 Bernt Schiele 3 Konrad Schindler 1 1 Photogrammetry and Remote Sensing Laboratory Swiss Federal Institute of Technology (ETH), Zurich 2 Artificial Intelligence Lab Stanford University, USA 3 Max-Planck-Institute for Informatics Saarbrücken, Germany

Current object models: coarse grained estimates 1

Our goal: finer-grained models to aid scene-level reasoning 2

Revival of 3D geometric representations 1970 [Marr, Nishihara 78] [Brooks 81] [Pentland 86] [Lowe 87] [Koller, Daniilidis, Nagel 93] [Sullivan, Worrall, Ferryman 95] [Haag, Nagel 99] 1980 1990 2000 2010 3

Revival of 3D geometric representations 1970 [Marr, Nishihara 78] [Brooks 81] [Pentland 86] [Lowe 87] 1980 1990 [Koller, Daniilidis, Nagel 93] [Sullivan, Worrall, Ferryman 95] [Haag, Nagel 99] 2000 [Hoiem, Efros, Hebert 08] [Ess, Leibe, Schindler, Van Gool 09] [Wang, Gould, Koller 10] [Hedau, Hoiem, Forsyth 10] [Barinova, Lempitsky, Tretyak, Kohli 10] [Gupta, Efros, Hebert 10] [Wojek, Roth, Schindler, Schiele 10] 2010 3

Related work in viewpoint invariant detection Multiple, viewpoint dependent representations (connected in different ways) [Thomas et al., 06] [Yan, Khan, Shah 07] [Ozuysal, Lepetit, Fua 09] [Nachimson, Basri 09] [Su, Sun, Fei-Fei, Savarese 09] [Gu, Ren 10] [Stark, Goesele, Schiele 10] 1) 1) 2) Explicit 3D geometry representation [Liebelt, Schmid 10] 2) [Sun, Xu, Bradski, Savarese 10] [Gupta, Efros, Hebert 10] [Chen, Kim, Cipolla 10] [Gupta, Satkin, Efros, Hebert 11] 4

Overview Simplify 3D Active Shape Model PCA 3D CAD Models 5

Overview Simplify 3D Active Shape Model PCA 3D CAD Models Render Positive examples (per part) 5

Overview Simplify 3D Active Shape Model PCA 3D CAD Models Render Positive examples (per part) AdaBoost Negative examples (background) 5

Overview Simplify 3D Active Shape Model PCA 3D CAD Models Render Positive examples (per part) AdaBoost Negative examples (background) Detection maps Test image 5

Overview Simplify 3D Active Shape Model PCA Inference 3D CAD Models Render Positive examples (per part) AdaBoost Negative examples (background) Detection maps Test image 5

Representation: 3D geometry Simplified 3D wireframes : fixed number of vertices 6

Learning: 3D geometry Eigen-Cars Principal Components Analysis (PCA) Tightly constrained global geometry 7

Representation: Local appearance Accurate foreground shape Very cheap training data, dense sampling of viewpoints! 8

Learning: Local appearance Dense Shape Context features [Belongie, Malik. 00] AdaBoost classifiers (per part-viewpoint) + - Annotated vertices are our parts. Related work: [Andriluka, Roth, Schiele 09] 9

Inference Test Image 10

Inference Test Image Detection maps 10

Inference Test Image Detection maps Sample 3D wireframes, project, compute image likelihood 10

Inference Detection maps Sample 3D cars, project, compute image likelihood image evidence shape of wireframe camera focal length recognition hypothesis viewpoint parameters, azimuth and elevation image space translation and scaling Projection matrix local part scale part likelihood self-occlusion indicator 11

Experimental evaluation Test Dataset Evaluations on 3D Object Classes dataset [Savarese et al., 2007] Car class (8 azimuth angles, 2 elevation angles, 3 distances, varying backgrounds) 240 images, 5 cars 12

Experimental evaluation - Training 38 3D CAD models 36 vertices as model points, 20 annotations per model (due to symmetry). Separate local part shape detectors trained from: - 72 different azimuth angles, - 2 different elevation angles (7.5, 15 from ground plane) 13

Experimental evaluation - Initialization Two initializations : 20 Stark et al., 2010 (full system) True initial value (tight bounding box, rough azimuth) 14

Experimental evaluation - Inference 35 35 20 14

Example wireframe fits Parts correctly localized Full system: 74.2% True initial value: 83.4% 15

Fine-grained 3D geometry estimation Accurate estimation of closest 3D CAD model, camera parameters, and ground plane 16

Ultra-wide baseline matching UW-Baseline matching using only model fits (corresponding part locations) Impossible using interest point matching Related work: [Bao, Savarese 11] 17

Ultra-wide baseline matching UW-Baseline matching using only model fits (corresponding part locations) Impossible using interest point matching Related work: [Bao, Savarese 11] 18

Ultra-wide baseline matching Azimuth Difference No. of Image Pairs True initial value Full system SIFT Part detections only 45 53 91% 55% 2% 27% 90 35 91% 60% 0% 27% 135 29 69% 52% 0% 10% 180 17 59% 41% 0% 24% Correct fit = Sampson error < E max on ground truth correspondences 3D Geometric model improves significantly over part detections only 19

Multiview recognition Rescored hypotheses Good 2D localization 20

Continuous viewpoint estimation Total Images True Positives % correct azimuth Average error azimuth Average error elevation Stark et al., 2010 48 46 67.4% 4.2 4.0 Full system 48 45 73.3% 3.8 3.6 True initial value* 48 48 89.6% 4.2 3.6 Comparison against ground truth pose, manually labeled. Full system improves 6% over Stark et al., 2010. * Approximate pose initialization quantized to 45 steps 21

Conclusion 3D deformable object class model have potential for accurate geometric reasoning on scene level. - accurate object localization - geometric parts in 2D - 3D pose estimation Novel application examples - fine-grained object categorization - ultra-wide baseline matching Future extensions - efficient multi-class methods for part likelihoods - analyze importance of geometric model vs. local appearance - occlusion invariance 22

OLD SLIDES

Learning: 3D Geometry any wireframe mean wireframe weight of k th principal component standard deviation of j th principal component direction of j th principal Eigen-Cars component residual (if r < m)

Part localization correct localization ~ localized within 4% of car length from ground truth

Experimental evaluation - Inference 35 35 20 14

Experimental evaluation - Inference 35 35 20 14