Learning from 3D Data
|
|
- Earl Lamb
- 6 years ago
- Views:
Transcription
1 Learning from 3D Data Thomas Funkhouser Princeton University* * On sabbatical at Stanford and Google
2 Disclaimer: I am talking about the work of these people Shuran Song Andy Zeng Fisher Yu Yinda Zhang Maciej Halber Jerry Liu Manolis Savva Angel Chang Matthias Niessner Jianxiong Xiao Matt Fisher Angela Dai
3 Goal Learning to 3D objects in indoor environments Detect Segment Match Label Complete Synthesize Edit etc.
4 Motivation Applications Scene understanding Robot manipulation Object modeling etc.
5 Scale of region Focus of This Talk Learning patterns in 3D space Small Patch Object View Scene Large For each scale: Motivating application? Source of training data? Methods? Results?
6 Scale of region Talk Outline (Part 1 of 4) Learning patterns in 3D space Small Patch Object View Scene Large A. Zeng, S. Song, M. Niessner, M. Fisher, J. Xiao, T. Funkhouser, 3DMatch: Learning Local Geometric Descriptors from 3D Reconstructions, CVPR 2017
7 3DMatch Goal: train a discriminating 3D local shape descriptor from data Local shape descriptor Local shape descriptor Match!
8 3DMatch Challenge: where to get training data?
9 3DMatch Approach: train on wide-baseline correspondences in RGB-D reconstructions
10 3DMatch Approach: train on wide-baseline correspondences in RGB-D reconstructions Ground truth match between RGB-D Images from different views
11 3DMatch Method: sample true/false correspondences from RGB-D reconstructions, train Siamese network
12 3DMatch Result: learns to discriminate local shapes found in real-world data
13 3DMatch Results Result 1: learned feature descriptor predicts RGB-D point correspondences more accurately than hand-tuned descriptors Match classification error at 95% recall Fragment Alignment Success Rate
14 3DMatch Results Result 2: feature descriptor learned from RGB-D reconstructions provides matching for recognizing poses of small objects in Amazon Picking Challenge Object pose prediction accuracy Predicting pose of 3D object model in RGB-D scan
15 3DMatch Results Result 3: feature descriptor learned from RGB-D reconstructions provides discriminative matching of semantic correspondences on 3D meshes
16 Scale of region Talk Outline (Part 2 of 4) Learning patterns in 3D space Small Patch Object View Scene Large J. Liu, F. Yu, T. Funkhouser, Interactive 3D Modeling with a Generative Adversarial Network,, 3DV 2017
17 Interactive 3D Modeling with a GAN Goal: interactive modeling of objects with a SIMPLE user interface
18 Interactive 3D Modeling with a GAN Challenge: difficult to create complete, detailed, realistic shapes
19 Interactive 3D Modeling with a GAN Approach: learn a shape manifold, and then use it to help a user to design new realistic shapes Latent Space of Shape Manifold
20 Interactive 3D Modeling with a GAN Where to get training data? ShapeNet Chang et al., ShapeNet: An Information-Rich 3D Model Repository, arxiv, December 2015.
21 Interactive 3D Modeling with a GAN Prior work: editing images with shape manifold J-Y Zhu, P. Krähenbühl, E. Shechtman, A. Efros, Generative Visual Manipulation on the Natural Image Manifold ECCV 2016
22 Interactive 3D Modeling with a GAN New idea: allow users to snap partial edits to 3D shape manifold
23 Interactive 3D Modeling with a GAN New idea: edit, snap edit, snap
24 Interactive 3D Modeling with a GAN Method: constrain edits to shape manifold learned with 3DGAN x
25 Interactive 3D Modeling with a GAN Method: map user edits to shape manifold with learned projection operator x G(z) Projection z=p(x) x x P S (x)
26 Interactive 3D Modeling with a GAN Method: two step projection operator accounts for similarity and realism Step 1: project into latent space Step 2: optimize for realism Realism of output Similarity to input
27 Interactive 3D Modeling with a GAN Results: comparison to one-step projection network
28 Interactive 3D Modeling with a GAN Results: comparison of snap operator to retrieval of nearest neighbor
29 Interactive 3D Modeling with a GAN Results: example editing sequences for chairs
30 Interactive 3D Modeling with a GAN Results: example editing sequences for planes and tables
31 Scale of region Talk Outline (Part 3 of 4) Learning patterns in 3D space Small Patch Object View Scene Large S. Song, F. Yu, A. Zeng, A. Chang, M. Savva, and T. Funkhouser, Semantic Scene Completion from a Single Depth Image, CVPR 2017
32 Semantic Scene Completion Goal: given an RGB-D image, label all voxels by semantic class Input: Single view depth map Output: Semantic scene completion
33 Semantic Scene Completion Goal: given an RGB-D image, label all voxels by semantic class visible surface free space occluded space outside view outside room 3D Scene
34 Semantic Scene Completion Goal: given an RGB-D image, label all voxels by semantic class visible surface free space occluded space outside view outside room 3D Scene
35 Semantic Scene Completion Prior work: segmentation OR completion surface segmentation Silberman et al. scene completion Firman et al. 3D Scene The occupancy and the object This identity paper are tightly intertwined! semantic scene completion
36 Semantic Scene Completion: SSCNet Approach: end-to-end deep network Prediction: N+1 classes SSCNet 3D ConvNet Input: Single view depth map Output: Semantic scene completion
37 Semantic Scene Completion : SSCNet
38 Semantic Scene Completion : SSCNet
39 Semantic Scene Completion : SSCNet Encode 3D space using flipped TSDF
40 Semantic Scene Completion : SSCNet Encode 3D space using flipped TSDF Voxel size: 0.02 m
41 Semantic Scene Completion : SSCNet Local geometry Receptive field: 0.98 m
42 Semantic Scene Completion : SSCNet High-level 3D context via big receptive field provided by dilated convolution Receptive field: 2.26
43 Semantic Scene Completion : SSCNet Multi-scale aggregation Receptive field: 0.98 m Receptive field:1.62 m Receptive field: 2.26 m
44 Semantic Scene Completion: SSCNet Experiments Where to get training data?
45 Semantic Scene Completion: SSCNet Experiments Where to get training data? NYU: only visible surfaces SUN3D: No semantic labels No dense volumetric ground truth with semantic labels for a complete scene
46 Semantic Scene Completion: SSCNet Experiments SUNCG dataset
47 Semantic Scene Completion: SSCNet Experiments SUNCG dataset 46K houses 50K floors 400K rooms 5.6M object instances
48 Semantic Scene Completion: SSCNet Experiments SUNCG dataset synthetic camera views depth ground truth semantic scene completion
49 Semantic Scene Completion: SSCNet Experiments Train on SUNCG Test on NYU
50 Semantic Scene Completion: SSCNet Results Result: better than previous volumetric completion algorithms Comparison to previous algorithms for volumetric completion
51 Color Image Observed Surface Ground Truth Zhang et al. Firman et al. Ours(SSCNet)
52 Semantic Scene Completion: SSCNet Results Result: better than previous 3D model fitting algorithms Comparison to previous algorithms for 3D model fitting
53 Color Image Observed Surface Ground Truth Lin et al. Geiger and Wang Ours(SSCNet)
54 Color Image Observed Surface Ground Truth Lin et al. Geiger and Wang Ours(SSCNet)
55 Scale of region Talk Outline (Part 4 of 4) Learning patterns in 3D space Small Patch Object View Scene Large Y. Zhang, M. Bai, J. Xiao, P. Kohli, and S. Izadi, DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding, ICCV 2017
56 DeepContext Goal: given a RGB-D image, perform holistic 3D scene understanding wall lamp nightstand bed ceiling dresser dresser Input RGB-D Image floor All Objects Identified with 3D Bounding Boxes
57 DeepContext Goal: given a RGB-D image, perform holistic 3D scene understanding wall lamp nightstand bed ceiling dresser dresser Input RGB-D Image floor
58 DeepContext Challenge: how parse whole scene with a single deep network?? Want learned models for: Object shapes Object co-occurances Object spatial relationships Object supports etc. wall lamp nightstand floor bed ceiling dresser dresser
59 DeepContext Challenge: how parse whole scene with a single deep network? wall lamp ceiling dresser nightstand bed dresser 3D Convolutional Neural Network? floor
60 DeepContext Approach: different deep network templates for different scene types Sleeping Area Office Area Lounging Area Table & Chair Set
61 Sleeping Area dresser with mirror sofa ottoman dresser bed dresser dresser with mirror nightstand lamp wall
62 DeepContext Sleeping Area Office Area Lounging Area Table & Chair Set
63 1. Rotation 2. Translation Depth Image 3D Volume (TSDF) Transformation Network
64 1. Rotation 2. Translation Depth Image 3D Volume (TSDF) Transformation Network
65 1. Rotation 2. Translation Depth Image 3D Volume (TSDF) Transformation Network
66 1. Rotation 2. Translation Depth Image Initial Template Alignment (Top View) 3D Volume (TSDF) 3D Context Network 1. Object Existence 2. Box offset Transformation Network
67 1. Rotation 2. Translation Depth Image Initial Template Alignment (Top View) 3D Volume (TSDF) 3D Context Network 1. Object Existence 2. Box offset Transformation Network
68 1. Rotation 2. Translation Depth Image Initial Template Alignment (Top View) 3D Volume (TSDF) 3D Context Network 1. Object Existence 2. Box offset Transformation Network
69 1. Rotation 2. Translation Depth Image Initial Template Alignment (Top View) 3D Volume (TSDF) 3D Context Network 1. Object Existence 2. Box offset Transformation Network
70 Depth Image Initial Template Alignment (Top View) 3D Volume (TSDF) 3D Context Network lamp wall ceiling dresser Transformation Network 1. Object Existence 2. Box offset nightstand bed dresser 1. Rotation 2. Translation floor
71 DeepContext Results Where to get training data? SUN RGB-D S. Song, S. Lichtenberg, and J. Xiao. SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite, CVPR 2015
72 DeepContext - Results Depth + Truth Result 2D Initial Alignment Initial Alignment Top View
73 DeepContext - Results Depth + Truth Result 2D Initial Alignment Initial Alignment Top View
74 DeepContext - Results Depth + Truth Result 2D Initial Alignment Initial Alignment Top View
75 DeepContext Results Experiments with SUN RGB-D show that DeepContext provides better 3D object detections than previous methods: Depth + Ground Truth 3D Object Detections Average Precision for 3D Object Detection
76 Summary Four projects where ConvNets are trained on 3D voxel grids with different Scales Tasks Training data Loss functions Network architectures Training protocols Problem structures Application domains
77 Future Challenges More efficient 3D convolutions Larger 3D data sets Larger 3D contexts Learning 3D interactions Better 3D representations
78 Future Challenges More efficient 3D convolutions Larger 3D data sets Larger 3D contexts Learning 3D interactions OctNet Riegler et al., CVPR 2017 Better 3D representations PointNet++ Qi et al., NIPS 2017
79 Future Challenges More efficient 3D convolutions Larger 3D data sets Larger 3D contexts Learning 3D interactions Better 3D representations SUN RGB-D (Song et al, CVPR 2017) 10K labeled images ScanNet (Dai et al., CVPR 2017) 700 rooms, 36K labeled objects Synthetic RGB-D Image RGB-D Video Object ShapeNet Intel RealSense Redwood Room SUNCG SUN RGB-D ScanNet Multiroom SUNCG Matterport3D SUN3D Largest 3D datasets available today for indoor environments
80 Future Challenges More efficient 3D convolutions Matterport3D Larger 3D data sets Larger 3D contexts Learning 3D interactions Better 3D representations Whole houses Room annotations Object annotations A. Chang, A. Dai, T. Funkhouser, M. Halber, M. Niessner, M. Savva, S. Song, A. Zeng, and Y. Zhang, Matteport3D: Learning from RGB-D Data in Indoor Environments, 3DV 2017
81 Future Challenges More efficient 3D convolutions Larger 3D data sets Larger 3D contexts Learning 3D interactions Better 3D representations MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments Manolis Savva, Angel X. Chang, Alexey Dosovitskiy, Thomas Funkhouser, Vladlen Koltun, ArXiv 2017
82 Future Challenges More efficient 3D convolutions Larger 3D data sets Larger 3D contexts Learning 3D interactions Better 3D representations Geometry, materials, lighting, physical properties, support relationships, spatial relationships, affordances, activities, dynamics, etc. Ikea
83 Future Challenges More efficient 3D convolutions Larger 3D data sets Larger 3D contexts Learning 3D interactions Better 3D representations Geometry, materials, lighting, physical properties, support relationships, spatial relationships, affordances, activities, dynamics, etc. Wu et al. Visual Genome DeepContext
84 Acknowledgments Princeton: Angel X. Chang, Kyle Genova, Maciej Halber, Manolis Savva, Elena Sizikova, Shuran Song, Jianxiong Xiao, Fisher Yu, Yinda Zhang, Andy Zeng Collaborators: Angela Dai, Matt Fisher, Matthias Niessner, Jianxiong Xiao Data: SUN3D, NYU, Trimble, Planner5D, Matterport Funding: NSF, Google, Intel, Facebook, Adobe, Pixar Thank You!
3D Scene Understanding from RGB-D Images. Thomas Funkhouser
3D Scene Understanding from RGB-D Images Thomas Funkhouser Recent Ph.D. Student Current Postdocs Current Ph.D. Students Disclaimer: I am talking about the work of these people Shuran Song Yinda Zhang Andy
More informationVisual Computing TUM
Visual Computing Group @ TUM Visual Computing Group @ TUM BundleFusion Real-time 3D Reconstruction Scalable scene representation Global alignment and re-localization TOG 17 [Dai et al.]: BundleFusion Real-time
More informationECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016
ECCV 2016 Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 Fundamental Question What is a good vector representation of an object? Something that can be easily predicted from 2D
More informationFinding Surface Correspondences With Shape Analysis
Finding Surface Correspondences With Shape Analysis Sid Chaudhuri, Steve Diverdi, Maciej Halber, Vladimir Kim, Yaron Lipman, Tianqiang Liu, Wilmot Li, Niloy Mitra, Elena Sizikova, Thomas Funkhouser Motivation
More informationVolumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material
Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material Charles R. Qi Hao Su Matthias Nießner Angela Dai Mengyuan Yan Leonidas J. Guibas Stanford University 1. Details
More informationHolistic 3D Scene Parsing and Reconstruction from a Single RGB Image. Supplementary Material
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image Supplementary Material Siyuan Huang 1,2, Siyuan Qi 1,2, Yixin Zhu 1,2, Yinxue Xiao 1, Yuanlu Xu 1,2, and Song-Chun Zhu 1,2 1 University
More informationThree-Dimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients
ThreeDimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients Authors: Zhile Ren, Erik B. Sudderth Presented by: Shannon Kao, Max Wang October 19, 2016 Introduction Given an
More informationPerceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington
Perceiving the 3D World from Images and Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World 3 Understand
More informationSUN RGB-D: A RGB-D Scene Understanding Benchmark Suite Supplimentary Material
SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite Supplimentary Material Shuran Song Samuel P. Lichtenberg Jianxiong Xiao Princeton University http://rgbd.cs.princeton.edu. Segmetation Result wall
More information3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington
3D Object Recognition and Scene Understanding from RGB-D Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World
More informationarxiv: v1 [cs.cv] 18 Sep 2017
Matterport3D: Learning from RGB-D Data in Indoor Environments Angel Chang 1 Angela Dai 2 Thomas Funkhouser 1 Maciej Halber 1 Matthias Nießner 3 Manolis Savva 1 Shuran Song 1 Andy Zeng 1 Yinda Zhang 1 1
More informationDeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding Yinda Zhang Mingru Bai Pushmeet Kohli 2,5 Shahram Izadi 3,5 Jianxiong Xiao,4 Princeton University 2 DeepMind 3 PerceptiveIO
More informationDeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding
DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding Yinda Zhang Mingru Bai Pushmeet Kohli 2,5 Shahram Izadi 3,5 Jianxiong Xiao,4 Princeton University 2 DeepMind 3 PerceptiveIO
More informationarxiv: v1 [cs.cv] 28 Mar 2018
arxiv:1803.10409v1 [cs.cv] 28 Mar 2018 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation Angela Dai 1 Matthias Nießner 2 1 Stanford University 2 Technical University of Munich Fig.
More informationDeep Models for 3D Reconstruction
Deep Models for 3D Reconstruction Andreas Geiger Autonomous Vision Group, MPI for Intelligent Systems, Tübingen Computer Vision and Geometry Group, ETH Zürich October 12, 2017 Max Planck Institute for
More informationLearning to generate 3D shapes
Learning to generate 3D shapes Subhransu Maji College of Information and Computer Sciences University of Massachusetts, Amherst http://people.cs.umass.edu/smaji August 10, 2018 @ Caltech Creating 3D shapes
More informationarxiv: v2 [cs.cv] 28 Mar 2018
ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans Angela Dai 1,3,5 Daniel Ritchie 2 Martin Bokeloh 3 Scott Reed 4 Jürgen Sturm 3 Matthias Nießner 5 1 Stanford University
More informationObject Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning
Allan Zelener Dissertation Proposal December 12 th 2016 Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning Overview 1. Introduction to 3D Object Identification
More information3D Object Detection with Latent Support Surfaces
3D Object Detection with Latent Support Surfaces Zhile Ren Brown University ren@cs.brown.edu Erik B. Sudderth University of California, Irvine sudderth@uci.edu Abstract We develop a 3D object detection
More informationMULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY
MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY Outline Object Recognition Multi-Level Volumetric Representations
More informationLocal-Level 3D Deep Learning. Andy Zeng
Local-Level 3D Deep Learning Andy Zeng 1 Matching 3D Data Image Credits: Song and Xiao, Tevs et al. 2 Matching 3D Data Reconstruction Image Credits: Song and Xiao, Tevs et al. 3 Matching 3D Data Reconstruction
More informationSupport surfaces prediction for indoor scene understanding
2013 IEEE International Conference on Computer Vision Support surfaces prediction for indoor scene understanding Anonymous ICCV submission Paper ID 1506 Abstract In this paper, we present an approach to
More informationarxiv: v1 [cs.cv] 12 Dec 2017
Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View Shuran Song 1 Andy Zeng 1 Angel X. Chang 1 Manolis Savva 1 Silvio Savarese 2 Thomas Funkhouser 1 1 Princeton University 2 Stanford
More informationObject Detection by 3D Aspectlets and Occlusion Reasoning
Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan Silvio Savarese Stanford University In the 4th International IEEE Workshop on 3D Representation and Recognition
More informationarxiv: v1 [cs.cv] 31 Mar 2018
arxiv:1804.00090v1 [cs.cv] 31 Mar 2018 FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans Chen Liu Jiaye Wu Washington University in St. Louis {chenliu,jiaye.wu}@wustl.edu Yasutaka
More information3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions
3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions Andy Zeng 1 Shuran Song 1 Matthias Nießner 2 Matthew Fisher 2,4 Jianxiong Xiao 3 Thomas Funkhouser 1 1 Princeton University 2 Stanford
More informationAn Empirical Study of Generative Adversarial Networks for Computer Vision Tasks
An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks Report for Undergraduate Project - CS396A Vinayak Tantia (Roll No: 14805) Guide: Prof Gaurav Sharma CSE, IIT Kanpur, India
More informationFloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans
FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans Chen Liu 1, Jiaye Wu 1, and Yasutaka Furukawa 2 1 Washington University in St. Louis, St. Louis, USA {chenliu,jiaye.wu}@wustl.edu
More informationDeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material
DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington
More information3D Shape Segmentation with Projective Convolutional Networks
3D Shape Segmentation with Projective Convolutional Networks Evangelos Kalogerakis 1 Melinos Averkiou 2 Subhransu Maji 1 Siddhartha Chaudhuri 3 1 University of Massachusetts Amherst 2 University of Cyprus
More informationShape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis Angela Dai 1 Charles Ruizhongtai Qi 1 Matthias Nießner 1,2 1 Stanford University 2 Technical University of Munich Our method completes
More informationPhysically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks
Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks Yinda Zhang Shuran Song Ersin Yumer Manolis Savva Joon-Young Lee Hailin Jin Thomas Funkhouser Princeton University
More informationFrom 3D descriptors to monocular 6D pose: what have we learned?
ECCV Workshop on Recovering 6D Object Pose From 3D descriptors to monocular 6D pose: what have we learned? Federico Tombari CAMP - TUM Dynamic occlusion Low latency High accuracy, low jitter No expensive
More informationSeparating Objects and Clutter in Indoor Scenes
Separating Objects and Clutter in Indoor Scenes Salman H. Khan School of Computer Science & Software Engineering, The University of Western Australia Co-authors: Xuming He, Mohammed Bennamoun, Ferdous
More information3D model classification using convolutional neural network
3D model classification using convolutional neural network JunYoung Gwak Stanford jgwak@cs.stanford.edu Abstract Our goal is to classify 3D models directly using convolutional neural network. Most of existing
More informationEfficient Semantic Scene Completion Network with Spatial Group Convolution
Efficient Semantic Scene Completion Network with Spatial Group Convolution Jiahui Zhang 1, Hao Zhao 2, Anbang Yao 3, Yurong Chen 3, Li Zhang 2, and Hongen Liao 1 1 Department of Biomedical Engineering,
More informationPanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding
PanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding Yinda Zhang Shuran Song Ping Tan Jianxiong Xiao Princeton University Simon Fraser University Alicia Clark PanoContext October
More information3D Deep Learning on Geometric Forms. Hao Su
3D Deep Learning on Geometric Forms Hao Su Many 3D representations are available Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models 3D representation
More informationAdversarial Semantic Scene Completion from a Single Depth Image
Adversarial Semantic Scene Completion from a Single Depth Image Yida Wang, David Joseph Tan, Nassir Navab, Federico Tombari Technische Universität München Boltzmannstraße 3, 85748 Garching bei München
More informationScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans
ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans Angela Dai 1,3,5 Daniel Ritchie 2 Martin Bokeloh 3 Scott Reed 4 Jürgen Sturm 3 Matthias Nießner 5 1 Stanford University
More information3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis
3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors
More informationLSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University
LSTM and its variants for visual recognition Xiaodan Liang xdliang328@gmail.com Sun Yat-sen University Outline Context Modelling with CNN LSTM and its Variants LSTM Architecture Variants Application in
More informationContexts and 3D Scenes
Contexts and 3D Scenes Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project presentation Nov 30 th 3:30 PM 4:45 PM Grading Three senior graders (30%)
More informationDeep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing
Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material Introduction In this supplementary material, Section 2 details the 3D annotation for CAD models and real
More informationCS468: 3D Deep Learning on Point Cloud Data. class label part label. Hao Su. image. May 10, 2017
CS468: 3D Deep Learning on Point Cloud Data class label part label Hao Su image. May 10, 2017 Agenda Point cloud generation Point cloud analysis CVPR 17, Point Set Generation Pipeline render CVPR 17, Point
More informationAN image is simply a grid of numbers to a machine.
1 Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey Muzammal Naseer, Salman H. Khan, Fatih Porikli Australian National University, Data61-CSIRO, Inception Institute of AI muzammal.naseer@anu.edu.au
More informationLearning Probabilistic Models from Collections of 3D Meshes
Learning Probabilistic Models from Collections of 3D Meshes Sid Chaudhuri, Steve Diverdi, Matthew Fisher, Pat Hanrahan, Vladimir Kim, Wilmot Li, Niloy Mitra, Daniel Ritchie, Manolis Savva, and Thomas Funkhouser
More informationIndoor Object Recognition of 3D Kinect Dataset with RNNs
Indoor Object Recognition of 3D Kinect Dataset with RNNs Thiraphat Charoensripongsa, Yue Chen, Brian Cheng 1. Introduction Recent work at Stanford in the area of scene understanding has involved using
More informationPointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Charles R. Qi* Hao Su* Kaichun Mo Leonidas J. Guibas
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation Charles R. Qi* Hao Su* Kaichun Mo Leonidas J. Guibas Big Data + Deep Representation Learning Robot Perception Augmented Reality
More informationAN image is simply a grid of numbers to a machine. Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000. Digital Object Identifier 10.1109/ACCESS.2017.DOI Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey MUZAMMAL
More informationSeeing the unseen. Data-driven 3D Understanding from Single Images. Hao Su
Seeing the unseen Data-driven 3D Understanding from Single Images Hao Su Image world Shape world 3D perception from a single image Monocular vision a typical prey a typical predator Cited from https://en.wikipedia.org/wiki/binocular_vision
More informationDeep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images
Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images Shuran Song Jianxiong Xiao Princeton University http://dss.cs.princeton.edu Abstract We focus on the task of amodal 3D object detection
More informationDetecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA)
Detecting and Parsing of Visual Objects: Humans and Animals Alan Yuille (UCLA) Summary This talk describes recent work on detection and parsing visual objects. The methods represent objects in terms of
More information3D ShapeNets: A Deep Representation for Volumetric Shapes
3D ShapeNets: A Deep Representation for Volumetric Shapes Zhirong Wu Shuran Song Aditya Khosla Fisher Yu Linguang Zhang Xiaoou Tang Jianxiong Xiao Princeton University Chinese University of Hong Kong Massachusetts
More informationAdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation
AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation Introduction Supplementary material In the supplementary material, we present additional qualitative results of the proposed AdaDepth
More informationDeep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material
Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material Chi Li, M. Zeeshan Zia 2, Quoc-Huy Tran 2, Xiang Yu 2, Gregory D. Hager, and Manmohan Chandraker 2 Johns
More information3D ShapeNets for 2.5D Object Recognition and Next-Best-View Prediction
3D ShapeNets for 2.5D Object Recognition and Next-Best-View Prediction Zhirong Wu Shuran Song Aditya Khosla Xiaoou Tang Jianxiong Xiao Princeton University MIT CUHK arxiv:1406.5670v2 [cs.cv] 1 Sep 2014
More information3D Deep Learning
3D Deep Learning Tutorial@CVPR2017 Hao Su (UCSD) Leonidas Guibas (Stanford) Michael Bronstein (Università della Svizzera Italiana) Evangelos Kalogerakis (UMass) Jimei Yang (Adobe Research) Charles Qi (Stanford)
More informationPredicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus
Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus Presented by: Rex Ying and Charles Qi Input: A Single RGB Image Estimate
More informationPOINT CLOUD DEEP LEARNING
POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/28 / 57 Introduction AGENDA Previous Work Method Result Conclusion 2 / 57 INTRODUCTION 3 / 57 2D OBJECT CLASSIFICATION Deep Learning for 2D Object Classification
More informationLinking WordNet to 3D Shapes
Linking WordNet to 3D Shapes Angel X Chang, Rishi Mago, Pranav Krishna, Manolis Savva, and Christiane Fellbaum Department of Computer Science, Princeton University Princeton, New Jersey, USA angelx@cs.stanford.edu,
More informationLearning Semantic Environment Perception for Cognitive Robots
Learning Semantic Environment Perception for Cognitive Robots Sven Behnke University of Bonn, Germany Computer Science Institute VI Autonomous Intelligent Systems Some of Our Cognitive Robots Equipped
More informationWall Painting Reconstruction Using a Genetic Algorithm. Elena Sizikova, Thomas Funkhouser Princeton University
Wall Painting Reconstruction Using a Genetic Algorithm Elena Sizikova, Thomas Funkhouser Princeton University Motivation Wall paintings recovered from Akrotiri (Santorini), Greece Part of a reconstructed
More informationJOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA
JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS Zhao Chen Machine Learning Intern, NVIDIA ABOUT ME 5th year PhD student in physics @ Stanford by day, deep learning computer vision scientist
More informationarxiv: v1 [cs.gr] 22 Sep 2017
Hierarchical Detail Enhancing Mesh-Based Shape Generation with 3D Generative Adversarial Network arxiv:1709.07581v1 [cs.gr] 22 Sep 2017 Chiyu Max Jiang University of California Berkeley, CA 94720 chiyu.jiang@berkeley.edu
More informationTeam the Amazon Robotics Challenge 1st place in stowing task
Grasping Team MIT-Princeton @ the Amazon Robotics Challenge 1st place in stowing task Andy Zeng Shuran Song Kuan-Ting Yu Elliott Donlon Francois Hogan Maria Bauza Daolin Ma Orion Taylor Melody Liu Eudald
More informationLearning Efficient Point Cloud Generation for Dense 3D Object Reconstruction
Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction Chen-Hsuan Lin Chen Kong Simon Lucey The Robotics Institute Carnegie Mellon University chlin@cmu.edu, {chenk,slucey}@cs.cmu.edu
More information3D ShapeNets: A Deep Representation for Volumetric Shape Modeling
3D ShapeNets: A Deep Representation for Volumetric Shape Modeling Zhirong Wu Shuran Song Aditya Khosla Fisher Yu Linguang Zhang Xiaoou Tang Jianxiong Xiao Princeton University Chinese University of Hong
More informationWhat are we trying to achieve? Why are we doing this? What do we learn from past history? What will we talk about today?
Introduction What are we trying to achieve? Why are we doing this? What do we learn from past history? What will we talk about today? What are we trying to achieve? Example from Scott Satkin 3D interpretation
More informationarxiv: v3 [cs.cv] 31 Oct 2017
OctNetFusion: Learning Depth Fusion from Data Gernot Riegler 1 Ali Osman Ulusoy 2 Horst Bischof 1 Andreas Geiger 2,3 1 Institute for Computer Graphics and Vision, Graz University of Technology 2 Autonomous
More informationSceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation?
SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation? John McCormac, Ankur Handa, Stefan Leutenegger, Andrew J. Davison Dyson Robotics Laboratory at Imperial
More informationContexts and 3D Scenes
Contexts and 3D Scenes Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project presentation Dec 1 st 3:30 PM 4:45 PM Goodwin Hall Atrium Grading Three
More informationarxiv: v1 [cs.cv] 10 Aug 2018
Weakly supervised learning of indoor geometry by dual warping Pulak Purkait Ujwal Bonde Christopher Zach Toshiba Research Europe, Cambridge, U.K. {pulak.cv, ujwal.bonde, christopher.m.zach}@gmail.com arxiv:1808.03609v1
More informationGAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction
GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction Li Jiang 1, Shaoshuai Shi 1, Xiaojuan Qi 1, and Jiaya Jia 1,2 1 The Chinese University of Hong Kong 2 Tencent YouTu Lab {lijiang,
More informationarxiv: v1 [cs.cv] 7 Nov 2015
Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images Shuran Song Jianxiong Xiao Princeton University http://dss.cs.princeton.edu arxiv:1511.23v1 [cs.cv] 7 Nov 215 Abstract We focus on the
More informationLearning Adversarial 3D Model Generation with 2D Image Enhancer
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) Learning Adversarial 3D Model Generation with 2D Image Enhancer Jing Zhu, Jin Xie, Yi Fang NYU Multimedia and Visual Computing Lab
More informationColored Point Cloud Registration Revisited Supplementary Material
Colored Point Cloud Registration Revisited Supplementary Material Jaesik Park Qian-Yi Zhou Vladlen Koltun Intel Labs A. RGB-D Image Alignment Section introduced a joint photometric and geometric objective
More informationUrban Scene Segmentation, Recognition and Remodeling. Part III. Jinglu Wang 11/24/2016 ACCV 2016 TUTORIAL
Part III Jinglu Wang Urban Scene Segmentation, Recognition and Remodeling 102 Outline Introduction Related work Approaches Conclusion and future work o o - - ) 11/7/16 103 Introduction Motivation Motivation
More informationDeep Learning for Visual Manipulation and Synthesis
Deep Learning for Visual Manipulation and Synthesis Jun-Yan Zhu 朱俊彦 UC Berkeley 2017/01/11 @ VALSE What is visual manipulation? Image Editing Program input photo User Input result Desired output: stay
More informationEfficient Segmentation-Aided Text Detection For Intelligent Robots
Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related
More informationPaper Motivation. Fixed geometric structures of CNN models. CNNs are inherently limited to model geometric transformations
Paper Motivation Fixed geometric structures of CNN models CNNs are inherently limited to model geometric transformations Higher-level features combine lower-level features at fixed positions as a weighted
More informationarxiv: v1 [cs.cv] 25 Oct 2017
ZOU, LI, HOIEM: COMPLETE 3D SCENE PARSING FROM SINGLE RGBD IMAGE 1 arxiv:1710.09490v1 [cs.cv] 25 Oct 2017 Complete 3D Scene Parsing from Single RGBD Image Chuhang Zou http://web.engr.illinois.edu/~czou4/
More informationLecture 7: Semantic Segmentation
Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr
More informationCS 395T Numerical Optimization for Graphics and AI (3D Vision) Qixing Huang August 29 th 2018
CS 395T Numerical Optimization for Graphics and AI (3D Vision) Qixing Huang August 29 th 2018 3D Vision Understanding geometric relations between images and the 3D world between images Obtaining 3D information
More informationSingle Image 3D Interpreter Network
Single Image 3D Interpreter Network Jiajun Wu* Josh Tenenbaum Tianfan Xue* Joseph Lim Antonio Torralba ECCV 2016 Yuandong Tian Bill Freeman (* equal contributions) What do we see from these images? Motivation
More informationarxiv: v1 [cs.cv] 13 Feb 2018
Recurrent Slice Networks for 3D Segmentation on Point Clouds Qiangui Huang Weiyue Wang Ulrich Neumann University of Southern California Los Angeles, California {qianguih,weiyuewa,uneumann}@uscedu arxiv:180204402v1
More information3D Object Representations. COS 526, Fall 2016 Princeton University
3D Object Representations COS 526, Fall 2016 Princeton University 3D Object Representations How do we... Represent 3D objects in a computer? Acquire computer representations of 3D objects? Manipulate computer
More informationPose estimation using a variety of techniques
Pose estimation using a variety of techniques Keegan Go Stanford University keegango@stanford.edu Abstract Vision is an integral part robotic systems a component that is needed for robots to interact robustly
More informationarxiv: v1 [cs.cv] 21 Jun 2017
Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction arxiv:1706.07036v1 [cs.cv] 21 Jun 2017 Chen-Hsuan Lin, Chen Kong, Simon Lucey The Robotics Institute Carnegie Mellon University
More informationImagining the Unseen: Stability-based Cuboid Arrangements for Scene Understanding
: Stability-based Cuboid Arrangements for Scene Understanding Tianjia Shao* Aron Monszpart Youyi Zheng Bongjin Koo Weiwei Xu Kun Zhou * Niloy J. Mitra * Background A fundamental problem for single view
More informationCS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep
CS395T paper review Indoor Segmentation and Support Inference from RGBD Images Chao Jia Sep 28 2012 Introduction What do we want -- Indoor scene parsing Segmentation and labeling Support relationships
More informationarxiv: v1 [cs.cv] 23 Nov 2017
SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation arxiv:1711.08588v1 [cs.cv] 23 Nov 2017 Weiyue Wang 1 Ronald Yu 2 Qiangui Huang 1 Ulrich Neumann 1 1 University of Southern
More informationDepth from Stereo. Dominic Cheng February 7, 2018
Depth from Stereo Dominic Cheng February 7, 2018 Agenda 1. Introduction to stereo 2. Efficient Deep Learning for Stereo Matching (W. Luo, A. Schwing, and R. Urtasun. In CVPR 2016.) 3. Cascade Residual
More informationarxiv: v1 [cs.cv] 30 Sep 2018
3D-PSRNet: Part Segmented 3D Point Cloud Reconstruction From a Single Image Priyanka Mandikal, Navaneet K L, and R. Venkatesh Babu arxiv:1810.00461v1 [cs.cv] 30 Sep 2018 Indian Institute of Science, Bangalore,
More informationProceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong
, March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong TABLE I CLASSIFICATION ACCURACY OF DIFFERENT PRE-TRAINED MODELS ON THE TEST DATA
More informationMulti-view 3D Models from Single Images with a Convolutional Network
Multi-view 3D Models from Single Images with a Convolutional Network Maxim Tatarchenko University of Freiburg Skoltech - 2nd Christmas Colloquium on Computer Vision Humans have prior knowledge about 3D
More information08 An Introduction to Dense Continuous Robotic Mapping
NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy
More informationLEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS
LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS Alexey Dosovitskiy, Jost Tobias Springenberg and Thomas Brox University of Freiburg Presented by: Shreyansh Daftry Visual Learning and Recognition
More informationObject Detection in 3D Scenes Using CNNs in Multi-view Images
Object Detection in 3D Scenes Using CNNs in Multi-view Images Ruizhongtai (Charles) Qi Department of Electrical Engineering Stanford University rqi@stanford.edu Abstract Semantic understanding in 3D scenes
More informationPaired 3D Model Generation with Conditional Generative Adversarial Networks
Accepted to 3D Reconstruction in the Wild Workshop European Conference on Computer Vision (ECCV) 2018 Paired 3D Model Generation with Conditional Generative Adversarial Networks Cihan Öngün Alptekin Temizel
More information