MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY

Similar documents
Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material

POINT CLOUD DEEP LEARNING

Multi-level voxel representation for GPUaccelerated

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016

3D model classification using convolutional neural network

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

Learning from 3D Data

Deep Learning for 3D Shape Classification Based on Volumetric Density and Surface Approximation Clues

Deep Learning with Tensorflow AlexNet

DEEP LEARNING FOR 3D SHAPE CLASSIFICATION FROM MULTIPLE DEPTH MAPS. Pietro Zanuttigh and Ludovico Minto

Learning to generate 3D shapes

Deep Learning and Its Applications

3D Object Classification using Shape Distributions and Deep Learning

Sparse 3D Convolutional Neural Networks for Large-Scale Shape Retrieval

DeepPano: Deep Panoramic Representation for 3-D Shape Recognition

LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS

Beam Search for Learning a Deep Convolutional Neural Network of 3D Shapes

Dynamic Routing Between Capsules

Semantic Segmentation

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Charles R. Qi* Hao Su* Kaichun Mo Leonidas J. Guibas

Multi-Task Self-Supervised Visual Learning

3D ShapeNets for 2.5D Object Recognition and Next-Best-View Prediction

Lecture 7: Semantic Segmentation

YOLO9000: Better, Faster, Stronger

3D Object Classification via Spherical Projections

Cross-domain Deep Encoding for 3D Voxels and 2D Images

3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis

Classification of 3D Shapes with Convolutional Neural Networks

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image. Supplementary Material

3D Object Detection with Sparse Sampling Neural Networks. Ryan Goy

Structured Prediction using Convolutional Neural Networks

Spatial Localization and Detection. Lecture 8-1

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

arxiv: v3 [cs.cv] 9 Sep 2016

Keras: Handwritten Digit Recognition using MNIST Dataset

Using Faster-RCNN to Improve Shape Detection in LIDAR

Yiqi Yan. May 10, 2017

Fuzzy Set Theory in Computer Vision: Example 3

3D Attention-Driven Depth Acquisition for Object Identification

Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization with Spatially-Varying Lighting

RGBD Occlusion Detection via Deep Convolutional Neural Networks

Learning Descriptor Networks for 3D Shape Synthesis and Analysis

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell

End-to-End Localization and Ranking for Relative Attributes

Three-Dimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients

Supplementary Material for Ensemble Diffusion for Retrieval

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Scene Text Recognition for Augmented Reality. Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science

3D ShapeNets: A Deep Representation for Volumetric Shapes

3D Deep Learning on Geometric Forms. Hao Su

Deep Models for 3D Reconstruction

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Deep Learning for Computer Vision

CNN Basics. Chongruo Wu

CS468: 3D Deep Learning on Point Cloud Data. class label part label. Hao Su. image. May 10, 2017

3D CONVOLUTIONAL NEURAL NETWORKS BY MODAL FUSION

Convolutional-Recursive Deep Learning for 3D Object Classification

Convolutional Layer Pooling Layer Fully Connected Layer Regularization

CAP 6412 Advanced Computer Vision

arxiv: v4 [cs.cv] 27 Nov 2016

Training Convolutional Neural Networks for Translational Invariance on SAR ATR

3D Convolutional Neural Networks for Landing Zone Detection from LiDAR

Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning

Lecture 5: Object Detection

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

arxiv: v1 [cs.cv] 28 Nov 2018

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015

Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks

Deep Learning for Computer Vision II

Perceptron: This is convolution!

ECE 5470 Classification, Machine Learning, and Neural Network Review

Machine Learning for Shape Analysis and Processing. Evangelos Kalogerakis

OBJECT DETECTION HYUNG IL KOO

An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks

Paired 3D Model Generation with Conditional Generative Adversarial Networks

Recurrent Convolutional Neural Networks for Scene Labeling

3D Deep Learning

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite Supplimentary Material

Training Deep Neural Networks (in parallel)

Supplementary Material for Learning 3D Shape Completion from Laser Scan Data with Weak Supervision

Channel Locality Block: A Variant of Squeeze-and-Excitation

Real-time convolutional networks for sonar image classification in low-power embedded systems

arxiv: v3 [cs.cv] 30 Oct 2017

3D ShapeNets: A Deep Representation for Volumetric Shape Modeling

Clipping. CSC 7443: Scientific Information Visualization

Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs Supplementary Material

Deep Learning Benchmarks Mumtaz Vauhkonen, Quaizar Vohra, Saurabh Madaan Collaboration with Adam Coates, Stanford Unviersity

Generative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs

Parallel Deep Network Training

MRI Segmentation. MRI Bootcamp, 14 th of January J. Miguel Valverde

CSG obj. oper3. obj1 obj2 obj3. obj5. obj4

arxiv: v1 [cs.cv] 26 Jul 2018

Two-Stream Convolutional Networks for Action Recognition in Videos

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Elastic Neural Networks for Classification

Large-Scale Point Cloud Classification Benchmark

arxiv: v1 [cs.cv] 20 Dec 2016

Parallel Deep Network Training

Transcription:

MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY

Outline Object Recognition Multi-Level Volumetric Representations for CAD Models Object Recognition using Dense Voxels Object Recognition using Multi-level Voxels March 26, 2018 2

Motivation Object recognition of 3D models from volumetric data Learn volumetric features from CAD models Local features 3D spatial features Memory efficient way to learn from volumetric data March 26, 2018 3

Boundary Representation (B-Rep) CAD Models De-facto representation for CAD models Can be easily tessellated into triangles for rendering Difficult to interpret volumetric information Size of a feature Internal location of a feature March 26, 2018 4

Voxel Representation Binary occupancy information Augmented with extra geometry information Can be used as direct input to a convolutional neural network Dense resolution voxel grid has high memory and computation requirements March 26, 2018 5

Why we need Multi-Resolution? As the resolution increases, the fraction of occupancy reduces Still need to store empty voxels An hierarchical (multi-level) representation is useful to capture key features at a finer resolution Level 1 Voxels Level 2 Voxels [2] http://openaccess.thecvf.com/content_cvpr_2017/poster/1319_poster.pdf March 26, 2018 6

ModelNet10 Dataset 3D CAD models for objects 10 categories of objects: Bathtub Chair Dresser Night Stand Table Bed Desk Monitor Sofa Toilet Source: Princeton ModelNet [1] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang and J. Xiao, 3D ShapeNets: A Deep Representation for Volumetric Shapes, Proceedings of 28th IEEE Conference on Computer Vision and Pattern Recognition (CVPR2015) March 26, 2018 7

Outline Object Recognition Multi-Level Volumetric Representations for CAD Models Object Recognition using Dense Voxels Object Recognition using Multi-level Voxels March 26, 2018 8

Volumetric Voxelization of ModelNet10 Overlay a regular voxel grid on the object............................................. Test point membership of the voxel bounding-box center points, classify as in or out....................................................................................................................................... March 26, 2018 9

Identifying Boundary Voxels Boundary Voxels need to be identified in order to generate fine level voxel grid Identify the voxels that contain vertices Use separating-axis test for all other voxels within the bound Classify Vertices Triangle Box Intersection March 26, 2018 10

Fine Level Voxelization (Level 2) Same method as coarse level Clip the model using AABB of boundary voxels Perform similar Tri-Box intersection to identify level 2 Boundary voxels All the information is stored in a flat data structure March 26, 2018 11

Outline Object Recognition Multi-Level Volumetric Representations for CAD Models Object Recognition using Dense Voxels Object Recognition using Multi-level Voxels March 26, 2018 12

3D CNN on Dense Voxel Grid Dense voxel grid as input model 3D-CNN with two convolutional layers and a max-pooling layer for feature extraction Dense Voxel Grid 10 Classes A fully connected dense layer to flatten the data to get 10 class classification Convolution Layer 1 Convolution Layer 2 Pooling Layer Dense Layer 1 Dense Layer 2 March 26, 2018 13

Data Augmentation ModelNet10: 3991 training and 908 testing 3D models Dataset size is insufficient to train the parameters of 3D-CNN 6 rigid body transformations on voxel grid for data augmentation 7x original data size used for training Rotation (x, y, z axis) Mirroring (x, y, z axis) Original model y y x x 90 Rot-z March 26, 2018 14

Outline Object Recognition Multi-Level Volumetric Representations for CAD Models Object Recognition using Dense Voxels Object Recognition using Multi-level Voxels March 26, 2018 15

Need to learn from Multi-Resolution data Learn efficiently from complex and intricate features of a CAD model Improve performance with fewer computations Amenable to model interpretability by learning finer features at specific spatial locations Low memory usage March 26, 2018 16

Data Augmentation Similar to data augmentation at coarse level voxels Rigid body transformation first applied on coarse voxels Transformation then applied on finer voxels inside each coarse voxel y 90 Level 1 Rot-z y 90 Level 2 Rot-z y x x x March 26, 2018 17

Multi-Level 3D CNN Boundary Voxels Level-2 Forward Linking Level-2 with Level-1 Level-1 Forward Classification 4 x 4 x 4 Voxel Grid 8 x 8 x 8 Voxel Grid 10 Classes Fine Voxels Convolution layers Pooling Dense Sigmoid Output Coarse Level Fusion Convolution Layer 1 Convolution Layer 2 Pooling Layer Dense Layer 1 Dense Layer 2 Update Weights Compute Level-2 Gradients Extract Voxel gradients based on forwards pass Compute Level-1 Gradients Compute Loss March 26, 2018 18

Results Multi-level training parameters: Batch size: 64 3D models of size 8x8x8 coarse & 4x4x4 fine voxels Optimizer: SGD with learning rate of 0.001 Loss Function: Softmax cross-entropy Network (Level-1): Convolution: 64 filters Convolution: 128 filters Max Pooling Dense Layer: 256 filters Network (Level-2): Convolution: 8 filters Convolution: 16 filters Max Pooling Dense Layer: 32 filters March 26, 2018 19

Results (Contd.) Dense level training parameters: Batch size: 64 3D models of size 32 x 32 x 32 voxels Optimizer: SGD with learning rate of 0.001 Loss Function: Softmax cross-entropy Network A: Convolution: 64 filters Max Pooling Convolution: 128 filters Max Pooling Dense Layer: 256 filters Network B: Convolution: 64 filters Convolution: 128 filters Max Pooling Dense Layer: 256 filters March 26, 2018 20

Accuracy Results (Contd.) 1 Coarse 2 Multi-Level 3 Dense 1 Coarse 2 Multi-Level 3 Dense 8x8x8 8x8x8 and 4x4x4 32x32x32 March 26, 2018 21

Results (Contd.) March 26, 2018 22

Results (Contd.) 16000 Memory Usage in GPU of Multi-Resolution voxel training & equivalent single resolution training 14000 12000 10000 8000 6000 4000 2000 0 Memory Usage in GPU (MB) Multi-Level Dense with MaxPool Dense wihout MaxPool March 26, 2018 23

Conclusions We have developed methods to represent CAD models using a multi-resolution voxel grid Developed a multi-level 3D-CNN for object recognition using the multi-resolution voxel grid Memory usage by the multi-level 3D-CNN is much lower than the dense voxel 3D-CNN without compromising the accuracy March 26, 2018 24

Future work Efficient training algorithms for Level-2 3D-CNN Explore different resolutions effect on training 3D-CNN Build model interpretability for hierarchical learning Experiment the algorithm with different datasets March 26, 2018 25

Acknowledgements AI-based Design and Manufacturability Lab (ADAM Lab) Xian Lee Aditya Balu Gavin Young Funding Sources National Science Foundation CMMI:1644441 CM: Machine-Learning Driven Decision Support in Design for Manufacturability nvidia Titan Xp GPU for Academic Research March 26, 2018 26

Thank You! Questions? March 26, 2018 27