PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Similar documents
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Charles R. Qi* Hao Su* Kaichun Mo Leonidas J. Guibas

Supplementary A. Overview. C. Time and Space Complexity. B. Shape Retrieval. D. Permutation Invariant SOM. B.1. Dataset

CS468: 3D Deep Learning on Point Cloud Data. class label part label. Hao Su. image. May 10, 2017

3D Deep Learning on Geometric Forms. Hao Su

POINT CLOUD DEEP LEARNING

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

arxiv: v1 [cs.cv] 28 Nov 2018

Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material

AI Ukraine Point cloud labeling using machine learning

Keras: Handwritten Digit Recognition using MNIST Dataset

Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network

Deep Learning on Point Sets for 3D Classification and Segmentation

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016

Dynamic Routing Between Capsules

arxiv: v1 [cs.cv] 6 Nov 2018 Abstract

Cross-domain Deep Encoding for 3D Voxels and 2D Images

An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks

Using Faster-RCNN to Improve Shape Detection in LIDAR

Pose estimation using a variety of techniques

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Object Recognition II

Deconvolution Networks

arxiv: v1 [cs.cv] 27 Nov 2018 Abstract

Seeing the unseen. Data-driven 3D Understanding from Single Images. Hao Su

Structured Prediction using Convolutional Neural Networks

Keras: Handwritten Digit Recognition using MNIST Dataset

arxiv: v1 [cs.cv] 17 Apr 2019

MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

3D Convolutional Neural Networks for Landing Zone Detection from LiDAR

Machine Learning 13. week

Deep Learning with Tensorflow AlexNet

arxiv: v4 [cs.cv] 27 Mar 2018

MoonRiver: Deep Neural Network in C++

Deep Models for 3D Reconstruction

arxiv: v1 [cs.cv] 2 Dec 2018

Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning

3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington

6. Convolutional Neural Networks

Polytechnic University of Tirana

Deep Learning for Computer Vision II

arxiv: v1 [cs.cv] 13 Feb 2018

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

3D Deep Learning

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018

CS 231A Computer Vision (Fall 2011) Problem Set 4

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

P-CNN: Pose-based CNN Features for Action Recognition. Iman Rezazadeh

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Learning to generate 3D shapes

arxiv: v1 [cs.cv] 4 Jun 2018

3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis

Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu

Classification of objects from Video Data (Group 30)

Efficient Algorithms may not be those we think

Spatial Localization and Detection. Lecture 8-1

Is Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Introduction to Neural Networks

Supervised Hashing for Image Retrieval via Image Representation Learning

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

Perceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington

Deep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition

3D model classification using convolutional neural network

arxiv: v1 [cs.cv] 20 Dec 2016

GraphNet: Recommendation system based on language and network structure

Recurrent Convolutional Neural Networks for Scene Labeling

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Know What Your Neighbors Do: 3D Semantic Segmentation of Point Clouds

CS 340 Lec. 4: K-Nearest Neighbors

Experimental Evaluation of Point Cloud Classification using the PointNet Neural Network

Human Pose Estimation with Deep Learning. Wei Yang

Su et al. Shape Descriptors - III

From 3D descriptors to monocular 6D pose: what have we learned?

Image Classification pipeline. Lecture 2-1

The Kinect Sensor. Luís Carriço FCUL 2014/15

AUTOMATIC 3D HUMAN ACTION RECOGNITION Ajmal Mian Associate Professor Computer Science & Software Engineering

Fully-Convolutional Siamese Networks for Object Tracking

Computing the Stereo Matching Cost with CNN

Shape Context Matching For Efficient OCR

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

Contextual Dropout. Sam Fok. Abstract. 1. Introduction. 2. Background and Related Work

Computer Vision Lecture 16

Tri-modal Human Body Segmentation

SO-Net: Self-Organizing Network for Point Cloud Analysis

Object Detection Based on Deep Learning

Learning to Sample. Shai Avidan Tel-Aviv University. Oren Dovrat Tel-Aviv University. Itai Lang Tel-Aviv University. Abstract. 1.

Robust PDF Table Locator

Report: Privacy-Preserving Classification on Deep Neural Network

Correspondence. CS 468 Geometry Processing Algorithms. Maks Ovsjanikov

The Detection of Faces in Color Images: EE368 Project Report

Image Classification pipeline. Lecture 2-1

Clustering and Unsupervised Anomaly Detection with l 2 Normalized Deep Auto-Encoder Representations

Filtering and mapping systems for underwater 3D imaging sonar

CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS

CS 523: Multimedia Systems

Learning from 3D Data

Transcription:

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space Sikai Zhong February 14, 2018 COMPUTER SCIENCE

Table of contents 1. PointNet 2. PointNet++ 3. Experiments 1

PointNet

Property of Point Cloud It is unordered,networks must be invariant to permutations; Points are not isolated and must have neighbor information; It must has invariance under transformations because all points are part of a rigid object; 2

Symmetry Function for Unordered input A symmetry function takes n vectors and outputs a result that is invariant to the order of a set; + and * are operator candidates for symmetry function; 3

Pipeline of PointNet Figure 1: The classification network takes n points as input, applies input and feature transformations, and then aggregates point features by max pooling. The output is classification scores for k classes. The segmentation network is an extension to the classification net. It concatenates global and local features and outputs per point scores. mlp stands for multi-layer perceptron, numbers in bracket are layer sizes. Batch norm is used for all layers with ReLU. Dropout layers are used for the last mlp in classification net.[1] 4

Pipeline of PointNet f (x 1, x 2,..., x n ) = γ(max i=1,...,n {h(x i )}); (1) Where γ and h are multi-layer perception(mlp) networks. 5

PointNet PointNet can not capature local structure induced by the metric space; Local structure is very important to the success of convolutional architectures; 6

PointNet++

General Ideal of PointNet++ 1. Partition the set of points into overlapping locally regions using farthest point sampling (FPS) algorithm; 2. Capture the local features using PointNet; 3. Group the features into bigger unit and calculate the higher features again; 7

Sampling Layer 1. Given input points {x 1, x 2,...x n }; 2. Using FPS to get a subset {x i1, x i2,...x im } such that x ij is the most distant point to the result of the points as centroids; 8

Grouping Layer 1. Given input points of size of N (d + C) and the coordinates of a set of centroids of size N d (d is the dimension of points, C is the dimension of features ) 2. The outputs are groups of points of size N K (d + C); 3. K varies from different levels; 9

Grouping Layer Figure 2: (a) Multi-scale grouping (MSG) (); (b) Multiresolution grouping (MRG)(features of a region at some level Li is a concatenation of two vectors. 10

Grouping Layer MRG:Features at different scales are concatenated to form a multi-scale features,computationally expensive; MSG: One vector (left in figure. 10) is obtained by summarizing the features at each subregion from the lower level L i using the set abstraction level. The other vector (right) is the feature that is obtained by directly processing all raw points in the local region using a single PointNet). 11

PointNet Layer 1. the input are N local regions of points with data size N K (d + C); K is the number of points in one region; 2. Output data size is N (d + c ) 12

Pipeline of PointNet++ Figure 3: Illustration of our hierarchical feature learning architecture and its application for set segmentation and classification using points in 2D Euclidean space as an example. Single scale point grouping is visualized here[2] 13

Experiments

Datasets MNIST : Images of handwritten digits with 60k training and 10k testing samples. ModelNet40: CAD models of 40 categories (mostly man-made). We use the official split with 9,843 shapes for training and 2,468 for testing. SHREC15 : 1200 shapes from 50 categories. Each category contains 24 shapes which are mostly organic ones with various poses such as horses, cats, etc. We use five fold cross validation to acquire classification accuracy on this dataset. ScanNet : 1513 scanned and reconstructed indoor scenes. We use 1201 scenes for training, 312 scenes for test; 14

Classification Figure 4: Left: Point cloud with random point dropout. Right: Curve showing advantage of our density adaptive strategy in dealing with non-uniform density. DP means random input dropout during training; otherwise training is on uniformly dense point 15

Point Set Segmentation for Semantic Scene Labeling Figure 5: Scannet labeling accuracy 16

References i C. R. Qi, H. Su, K. Mo, and L. J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. Proc. Computer Vision and Pattern Recognition (CVPR), IEEE, 1(2):4, 2017. C. R. Qi, L. Yi, H. Su, and L. J. Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in Neural Information Processing Systems, pages 5105 5114, 2017.