PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Charles R. Qi* Hao Su* Kaichun Mo Leonidas J. Guibas
|
|
- Johnathan Scott
- 6 years ago
- Views:
Transcription
1 PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation Charles R. Qi* Hao Su* Kaichun Mo Leonidas J. Guibas
2 Big Data + Deep Representation Learning Robot Perception Augmented Reality source: Scott J Grunewald source: Google Tango Emerging 3D Applications Shape Design source: solidsolutions
3 Big Data + Deep Representation Learning Robot Perception Augmented Reality source: Scott J Grunewald Shape Design source: Google Tango Need for 3D Deep Learning! source: solidsolutions
4 3D Representations Point Cloud Mesh Volumetric Projected View RGB(D)
5 3D Representation: Point Cloud Point cloud is close to raw sensor data LiDAR Depth Sensor Point Cloud
6 3D Representation: Point Cloud Point cloud is close to raw sensor data Point cloud is canonical Mesh LiDAR Volumetric Depth Sensor Point Cloud Depth Map
7 Previous Works Most existing point cloud features are handcrafted towards specific tasks Source:
8 Previous Works Point cloud is converted to other representations before it s fed to a deep neural network Conversion Voxelization Projection/Rendering Feature extraction Deep Net 3D CNN 2D CNN Fully Connected
9 Research Question: Can we achieve effective feature learning directly on point clouds?
10 Our Work: PointNet End-to-end learning for scattered, unordered point data PointNet
11 Our Work: PointNet End-to-end learning for scattered, unordered point data Unified framework for various tasks Object Classification PointNet Object Part Segmentation Semantic Scene Parsing...
12 Our Work: PointNet End-to-end learning for scattered, unordered point data Unified framework for various tasks
13 Challenges Unordered point set as input Model needs to be invariant to N! permutations. Invariance under geometric transformations Point cloud rotations should not alter classification results.
14 Challenges Unordered point set as input Model needs to be invariant to N! permutations. Invariance under geometric transformations Point cloud rotations should not alter classification results.
15 Unordered Input Point cloud: N orderless points, each represented by a D dim vector D N
16 Unordered Input Point cloud: N orderless points, each represented by a D dim vector D D N represents the same set as N
17 Unordered Input Point cloud: N orderless points, each represented by a D dim vector D D N represents the same set as N Model needs to be invariant to N! permutations
18 Permutation Invariance: Symmetric Function f (x 1, x 2,, x n ) f (x π1, x π 2,, x π n ), x i! D
19 Permutation Invariance: Symmetric Function f (x 1, x 2,, x n ) f (x π1, x π 2,, x π n ), x i! D Examples: f (x 1, x 2,, x n ) = max{x 1, x 2,, x n } f (x 1, x 2,, x n ) = x 1 + x x n
20 Permutation Invariance: Symmetric Function f (x 1, x 2,, x n ) f (x π1, x π 2,, x π n ), x i! D Examples: f (x 1, x 2,, x n ) = max{x 1, x 2,, x n } f (x 1, x 2,, x n ) = x 1 + x x n How can we construct a family of symmetric functions by neural networks?
21 Permutation Invariance: Symmetric Function Observe: f (x 1, x 2,, x n ) = γ! g(h(x 1 ),,h(x n )) is symmetric if g is symmetric
22 Permutation Invariance: Symmetric Function Observe: f (x 1, x 2,, x n ) = γ! g(h(x 1 ),,h(x n )) is symmetric if g is symmetric (1,2,3) (1,1,1) (2,3,2) h (2,3,4)
23 Permutation Invariance: Symmetric Function Observe: f (x 1, x 2,, x n ) = γ! g(h(x 1 ),,h(x n )) is symmetric if g is symmetric (1,2,3) (1,1,1) (2,3,2) (2,3,4) h g simple symmetric function
24 Permutation Invariance: Symmetric Function Observe: f (x 1, x 2,, x n ) = γ! g(h(x 1 ),,h(x n )) is symmetric if g is symmetric (1,2,3) (1,1,1) (2,3,2) (2,3,4) h g simple symmetric function γ PointNet (vanilla)
25 Permutation Invariance: Symmetric Function What symmetric functions can be constructed by PointNet? Symmetric functions PointNet (vanilla)
26 Universal Set Function Approximator Theorem: A Hausdorff continuous symmetric function arbitrarily approximated by PointNet. f :2 X! can be S! d PointNet (vanilla)
27 Basic PointNet Architecture Empirically, we use multi-layer perceptron (MLP) and max pooling: h (1,2,3) (1,1,1) MLP MLP g γ (2,3,2) MLP max MLP (2,3,4) MLP PointNet (vanilla)
28 Challenges Unordered point set as input Model needs to be invariant to N! permutations. Invariance under geometric transformations Point cloud rotations should not alter classification results.
29 Input Alignment by Transformer Network Idea: Data dependent transformation for automatic alignment 3 T-Net transform 3 params N Transform N Data Transformed Data
30 Input Alignment by Transformer Network Idea: Data dependent transformation for automatic alignment 3 T-Net transform 3 params N Transform N Data Transformed Data
31 Input Alignment by Transformer Network Idea: Data dependent transformation for automatic alignment 3 T-Net transform 3 params N Transform N Data Transformed Data
32 Input Alignment by Transformer Network The transformation is just matrix multiplication! 3 T-Net transform 3 params: 3x3 N Matrix Mult. Data Transformed Data
33 Embedding Space Alignment T-Net transform params: 64x64 Matrix Mult. Input embeddings: Nx64 Transformed embeddings: Nx64
34 Embedding Space Alignment T-Net transform params: 64x64 Matrix Mult. Regularization: Transform matrix A 64x64 close to orthogonal: Input embeddings: Nx64 Transformed embeddings: Nx64
35 PointNet Classification Network
36 PointNet Classification Network
37 PointNet Classification Network
38 PointNet Classification Network
39 PointNet Classification Network
40 PointNet Classification Network
41 PointNet Classification Network
42 Extension to PointNet Segmentation Network local embedding global feature
43 Extension to PointNet Segmentation Network local embedding global feature
44 Results
45 Results on Object Classification 3D CNNs dataset: ModelNet40; metric: 40-class classification accuracy (%)
46 Results on Object Part Segmentation
47 Results on Object Part Segmentation dataset: ShapeNetPart; metric: mean IoU (%)
48 Results on Semantic Scene Parsing Output Input dataset: Stanford 2D-3D-S (Matterport scans)
49 Robustness to Data Corruption dataset: ModelNet40; metric: 40-class classification accuracy (%)
50 Robustness to Data Corruption Less than 2% accuracy drop with 50% missing data dataset: ModelNet40; metric: 40-class classification accuracy (%)
51 Robustness to Data Corruption dataset: ModelNet40; metric: 40-class classification accuracy (%)
52 Robustness to Data Corruption Why is PointNet so robust to missing data? 3D CNN
53 Visualizing Global Point Cloud Features n MLP shared maxpool global feature Which input points are contributing to the global feature? (critical points)
54 Visualizing Global Point Cloud Features Original Shape: Critical Point Sets:
55 Visualizing Global Point Cloud Features n MLP shared maxpool global feature Which points won t affect the global feature?
56 Visualizing Global Point Cloud Features Original Shape: Critical Point Set: Upper bound set:
57 Visualizing Global Point Cloud Features (OOS) Original Shape: Critical Point Set: Upper bound Set:
58 Conclusion PointNet is a novel deep neural network that directly consumes point cloud. A unified approach to various 3D recognition tasks. Rich theoretical analysis and experimental results. Code & Data Available! See you at Poster 9!
59 Thank you!
60 THE END
61 Speed and Model Size Inference time 11.6ms, 25.3ms GTX1080, batch size 8
62 Permutation Invariance: How about Sorting? Sort the points before feeding them into a network. Unfortunately, there is no canonical order in high dim space. (1,2,3) (1,1,1) (2,3,2) (2,3,4) lexsorted (1,1,1) (1,2,3) (2,3,2) (2,3,4) MLP
63 Permutation Invariance: How about Sorting? Sort the points before feeding them into a network. Unfortunately, there is no canonical order in high dim space. lexsorted Multi-Layer Perceptron (ModelNet shape classification) (1,2,3) (1,1,1) (2,3,2) (2,3,4) (1,1,1) (1,2,3) (2,3,2) (2,3,4) MLP Accuracy Unordered Input 12% Lexsorted Input 40% PointNet (vanilla) 87%
64 Permutation Invariance: How about RNNs? Train RNN with permutation augmentation. However, RNN forgets and order matters. LSTM LSTM LSTM LSTM MLP MLP MLP MLP (1,2,3) (1,1,1) (2,3,2) (2,3,4)
65 Permutation Invariance: How about RNNs? Train RNN with permutation augmentation. However, RNN forgets and order matters. LSTM LSTM LSTM LSTM LSTM Network (ModelNet shape classification) MLP MLP MLP MLP Accuracy LSTM 75% (1,2,3) (1,1,1) (2,3,2) (2,3,4) PointNet (vanilla) 87%
66 PointNet Classification Network ModelNet40 Accuracy PointNet (vanilla) 87.1% + input 3x3 87.9% + feature 64x % + feature 64x64 + reg 87.4% + both 89.2%
67 Visualizing Point Functions Compact View: 1x3 FCs 1x1024 Expanded View: 1x3 FC FC FC FC FC x1024 Which input point will activate neuron X? Find the top-k points in a dense volumetric grid that activates neuron X.
68 Visualizing Point Functions
CS468: 3D Deep Learning on Point Cloud Data. class label part label. Hao Su. image. May 10, 2017
CS468: 3D Deep Learning on Point Cloud Data class label part label Hao Su image. May 10, 2017 Agenda Point cloud generation Point cloud analysis CVPR 17, Point Set Generation Pipeline render CVPR 17, Point
More informationPointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space Sikai Zhong February 14, 2018 COMPUTER SCIENCE Table of contents 1. PointNet 2. PointNet++ 3. Experiments 1 PointNet Property
More information3D Deep Learning on Geometric Forms. Hao Su
3D Deep Learning on Geometric Forms Hao Su Many 3D representations are available Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models 3D representation
More informationAI Ukraine Point cloud labeling using machine learning
AI Ukraine - 2017 Point cloud labeling using machine learning Andrii Babii apratster@gmail.com Data sources 1. LIDAR 2. 3D IR cameras 3. Generated from 2D sources 3D dataset: http://kos.informatik.uni-osnabrueck.de/3dscans/
More information3D Deep Learning
3D Deep Learning Tutorial@CVPR2017 Hao Su (UCSD) Leonidas Guibas (Stanford) Michael Bronstein (Università della Svizzera Italiana) Evangelos Kalogerakis (UMass) Jimei Yang (Adobe Research) Charles Qi (Stanford)
More informationPOINT CLOUD DEEP LEARNING
POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/28 / 57 Introduction AGENDA Previous Work Method Result Conclusion 2 / 57 INTRODUCTION 3 / 57 2D OBJECT CLASSIFICATION Deep Learning for 2D Object Classification
More informationDeep Learning on Point Sets for 3D Classification and Segmentation
Deep Learning on Point Sets for 3D Classification and Segmentation Charles Ruizhongtai Qi Stanford University rqi@stanford.edu Abstract Point cloud is an important type of geometric data structure. Due
More informationVolumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material
Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material Charles R. Qi Hao Su Matthias Nießner Angela Dai Mengyuan Yan Leonidas J. Guibas Stanford University 1. Details
More informationUsing Faster-RCNN to Improve Shape Detection in LIDAR
Using Faster-RCNN to Improve Shape Detection in LIDAR TJ Melanson Stanford University Stanford, CA 94305 melanson@stanford.edu Abstract In this paper, I propose a method for extracting objects from unordered
More informationSupplementary A. Overview. C. Time and Space Complexity. B. Shape Retrieval. D. Permutation Invariant SOM. B.1. Dataset
Supplementary A. Overview This supplementary document provides more technical details and experimental results to the main paper. Shape retrieval experiments are demonstrated with ShapeNet Core55 dataset
More informationarxiv: v1 [cs.cv] 28 Nov 2018
MeshNet: Mesh Neural Network for 3D Shape Representation Yutong Feng, 1 Yifan Feng, 2 Haoxuan You, 1 Xibin Zhao 1, Yue Gao 1 1 BNRist, KLISS, School of Software, Tsinghua University, China. 2 School of
More informationAn Empirical Study of Generative Adversarial Networks for Computer Vision Tasks
An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks Report for Undergraduate Project - CS396A Vinayak Tantia (Roll No: 14805) Guide: Prof Gaurav Sharma CSE, IIT Kanpur, India
More information3D Convolutional Neural Networks for Landing Zone Detection from LiDAR
3D Convolutional Neural Networks for Landing Zone Detection from LiDAR Daniel Mataruna and Sebastian Scherer Presented by: Sabin Kafle Outline Introduction Preliminaries Approach Volumetric Density Mapping
More informationarxiv: v1 [cs.cv] 6 Nov 2018 Abstract
3DCapsule: Extending the Capsule Architecture to Classify 3D Point Clouds Ali Cheraghian Australian National University, Data61-CSIRO ali.cheraghian@anu.edu.au Lars Petersson Data61-CSIRO lars.petersson@data61.csiro.au
More informationPhoto-realistic Renderings for Machines Seong-heum Kim
Photo-realistic Renderings for Machines 20105034 Seong-heum Kim CS580 Student Presentations 2016.04.28 Photo-realistic Renderings for Machines Scene radiances Model descriptions (Light, Shape, Material,
More informationLearning from 3D Data
Learning from 3D Data Thomas Funkhouser Princeton University* * On sabbatical at Stanford and Google Disclaimer: I am talking about the work of these people Shuran Song Andy Zeng Fisher Yu Yinda Zhang
More informationCode Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:
Code Mania 2019 Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: 1. Introduction to Artificial Intelligence 2. Introduction to python programming and Environment
More informationarxiv: v1 [cs.cv] 17 Apr 2019
2D Car Detection in Radar Data with PointNets Andreas Danzer, Thomas Griebel, Martin Bach, and Klaus Dietmayer arxiv:1904.08414v1 [cs.cv] 17 Apr 2019 Abstract For many automated driving functions, a highly
More informationLSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University
LSTM and its variants for visual recognition Xiaodan Liang xdliang328@gmail.com Sun Yat-sen University Outline Context Modelling with CNN LSTM and its Variants LSTM Architecture Variants Application in
More informationSeeing the unseen. Data-driven 3D Understanding from Single Images. Hao Su
Seeing the unseen Data-driven 3D Understanding from Single Images Hao Su Image world Shape world 3D perception from a single image Monocular vision a typical prey a typical predator Cited from https://en.wikipedia.org/wiki/binocular_vision
More informationFrom 3D descriptors to monocular 6D pose: what have we learned?
ECCV Workshop on Recovering 6D Object Pose From 3D descriptors to monocular 6D pose: what have we learned? Federico Tombari CAMP - TUM Dynamic occlusion Low latency High accuracy, low jitter No expensive
More informationMULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY
MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY Outline Object Recognition Multi-Level Volumetric Representations
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More information3D model classification using convolutional neural network
3D model classification using convolutional neural network JunYoung Gwak Stanford jgwak@cs.stanford.edu Abstract Our goal is to classify 3D models directly using convolutional neural network. Most of existing
More informationGraphNet: Recommendation system based on language and network structure
GraphNet: Recommendation system based on language and network structure Rex Ying Stanford University rexying@stanford.edu Yuanfang Li Stanford University yli03@stanford.edu Xin Li Stanford University xinli16@stanford.edu
More informationObject Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning
Allan Zelener Dissertation Proposal December 12 th 2016 Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning Overview 1. Introduction to 3D Object Identification
More informationDetection and Tracking of Moving Objects Using 2.5D Motion Grids
Detection and Tracking of Moving Objects Using 2.5D Motion Grids Alireza Asvadi, Paulo Peixoto and Urbano Nunes Institute of Systems and Robotics, University of Coimbra September 2015 1 Outline: Introduction
More informationLarge-Scale Point Cloud Classification Benchmark
Large-Scale Point Cloud Classification Benchmark www.semantic3d.net IGP & CVG, ETH Zürich www.semantic3d.net, info@semantic3d.net 7/6/2016 1 Timo Hackel Nikolay Savinov Ľubor Ladický Jan Dirk Wegner Konrad
More informationObject Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR
Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization
More information3D Object Classification via Spherical Projections
3D Object Classification via Spherical Projections Zhangjie Cao 1,QixingHuang 2,andRamaniKarthik 3 1 School of Software Tsinghua University, China 2 Department of Computer Science University of Texas at
More informationKeras: Handwritten Digit Recognition using MNIST Dataset
Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA February 9, 2017 1 / 24 OUTLINE 1 Introduction Keras: Deep Learning library for Theano and TensorFlow 2 Installing Keras Installation
More informationRelational inductive biases, deep learning, and graph networks
Relational inductive biases, deep learning, and graph networks Peter Battaglia et al. 2018 1 What The authors explore how we can combine relational inductive biases and DL. They introduce graph network
More informationarxiv: v1 [cs.cv] 13 Feb 2018
Recurrent Slice Networks for 3D Segmentation on Point Clouds Qiangui Huang Weiyue Wang Ulrich Neumann University of Southern California Los Angeles, California {qianguih,weiyuewa,uneumann}@uscedu arxiv:180204402v1
More informationPresented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey
Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey Evangelos MALTEZOS, Charalabos IOANNIDIS, Anastasios DOULAMIS and Nikolaos DOULAMIS Laboratory of Photogrammetry, School of Rural
More informationDeep Learning. Architecture Design for. Sargur N. Srihari
Architecture Design for Deep Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation
More information3D Attention-Driven Depth Acquisition for Object Identification
3D Attention-Driven Depth Acquisition for Object Identification Kai Xu, Yifei Shi, Lintao Zheng, Junyu Zhang, Min Liu, Hui Huang, Hao Su, Daniel Cohen-Or and Baoquan Chen National University of Defense
More informationarxiv: v1 [cs.cv] 27 Nov 2018 Abstract
Iterative Transformer Network for 3D Point Cloud Wentao Yuan David Held Christoph Mertz Martial Hebert The Robotics Institute Carnegie Mellon University {wyuan1, dheld, cmertz, mhebert}@cs.cmu.edu arxiv:1811.11209v1
More informationAutoencoder. Representation learning (related to dictionary learning) Both the input and the output are x
Deep Learning 4 Autoencoder, Attention (spatial transformer), Multi-modal learning, Neural Turing Machine, Memory Networks, Generative Adversarial Net Jian Li IIIS, Tsinghua Autoencoder Autoencoder Unsupervised
More informationLearning to generate 3D shapes
Learning to generate 3D shapes Subhransu Maji College of Information and Computer Sciences University of Massachusetts, Amherst http://people.cs.umass.edu/smaji August 10, 2018 @ Caltech Creating 3D shapes
More informationCross-domain Deep Encoding for 3D Voxels and 2D Images
Cross-domain Deep Encoding for 3D Voxels and 2D Images Jingwei Ji Stanford University jingweij@stanford.edu Danyang Wang Stanford University danyangw@stanford.edu 1. Introduction 3D reconstruction is one
More informationCS 395T Numerical Optimization for Graphics and AI (3D Vision) Qixing Huang August 29 th 2018
CS 395T Numerical Optimization for Graphics and AI (3D Vision) Qixing Huang August 29 th 2018 3D Vision Understanding geometric relations between images and the 3D world between images Obtaining 3D information
More information3D Semantic Segmentation with Submanifold Sparse Convolutional Networks
3D Semantic Segmentation with Submanifold Sparse Convolutional Networks Benjamin Graham Facebook AI Research benjamingraham@fb.com Martin Engelcke University of Oxford martin@robots.ox.ac.uk Laurens van
More information3D Object Detection with Sparse Sampling Neural Networks. Ryan Goy
3D Object Detection with Sparse Sampling Neural Networks by Ryan Goy A thesis submitted in partial satisfaction of the requirements for the degree of Master of Science, Plan II in Electrical Engineering
More informationPreparation Meeting. Recent Advances in the Analysis of 3D Shapes. Emanuele Rodolà Matthias Vestner Thomas Windheuser Daniel Cremers
Preparation Meeting Recent Advances in the Analysis of 3D Shapes Emanuele Rodolà Matthias Vestner Thomas Windheuser Daniel Cremers What You Will Learn in the Seminar Get an overview on state of the art
More informationAsynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features
Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features Xu SUN ( 孙栩 ) Peking University xusun@pku.edu.cn Motivation Neural networks -> Good Performance CNN, RNN, LSTM
More information16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning. Spring 2018 Lecture 14. Image to Text
16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning Spring 2018 Lecture 14. Image to Text Input Output Classification tasks 4/1/18 CMU 16-785: Integrated Intelligence in Robotics
More information3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis
3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors
More information3D Shape Segmentation with Projective Convolutional Networks
3D Shape Segmentation with Projective Convolutional Networks Evangelos Kalogerakis 1 Melinos Averkiou 2 Subhransu Maji 1 Siddhartha Chaudhuri 3 1 University of Massachusetts Amherst 2 University of Cyprus
More informationPractical Methodology. Lecture slides for Chapter 11 of Deep Learning Ian Goodfellow
Practical Methodology Lecture slides for Chapter 11 of Deep Learning www.deeplearningbook.org Ian Goodfellow 2016-09-26 What drives success in ML? Arcane knowledge of dozens of obscure algorithms? Mountains
More informationHello Edge: Keyword Spotting on Microcontrollers
Hello Edge: Keyword Spotting on Microcontrollers Yundong Zhang, Naveen Suda, Liangzhen Lai and Vikas Chandra ARM Research, Stanford University arxiv.org, 2017 Presented by Mohammad Mofrad University of
More informationarxiv: v1 [cs.cv] 2 Dec 2018
PVRNet: Point-View Relation Neural Network for 3D Shape Recognition Haoxuan You 1, Yifan Feng 2, Xibin Zhao 1, hangqing Zou 3, Rongrong Ji 2, Yue Gao 1 1 BNRist, KLISS, School of Software, Tsinghua University,
More informationFace Alignment Across Large Poses: A 3D Solution
Face Alignment Across Large Poses: A 3D Solution Outline Face Alignment Related Works 3D Morphable Model Projected Normalized Coordinate Code Network Structure 3D Image Rotation Performance on Datasets
More informationCOMP 551 Applied Machine Learning Lecture 16: Deep Learning
COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all
More informationRecurrent Convolutional Neural Networks for Scene Labeling
Recurrent Convolutional Neural Networks for Scene Labeling Pedro O. Pinheiro, Ronan Collobert Reviewed by Yizhe Zhang August 14, 2015 Scene labeling task Scene labeling: assign a class label to each pixel
More informationKeras: Handwritten Digit Recognition using MNIST Dataset
Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA January 31, 2018 1 / 30 OUTLINE 1 Keras: Introduction 2 Installing Keras 3 Keras: Building, Testing, Improving A Simple Network 2 / 30
More informationNeural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina
Neural Network and Deep Learning Early history of deep learning Deep learning dates back to 1940s: known as cybernetics in the 1940s-60s, connectionism in the 1980s-90s, and under the current name starting
More information(Deep) Learning for Robot Perception and Navigation. Wolfram Burgard
(Deep) Learning for Robot Perception and Navigation Wolfram Burgard Deep Learning for Robot Perception (and Navigation) Lifeng Bo, Claas Bollen, Thomas Brox, Andreas Eitel, Dieter Fox, Gabriel L. Oliveira,
More informationSequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015
Sequence Modeling: Recurrent and Recursive Nets By Pyry Takala 14 Oct 2015 Agenda Why Recurrent neural networks? Anatomy and basic training of an RNN (10.2, 10.2.1) Properties of RNNs (10.2.2, 8.2.6) Using
More informationConditional Random Fields as Recurrent Neural Networks
BIL722 - Deep Learning for Computer Vision Conditional Random Fields as Recurrent Neural Networks S. Zheng, S. Jayasumana, B. Romera-Paredes V. Vineet, Z. Su, D. Du, C. Huang, P.H.S. Torr Introduction
More informationVisual Computing TUM
Visual Computing Group @ TUM Visual Computing Group @ TUM BundleFusion Real-time 3D Reconstruction Scalable scene representation Global alignment and re-localization TOG 17 [Dai et al.]: BundleFusion Real-time
More informationPerceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington
Perceiving the 3D World from Images and Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World 3 Understand
More informationIndex. Springer Nature Switzerland AG 2019 B. Moons et al., Embedded Deep Learning,
Index A Algorithmic noise tolerance (ANT), 93 94 Application specific instruction set processors (ASIPs), 115 116 Approximate computing application level, 95 circuits-levels, 93 94 DAS and DVAS, 107 110
More information3D Environment Reconstruction
3D Environment Reconstruction Using Modified Color ICP Algorithm by Fusion of a Camera and a 3D Laser Range Finder The 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems October 11-15,
More informationEnd-To-End Spam Classification With Neural Networks
End-To-End Spam Classification With Neural Networks Christopher Lennan, Bastian Naber, Jan Reher, Leon Weber 1 Introduction A few years ago, the majority of the internet s network traffic was due to spam
More informationLSTM for Language Translation and Image Captioning. Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia
1 LSTM for Language Translation and Image Captioning Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia 2 Part I LSTM for Language Translation Motivation Background (RNNs, LSTMs) Model
More informationDEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla
DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple
More informationAll You Want To Know About CNNs. Yukun Zhu
All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image
More information08 An Introduction to Dense Continuous Robotic Mapping
NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy
More informationEfficient Algorithms may not be those we think
Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann
More information3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington
3D Object Recognition and Scene Understanding from RGB-D Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World
More informationDelivering Deep Learning to Mobile Devices via Offloading
Delivering Deep Learning to Mobile Devices via Offloading Xukan Ran*, Haoliang Chen*, Zhenming Liu 1, Jiasi Chen* *University of California, Riverside 1 College of William and Mary Deep learning on mobile
More informationRecurrent Neural Networks and Transfer Learning for Action Recognition
Recurrent Neural Networks and Transfer Learning for Action Recognition Andrew Giel Stanford University agiel@stanford.edu Ryan Diaz Stanford University ryandiaz@stanford.edu Abstract We have taken on the
More informationPoint2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network
Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network Xinhai Liu, Zhizhong Han,2, Yu-Shen Liu, Matthias Zwicker 2 School of Software,
More informationTutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY
Tutorial on Keras CAP 6412 - ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY Deep learning packages TensorFlow Google PyTorch Facebook AI research Keras Francois Chollet (now at Google) Chainer Company
More informationMask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018
Mask R-CNN Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 1 Common computer vision tasks Image Classification: one label is generated for
More informationarxiv: v1 [cs.cv] 28 Nov 2017
3D Semantic Segmentation with Submanifold Sparse Convolutional Networks arxiv:1711.10275v1 [cs.cv] 28 Nov 2017 Abstract Benjamin Graham, Martin Engelcke, Laurens van der Maaten Convolutional networks are
More informationarxiv: v1 [cs.cv] 20 Dec 2016
End-to-End Pedestrian Collision Warning System based on a Convolutional Neural Network with Semantic Segmentation arxiv:1612.06558v1 [cs.cv] 20 Dec 2016 Heechul Jung heechul@dgist.ac.kr Min-Kook Choi mkchoi@dgist.ac.kr
More informationarxiv: v1 [cs.cv] 4 Jun 2018
arxiv:1806.01411v1 [cs.cv] 4 Jun 2018 Learning Scene Flow in 3D Point Clouds Xingyu Liu* Charles R. Qi* Leonidas J. Guibas Stanford University Abstract. Many applications in robotics and human-computer
More informationAutomotive 3D Object Detection Without Target Domain Annotations
Master of Science Thesis in Electrical Engineering Department of Electrical Engineering, Linköping University, 2018 Automotive 3D Object Detection Without Target Domain Annotations Erik Linder-Norén and
More information3D Object Classification using Shape Distributions and Deep Learning
3D Object Classification using Shape Distributions and Deep Learning Melvin Low Stanford University mwlow@cs.stanford.edu Abstract This paper shows that the Absolute Angle shape distribution (AAD) feature
More informationDeep Aggregation of Local 3D Geometric Features for 3D Model Retrieval
FURUYA AND OHBUCHI: DEEP AGGREGATION OF LOCAL 3D FEATURES 1 Deep Aggregation of Local 3D Geometric Features for 3D Model Retrieval Takahiko Furuya 1 takahikof AT yamanashi.ac.jp Ryutarou Ohbuchi 1 ohbuchi
More informationA spatiotemporal model with visual attention for video classification
A spatiotemporal model with visual attention for video classification Mo Shan and Nikolay Atanasov Department of Electrical and Computer Engineering University of California San Diego, La Jolla, California,
More informationObject Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal
Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn
More informationAN image is simply a grid of numbers to a machine.
1 Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey Muzammal Naseer, Salman H. Khan, Fatih Porikli Australian National University, Data61-CSIRO, Inception Institute of AI muzammal.naseer@anu.edu.au
More informationDeconvolution Networks
Deconvolution Networks Johan Brynolfsson Mathematical Statistics Centre for Mathematical Sciences Lund University December 6th 2016 1 / 27 Deconvolution Neural Networks 2 / 27 Image Deconvolution True
More informationSynscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet.
Synscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet 7D Labs VINNOVA https://7dlabs.com Photo-realistic image synthesis
More informationMultiAR Project Michael Pekel, Ofir Elmakias [GIP] [234329]
MultiAR Project Michael Pekel, Ofir Elmakias [GIP] [234329] Supervisors Dr. Matan Sela Mr. Yaron Honen Assistants Alexander Porotskiy Summary MultiAR is a multiplayer quest (Outdoor Real Time Multiplayer
More informationPaper Motivation. Fixed geometric structures of CNN models. CNNs are inherently limited to model geometric transformations
Paper Motivation Fixed geometric structures of CNN models CNNs are inherently limited to model geometric transformations Higher-level features combine lower-level features at fixed positions as a weighted
More informationPointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space Charles R. Qi Li Yi Hao Su Leonidas J. Guibas Stanford University Abstract Few prior works study deep learning on point sets.
More informationCMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro
CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful
More informationDynamic Routing Between Capsules
Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet
More informationConvolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,
Convolutional Neural Networks: Applications and a short timeline 7th Deep Learning Meetup Kornel Kis Vienna, 1.12.2016. Introduction Currently a master student Master thesis at BME SmartLab Started deep
More informationABC-CNN: Attention Based CNN for Visual Question Answering
ABC-CNN: Attention Based CNN for Visual Question Answering CIS 601 PRESENTED BY: MAYUR RUMALWALA GUIDED BY: DR. SUNNIE CHUNG AGENDA Ø Introduction Ø Understanding CNN Ø Framework of ABC-CNN Ø Datasets
More informationDeep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More information1 Overview Definitions (read this section carefully) 2
MLPerf User Guide Version 0.5 May 2nd, 2018 1 Overview 2 1.1 Definitions (read this section carefully) 2 2 General rules 3 2.1 Strive to be fair 3 2.2 System and framework must be consistent 4 2.3 System
More informationYOLO9000: Better, Faster, Stronger
YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object
More informationScene Text Recognition for Augmented Reality. Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science
Scene Text Recognition for Augmented Reality Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science Outline Research area and motivation Finding text in natural scenes Prior art Improving
More informationSparse 3D Convolutional Neural Networks for Large-Scale Shape Retrieval
Sparse 3D Convolutional Neural Networks for Large-Scale Shape Retrieval Alexandr Notchenko, Ermek Kapushev, Evgeny Burnaev {avnotchenko,kapushev,burnaevevgeny}@gmail.com Skolkovo Institute of Science and
More informationarxiv: v4 [cs.cv] 27 Nov 2016
FusionNet: 3D Object Classification Using Multiple Data Representations Vishakh Hegde Stanford and Matroid vishakh@matroid.com Reza Zadeh Stanford and Matroid reza@matroid.com arxiv:1607.05695v4 [cs.cv]
More informationKnow your data - many types of networks
Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for
More information