# PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Save this PDF as:

Size: px
Start display at page:

Download "PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space"

## Transcription

1 PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space Sikai Zhong February 14, 2018 COMPUTER SCIENCE

3 PointNet

4 Property of Point Cloud It is unordered,networks must be invariant to permutations; Points are not isolated and must have neighbor information; It must has invariance under transformations because all points are part of a rigid object; 2

5 Symmetry Function for Unordered input A symmetry function takes n vectors and outputs a result that is invariant to the order of a set; + and * are operator candidates for symmetry function; 3

6 Pipeline of PointNet Figure 1: The classification network takes n points as input, applies input and feature transformations, and then aggregates point features by max pooling. The output is classification scores for k classes. The segmentation network is an extension to the classification net. It concatenates global and local features and outputs per point scores. mlp stands for multi-layer perceptron, numbers in bracket are layer sizes. Batch norm is used for all layers with ReLU. Dropout layers are used for the last mlp in classification net.[1] 4

7 Pipeline of PointNet f (x 1, x 2,..., x n ) = γ(max i=1,...,n {h(x i )}); (1) Where γ and h are multi-layer perception(mlp) networks. 5

8 PointNet PointNet can not capature local structure induced by the metric space; Local structure is very important to the success of convolutional architectures; 6

9 PointNet++

10 General Ideal of PointNet++ 1. Partition the set of points into overlapping locally regions using farthest point sampling (FPS) algorithm; 2. Capture the local features using PointNet; 3. Group the features into bigger unit and calculate the higher features again; 7

11 Sampling Layer 1. Given input points {x 1, x 2,...x n }; 2. Using FPS to get a subset {x i1, x i2,...x im } such that x ij is the most distant point to the result of the points as centroids; 8

12 Grouping Layer 1. Given input points of size of N (d + C) and the coordinates of a set of centroids of size N d (d is the dimension of points, C is the dimension of features ) 2. The outputs are groups of points of size N K (d + C); 3. K varies from different levels; 9

13 Grouping Layer Figure 2: (a) Multi-scale grouping (MSG) (); (b) Multiresolution grouping (MRG)(features of a region at some level Li is a concatenation of two vectors. 10

14 Grouping Layer MRG:Features at different scales are concatenated to form a multi-scale features,computationally expensive; MSG: One vector (left in figure. 10) is obtained by summarizing the features at each subregion from the lower level L i using the set abstraction level. The other vector (right) is the feature that is obtained by directly processing all raw points in the local region using a single PointNet). 11

15 PointNet Layer 1. the input are N local regions of points with data size N K (d + C); K is the number of points in one region; 2. Output data size is N (d + c ) 12

16 Pipeline of PointNet++ Figure 3: Illustration of our hierarchical feature learning architecture and its application for set segmentation and classification using points in 2D Euclidean space as an example. Single scale point grouping is visualized here[2] 13

17 Experiments

18 Datasets MNIST : Images of handwritten digits with 60k training and 10k testing samples. ModelNet40: CAD models of 40 categories (mostly man-made). We use the official split with 9,843 shapes for training and 2,468 for testing. SHREC15 : 1200 shapes from 50 categories. Each category contains 24 shapes which are mostly organic ones with various poses such as horses, cats, etc. We use five fold cross validation to acquire classification accuracy on this dataset. ScanNet : 1513 scanned and reconstructed indoor scenes. We use 1201 scenes for training, 312 scenes for test; 14

19 Classification Figure 4: Left: Point cloud with random point dropout. Right: Curve showing advantage of our density adaptive strategy in dealing with non-uniform density. DP means random input dropout during training; otherwise training is on uniformly dense point 15

20 Point Set Segmentation for Semantic Scene Labeling Figure 5: Scannet labeling accuracy 16

21 References i C. R. Qi, H. Su, K. Mo, and L. J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. Proc. Computer Vision and Pattern Recognition (CVPR), IEEE, 1(2):4, C. R. Qi, L. Yi, H. Su, and L. J. Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in Neural Information Processing Systems, pages , 2017.

### PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. Charles R. Qi* Hao Su* Kaichun Mo Leonidas J. Guibas

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation Charles R. Qi* Hao Su* Kaichun Mo Leonidas J. Guibas Big Data + Deep Representation Learning Robot Perception Augmented Reality

### Supplementary A. Overview. C. Time and Space Complexity. B. Shape Retrieval. D. Permutation Invariant SOM. B.1. Dataset

Supplementary A. Overview This supplementary document provides more technical details and experimental results to the main paper. Shape retrieval experiments are demonstrated with ShapeNet Core55 dataset

### CS468: 3D Deep Learning on Point Cloud Data. class label part label. Hao Su. image. May 10, 2017

CS468: 3D Deep Learning on Point Cloud Data class label part label Hao Su image. May 10, 2017 Agenda Point cloud generation Point cloud analysis CVPR 17, Point Set Generation Pipeline render CVPR 17, Point

### 3D Deep Learning on Geometric Forms. Hao Su

3D Deep Learning on Geometric Forms Hao Su Many 3D representations are available Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models 3D representation

### POINT CLOUD DEEP LEARNING

POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/28 / 57 Introduction AGENDA Previous Work Method Result Conclusion 2 / 57 INTRODUCTION 3 / 57 2D OBJECT CLASSIFICATION Deep Learning for 2D Object Classification

### PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space Charles R. Qi Li Yi Hao Su Leonidas J. Guibas Stanford University Abstract Few prior works study deep learning on point sets.

### arxiv: v1 [cs.cv] 28 Nov 2018

MeshNet: Mesh Neural Network for 3D Shape Representation Yutong Feng, 1 Yifan Feng, 2 Haoxuan You, 1 Xibin Zhao 1, Yue Gao 1 1 BNRist, KLISS, School of Software, Tsinghua University, China. 2 School of

### Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material

Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material Charles R. Qi Hao Su Matthias Nießner Angela Dai Mengyuan Yan Leonidas J. Guibas Stanford University 1. Details

### AI Ukraine Point cloud labeling using machine learning

AI Ukraine - 2017 Point cloud labeling using machine learning Andrii Babii apratster@gmail.com Data sources 1. LIDAR 2. 3D IR cameras 3. Generated from 2D sources 3D dataset: http://kos.informatik.uni-osnabrueck.de/3dscans/

### Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA February 9, 2017 1 / 24 OUTLINE 1 Introduction Keras: Deep Learning library for Theano and TensorFlow 2 Installing Keras Installation

### Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network

Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network Xinhai Liu, Zhizhong Han,2, Yu-Shen Liu, Matthias Zwicker 2 School of Software,

### Deep Learning on Point Sets for 3D Classification and Segmentation

Deep Learning on Point Sets for 3D Classification and Segmentation Charles Ruizhongtai Qi Stanford University rqi@stanford.edu Abstract Point cloud is an important type of geometric data structure. Due

### Dynamic Routing Between Capsules

Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet

### ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016

ECCV 2016 Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 Fundamental Question What is a good vector representation of an object? Something that can be easily predicted from 2D

### arxiv: v1 [cs.cv] 6 Nov 2018 Abstract

3DCapsule: Extending the Capsule Architecture to Classify 3D Point Clouds Ali Cheraghian Australian National University, Data61-CSIRO ali.cheraghian@anu.edu.au Lars Petersson Data61-CSIRO lars.petersson@data61.csiro.au

### An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks

An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks Report for Undergraduate Project - CS396A Vinayak Tantia (Roll No: 14805) Guide: Prof Gaurav Sharma CSE, IIT Kanpur, India

### Using Faster-RCNN to Improve Shape Detection in LIDAR

Using Faster-RCNN to Improve Shape Detection in LIDAR TJ Melanson Stanford University Stanford, CA 94305 melanson@stanford.edu Abstract In this paper, I propose a method for extracting objects from unordered

### Pose estimation using a variety of techniques

Pose estimation using a variety of techniques Keegan Go Stanford University keegango@stanford.edu Abstract Vision is an integral part robotic systems a component that is needed for robots to interact robustly

### Cross-domain Deep Encoding for 3D Voxels and 2D Images

Cross-domain Deep Encoding for 3D Voxels and 2D Images Jingwei Ji Stanford University jingweij@stanford.edu Danyang Wang Stanford University danyangw@stanford.edu 1. Introduction 3D reconstruction is one

### arxiv: v1 [cs.cv] 17 Apr 2019

2D Car Detection in Radar Data with PointNets Andreas Danzer, Thomas Griebel, Martin Bach, and Klaus Dietmayer arxiv:1904.08414v1 [cs.cv] 17 Apr 2019 Abstract For many automated driving functions, a highly

### Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization

### Object Recognition II

Object Recognition II Linda Shapiro EE/CSE 576 with CNN slides from Ross Girshick 1 Outline Object detection the task, evaluation, datasets Convolutional Neural Networks (CNNs) overview and history Region-based

### Deconvolution Networks

Deconvolution Networks Johan Brynolfsson Mathematical Statistics Centre for Mathematical Sciences Lund University December 6th 2016 1 / 27 Deconvolution Neural Networks 2 / 27 Image Deconvolution True

### arxiv: v1 [cs.cv] 27 Nov 2018 Abstract

Iterative Transformer Network for 3D Point Cloud Wentao Yuan David Held Christoph Mertz Martial Hebert The Robotics Institute Carnegie Mellon University {wyuan1, dheld, cmertz, mhebert}@cs.cmu.edu arxiv:1811.11209v1

### Seeing the unseen. Data-driven 3D Understanding from Single Images. Hao Su

Seeing the unseen Data-driven 3D Understanding from Single Images Hao Su Image world Shape world 3D perception from a single image Monocular vision a typical prey a typical predator Cited from https://en.wikipedia.org/wiki/binocular_vision

### Structured Prediction using Convolutional Neural Networks

Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer

### MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY

MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY Outline Object Recognition Multi-Level Volumetric Representations

### Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA January 31, 2018 1 / 30 OUTLINE 1 Keras: Introduction 2 Installing Keras 3 Keras: Building, Testing, Improving A Simple Network 2 / 30

### 3D Convolutional Neural Networks for Landing Zone Detection from LiDAR

3D Convolutional Neural Networks for Landing Zone Detection from LiDAR Daniel Mataruna and Sebastian Scherer Presented by: Sabin Kafle Outline Introduction Preliminaries Approach Volumetric Density Mapping

### Object detection, deep learning, and R-CNNs

Object detection, deep learning, and R-CNNs Partly from Ross Girshick Microsoft Research Guest lecture for UW CSE 455 Nov. 24, 2014 Outline Object detection the task, evaluation, datasets Convolutional

### Machine Learning 13. week

Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

### Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity

### Deep Learning with Tensorflow AlexNet

Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification

### MoonRiver: Deep Neural Network in C++

MoonRiver: Deep Neural Network in C++ Chung-Yi Weng Computer Science & Engineering University of Washington chungyi@cs.washington.edu Abstract Artificial intelligence resurges with its dramatic improvement

### arxiv: v4 [cs.cv] 27 Mar 2018

SO-Net: Self-Organizing Network for Point Cloud Analysis Jiaxin Li Ben M. Chen Gim Hee Lee National University of Singapore arxiv:1803.04249v4 [cs.cv] 27 Mar 2018 Abstract This paper presents SO-Net, a

### Deep Models for 3D Reconstruction

Deep Models for 3D Reconstruction Andreas Geiger Autonomous Vision Group, MPI for Intelligent Systems, Tübingen Computer Vision and Geometry Group, ETH Zürich October 12, 2017 Max Planck Institute for

### Polytechnic University of Tirana

1 Polytechnic University of Tirana Department of Computer Engineering SIBORA THEODHOR ELINDA KAJO M ECE 2 Computer Vision OCR AND BEYOND THE PRESENTATION IS ORGANISED IN 3 PARTS : 3 Introduction, previous

### 6. Convolutional Neural Networks

6. Convolutional Neural Networks CS 519 Deep Learning, Winter 2017 Fuxin Li With materials from Zsolt Kira Quiz coming up Next Thursday (2/2) 20 minutes Topics: Optimization Basic neural networks No Convolutional

### Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning

Allan Zelener Dissertation Proposal December 12 th 2016 Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning Overview 1. Introduction to 3D Object Identification

### arxiv: v1 [cs.cv] 2 Dec 2018

PVRNet: Point-View Relation Neural Network for 3D Shape Recognition Haoxuan You 1, Yifan Feng 2, Xibin Zhao 1, hangqing Zou 3, Rongrong Ji 2, Yue Gao 1 1 BNRist, KLISS, School of Software, Tsinghua University,

### 3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington

3D Object Recognition and Scene Understanding from RGB-D Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World

### Deep Learning for Computer Vision II

IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L

### Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter

### arxiv: v1 [cs.cv] 13 Feb 2018

Recurrent Slice Networks for 3D Segmentation on Point Clouds Qiangui Huang Weiyue Wang Ulrich Neumann University of Southern California Los Angeles, California {qianguih,weiyuewa,uneumann}@uscedu arxiv:180204402v1

### 3D Deep Learning

3D Deep Learning Tutorial@CVPR2017 Hao Su (UCSD) Leonidas Guibas (Stanford) Michael Bronstein (Università della Svizzera Italiana) Evangelos Kalogerakis (UMass) Jimei Yang (Adobe Research) Charles Qi (Stanford)

### Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018

Mask R-CNN Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 1 Common computer vision tasks Image Classification: one label is generated for

### CS 231A Computer Vision (Fall 2011) Problem Set 4

CS 231A Computer Vision (Fall 2011) Problem Set 4 Due: Nov. 30 th, 2011 (9:30am) 1 Part-based models for Object Recognition (50 points) One approach to object recognition is to use a deformable part-based

### DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington

### P-CNN: Pose-based CNN Features for Action Recognition. Iman Rezazadeh

P-CNN: Pose-based CNN Features for Action Recognition Iman Rezazadeh Introduction automatic understanding of dynamic scenes strong variations of people and scenes in motion and appearance Fine-grained

### Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus Presented by: Rex Ying and Charles Qi Input: A Single RGB Image Estimate

### Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky

### Learning to generate 3D shapes

Learning to generate 3D shapes Subhransu Maji College of Information and Computer Sciences University of Massachusetts, Amherst http://people.cs.umass.edu/smaji August 10, 2018 @ Caltech Creating 3D shapes

### 3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis

3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors

### Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu

Artificial Neural Networks Introduction to Computational Neuroscience Ardi Tampuu 7.0.206 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition

### Classification of objects from Video Data (Group 30)

Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time

### Efficient Algorithms may not be those we think

Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann

### Spatial Localization and Detection. Lecture 8-1

Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

### Is Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th

Is Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th Today s Story Why does CNN matter to the embedded world? How to enable CNN in

### Introduction to Neural Networks

Introduction to Neural Networks Jakob Verbeek 2017-2018 Biological motivation Neuron is basic computational unit of the brain about 10^11 neurons in human brain Simplified neuron model as linear threshold

### Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in

### Supervised Hashing for Image Retrieval via Image Representation Learning

Supervised Hashing for Image Retrieval via Image Representation Learning Rongkai Xia, Yan Pan, Cong Liu (Sun Yat-Sen University) Hanjiang Lai, Shuicheng Yan (National University of Singapore) Finding Similar

### DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,

### Perceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington

Perceiving the 3D World from Images and Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World 3 Understand

### Deep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition

IEEE 2017 Conference on Computer Vision and Pattern Recognition Deep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition Bong-Nam Kang*, Yonghyun

### 3D model classification using convolutional neural network

3D model classification using convolutional neural network JunYoung Gwak Stanford jgwak@cs.stanford.edu Abstract Our goal is to classify 3D models directly using convolutional neural network. Most of existing

### arxiv: v1 [cs.cv] 20 Dec 2016

End-to-End Pedestrian Collision Warning System based on a Convolutional Neural Network with Semantic Segmentation arxiv:1612.06558v1 [cs.cv] 20 Dec 2016 Heechul Jung heechul@dgist.ac.kr Min-Kook Choi mkchoi@dgist.ac.kr

### arxiv: v1 [cs.cv] 4 Jun 2018

arxiv:1806.01411v1 [cs.cv] 4 Jun 2018 Learning Scene Flow in 3D Point Clouds Xingyu Liu* Charles R. Qi* Leonidas J. Guibas Stanford University Abstract. Many applications in robotics and human-computer

### GraphNet: Recommendation system based on language and network structure

GraphNet: Recommendation system based on language and network structure Rex Ying Stanford University rexying@stanford.edu Yuanfang Li Stanford University yli03@stanford.edu Xin Li Stanford University xinli16@stanford.edu

### Recurrent Convolutional Neural Networks for Scene Labeling

Recurrent Convolutional Neural Networks for Scene Labeling Pedro O. Pinheiro, Ronan Collobert Reviewed by Yizhe Zhang August 14, 2015 Scene labeling task Scene labeling: assign a class label to each pixel

### CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

CENG 783 Special topics in Deep Learning AlchemyAPI Week 11 Sinan Kalkan TRAINING A CNN Fig: http://www.robots.ox.ac.uk/~vgg/practicals/cnn/ Feed-forward pass Note that this is written in terms of the

### Neural Network Optimization and Tuning / Spring 2018 / Recitation 3

Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.

### Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn

### CS 340 Lec. 4: K-Nearest Neighbors

CS 340 Lec. 4: K-Nearest Neighbors AD January 2011 AD () CS 340 Lec. 4: K-Nearest Neighbors January 2011 1 / 23 K-Nearest Neighbors Introduction Choice of Metric Overfitting and Underfitting Selection

### Image Classification pipeline. Lecture 2-1

Lecture 2: Image Classification pipeline Lecture 2-1 Administrative: Piazza For questions about midterm, poster session, projects, etc, use Piazza! SCPD students: Use your @stanford.edu address to register

### Experimental Evaluation of Point Cloud Classification using the PointNet Neural Network

Experimental Evaluation of Point Cloud Classification using the PointNet Neural Network Marko Filipović, Petra Ðurović and Robert Cupec Faculty of Electrical Engineering, Computer Science and Information

### Human Pose Estimation with Deep Learning. Wei Yang

Human Pose Estimation with Deep Learning Wei Yang Applications Understand Activities Family Robots American Heist (2014) - The Bank Robbery Scene 2 What do we need to know to recognize a crime scene? 3

### Su et al. Shape Descriptors - III

Su et al. Shape Descriptors - III Siddhartha Chaudhuri http://www.cse.iitb.ac.in/~cs749 Funkhouser; Feng, Liu, Gong Recap Global A shape descriptor is a set of numbers that describes a shape in a way that

### From 3D descriptors to monocular 6D pose: what have we learned?

ECCV Workshop on Recovering 6D Object Pose From 3D descriptors to monocular 6D pose: what have we learned? Federico Tombari CAMP - TUM Dynamic occlusion Low latency High accuracy, low jitter No expensive

### The Kinect Sensor. Luís Carriço FCUL 2014/15

Advanced Interaction Techniques The Kinect Sensor Luís Carriço FCUL 2014/15 Sources: MS Kinect for Xbox 360 John C. Tang. Using Kinect to explore NUI, Ms Research, From Stanford CS247 Shotton et al. Real-Time

### AUTOMATIC 3D HUMAN ACTION RECOGNITION Ajmal Mian Associate Professor Computer Science & Software Engineering

AUTOMATIC 3D HUMAN ACTION RECOGNITION Ajmal Mian Associate Professor Computer Science & Software Engineering www.csse.uwa.edu.au/~ajmal/ Overview Aim of automatic human action recognition Applications

### Fully-Convolutional Siamese Networks for Object Tracking

Fully-Convolutional Siamese Networks for Object Tracking Luca Bertinetto*, Jack Valmadre*, João Henriques, Andrea Vedaldi and Philip Torr www.robots.ox.ac.uk/~luca luca.bertinetto@eng.ox.ac.uk Tracking

### Computing the Stereo Matching Cost with CNN

University at Austin Figure. The of lefttexas column displays the left input image, while the right column displays the output of our stereo method. Examples are sorted by difficulty, with easy examples

### Contextual Dropout. Sam Fok. Abstract. 1. Introduction. 2. Background and Related Work

Contextual Dropout Finding subnets for subtasks Sam Fok samfok@stanford.edu Abstract The feedforward networks widely used in classification are static and have no means for leveraging information about

### Shape Context Matching For Efficient OCR

Matching For Efficient OCR May 14, 2012 Matching For Efficient OCR Table of contents 1 Motivation Background 2 What is a? Matching s Simliarity Measure 3 Matching s via Pyramid Matching Matching For Efficient

### 3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

3D Computer Vision Structured Light II Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Introduction

### Computer Vision Lecture 16

Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

### Tri-modal Human Body Segmentation

Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4

### Image Classification pipeline. Lecture 2-1

Lecture 2: Image Classification pipeline Lecture 2-1 Administrative: Piazza For questions about midterm, poster session, projects, etc, use Piazza! SCPD students: Use your @stanford.edu address to register

### SO-Net: Self-Organizing Network for Point Cloud Analysis

SO-Net: Self-Organizing Network for Point Cloud Analysis Jiaxin Li Ben M. Chen Gim Hee Lee National University of Singapore Abstract This paper presents SO-Net, a permutation invariant architecture for

### Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

### Learning to Sample. Shai Avidan Tel-Aviv University. Oren Dovrat Tel-Aviv University. Itai Lang Tel-Aviv University. Abstract. 1.

Learning to Sample Oren Dovrat Tel-Aviv University Itai Lang Tel-Aviv University Shai Avidan Tel-Aviv University We propose a simplification network, termed S-NET, that is based on the architecture of

### The Detection of Faces in Color Images: EE368 Project Report

The Detection of Faces in Color Images: EE368 Project Report Angela Chau, Ezinne Oji, Jeff Walters Dept. of Electrical Engineering Stanford University Stanford, CA 9435 angichau,ezinne,jwalt@stanford.edu

### Robust PDF Table Locator

Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records

### Correspondence. CS 468 Geometry Processing Algorithms. Maks Ovsjanikov

Shape Matching & Correspondence CS 468 Geometry Processing Algorithms Maks Ovsjanikov Wednesday, October 27 th 2010 Overall Goal Given two shapes, find correspondences between them. Overall Goal Given

### Report: Privacy-Preserving Classification on Deep Neural Network

Report: Privacy-Preserving Classification on Deep Neural Network Janno Veeorg Supervised by Helger Lipmaa and Raul Vicente Zafra May 25, 2017 1 Introduction In this report we consider following task: how

### Clustering and Unsupervised Anomaly Detection with l 2 Normalized Deep Auto-Encoder Representations

Clustering and Unsupervised Anomaly Detection with l 2 Normalized Deep Auto-Encoder Representations Caglar Aytekin, Xingyang Ni, Francesco Cricri and Emre Aksu Nokia Technologies, Tampere, Finland Corresponding

### Filtering and mapping systems for underwater 3D imaging sonar

Filtering and mapping systems for underwater 3D imaging sonar Tomohiro Koshikawa 1, a, Shin Kato 1,b, and Hitoshi Arisumi 1,c 1 Field Robotics Research Group, National Institute of Advanced Industrial

### Visual Computing TUM

Visual Computing Group @ TUM Visual Computing Group @ TUM BundleFusion Real-time 3D Reconstruction Scalable scene representation Global alignment and re-localization TOG 17 [Dai et al.]: BundleFusion Real-time