DEEP NEURAL NETWORKS FOR OBJECT DETECTION

Similar documents
Yiqi Yan. May 10, 2017

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Spatial Localization and Detection. Lecture 8-1

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Lecture: Deep Convolutional Neural Networks

Deep Learning for Object detection & localization

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

YOLO9000: Better, Faster, Stronger

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

Object detection with CNNs

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

RON: Reverse Connection with Objectness Prior Networks for Object Detection

Deep Learning for Computer Vision II

Computer Vision Lecture 16

Yield Estimation using faster R-CNN

OBJECT DETECTION HYUNG IL KOO

Study of Residual Networks for Image Recognition

CS 523: Multimedia Systems

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

INTRODUCTION TO DEEP LEARNING

Object Detection Based on Deep Learning

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu

Machine Learning 13. week

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

YOLO 9000 TAEWAN KIM

IMPLEMENTING DEEP LEARNING USING CUDNN 이예하 VUNO INC.

Lecture 5: Object Detection

Object Detection on Self-Driving Cars in China. Lingyun Li

Structured Prediction using Convolutional Neural Networks

Advanced Video Analysis & Imaging

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

CS230: Lecture 3 Various Deep Learning Topics

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Real-time Object Detection CS 229 Course Project

Deep Neural Networks:

Know your data - many types of networks

Object Recognition II

! References: ! Computer eyesight gets a lot more accurate, NY Times. ! Stanford CS 231n. ! Christopher Olah s blog. ! Take ECS 174!

Machine Learning. MGS Lecture 3: Deep Learning

Computer Vision Lecture 16

Computer Vision Lecture 16

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS

Fuzzy Set Theory in Computer Vision: Example 3, Part II

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou

Automatic detection of books based on Faster R-CNN

Deep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers

Fuzzy Set Theory in Computer Vision: Example 3

A Deep Learning Framework for Authorship Classification of Paintings

CS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell

Object Detection. Part1. Presenter: Dae-Yong

Rich feature hierarchies for accurate object detection and semant

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

CAP 6412 Advanced Computer Vision

ImageNet Classification with Deep Convolutional Neural Networks

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia

Seminars in Artifiial Intelligenie and Robotiis

Su et al. Shape Descriptors - III

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

An Exploration of Computer Vision Techniques for Bird Species Classification

Computer Vision. From traditional approaches to deep neural networks. Stanislav Frolov München,

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Modern Convolutional Object Detectors

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper

Visual features detection based on deep neural network in autonomous driving tasks

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies

Deep Learning with Intel DAAL

Recurrent Convolutional Neural Networks for Scene Labeling

Neural Networks and Deep Learning

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com

POINT CLOUD DEEP LEARNING

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Unified, real-time object detection

Rotation Invariance Neural Network

Rich feature hierarchies for accurate object detection and semantic segmentation

Generative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs

Self Driving. DNN * * Reinforcement * Unsupervised *

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Traffic Multiple Target Detection on YOLOv2

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing

Back propagation Algorithm:

Final Report: Smart Trash Net: Waste Localization and Classification

Rich feature hierarchies for accurate object detection and semantic segmentation

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

Convolutional Neural Networks

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601

Optimizing Object Detection:

Using Machine Learning for Classification of Cancer Cells

Regionlet Object Detector with Hand-crafted and CNN Feature

Transcription:

DEEP NEURAL NETWORKS FOR OBJECT DETECTION Sergey Nikolenko Steklov Institute of Mathematics at St. Petersburg October 21, 2017, St. Petersburg, Russia

Outline Bird s eye overview of deep learning Convolutional neural networks From CNN to object detection and segmentation Current state of the art Neuromation: synthetic data

Neural networks: a brief history Neural networks started as models of actual neurons Very old idea (McCulloch, Pitts, 1943), there were actual hardware perceptrons in the 1950s Several winters and springs, but the 1980s already had all basic architectures that we use today But they couldn t train them fast enough and on enough data

The deep learning revolution 10 years ago machine learning underwent a deep learning revolution Since 2007-2008, we can train large and deep neural networks New ideas for training + GPUs + large datasets And now deep NNs yield state of the art results in many fields

What is a deep neural network A neural network is a composition of functions Usually linear combination + nonlinearity These functions comprise a computational graph that computes the loss function for the model To train the model (learn the weights), you take the gradient of the loss function w.r.t. weights with backpropagation And then you can do (stochastic) gradient descent and variations

Convolutional neural networks Convolutional neural networks specifically for image processing Also an old idea, LeCun s group did it since late 1980s Inspired by the experiments of Hubel and Wiesel who understood (lower layers of) the visual cortex

Convolutional neural networks: idea Main idea: apply the same filters to different parts of the image. Break up the picture into windows:

Convolutional neural networks: idea Main idea: apply the same filters to different parts of the image. Apply a small neural network to each window: Processing a single tile

Convolutional neural networks: idea Main idea: apply the same filters to different parts of the image. Compress with max-pooling Then use the resulting features:

Convolutional neural networks: idea We can also see which parts of the image activate a specific neuron, i.e., find out what the features do for specific images:

Deep CNNs СNNs were deep from the start LeNet, late 1980s: And they started to grow quickly after the deep learning revolution VGG:

Inception Network in network: the small network does not have to be trivial Inception: a special network in network architecture GoogLeNet: extra outputs for the error function the model from halfway

ResNet Residual connections provide the free gradient flow needed for really deep networks

ResNet led to the revolution of depth

ImageNet Modern CNNs have hundreds of layers They usually train on ImageNet, a huge dataset for image classification: >10M images, >1M bounding boxes, all labeled by hand

Object detection In practice we also need to know where the objects are PASCAL VOC dataset for segmentation: Relatively small, so recognition models are first trained on ImageNet

YOLO YOLO: you only look once; look for bounding boxes and objects in one pass: YOLO v.2 has recently appeared and is one of the fastest and best object detectors right now

YOLO Idea: split the image into an SxS grid. In each cell, predict both bounding boxes and class probabilities; then simply CNN architecture in YOLO is standard:

Single Shot Detectors Further development of this idea: single-shot detectors (SSD) A single network that predicts several class labels and several corresponding positions for anchor boxes (bounding boxes of several predefined sizes).

R-CNN R-CNN: Region-based ConvNet Find bounding boxes with some external algorithm (e.g., selective search) Then extract CNN features (from a CNN trained on ImageNet and fine-tuned on the necessary dataset) and classify

R-CNN Visualizing regions of activation for a neuron from a high layer:

Fast R-CNN But R-CNN has to be trained in several steps (first CNN, then SVM on CNN features, then bounding box regressors), very long, and recognition is very slow (47s per image even on a GPU!) The main reason is that we need to go through the CNN for every region Hence, Fast R-CNN makes RoI (region of interest) projection that collects features from a region. One pass of the main CNN for the whole image. Loss = classification error + bounding box regression error

Faster R-CNN One more bottleneck left: selective search to choose bounding boxes. Faster R-CNN embeds it into the network too with a separate Region Proposal Network Evaluates each individual possibility from a set of predefined anchor boxes

R-FCN We can cut the costs even further, getting rid of complicated layers to be computed on each region. R-FCN (Region-based Fully Convolutional Network) cuts the features from the very last layer, immediately before classification

How they all compare

How they all compare

Mask R-CNN for image segmentation To get segmentation, just add a pixel-wise output layer

Synthetic data But all of this still requires lots and lots of data The Neuromation approach: create synthetic data ourselves We create a 3D model for each object and render images to train on

Synthetic data Synthetic data can have pixel perfect labeling, something humans can t do And it is 100% correct and free

Transfer learning Problem: we need to do transfer learning from synthetic images to real ones We are successfully solving this problem from both sides

Next step Retail Automation Lab needs to scale up synthetic data Challenge: 170000 SKU in the Russian retail catalogue only

Synthetic data for industrial automation Another great fit for synthetic data industrial automation Self-driving cars, flying drones, industrial robots labeled data is limited Synthetic environments can help

OUR TEAM: Maxim Prasolov CEO Fedor Savchenko CTO Sergey Nikolenko Chief Research Officer Denis Popov Chief Information Officer Constantine Goltsev Investor / Chairman Andrew Rabinovich Adviser Yuri Kundin ICO Compliance Adviser Esther Katz VP of Communication Kiryl Truskovskyi Lead Researcher THANK YOU FOR YOUR ATTENTION! David Orban Adviser

KNOWLEDGE MINING - A NEW ERA OF DISTRIBUTED COMPUTING THANK YOU! neuromation.io