Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

Similar documents
Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Deep Residual Learning

Learning Deep Features for Visual Recognition

Know your data - many types of networks

Learning Deep Representations for Visual Recognition

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

INTRODUCTION TO DEEP LEARNING

DEEP NEURAL NETWORKS FOR OBJECT DETECTION

Study of Residual Networks for Image Recognition

Machine Learning 13. week

Fei-Fei Li & Justin Johnson & Serena Yeung

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Fuzzy Set Theory in Computer Vision: Example 3

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

POINT CLOUD DEEP LEARNING

Seminars in Artifiial Intelligenie and Robotiis

Machine Learning. MGS Lecture 3: Deep Learning

Cascade Region Regression for Robust Object Detection

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Yiqi Yan. May 10, 2017

Fuzzy Set Theory in Computer Vision: Example 3, Part II

Spatial Localization and Detection. Lecture 8-1

Computer Vision Lecture 16

Computer Vision Lecture 16

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Classifying a specific image region using convolutional nets with an ROI mask as input

Channel Locality Block: A Variant of Squeeze-and-Excitation

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

Final Report: Smart Trash Net: Waste Localization and Classification

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CNNS FROM THE BASICS TO RECENT ADVANCES. Dmytro Mishkin Center for Machine Perception Czech Technical University in Prague

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

CSE 559A: Computer Vision

HENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage

CS 523: Multimedia Systems

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

Inception Network Overview. David White CS793

Kaggle Data Science Bowl 2017 Technical Report

arxiv: v2 [cs.cv] 26 Jan 2018

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. By Joa õ Carreira and Andrew Zisserman Presenter: Zhisheng Huang 03/02/2018

Deep Learning With Noise

Convolutional Neural Networks

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Supplementary material for Analyzing Filters Toward Efficient ConvNet

Computer Vision Lecture 16

CS231N Section. Video Understanding 6/1/2018

Smart Parking System using Deep Learning. Sheece Gardezi Supervised By: Anoop Cherian Peter Strazdins

Object Detection Based on Deep Learning

ECE 5470 Classification, Machine Learning, and Neural Network Review

Automatic detection of books based on Faster R-CNN

Visual features detection based on deep neural network in autonomous driving tasks

R-FCN: Object Detection with Really - Friggin Convolutional Networks

Advanced Video Analysis & Imaging

Convolutional Neural Networks

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing

Brainchip OCTOBER

Regionlet Object Detector with Hand-crafted and CNN Feature

Automated Diagnosis of Vertebral Fractures using 2D and 3D Convolutional Networks

Object detection with CNNs

Generative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs

Fully Convolutional Networks for Semantic Segmentation

Lecture 7: Semantic Segmentation

Lecture: Deep Convolutional Neural Networks

Efficient Segmentation-Aided Text Detection For Intelligent Robots

Deep Learning. a.k.o. Neural Network

Convolutional Neural Networks + Neural Style Transfer. Justin Johnson 2/1/2017

Using Machine Learning for Classification of Cancer Cells

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University

Tiny ImageNet Challenge Submission

Martian lava field, NASA, Wikipedia

SEMANTIC SEGMENTATION AVIRAM BAR HAIM & IRIS TAL

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Artificial Intelligence Introduction Handwriting Recognition Kadir Eren Unal ( ), Jakob Heyder ( )

Attention Incorporate Network: A network can adapt various data size

Structured Prediction using Convolutional Neural Networks

MIXED PRECISION TRAINING: THEORY AND PRACTICE Paulius Micikevicius

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Finding Tiny Faces Supplementary Materials

Deep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets)

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

Modern Convolutional Object Detectors

All You Want To Know About CNNs. Yukun Zhu

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

Convolution Neural Network for Traditional Chinese Calligraphy Recognition

arxiv: v1 [cs.cv] 20 Dec 2016

Classification of objects from Video Data (Group 30)

Como funciona o Deep Learning

Index. Springer Nature Switzerland AG 2019 B. Moons et al., Embedded Deep Learning,

YOLO9000: Better, Faster, Stronger

Transcription:

Convolutional Neural Networks: Applications and a short timeline 7th Deep Learning Meetup Kornel Kis Vienna, 1.12.2016.

Introduction Currently a master student Master thesis at BME SmartLab Started deep learning 2.5 years ago

New idea: Backprop. through time Deep MultiLayer- Perceptrons (feedforward) New idea: Input is highly redundant LSTM, GRU etc. (recurrent) ConvNETs (feedforward)

The naive approach Let s apply MLP networks to real-world images CIFAR-10 database: 32*32*3 pixels 32*32*3 = 3072 inputs! Does not scale with image size Really inefficient knowledge representation MLPs work well with de-correlated input

Key Idea Information on real-world images are really redundant Learn patterns, not data Feature engineering -> Feature learning

ConvNets Feedforward networks Use different types of layers Convolutional layer Pooling layer ReLU layer (Sometimes FC layer) Etc.

Convolutional Layer Each unit is only connected to a small part of the image receptive field of a neuron Weights are shared inside one depth slice Each depth slice is using the same weights Drastically reduces parameter count A 3D volume of neurons Width, Height, Depth

Convolutional Layer Depth slices Input Output Image source: http://cs231n.github.io/convolutional-networks/

Pooling layer Downsampling -> further control of parameters Types: Max-pooling, Avg-pooling Etc. Changes spatial structure, but not parameter count

ReLU layer Applies a non-linear function Does change neither parameter count, nor structure

Important ConvNet architectures AlexNet (2012) 8 layers VGG (2014) 16/19 layers GoogLeNet (2014-) 22 layers (Inception v1) ResNet (2015-) 152 layers!!! Inception-ResNet (2016) - (Inception v4) What about vanishing gradients?

Image source: https://indico.io/blog/wp-content/uploads/2015/08/googlenet.png GoogLeNet Input Output3 Output1 Output2

ResNet Not just stacking layers on top of each other Normal Residual

Image source: https://figshare.com/articles/ventral_visual_stream/106794 (Jonas Kubilius) A brain analogy (hypothesis)

Spiking neural networks More similar to real neurons Neurons do not fire at each cycle, only after reaching some threshold Output information is carried by frequency or timing between spikes Possible applications (?) Direct modeling of the nervous system Not yet widespread due to hardware constraints and stability problems

Applications Features Matter. ([Girshick et al. 2014]) Meaning: better feature learning directly yield better overall performance Frameworks are being developed in many areas which can use any deep ConvNet as feature learner Wide range of applications (images, videos, speech technology, etc.)

General approach Networks are rarely trained from scratch Pre-trained weights are often available for each network Fine-tuning (transfer learning) for specific application Only FC layers (the classifier) are re-trained from random initials

A brief example: object detection Problem: Finding multiple objects on an image Output: bounding-boxes given with 4 coordinates and detection probability Massive improvements in performance in the recent years, thanks to ConvNets Frameworks include R-CNN, Fast R-CNN, Faster R-CNN, R-FCN

R-FCN/ Faster R-CNN example Image source: Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Deep Residual Learning for Image Recognition. arxiv 2015. Shaoqing Ren, Kaiming He, Ross Girshick, & Jian Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. NIPS 2015.

R-FCN Latest generation of object detection frameworks Based on Faster R-CNN Regression to boundary box coordinates + object classification inside boundary box Uses an arbitrary ConvNet (most often ResNet) as backend Surpasses human-level performance Detection works very close to real-time (!)

Our current project Skin cancer image detection/classification Huge database of medical images Using Faster R-CNN (currently) Image labeling/preprocessing is neccessary Related work: [Liao, Haofu. "A Deep Learning Approach to Universal Skin Disease Classification."] Already surpassing human expert performance on a much smaller dataset

Challenges Huge unlabeled dataset Many preprocessing steps Pre-trained models may not be applicable

Thank you for your attention! Image source: https://www.carmudi.ae/journal/wp-content/uploads/2015/06/google-self-driving-car.jpg