CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN

Similar documents
Object detection with CNNs

Yiqi Yan. May 10, 2017

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Lecture 5: Object Detection

Spatial Localization and Detection. Lecture 8-1

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Object Detection Based on Deep Learning

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Deep Learning for Object detection & localization

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

CS 1674: Intro to Computer Vision. Object Recognition. Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

OBJECT DETECTION HYUNG IL KOO

Object Detection. TA : Young-geun Kim. Biostatistics Lab., Seoul National University. March-June, 2018

Yield Estimation using faster R-CNN

RON: Reverse Connection with Objectness Prior Networks for Object Detection

YOLO9000: Better, Faster, Stronger

Optimizing Object Detection:

R-FCN: Object Detection with Really - Friggin Convolutional Networks

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

Unified, real-time object detection

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

Object Detection. Computer Vision Yuliang Zou, Virginia Tech. Many slides from D. Hoiem, J. Hays, J. Johnson, R. Girshick

Rich feature hierarchies for accurate object detection and semantic segmentation

Object Detection on Self-Driving Cars in China. Lingyun Li

Martian lava field, NASA, Wikipedia

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018

Efficient Segmentation-Aided Text Detection For Intelligent Robots

Modern Convolutional Object Detectors

Object Detection. Part1. Presenter: Dae-Yong

Computer Vision Lecture 16

Cascade Region Regression for Robust Object Detection

G-CNN: an Iterative Grid Based Object Detector

DEEP NEURAL NETWORKS FOR OBJECT DETECTION

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection

Photo OCR ( )

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Rich feature hierarchies for accurate object detection and semantic segmentation

YOLO: You Only Look Once Unified Real-Time Object Detection. Presenter: Liyang Zhong Quan Zou

Computer Vision Lecture 16

Visual features detection based on deep neural network in autonomous driving tasks

Optimizing Object Detection:

Project 3 Q&A. Jonathan Krause

Why Is Recognition Hard? Object Recognizer

Abandoned Luggage Detection

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Real-time Object Detection CS 229 Course Project

Object Detection with YOLO on Artwork Dataset

arxiv: v1 [cs.cv] 26 Jun 2017

Towards Real-Time Automatic Number Plate. Detection: Dots in the Search Space

Final Report: Smart Trash Net: Waste Localization and Classification

Hybrid Learning Framework for Large-Scale Web Image Annotation and Localization

Rich feature hierarchies for accurate object detection and semant

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task

Deformable Part Models

Feature Fusion for Scene Text Detection

SSD: Single Shot MultiBox Detector

R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS

Object Detection in Sports Videos

Regionlet Object Detector with Hand-crafted and CNN Feature

Self Driving. DNN * * Reinforcement * Unsupervised *

arxiv: v1 [cs.cv] 4 Jun 2015

Improving Small Object Detection

Feature-Fused SSD: Fast Detection for Small Objects

Pixel Offset Regression (POR) for Single-shot Instance Segmentation

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell

YOLO 9000 TAEWAN KIM

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

arxiv: v1 [cs.cv] 18 Jun 2017

Object Recognition and Detection

Automatic detection of books based on Faster R-CNN

Instance-aware Semantic Segmentation via Multi-task Network Cascades

Optimizing CNN-based Object Detection Algorithms on Embedded FPGA Platforms

arxiv: v2 [cs.cv] 18 Jul 2017

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou

arxiv: v2 [cs.cv] 24 Mar 2019

Kaggle Data Science Bowl 2017 Technical Report

arxiv: v3 [cs.cv] 29 Mar 2017

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies

Detecting cars in aerial photographs with a hierarchy of deconvolution nets

Detection Method of Insulator Based on Single Shot MultiBox Detector

Classification of objects from Video Data (Group 30)

THE goal of action detection is to detect every occurrence

Joint Object Detection and Viewpoint Estimation using CNN features

G-CNN: an Iterative Grid Based Object Detector

arxiv: v2 [cs.cv] 30 Mar 2016

arxiv: v2 [cs.cv] 23 Jan 2019

arxiv: v1 [cs.cv] 15 Oct 2018

Fully Convolutional Networks for Semantic Segmentation

arxiv: v1 [cs.cv] 7 Nov 2015

Hand Detection For Grab-and-Go Groceries

arxiv: v3 [cs.cv] 15 Sep 2018

DPATCH: An Adversarial Patch Attack on Object Detectors

Scene Text Recognition for Augmented Reality. Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science

Transcription:

CS6501: Deep Learning for Visual Recognition Object Detection I: RCNN, Fast-RCNN, Faster-RCNN

Today s Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster RCNN Object Detector (2016)

Object Detection deer cat

Object Detection as Classification CNN deer? cat? background?

Object Detection as Classification CNN deer? cat? background?

Object Detection as Classification CNN deer? cat? background?

Object Detection as Classification with Sliding Window CNN deer? cat? background?

Object Detection as Classification with Box Proposals

RCNN https://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr.pdf Rich feature hierarchies for accurate object detection and semantic segmentation. Girshick et al. CVPR 2014.

RCNN First stage: generate categoryindependent region proposals. 2000 Region proposals for every image Selective Search: combine the strength of both an exhaustive search and segmentation. Uijlings et al. IJCV 2013. ref

RCNN First stage: generate categoryindependent region proposals. 2000 Region proposals for every image Second stage: extracts a fixed-length feature vector from each region. a 4096-dimensional feature vector from each region proposal warp CNN feature vector Arbitrary rectangles? A fixed size input? 227 x 227 5 conv layers + 2 fully connected layers

RCNN First stage: generate categoryindependent region proposals. 2000 Region proposals for every image Second stage: extracts a fixed-length feature vector from each region. a 4096-dimensional feature vector from each region proposal Third stage: a set of class- specific linear SVMs. object category and location feature vector proposal location linear svm Bounding box regression people? horse? background? x y w h

RCNN Fast-RCNN Simple and scalable. improves map. A multistage pipeline. Training is expensive in space and time (features are extracted from each region proposal in each image and written into disk). Object detection is slow.?

Fast-RCNN https://arxiv.org/abs/1504.08083 Fast R-CNN. Girshick. ICCV 2015. Idea: No need to recompute features for every box independently

Fast-RCNN Process the whole image with several convolutional (conv) and max pooling layers to produce a conv feature map. a region of interest (RoI) pooling layer extracts a fixed-length feature vector from the region feature map. feature vector FC+ softmax K + 1 categories + FC+ regressor four real-valued numbers for each of the K object classes.

RCNN Simple and scalable. improves map. A multistage pipeline. Training is expensive in space and time (features are extracted from each region proposal in each image and written into disk). Object detection is slow. Fast-RCNN Higher map. Single stage, end-to-end training. No disk storage is required for feature caching. proposals are the computational bottleneck in detection systems. Faster-RCNN?

Faster-RCNN Idea: Integrate the Bounding Box Pro posals as part of the CNN predictions https://arxiv.org/abs/1506.01497 Ren et al. NIPS 2015.

Faster-RCNN Region Proposal Networks: 2k scores 4k coordinates k anchors boxes cls layer object or not object 1x1 conv layer bounding box proposal 1x1 conv layer reg layer RPN nxn conv layer Shared conv layers Fast-RCNN feature map sliding window, nxn

RCNN Simple and scalable. improves map. A multistage pipeline. Training is expensive in space and time (features are extracted from each region proposal in each image and written into disk). Object detection is slow. Fast-RCNN Higher map. Single stage, end-to-end training. No disk storage is required for feature caching. proposals are the computational bottleneck in detection systems. Faster-RCNN compute proposals with a deep convolutional neural network --Region Proposal Network (RPN) merge RPN and Fast R-CNN into a single network, enabling nearly cost-free region proposals.?

YOLO- You Only Look Once Idea: No bounding box proposal. A single regression problem, stra ight from image pixels to boundi ng box coordinates and class pro babilities. extremely fast reason globally learn generalizable represent ations https://arxiv.org/abs/1506.02640 Redmon et al. CVPR 2016.

YOLO- You Only Look Once Divide the image into 7x7 cells. Each cell trains a detector. The detector needs to predict the object s class distributions. The detector has 2 bounding-box predictors to predict bounding-boxes and confidence scores.

SSD: Single Shot Detector Idea: Similar to YOLO, but denser grid map, multiscale grid maps. + Data augme ntation + Hard negative mining + Other design choices in the network. Liu et al. ECCV 2016.

Questions?