Instance-aware Semantic Segmentation via Multi-task Network Cascades

Size: px
Start display at page:

Download "Instance-aware Semantic Segmentation via Multi-task Network Cascades"

Transcription

1 Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai, Kaiming He, Jian Sun Microsoft research 2016 Yotam Gil Amit Nativ

2 Agenda Introduction Highlights Implementation Further improvements Experiments & Results Conclusions

3 Introduction Semantic segmentation each pixel has a category Labeling image pixels with semantic categories and instance indices is a challenging task

4 Introduction Classification Classification + Localization Object Detection Instance Segmentation

5 Introduction This is the output we re looking for classification of each object to class and instance index -

6 Introduction Existing methods require external mask proposals modules Slow at inference (~30sec / image) for MCG [CVPR 2014] proposals Take no advantage of deeply learned features

7 Highlights First pure CNN-based method for instance segmentation First place in MS COCO segmentation challenge in 2015 Fastest CNN-based method for instance segmentation

8 Dividing the task to sub-tasks Decomposition into three sub-tasks:

9 Dividing the task to sub-tasks The tasks are dependent This network structure is called Multi-task network cascade The training is done end-to-end elaborated next

10 Dividing the task to sub-tasks Cascade Model -

11 Task 1 Regressing box level instances Region Proposal Network (RPN) Based on Faster R-CNN Input Shared features Outputs highest score boxes to next stage, in the format of Bi = x i, y i, w i, h i, p i Loss function L 1 = L 1 B θ

12 Task 2 - Regressing mask-level instances Input Shared features and proposed boxes {B i } Output - {M i }, a list of masks each with size m 2, taking continuous values in [0,1] Perform logistic regression to the ground truth mask Shared features & Box proposals Task 2 m 2 Mask per proposed box

13 Task 2 Regressing mask-level instances Loss function L 2 = L 2 M θ B(θ) Region-of-Interest (RoI) pooling with differentiable RoI warping layer to enable end-to-end training

14 Task 3 Categorizing instances Input Shared features, boxes (stage 1) and masks (stage 2) Two pathways concatenated to predict object class Box-based pathway: directly use RoI pooled features Mask-based pathway: mask out background features - F mask i (θ) = F RoI i (θ) M i (θ)

15 Task 3 Categorizing instances Output C = {C i }, list of category prediction for all instances Loss function L 3 = L 3 C θ B θ, M(θ)

16 End-to-end training Loss function L = L 1 + L 2 + L 3 Unlike traditional multi-task learning loss terms are dependent

17 End-to-end training Challenges Apply the chain rule to the loss function Spatial transform of a predicted box that determines RoI pooling (unlike R-CNN, for example)

18 End-to-end training Solution Perform cropping and warping operations by bilinear interpolation

19 End-to-end training F i RoI θ = G B i θ F θ G Cropping and warping, maps W x H to W x H image Dimensions (n x n) F full image feature map n-dimensional vector F RoI - Output of RoI warping n -dimensional vector L 2 B i = L 2 F i RoI G B i F

20 Further improvements cascades with more stages Added 2 more stages to get 5-stage cascade Stages 2 and 3 are performed for the second time the box proposals derive from stage 3

21 Experiments on PASCAL VOC 2012

22 Experiments on PASCAL VOC 2012 Object detection evaluations as a by product

23 Experiments on PASCAL VOC 2012

24 Experiments MS COCO Using VGG-16 and ResNet Final result on the test-challenge set is 28.2%/51.5%

25 Experiments MS COCO

26 Conclusions Contributions Task decomposition Multi-task Network Cascades (MNCs) Solely based on CNNs, without external modules End-to-End Training Fast and accurate Investigate in the future Idea of exploiting network cascades in a multi-task learning framework maybe useful for other recognition tasks Combine other successful strategies

27 Multi-scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation Shu Liu, Xiaojuan Qi, Jianping Shi, Hong Zhang, Jiaya Jia (The Chinese University of Hong Kong, SenseTime Group Limited) Amit Nativ

28 So what are we talking about? Object recognition Object detection Semantic segmentation Instance aware semantic segmentation

29 Previous work [B. Hariharan 2014] Region Proposals Feature extraction: (R-CNN) Region Classification Region Refinement

30 Patch Aggregation Method The Basic Idea find different patches of the same object Find the mask in each one combine them in a smart way INSTANCE AWARE + DETECTION +SEGMENTATION

31 Patch Aggregation Method The Basic Idea

32 Patch Aggregation Method The Basic Idea

33 Patch Aggregation Method The Basic Idea Each patch belongs to a different object Instance aware segmentation and detection

34 Network structure Convolution layers Multi-scale path generator Class classification branch Segmentation branch

35 Convolution layers Convolution Layers generate the global feature map. 13 convolution layers interleaved with ReLU and polling. Similar to layers in VGG-16 net. Down sample is 16

36 Multi-Scale Patch Generator In the original image 4 different patch sizes: (48 48, 96 96, , ) Sliding windows with patch 16

37 Multi-Scale Patch Generator Different patch scale different patch grid (48 48, 96 96, , ) ( 3x3, 6x6, 12x12, 24x24) Cropped feature grids Global feature map

38 Multi-Scale Patch Generator Intuitively, we could now analyze each scale separately. mask label mask label mask label mask label Sub Net 3 Sub Net 6 Sub Net 12 Sub Net 24 Cropped feature grids Global feature map

39 Multi-Scale Patch Generator a better solution is to rescale all patches to the same size mask Scale Alignment (12x12) Low resolution layers up sample High resolution layer Sub down sample Net 12 label 12x12 deconv deconv copy Max poll Cropped feature grids

40 Training Sample Selection Standard criterion Intersection over Union (IoU) value

41 Training Sample Selection Condition 1: Patch center on an object

42 Training Sample Selection Condition 2: at least half the object is inside the patch

43 Training Sample Selection Condition 3: The object size is at least 20% of patch

44 Training Sample Selection Only if all three conditions are met: Condition 1: Patch center on an object Condition 2: at least half the object is inside the patch Condition 3: The object size is at least 20% of patch The patch is POSITIVE: CLASS ASSIGNED TO PATCH MASK TO SEGMENT

45 Distinguish individual instances Due to condition 1: Patch is only responsible for center object If objects overlap in patch only the label of the mask in center will be predicted

46 Multi-class Classification Branch Predicts semantic label to each patch 2x2 Max pooling to reduce complexity Three fully connected layers The output: predicted score of patch P i

47 Segmentation Branch Segments the object in the patch (one patch one object)

48 Training Loss and Strategy The loss of classification and segmentation branches: if patch belongs to class label L w = i [ log(f c l i (P i )) + λi(l i 0) N j log f s j P i classification segmentation

49 Patch Aggregation Method After network prediction Semantic label patch mask One patch one semantic label overlapped patches overlapped masks merging masks optimize segmentation.

50 Patch Aggregation Method How to merge masks: overlap score: IoU of neighboring masks Row search: Only Left side Column search: Only top side Iterate over all patches. Patch pair with highest IoU is selected Repeat until overlap score is less than τ

51 Results Tested on different image data sets: VOC 2012 segmentation val VOC 2012 SDS val Microsoft COCO VOC 2012 SDS val subset

52 Results On VOC 2012 Segmentation val 10,582 images in train 1499 images in val. also proposal free (In terms of map r with different IoU thresholds)

53 Results On VOC 2012 SBD val 5623 images in train 5732 images in val. VOC 2012 SDS val subset

54 Results On VOC 2012 SBD val 5623 images in train 5732 images in val.

55 Running-time Analysis Proposal based systems take much longer Single scale input takes ~2 sec. Three-scale input takes ~ 9 sec Region proposals No Region proposals

56 Error Analysis Mis-localization has a strong effect Localization Class confusion Background detection

57 Take-Home Message No region proposals Patches are used to detect interesting areas For a patch to be includes, it must sustain 3 rules Patches are selected and merged based on mask IoU

58 Results On MSCOCO test-std/tes-dev 120k images in trainval 20k images in test-std 20k images in test-dev

59 QUESTIONS??

60

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects

More information

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left

More information

Mask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi

Mask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Mask R-CNN By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Types of Computer Vision Tasks http://cs231n.stanford.edu/ Semantic vs Instance Segmentation Image

More information

arxiv: v1 [cs.cv] 14 Dec 2015

arxiv: v1 [cs.cv] 14 Dec 2015 Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai Kaiming He Jian Sun Microsoft Research {jifdai,kahe,jiansun}@microsoft.com arxiv:1512.04412v1 [cs.cv] 14 Dec 2015 Abstract

More information

Deep Residual Learning

Deep Residual Learning Deep Residual Learning MSRA @ ILSVRC & COCO 2015 competitions Kaiming He with Xiangyu Zhang, Shaoqing Ren, Jifeng Dai, & Jian Sun Microsoft Research Asia (MSRA) MSRA @ ILSVRC & COCO 2015 Competitions 1st

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 Mask R-CNN Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 1 Common computer vision tasks Image Classification: one label is generated for

More information

Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network

Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network Anurag Arnab and Philip H.S. Torr University of Oxford {anurag.arnab, philip.torr}@eng.ox.ac.uk 1. Introduction

More information

R-FCN: Object Detection with Really - Friggin Convolutional Networks

R-FCN: Object Detection with Really - Friggin Convolutional Networks R-FCN: Object Detection with Really - Friggin Convolutional Networks Jifeng Dai Microsoft Research Li Yi Tsinghua Univ. Kaiming He FAIR Jian Sun Microsoft Research NIPS, 2016 Or Region-based Fully Convolutional

More information

Object Detection on Self-Driving Cars in China. Lingyun Li

Object Detection on Self-Driving Cars in China. Lingyun Li Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not

More information

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:

More information

Lecture 7: Semantic Segmentation

Lecture 7: Semantic Segmentation Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

Deep Learning for Object detection & localization

Deep Learning for Object detection & localization Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified

More information

Object detection with CNNs

Object detection with CNNs Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

More information

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Deep learning for object detection. Slides from Svetlana Lazebnik and many others Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep

More information

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological

More information

Learning to Segment Object Candidates

Learning to Segment Object Candidates Learning to Segment Object Candidates Pedro Pinheiro, Ronan Collobert and Piotr Dollar Presented by - Sivaraman, Kalpathy Sitaraman, M.S. in Computer Science, University of Virginia Facebook Artificial

More information

JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA

JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS Zhao Chen Machine Learning Intern, NVIDIA ABOUT ME 5th year PhD student in physics @ Stanford by day, deep learning computer vision scientist

More information

Classification of objects from Video Data (Group 30)

Classification of objects from Video Data (Group 30) Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time

More information

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN

More information

Final Report: Smart Trash Net: Waste Localization and Classification

Final Report: Smart Trash Net: Waste Localization and Classification Final Report: Smart Trash Net: Waste Localization and Classification Oluwasanya Awe oawe@stanford.edu Robel Mengistu robel@stanford.edu December 15, 2017 Vikram Sreedhar vsreed@stanford.edu Abstract Given

More information

R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS

R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS JIFENG DAI YI LI KAIMING HE JIAN SUN MICROSOFT RESEARCH TSINGHUA UNIVERSITY MICROSOFT RESEARCH MICROSOFT RESEARCH SPEED/ACCURACY TRADE-OFFS

More information

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Raymond Ptucha, Rochester Institute of Technology, USA Tutorial-9 May 19, 218 www.nvidia.com/dli R. Ptucha 18 1 Fair Use Agreement

More information

Classifying a specific image region using convolutional nets with an ROI mask as input

Classifying a specific image region using convolutional nets with an ROI mask as input Classifying a specific image region using convolutional nets with an ROI mask as input 1 Sagi Eppel Abstract Convolutional neural nets (CNN) are the leading computer vision method for classifying images.

More information

SEMANTIC SEGMENTATION AVIRAM BAR HAIM & IRIS TAL

SEMANTIC SEGMENTATION AVIRAM BAR HAIM & IRIS TAL SEMANTIC SEGMENTATION AVIRAM BAR HAIM & IRIS TAL IMAGE DESCRIPTIONS IN THE WILD (IDW-CNN) LARGE KERNEL MATTERS (GCN) DEEP LEARNING SEMINAR, TAU NOVEMBER 2017 TOPICS IDW-CNN: Improving Semantic Segmentation

More information

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,

More information

arxiv: v1 [cs.cv] 15 Oct 2018

arxiv: v1 [cs.cv] 15 Oct 2018 Instance Segmentation and Object Detection with Bounding Shape Masks Ha Young Kim 1,2,*, Ba Rom Kang 2 1 Department of Financial Engineering, Ajou University Worldcupro 206, Yeongtong-gu, Suwon, 16499,

More information

Automatic detection of books based on Faster R-CNN

Automatic detection of books based on Faster R-CNN Automatic detection of books based on Faster R-CNN Beibei Zhu, Xiaoyu Wu, Lei Yang, Yinghua Shen School of Information Engineering, Communication University of China Beijing, China e-mail: zhubeibei@cuc.edu.cn,

More information

Amodal and Panoptic Segmentation. Stephanie Liu, Andrew Zhou

Amodal and Panoptic Segmentation. Stephanie Liu, Andrew Zhou Amodal and Panoptic Segmentation Stephanie Liu, Andrew Zhou This lecture: 1. 2. 3. 4. Semantic Amodal Segmentation Cityscapes Dataset ADE20K Dataset Panoptic Segmentation Semantic Amodal Segmentation Yan

More information

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic

More information

Cascade Region Regression for Robust Object Detection

Cascade Region Regression for Robust Object Detection Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng, Shaoli Huang, Jing Yang, Hui Shuai, Zhengbo Yu, Zongguang Lu, Qiang Ma, Yali

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification

More information

OBJECT DETECTION HYUNG IL KOO

OBJECT DETECTION HYUNG IL KOO OBJECT DETECTION HYUNG IL KOO INTRODUCTION Computer Vision Tasks Classification + Localization Classification: C-classes Input: image Output: class label Evaluation metric: accuracy Localization Input:

More information

Rich feature hierarchies for accurate object detection and semant

Rich feature hierarchies for accurate object detection and semant Rich feature hierarchies for accurate object detection and semantic segmentation Speaker: Yucong Shen 4/5/2018 Develop of Object Detection 1 DPM (Deformable parts models) 2 R-CNN 3 Fast R-CNN 4 Faster

More information

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Liwen Zheng, Canmiao Fu, Yong Zhao * School of Electronic and Computer Engineering, Shenzhen Graduate School of

More information

Flow-Based Video Recognition

Flow-Based Video Recognition Flow-Based Video Recognition Jifeng Dai Visual Computing Group, Microsoft Research Asia Joint work with Xizhou Zhu*, Yuwen Xiong*, Yujie Wang*, Lu Yuan and Yichen Wei (* interns) Talk pipeline Introduction

More information

Fully Convolutional Networks for Semantic Segmentation

Fully Convolutional Networks for Semantic Segmentation Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Chaim Ginzburg for Deep Learning seminar 1 Semantic Segmentation Define a pixel-wise labeling

More information

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation

More information

Crafting GBD-Net for Object Detection

Crafting GBD-Net for Object Detection MANUSCRIPT 1 Crafting GBD-Net for Object Detection Xingyu Zeng*,Wanli Ouyang*,Junjie Yan, Hongsheng Li,Tong Xiao, Kun Wang, Yu Liu, Yucong Zhou, Bin Yang, Zhe Wang,Hui Zhou, Xiaogang Wang, To handle these

More information

RON: Reverse Connection with Objectness Prior Networks for Object Detection

RON: Reverse Connection with Objectness Prior Networks for Object Detection RON: Reverse Connection with Objectness Prior Networks for Object Detection Tao Kong 1, Fuchun Sun 1, Anbang Yao 2, Huaping Liu 1, Ming Lu 3, Yurong Chen 2 1 Department of CST, Tsinghua University, 2 Intel

More information

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing 3 Object Detection BVM 2018 Tutorial: Advanced Deep Learning Methods Paul F. Jaeger, of Medical Image Computing What is object detection? classification segmentation obj. detection (1 label per pixel)

More information

Multi-scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation

Multi-scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation Multi-scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation Shu Liu Xiaojuan Qi Jianping Shi Hong Zhang Jiaya Jia The Chinese University of Hong Kong SenseTime Group Limited {sliu,

More information

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN CS6501: Deep Learning for Visual Recognition Object Detection I: RCNN, Fast-RCNN, Faster-RCNN Today s Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster

More information

Pixel Offset Regression (POR) for Single-shot Instance Segmentation

Pixel Offset Regression (POR) for Single-shot Instance Segmentation Pixel Offset Regression (POR) for Single-shot Instance Segmentation Yuezun Li 1, Xiao Bian 2, Ming-ching Chang 1, Longyin Wen 2 and Siwei Lyu 1 1 University at Albany, State University of New York, NY,

More information

Content-Based Image Recovery

Content-Based Image Recovery Content-Based Image Recovery Hong-Yu Zhou and Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China zhouhy@lamda.nju.edu.cn wujx2001@nju.edu.cn Abstract. We propose

More information

arxiv: v1 [cs.cv] 31 Mar 2016

arxiv: v1 [cs.cv] 31 Mar 2016 Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu and C.-C. Jay Kuo arxiv:1603.09742v1 [cs.cv] 31 Mar 2016 University of Southern California Abstract.

More information

Improving Face Recognition by Exploring Local Features with Visual Attention

Improving Face Recognition by Exploring Local Features with Visual Attention Improving Face Recognition by Exploring Local Features with Visual Attention Yichun Shi and Anil K. Jain Michigan State University Difficulties of Face Recognition Large variations in unconstrained face

More information

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection Zeming Li, 1 Yilun Chen, 2 Gang Yu, 2 Yangdong

More information

arxiv: v2 [cs.cv] 18 Jul 2017

arxiv: v2 [cs.cv] 18 Jul 2017 PHAM, ITO, KOZAKAYA: BISEG 1 arxiv:1706.02135v2 [cs.cv] 18 Jul 2017 BiSeg: Simultaneous Instance Segmentation and Semantic Segmentation with Fully Convolutional Networks Viet-Quoc Pham quocviet.pham@toshiba.co.jp

More information

EE-559 Deep learning Networks for semantic segmentation

EE-559 Deep learning Networks for semantic segmentation EE-559 Deep learning 7.4. Networks for semantic segmentation François Fleuret https://fleuret.org/ee559/ Mon Feb 8 3:35:5 UTC 209 ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE The historical approach to image

More information

Kaggle Data Science Bowl 2017 Technical Report

Kaggle Data Science Bowl 2017 Technical Report Kaggle Data Science Bowl 2017 Technical Report qfpxfd Team May 11, 2017 1 Team Members Table 1: Team members Name E-Mail University Jia Ding dingjia@pku.edu.cn Peking University, Beijing, China Aoxue Li

More information

ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials)

ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials) ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials) Yinda Zhang 1,2, Sameh Khamis 1, Christoph Rhemann 1, Julien Valentin 1, Adarsh Kowdle 1, Vladimir

More information

Toward Scale-Invariance and Position-Sensitive Region Proposal Networks

Toward Scale-Invariance and Position-Sensitive Region Proposal Networks Toward Scale-Invariance and Position-Sensitive Region Proposal Networks Hsueh-Fu Lu [0000 0003 1885 3805], Xiaofei Du [0000 0002 0071 8093], and Ping-Lin Chang [0000 0002 3341 8425] Umbo Computer Vision

More information

Single-Shot Refinement Neural Network for Object Detection -Supplementary Material-

Single-Shot Refinement Neural Network for Object Detection -Supplementary Material- Single-Shot Refinement Neural Network for Object Detection -Supplementary Material- Shifeng Zhang 1,2, Longyin Wen 3, Xiao Bian 3, Zhen Lei 1,2, Stan Z. Li 4,1,2 1 CBSR & NLPR, Institute of Automation,

More information

YOLO9000: Better, Faster, Stronger

YOLO9000: Better, Faster, Stronger YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Presented by Pandian Raju and Jialin Wu Last class SGD for Document

More information

arxiv: v1 [cs.cv] 14 Jun 2016

arxiv: v1 [cs.cv] 14 Jun 2016 S. GIDARIS AND N. KOMODAKIS: ATTEND REFINE REPEAT 1 arxiv:1606.04446v1 [cs.cv] 14 Jun 2016 Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization Spyros Gidaris spyros.gidaris@enpc.fr

More information

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D.

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. Rhodes 5/10/17 What is Machine Learning? Machine learning

More information

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization

More information

Learning Deep Structured Models for Semantic Segmentation. Guosheng Lin

Learning Deep Structured Models for Semantic Segmentation. Guosheng Lin Learning Deep Structured Models for Semantic Segmentation Guosheng Lin Semantic Segmentation Outline Exploring Context with Deep Structured Models Guosheng Lin, Chunhua Shen, Ian Reid, Anton van dan Hengel;

More information

arxiv: v1 [cs.cv] 19 Mar 2018

arxiv: v1 [cs.cv] 19 Mar 2018 arxiv:1803.07066v1 [cs.cv] 19 Mar 2018 Learning Region Features for Object Detection Jiayuan Gu 1, Han Hu 2, Liwei Wang 1, Yichen Wei 2 and Jifeng Dai 2 1 Key Laboratory of Machine Perception, School of

More information

Visual features detection based on deep neural network in autonomous driving tasks

Visual features detection based on deep neural network in autonomous driving tasks 430 Fomin I., Gromoshinskii D., Stepanov D. Visual features detection based on deep neural network in autonomous driving tasks Ivan Fomin, Dmitrii Gromoshinskii, Dmitry Stepanov Computer vision lab Russian

More information

CS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep

CS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep CS395T paper review Indoor Segmentation and Support Inference from RGBD Images Chao Jia Sep 28 2012 Introduction What do we want -- Indoor scene parsing Segmentation and labeling Support relationships

More information

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab. [ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that

More information

SON OF ZORN S LEMMA: TARGETED STYLE TRANSFER USING INSTANCE-AWARE SEMANTIC SEGMENTATION

SON OF ZORN S LEMMA: TARGETED STYLE TRANSFER USING INSTANCE-AWARE SEMANTIC SEGMENTATION SON OF ZORN S LEMMA: TARGETED STYLE TRANSFER USING INSTANCE-AWARE SEMANTIC SEGMENTATION Carlos Castillo, Soham De, Xintong Han, Bharat Singh, Abhay Kumar Yadav, and Tom Goldstein Department of Computer

More information

Structured Prediction using Convolutional Neural Networks

Structured Prediction using Convolutional Neural Networks Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer

More information

LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES

LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES LEARNING TO INFER GRAPHICS PROGRAMS FROM HAND DRAWN IMAGES Kevin Ellis - MIT, Daniel Ritchie - Brown University, Armando Solar-Lezama - MIT, Joshua b. Tenenbaum - MIT Presented by : Maliha Arif Advanced

More information

Yield Estimation using faster R-CNN

Yield Estimation using faster R-CNN Yield Estimation using faster R-CNN 1 Vidhya Sagar, 2 Sailesh J.Jain and 2 Arjun P. 1 Assistant Professor, 2 UG Scholar, Department of Computer Engineering and Science SRM Institute of Science and Technology,Chennai,

More information

arxiv: v1 [cs.cv] 29 Nov 2018

arxiv: v1 [cs.cv] 29 Nov 2018 Grid R-CNN Xin Lu 1 Buyu Li 1 Yuxin Yue 1 Quanquan Li 1 Junjie Yan 1 1 SenseTime Group Limited {luxin,libuyu,yueyuxin,liquanquan,yanjunjie}@sensetime.com arxiv:1811.12030v1 [cs.cv] 29 Nov 2018 Abstract

More information

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) Donggeun Yoo, Sunggyun Park, Joon-Young Lee, Anthony Paek, In So Kweon. State-of-the-art frameworks for object

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha

More information

Martian lava field, NASA, Wikipedia

Martian lava field, NASA, Wikipedia Martian lava field, NASA, Wikipedia Old Man of the Mountain, Franconia, New Hampshire Pareidolia http://smrt.ccel.ca/203/2/6/pareidolia/ Reddit for more : ) https://www.reddit.com/r/pareidolia/top/ Pareidolia

More information

arxiv: v2 [cs.cv] 10 Apr 2017

arxiv: v2 [cs.cv] 10 Apr 2017 Fully Convolutional Instance-aware Semantic Segmentation Yi Li 1,2 Haozhi Qi 2 Jifeng Dai 2 Xiangyang Ji 1 Yichen Wei 2 1 Tsinghua University 2 Microsoft Research Asia {liyi14,xyji}@tsinghua.edu.cn, {v-haoq,jifdai,yichenw}@microsoft.com

More information

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601 Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601 Introduction Face ID is complicated by alterations to an individual s appearance Beard,

More information

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) e-isjn: A4372-3114 Impact Factor: 7.327 Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies Research Article

More information

arxiv: v1 [cs.cv] 16 Nov 2018

arxiv: v1 [cs.cv] 16 Nov 2018 Improving Rotated Text Detection with Rotation Region Proposal Networks Jing Huang 1, Viswanath Sivakumar 1, Mher Mnatsakanyan 1,2 and Guan Pang 1 1 Facebook Inc. 2 University of California, Berkeley November

More information

Semantic Segmentation

Semantic Segmentation Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2 OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell,

More information

Pseudo Mask Augmented Object Detection

Pseudo Mask Augmented Object Detection Pseudo Mask Augmented Object Detection Xiangyun Zhao Northwestern University zhaoxiangyun915@gmail.com Shuang Liang Tongji University shuangliang@tongji.edu.cn Yichen Wei Microsoft Research yichenw@microsoft.com

More information

Feature-Fused SSD: Fast Detection for Small Objects

Feature-Fused SSD: Fast Detection for Small Objects Feature-Fused SSD: Fast Detection for Small Objects Guimei Cao, Xuemei Xie, Wenzhe Yang, Quan Liao, Guangming Shi, Jinjian Wu School of Electronic Engineering, Xidian University, China xmxie@mail.xidian.edu.cn

More information

Advanced Video Analysis & Imaging

Advanced Video Analysis & Imaging Advanced Video Analysis & Imaging (5LSH0), Module 09B Machine Learning with Convolutional Neural Networks (CNNs) - Workout Farhad G. Zanjani, Clint Sebastian, Egor Bondarev, Peter H.N. de With ( p.h.n.de.with@tue.nl

More information

Modern Convolutional Object Detectors

Modern Convolutional Object Detectors Modern Convolutional Object Detectors Faster R-CNN, R-FCN, SSD 29 September 2017 Presented by: Kevin Liang Papers Presented Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

More information

Pseudo Mask Augmented Object Detection

Pseudo Mask Augmented Object Detection Pseudo Mask Augmented Object Detection Xiangyun Zhao Northwestern University zhaoxiangyun915@gmail.com Shuang Liang Tongji University shuangliang@tongji.edu.cn Yichen Wei Microsoft Research yichenw@microsoft.com

More information

Todo before next class

Todo before next class Todo before next class Each project group should submit a short project report (4 pages presentation slides) including 1. Problem definition 2. Related work 3. Preliminary results 4. Future plan Submission:

More information

Semantic Soft Segmentation Supplementary Material

Semantic Soft Segmentation Supplementary Material Semantic Soft Segmentation Supplementary Material YAĞIZ AKSOY, MIT CSAIL and ETH Zürich TAE-HYUN OH, MIT CSAIL SYLVAIN PARIS, Adobe Research MARC POLLEFEYS, ETH Zürich and Microsoft WOJCIECH MATUSIK, MIT

More information

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013 Feature Descriptors CS 510 Lecture #21 April 29 th, 2013 Programming Assignment #4 Due two weeks from today Any questions? How is it going? Where are we? We have two umbrella schemes for object recognition

More information

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python. Inception and Residual Networks Hantao Zhang Deep Learning with Python https://en.wikipedia.org/wiki/residual_neural_network Deep Neural Network Progress from Large Scale Visual Recognition Challenge (ILSVRC)

More information

arxiv: v2 [cs.cv] 30 Sep 2018

arxiv: v2 [cs.cv] 30 Sep 2018 A Detection and Segmentation Architecture for Skin Lesion Segmentation on Dermoscopy Images arxiv:1809.03917v2 [cs.cv] 30 Sep 2018 Chengyao Qian, Ting Liu, Hao Jiang, Zhe Wang, Pengfei Wang, Mingxin Guan

More information

Gated Bi-directional CNN for Object Detection

Gated Bi-directional CNN for Object Detection Gated Bi-directional CNN for Object Detection Xingyu Zeng,, Wanli Ouyang, Bin Yang, Junjie Yan, Xiaogang Wang The Chinese University of Hong Kong, Sensetime Group Limited {xyzeng,wlouyang}@ee.cuhk.edu.hk,

More information

Efficient Segmentation-Aided Text Detection For Intelligent Robots

Efficient Segmentation-Aided Text Detection For Intelligent Robots Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related

More information

TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES

TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES Valery Grishkin a, Alexander Ebral b, Nikolai Stepenko c, Jean Sene d Saint Petersburg State University, 7 9 Universitetskaya nab., Saint Petersburg, 199034,

More information

Multi-View 3D Object Detection Network for Autonomous Driving

Multi-View 3D Object Detection Network for Autonomous Driving Multi-View 3D Object Detection Network for Autonomous Driving Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, Tian Xia CVPR 2017 (Spotlight) Presented By: Jason Ku Overview Motivation Dataset Network Architecture

More information

Team G-RMI: Google Research & Machine Intelligence

Team G-RMI: Google Research & Machine Intelligence Team G-RMI: Google Research & Machine Intelligence Alireza Fathi (alirezafathi@google.com) Nori Kanazawa, Kai Yang, George Papandreou, Tyler Zhu, Jonathan Huang, Vivek Rathod, Chen Sun, Kevin Murphy, et

More information

Places Challenge 2017

Places Challenge 2017 Places Challenge 2017 Scene Parsing Task CASIA_IVA_JD Jun Fu, Jing Liu, Longteng Guo, Haijie Tian, Fei Liu, Hanqing Lu Yong Li, Yongjun Bao, Weipeng Yan National Laboratory of Pattern Recognition, Institute

More information

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection ILSVRC 2016 Object Detection from Video Byungjae Lee¹, Songguo Jin¹, Enkhbayar Erdenee¹, Mi Young Nam², Young Gui Jung², Phill Kyu

More information

Fast scene understanding and prediction for autonomous platforms. Bert De Brabandere, KU Leuven, October 2017

Fast scene understanding and prediction for autonomous platforms. Bert De Brabandere, KU Leuven, October 2017 Fast scene understanding and prediction for autonomous platforms Bert De Brabandere, KU Leuven, October 2017 Who am I? MSc in Electrical Engineering at KU Leuven, Belgium Last year PhD student with Luc

More information

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK Wenjie Guan, YueXian Zou*, Xiaoqun Zhou ADSPLAB/Intelligent Lab, School of ECE, Peking University, Shenzhen,518055, China

More information