Object Detection. TA : Young-geun Kim. Biostatistics Lab., Seoul National University. March-June, 2018

Size: px
Start display at page:

Download "Object Detection. TA : Young-geun Kim. Biostatistics Lab., Seoul National University. March-June, 2018"

Transcription

1 Object Detection TA : Young-geun Kim Biostatistics Lab., Seoul National University March-June, 2018 Seoul National University Deep Learning March-June, / 57

2 Index 1 Introduction 2 R-CNN 3 YOLO 4 Evaluation Seoul National University Deep Learning March-June, / 57

3 Introduction Introduction Seoul National University Deep Learning March-June, / 57

4 Introduction In this session, we will learn about... The Object Detection problem. Regions with CNN features (R-CNN), a region proposal-based approach model. You-Only-Look-Once (YOLO), an unified approach model. Evaluation metrics for detection models. Seoul National University Deep Learning March-June, / 57

5 Introduction What is Object Detection? Object Detection is a task finding where and what objects are (Location + Classification). An integral part of various vision application such as Automated Driving System, Face Detection and Object Counting. Figure: from (YOLO v3 clip). Seoul National University Deep Learning March-June, / 57

6 Introduction What is Object Detection? (Conti.) For given image, task-taker should answer the predicted region and class confidence. A region is expressed as rectangular called bounding box. The number of objects is not provided. Figure: from Ren et al., Seoul National University Deep Learning March-June, / 57

7 Introduction What is Object Detection? (Conti.) Exact region (or bounding box) of each object is called Ground-Truth (GT) box, the minimal rectangular containing whole part of the object. A region is parameterized by (x, y, w, h) where (x, y) is the coordinate of top-left (or center) point, w is the width, and h is the height of the bounding box. Intersection over Union (IoU), the ratio of intersection area to union area, between predicted region and GT box presents the accuracy about location. Seoul National University Deep Learning March-June, / 57

8 Introduction What is Object Detection? (Conti.) There are various types of objects. For example, VOC challenge requires detecting following 20 kinds of object classes. Middle Level Person Animal Vehicle Indoor Low Level person bird, cat, cow, dog, horse, sheep aeroplane, bicycle, boat, bus, car, motorbike, train bottle, chair, diningtable, potted plant, sofa, tv/monitor mean Average Precision (map), an estimator of the area under the precision-recall curve (AUCPR), usually presents the accuracy about classification. Seoul National University Deep Learning March-June, / 57

9 Introduction What is Object Detection? (Conti.) An object is considered detected if task-taker selects any region with predicted label satisfying following conditions. Condition 1 : Highly overlapped with GT box of the object. Condition 2 : Correctly classified. Seoul National University Deep Learning March-June, / 57

10 Introduction Challenges Infinitely Imbalanced Structure : Background (BG) is the majority class. There are few positive regions (GT) and infinitely many negative regions (BG). True Predicted N P Total N TN FP # of BG P FN TP # of GT Table: The confusion matrix of object detection. In this structure, accuracy about positive class is severely suffered. This means that finding an object position as it is difficult. Seoul National University Deep Learning March-June, / 57

11 Introduction Challenges (Conti.) Dynamic Scale : Shape of objects is various. Some are tiny/huge and some are horizontally/vertically long. Figure: from VOC2012. This means that our model should recognize various scale of regions. Seoul National University Deep Learning March-June, / 57

12 Introduction Challenges (Conti.) Multi-task : Finding object position (Location) and classifying the object (Classification) each is difficult. Object Detection requires performing both tasks simultaneously. In practice, the test time of detection model should be short, but due to the high level of difficulty, it is challenging. Seoul National University Deep Learning March-June, / 57

13 Introduction Approaches Pre-deep learning approaches (Do not cover. See 50 years of object recognition: Directions forward). Regions with CNN features (R-CNN), a region proposal-based approach model. You-Only-Look-Once (YOLO), an unified approach model. Seoul National University Deep Learning March-June, / 57

14 R-CNN R-CNN Seoul National University Deep Learning March-June, / 57

15 R-CNN Regions with CNN features Regions with CNN features (R-CNN) is a region proposal-based approach model (Girshick et el., 2014). R-CNN selects regions using Selective Search (Uijlinga et al., 2013), warps them as to the same scale and extracts features to learn class-specific SVMs. Figure: from Girshick et el., Seoul National University Deep Learning March-June, / 57

16 R-CNN Selective Search Selective Search (SS) is an hierarchical grouping algorithm whose domain is a set of region. For given set of regions, SS greedily merges regions. The distance measure is a partial summation of similarity about colour, texture, size, and fill. Figure: from Uijlinga et al., Seoul National University Deep Learning March-June, / 57

17 R-CNN Selective Search (Conti.) Considering various features from fine-level region, SS distinguishes objects and captures their hierarchical structure. Initialization is based on a graph-based segmentation algorithm (Felzenszwalb and Huttenlocher, 2004.) whose time complexity is nearly linear in the number of pixels. Seoul National University Deep Learning March-June, / 57

18 R-CNN Detection Network Proposed regions pass through CNNs which consists of classifier and bounding box (bbox) regressor. AlexNet (Krizhevsky et al., 2012.) is applied with replaced FC layer for corresponding number of class including BG. After tuning AlexNet, class-specific linear SVMs and bbox regressor (Felzenszwalb et al., 2010.) are learned by using extracted feature. bbox regressor predicts (x, y, w, h) of GT and use it to adjust proposed regions. Seoul National University Deep Learning March-June, / 57

19 R-CNN Limitation of R-CNN R-CNN requires fine-tuning CNN, learning multiple SVMs and bbox regressor (multi-stage pipeline). Training SVMs and bbox regressor requires feedforwarding all regions in all images and saving all extracted features. Because of the same reason in training, test time is too long. It takes 47 second to perform detection for a single image. Seoul National University Deep Learning March-June, / 57

20 R-CNN Spatial Pyramid Pooling Network Feedforwarding all proposed regions is time-consuming approach. Spatial Pyramid Pooling (SPP; He et al., 2014.) models the spatial connectivity and makes various regions into the fixed size. For usual CNNs, warp conv conv warp since there is no spatial connectivity between raw image and extracted feature. Figure: from He et al., Seoul National University Deep Learning March-June, / 57

21 R-CNN Spatial Pyramid Pooling Network (Conti.) SPP Network learns the spatial connectivity, still capturing semantic content. SPP reduces the computation cost, but SPP Network is still multi-stage pipeline. Figure: from He et al., Seoul National University Deep Learning March-June, / 57

22 R-CNN Fast R-CNN Fast R-CNN (Girshick and Ross, 2015.) is a variation of R-CNN applying Region of Interest (RoI) pooling, a kind of SPP. Training is single-stage by using multi-task loss. Multi-task loss enables us to update all weights simultaneously. Figure: from Girshick and Ross, Seoul National University Deep Learning March-June, / 57

23 R-CNN Region of Interest Pooling RoI pooling connects the raw image and the final extracted feature before FC layers. RoI feature vector passes two sibling FC layer. Figure: from Girshick and Ross, Seoul National University Deep Learning March-June, / 57

24 R-CNN Region of Interest Pooling (Conti.) In contrast to usual max-pooling, RoI pooling has dynamic filter size. Back propagation through RoI pooling requires activated positions for each region. Seoul National University Deep Learning March-June, / 57

25 R-CNN Multi-task Loss For given region (x r, y r, w r, h r ) in an image, Fast R-CNN calculates p and t k = (tx k, ty k, tw k, th k ), the predicted probability vector and location for class k parameterized by following. t k x = (x k x r )/w r t k y = (y k y r )/h r t k w = log(w k /w r ) t k h = log(hk /h r ) Let u, v be the true class and location of corresponding GT box for given region. v is parameterized by substituting (x, y, w, h) of the GT. Seoul National University Deep Learning March-June, / 57

26 R-CNN Multi-task Loss (Conti.) To train multi-task model, the loss function is designed as L(p, u, t u, v) = L cls (p, u) + λ[u 1]L loc (t u, v) where L cls (p, u) = log p u is log loss for true class u and L loc (t u, v) = i {x,y,w,h} huber(t u i v i ). The hyper-parameter λ controls balance between classification loss and regression loss. For u = 0, background region, L loc doesn t have any role. L loc is a function of ti u v i, so L is invariant to translation, flipping and rescaling. Seoul National University Deep Learning March-June, / 57

27 R-CNN Limitation of Fast R-CNN (Conti.) Compared to R-CNN, Fast R-CNN achieves slightly higher accuracy with nearly 100 times short test time, but the test time is still long. In VOC 2007 test task, Fast R-CNN takes 1830ms per image. Region proposal task, SS is a huge piece consuming 1510ms per image. Seoul National University Deep Learning March-June, / 57

28 R-CNN Faster R-CNN Faster R-CNN (Ren et al., 2015) is a variation of Fast R-CNN using Region Proposal Network (RPN). Roughly speaking, Faster R-CNN = RPN + Fast R-CNN. Contrast to SS, RPN has learnable weight for multi-task loss. Seoul National University Deep Learning March-June, / 57

29 R-CNN Region Proposal Network For given point in an image, RPN classifies objectness of several regions centered on that point and regresses exact location. Pre-determined points in each image are called anchors. Figure: adjusted from VOC2012. Seoul National University Deep Learning March-June, / 57

30 R-CNN Region Proposal Network (Conti.) 1. For selected anchor, view small region nearby the anchor in the level of extracted feature. 2. Determine the objectness of k regions centered on the corresponding anchor in the raw image map. Figure: from Ren et al., Seoul National University Deep Learning March-June, / 57

31 R-CNN Region Proposal Network (Conti.) 3. For all regions classified to be positive, adjust them using reg layer. Figure: from Ren et al., Seoul National University Deep Learning March-June, / 57

32 R-CNN Multi-task loss for RPN RPN uses multi-task loss similar to Fast R-CNN. Exact formula is L({p i }, {t i }) = 1 L cls (p i, pi ) + λ 1 pi L reg (t i, ti ) N cls N reg where i is the index of an anchor. i Here, p i is the predicted probability of anchor i being an object. pi the ground-truth label. t is the same to Fast R-CNN. (Opinion) This data has multi-label structure. Note that input domain of loss is an anchor box, not anchor boxes sharing center. This design relaxes the issue about class correlation between anchor boxes. i is Seoul National University Deep Learning March-June, / 57

33 R-CNN Training faster R-CNN RPN and fast R-CNN share feature extractor part. This shared structure reduces test-time, the origin of its name Faster R-CNN. Sharing structure is implemented by following sequence. Phase Feature Extractor Region Proposal 1. Train RPN 2. Train fast R-CNN 3. Tune RPN 4. Tune fast R-CNN Initialized from ImageNet model Initialized from ImageNet model Frozen from phase 2. Frozen from phase 2. - RPN from phase 1. - RPN from phase 3. Seoul National University Deep Learning March-June, / 57

34 R-CNN Summary of R-CNN variations All the models use bbox regressor to adjust proposed region. R-CNN uses SVM and others use softmax classifier. Region Proposal Method Region Scaling Method R-CNN SS Warping Fast R-CNN SS RoI pooling Faster R-CNN RPN RoI pooling Table: Key methodologies. map (%) test time (ms/image) R-CNN 66.0 > 10 4 Fast R-CNN Faster R-CNN Table: Evaluation on VOC 2007 test set, adjusted from Girshick and Ross, and Ren et al., Seoul National University Deep Learning March-June, / 57

35 YOLO YOLO Seoul National University Deep Learning March-June, / 57

36 YOLO You-Only-Look-Once You-Only-Look-Once (YOLO; Redmon et al., 2016.) is an unified approach model. YOLO has one CNNs solving both location and classification problem. In the introduction of paper: Humans glance at an image and instantly know what objects are in the image, where they are, and how they interact. For given image, YOLO feedforwards only one time, remarkably reducing test time. All the figures, tables, and equations in this section are come from Redmon et al., Seoul National University Deep Learning March-June, / 57

37 YOLO Terminology An image is divided by S S grid cells. If the center of an object falls into a grid cell, that grid cell is responsible for detecting that object. Seoul National University Deep Learning March-June, / 57

38 YOLO Terminology (Conti.) Each grid cell predicts B bounding boxes and corresponding objectness confidence. Each bounding box is parametrized by (x, y, w, h), the same to R-CNN. The objectness confidence is Pr(Object) IOU truth pred. Seoul National University Deep Learning March-June, / 57

39 YOLO Terminology (Conti.) All bounding boxes sharing grid cell have the same conditional class probability, formally Pr(Class i Object). At test time, the class-specific confidence, Pr(Class i ) IoUpred truth is predicted by multiplying predicted conditional class confidence and objectness confidence. Seoul National University Deep Learning March-June, / 57

40 YOLO Terminology (Conti.) Seoul National University Deep Learning March-June, / 57

41 YOLO Architecture For given image, YOLO predicts (x, y, w, h) and objectness confidence for all bounding boxes and conditional class probability for all grid cells. Considering its spatial meaning, we can view the output as S S (B 5 + C) box. In VOC competition, S = 7, B = 2, and C = 20. Seoul National University Deep Learning March-June, / 57

42 YOLO Architecture (Conti.) Following figure describes the architecture of YOLO. For given image, convolution layers extract features and final FC layer predicts bounding box parameters, objectness confidence, and conditional class probability. Seoul National University Deep Learning March-June, / 57

43 YOLO Loss Following is the loss function of YOLO. The first two terms are about bbox regression. Next two terms are about objectness classification and the last term is about the class classification. Here, 1 i and 1 ij are indicators about responsibility of ith grid cell and its jth bounding box, respectively. Seoul National University Deep Learning March-June, / 57

44 YOLO Performance YOLO is the first deep-learning model in the context of real-time detection with the state-of-the-art accuracy. Real-Time : 30 frames per second or better. When the speed of car is 60km/h, car moves 0.55m between detections. Seoul National University Deep Learning March-June, / 57

45 YOLO Performance (Conti.) Compared with fast R-CNN, YOLO has high location error and low background error. Correct: correct class and IoU >.5, Loc: correct class,.1<iou<.5, Sim: class is similar, IoU>.1, Other: class is wrong, IoU>.1, Background: IoU<.1 for any object. Seoul National University Deep Learning March-June, / 57

46 Evaluation Evaluation Seoul National University Deep Learning March-June, / 57

47 Evaluation Non Maximum Suppression Some of predicted regions severely overlap. In object detection, multiple detection for single GT is penalized. Figure: from Seoul National University Deep Learning March-June, / 57

48 Evaluation Figure: from Seoul National University Deep Learning March-June, / 57 Non Maximum Suppression (Conti.) Non Maximum Suppression (NMS) is a pre-work for evaluation, removing overlapped regions using confidence. Choose the most confident bounding box and remove all other boxes with high IoU with the box. Repeat until there is no more box. NMS is applied to both RPN and detection network.

49 Evaluation Evaluation measures In Infinitely Imbalance Structure, performance measures using TN may unsuitable. True Predicted N P Total N TN FP # of BG P FN TP # of GT Table: The confusion matrix of object detection. Detecting all objects as it is easy. Just classify all regions to all object. What would be the value of TN? If the model is reasonable, TN should be. Seoul National University Deep Learning March-June, / 57

50 Evaluation Evaluation measures (Conti.) Main evaluation measures in object detection are based on Precision and Recall. Precision : the proportion of TP among positive labeled, TP/(TP+FP). Recall : the proportion of TP among positive, TP/(TP+FN). F1 score : the harmonic mean of precision and recall. AUCPR : the area under the precision-recall curve. Commonly used estimator for AUCPR in object detection is Average Precision (AP). Seoul National University Deep Learning March-June, / 57

51 Evaluation Average Precision Let c be the threshold of confidence. Than AUCPR can be expressed as AUCPR = Precision(c)dRecall(c) where Precision(c) and Recall(c) are the precision and recall at threshold level c, respectively. By plugging in the empirical precision and recall, Precision(c) and Recall(c), we get an estimator of AUCPR, AUCPR = Precision(c)d Recall(c). Seoul National University Deep Learning March-June, / 57

52 Evaluation Average Precision (Conti.) Here, by the definition of Riemann Stieltjes integral, AUCPR = = Precision(c)d Recall(c) c {conf i i P} Precision(c) ( # of P have conf. equal to c ). # of P This is the Average Precision (AP), an weighted average of precisions at each confidence level of GT box. Seoul National University Deep Learning March-June, / 57

53 Evaluation Average Precision (Conti.) Considering various kinds of class, the mean of AP is used. This is called mean Average Precision (map). In practice, Interpolated AP is used due to the wiggles in the precision-recall curve. Unlike the ROC curve, it may not hold monotonicity. Seoul National University Deep Learning March-June, / 57

54 Evaluation References Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp ). Uijlings, Jasper RR, et al. Selective search for object recognition. International journal of computer vision (2013): Felzenszwalb, Pedro F., and Daniel P. Huttenlocher. Efficient graph-based image segmentation. International journal of computer vision 59.2 (2004): Felzenszwalb, Pedro F., et al. Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence 32.9 (2010): Seoul National University Deep Learning March-June, / 57

55 Evaluation References (Conti.) A. Krizhevsky, I. Sutskever, and G. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, He, Kaiming, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. european conference on computer vision. Springer, Cham, Girshick, Ross. Fast r-cnn. arxiv preprint arxiv: (2015). Simonyan, Karen, and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arxiv preprint arxiv: (2014). Seoul National University Deep Learning March-June, / 57

56 Evaluation References (Conti.) Ren, Shaoqing, et al. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp ). Everingham, M., Van Gool, L., Williams, C. K., Winn, J., and Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2), Seoul National University Deep Learning March-June, / 57

57 Evaluation References (Conti.) Boyd, Kendrick, Kevin H. Eng, and C. David Page. Area under the precision-recall curve: Point estimates and confidence intervals. Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, Heidelberg, Introduction to modern information retrieval Seoul National University Deep Learning March-June, / 57

Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

Object detection with CNNs

Object detection with CNNs Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals

More information

Unified, real-time object detection

Unified, real-time object detection Unified, real-time object detection Final Project Report, Group 02, 8 Nov 2016 Akshat Agarwal (13068), Siddharth Tanwar (13699) CS698N: Recent Advances in Computer Vision, Jul Nov 2016 Instructor: Gaurav

More information

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Kyunghee Kim Stanford University 353 Serra Mall Stanford, CA 94305 kyunghee.kim@stanford.edu Abstract We use a

More information

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic

More information

YOLO9000: Better, Faster, Stronger

YOLO9000: Better, Faster, Stronger YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object

More information

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Deep learning for object detection. Slides from Svetlana Lazebnik and many others Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep

More information

Optimizing Object Detection:

Optimizing Object Detection: Lecture 10: Optimizing Object Detection: A Case Study of R-CNN, Fast R-CNN, and Faster R-CNN Visual Computing Systems Today s task: object detection Image classification: what is the object in this image?

More information

Final Report: Smart Trash Net: Waste Localization and Classification

Final Report: Smart Trash Net: Waste Localization and Classification Final Report: Smart Trash Net: Waste Localization and Classification Oluwasanya Awe oawe@stanford.edu Robel Mengistu robel@stanford.edu December 15, 2017 Vikram Sreedhar vsreed@stanford.edu Abstract Given

More information

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Liwen Zheng, Canmiao Fu, Yong Zhao * School of Electronic and Computer Engineering, Shenzhen Graduate School of

More information

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN CS6501: Deep Learning for Visual Recognition Object Detection I: RCNN, Fast-RCNN, Faster-RCNN Today s Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster

More information

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Presented by Pandian Raju and Jialin Wu Last class SGD for Document

More information

Object Detection on Self-Driving Cars in China. Lingyun Li

Object Detection on Self-Driving Cars in China. Lingyun Li Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not

More information

arxiv: v1 [cs.cv] 4 Jun 2015

arxiv: v1 [cs.cv] 4 Jun 2015 Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks arxiv:1506.01497v1 [cs.cv] 4 Jun 2015 Shaoqing Ren Kaiming He Ross Girshick Jian Sun Microsoft Research {v-shren, kahe, rbg,

More information

G-CNN: an Iterative Grid Based Object Detector

G-CNN: an Iterative Grid Based Object Detector G-CNN: an Iterative Grid Based Object Detector Mahyar Najibi 1, Mohammad Rastegari 1,2, Larry S. Davis 1 1 University of Maryland, College Park 2 Allen Institute for Artificial Intelligence najibi@cs.umd.edu

More information

Project 3 Q&A. Jonathan Krause

Project 3 Q&A. Jonathan Krause Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations

More information

OBJECT DETECTION HYUNG IL KOO

OBJECT DETECTION HYUNG IL KOO OBJECT DETECTION HYUNG IL KOO INTRODUCTION Computer Vision Tasks Classification + Localization Classification: C-classes Input: image Output: class label Evaluation metric: accuracy Localization Input:

More information

Deep Learning for Object detection & localization

Deep Learning for Object detection & localization Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified

More information

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab. [ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that

More information

Optimizing Intersection-Over-Union in Deep Neural Networks for Image Segmentation

Optimizing Intersection-Over-Union in Deep Neural Networks for Image Segmentation Optimizing Intersection-Over-Union in Deep Neural Networks for Image Segmentation Md Atiqur Rahman and Yang Wang Department of Computer Science, University of Manitoba, Canada {atique, ywang}@cs.umanitoba.ca

More information

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Raymond Ptucha, Rochester Institute of Technology, USA Tutorial-9 May 19, 218 www.nvidia.com/dli R. Ptucha 18 1 Fair Use Agreement

More information

Category-level localization

Category-level localization Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object

More information

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing 3 Object Detection BVM 2018 Tutorial: Advanced Deep Learning Methods Paul F. Jaeger, of Medical Image Computing What is object detection? classification segmentation obj. detection (1 label per pixel)

More information

Gradient of the lower bound

Gradient of the lower bound Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification

More information

CS 1674: Intro to Computer Vision. Object Recognition. Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018

CS 1674: Intro to Computer Vision. Object Recognition. Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018 CS 1674: Intro to Computer Vision Object Recognition Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018 Different Flavors of Object Recognition Semantic Segmentation Classification + Localization

More information

YOLO: You Only Look Once Unified Real-Time Object Detection. Presenter: Liyang Zhong Quan Zou

YOLO: You Only Look Once Unified Real-Time Object Detection. Presenter: Liyang Zhong Quan Zou YOLO: You Only Look Once Unified Real-Time Object Detection Presenter: Liyang Zhong Quan Zou Outline 1. Review: R-CNN 2. YOLO: -- Detection Procedure -- Network Design -- Training Part -- Experiments Rich

More information

Object Detection with YOLO on Artwork Dataset

Object Detection with YOLO on Artwork Dataset Object Detection with YOLO on Artwork Dataset Yihui He Computer Science Department, Xi an Jiaotong University heyihui@stu.xjtu.edu.cn Abstract Person: 0.64 Horse: 0.28 I design a small object detection

More information

Deformable Part Models

Deformable Part Models CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones

More information

Object Recognition II

Object Recognition II Object Recognition II Linda Shapiro EE/CSE 576 with CNN slides from Ross Girshick 1 Outline Object detection the task, evaluation, datasets Convolutional Neural Networks (CNNs) overview and history Region-based

More information

Automatic detection of books based on Faster R-CNN

Automatic detection of books based on Faster R-CNN Automatic detection of books based on Faster R-CNN Beibei Zhu, Xiaoyu Wu, Lei Yang, Yinghua Shen School of Information Engineering, Communication University of China Beijing, China e-mail: zhubeibei@cuc.edu.cn,

More information

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia CPSC340 State-of-the-art Neural Networks Nando de Freitas November, 2012 University of British Columbia Outline of the lecture This lecture provides an overview of two state-of-the-art neural networks:

More information

Feature-Fused SSD: Fast Detection for Small Objects

Feature-Fused SSD: Fast Detection for Small Objects Feature-Fused SSD: Fast Detection for Small Objects Guimei Cao, Xuemei Xie, Wenzhe Yang, Quan Liao, Guangming Shi, Jinjian Wu School of Electronic Engineering, Xidian University, China xmxie@mail.xidian.edu.cn

More information

Efficient Segmentation-Aided Text Detection For Intelligent Robots

Efficient Segmentation-Aided Text Detection For Intelligent Robots Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related

More information

Optimizing Object Detection:

Optimizing Object Detection: Lecture 10: Optimizing Object Detection: A Case Study of R-CNN, Fast R-CNN, and Faster R-CNN and Single Shot Detection Visual Computing Systems Today s task: object detection Image classification: what

More information

Classification of objects from Video Data (Group 30)

Classification of objects from Video Data (Group 30) Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time

More information

Detection and Localization with Multi-scale Models

Detection and Localization with Multi-scale Models Detection and Localization with Multi-scale Models Eshed Ohn-Bar and Mohan M. Trivedi Computer Vision and Robotics Research Laboratory University of California San Diego {eohnbar, mtrivedi}@ucsd.edu Abstract

More information

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object

More information

Rich feature hierarchies for accurate object detection and semant

Rich feature hierarchies for accurate object detection and semant Rich feature hierarchies for accurate object detection and semantic segmentation Speaker: Yucong Shen 4/5/2018 Develop of Object Detection 1 DPM (Deformable parts models) 2 R-CNN 3 Fast R-CNN 4 Faster

More information

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection Zeming Li, 1 Yilun Chen, 2 Gang Yu, 2 Yangdong

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK Wenjie Guan, YueXian Zou*, Xiaoqun Zhou ADSPLAB/Intelligent Lab, School of ECE, Peking University, Shenzhen,518055, China

More information

An Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b

An Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b 5th International Conference on Advanced Materials and Computer Science (ICAMCS 2016) An Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b 1

More information

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their

More information

Modern Convolutional Object Detectors

Modern Convolutional Object Detectors Modern Convolutional Object Detectors Faster R-CNN, R-FCN, SSD 29 September 2017 Presented by: Kevin Liang Papers Presented Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

More information

Traffic Multiple Target Detection on YOLOv2

Traffic Multiple Target Detection on YOLOv2 Traffic Multiple Target Detection on YOLOv2 Junhong Li, Huibin Ge, Ziyang Zhang, Weiqin Wang, Yi Yang Taiyuan University of Technology, Shanxi, 030600, China wangweiqin1609@link.tyut.edu.cn Abstract Background

More information

arxiv: v1 [cs.cv] 3 Apr 2016

arxiv: v1 [cs.cv] 3 Apr 2016 : Towards Accurate Region Proposal Generation and Joint Object Detection arxiv:64.6v [cs.cv] 3 Apr 26 Tao Kong Anbang Yao 2 Yurong Chen 2 Fuchun Sun State Key Lab. of Intelligent Technology and Systems

More information

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left

More information

Towards Real-Time Automatic Number Plate. Detection: Dots in the Search Space

Towards Real-Time Automatic Number Plate. Detection: Dots in the Search Space Towards Real-Time Automatic Number Plate Detection: Dots in the Search Space Chi Zhang Department of Computer Science and Technology, Zhejiang University wellyzhangc@zju.edu.cn Abstract Automatic Number

More information

Cascade Region Regression for Robust Object Detection

Cascade Region Regression for Robust Object Detection Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng, Shaoli Huang, Jing Yang, Hui Shuai, Zhengbo Yu, Zongguang Lu, Qiang Ma, Yali

More information

PT-NET: IMPROVE OBJECT AND FACE DETECTION VIA A PRE-TRAINED CNN MODEL

PT-NET: IMPROVE OBJECT AND FACE DETECTION VIA A PRE-TRAINED CNN MODEL PT-NET: IMPROVE OBJECT AND FACE DETECTION VIA A PRE-TRAINED CNN MODEL Yingxin Lou 1, Guangtao Fu 2, Zhuqing Jiang 1, Aidong Men 1, and Yun Zhou 2 1 Beijing University of Posts and Telecommunications, Beijing,

More information

Paper Motivation. Fixed geometric structures of CNN models. CNNs are inherently limited to model geometric transformations

Paper Motivation. Fixed geometric structures of CNN models. CNNs are inherently limited to model geometric transformations Paper Motivation Fixed geometric structures of CNN models CNNs are inherently limited to model geometric transformations Higher-level features combine lower-level features at fixed positions as a weighted

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Object Detection in Sports Videos

Object Detection in Sports Videos Object Detection in Sports Videos M. Burić, M. Pobar, M. Ivašić-Kos University of Rijeka/Department of Informatics, Rijeka, Croatia matija.buric@hep.hr, marinai@inf.uniri.hr, mpobar@inf.uniri.hr Abstract

More information

Regionlet Object Detector with Hand-crafted and CNN Feature

Regionlet Object Detector with Hand-crafted and CNN Feature Regionlet Object Detector with Hand-crafted and CNN Feature Xiaoyu Wang Research Xiaoyu Wang Research Ming Yang Horizon Robotics Shenghuo Zhu Alibaba Group Yuanqing Lin Baidu Overview of this section Regionlet

More information

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, September 18,

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, September 18, REAL-TIME OBJECT DETECTION WITH CONVOLUTION NEURAL NETWORK USING KERAS Asmita Goswami [1], Lokesh Soni [2 ] Department of Information Technology [1] Jaipur Engineering College and Research Center Jaipur[2]

More information

Hand Detection For Grab-and-Go Groceries

Hand Detection For Grab-and-Go Groceries Hand Detection For Grab-and-Go Groceries Xianlei Qiu Stanford University xianlei@stanford.edu Shuying Zhang Stanford University shuyingz@stanford.edu Abstract Hands detection system is a very critical

More information

A Novel Representation and Pipeline for Object Detection

A Novel Representation and Pipeline for Object Detection A Novel Representation and Pipeline for Object Detection Vishakh Hegde Stanford University vishakh@stanford.edu Manik Dhar Stanford University dmanik@stanford.edu Abstract Object detection is an important

More information

Yield Estimation using faster R-CNN

Yield Estimation using faster R-CNN Yield Estimation using faster R-CNN 1 Vidhya Sagar, 2 Sailesh J.Jain and 2 Arjun P. 1 Assistant Professor, 2 UG Scholar, Department of Computer Engineering and Science SRM Institute of Science and Technology,Chennai,

More information

Exploiting Depth from Single Monocular Images for Object Detection and Semantic Segmentation

Exploiting Depth from Single Monocular Images for Object Detection and Semantic Segmentation APPEARING IN IEEE TRANSACTIONS ON IMAGE PROCESSING, OCTOBER 2016 1 Exploiting Depth from Single Monocular Images for Object Detection and Semantic Segmentation Yuanzhouhan Cao, Chunhua Shen, Heng Tao Shen

More information

SSD: Single Shot MultiBox Detector

SSD: Single Shot MultiBox Detector SSD: Single Shot MultiBox Detector Wei Liu 1(B), Dragomir Anguelov 2, Dumitru Erhan 3, Christian Szegedy 3, Scott Reed 4, Cheng-Yang Fu 1, and Alexander C. Berg 1 1 UNC Chapel Hill, Chapel Hill, USA {wliu,cyfu,aberg}@cs.unc.edu

More information

arxiv: v1 [cs.cv] 15 Oct 2018

arxiv: v1 [cs.cv] 15 Oct 2018 Instance Segmentation and Object Detection with Bounding Shape Masks Ha Young Kim 1,2,*, Ba Rom Kang 2 1 Department of Financial Engineering, Ajou University Worldcupro 206, Yeongtong-gu, Suwon, 16499,

More information

Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network

Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network Anurag Arnab and Philip H.S. Torr University of Oxford {anurag.arnab, philip.torr}@eng.ox.ac.uk 1. Introduction

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts

More information

arxiv: v1 [cs.cv] 26 May 2017

arxiv: v1 [cs.cv] 26 May 2017 arxiv:1705.09587v1 [cs.cv] 26 May 2017 J. JEONG, H. PARK AND N. KWAK: UNDER REVIEW IN BMVC 2017 1 Enhancement of SSD by concatenating feature maps for object detection Jisoo Jeong soo3553@snu.ac.kr Hyojin

More information

[Supplementary Material] Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors

[Supplementary Material] Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors [Supplementary Material] Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors Junhyug Noh Soochan Lee Beomsu Kim Gunhee Kim Department of Computer Science and Engineering

More information

Finding Tiny Faces Supplementary Materials

Finding Tiny Faces Supplementary Materials Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution

More information

Visual features detection based on deep neural network in autonomous driving tasks

Visual features detection based on deep neural network in autonomous driving tasks 430 Fomin I., Gromoshinskii D., Stepanov D. Visual features detection based on deep neural network in autonomous driving tasks Ivan Fomin, Dmitrii Gromoshinskii, Dmitry Stepanov Computer vision lab Russian

More information

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search Beyond Sliding Windows: Object Localization by Efficient Subwindow Search Christoph H. Lampert, Matthew B. Blaschko, & Thomas Hofmann Max Planck Institute for Biological Cybernetics Tübingen, Germany Google,

More information

arxiv: v1 [cs.cv] 15 Aug 2018

arxiv: v1 [cs.cv] 15 Aug 2018 SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection arxiv:88.97v [cs.cv] 5 Aug 8 Yonghyun Kim [ 8 785], Bong-Nam Kang [ 688 75], and Daijin Kim [ 86 85] Department

More information

PASCAL VOC Classification: Local Features vs. Deep Features. Shuicheng YAN, NUS

PASCAL VOC Classification: Local Features vs. Deep Features. Shuicheng YAN, NUS PASCAL VOC Classification: Local Features vs. Deep Features Shuicheng YAN, NUS PASCAL VOC Why valuable? Multi-label, Real Scenarios! Visual Object Recognition Object Classification Object Detection Object

More information

R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS

R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS JIFENG DAI YI LI KAIMING HE JIAN SUN MICROSOFT RESEARCH TSINGHUA UNIVERSITY MICROSOFT RESEARCH MICROSOFT RESEARCH SPEED/ACCURACY TRADE-OFFS

More information

Single-Shot Refinement Neural Network for Object Detection -Supplementary Material-

Single-Shot Refinement Neural Network for Object Detection -Supplementary Material- Single-Shot Refinement Neural Network for Object Detection -Supplementary Material- Shifeng Zhang 1,2, Longyin Wen 3, Xiao Bian 3, Zhen Lei 1,2, Stan Z. Li 4,1,2 1 CBSR & NLPR, Institute of Automation,

More information

Object Detection and Its Implementation on Android Devices

Object Detection and Its Implementation on Android Devices Object Detection and Its Implementation on Android Devices Zhongjie Li Stanford University 450 Serra Mall, Stanford, CA 94305 jay2015@stanford.edu Rao Zhang Stanford University 450 Serra Mall, Stanford,

More information

CNN BASED REGION PROPOSALS FOR EFFICIENT OBJECT DETECTION. Jawadul H. Bappy and Amit K. Roy-Chowdhury

CNN BASED REGION PROPOSALS FOR EFFICIENT OBJECT DETECTION. Jawadul H. Bappy and Amit K. Roy-Chowdhury CNN BASED REGION PROPOSALS FOR EFFICIENT OBJECT DETECTION Jawadul H. Bappy and Amit K. Roy-Chowdhury Department of Electrical and Computer Engineering, University of California, Riverside, CA 92521 ABSTRACT

More information

Deep condolence to Professor Mark Everingham

Deep condolence to Professor Mark Everingham Deep condolence to Professor Mark Everingham Towards VOC2012 Object Classification Challenge Generalized Hierarchical Matching for Sub-category Aware Object Classification National University of Singapore

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:

More information

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) Donggeun Yoo, Sunggyun Park, Joon-Young Lee, Anthony Paek, In So Kweon. State-of-the-art frameworks for object

More information

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection ILSVRC 2016 Object Detection from Video Byungjae Lee¹, Songguo Jin¹, Enkhbayar Erdenee¹, Mi Young Nam², Young Gui Jung², Phill Kyu

More information

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 Mask R-CNN Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 1 Common computer vision tasks Image Classification: one label is generated for

More information

Future directions in computer vision. Larry Davis Computer Vision Laboratory University of Maryland College Park MD USA

Future directions in computer vision. Larry Davis Computer Vision Laboratory University of Maryland College Park MD USA Future directions in computer vision Larry Davis Computer Vision Laboratory University of Maryland College Park MD USA Presentation overview Future Directions Workshop on Computer Vision Object detection

More information

HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION

HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION Chien-Yao Wang, Jyun-Hong Li, Seksan Mathulaprangsan, Chin-Chin Chiang, and Jia-Ching Wang Department of Computer Science and Information

More information

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) e-isjn: A4372-3114 Impact Factor: 7.327 Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies Research Article

More information

arxiv: v3 [cs.cv] 15 Sep 2018

arxiv: v3 [cs.cv] 15 Sep 2018 DPATCH: An Adversarial Patch Attack on Object Detectors Xin Liu1, Huanrui Yang1, Ziwei Liu2, Linghao Song1, Hai Li1, Yiran Chen1, arxiv:1806.02299v3 [cs.cv] 15 Sep 2018 2 1 Duke University The Chinese

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Industrial Technology Research Institute, Hsinchu, Taiwan, R.O.C ǂ

Industrial Technology Research Institute, Hsinchu, Taiwan, R.O.C ǂ Stop Line Detection and Distance Measurement for Road Intersection based on Deep Learning Neural Network Guan-Ting Lin 1, Patrisia Sherryl Santoso *1, Che-Tsung Lin *ǂ, Chia-Chi Tsai and Jiun-In Guo National

More information

Learning Detection with Diverse Proposals

Learning Detection with Diverse Proposals Learning Detection with Diverse Proposals Samaneh Azadi 1, Jiashi Feng 2, and Trevor Darrell 1 1 University of California, Berkeley, 2 National University of Singapore {sazadi,trevor}@eecs.berkeley.edu

More information

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation

More information

Mimicking Very Efficient Network for Object Detection

Mimicking Very Efficient Network for Object Detection Mimicking Very Efficient Network for Object Detection Quanquan Li 1, Shengying Jin 2, Junjie Yan 1 1 SenseTime 2 Beihang University liquanquan@sensetime.com, jsychffy@gmail.com, yanjunjie@outlook.com Abstract

More information

Content-Based Image Recovery

Content-Based Image Recovery Content-Based Image Recovery Hong-Yu Zhou and Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China zhouhy@lamda.nju.edu.cn wujx2001@nju.edu.cn Abstract. We propose

More information

Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection

Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection Yu Xiang 1, Wongun Choi 2, Yuanqing Lin 3, and Silvio Savarese 4 1 University of Washington, 2 NEC Laboratories America,

More information

Rotation Invariance Neural Network

Rotation Invariance Neural Network Rotation Invariance Neural Network Shiyuan Li Abstract Rotation invariance and translate invariance have great values in image recognition. In this paper, we bring a new architecture in convolutional neural

More information

Hierarchical Image-Region Labeling via Structured Learning

Hierarchical Image-Region Labeling via Structured Learning Hierarchical Image-Region Labeling via Structured Learning Julian McAuley, Teo de Campos, Gabriela Csurka, Florent Perronin XRCE September 14, 2009 McAuley et al (XRCE) Hierarchical Image-Region Labeling

More information