Automatic detection of books based on Faster R-CNN

Size: px
Start display at page:

Download "Automatic detection of books based on Faster R-CNN"

Transcription

1 Automatic detection of books based on Faster R-CNN Beibei Zhu, Xiaoyu Wu, Lei Yang, Yinghua Shen School of Information Engineering, Communication University of China Beijing, China Abstract Detection networks has made improvements continuously like SPPnet and Fast R-CNN. Recently the novel region proposal method RPN shares full-image convolutional features with the detection network and enables a state-of-theart object detection network Faster R-CNN. In this work we apply Faster R-CNN to train a detection network on our digital image database of books and implement automatic recognition and positioning of books. Experiments show that retrained Faster R-CNN achieves fine detection results in terms of both speed and accuracy, and it also solves the problem of testing negative examples in our previous study. This provides great help for the study of practical book retrieval system. Keywords: object detection; detection of books; Faster R- CNN; deep learning I. INTRODUCTION Nowadays new technology and network has made our live more easy and convenience. The intelligent retrieval and management of books has developed for a long time and now has a wide prospect given the latest deep learning methods. The identification of books is the key step of a retrieval system. So far book identification mostly use text-based methods or content-based shallow machine learning methods that require manually extracted feature. Hao Wang and Peng Ye et al have respectively realized the automatic classi fication of Chinese books or journal articles based on support vector machine(svm) and back propagation neural network algorithms [1][2]. And there are numerous studies of the text categorization of books based on machine learning. Deep learning has been a hotspot in recent years and obtained good application effect in the field of image classification, scene recognition and object detection and tracking. There are few studies on the identification of books based on deep learning. We have studied the image identification of 10 classes of books based on SVM and deep learning methods [3]. The experimental test results are good but the system is unfit for testing images of other books that don t belong to those 10 classes or images of other objects. Lately region-based convolutional neural networks has made good advances in object detection. The bottleneck, that is, the proposal step in state-of-the-art detection systems has been drastically reduced computational cost due to the novel network Faster R-CNN [4], which enables a fast and effective object detection system. For the very deep VGG-16 model, the detection system has a frame rate of 5fps on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC2007 and This paper studies the image detection of books based on deep learning and we apply Faster R-CNN to recognize 10 classes of books in images and predict bounding boxes of them. Experiments show that the detection network retrained on our database achieves good detection accuracy and this also lays the foundation for the integrated intelligent book retrieval system. II. THE DETECTION OF BOOKS BASED ON FASTER R- CNN Faster R-CNN is an region-based object detection network that proposed by Shaoqing Ren et al. Object detection networks depend on region proposal methods to hypothesize object locations, while the original methods are typically rely on inexpensive features and implemented on the CPU, which makes the running time of the proposal step intolerable. Faster R-CNN is proposed to lead to an elegant and effective solution to this problem. Faster R-CNN includes a Region Proposal Network (RPN) and an object detection network. Shaoqing Ren et al introduce RPN into convolutional neural network and train an end-to-end network to enable nearly cost-free region proposals. The architecture of Faster R-CNN is shown in Figure 1. The top branch is Fast R-CNN [5] and the bottom branch is RPN. Note that Figure 1 just shows how the two networks share convolutional s and their structural similarities which is the foundation of the network convergence in its training scheme, the ultimate model of Faster R-CNN is the upper branch. The intermediate plays the same role as ROI (Regions of Interest) pooling and pools a small window of the convolution map to a fixedsized vector. The s in the network refer to fully connected s. In the training phase, the RPN takes an image of any size as input and outputs a set of rectangular object proposals, each with 4 coordinates of predicted object bounding box and 2 scores that estimate probability of object/not-object. By minimizing the classification loss and regression loss for learning region proposals, the RPN are trained to generate high quality proposals. The object detection network is Fast R-CNN, it takes the proposals generated by RPN as input and performs elaborate classification and positioning for each proposal by calculating softmax probabilities and boundingbox regression offsets for each proposal. The training scheme of the network is ingenious. In the first two steps, the RPN and Fast R-CNN are trained independently. In the last two steps, two networks are trained alternately to share convolutional s and fine-tune their respective fully connected s for the ultimate model.

2 In the test phase, the re-trained model takes test images as input and outputs the predicted category label and the Convolutional bounding box for each corresponding target object. ROI pooling ROI feature vector softmax probabilities bbox regressor offsets 2k scores Imput RPN Intermediate 256-d vector 4k coordinates Figure1.The architecture of Faster R-CNN. A. Database preparation In this paper we are training a network on our database of books to identify 10 classes of books and predict object bounds of them simultaneously. We use the same image database as that in paper [3]. Figure 2 shows some examples of the database. The database contains more than 4000 images of 10 classes of books that taken in different natural backgrounds under 4 lighting conditions. The 4 lighting conditions respectively are outdoor natural light at noon, outdoor natural light at dusk, indoor lamplight and indoor natural light at dusk. Considering different users may have different shooting habits, the images are taken by different users with their own equipment to simulate actual shooting. More details about the construction of the database please refer to paper [3]. We set the labels of books respectively to 0 ~ 9 and select randomly 1500 images for training, 1500 images for validation and 1000 images to test, that s 150 training images, 150 validation images and 100 test images for each class. We re-scale the images with reference to the datasets from PASCAL VOC challenges such that their longer side is 500 pixels and remain their ratios. (a) The diagram of 4 lighting conditions (b) The diagram of 10 classes of books Figure2.The examples of the image database B. Documents preparation In addition to the image database, there are two kinds of documents we need to prepare to use the Faster R-CNN framework to train the network on our database. The localization task of the network requires a set of images that come with manual annotations indicating the ground-truth locations of books within the images. We adopt the graphical image annotation tool LabelImg created by tzutalin [6] to do the annotation work. LabelImg is written in Python and uses Qt for its graphical interface. We firstly predefine the classes of books as 0~9 in LabelImg. The means of annotation is drawing a tight rectangular region of interest also called bounding box to surround each book in an image, and then adding a label for it. The examples of annotation are shown in Figure 3. LabelImg can record mouse clicks and save the coordinates and label of each bounding box in an annotation file. Each image corresponds to an annotation file. The annotation file also includes information such as the name and the size of the image and the number of channels of the image. The annotation file will be saved as a XML file of which the format is same as the format adopted in PASCAL VOC challenges.

3 Figure3. The examples of annotation Apart from annotation files, the code framework of Faster R-CNN requires 4 TXT files for each class of books and for the whole database under the /datasets/ VOCdevkit2007 / VOC2007 / ImageSets / Main directory. These files are named train.txt, test.txt, val.txt and trainval.txt respectively. Just as their names imply, these.txt files indicate the use of these images by including their names and giving different numbers as marks according to their usage. For example, the images of a class of books used for training are marked 1 behind their names in the train.txt that corresponds to the class, while the other images are marked -1. In this paper, we write a simple MATLAB program to generate these files. C. Implementation details In this paper, we adopt open source code of Faster R-CNN framework created by Shaoqing Ren [7] et al to implement the detection of books. The official code is written in MATLAB, while a Python reimplementation of the MATLAB code is also available at Github. In this work we use MATLAB version of the code since we are using Windows 7 64-bit operating system. Our graphics card is NVIDIA Quadro K2200 and the GPU memory is 4GB GDDR5. The framework also needs the support of Caffe [8], into which datas about proposals and labels are actually fed through MATLAB interface to perform calculation and weight update. We download the ready-made mex file compiled by Caffe that provided and included in the code repository under the /external/caffe directory by the developers of Faster R-CNN. 1) RPN training In the first step, we download an ImageNet-pre-trained ZF [9] net to initialize RPN. The ImageNet trained ZF model is an 8 convnet model and it generalize well to other datasets. The RPN is fine-tuned end-to-end for the region proposal task. The architecture and training process of RPN is shown in Figure 4. In RPN, we set one image per mini-batch of which the data is fed into Caffe to perform forward propagation and back propagation through MATLAB interface each time. We randomly sample 256 anchor boxes in an image and the ratio of positive and negative anchors is 1:1. We set the overlap threshold for an anchor box to be considered foreground to 0.7. The anchor box that has an overlap lower than 0.3 with any ground-truth boxes is considered negative example. The ground-truth label is 1 for a positive anchor and 0 for a negative one. Those examples with labels and coordinates of ground-truth boxes are used for supervised training of RPN. Note that negative anchors do not contribute to the regression loss at this stage. After the training of RPN, we input test images into the fine-tuned RPN and output a set of predicted proposal boxes, each with 2 scores that estimate probability of object/notobject and 4 coordinates. Because we adopt non-maximum suppression (NMS) on the proposal boxes based on their scores, the number of proposals is reduced to about 2k per image. Input Convolutional Intermediate 256 -d cls reg 2k scores 4k coordinates k anchor boxes (a) Images Generate anchor boxes Assign class labels to anchor boxes (b) Figure4. The architecture and training process of RPN Training 2) Fast R-CNN training In the second step, we use the proposals generated above to train a separate detection network Fast R-CNN. The Fast R- CNN is also initialized by the ImageNet-pre-trained ZF model. In this step we set 2 images per mini-batch. For each image of mini-batch we randomly select 64 proposals that include 16

4 positive examples and 48 negative examples. Unlike RPN, we set the overlap threshold for a proposal to be considered positive to 0.5 and the rest are background examples. The ground-truth labels of positive examples are their class labels and those of negative examples are 0s. Likewise, we pass the data to Caffe through MATLAB interface to train Fast R- CNN by back-propagation and stochastic gradient descent (SGD). 3) Network convergence In the third step, we use Fast R-CNN to initialize RPN and fix the convolutional s while fine-tune the s unique to RPN using training samples. In the end, we use the region proposals generated in step 3 to fine-tune the fully connected s of the Fast R-CNN while keeping the shared convolutional s fixed. At this point the two networks share the same convolutional s and form a unified network. D. Experimental Results and Analysis 1) Comparison and analysis of two networks The test results using Faster R-CNN retrained on our database of 10 classes of books are shown in Table 1. The database includes 3000 images for training and validation and 1000 images for test. We evaluate the performance of the network in two ways which are the accuracy of classification and the accuracy of bounding box prediction, also called regression accuracy. TABLE1. DETECTION RESULTS ON OUR DATABASE USING FASTER R-CNN label classification accuracy regression accuracy map (%) Test time(s) We can see the mean Average Precision (map) of the classification is up to 98.6%, which outperforms the network we used in paper [3]. And our system takes seconds for the test of 1000 images. In paper [3] we adopt Caffe to retrain a CNN model with 3 convolutional s and 2 fully connected s. We use the same database with 4000 training images and 400 test images, and we rescale and crop the images respectively to 32*32 and 96*96 resolution. The results of book recognition is shown in Table2. TABLE2. THE ACCURACY OF BOOK RECOGNITION BASED ON CNN Number Resolution * % 97.53% 96* % 97.79% The network in this paper uses 5 convolutional s to calculate convolution maps of images and the resolution of input is much higher than that in paper [3]. The RPN also help increase the number of training examples. All these factors contribute to the improvement of recognition accuracy. We should also be alert that for a network of a certain depth, the increase of resolution of the input may leads to a loss in accuracy since it brings in too many parameters. 2) Tests on negative examples The recognition rate in paper [3] is quite good since our classification task is relatively simple for deep learning methods. However there is a significant problem that any test images that do not include target objects will also be classified wrongly as one of the 10 classes. Faster R-CNN solves the problem easily by introducing a class named background. The network randomly selects patches from the background of images in the training phase and use them as negative samples to train the network. Thus when we use retrained network to test images that do not include target objects, the network will classify them as background. We run a test on 300 negative examples using our retrained network. The negative examples are 150 natural images that selected from PASCAL VOC datasets randomly and 150 images of another 10 classes of books taken in similar circumstances as our training examples. The test results on images from VOC datasets are pretty good and the detection accuracy is 1, which means the network detects no book in those natural images. However the results on another 10 classes of negative examples are less accurate. The diagram of testing negative examples is shown in Figure 5. We can see from Figure 5(a) that the book cover on the right is much like our positive examples labeled 7s, so the error recognition rate is relatively high for this specific kind of negative examples. The system also shows a relatively high error recognition rate for a few examples that have simple cover design and are similar as our positive examples labeled 3s, shown in Figure 5(b). Apart from these the network performs fine and barely makes mistakes. The results suggest that we need to work on our database since the images contain a little background that do not contain books, and the background is lack of diversity. Therefore the negative examples don t provide fully effective information for the network learning, which is important for open set test. The network performs well on the datasets for now, but we need to expand the datasets on its capacity and diversity when the detection task becomes more complex. (a)

5 (b) Figure5. The diagram of testing negative examples III. CONCLUSION This paper studies the latest object detection network Faster R-CNN and adopts the code framework created by its authors to implement efficient and accurate detection of books. We improve the classification accuracy of books and solve the problem of testing negative samples existed in our previous study. In the further study, we may consider increasing the capacity and diversity of the database, and using deeper networks to train a more complex detection model that suits for the practical application. ACKNOWLEDGMENT This paper is under the financial aid of the National Key Technology R&D Program (2015BAK22B02) and (2014BAH10F02). REFERENCES [1] Hao Wang, Ming Yan and Xinning Su, The automatic classification of Chinese books title based on machine learning, Journal of Library Science in China, Vol.36, No.190, pp.28-39, [2] Peng Ye, The automatic classification of Chinese journal articles based on machine learning, Nanjing University, [3] Beibei Zhu, Lei Yang, Xiaoyu Wu and Tianchu Guo, Automatic Recognition of Books Based on Machine Learning, International Symposium on Computational and Bussiness Intelligence(ISCBI), pp.74-78, [4] Shaoqing Ren, Kaiming He, Ross Girshick and Jian Sun, Faster R- CNN: Towards Real-Time Object Detection with Region Proposal Networks, Neural Information Processing Systems(NIPS), [5] Ross Girshick, Fast R-CNN, International Conference on Computer Vision(ICCV), [6] Tzutalin, labelimg: A graphical image annotation tool. [7] Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun, Faster R-CNN: [8] Yangqing Jia, Caffe: An Open Source Convolutional Architecture for Fast Feature Embedding. (2013). [9] Matthew D.Zeiler, Rob Fergus, Visualizing and Understanding Convolutional Networks, European Conference on Computer Vision(ECCV), Vol.8689, pp ,2013. AUTHORS BACKGROUND Your Name Title* Research Field Personal website Beibei Zhu master student Image processing zhubeibei@cuc.edu.cn Xiaoyu Wu associate professor Image processing wuxiaoyu@cuc.edu.cn Lei Yang full professor Digital media technology young-lad@263.net Yinghua Shen associate professor Digital media technology shenyinghua@cuc.edu.cn

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

Final Report: Smart Trash Net: Waste Localization and Classification

Final Report: Smart Trash Net: Waste Localization and Classification Final Report: Smart Trash Net: Waste Localization and Classification Oluwasanya Awe oawe@stanford.edu Robel Mengistu robel@stanford.edu December 15, 2017 Vikram Sreedhar vsreed@stanford.edu Abstract Given

More information

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN

More information

Kaggle Data Science Bowl 2017 Technical Report

Kaggle Data Science Bowl 2017 Technical Report Kaggle Data Science Bowl 2017 Technical Report qfpxfd Team May 11, 2017 1 Team Members Table 1: Team members Name E-Mail University Jia Ding dingjia@pku.edu.cn Peking University, Beijing, China Aoxue Li

More information

Yield Estimation using faster R-CNN

Yield Estimation using faster R-CNN Yield Estimation using faster R-CNN 1 Vidhya Sagar, 2 Sailesh J.Jain and 2 Arjun P. 1 Assistant Professor, 2 UG Scholar, Department of Computer Engineering and Science SRM Institute of Science and Technology,Chennai,

More information

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Liwen Zheng, Canmiao Fu, Yong Zhao * School of Electronic and Computer Engineering, Shenzhen Graduate School of

More information

Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

More information

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS Kuan-Chuan Peng and Tsuhan Chen School of Electrical and Computer Engineering, Cornell University, Ithaca, NY

More information

Object Detection on Self-Driving Cars in China. Lingyun Li

Object Detection on Self-Driving Cars in China. Lingyun Li Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not

More information

Deep Learning for Object detection & localization

Deep Learning for Object detection & localization Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified

More information

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab. [ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that

More information

Optimizing Object Detection:

Optimizing Object Detection: Lecture 10: Optimizing Object Detection: A Case Study of R-CNN, Fast R-CNN, and Faster R-CNN Visual Computing Systems Today s task: object detection Image classification: what is the object in this image?

More information

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies

Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) e-isjn: A4372-3114 Impact Factor: 7.327 Volume 6, Issue 12, December 2018 International Journal of Advance Research in Computer Science and Management Studies Research Article

More information

Object detection with CNNs

Object detection with CNNs Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals

More information

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK Wenjie Guan, YueXian Zou*, Xiaoqun Zhou ADSPLAB/Intelligent Lab, School of ECE, Peking University, Shenzhen,518055, China

More information

Regionlet Object Detector with Hand-crafted and CNN Feature

Regionlet Object Detector with Hand-crafted and CNN Feature Regionlet Object Detector with Hand-crafted and CNN Feature Xiaoyu Wang Research Xiaoyu Wang Research Ming Yang Horizon Robotics Shenghuo Zhu Alibaba Group Yuanqing Lin Baidu Overview of this section Regionlet

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification

More information

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Deep learning for object detection. Slides from Svetlana Lazebnik and many others Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep

More information

OBJECT DETECTION HYUNG IL KOO

OBJECT DETECTION HYUNG IL KOO OBJECT DETECTION HYUNG IL KOO INTRODUCTION Computer Vision Tasks Classification + Localization Classification: C-classes Input: image Output: class label Evaluation metric: accuracy Localization Input:

More information

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 Mask R-CNN Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 1 Common computer vision tasks Image Classification: one label is generated for

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

Instance-aware Semantic Segmentation via Multi-task Network Cascades

Instance-aware Semantic Segmentation via Multi-task Network Cascades Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai, Kaiming He, Jian Sun Microsoft research 2016 Yotam Gil Amit Nativ Agenda Introduction Highlights Implementation Further

More information

Visual features detection based on deep neural network in autonomous driving tasks

Visual features detection based on deep neural network in autonomous driving tasks 430 Fomin I., Gromoshinskii D., Stepanov D. Visual features detection based on deep neural network in autonomous driving tasks Ivan Fomin, Dmitrii Gromoshinskii, Dmitry Stepanov Computer vision lab Russian

More information

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing 3 Object Detection BVM 2018 Tutorial: Advanced Deep Learning Methods Paul F. Jaeger, of Medical Image Computing What is object detection? classification segmentation obj. detection (1 label per pixel)

More information

R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS

R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS JIFENG DAI YI LI KAIMING HE JIAN SUN MICROSOFT RESEARCH TSINGHUA UNIVERSITY MICROSOFT RESEARCH MICROSOFT RESEARCH SPEED/ACCURACY TRADE-OFFS

More information

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization

More information

Object Detection. TA : Young-geun Kim. Biostatistics Lab., Seoul National University. March-June, 2018

Object Detection. TA : Young-geun Kim. Biostatistics Lab., Seoul National University. March-June, 2018 Object Detection TA : Young-geun Kim Biostatistics Lab., Seoul National University March-June, 2018 Seoul National University Deep Learning March-June, 2018 1 / 57 Index 1 Introduction 2 R-CNN 3 YOLO 4

More information

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their

More information

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object

More information

Cascade Region Regression for Robust Object Detection

Cascade Region Regression for Robust Object Detection Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng, Shaoli Huang, Jing Yang, Hui Shuai, Zhengbo Yu, Zongguang Lu, Qiang Ma, Yali

More information

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left

More information

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University. Visualizing and Understanding Convolutional Networks Christopher Pennsylvania State University February 23, 2015 Some Slide Information taken from Pierre Sermanet (Google) presentation on and Computer

More information

Mask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi

Mask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Mask R-CNN By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Types of Computer Vision Tasks http://cs231n.stanford.edu/ Semantic vs Instance Segmentation Image

More information

Rich feature hierarchies for accurate object detection and semant

Rich feature hierarchies for accurate object detection and semant Rich feature hierarchies for accurate object detection and semantic segmentation Speaker: Yucong Shen 4/5/2018 Develop of Object Detection 1 DPM (Deformable parts models) 2 R-CNN 3 Fast R-CNN 4 Faster

More information

arxiv: v1 [cs.cv] 31 Mar 2016

arxiv: v1 [cs.cv] 31 Mar 2016 Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu and C.-C. Jay Kuo arxiv:1603.09742v1 [cs.cv] 31 Mar 2016 University of Southern California Abstract.

More information

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna, Convolutional Neural Networks: Applications and a short timeline 7th Deep Learning Meetup Kornel Kis Vienna, 1.12.2016. Introduction Currently a master student Master thesis at BME SmartLab Started deep

More information

WE are witnessing a rapid, revolutionary change in our

WE are witnessing a rapid, revolutionary change in our 1904 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 37, NO. 9, SEPTEMBER 2015 Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zhang,

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts

More information

Zero-shot learning with Fast RCNN and Mask RCNN through semantic attribute Mapping

Zero-shot learning with Fast RCNN and Mask RCNN through semantic attribute Mapping International Journal of Engineering Science Invention (IJESI) ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 7 Issue 5 Ver. V May 2018 PP 58-62 Zero-shot learning with Fast RCNN and Mask RCNN

More information

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

Modern Convolutional Object Detectors

Modern Convolutional Object Detectors Modern Convolutional Object Detectors Faster R-CNN, R-FCN, SSD 29 September 2017 Presented by: Kevin Liang Papers Presented Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

More information

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection ILSVRC 2016 Object Detection from Video Byungjae Lee¹, Songguo Jin¹, Enkhbayar Erdenee¹, Mi Young Nam², Young Gui Jung², Phill Kyu

More information

Deep Neural Networks:

Deep Neural Networks: Deep Neural Networks: Part II Convolutional Neural Network (CNN) Yuan-Kai Wang, 2016 Web site of this course: http://pattern-recognition.weebly.com source: CNN for ImageClassification, by S. Lazebnik,

More information

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK 1 Po-Jen Lai ( 賴柏任 ), 2 Chiou-Shann Fuh ( 傅楸善 ) 1 Dept. of Electrical Engineering, National Taiwan University, Taiwan 2 Dept.

More information

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell Constrained Convolutional Neural Networks for Weakly Supervised Segmentation Deepak Pathak, Philipp Krähenbühl and Trevor Darrell 1 Multi-class Image Segmentation Assign a class label to each pixel in

More information

DEEP NEURAL NETWORKS FOR OBJECT DETECTION

DEEP NEURAL NETWORKS FOR OBJECT DETECTION DEEP NEURAL NETWORKS FOR OBJECT DETECTION Sergey Nikolenko Steklov Institute of Mathematics at St. Petersburg October 21, 2017, St. Petersburg, Russia Outline Bird s eye overview of deep learning Convolutional

More information

Content-Based Image Recovery

Content-Based Image Recovery Content-Based Image Recovery Hong-Yu Zhou and Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China zhouhy@lamda.nju.edu.cn wujx2001@nju.edu.cn Abstract. We propose

More information

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong TABLE I CLASSIFICATION ACCURACY OF DIFFERENT PRE-TRAINED MODELS ON THE TEST DATA

More information

Joint Object Detection and Viewpoint Estimation using CNN features

Joint Object Detection and Viewpoint Estimation using CNN features Joint Object Detection and Viewpoint Estimation using CNN features Carlos Guindel, David Martín and José M. Armingol cguindel@ing.uc3m.es Intelligent Systems Laboratory Universidad Carlos III de Madrid

More information

arxiv: v1 [cs.cv] 4 Jun 2015

arxiv: v1 [cs.cv] 4 Jun 2015 Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks arxiv:1506.01497v1 [cs.cv] 4 Jun 2015 Shaoqing Ren Kaiming He Ross Girshick Jian Sun Microsoft Research {v-shren, kahe, rbg,

More information

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) Donggeun Yoo, Sunggyun Park, Joon-Young Lee, Anthony Paek, In So Kweon. State-of-the-art frameworks for object

More information

Classification of objects from Video Data (Group 30)

Classification of objects from Video Data (Group 30) Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Presented by Pandian Raju and Jialin Wu Last class SGD for Document

More information

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN CS6501: Deep Learning for Visual Recognition Object Detection I: RCNN, Fast-RCNN, Faster-RCNN Today s Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster

More information

Two-Stream Convolutional Networks for Action Recognition in Videos

Two-Stream Convolutional Networks for Action Recognition in Videos Two-Stream Convolutional Networks for Action Recognition in Videos Karen Simonyan Andrew Zisserman Cemil Zalluhoğlu Introduction Aim Extend deep Convolution Networks to action recognition in video. Motivation

More information

Project 3 Q&A. Jonathan Krause

Project 3 Q&A. Jonathan Krause Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations

More information

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 Etienne Gadeski, Hervé Le Borgne, and Adrian Popescu CEA, LIST, Laboratory of Vision and Content Engineering, France

More information

YOLO9000: Better, Faster, Stronger

YOLO9000: Better, Faster, Stronger YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object

More information

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological

More information

Robust Face Recognition Based on Convolutional Neural Network

Robust Face Recognition Based on Convolutional Neural Network 2017 2nd International Conference on Manufacturing Science and Information Engineering (ICMSIE 2017) ISBN: 978-1-60595-516-2 Robust Face Recognition Based on Convolutional Neural Network Ying Xu, Hui Ma,

More information

Study of Residual Networks for Image Recognition

Study of Residual Networks for Image Recognition Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks

More information

Improved Face Detection and Alignment using Cascade Deep Convolutional Network

Improved Face Detection and Alignment using Cascade Deep Convolutional Network Improved Face Detection and Alignment using Cascade Deep Convolutional Network Weilin Cong, Sanyuan Zhao, Hui Tian, and Jianbing Shen Beijing Key Laboratory of Intelligent Information Technology, School

More information

Deep Residual Learning

Deep Residual Learning Deep Residual Learning MSRA @ ILSVRC & COCO 2015 competitions Kaiming He with Xiangyu Zhang, Shaoqing Ren, Jifeng Dai, & Jian Sun Microsoft Research Asia (MSRA) MSRA @ ILSVRC & COCO 2015 Competitions 1st

More information

Real-Time Depth Estimation from 2D Images

Real-Time Depth Estimation from 2D Images Real-Time Depth Estimation from 2D Images Jack Zhu Ralph Ma jackzhu@stanford.edu ralphma@stanford.edu. Abstract ages. We explore the differences in training on an untrained network, and on a network pre-trained

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Advanced Video Analysis & Imaging

Advanced Video Analysis & Imaging Advanced Video Analysis & Imaging (5LSH0), Module 09B Machine Learning with Convolutional Neural Networks (CNNs) - Workout Farhad G. Zanjani, Clint Sebastian, Egor Bondarev, Peter H.N. de With ( p.h.n.de.with@tue.nl

More information

Finding Tiny Faces Supplementary Materials

Finding Tiny Faces Supplementary Materials Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution

More information

Feature-Fused SSD: Fast Detection for Small Objects

Feature-Fused SSD: Fast Detection for Small Objects Feature-Fused SSD: Fast Detection for Small Objects Guimei Cao, Xuemei Xie, Wenzhe Yang, Quan Liao, Guangming Shi, Jinjian Wu School of Electronic Engineering, Xidian University, China xmxie@mail.xidian.edu.cn

More information

Efficient Segmentation-Aided Text Detection For Intelligent Robots

Efficient Segmentation-Aided Text Detection For Intelligent Robots Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related

More information

FINE-GRAINED image classification aims to recognize. Fast Fine-grained Image Classification via Weakly Supervised Discriminative Localization

FINE-GRAINED image classification aims to recognize. Fast Fine-grained Image Classification via Weakly Supervised Discriminative Localization 1 Fast Fine-grained Image Classification via Weakly Supervised Discriminative Localization Xiangteng He, Yuxin Peng and Junjie Zhao arxiv:1710.01168v1 [cs.cv] 30 Sep 2017 Abstract Fine-grained image classification

More information

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Kyunghee Kim Stanford University 353 Serra Mall Stanford, CA 94305 kyunghee.kim@stanford.edu Abstract We use a

More information

PT-NET: IMPROVE OBJECT AND FACE DETECTION VIA A PRE-TRAINED CNN MODEL

PT-NET: IMPROVE OBJECT AND FACE DETECTION VIA A PRE-TRAINED CNN MODEL PT-NET: IMPROVE OBJECT AND FACE DETECTION VIA A PRE-TRAINED CNN MODEL Yingxin Lou 1, Guangtao Fu 2, Zhuqing Jiang 1, Aidong Men 1, and Yun Zhou 2 1 Beijing University of Posts and Telecommunications, Beijing,

More information

arxiv: v1 [cs.cv] 5 Oct 2015

arxiv: v1 [cs.cv] 5 Oct 2015 Efficient Object Detection for High Resolution Images Yongxi Lu 1 and Tara Javidi 1 arxiv:1510.01257v1 [cs.cv] 5 Oct 2015 Abstract Efficient generation of high-quality object proposals is an essential

More information

arxiv: v1 [cs.cv] 26 Jun 2017

arxiv: v1 [cs.cv] 26 Jun 2017 Detecting Small Signs from Large Images arxiv:1706.08574v1 [cs.cv] 26 Jun 2017 Zibo Meng, Xiaochuan Fan, Xin Chen, Min Chen and Yan Tong Computer Science and Engineering University of South Carolina, Columbia,

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha

More information

Automatic Detection of Multiple Organs Using Convolutional Neural Networks

Automatic Detection of Multiple Organs Using Convolutional Neural Networks Automatic Detection of Multiple Organs Using Convolutional Neural Networks Elizabeth Cole University of Massachusetts Amherst Amherst, MA ekcole@umass.edu Sarfaraz Hussein University of Central Florida

More information

Recurrent Convolutional Neural Networks for Scene Labeling

Recurrent Convolutional Neural Networks for Scene Labeling Recurrent Convolutional Neural Networks for Scene Labeling Pedro O. Pinheiro, Ronan Collobert Reviewed by Yizhe Zhang August 14, 2015 Scene labeling task Scene labeling: assign a class label to each pixel

More information

Semantic Segmentation

Semantic Segmentation Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2 OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell,

More information

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Raymond Ptucha, Rochester Institute of Technology, USA Tutorial-9 May 19, 218 www.nvidia.com/dli R. Ptucha 18 1 Fair Use Agreement

More information

A CLOSER LOOK: SMALL OBJECT DETECTION IN FASTER R-CNN. Christian Eggert, Stephan Brehm, Anton Winschel, Dan Zecha, Rainer Lienhart

A CLOSER LOOK: SMALL OBJECT DETECTION IN FASTER R-CNN. Christian Eggert, Stephan Brehm, Anton Winschel, Dan Zecha, Rainer Lienhart A CLOSER LOOK: SMALL OBJECT DETECTION IN FASTER R-CNN Christian Eggert, Stephan Brehm, Anton Winschel, Dan Zecha, Rainer Lienhart Multimedia Computing and Computer Vision Lab University of Augsburg ABSTRACT

More information

R-FCN: Object Detection with Really - Friggin Convolutional Networks

R-FCN: Object Detection with Really - Friggin Convolutional Networks R-FCN: Object Detection with Really - Friggin Convolutional Networks Jifeng Dai Microsoft Research Li Yi Tsinghua Univ. Kaiming He FAIR Jian Sun Microsoft Research NIPS, 2016 Or Region-based Fully Convolutional

More information

FCHD: A fast and accurate head detector

FCHD: A fast and accurate head detector JOURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1 FCHD: A fast and accurate head detector Aditya Vora, Johnson Controls Inc. arxiv:1809.08766v2 [cs.cv] 26 Sep 2018 Abstract In this paper, we

More information

Detecting and Recognizing Text in Natural Images using Convolutional Networks

Detecting and Recognizing Text in Natural Images using Convolutional Networks Detecting and Recognizing Text in Natural Images using Convolutional Networks Aditya Srinivas Timmaraju, Vikesh Khanna Stanford University Stanford, CA - 94305 adityast@stanford.edu, vikesh@stanford.edu

More information

CS230: Lecture 3 Various Deep Learning Topics

CS230: Lecture 3 Various Deep Learning Topics CS230: Lecture 3 Various Deep Learning Topics Kian Katanforoosh, Andrew Ng Today s outline We will learn how to: - Analyse a problem from a deep learning approach - Choose an architecture - Choose a loss

More information

Transfer Learning. Style Transfer in Deep Learning

Transfer Learning. Style Transfer in Deep Learning Transfer Learning & Style Transfer in Deep Learning 4-DEC-2016 Gal Barzilai, Ram Machlev Deep Learning Seminar School of Electrical Engineering Tel Aviv University Part 1: Transfer Learning in Deep Learning

More information

RoI Pooling Based Fast Multi-Domain Convolutional Neural Networks for Visual Tracking

RoI Pooling Based Fast Multi-Domain Convolutional Neural Networks for Visual Tracking Advances in Intelligent Systems Research, volume 133 2nd International Conference on Artificial Intelligence and Industrial Engineering (AIIE2016) RoI Pooling Based Fast Multi-Domain Convolutional Neural

More information

Object Recognition II

Object Recognition II Object Recognition II Linda Shapiro EE/CSE 576 with CNN slides from Ross Girshick 1 Outline Object detection the task, evaluation, datasets Convolutional Neural Networks (CNNs) overview and history Region-based

More information

An Exploration of Computer Vision Techniques for Bird Species Classification

An Exploration of Computer Vision Techniques for Bird Species Classification An Exploration of Computer Vision Techniques for Bird Species Classification Anne L. Alter, Karen M. Wang December 15, 2017 Abstract Bird classification, a fine-grained categorization task, is a complex

More information

Faster R-CNN Implementation using CUDA Architecture in GeForce GTX 10 Series

Faster R-CNN Implementation using CUDA Architecture in GeForce GTX 10 Series INTERNATIONAL JOURNAL OF ELECTRICAL AND ELECTRONIC SYSTEMS RESEARCH Faster R-CNN Implementation using CUDA Architecture in GeForce GTX 10 Series Basyir Adam, Fadhlan Hafizhelmi Kamaru Zaman, Member, IEEE,

More information

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python. Inception and Residual Networks Hantao Zhang Deep Learning with Python https://en.wikipedia.org/wiki/residual_neural_network Deep Neural Network Progress from Large Scale Visual Recognition Challenge (ILSVRC)

More information

Airport Detection Using End-to-End Convolutional Neural Network with Hard Example Mining

Airport Detection Using End-to-End Convolutional Neural Network with Hard Example Mining remote sensing Article Airport Detection Using End-to-End Convolutional Neural Network with Hard Example Mining Bowen Cai 1,2, Zhiguo Jiang 1,2, Haopeng Zhang 1,2, * ID, Danpei Zhao 1,2 and Yuan Yao 1,2

More information

Fish Species Likelihood Prediction. Kumari Deepshikha (1611MC03) Sequeira Ryan Thomas (1611CS13)

Fish Species Likelihood Prediction. Kumari Deepshikha (1611MC03) Sequeira Ryan Thomas (1611CS13) Fish Species Likelihood Prediction Kumari Deepshikha (1611MC03) Sequeira Ryan Thomas (1611CS13) Content 1. Problem Statement 2. Literature Survey 3. Data Sources 4. Faster R-CNN training 5. CNN training

More information

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington

More information

NATURAL, INTERACTIVE TRAINING OF SERVICE ROBOTS TO DETECT NOVEL OBJECTS

NATURAL, INTERACTIVE TRAINING OF SERVICE ROBOTS TO DETECT NOVEL OBJECTS MUNICH 10-12 OCT 2017 NATURAL, INTERACTIVE TRAINING OF SERVICE ROBOTS TO DETECT NOVEL OBJECTS Elisa Maiettini and Dr. Giulia Pasquale Joint work with: Prof. Lorenzo Natale, Prof. Lorenzo Rosasco R1 icub

More information