arxiv: v1 [cs.cv] 31 Mar 2016

Size: px
Start display at page:

Download "arxiv: v1 [cs.cv] 31 Mar 2016"

Transcription

1 Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu and C.-C. Jay Kuo arxiv: v1 [cs.cv] 31 Mar 2016 University of Southern California Abstract. Semantic segmentation has been a major topic in computer vision, and has played an important role in understanding object classes as well as object localizations. Recent development in deep learning, especially in fully-convolutional neural network, has enabled pixel-level labeling for more accurate results. However most of the previous works, including FCN, did not take object boundary into consideration. In fact, since the originally labeled ground truth does not provide with a clean object boundary, the labeled contours and background objects have been both ignored as background class. In this work, we propose an elegant object boundary guided FCN (OBG-FCN) network, which uses the prior knowledge of object boundary from training to achieve better class accuracy and segmentation details. To this end, we first relabel the object contours, and use the FCN network to specially learn to find the whereabouts of the object and contours. Then we transform the output of this branch to 21 classes and use it as a mask to refine the detail shapes of the objects. An end-to-end learning is then applied to finetune the transforming parameters which reconsider the combination of object-background-boundary in the final segmentation decision. We apply the proposed method in PASCAL VOC segmentation benchmark, and have achieved 87.4% mean IU (15% relative improvements compared to FCN and around 10% improvement compared to CRF- RNN), and our edge model has shown to be stable and accurate even at accuracy level of FCN-2s. 1 Introduction Recently, semantic segmentation has played an important role in understanding object classes as well as object localizations[1]. The introduction of fully convolution networks[2] has brought a large improvement in image semantic segmentation. In fact, there are also a number of recent approaches including DeepLab[3], CRF-RNN[4], which achieved good segmentation performance. However, while they use feature representations to make pixel-wise classification, it is of great importance to take another factor into consideration, i.e. the object boundary, to make the labelling more accurate and natural[5].

2 2 Authors Suppressed Due to Excessive Length Input Image FCN-8s CRF-RNN OBG-FCN Ground Truth Fig. 1. Examples of segmentation results with first column as input images and last column as segmentation ground truth. The 2nd to 4th columns are the results from FCN-8s, CRF-RNN and our proposed OBG-FCN.

3 Object Boundary Guided Semantic Segmentation 3 It is a significant challenge to adapt Convnets on pixel-wise classification task. Firstly, the convolution filters and the max-pooling manipulation of traditional CNNs make the object boundary prediction quiet coarse. Furthermore, although recent algorithms, such as FCN[2], have made use of the intermediate convolutional layers to finetune the pixel-wise prediction, they couldn t make a good prediction on the object boundaries[6]. CRF-RNN[4] formulates mean-field approximate inference that partially improves the prediction on object boundaries, but there still exists a number of problems, for example, mixing the nearby objects together and misclassifying the objects on the boundaries. In this case, lack of boundary constraints couldn t give a good prediction on image boundary in most cases. To deal with this problem, we introduce an object boundary guided FCN network (OBG-FCN), which uses the pre-trained prior knowledge of object boundaries to enhance the performance semantic segmentation. In this work, we first relabel the object contours based on the PASCAL annotation. Although the ground truth provided by PASCAL annotation already offers contour information, it also includes some background objects as boundaries, which will mislead our training. Therefore, we relabel the object boundary by shifting the positions of the object regions and derive an accurate ground truth with object proposals and boundaries. Then, we follow the FCN network structure and conduct step-by-step learning from FCN-32s to FCN-2s to train our 3-class OB-FCN segmenter (Object-Boundary-Background). Then we use the output of the OB-FCN branch as a mask layer, which is transformed from 3-class output to 21-class, and conduct an element-wise multiplication with the original FCN-8s network. An end-to-end training is then followed on the object boundary guided FCN (OBG-FCN) to finetune the network. The results have shown a great improvement over previous state-of-art in improving the mean IU of PASCAL VOC benchmark and result in more accurate class accuracy and object details. The following sections explain our implementation details and introduce our architecture which combines the information of two distinct network branches together to make pixel-wise predictions. In the experiment section, we demonstrate the state-of-the-art results on PASCAL VOC Object Boundary FCN (OB-FCN) with Re-labeled Boundaries One of the most major idea is to utilize the boundary information as a guideline to the training stage of semantic segmentation. Researchers in previous works, such as [7,8,2,4], all treated the boundary to be background. Therefore, their segmentation results show little relation to the boundary on the ground truth. Subjective comparison of their segmentation results also demonstrate that there are lot of cross regions if adding the boundaries onto them, which is a good indication that boundary can be a crucial part for the semantic segmentation. The first stage of our research is to achieve the boundary prediction as precise as possible. Preprocessing the ground truth is one of the key parts in our work.

4 4 Authors Suppressed Due to Excessive Length By dividing the ground truth into three classes, objects, boundaries, and background, we recreated our own proposed ground truth, and followed the network structure of FCN to learn corresponding features and finetune the network. As a matter of fact, the ground truth of class labels of semantic segmentation has a labeling of object boundary, however it is sometimes confused with the background objects. Therefore, we relabel the object boundaries by moving the objects horizontally and vertically so that we can extent the object area, where we later set the center object region as it is. In this way, we can get a clear edge between objects and backgrounds and within different objects. Sample Iimage Original Ground Truth Relabeled Object Boundary Fig. 2. Examples of re-labeled object boundary for an image in PASCAL VOC An example is shown in Fig. 2, where the original images have some background objects included as the same class of object boundaries. In contrary, our relabeled ground truth keeps the exact information of objects and accurate boundary information. Since we are working at FCN-4s in current stage of OB-FCN network, we set the maximum boundary width as 4 which is the accuracy interval pixel-wise. Currently we are working on combining the result with OB-FCN-2s, and we expect to have even better results from it. Fig. 3. Flow chart of OB-FCN network structure. Previously in the work of FCN, it has been observed that the accuracy level can only reach up to combining pool 3, while further combining with pool2 or

5 Object Boundary Guided Semantic Segmentation 5 pool1 will confuse the segmenter. However, by making the object boundary FCN (OB-FCN) branch to learn only 3 classes (object, boundary, backgorund), we are able to achieve the detail level of FCN-4s and FCN-2s without confusing the network with small scale information. The flow-chart of the OB-FCN network is shown in Fig.3, where our final model is consisted with all pooling information. Input Image OB-FCN- 32s OB-FCN- 16s OB-FCN-8s OB-FCN-4s Labeled Boundary Fig. 4. Examples of segmentation results with first column as input images and last column as segmentation ground truth. The 2nd to 4th columns are the results from FCN-8s, CRF-RNN and our proposed OBG-FCN. A step-by-step boundary learning result is shown in Fig. 4, where the revolution of each step shows finer details of object and its boundaries. Serving as important prior knowledge of object proposals, our work shows a much more precise Semantic Segmentation can be achieved even with the help of a FCN-4s OB-FCN branch. We will further evaluate the object matching area with the ground truth in future experiments. 3 Object Boundary Guided FCN (OBG-FCN) for semantic segmentation Now that we acquire a precise model with the object information, it is important to combine them with the class information derived with the original FCN-8s. As mentioned in [9], a masking method is adopted by applying the output of one branch to the other branch. We followed the method and tried to combine the object information and class information by using the output of OB-FCN as a mask. We first followed the methods by using shared layers from Conv-5 and even go down to Conv-3, however the results are not that satisfying. This is most because that by looking for boundary information, the shallow layers of FCN and OB-FCN are most likely to be different with each other. Therefore, we decided to use the two pre-trained branch completely separated. The system network flow is shown in Fig. 6, where we introduce the data to two different branches, and design a masking layer to combine the output.

6 6 Authors Suppressed Due to Excessive Length Fig. 5. End-to-end two-branch network of OBG-FCN. Here, we use element wise product to exert the masking. However since this operation requires two bottom layers with exactly same dimension. We need to transform the 3-class output of OB-FCN to 21 classes. Therefore, we apply a convolution layer between the element production and the output of OB-FCN, which takes an input of 3 and output a 21 class masking map. Fig. 6. Demonstration of the convolutional transform layer to map 3-class output of OB-FCN to 21 classes. One crucial issue here is to initialize the transform layer. Since we would want most of the object area to be highlighted and combined with original FCN, and would like the background and detected boundaries not confusing with the object, we do not randomly initialize the network, but setting the parameters as 1, if k = C(background), m C(object), ω(k, m, 1, 1) = 1, if k = C(object), m = C(object), 0, otherwise, where ω is the parameters of the transforming convolution layer,k is the first parameter corresponding to the output depth and m is the corresponding input channel of OB-FCN s result, while C representing the class ID of background (0) and objects (1-20). As a result, we apply the pre-trained object area directly onto the original FCN-8s, and derive a primitive masking layer as shown in the first two columns of Fig. 7. The corresponding combined layer output is shown in the third column with the segmentation result in last column. As shown in the result, our (1)

7 Object Boundary Guided Semantic Segmentation 7 3-Class Output Initializa- 21-Class tion Combined Initialization Initialized Result Fig. 7. Results of initialization of convolutional transform layer. masking layer did a good job in highlighting the object area, whose segmentation boundary is already more accurate than the FCN-8s itself. 4 End-to-END Object Boundary Guided FCN (OBG-FCN) Training Based on the proposed model, we then conduct an end-to-end training to refine the network. Our currently results show that by enabling the back-propagation to both networks would significantly influence the pretrained features. In fact, the constraint of original FCN-8s still exists here, that even if we fixed the OB-FCN branch, the back-propagated gradients would result in scattered segmentation results which shows that the FCN looks for too much detail patterns. Therefore, currently we fixed the learning rate of the two branches, and conduct the finetuning on the masking layer with large step-size. FCN-8s Layer Output FCN-8s Result OBG-FCN Layer Output OBG Result Fig. 8. Results of end-to-end training of OBG-FCN, compared with FCN, on the test image of Fig. 4

8 8 Authors Suppressed Due to Excessive Length The end-to-end traning results are shown in Fig. 8, where the masking layer for each class now has specific weighting by combing the background, object and boundary information, and the segmentation results now looks even finer. Currently we are working on enabling global finetuning of feature layers for better results. 5 Experiment Results In this section, we evaluate the proposed OBG-FCN method on PASCAL VOC dataset, and compare with the previous state-of-art FCN [2] and CRF-RNN [4] with their newest available models. We currently only use the 1112 training images from PASCAL VOC 2011 segmentation dataset to train our OB-FCN branch, and finetune the OBG-FCN network. We first evaluate on the PASCAL VOC 2011 dataset. Since the model trained in [2] and [4] both use the training images in PASCAL VOC 2011 and the extra data in [10], there are some overlapping with the validation set and the extra data. However, we first present this result as an indication of our improved performance. We will later derive a more solid evaluation on non-overlapping validation dataset, as well as submitting it to the PASCAL challenge server. As shown in Table. 1, we present four different evaluations on the validation sets. And it has shown that the initialized network without further finetuning already reaches the state-of-art performance. And the final result of our proposed OBG-FCN network has outperforms the other methods significantly. Table 1. Comparison of semantic segmentation on complete PASCAL VOC 2011 dataset. pixel mean mean f.w accuracy accuracy IU IU FCN-8s CRF-as-RNN OBG-FCN (initialization) OBG-FCN We then follow the steps of [4] and derive a reduced subset of VOC 2012 validation data with 346 images by removing overlapping images within the training set. The results are shown in Table. 2 and the initialized OBG-FCN already out-performs FCN-8s and CRF-RNN, while the final result has a further improvement in higher accuracy and mean IU. In Fig. 1, we present several sets of segmentation results. The first six sets of examples are referred to as general or failure cases according to the [4], and the rest are examples of typical good quality results of the previous methods. As shown in the results, our methods manage to achieve finer details of object boundaries even if the CRF-RNN already did a good job. As for the confusion

9 Object Boundary Guided Semantic Segmentation 9 Table 2. Comparison of semantic segmentation on reduced PASCAL VOC 2012 validation set. pixel mean mean f.w accuracy accuracy IU IU FCN-8s CRF-as-RNN OBG-FCN (initialization) OBG-FCN classes, as well as occlusion problem, the proposed OBG-FCN can significantly improve the class accuracy and object completeness. References 1. R. Girshick, J. Donahue, T.D., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. (2014) 1 2. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2015) , 3, 7, 8 3. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected crfs. arxiv preprint arxiv: (2014) 1 4. Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.: Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE International Conference on Computer Vision. (2015) , 3, 7, 8 5. J. Dai, K.H., Boxsup, J.S.: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. arxiv preprint arxiv: (2015) 1 6. Gedas Bertasius, Jianbo Shi, L.T.: Semantic segmentation with boundary neural fields. arxiv preprint arxiv: (2015) 3 7. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. (2014) Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision. (2015) Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. arxiv preprint arxiv: (2015) Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Simultaneous detection and segmentation. In: Computer vision ECCV Springer (2014)

arxiv: v4 [cs.cv] 6 Jul 2016

arxiv: v4 [cs.cv] 6 Jul 2016 Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu, C.-C. Jay Kuo (qinhuang@usc.edu) arxiv:1603.09742v4 [cs.cv] 6 Jul 2016 Abstract. Semantic segmentation

More information

HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION

HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION Chien-Yao Wang, Jyun-Hong Li, Seksan Mathulaprangsan, Chin-Chin Chiang, and Jia-Ching Wang Department of Computer Science and Information

More information

Fully Convolutional Networks for Semantic Segmentation

Fully Convolutional Networks for Semantic Segmentation Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Chaim Ginzburg for Deep Learning seminar 1 Semantic Segmentation Define a pixel-wise labeling

More information

Conditional Random Fields as Recurrent Neural Networks

Conditional Random Fields as Recurrent Neural Networks BIL722 - Deep Learning for Computer Vision Conditional Random Fields as Recurrent Neural Networks S. Zheng, S. Jayasumana, B. Romera-Paredes V. Vineet, Z. Su, D. Du, C. Huang, P.H.S. Torr Introduction

More information

Efficient Segmentation-Aided Text Detection For Intelligent Robots

Efficient Segmentation-Aided Text Detection For Intelligent Robots Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related

More information

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,

More information

Semantic Segmentation

Semantic Segmentation Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2 OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell,

More information

Lecture 7: Semantic Segmentation

Lecture 7: Semantic Segmentation Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr

More information

arxiv: v2 [cs.cv] 18 Jul 2017

arxiv: v2 [cs.cv] 18 Jul 2017 PHAM, ITO, KOZAKAYA: BISEG 1 arxiv:1706.02135v2 [cs.cv] 18 Jul 2017 BiSeg: Simultaneous Instance Segmentation and Semantic Segmentation with Fully Convolutional Networks Viet-Quoc Pham quocviet.pham@toshiba.co.jp

More information

arxiv: v1 [cs.cv] 1 Feb 2018

arxiv: v1 [cs.cv] 1 Feb 2018 Learning Semantic Segmentation with Diverse Supervision Linwei Ye University of Manitoba yel3@cs.umanitoba.ca Zhi Liu Shanghai University liuzhi@staff.shu.edu.cn Yang Wang University of Manitoba ywang@cs.umanitoba.ca

More information

A MULTI-RESOLUTION FUSION MODEL INCORPORATING COLOR AND ELEVATION FOR SEMANTIC SEGMENTATION

A MULTI-RESOLUTION FUSION MODEL INCORPORATING COLOR AND ELEVATION FOR SEMANTIC SEGMENTATION A MULTI-RESOLUTION FUSION MODEL INCORPORATING COLOR AND ELEVATION FOR SEMANTIC SEGMENTATION Wenkai Zhang a, b, Hai Huang c, *, Matthias Schmitz c, Xian Sun a, Hongqi Wang a, Helmut Mayer c a Key Laboratory

More information

Presentation Outline. Semantic Segmentation. Overview. Presentation Outline CNN. Learning Deconvolution Network for Semantic Segmentation 6/6/16

Presentation Outline. Semantic Segmentation. Overview. Presentation Outline CNN. Learning Deconvolution Network for Semantic Segmentation 6/6/16 6/6/16 Learning Deconvolution Network for Semantic Segmentation Hyeonwoo Noh, Seunghoon Hong,Bohyung Han Department of Computer Science and Engineering, POSTECH, Korea Shai Rozenberg 6/6/2016 1 2 Semantic

More information

Deconvolutions in Convolutional Neural Networks

Deconvolutions in Convolutional Neural Networks Overview Deconvolutions in Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Deconvolutions in CNNs Applications Network visualization

More information

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects

More information

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Deep learning for object detection. Slides from Svetlana Lazebnik and many others Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep

More information

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky

More information

Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network

Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network Anurag Arnab and Philip H.S. Torr University of Oxford {anurag.arnab, philip.torr}@eng.ox.ac.uk 1. Introduction

More information

arxiv: v1 [cs.cv] 13 Mar 2016

arxiv: v1 [cs.cv] 13 Mar 2016 Deep Interactive Object Selection arxiv:63.442v [cs.cv] 3 Mar 26 Ning Xu University of Illinois at Urbana-Champaign ningxu2@illinois.edu Jimei Yang Adobe Research jimyang@adobe.com Abstract Interactive

More information

RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation : Multi-Path Refinement Networks for High-Resolution Semantic Segmentation Guosheng Lin 1 Anton Milan 2 Chunhua Shen 2,3 Ian Reid 2,3 1 Nanyang Technological University 2 University of Adelaide 3 Australian

More information

RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation : Multi-Path Refinement Networks for High-Resolution Semantic Segmentation Guosheng Lin 1,2, Anton Milan 1, Chunhua Shen 1,2, Ian Reid 1,2 1 The University of Adelaide, 2 Australian Centre for Robotic

More information

Channel Locality Block: A Variant of Squeeze-and-Excitation

Channel Locality Block: A Variant of Squeeze-and-Excitation Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan

More information

Finding Tiny Faces Supplementary Materials

Finding Tiny Faces Supplementary Materials Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution

More information

JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA

JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS Zhao Chen Machine Learning Intern, NVIDIA ABOUT ME 5th year PhD student in physics @ Stanford by day, deep learning computer vision scientist

More information

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation

More information

Gradient of the lower bound

Gradient of the lower bound Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level

More information

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS Kuan-Chuan Peng and Tsuhan Chen School of Electrical and Computer Engineering, Cornell University, Ithaca, NY

More information

arxiv: v1 [cs.cv] 15 Oct 2018

arxiv: v1 [cs.cv] 15 Oct 2018 Instance Segmentation and Object Detection with Bounding Shape Masks Ha Young Kim 1,2,*, Ba Rom Kang 2 1 Department of Financial Engineering, Ajou University Worldcupro 206, Yeongtong-gu, Suwon, 16499,

More information

TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES

TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES Valery Grishkin a, Alexander Ebral b, Nikolai Stepenko c, Jean Sene d Saint Petersburg State University, 7 9 Universitetskaya nab., Saint Petersburg, 199034,

More information

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Liwen Zheng, Canmiao Fu, Yong Zhao * School of Electronic and Computer Engineering, Shenzhen Graduate School of

More information

Deep Interactive Object Selection

Deep Interactive Object Selection Deep Interactive Object Selection Ning Xu 1, Brian Price 2, Scott Cohen 2, Jimei Yang 2, and Thomas Huang 1 1 University of Illinois at Urbana-Champaign 2 Adobe Research ningxu2@illinois.edu, {bprice,scohen,jimyang}@adobe.com,

More information

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left

More information

BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation

BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation Jifeng Dai Kaiming He Jian Sun Microsoft Research {jifdai,kahe,jiansun}@microsoft.com Abstract Recent leading

More information

arxiv: v1 [cs.cv] 8 Mar 2017 Abstract

arxiv: v1 [cs.cv] 8 Mar 2017 Abstract Large Kernel Matters Improve Semantic Segmentation by Global Convolutional Network Chao Peng Xiangyu Zhang Gang Yu Guiming Luo Jian Sun School of Software, Tsinghua University, {pengc14@mails.tsinghua.edu.cn,

More information

arxiv: v1 [cs.cv] 14 Dec 2015

arxiv: v1 [cs.cv] 14 Dec 2015 Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai Kaiming He Jian Sun Microsoft Research {jifdai,kahe,jiansun}@microsoft.com arxiv:1512.04412v1 [cs.cv] 14 Dec 2015 Abstract

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification

More information

Detecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA)

Detecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA) Detecting and Parsing of Visual Objects: Humans and Animals Alan Yuille (UCLA) Summary This talk describes recent work on detection and parsing visual objects. The methods represent objects in terms of

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

DifNet: Semantic Segmentation by Diffusion Networks

DifNet: Semantic Segmentation by Diffusion Networks DifNet: Semantic Segmentation by Diffusion Networks Peng Jiang 1 Fanglin Gu 1 Yunhai Wang 1 Changhe Tu 1 Baoquan Chen 2,1 1 Shandong University, China 2 Peking University, China sdujump@gmail.com, fanglin.gu@gmail.com,

More information

Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains

Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains Jiahao Pang 1 Wenxiu Sun 1 Chengxi Yang 1 Jimmy Ren 1 Ruichao Xiao 1 Jin Zeng 1 Liang Lin 1,2 1 SenseTime Research

More information

Dense Image Labeling Using Deep Convolutional Neural Networks

Dense Image Labeling Using Deep Convolutional Neural Networks Dense Image Labeling Using Deep Convolutional Neural Networks Md Amirul Islam, Neil Bruce, Yang Wang Department of Computer Science University of Manitoba Winnipeg, MB {amirul, bruce, ywang}@cs.umanitoba.ca

More information

Object detection with CNNs

Object detection with CNNs Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals

More information

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection Zeming Li, 1 Yilun Chen, 2 Gang Yu, 2 Yangdong

More information

Xiaowei Hu* Lei Zhu* Chi-Wing Fu Jing Qin Pheng-Ann Heng

Xiaowei Hu* Lei Zhu* Chi-Wing Fu Jing Qin Pheng-Ann Heng Direction-aware Spatial Context Features for Shadow Detection Xiaowei Hu* Lei Zhu* Chi-Wing Fu Jing Qin Pheng-Ann Heng The Chinese University of Hong Kong The Hong Kong Polytechnic University Shenzhen

More information

EE-559 Deep learning Networks for semantic segmentation

EE-559 Deep learning Networks for semantic segmentation EE-559 Deep learning 7.4. Networks for semantic segmentation François Fleuret https://fleuret.org/ee559/ Mon Feb 8 3:35:5 UTC 209 ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE The historical approach to image

More information

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological

More information

Feature-Fused SSD: Fast Detection for Small Objects

Feature-Fused SSD: Fast Detection for Small Objects Feature-Fused SSD: Fast Detection for Small Objects Guimei Cao, Xuemei Xie, Wenzhe Yang, Quan Liao, Guangming Shi, Jinjian Wu School of Electronic Engineering, Xidian University, China xmxie@mail.xidian.edu.cn

More information

Cascade Region Regression for Robust Object Detection

Cascade Region Regression for Robust Object Detection Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng, Shaoli Huang, Jing Yang, Hui Shuai, Zhengbo Yu, Zongguang Lu, Qiang Ma, Yali

More information

YOLO9000: Better, Faster, Stronger

YOLO9000: Better, Faster, Stronger YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object

More information

arxiv: v1 [cs.cv] 24 May 2016

arxiv: v1 [cs.cv] 24 May 2016 Dense CNN Learning with Equivalent Mappings arxiv:1605.07251v1 [cs.cv] 24 May 2016 Jianxin Wu Chen-Wei Xie Jian-Hao Luo National Key Laboratory for Novel Software Technology, Nanjing University 163 Xianlin

More information

A Bi-directional Message Passing Model for Salient Object Detection

A Bi-directional Message Passing Model for Salient Object Detection A Bi-directional Message Passing Model for Salient Object Detection Lu Zhang, Ju Dai, Huchuan Lu, You He 2, ang Wang 3 Dalian University of Technology, China 2 Naval Aviation University, China 3 Alibaba

More information

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their

More information

Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

More information

Automatic detection of books based on Faster R-CNN

Automatic detection of books based on Faster R-CNN Automatic detection of books based on Faster R-CNN Beibei Zhu, Xiaoyu Wu, Lei Yang, Yinghua Shen School of Information Engineering, Communication University of China Beijing, China e-mail: zhubeibei@cuc.edu.cn,

More information

arxiv: v1 [cs.cv] 5 Apr 2017

arxiv: v1 [cs.cv] 5 Apr 2017 Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade Xiaoxiao Li 1 Ziwei Liu 1 Ping Luo 2,1 Chen Change Loy 1,2 Xiaoou Tang 1,2 1 Department of Information Engineering,

More information

Deep Dual Learning for Semantic Image Segmentation

Deep Dual Learning for Semantic Image Segmentation Deep Dual Learning for Semantic Image Segmentation Ping Luo 2 Guangrun Wang 1,2 Liang Lin 1,3 Xiaogang Wang 2 1 Sun Yat-Sen University 2 The Chinese University of Hong Kong 3 SenseTime Group (Limited)

More information

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington

More information

Instance-aware Semantic Segmentation via Multi-task Network Cascades

Instance-aware Semantic Segmentation via Multi-task Network Cascades Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai, Kaiming He, Jian Sun Microsoft research 2016 Yotam Gil Amit Nativ Agenda Introduction Highlights Implementation Further

More information

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK Wenjie Guan, YueXian Zou*, Xiaoqun Zhou ADSPLAB/Intelligent Lab, School of ECE, Peking University, Shenzhen,518055, China

More information

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 Etienne Gadeski, Hervé Le Borgne, and Adrian Popescu CEA, LIST, Laboratory of Vision and Content Engineering, France

More information

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization

More information

Depth Estimation from a Single Image Using a Deep Neural Network Milestone Report

Depth Estimation from a Single Image Using a Deep Neural Network Milestone Report Figure 1: The architecture of the convolutional network. Input: a single view image; Output: a depth map. 3 Related Work In [4] they used depth maps of indoor scenes produced by a Microsoft Kinect to successfully

More information

Learning to Segment Instances in Videos with Spatial Propagation Network

Learning to Segment Instances in Videos with Spatial Propagation Network The 2017 DAVIS Challenge on Video Object Segmentation - CVPR 2017 Workshops Learning to Segment Instances in Videos with Spatial Propagation Network Jingchun Cheng 1,2 Sifei Liu 2 Yi-Hsuan Tsai 2 Wei-Chih

More information

RSRN: Rich Side-output Residual Network for Medial Axis Detection

RSRN: Rich Side-output Residual Network for Medial Axis Detection RSRN: Rich Side-output Residual Network for Medial Axis Detection Chang Liu, Wei Ke, Jianbin Jiao, and Qixiang Ye University of Chinese Academy of Sciences, Beijing, China {liuchang615, kewei11}@mails.ucas.ac.cn,

More information

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus Presented by: Rex Ying and Charles Qi Input: A Single RGB Image Estimate

More information

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Kyunghee Kim Stanford University 353 Serra Mall Stanford, CA 94305 kyunghee.kim@stanford.edu Abstract We use a

More information

arxiv: v2 [cs.cv] 23 Nov 2016 Abstract

arxiv: v2 [cs.cv] 23 Nov 2016 Abstract Simple Does It: Weakly Supervised Instance and Semantic Segmentation Anna Khoreva 1 Rodrigo Benenson 1 Jan Hosang 1 Matthias Hein 2 Bernt Schiele 1 1 Max Planck Institute for Informatics, Saarbrücken,

More information

Boundary-aware Fully Convolutional Network for Brain Tumor Segmentation

Boundary-aware Fully Convolutional Network for Brain Tumor Segmentation Boundary-aware Fully Convolutional Network for Brain Tumor Segmentation Haocheng Shen, Ruixuan Wang, Jianguo Zhang, and Stephen J. McKenna Computing, School of Science and Engineering, University of Dundee,

More information

Perceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington

Perceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington Perceiving the 3D World from Images and Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World 3 Understand

More information

Martian lava field, NASA, Wikipedia

Martian lava field, NASA, Wikipedia Martian lava field, NASA, Wikipedia Old Man of the Mountain, Franconia, New Hampshire Pareidolia http://smrt.ccel.ca/203/2/6/pareidolia/ Reddit for more : ) https://www.reddit.com/r/pareidolia/top/ Pareidolia

More information

PARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION. Shin Matsuo Wataru Shimoda Keiji Yanai

PARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION. Shin Matsuo Wataru Shimoda Keiji Yanai PARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION Shin Matsuo Wataru Shimoda Keiji Yanai Department of Informatics, The University of Electro-Communications, Tokyo 1-5-1 Chofugaoka,

More information

Semantic Soft Segmentation Supplementary Material

Semantic Soft Segmentation Supplementary Material Semantic Soft Segmentation Supplementary Material YAĞIZ AKSOY, MIT CSAIL and ETH Zürich TAE-HYUN OH, MIT CSAIL SYLVAIN PARIS, Adobe Research MARC POLLEFEYS, ETH Zürich and Microsoft WOJCIECH MATUSIK, MIT

More information

arxiv: v1 [cs.cv] 22 Nov 2017

arxiv: v1 [cs.cv] 22 Nov 2017 W-Net: A Deep Model for Fully Unsupervised Image Segmentation Xide Xia Boston University xidexia@bu.edu Brian Kulis Boston University bkulis@bu.edu arxiv:1711.08506v1 [cs.cv] 22 Nov 2017 Abstract While

More information

Pixelwise Instance Segmentation with a Dynamically Instantiated Network

Pixelwise Instance Segmentation with a Dynamically Instantiated Network Pixelwise Instance Segmentation with a Dynamically Instantiated Network Anurag Arnab and Philip H.S Torr University of Oxford {anurag.arnab, philip.torr}@eng.ox.ac.uk Abstract Semantic segmentation and

More information

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation Golnaz Ghiasi (B) and Charless C. Fowlkes Department of Computer Science, University of California, Irvine, USA {gghiasi,fowlkes}@ics.uci.edu

More information

Iterative Multi-domain Regularized Deep Learning for Anatomical Structure Detection and Segmentation from Ultrasound Images

Iterative Multi-domain Regularized Deep Learning for Anatomical Structure Detection and Segmentation from Ultrasound Images Iterative Multidomain Regularized Deep Learning for Anatomical Structure Detection and Segmentation from Ultrasound Images Hao Chen 1,2, Yefeng Zheng 2, JinHyeong Park 2, PhengAnn Heng 1, and S. Kevin

More information

Semi Supervised Semantic Segmentation Using Generative Adversarial Network

Semi Supervised Semantic Segmentation Using Generative Adversarial Network Semi Supervised Semantic Segmentation Using Generative Adversarial Network Nasim Souly Concetto Spampinato Mubarak Shah nsouly@eecs.ucf.edu cspampin@dieei.unict.it shah@crcv.ucf.edu Abstract Unlabeled

More information

Joint Calibration for Semantic Segmentation

Joint Calibration for Semantic Segmentation CAESAR ET AL.: JOINT CALIBRATION FOR SEMANTIC SEGMENTATION 1 Joint Calibration for Semantic Segmentation Holger Caesar holger.caesar@ed.ac.uk Jasper Uijlings jrr.uijlings@ed.ac.uk Vittorio Ferrari vittorio.ferrari@ed.ac.uk

More information

arxiv: v4 [cs.cv] 12 Aug 2015

arxiv: v4 [cs.cv] 12 Aug 2015 CAESAR ET AL.: JOINT CALIBRATION FOR SEMANTIC SEGMENTATION 1 arxiv:1507.01581v4 [cs.cv] 12 Aug 2015 Joint Calibration for Semantic Segmentation Holger Caesar holger.caesar@ed.ac.uk Jasper Uijlings jrr.uijlings@ed.ac.uk

More information

arxiv: v3 [cs.cv] 8 May 2017

arxiv: v3 [cs.cv] 8 May 2017 Convolutional Random Walk Networks for Semantic Image Segmentation Gedas Bertasius 1, Lorenzo Torresani 2, Stella X. Yu 3, Jianbo Shi 1 1 University of Pennsylvania, 2 Dartmouth College, 3 UC Berkeley

More information

3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington

3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington 3D Object Recognition and Scene Understanding from RGB-D Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World

More information

Person Part Segmentation based on Weak Supervision

Person Part Segmentation based on Weak Supervision JIANG, CHI: PERSON PART SEGMENTATION BASED ON WEAK SUPERVISION 1 Person Part Segmentation based on Weak Supervision Yalong Jiang 1 yalong.jiang@connect.polyu.hk Zheru Chi 1 chi.zheru@polyu.edu.hk 1 Department

More information

arxiv: v1 [cs.cv] 24 Nov 2016

arxiv: v1 [cs.cv] 24 Nov 2016 Recalling Holistic Information for Semantic Segmentation arxiv:1611.08061v1 [cs.cv] 24 Nov 2016 Hexiang Hu UCLA Los Angeles, CA hexiang.frank.hu@gmail.com Abstract Fei Sha UCLA Los Angeles, CA feisha@cs.ucla.edu

More information

Final Report: Smart Trash Net: Waste Localization and Classification

Final Report: Smart Trash Net: Waste Localization and Classification Final Report: Smart Trash Net: Waste Localization and Classification Oluwasanya Awe oawe@stanford.edu Robel Mengistu robel@stanford.edu December 15, 2017 Vikram Sreedhar vsreed@stanford.edu Abstract Given

More information

Webly Supervised Semantic Segmentation

Webly Supervised Semantic Segmentation Webly Supervised Semantic Segmentation Bin Jin IC, EPFL bin.jin@epfl.ch Maria V. Ortiz Segovia Océ Print Logic Technologies Maria.Ortiz@oce.com Sabine Süsstrunk IC, EPFL sabine.susstrunk@epfl.ch Abstract

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

arxiv: v2 [cs.cv] 10 Apr 2017

arxiv: v2 [cs.cv] 10 Apr 2017 Fully Convolutional Instance-aware Semantic Segmentation Yi Li 1,2 Haozhi Qi 2 Jifeng Dai 2 Xiangyang Ji 1 Yichen Wei 2 1 Tsinghua University 2 Microsoft Research Asia {liyi14,xyji}@tsinghua.edu.cn, {v-haoq,jifdai,yichenw}@microsoft.com

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

Learning to Segment Human by Watching YouTube

Learning to Segment Human by Watching YouTube IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. X, X 20XX 1 Learning to Segment Human by Watching YouTube Xiaodan Liang, Yunchao Wei, Yunpeng Chen, Xiaohui Shen, Jianchao Yang,

More information

arxiv: v1 [cs.cv] 29 Sep 2016

arxiv: v1 [cs.cv] 29 Sep 2016 arxiv:1609.09545v1 [cs.cv] 29 Sep 2016 Two-stage Convolutional Part Heatmap Regression for the 1st 3D Face Alignment in the Wild (3DFAW) Challenge Adrian Bulat and Georgios Tzimiropoulos Computer Vision

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

Multi-Glance Attention Models For Image Classification

Multi-Glance Attention Models For Image Classification Multi-Glance Attention Models For Image Classification Chinmay Duvedi Stanford University Stanford, CA cduvedi@stanford.edu Pararth Shah Stanford University Stanford, CA pararth@stanford.edu Abstract We

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Know your data - many types of networks

Know your data - many types of networks Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic

More information

arxiv: v2 [cs.cv] 8 Apr 2018

arxiv: v2 [cs.cv] 8 Apr 2018 Single-Shot Object Detection with Enriched Semantics Zhishuai Zhang 1 Siyuan Qiao 1 Cihang Xie 1 Wei Shen 1,2 Bo Wang 3 Alan L. Yuille 1 Johns Hopkins University 1 Shanghai University 2 Hikvision Research

More information

Dataset Augmentation with Synthetic Images Improves Semantic Segmentation

Dataset Augmentation with Synthetic Images Improves Semantic Segmentation Dataset Augmentation with Synthetic Images Improves Semantic Segmentation P. S. Rajpura IIT Gandhinagar param.rajpura@iitgn.ac.in M. Goyal IIT Varanasi manik.goyal.cse15@iitbhu.ac.in H. Bojinov Innit Inc.

More information

arxiv: v2 [cs.cv] 29 Nov 2016 Abstract

arxiv: v2 [cs.cv] 29 Nov 2016 Abstract Object Detection Free Instance Segmentation With Labeling Transformations Long Jin 1, Zeyu Chen 1, Zhuowen Tu 2,1 1 Dept. of CSE and 2 Dept. of CogSci, University of California, San Diego 9500 Gilman Drive,

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Flow-Based Video Recognition

Flow-Based Video Recognition Flow-Based Video Recognition Jifeng Dai Visual Computing Group, Microsoft Research Asia Joint work with Xizhou Zhu*, Yuwen Xiong*, Yujie Wang*, Lu Yuan and Yichen Wei (* interns) Talk pipeline Introduction

More information