arxiv: v1 [cs.cv] 31 Mar 2016
|
|
- Caren Joseph
- 5 years ago
- Views:
Transcription
1 Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu and C.-C. Jay Kuo arxiv: v1 [cs.cv] 31 Mar 2016 University of Southern California Abstract. Semantic segmentation has been a major topic in computer vision, and has played an important role in understanding object classes as well as object localizations. Recent development in deep learning, especially in fully-convolutional neural network, has enabled pixel-level labeling for more accurate results. However most of the previous works, including FCN, did not take object boundary into consideration. In fact, since the originally labeled ground truth does not provide with a clean object boundary, the labeled contours and background objects have been both ignored as background class. In this work, we propose an elegant object boundary guided FCN (OBG-FCN) network, which uses the prior knowledge of object boundary from training to achieve better class accuracy and segmentation details. To this end, we first relabel the object contours, and use the FCN network to specially learn to find the whereabouts of the object and contours. Then we transform the output of this branch to 21 classes and use it as a mask to refine the detail shapes of the objects. An end-to-end learning is then applied to finetune the transforming parameters which reconsider the combination of object-background-boundary in the final segmentation decision. We apply the proposed method in PASCAL VOC segmentation benchmark, and have achieved 87.4% mean IU (15% relative improvements compared to FCN and around 10% improvement compared to CRF- RNN), and our edge model has shown to be stable and accurate even at accuracy level of FCN-2s. 1 Introduction Recently, semantic segmentation has played an important role in understanding object classes as well as object localizations[1]. The introduction of fully convolution networks[2] has brought a large improvement in image semantic segmentation. In fact, there are also a number of recent approaches including DeepLab[3], CRF-RNN[4], which achieved good segmentation performance. However, while they use feature representations to make pixel-wise classification, it is of great importance to take another factor into consideration, i.e. the object boundary, to make the labelling more accurate and natural[5].
2 2 Authors Suppressed Due to Excessive Length Input Image FCN-8s CRF-RNN OBG-FCN Ground Truth Fig. 1. Examples of segmentation results with first column as input images and last column as segmentation ground truth. The 2nd to 4th columns are the results from FCN-8s, CRF-RNN and our proposed OBG-FCN.
3 Object Boundary Guided Semantic Segmentation 3 It is a significant challenge to adapt Convnets on pixel-wise classification task. Firstly, the convolution filters and the max-pooling manipulation of traditional CNNs make the object boundary prediction quiet coarse. Furthermore, although recent algorithms, such as FCN[2], have made use of the intermediate convolutional layers to finetune the pixel-wise prediction, they couldn t make a good prediction on the object boundaries[6]. CRF-RNN[4] formulates mean-field approximate inference that partially improves the prediction on object boundaries, but there still exists a number of problems, for example, mixing the nearby objects together and misclassifying the objects on the boundaries. In this case, lack of boundary constraints couldn t give a good prediction on image boundary in most cases. To deal with this problem, we introduce an object boundary guided FCN network (OBG-FCN), which uses the pre-trained prior knowledge of object boundaries to enhance the performance semantic segmentation. In this work, we first relabel the object contours based on the PASCAL annotation. Although the ground truth provided by PASCAL annotation already offers contour information, it also includes some background objects as boundaries, which will mislead our training. Therefore, we relabel the object boundary by shifting the positions of the object regions and derive an accurate ground truth with object proposals and boundaries. Then, we follow the FCN network structure and conduct step-by-step learning from FCN-32s to FCN-2s to train our 3-class OB-FCN segmenter (Object-Boundary-Background). Then we use the output of the OB-FCN branch as a mask layer, which is transformed from 3-class output to 21-class, and conduct an element-wise multiplication with the original FCN-8s network. An end-to-end training is then followed on the object boundary guided FCN (OBG-FCN) to finetune the network. The results have shown a great improvement over previous state-of-art in improving the mean IU of PASCAL VOC benchmark and result in more accurate class accuracy and object details. The following sections explain our implementation details and introduce our architecture which combines the information of two distinct network branches together to make pixel-wise predictions. In the experiment section, we demonstrate the state-of-the-art results on PASCAL VOC Object Boundary FCN (OB-FCN) with Re-labeled Boundaries One of the most major idea is to utilize the boundary information as a guideline to the training stage of semantic segmentation. Researchers in previous works, such as [7,8,2,4], all treated the boundary to be background. Therefore, their segmentation results show little relation to the boundary on the ground truth. Subjective comparison of their segmentation results also demonstrate that there are lot of cross regions if adding the boundaries onto them, which is a good indication that boundary can be a crucial part for the semantic segmentation. The first stage of our research is to achieve the boundary prediction as precise as possible. Preprocessing the ground truth is one of the key parts in our work.
4 4 Authors Suppressed Due to Excessive Length By dividing the ground truth into three classes, objects, boundaries, and background, we recreated our own proposed ground truth, and followed the network structure of FCN to learn corresponding features and finetune the network. As a matter of fact, the ground truth of class labels of semantic segmentation has a labeling of object boundary, however it is sometimes confused with the background objects. Therefore, we relabel the object boundaries by moving the objects horizontally and vertically so that we can extent the object area, where we later set the center object region as it is. In this way, we can get a clear edge between objects and backgrounds and within different objects. Sample Iimage Original Ground Truth Relabeled Object Boundary Fig. 2. Examples of re-labeled object boundary for an image in PASCAL VOC An example is shown in Fig. 2, where the original images have some background objects included as the same class of object boundaries. In contrary, our relabeled ground truth keeps the exact information of objects and accurate boundary information. Since we are working at FCN-4s in current stage of OB-FCN network, we set the maximum boundary width as 4 which is the accuracy interval pixel-wise. Currently we are working on combining the result with OB-FCN-2s, and we expect to have even better results from it. Fig. 3. Flow chart of OB-FCN network structure. Previously in the work of FCN, it has been observed that the accuracy level can only reach up to combining pool 3, while further combining with pool2 or
5 Object Boundary Guided Semantic Segmentation 5 pool1 will confuse the segmenter. However, by making the object boundary FCN (OB-FCN) branch to learn only 3 classes (object, boundary, backgorund), we are able to achieve the detail level of FCN-4s and FCN-2s without confusing the network with small scale information. The flow-chart of the OB-FCN network is shown in Fig.3, where our final model is consisted with all pooling information. Input Image OB-FCN- 32s OB-FCN- 16s OB-FCN-8s OB-FCN-4s Labeled Boundary Fig. 4. Examples of segmentation results with first column as input images and last column as segmentation ground truth. The 2nd to 4th columns are the results from FCN-8s, CRF-RNN and our proposed OBG-FCN. A step-by-step boundary learning result is shown in Fig. 4, where the revolution of each step shows finer details of object and its boundaries. Serving as important prior knowledge of object proposals, our work shows a much more precise Semantic Segmentation can be achieved even with the help of a FCN-4s OB-FCN branch. We will further evaluate the object matching area with the ground truth in future experiments. 3 Object Boundary Guided FCN (OBG-FCN) for semantic segmentation Now that we acquire a precise model with the object information, it is important to combine them with the class information derived with the original FCN-8s. As mentioned in [9], a masking method is adopted by applying the output of one branch to the other branch. We followed the method and tried to combine the object information and class information by using the output of OB-FCN as a mask. We first followed the methods by using shared layers from Conv-5 and even go down to Conv-3, however the results are not that satisfying. This is most because that by looking for boundary information, the shallow layers of FCN and OB-FCN are most likely to be different with each other. Therefore, we decided to use the two pre-trained branch completely separated. The system network flow is shown in Fig. 6, where we introduce the data to two different branches, and design a masking layer to combine the output.
6 6 Authors Suppressed Due to Excessive Length Fig. 5. End-to-end two-branch network of OBG-FCN. Here, we use element wise product to exert the masking. However since this operation requires two bottom layers with exactly same dimension. We need to transform the 3-class output of OB-FCN to 21 classes. Therefore, we apply a convolution layer between the element production and the output of OB-FCN, which takes an input of 3 and output a 21 class masking map. Fig. 6. Demonstration of the convolutional transform layer to map 3-class output of OB-FCN to 21 classes. One crucial issue here is to initialize the transform layer. Since we would want most of the object area to be highlighted and combined with original FCN, and would like the background and detected boundaries not confusing with the object, we do not randomly initialize the network, but setting the parameters as 1, if k = C(background), m C(object), ω(k, m, 1, 1) = 1, if k = C(object), m = C(object), 0, otherwise, where ω is the parameters of the transforming convolution layer,k is the first parameter corresponding to the output depth and m is the corresponding input channel of OB-FCN s result, while C representing the class ID of background (0) and objects (1-20). As a result, we apply the pre-trained object area directly onto the original FCN-8s, and derive a primitive masking layer as shown in the first two columns of Fig. 7. The corresponding combined layer output is shown in the third column with the segmentation result in last column. As shown in the result, our (1)
7 Object Boundary Guided Semantic Segmentation 7 3-Class Output Initializa- 21-Class tion Combined Initialization Initialized Result Fig. 7. Results of initialization of convolutional transform layer. masking layer did a good job in highlighting the object area, whose segmentation boundary is already more accurate than the FCN-8s itself. 4 End-to-END Object Boundary Guided FCN (OBG-FCN) Training Based on the proposed model, we then conduct an end-to-end training to refine the network. Our currently results show that by enabling the back-propagation to both networks would significantly influence the pretrained features. In fact, the constraint of original FCN-8s still exists here, that even if we fixed the OB-FCN branch, the back-propagated gradients would result in scattered segmentation results which shows that the FCN looks for too much detail patterns. Therefore, currently we fixed the learning rate of the two branches, and conduct the finetuning on the masking layer with large step-size. FCN-8s Layer Output FCN-8s Result OBG-FCN Layer Output OBG Result Fig. 8. Results of end-to-end training of OBG-FCN, compared with FCN, on the test image of Fig. 4
8 8 Authors Suppressed Due to Excessive Length The end-to-end traning results are shown in Fig. 8, where the masking layer for each class now has specific weighting by combing the background, object and boundary information, and the segmentation results now looks even finer. Currently we are working on enabling global finetuning of feature layers for better results. 5 Experiment Results In this section, we evaluate the proposed OBG-FCN method on PASCAL VOC dataset, and compare with the previous state-of-art FCN [2] and CRF-RNN [4] with their newest available models. We currently only use the 1112 training images from PASCAL VOC 2011 segmentation dataset to train our OB-FCN branch, and finetune the OBG-FCN network. We first evaluate on the PASCAL VOC 2011 dataset. Since the model trained in [2] and [4] both use the training images in PASCAL VOC 2011 and the extra data in [10], there are some overlapping with the validation set and the extra data. However, we first present this result as an indication of our improved performance. We will later derive a more solid evaluation on non-overlapping validation dataset, as well as submitting it to the PASCAL challenge server. As shown in Table. 1, we present four different evaluations on the validation sets. And it has shown that the initialized network without further finetuning already reaches the state-of-art performance. And the final result of our proposed OBG-FCN network has outperforms the other methods significantly. Table 1. Comparison of semantic segmentation on complete PASCAL VOC 2011 dataset. pixel mean mean f.w accuracy accuracy IU IU FCN-8s CRF-as-RNN OBG-FCN (initialization) OBG-FCN We then follow the steps of [4] and derive a reduced subset of VOC 2012 validation data with 346 images by removing overlapping images within the training set. The results are shown in Table. 2 and the initialized OBG-FCN already out-performs FCN-8s and CRF-RNN, while the final result has a further improvement in higher accuracy and mean IU. In Fig. 1, we present several sets of segmentation results. The first six sets of examples are referred to as general or failure cases according to the [4], and the rest are examples of typical good quality results of the previous methods. As shown in the results, our methods manage to achieve finer details of object boundaries even if the CRF-RNN already did a good job. As for the confusion
9 Object Boundary Guided Semantic Segmentation 9 Table 2. Comparison of semantic segmentation on reduced PASCAL VOC 2012 validation set. pixel mean mean f.w accuracy accuracy IU IU FCN-8s CRF-as-RNN OBG-FCN (initialization) OBG-FCN classes, as well as occlusion problem, the proposed OBG-FCN can significantly improve the class accuracy and object completeness. References 1. R. Girshick, J. Donahue, T.D., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. (2014) 1 2. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. (2015) , 3, 7, 8 3. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected crfs. arxiv preprint arxiv: (2014) 1 4. Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.: Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE International Conference on Computer Vision. (2015) , 3, 7, 8 5. J. Dai, K.H., Boxsup, J.S.: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. arxiv preprint arxiv: (2015) 1 6. Gedas Bertasius, Jianbo Shi, L.T.: Semantic segmentation with boundary neural fields. arxiv preprint arxiv: (2015) 3 7. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. (2014) Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision. (2015) Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. arxiv preprint arxiv: (2015) Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Simultaneous detection and segmentation. In: Computer vision ECCV Springer (2014)
arxiv: v4 [cs.cv] 6 Jul 2016
Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu, C.-C. Jay Kuo (qinhuang@usc.edu) arxiv:1603.09742v4 [cs.cv] 6 Jul 2016 Abstract. Semantic segmentation
More informationHIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION
HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION Chien-Yao Wang, Jyun-Hong Li, Seksan Mathulaprangsan, Chin-Chin Chiang, and Jia-Ching Wang Department of Computer Science and Information
More informationFully Convolutional Networks for Semantic Segmentation
Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Chaim Ginzburg for Deep Learning seminar 1 Semantic Segmentation Define a pixel-wise labeling
More informationConditional Random Fields as Recurrent Neural Networks
BIL722 - Deep Learning for Computer Vision Conditional Random Fields as Recurrent Neural Networks S. Zheng, S. Jayasumana, B. Romera-Paredes V. Vineet, Z. Su, D. Du, C. Huang, P.H.S. Torr Introduction
More informationEfficient Segmentation-Aided Text Detection For Intelligent Robots
Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related
More informationDeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,
More informationSemantic Segmentation
Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2 OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell,
More informationLecture 7: Semantic Segmentation
Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr
More informationarxiv: v2 [cs.cv] 18 Jul 2017
PHAM, ITO, KOZAKAYA: BISEG 1 arxiv:1706.02135v2 [cs.cv] 18 Jul 2017 BiSeg: Simultaneous Instance Segmentation and Semantic Segmentation with Fully Convolutional Networks Viet-Quoc Pham quocviet.pham@toshiba.co.jp
More informationarxiv: v1 [cs.cv] 1 Feb 2018
Learning Semantic Segmentation with Diverse Supervision Linwei Ye University of Manitoba yel3@cs.umanitoba.ca Zhi Liu Shanghai University liuzhi@staff.shu.edu.cn Yang Wang University of Manitoba ywang@cs.umanitoba.ca
More informationA MULTI-RESOLUTION FUSION MODEL INCORPORATING COLOR AND ELEVATION FOR SEMANTIC SEGMENTATION
A MULTI-RESOLUTION FUSION MODEL INCORPORATING COLOR AND ELEVATION FOR SEMANTIC SEGMENTATION Wenkai Zhang a, b, Hai Huang c, *, Matthias Schmitz c, Xian Sun a, Hongqi Wang a, Helmut Mayer c a Key Laboratory
More informationPresentation Outline. Semantic Segmentation. Overview. Presentation Outline CNN. Learning Deconvolution Network for Semantic Segmentation 6/6/16
6/6/16 Learning Deconvolution Network for Semantic Segmentation Hyeonwoo Noh, Seunghoon Hong,Bohyung Han Department of Computer Science and Engineering, POSTECH, Korea Shai Rozenberg 6/6/2016 1 2 Semantic
More informationDeconvolutions in Convolutional Neural Networks
Overview Deconvolutions in Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Deconvolutions in CNNs Applications Network visualization
More informationEncoder-Decoder Networks for Semantic Segmentation. Sachin Mehta
Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects
More informationDeep learning for object detection. Slides from Svetlana Lazebnik and many others
Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep
More informationDeep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia
Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky
More informationSupplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network
Supplementary Material: Pixelwise Instance Segmentation with a Dynamically Instantiated Network Anurag Arnab and Philip H.S. Torr University of Oxford {anurag.arnab, philip.torr}@eng.ox.ac.uk 1. Introduction
More informationarxiv: v1 [cs.cv] 13 Mar 2016
Deep Interactive Object Selection arxiv:63.442v [cs.cv] 3 Mar 26 Ning Xu University of Illinois at Urbana-Champaign ningxu2@illinois.edu Jimei Yang Adobe Research jimyang@adobe.com Abstract Interactive
More informationRefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation
: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation Guosheng Lin 1 Anton Milan 2 Chunhua Shen 2,3 Ian Reid 2,3 1 Nanyang Technological University 2 University of Adelaide 3 Australian
More informationRefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation
: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation Guosheng Lin 1,2, Anton Milan 1, Chunhua Shen 1,2, Ian Reid 1,2 1 The University of Adelaide, 2 Australian Centre for Robotic
More informationChannel Locality Block: A Variant of Squeeze-and-Excitation
Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan
More informationFinding Tiny Faces Supplementary Materials
Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution
More informationJOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA
JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS Zhao Chen Machine Learning Intern, NVIDIA ABOUT ME 5th year PhD student in physics @ Stanford by day, deep learning computer vision scientist
More informationSSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang
SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation
More informationGradient of the lower bound
Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level
More informationA FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen
A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS Kuan-Chuan Peng and Tsuhan Chen School of Electrical and Computer Engineering, Cornell University, Ithaca, NY
More informationarxiv: v1 [cs.cv] 15 Oct 2018
Instance Segmentation and Object Detection with Bounding Shape Masks Ha Young Kim 1,2,*, Ba Rom Kang 2 1 Department of Financial Engineering, Ajou University Worldcupro 206, Yeongtong-gu, Suwon, 16499,
More informationTEXT SEGMENTATION ON PHOTOREALISTIC IMAGES
TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES Valery Grishkin a, Alexander Ebral b, Nikolai Stepenko c, Jean Sene d Saint Petersburg State University, 7 9 Universitetskaya nab., Saint Petersburg, 199034,
More informationExtend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network
Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Liwen Zheng, Canmiao Fu, Yong Zhao * School of Electronic and Computer Engineering, Shenzhen Graduate School of
More informationDeep Interactive Object Selection
Deep Interactive Object Selection Ning Xu 1, Brian Price 2, Scott Cohen 2, Jimei Yang 2, and Thomas Huang 1 1 University of Illinois at Urbana-Champaign 2 Adobe Research ningxu2@illinois.edu, {bprice,scohen,jimyang}@adobe.com,
More informationMask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma
Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left
More informationBoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation
BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation Jifeng Dai Kaiming He Jian Sun Microsoft Research {jifdai,kahe,jiansun}@microsoft.com Abstract Recent leading
More informationarxiv: v1 [cs.cv] 8 Mar 2017 Abstract
Large Kernel Matters Improve Semantic Segmentation by Global Convolutional Network Chao Peng Xiangyu Zhang Gang Yu Guiming Luo Jian Sun School of Software, Tsinghua University, {pengc14@mails.tsinghua.edu.cn,
More informationarxiv: v1 [cs.cv] 14 Dec 2015
Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai Kaiming He Jian Sun Microsoft Research {jifdai,kahe,jiansun}@microsoft.com arxiv:1512.04412v1 [cs.cv] 14 Dec 2015 Abstract
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification
More informationDetecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA)
Detecting and Parsing of Visual Objects: Humans and Animals Alan Yuille (UCLA) Summary This talk describes recent work on detection and parsing visual objects. The methods represent objects in terms of
More informationLecture 5: Object Detection
Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based
More informationDifNet: Semantic Segmentation by Diffusion Networks
DifNet: Semantic Segmentation by Diffusion Networks Peng Jiang 1 Fanglin Gu 1 Yunhai Wang 1 Changhe Tu 1 Baoquan Chen 2,1 1 Shandong University, China 2 Peking University, China sdujump@gmail.com, fanglin.gu@gmail.com,
More informationSupplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains
Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains Jiahao Pang 1 Wenxiu Sun 1 Chengxi Yang 1 Jimmy Ren 1 Ruichao Xiao 1 Jin Zeng 1 Liang Lin 1,2 1 SenseTime Research
More informationDense Image Labeling Using Deep Convolutional Neural Networks
Dense Image Labeling Using Deep Convolutional Neural Networks Md Amirul Islam, Neil Bruce, Yang Wang Department of Computer Science University of Manitoba Winnipeg, MB {amirul, bruce, ywang}@cs.umanitoba.ca
More informationObject detection with CNNs
Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals
More informationR-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection Zeming Li, 1 Yilun Chen, 2 Gang Yu, 2 Yangdong
More informationXiaowei Hu* Lei Zhu* Chi-Wing Fu Jing Qin Pheng-Ann Heng
Direction-aware Spatial Context Features for Shadow Detection Xiaowei Hu* Lei Zhu* Chi-Wing Fu Jing Qin Pheng-Ann Heng The Chinese University of Hong Kong The Hong Kong Polytechnic University Shenzhen
More informationEE-559 Deep learning Networks for semantic segmentation
EE-559 Deep learning 7.4. Networks for semantic segmentation François Fleuret https://fleuret.org/ee559/ Mon Feb 8 3:35:5 UTC 209 ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE The historical approach to image
More informationREGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION
REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological
More informationFeature-Fused SSD: Fast Detection for Small Objects
Feature-Fused SSD: Fast Detection for Small Objects Guimei Cao, Xuemei Xie, Wenzhe Yang, Quan Liao, Guangming Shi, Jinjian Wu School of Electronic Engineering, Xidian University, China xmxie@mail.xidian.edu.cn
More informationCascade Region Regression for Robust Object Detection
Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng, Shaoli Huang, Jing Yang, Hui Shuai, Zhengbo Yu, Zongguang Lu, Qiang Ma, Yali
More informationYOLO9000: Better, Faster, Stronger
YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object
More informationarxiv: v1 [cs.cv] 24 May 2016
Dense CNN Learning with Equivalent Mappings arxiv:1605.07251v1 [cs.cv] 24 May 2016 Jianxin Wu Chen-Wei Xie Jian-Hao Luo National Key Laboratory for Novel Software Technology, Nanjing University 163 Xianlin
More informationA Bi-directional Message Passing Model for Salient Object Detection
A Bi-directional Message Passing Model for Salient Object Detection Lu Zhang, Ju Dai, Huchuan Lu, You He 2, ang Wang 3 Dalian University of Technology, China 2 Naval Aviation University, China 3 Alibaba
More informationCIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm
CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their
More informationObject Detection Based on Deep Learning
Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
More informationAutomatic detection of books based on Faster R-CNN
Automatic detection of books based on Faster R-CNN Beibei Zhu, Xiaoyu Wu, Lei Yang, Yinghua Shen School of Information Engineering, Communication University of China Beijing, China e-mail: zhubeibei@cuc.edu.cn,
More informationarxiv: v1 [cs.cv] 5 Apr 2017
Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade Xiaoxiao Li 1 Ziwei Liu 1 Ping Luo 2,1 Chen Change Loy 1,2 Xiaoou Tang 1,2 1 Department of Information Engineering,
More informationDeep Dual Learning for Semantic Image Segmentation
Deep Dual Learning for Semantic Image Segmentation Ping Luo 2 Guangrun Wang 1,2 Liang Lin 1,3 Xiaogang Wang 2 1 Sun Yat-Sen University 2 The Chinese University of Hong Kong 3 SenseTime Group (Limited)
More informationDeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material
DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington
More informationInstance-aware Semantic Segmentation via Multi-task Network Cascades
Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai, Kaiming He, Jian Sun Microsoft research 2016 Yotam Gil Amit Nativ Agenda Introduction Highlights Implementation Further
More informationMULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou
MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK Wenjie Guan, YueXian Zou*, Xiaoqun Zhou ADSPLAB/Intelligent Lab, School of ECE, Peking University, Shenzhen,518055, China
More informationCEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015
CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 Etienne Gadeski, Hervé Le Borgne, and Adrian Popescu CEA, LIST, Laboratory of Vision and Content Engineering, France
More informationObject Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR
Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization
More informationDepth Estimation from a Single Image Using a Deep Neural Network Milestone Report
Figure 1: The architecture of the convolutional network. Input: a single view image; Output: a depth map. 3 Related Work In [4] they used depth maps of indoor scenes produced by a Microsoft Kinect to successfully
More informationLearning to Segment Instances in Videos with Spatial Propagation Network
The 2017 DAVIS Challenge on Video Object Segmentation - CVPR 2017 Workshops Learning to Segment Instances in Videos with Spatial Propagation Network Jingchun Cheng 1,2 Sifei Liu 2 Yi-Hsuan Tsai 2 Wei-Chih
More informationRSRN: Rich Side-output Residual Network for Medial Axis Detection
RSRN: Rich Side-output Residual Network for Medial Axis Detection Chang Liu, Wei Ke, Jianbin Jiao, and Qixiang Ye University of Chinese Academy of Sciences, Beijing, China {liuchang615, kewei11}@mails.ucas.ac.cn,
More informationPredicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus
Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus Presented by: Rex Ying and Charles Qi Input: A Single RGB Image Estimate
More informationFine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task
Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Kyunghee Kim Stanford University 353 Serra Mall Stanford, CA 94305 kyunghee.kim@stanford.edu Abstract We use a
More informationarxiv: v2 [cs.cv] 23 Nov 2016 Abstract
Simple Does It: Weakly Supervised Instance and Semantic Segmentation Anna Khoreva 1 Rodrigo Benenson 1 Jan Hosang 1 Matthias Hein 2 Bernt Schiele 1 1 Max Planck Institute for Informatics, Saarbrücken,
More informationBoundary-aware Fully Convolutional Network for Brain Tumor Segmentation
Boundary-aware Fully Convolutional Network for Brain Tumor Segmentation Haocheng Shen, Ruixuan Wang, Jianguo Zhang, and Stephen J. McKenna Computing, School of Science and Engineering, University of Dundee,
More informationPerceiving the 3D World from Images and Videos. Yu Xiang Postdoctoral Researcher University of Washington
Perceiving the 3D World from Images and Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World 3 Understand
More informationMartian lava field, NASA, Wikipedia
Martian lava field, NASA, Wikipedia Old Man of the Mountain, Franconia, New Hampshire Pareidolia http://smrt.ccel.ca/203/2/6/pareidolia/ Reddit for more : ) https://www.reddit.com/r/pareidolia/top/ Pareidolia
More informationPARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION. Shin Matsuo Wataru Shimoda Keiji Yanai
PARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION Shin Matsuo Wataru Shimoda Keiji Yanai Department of Informatics, The University of Electro-Communications, Tokyo 1-5-1 Chofugaoka,
More informationSemantic Soft Segmentation Supplementary Material
Semantic Soft Segmentation Supplementary Material YAĞIZ AKSOY, MIT CSAIL and ETH Zürich TAE-HYUN OH, MIT CSAIL SYLVAIN PARIS, Adobe Research MARC POLLEFEYS, ETH Zürich and Microsoft WOJCIECH MATUSIK, MIT
More informationarxiv: v1 [cs.cv] 22 Nov 2017
W-Net: A Deep Model for Fully Unsupervised Image Segmentation Xide Xia Boston University xidexia@bu.edu Brian Kulis Boston University bkulis@bu.edu arxiv:1711.08506v1 [cs.cv] 22 Nov 2017 Abstract While
More informationPixelwise Instance Segmentation with a Dynamically Instantiated Network
Pixelwise Instance Segmentation with a Dynamically Instantiated Network Anurag Arnab and Philip H.S Torr University of Oxford {anurag.arnab, philip.torr}@eng.ox.ac.uk Abstract Semantic segmentation and
More informationLaplacian Pyramid Reconstruction and Refinement for Semantic Segmentation
Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation Golnaz Ghiasi (B) and Charless C. Fowlkes Department of Computer Science, University of California, Irvine, USA {gghiasi,fowlkes}@ics.uci.edu
More informationIterative Multi-domain Regularized Deep Learning for Anatomical Structure Detection and Segmentation from Ultrasound Images
Iterative Multidomain Regularized Deep Learning for Anatomical Structure Detection and Segmentation from Ultrasound Images Hao Chen 1,2, Yefeng Zheng 2, JinHyeong Park 2, PhengAnn Heng 1, and S. Kevin
More informationSemi Supervised Semantic Segmentation Using Generative Adversarial Network
Semi Supervised Semantic Segmentation Using Generative Adversarial Network Nasim Souly Concetto Spampinato Mubarak Shah nsouly@eecs.ucf.edu cspampin@dieei.unict.it shah@crcv.ucf.edu Abstract Unlabeled
More informationJoint Calibration for Semantic Segmentation
CAESAR ET AL.: JOINT CALIBRATION FOR SEMANTIC SEGMENTATION 1 Joint Calibration for Semantic Segmentation Holger Caesar holger.caesar@ed.ac.uk Jasper Uijlings jrr.uijlings@ed.ac.uk Vittorio Ferrari vittorio.ferrari@ed.ac.uk
More informationarxiv: v4 [cs.cv] 12 Aug 2015
CAESAR ET AL.: JOINT CALIBRATION FOR SEMANTIC SEGMENTATION 1 arxiv:1507.01581v4 [cs.cv] 12 Aug 2015 Joint Calibration for Semantic Segmentation Holger Caesar holger.caesar@ed.ac.uk Jasper Uijlings jrr.uijlings@ed.ac.uk
More informationarxiv: v3 [cs.cv] 8 May 2017
Convolutional Random Walk Networks for Semantic Image Segmentation Gedas Bertasius 1, Lorenzo Torresani 2, Stella X. Yu 3, Jianbo Shi 1 1 University of Pennsylvania, 2 Dartmouth College, 3 UC Berkeley
More information3D Object Recognition and Scene Understanding from RGB-D Videos. Yu Xiang Postdoctoral Researcher University of Washington
3D Object Recognition and Scene Understanding from RGB-D Videos Yu Xiang Postdoctoral Researcher University of Washington 1 2 Act in the 3D World Sensing & Understanding Acting Intelligent System 3D World
More informationPerson Part Segmentation based on Weak Supervision
JIANG, CHI: PERSON PART SEGMENTATION BASED ON WEAK SUPERVISION 1 Person Part Segmentation based on Weak Supervision Yalong Jiang 1 yalong.jiang@connect.polyu.hk Zheru Chi 1 chi.zheru@polyu.edu.hk 1 Department
More informationarxiv: v1 [cs.cv] 24 Nov 2016
Recalling Holistic Information for Semantic Segmentation arxiv:1611.08061v1 [cs.cv] 24 Nov 2016 Hexiang Hu UCLA Los Angeles, CA hexiang.frank.hu@gmail.com Abstract Fei Sha UCLA Los Angeles, CA feisha@cs.ucla.edu
More informationFinal Report: Smart Trash Net: Waste Localization and Classification
Final Report: Smart Trash Net: Waste Localization and Classification Oluwasanya Awe oawe@stanford.edu Robel Mengistu robel@stanford.edu December 15, 2017 Vikram Sreedhar vsreed@stanford.edu Abstract Given
More informationWebly Supervised Semantic Segmentation
Webly Supervised Semantic Segmentation Bin Jin IC, EPFL bin.jin@epfl.ch Maria V. Ortiz Segovia Océ Print Logic Technologies Maria.Ortiz@oce.com Sabine Süsstrunk IC, EPFL sabine.susstrunk@epfl.ch Abstract
More informationYiqi Yan. May 10, 2017
Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field
More informationarxiv: v2 [cs.cv] 10 Apr 2017
Fully Convolutional Instance-aware Semantic Segmentation Yi Li 1,2 Haozhi Qi 2 Jifeng Dai 2 Xiangyang Ji 1 Yichen Wei 2 1 Tsinghua University 2 Microsoft Research Asia {liyi14,xyji}@tsinghua.edu.cn, {v-haoq,jifdai,yichenw}@microsoft.com
More informationReal-time Object Detection CS 229 Course Project
Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection
More informationLearning to Segment Human by Watching YouTube
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. X, X 20XX 1 Learning to Segment Human by Watching YouTube Xiaodan Liang, Yunchao Wei, Yunpeng Chen, Xiaohui Shen, Jianchao Yang,
More informationarxiv: v1 [cs.cv] 29 Sep 2016
arxiv:1609.09545v1 [cs.cv] 29 Sep 2016 Two-stage Convolutional Part Heatmap Regression for the 1st 3D Face Alignment in the Wild (3DFAW) Challenge Adrian Bulat and Georgios Tzimiropoulos Computer Vision
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More informationMulti-Glance Attention Models For Image Classification
Multi-Glance Attention Models For Image Classification Chinmay Duvedi Stanford University Stanford, CA cduvedi@stanford.edu Pararth Shah Stanford University Stanford, CA pararth@stanford.edu Abstract We
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationKnow your data - many types of networks
Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic
More informationarxiv: v2 [cs.cv] 8 Apr 2018
Single-Shot Object Detection with Enriched Semantics Zhishuai Zhang 1 Siyuan Qiao 1 Cihang Xie 1 Wei Shen 1,2 Bo Wang 3 Alan L. Yuille 1 Johns Hopkins University 1 Shanghai University 2 Hikvision Research
More informationDataset Augmentation with Synthetic Images Improves Semantic Segmentation
Dataset Augmentation with Synthetic Images Improves Semantic Segmentation P. S. Rajpura IIT Gandhinagar param.rajpura@iitgn.ac.in M. Goyal IIT Varanasi manik.goyal.cse15@iitbhu.ac.in H. Bojinov Innit Inc.
More informationarxiv: v2 [cs.cv] 29 Nov 2016 Abstract
Object Detection Free Instance Segmentation With Labeling Transformations Long Jin 1, Zeyu Chen 1, Zhuowen Tu 2,1 1 Dept. of CSE and 2 Dept. of CogSci, University of California, San Diego 9500 Gilman Drive,
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period
More informationFlow-Based Video Recognition
Flow-Based Video Recognition Jifeng Dai Visual Computing Group, Microsoft Research Asia Joint work with Xizhou Zhu*, Yuwen Xiong*, Yujie Wang*, Lu Yuan and Yichen Wei (* interns) Talk pipeline Introduction
More information