arxiv: v3 [cs.cv] 2 Jun 2017
|
|
- Jesse McKenzie
- 5 years ago
- Views:
Transcription
1 Incorporating the Knowledge of Dermatologists to Convolutional Neural Networks for the Diagnosis of Skin Lesions arxiv: v3 [cs.cv] 2 Jun 2017 Iván González-Díaz Department of Signal Theory and Communications Universidad Carlos III de Madrid Leganés, 28911, Spain igonzalez@tsc.uc3m.es Abstract This report describes our submission to the ISIC 2017 Challenge in Skin Lesion Analysis Towards Melanoma Detection. We have participated in the Part 3: Lesion Classification with a system for automatic diagnosis of nevus, melanoma and seborrheic keratosis. Our approach aims to incorporate the expert knowledge of dermatologists into the well known framework of Convolutional Neural Networks (CNN), which have shown impressive performance in many visual recognition tasks. In particular, we have designed several networks providing lesion area identification, lesion segmentation into structural patterns and final diagnosis of clinical cases. Furthermore, novel blocks for CNNs have been designed to integrate this information with the diagnosis processing pipeline. Figure 1: Main processing pipeline of our Automatic Diagnosis System 1 General description of the system The main pipeline of our system is depicted in Fig. 1. It comprises the following steps: 1. For each clinical case c, a dermoscopic image X c feeds a Lesion Segmentation Network that generates a binary mask M c outlining the area of the image which corresponds to the lesion. The description of this module is given in section Each clinical case c, which is now defined by the image-mask couple {X c, M c }, goes through the Data Augmentation Module. This module aims to extend the initial visual support of the lesion by generating new views v corresponding to different rotations and
2 cropped areas. Hence, the output of this module is an extended set of images X c v related to the lesion. Section 3 provides a detailed description of this data augmentation process. 3. The next step in the process is the Structure Segmentation Network. It aims to segment each view of the lesion X v into a set of eight global and local structures that have turned to be very important for dermatologists in their daily diagnosis. Examples of these structures are dots/globules, regression areas, streaks, etc. Hence, the output of this system is a set of 8 segmentation maps S c vs, s = 1...8, each one associated to a particular structure s of interest. This module is introduced in section Finally, the augmented set { X c v, S c vs} is passed to the Diagnosis Network, which is in charge of providing the final diagnosis Y c for the clinical case. The description of this network can be found in section 5. 2 Lesion Segmentation Network The Lesion Segmentation Network has been developed by learning a Fully Convolutional Network (FCN) [Shelhamer et al., 2016]. FCNs have achieved state-of-the-art results on the task of semantic image segmentation in general-content, as demonstrated in the PASCAL VOC Segmentation [Everingham et al., 2015]. In order to train a network for our particular task of lesion/skin segmentation, we have used the training set for the lesion segmentation task in the 2017 ISBI challenge. Let us note that the goal of this module is not to generate very accurate segmentation maps of a lesion, but to broadly identify the area of the image that corresponds to the lesion, giving place to a binary map M c for each clinical case. Figure 2: Example of a rotated and cropped view of a lesion and its Normalized Polar Coordinates. (Left) View of the lession (Middle) Normalize Ratio (Right) Angle 3 Data Augmentation Module and Normalized Polar Coordinates It is well known that data augmentation notably boosts the performance of deep neural networks, mainly when the amount of training data is limited. Among all the potential image variations and artifacts, invariance to orientation is probably the main requirement of our method, as dermatologists do not follow a specific protocol during the capture of a lesion. Other more complex geometric transformations such as affine or projective transforms are less interesting here as the dermatoscope is normally placed just over and orthogonally to the lesion surface. The particular process of data augmentation is described next: 1. First, starting from the pair {X c, M c }, we generate a set of rotated versions. 2. As rotating an image without losing any visual information requires incorporating new areas which were not present in the original view, we find and crop the largest inner rectangle ensuring that all pixels belong to the original image. 3. Finally, as our sub subsequent CNNs (Structure Segmentation and Diagnosis) require square input images of 256x256 pixels, we finally perform various squared crops which are in turn re-sized to the required dimensions. Considering the aforementioned rotations and crops, for each given clinical case c, we generate an augmented set of 24 images, represented by a tensor X c v R , with v = In addition, for each generated view X c v, we compute the Normalized Polar Coordinates from the lesion mask. The goal of this new alternative coordinates is to support subsequent processing blocks by providing invariance against shifts, rotations, changes in size and even irregular shapes of the lesions. To do so, we transform pixel Cartesian coordinates (x i, y i ) into normalized polar coordinates (ρ i, θ i ), where rho i [0, 1] and θ i [0, 2π) stand for the normalized ratio and angle, respectively. 2
3 The process to compute this transformation is as follows: first, the mask of the lesion is approximated by an ellipse with the same second-order moments. Then, we learn the affine matrix that transforms the ellipse into a normalized (unit ratio) circle centered at (0,0). Figure 2 shows an example of a rotated and cropped view of a lesion, and its corresponding normalized polar coordinates. 4 Structure Segmentation Network The goal of this module is, given an input view of the lesion X c v, to provide a corresponding segmentation into a pre-defined set of textural patterns and local structures that are of special interest for dermatologists in their diagnosis. In particular, we have considered a set of eight structures: 1) dots, globules and cobblestone pattern, 2) reticular patterns and pigmented networks, 3) homogeneous areas, 4) regression areas, 5) blue-white veil, 6) streaks, 7) vascular structures and 8) unspecific patterns. The main challenge to develop this module is the generation of a strongly-labeled training dataset, in which each image has an associated ground truth pixel-wise segmentation. This kind of annotation is often hard to obtain as it requires a huge effort of the dermatologists to manually outline the segmentations. Alternatively, providing weak image-level labels indicating only which structural patterns are present in each lesion is much easier for dermatologists and therefore becomes more realistic. Hence, following this latter approach, we asked dermatologists of a collaborating medical institution, the Hospital Doce de Octubre in Madrid, to annotate the ISIC 2016 training dataset with the presence or absence of the 8 considered structures. In particular, we asked them to provide one labels for each structure: 0 if the structure is not present, 1 if is locally present, 2 if it is present and large enough to be considered a global pattern in the lesion. Given this weakly-annotated dataset, we have built our approach over the work of [Pathak et al., 2015], where the authors introduced a novel constrained optimization for weakly-labeled segmentation using CNNs. The output of this network is a reduced version of the input image (64x64 in our case) where, for each pixel location x i, a softmax is used to transform the net outputs f i (x i ; θ) into probabilities as follows: p i (x i θ) = 1 Z i exp(f i (x i θ)) (1) where θ represents the parameters of the CNN, and Z i = s=1...8 exp(f i(s θ)) is the partition function at the location i. The presence or absence of a class, as well as, an estimate of its size in the image, lead to particular constraints over the probability P s = i p i(s θ) accumulated over all pixel locations in the segmentation map: If a structure s is not present in an image, the constraint acts as an upper bound over the accumulated probability P s, which has to be nearly zero. If a structure s is local in an image, we impose a lower and upper bound on the accumulated probability P s in the image to control the total area of the structure in the lesion. If a structure s is global in an image, we impose a lower bound on the accumulated probability P s in the image to ensure a minimum area corresponding to the structure. In order to adapt this approach to our particular scenario, we have developed a set of modifications over the original approach, namely: We observed that using simple softmax function lead to situations in which many constraints over local structures were obeyed by assigning some residual probability to every location in the segmentation map. From our point of view, this is an undesired behavior, as one would rather expect a small set of pixels showing large probabilities of belonging to the structure of interest. To overcome this limitation, we have used a parametric softmax p i (x i γ, θ) = 1 Z i exp(f i (γx i θ)). The parameter γ is a soft-approximation towards the max function, and large values lead to scenarios in which each location shows high probability just for very reduced set of structures. In our case, we have used a value of γ = 20. We added a new constraint that helps to learn structures that appear in spatial locations of the lesion: e.g. streaks tend to appear in the borders of a lesion. For that end, we accumulate probabilities P s only in those locations that will likely contain the intended structure. At 3
4 this point, we have defined these areas of interest over the Normalized Polar Coordinates described in section 3, which are more adequate than the original Cartesian coordinates. We have implemented this module taking the well-known vgg-vdd [Simonyan and Zisserman, 2014] (the same network used as initialization for the lesion segmentation module), removing the top layers, and using the ISIC 2016 training dataset and the described constrained optimization with weak annotations [Pathak et al., 2015]. The output of this module is, for each view v of a clinical case c, a tensor S c v R that contains the 8 probability maps of the considered structures. Figure 3: Processing pipeline of the Diagnosis Network 5 Diagnosis Network The Diagnosis Network will gather the information from previous modules in order to generate a diagnose for each clinical case. As in the previous modules, our approach has taken a well-known CNN as starting point and modified the top layers to get a better adaptation to our problem. The network chosen as basis is the resnet-50 [He et al., 2015], which uses residual layers to avoid the degradation problem when more and more layers are stacked to the network. When applied to our 256x256 images, the last convolutional block (conv_5x) of this network produces a tensor T c R , which hopefully behaves as a detector of high level concepts (objects in Imagenet, the dataset for which it was originally designed). In the original work, an average pooling layer transformed this tensor into a single-value per channel and image T s R , which was followed by a fully convolutional layer and a softmax to generate the final probabilities of the image containing the classes being detected. Hence, the goal of the average pooling was fusing detections at various locations of the input image and generating a unified score for each high-level concept. In our approach, however, we have modified the structure of the top layers of the network, giving place to the structure presented in Figure 3. We basically subdivide the top fully-connected layer providing the lesion diagnosis into three arms: a) the original arm with an average pooling followed by a fully connected layer (FC1), b) a second arm that performs a normalized polar pooling (3x6 rings by angles) and follows it by a fully connected layer (FC2), c) a third arm that estimates the asymmetry of lesion based on the previous polar pooling and applies then a Fully Connected layer (FC3). The results of the three arms are then linearly combined using a Sum block. We next describe the novel blocks that are required in this new structure and that have been specifically developed in this work: 1. Modulation block: The goal of this block is to take advantage of the previous segmentations of the lesion into global and local structures which are of great interest for dermatologists in their daily diagnosis. To do so, this blocks fuses the previous structure segmentation maps S c v with the filter outputs of the conv_5x layer in resnet-50. In particular, we modulate the outputs of the layer (2048 channels in our case) using the probabilities of the 8 local and global structures described in section 4. By concatenating the resulting modulation with the original set of outputs we finally generate a set of channels which is 9 times the original one (18432 in our case). 2. Polar Pooling: This block aims to perform pooling operations over data (average or max pooling) but, rather than using rectangular spatial regions, we employ sectors defined in polar coordinates. Hence, this block is defined for a given number of radial rings R (radius 4
5 ranging from 0 to 1) and angular sectors A (angles ranging between 0 and 2π), producing an output of size R A channels. Furthermore, in order to adapt to the irregular shapes of the lesions, we use the normalized polar coordinates described in section 3. Since, depending on the shape of the lesion and the size of the tensor being pooled, some combinations (r, a) may not contain pixels in the image, we can also define overlaps between adjacent radius and angles to regularize the outputs. In addition, the division of the lesion into rings is non-uniform and ensures that every ring contains the same number of pixels for a perfect circular lesion. 3. Asymmetry: This block computes metrics that evaluate the asymmetry of a lesion for a given angle. In particular, given a polar division of the lesion into R A sectors, we compute the asymmetry for A/2 angles by folding the lesion over each angle and computing the accumulated absolute difference between corresponding sectors. As shown in the Figure 3, we combine these modules to generate a final output Y c v for each considered view of a clinical case. Finally, in order to generate a final output for each clinical case Y c, we consider independence between views leading to a factorization: Y c = V v=1 Y c v (2) It is also worth noting that our final submission has also incorporated in the factorization an extra classifier which depends only on external information about the clinical case, such as patient gender and age, and lesion area. 6 Code The code that implements this paper as well as the Lesion Segmentation and Diagnosis Networks are provided in the following link: Acknowledgments We kindly thank dermatologists of Hospital 12 de Octubre of Madrid because of their inestimable help annotating the data contents with the weak labels of structural patterns. This work was supported in part by the National Grant TEC P and National Grant TEC EXP of the Spanish Ministry of Economy and Competitiveness. In addition, we gratefully acknowledge the support of NVIDIA Corporation with the donation of the TITAN X GPU used for this research. References M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 111(1):98 136, Jan K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. CoRR, abs/ , URL D. Pathak, P. Krähenbühl, and T. Darrell. Constrained convolutional neural networks for weakly supervised segmentation. In ICCV, E. Shelhamer, J. Long, and T. Darrell. Fully convolutional networks for semantic segmentation. CoRR, abs/ , URL K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/ ,
Skin Lesion Classification and Segmentation for Imbalanced Classes using Deep Learning
Skin Lesion Classification and Segmentation for Imbalanced Classes using Deep Learning Mohammed K. Amro, Baljit Singh, and Avez Rizvi mamro@sidra.org, bsingh@sidra.org, arizvi@sidra.org Abstract - This
More informationContent-Based Image Recovery
Content-Based Image Recovery Hong-Yu Zhou and Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China zhouhy@lamda.nju.edu.cn wujx2001@nju.edu.cn Abstract. We propose
More informationMULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou
MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK Wenjie Guan, YueXian Zou*, Xiaoqun Zhou ADSPLAB/Intelligent Lab, School of ECE, Peking University, Shenzhen,518055, China
More informationLecture 7: Semantic Segmentation
Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr
More informationFully Convolutional Networks for Semantic Segmentation
Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Chaim Ginzburg for Deep Learning seminar 1 Semantic Segmentation Define a pixel-wise labeling
More informationFinal Report: Smart Trash Net: Waste Localization and Classification
Final Report: Smart Trash Net: Waste Localization and Classification Oluwasanya Awe oawe@stanford.edu Robel Mengistu robel@stanford.edu December 15, 2017 Vikram Sreedhar vsreed@stanford.edu Abstract Given
More informationDeep learning for object detection. Slides from Svetlana Lazebnik and many others
Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep
More informationLecture 5: Object Detection
Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based
More informationGradient of the lower bound
Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level
More informationA new interface for manual segmentation of dermoscopic images
A new interface for manual segmentation of dermoscopic images P.M. Ferreira, T. Mendonça, P. Rocha Faculdade de Engenharia, Faculdade de Ciências, Universidade do Porto, Portugal J. Rozeira Hospital Pedro
More informationYiqi Yan. May 10, 2017
Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field
More informationREGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION
REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological
More informationDeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,
More informationDynamic Routing Between Capsules
Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet
More informationEncoder-Decoder Networks for Semantic Segmentation. Sachin Mehta
Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:
More informationA Novel Representation and Pipeline for Object Detection
A Novel Representation and Pipeline for Object Detection Vishakh Hegde Stanford University vishakh@stanford.edu Manik Dhar Stanford University dmanik@stanford.edu Abstract Object detection is an important
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More informationSkin Lesion Attribute Detection for ISIC Using Mask-RCNN
Skin Lesion Attribute Detection for ISIC 2018 Using Mask-RCNN Asmaa Aljuhani and Abhishek Kumar Department of Computer Science, Ohio State University, Columbus, USA E-mail: Aljuhani.2@osu.edu; Kumar.717@osu.edu
More informationR-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection Zeming Li, 1 Yilun Chen, 2 Gang Yu, 2 Yangdong
More informationPaper Motivation. Fixed geometric structures of CNN models. CNNs are inherently limited to model geometric transformations
Paper Motivation Fixed geometric structures of CNN models CNNs are inherently limited to model geometric transformations Higher-level features combine lower-level features at fixed positions as a weighted
More informationObject detection with CNNs
Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification
More informationCIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm
CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects
More informationMedical images, segmentation and analysis
Medical images, segmentation and analysis ImageLab group http://imagelab.ing.unimo.it Università degli Studi di Modena e Reggio Emilia Medical Images Macroscopic Dermoscopic ELM enhance the features of
More informationFuzzy Set Theory in Computer Vision: Example 3
Fuzzy Set Theory in Computer Vision: Example 3 Derek T. Anderson and James M. Keller FUZZ-IEEE, July 2017 Overview Purpose of these slides are to make you aware of a few of the different CNN architectures
More informationObject Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR
Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization
More informationObject Detection on Self-Driving Cars in China. Lingyun Li
Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts
More informationConstrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell
Constrained Convolutional Neural Networks for Weakly Supervised Segmentation Deepak Pathak, Philipp Krähenbühl and Trevor Darrell 1 Multi-class Image Segmentation Assign a class label to each pixel in
More informationObject Detection Based on Deep Learning
Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
More informationJoint Object Detection and Viewpoint Estimation using CNN features
Joint Object Detection and Viewpoint Estimation using CNN features Carlos Guindel, David Martín and José M. Armingol cguindel@ing.uc3m.es Intelligent Systems Laboratory Universidad Carlos III de Madrid
More informationStructured Prediction using Convolutional Neural Networks
Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer
More informationMask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018
Mask R-CNN Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 1 Common computer vision tasks Image Classification: one label is generated for
More informationPARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION. Shin Matsuo Wataru Shimoda Keiji Yanai
PARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION Shin Matsuo Wataru Shimoda Keiji Yanai Department of Informatics, The University of Electro-Communications, Tokyo 1-5-1 Chofugaoka,
More informationObject detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation
Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN
More informationarxiv: v1 [cs.cv] 29 Nov 2017
Detection-aided liver lesion segmentation using deep learning arxiv:1711.11069v1 [cs.cv] 29 Nov 2017 Míriam Bellver, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Xavier Giró-i-Nieto, Jordi Torres,, Luc Van
More informationData Augmentation for Skin Lesion Analysis
Data Augmentation for Skin Lesion Analysis Fábio Perez 1, Cristina Vasconcelos 2, Sandra Avila 3, and Eduardo Valle 1 1 RECOD Lab, DCA, FEEC, University of Campinas (Unicamp), Brazil 2 Computer Science
More informationComputer Vision Lecture 16
Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:
More informationFinding Tiny Faces Supplementary Materials
Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution
More informationA Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications
A Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications Le Thanh Nguyen-Meidine 1, Eric Granger 1, Madhu Kiran 1 and Louis-Antoine Blais-Morin 2 1 École de technologie
More informationKaggle Data Science Bowl 2017 Technical Report
Kaggle Data Science Bowl 2017 Technical Report qfpxfd Team May 11, 2017 1 Team Members Table 1: Team members Name E-Mail University Jia Ding dingjia@pku.edu.cn Peking University, Beijing, China Aoxue Li
More informationSemantic Segmentation
Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2 OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell,
More informationarxiv: v2 [cs.cv] 30 Sep 2018
A Detection and Segmentation Architecture for Skin Lesion Segmentation on Dermoscopy Images arxiv:1809.03917v2 [cs.cv] 30 Sep 2018 Chengyao Qian, Ting Liu, Hao Jiang, Zhe Wang, Pengfei Wang, Mingxin Guan
More informationEE-559 Deep learning Networks for semantic segmentation
EE-559 Deep learning 7.4. Networks for semantic segmentation François Fleuret https://fleuret.org/ee559/ Mon Feb 8 3:35:5 UTC 209 ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE The historical approach to image
More informationRegionlet Object Detector with Hand-crafted and CNN Feature
Regionlet Object Detector with Hand-crafted and CNN Feature Xiaoyu Wang Research Xiaoyu Wang Research Ming Yang Horizon Robotics Shenghuo Zhu Alibaba Group Yuanqing Lin Baidu Overview of this section Regionlet
More informationPhoto-realistic Renderings for Machines Seong-heum Kim
Photo-realistic Renderings for Machines 20105034 Seong-heum Kim CS580 Student Presentations 2016.04.28 Photo-realistic Renderings for Machines Scene radiances Model descriptions (Light, Shape, Material,
More informationClassification of objects from Video Data (Group 30)
Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time
More informationExtend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network
Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Liwen Zheng, Canmiao Fu, Yong Zhao * School of Electronic and Computer Engineering, Shenzhen Graduate School of
More informationChannel Locality Block: A Variant of Squeeze-and-Excitation
Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan
More informationDirect Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.
[ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that
More informationSEMANTIC SEGMENTATION AVIRAM BAR HAIM & IRIS TAL
SEMANTIC SEGMENTATION AVIRAM BAR HAIM & IRIS TAL IMAGE DESCRIPTIONS IN THE WILD (IDW-CNN) LARGE KERNEL MATTERS (GCN) DEEP LEARNING SEMINAR, TAU NOVEMBER 2017 TOPICS IDW-CNN: Improving Semantic Segmentation
More informationYOLO9000: Better, Faster, Stronger
YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object
More informationTEXT SEGMENTATION ON PHOTOREALISTIC IMAGES
TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES Valery Grishkin a, Alexander Ebral b, Nikolai Stepenko c, Jean Sene d Saint Petersburg State University, 7 9 Universitetskaya nab., Saint Petersburg, 199034,
More informationTRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK
TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK 1 Po-Jen Lai ( 賴柏任 ), 2 Chiou-Shann Fuh ( 傅楸善 ) 1 Dept. of Electrical Engineering, National Taiwan University, Taiwan 2 Dept.
More informationHIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION
HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION Chien-Yao Wang, Jyun-Hong Li, Seksan Mathulaprangsan, Chin-Chin Chiang, and Jia-Ching Wang Department of Computer Science and Information
More informationarxiv: v1 [cs.cv] 31 Mar 2016
Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu and C.-C. Jay Kuo arxiv:1603.09742v1 [cs.cv] 31 Mar 2016 University of Southern California Abstract.
More informationPose estimation using a variety of techniques
Pose estimation using a variety of techniques Keegan Go Stanford University keegango@stanford.edu Abstract Vision is an integral part robotic systems a component that is needed for robots to interact robustly
More informationCEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015
CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015 Etienne Gadeski, Hervé Le Borgne, and Adrian Popescu CEA, LIST, Laboratory of Vision and Content Engineering, France
More informationDeep Learning for Object detection & localization
Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified
More informationTraffic sign shape classification evaluation II: FFT applied to the signature of Blobs
Traffic sign shape classification evaluation II: FFT applied to the signature of Blobs P. Gil-Jiménez, S. Lafuente-Arroyo, H. Gómez-Moreno, F. López-Ferreras and S. Maldonado-Bascón Dpto. de Teoría de
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationGeometry-aware Traffic Flow Analysis by Detection and Tracking
Geometry-aware Traffic Flow Analysis by Detection and Tracking 1,2 Honghui Shi, 1 Zhonghao Wang, 1,2 Yang Zhang, 1,3 Xinchao Wang, 1 Thomas Huang 1 IFP Group, Beckman Institute at UIUC, 2 IBM Research,
More informationarxiv: v1 [cs.cv] 26 Jun 2017
Detecting Small Signs from Large Images arxiv:1706.08574v1 [cs.cv] 26 Jun 2017 Zibo Meng, Xiaochuan Fan, Xin Chen, Min Chen and Yan Tong Computer Science and Engineering University of South Carolina, Columbia,
More informationMask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi
Mask R-CNN By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Types of Computer Vision Tasks http://cs231n.stanford.edu/ Semantic vs Instance Segmentation Image
More informationReal-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor Supplemental Document
Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor Supplemental Document Franziska Mueller 1,2 Dushyant Mehta 1,2 Oleksandr Sotnychenko 1 Srinath Sridhar 1 Dan Casas 3 Christian Theobalt
More informationSupplementary Material: Unconstrained Salient Object Detection via Proposal Subset Optimization
Supplementary Material: Unconstrained Salient Object via Proposal Subset Optimization 1. Proof of the Submodularity According to Eqns. 10-12 in our paper, the objective function of the proposed optimization
More informationHuman Pose Estimation with Deep Learning. Wei Yang
Human Pose Estimation with Deep Learning Wei Yang Applications Understand Activities Family Robots American Heist (2014) - The Bank Robbery Scene 2 What do we need to know to recognize a crime scene? 3
More informationDEEP NEURAL NETWORKS FOR OBJECT DETECTION
DEEP NEURAL NETWORKS FOR OBJECT DETECTION Sergey Nikolenko Steklov Institute of Mathematics at St. Petersburg October 21, 2017, St. Petersburg, Russia Outline Bird s eye overview of deep learning Convolutional
More informationReal-time Object Detection CS 229 Course Project
Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationOptimizing CNN-based Object Detection Algorithms on Embedded FPGA Platforms
Optimizing CNN-based Object Detection Algorithms on Embedded FPGA Platforms Ruizhe Zhao 1, Xinyu Niu 1, Yajie Wu 2, Wayne Luk 1, and Qiang Liu 3 1 Imperial College London {ruizhe.zhao15,niu.xinyu10,w.luk}@imperial.ac.uk
More informationRyerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro
Ryerson University CP8208 Soft Computing and Machine Intelligence Naive Road-Detection using CNNS Authors: Sarah Asiri - Domenic Curro April 24 2016 Contents 1 Abstract 2 2 Introduction 2 3 Motivation
More informationTransfer Learning. Style Transfer in Deep Learning
Transfer Learning & Style Transfer in Deep Learning 4-DEC-2016 Gal Barzilai, Ram Machlev Deep Learning Seminar School of Electrical Engineering Tel Aviv University Part 1: Transfer Learning in Deep Learning
More informationOBJECT detection in general has many applications
1 Implementing Rectangle Detection using Windowed Hough Transform Akhil Singh, Music Engineering, University of Miami Abstract This paper implements Jung and Schramm s method to use Hough Transform for
More informationAutomatic detection of books based on Faster R-CNN
Automatic detection of books based on Faster R-CNN Beibei Zhu, Xiaoyu Wu, Lei Yang, Yinghua Shen School of Information Engineering, Communication University of China Beijing, China e-mail: zhubeibei@cuc.edu.cn,
More informationTodo before next class
Todo before next class Each project group should submit a short project report (4 pages presentation slides) including 1. Problem definition 2. Related work 3. Preliminary results 4. Future plan Submission:
More informationarxiv: v1 [cs.cv] 29 Sep 2016
arxiv:1609.09545v1 [cs.cv] 29 Sep 2016 Two-stage Convolutional Part Heatmap Regression for the 1st 3D Face Alignment in the Wild (3DFAW) Challenge Adrian Bulat and Georgios Tzimiropoulos Computer Vision
More informationDeepBIBX: Deep Learning for Image Based Bibliographic Data Extraction
DeepBIBX: Deep Learning for Image Based Bibliographic Data Extraction Akansha Bhardwaj 1,2, Dominik Mercier 1, Sheraz Ahmed 1, Andreas Dengel 1 1 Smart Data and Services, DFKI Kaiserslautern, Germany firstname.lastname@dfki.de
More informationFace Recognition using SURF Features and SVM Classifier
International Journal of Electronics Engineering Research. ISSN 0975-6450 Volume 8, Number 1 (016) pp. 1-8 Research India Publications http://www.ripublication.com Face Recognition using SURF Features
More informationA Deep Learning Approach to Vehicle Speed Estimation
A Deep Learning Approach to Vehicle Speed Estimation Benjamin Penchas bpenchas@stanford.edu Tobin Bell tbell@stanford.edu Marco Monteiro marcorm@stanford.edu ABSTRACT Given car dashboard video footage,
More informationSSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang
SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation
More informationGeneric Face Alignment Using an Improved Active Shape Model
Generic Face Alignment Using an Improved Active Shape Model Liting Wang, Xiaoqing Ding, Chi Fang Electronic Engineering Department, Tsinghua University, Beijing, China {wanglt, dxq, fangchi} @ocrserv.ee.tsinghua.edu.cn
More informationStudy of Residual Networks for Image Recognition
Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks
More informationComputer aided diagnosis of melanoma using Computer Vision and Machine Learning
Computer aided diagnosis of melanoma using Computer Vision and Machine Learning Jabeer Ahmed Biomedical Engineering Oregon Health & Science University This paper presents a computer-aided analysis of pigmented
More information3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis
3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors
More informationarxiv: v1 [cs.cv] 20 Dec 2016
End-to-End Pedestrian Collision Warning System based on a Convolutional Neural Network with Semantic Segmentation arxiv:1612.06558v1 [cs.cv] 20 Dec 2016 Heechul Jung heechul@dgist.ac.kr Min-Kook Choi mkchoi@dgist.ac.kr
More informationScene Text Recognition for Augmented Reality. Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science
Scene Text Recognition for Augmented Reality Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science Outline Research area and motivation Finding text in natural scenes Prior art Improving
More information3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing
3 Object Detection BVM 2018 Tutorial: Advanced Deep Learning Methods Paul F. Jaeger, of Medical Image Computing What is object detection? classification segmentation obj. detection (1 label per pixel)
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic
More informationDisguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601
Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601 Introduction Face ID is complicated by alterations to an individual s appearance Beard,
More informationConvolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,
Convolutional Neural Networks: Applications and a short timeline 7th Deep Learning Meetup Kornel Kis Vienna, 1.12.2016. Introduction Currently a master student Master thesis at BME SmartLab Started deep
More informationSupplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains
Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains Jiahao Pang 1 Wenxiu Sun 1 Chengxi Yang 1 Jimmy Ren 1 Ruichao Xiao 1 Jin Zeng 1 Liang Lin 1,2 1 SenseTime Research
More informationDeep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon
Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in
More informationConvolutional Neural Network Layer Reordering for Acceleration
R1-15 SASIMI 2016 Proceedings Convolutional Neural Network Layer Reordering for Acceleration Vijay Daultani Subhajit Chaudhury Kazuhisa Ishizaka System Platform Labs Value Co-creation Center System Platform
More informationInternational Journal of Computer Engineering and Applications, Volume XII, Special Issue, September 18,
REAL-TIME OBJECT DETECTION WITH CONVOLUTION NEURAL NETWORK USING KERAS Asmita Goswami [1], Lokesh Soni [2 ] Department of Information Technology [1] Jaipur Engineering College and Research Center Jaipur[2]
More informationMask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma
Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left
More informationPhoto OCR ( )
Photo OCR (2017-2018) Xiang Bai Huazhong University of Science and Technology Outline VALSE2018, DaLian Xiang Bai 2 Deep Direct Regression for Multi-Oriented Scene Text Detection [He et al., ICCV, 2017.]
More informationDeconvolutions in Convolutional Neural Networks
Overview Deconvolutions in Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Deconvolutions in CNNs Applications Network visualization
More informationMachine vision. Summary # 6: Shape descriptors
Machine vision Summary # : Shape descriptors SHAPE DESCRIPTORS Objects in an image are a collection of pixels. In order to describe an object or distinguish between objects, we need to understand the properties
More information