Skin Lesion Classification and Segmentation for Imbalanced Classes using Deep Learning

Similar documents
arxiv: v2 [cs.cv] 30 Sep 2018

Skin Lesion Attribute Detection for ISIC Using Mask-RCNN

arxiv: v3 [cs.cv] 2 Jun 2017

Detection of Melanoma Skin Cancer using Segmentation and Classification Algorithm

Deep Residual Architecture for Skin Lesion Segmentation

Medical images, segmentation and analysis

Skin Lesion Classification Using GLCM Based Feature Extraction in Probabilistic Neural Network

Detection of skin cancer

CS 523: Multimedia Systems

Multi-resolution-Tract CNN with Hybrid Pretrained and Skin-Lesion Trained Layers

Kaggle Data Science Bowl 2017 Technical Report

Learning to Segment Object Candidates

MRI Tumor Segmentation with Densely Connected 3D CNN. Lele Chen, Yue Wu, Adora M. DSouze, Anas Z. Abidin, Axel Wismüller, and Chenliang Xu

Skin Lesion Segmentation Using TDLS Algorithm and Pattern Identification

A new interface for manual segmentation of dermoscopic images

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

Dynamic Routing Between Capsules

DEFECT INSPECTION FROM SCRATCH TO PRODUCTION. Andrew Liu, Ryan Shen Deep Learning Solution Architect

Computer aided diagnosis of melanoma using Computer Vision and Machine Learning

Data Augmentation for Skin Lesion Analysis

An annotation tool for dermoscopic image segmentation

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

CS249: ADVANCED DATA MINING

Global Pattern Analysis and Classification of Dermoscopic Images Using Textons

Channel Locality Block: A Variant of Squeeze-and-Excitation

Weighted Convolutional Neural Network. Ensemble.

Using Machine Learning for Classification of Cancer Cells

IN order to distinguish melanoma from benign lesions, dermatologists. Fully Convolutional Neural Networks to Detect Clinical Dermoscopic Features

Lung nodule detection by using. Deep Learning

An Algorithm for the Automatic Detection of Abnormal Mitotic Figure towards the Automated Diagnosis of Melanoma

Lecture 7: Semantic Segmentation

Early detection of malignant melanoma using random forest algorithm

Classification of objects from Video Data (Group 30)

Classification of WBC for Blood Cancer Diagnosis using Deep Convolutional Neural Networks

3D Convolutional Neural Networks for Landing Zone Detection from LiDAR

Convolutional Neural Network based Medical Imaging Segmentation: Recent Progress and Challenges. Jiaxing Tan

ImageNet Classification with Deep Convolutional Neural Networks

AUTOMATED DETECTION AND CLASSIFICATION OF CANCER METASTASES IN WHOLE-SLIDE HISTOPATHOLOGY IMAGES USING DEEP LEARNING

Enhanced K-mean Using Evolutionary Algorithms for Melanoma Detection and Segmentation in Skin Images

CS145: INTRODUCTION TO DATA MINING

3D-CNN and SVM for Multi-Drug Resistance Detection

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models

Available Online through

Hide-and-Seek: Forcing a network to be Meticulous for Weakly-supervised Object and Action Localization

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

A Multi-task Framework for Skin Lesion Detection and Segmentation

Dermoscopic Image Segmentation via Multi-Stage Fully Convolutional Networks

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset

Evaluation Methodology between Globalization and Localization Features Approaches for Skin Cancer Lesions Classification

Towards Grading Gleason Score using Generically Trained Deep convolutional Neural Networks

CAP 6412 Advanced Computer Vision

Deep Learning with Tensorflow AlexNet

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing

Pattern recognition (4)

Deep Residual Network with Regularized Fisher Framework for Detection of Melanoma

Pigment Network Detection in Dermoscopy Images using Deep Learning

Dense Fully Convolutional Network for Skin Lesion Segmentation

arxiv: v4 [cs.cv] 13 Dec 2018

Detecting Bone Lesions in Multiple Myeloma Patients using Transfer Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Convolutional Neural Networks

Supplementary A. Overview. C. Time and Space Complexity. B. Shape Retrieval. D. Permutation Invariant SOM. B.1. Dataset

A FPGA IMPLEMENTATION ON SKIN CANCER DETECTION USING TDLS ALGORITHM

ECE 5470 Classification, Machine Learning, and Neural Network Review

Atlas Of Confocal Microscopy In Dermatology

COMPARATIVE DEEP LEARNING FOR CONTENT- BASED MEDICAL IMAGE RETRIEVAL

Cost-Sensitive Learning of Deep Feature Representations from Imbalanced Data

Boundary Tracing Algorithm for Automatic Skin Lesion Detection in Macroscopic Images

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Universität Freiburg Lehrstuhl für Maschinelles Lernen und natürlichsprachliche Systeme. Machine Learning (SS2012)

Plankton Classification Using ConvNets

Fusion of Structural and Textural Features for Melanoma Recognition

CHAPTER 4 SEMANTIC REGION-BASED IMAGE RETRIEVAL (SRBIR)

Tri-modal Human Body Segmentation

Detection-aided medical image segmentation using deep learning

Boundary Tracing Algorithm for Automatic Skin Lesion Detection in Macroscopic Images

Micro-scale Stereo Photogrammetry of Skin Lesions for Depth and Colour Classification

Fully Convolutional Networks for Semantic Segmentation

Lowering the Bar: Deep Learning for Side Channel Analysis. Guilherme Perin, Baris Ege, Jasper van December 4, 2018

k-nn Disgnosing Breast Cancer

Pull the Plug? Predicting If Computers or Humans Should Segment Images Supplementary Material

Lecture 6 K- Nearest Neighbors(KNN) And Predictive Accuracy

Analysis of classifier to improve Medical diagnosis for Breast Cancer Detection using Data Mining Techniques A.subasini 1

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY

Apply Lightweight Deep Learning on Internet of Things for Low-Cost and Easy-To-Access Skin Cancer Detection

Automatic nevi segmentation using adaptive mean shift filters and feature analysis

Detection of a Single Hand Shape in the Foreground of Still Images

Classification of Dermoscopic Skin Cancer Images Using Color and Hybrid Texture Features

Computer-Aided Diagnosis in Abdominal and Cardiac Radiology Using Neural Networks

arxiv: v1 [cs.cv] 31 Mar 2016

Facial Key Points Detection using Deep Convolutional Neural Network - NaimishNet

Advanced Machine Learning

Recurrent neural networks for polyphonic sound event detection in real life recordings

CrescendoNet: A New Deep Convolutional Neural Network with Ensemble Behavior

DEEP BLIND IMAGE QUALITY ASSESSMENT

Large-scale Video Classification with Convolutional Neural Networks

EP-CapsNet: Extending Capsule Network with Inception Module for Electrophoresis Binary Classification

Automated Diagnosis of Vertebral Fractures using 2D and 3D Convolutional Networks

arxiv: v1 [cs.cv] 29 Nov 2017

Transcription:

Skin Lesion Classification and Segmentation for Imbalanced Classes using Deep Learning Mohammed K. Amro, Baljit Singh, and Avez Rizvi mamro@sidra.org, bsingh@sidra.org, arizvi@sidra.org Abstract - This paper summarizes our work for ISIC 2018 challenge in Skin Lesion Analysis Towards Melanoma Detection [1]for both Task 1: Lesion Boundary Segmentation and Task 3: Disease Classification of ISIC 2018. We used a modified version of U-Net called dilated U-Net for Task 1 utilizing 4-Fold training method and testing time augmentation during prediction phase with final accuracy 0.77 Jaccard score on ISIC 2018 validation set. For Task 3 we build two approaches. The first approach is a one-step classifier between seven lesions type, while the second approach works into two steps, the first step by binary classifying the lesion either Nevus lesion (the Major class in the dataset) or non- Nevis (the remaining classes), and the second step to classify between the remaining six lesions type. The average accuracy on approach one was 89.8% and 88.8% for approach two and reached 91.8 % when utilizing both approaches at the same time. INTRODUCTION Skin cancer, aka Melanoma, is one of the deadliest occurring problems in today's world. Even though it is the least common, the disease is responsible for around 91,000 deaths this year until now [2]. Early detection of skin lesions can help in its treatment and improve life. Dermoscopy refers to the examination of the skin [3] under a microscope to point out skin abnormalities and then classify them. Dermoscopic images are free from any skin surface reflections. These enhanced images help the dermatologist to diagnose melanoma accurately. Deep CNN has been known for producing state of the art results when it comes to medical diagnosis. We participated in the ISIC 2018: Skin Lesion Analysis Towards Melanoma Detection challenge to train models that could act as a screening process for dermatologist by segmenting and classifying skin lesions. ISIC 2018 is based on HAM10000 ( Human Against Machine with 10000 training images ) dataset [3] with 10015 images in the training set, 193 images in the validation set, and 1000 images for testing set. The dataset contains lesions form seven classes (Melanoma, Melanocytic nevus, Basal cell carcinoma, Actinic keratosis, Benign keratosis, Dermatofibroma, and Vascular Lesions). TASK GOAL TASK 1: LESION BOUNDARY SEGMENTATION Submit automated predictions of lesion segmentation boundaries within dermoscopic images. EVALUATION METRIC The prediction scores are measured using threshold Jaccard index metric which compares pixel-wise agreement between a predicted segmentation and its corresponding ground truth with zero scores if the Jaccard index is less than 0.65 and Jaccard index value, otherwise. Then the mean of all per-image scores is taken as the final metric value for the entire set.

DATASET PROCESSING The original HAM10000 dataset from ISIC 2018 contains 2594 lesions with their 2594 corresponding masks. As a first step, we reviewed the dataset and discovered some issues in the ground truth masks as shown in figure 1. So in order to clean the dataset and train our network on accurate data, we train a sample network on the whole training set without validation allowing the network to overfitting then we checked only the images with a low score and compared the related mask. As an output of this step, we removed 106 images from the dataset and developed the model using 2488 images only. NETWORK ARCHITECTURE Figure 1 Sample of some issues in ground truth masks In task 1 for segmentation, we used a modified version of U-net architecture called dilated U-Net. In traditional U- Net design, the network consists of three parts: Encoder network, Bottleneck layers, and Decoder network, in our solution we used 6 CONV layers in the bottleneck part. TRAINING STRATEGY We train our model using 4-Fold cross-validation with 1866 images for training and 622 images for validation using RMSprop optimizer with 0.0001 learning rate. Using the K-fold cross-validation allow the model to train on the whole dataset which reduces the effect of any error in the provided masks by getting four different model each one validated on a different set of images. PREDICTION STRATEGY During prediction phase and for each model of our final 4-Fold models we used Test Time Augmentation (TTA) by conduct four predictions for each image (the original image, horizontal flipped image, vertical flipped images, and horizontal and vertical flipped image) as shown in figure 2. Then we calculated the average, the minimum, and the maximum for each fold ending with twelve different predictions for each lesion. In our final submission, we used the best three performing predictions on challenge validation set which was 0.77 threshold Jaccard index to be used for final testing prediction.

Figure 2 Prediction using Testing Time Augmentation TASK 1 RESULTS Table 1 shows the best score for each fold during model training which tested against 622 images without applying the challenge threshold of 0.65 value. Table 1 Task 1: Segmentation Performance

TASK 3: LESION DIAGNOSIS Figure 3 Different Type of Lesions TASK GOAL The primary objective of this task is to classify dermoscopic images from the following seven lesion types: Melanoma (MEL). Melanocytic nevus (NV). Basal cell carcinoma (BCC). Actinic keratosis (AKIEC). Benign keratosis (BKL). Dermatofibroma (DF). Vascular lesion (VASC). EVALUATION METRIC Final predicted results are scored using a multi-class accuracy metric (balanced across categories). DATASET PROCESSING According to the HAM1000 paper, the training dataset has images of skin similar skin lesions taken from different camera angles. The percentage data distribution for each class is shown in table 2

Table 2 HAM10000 Data Distribution The imbalance between classes is apparent especially for Dermatofibroma and Vascular Lesions. Most of the lesions are belonging to Melanocytic nevus class. Out of that, there were a total of 5515 unique images and 4500 images with variants. The table below shows the distribution for each class. NETWORK ARCHITECTURE During our classification work, we used two models, Xception [5] and DenseNet121 [6] trained on the ImageNet Dataset. We used the pre-trained weights in addition to three CONV layers with 2048, 512, and 512 neurons and two Dropout layers with drop factor 0.5. TRAINING STRATEGY We implemented two approaches. First approach to building one step classifier between all seven classes in the same time (7C model), and the second approach is to build two steps classifier with a binary classifier between Melanocytic nevus (the dominant class) and all other six classes together (2C Model), then pass the result to another classifier to classifier between the remaining six classes( 6C Model). We split the data into 70% training set and 30% testing set, and due to the presence of a different number of variants, the data was manually split. In this manual split, with careful consideration, all the unique skin lesion images were put in the testing split and the rest, which includes images with two or more variants, were put in the training split to prevent the models to overfitting during training processes, we used Adam with learning rate 0.0001 with different image sizes of 224, 256, 299, and 512. As it can be observed from the above table, the data is largely imbalanced. The majority class, the NV class, is 66% of the entire dataset while there were also classes that made up only 1-3% of the dataset. To deal with this issue, the

technique of upsampling the data was employed. This technique involved duplicating the number of images in each class until the number is equal to the number of images in the majority class (NV in 7C and 2C models and MEL for 6C model). Both the training and validation split were upsampled to obtain a balanced set of images which allow the model to train on balanced class trying to optimize the average accuracy. PREDICTION STRATEGY During prediction, we used ensemble prediction from all generated models with the two networks (Xception and Dense121) and with different image sizes (224, 256, 299, and 512). TASK 3 RESULTS We tested out models using the selected testing set with 3005 images. For 7C classifier, the average accuracy was 84.2 % as per figure 4, while the for 2C classifier we achieved 95.16% accuracy as per figure 5, and for 6C classifier, we scored 87.69% as per figure 6. During prediction, we used ensemble prediction from all generated models with the two networks (Xception and Dense121) and with different image sizes (224, 256, 299, and 512). Figure 4 7C Model Confusion Matrix for Local Test Dataset

Figure 5 2C Model Confusion Matrix for Local Test Dataset Figure 6 6C Model Confusion Matrix for Local Test Dataset

Figure 7 2C + 6C Confusion Matrix for Test Dataset Figure 8 2,6C + 7C Confusion Matrix for Test Dataset For final submission to the challenge we submit the result of the 7C model as approach one, and the output of 2C and 6C as approached two and the average of both prediction as approach three. For ISIC 2018 validation set which consists of 193 images we scored 88.8% using 7C model and 89.8% using 2C+6C models, and 91.8% with the average between 7C and 2,6C.

RESULTS AND CONCLUSION For Task 1: Segmentation, our model score 0.77 threshold Jaccard index on ISIC Validation, set while For Task 3: Classification, our ensembles models scored 88.9%, 89.8%, and 91.8% respectively. The primary challenge for segmentation task was the noise in the ground truth masks which solved by cleaning the dataset before training, while the challenge in classification task was the imbalance between classes which solved by upsampling. REFERENCES [1] "ISIC 2018: Skin Lesion Analysis Towards Melanoma Detection," 2018. [Online]. Available: https://challenge2018.isic-archive.com/. [2] "Melanoma Stats, Facts, and Figures," 2018. [Online]. Available: https://www.aimatmelanoma.org/about-melanoma/melanoma-stats-facts-and-figures. [3] "Dermoscopy & Mole Scans in Perth and Regional WA," [Online]. Available: https://myskincentre.com.au/service/dermoscopy/. [4] Philipp Tschandl and Cliff Rosendahl and Harald Kittler, "The {HAM10000} dataset, a large collection of multi-source dermatoscopic," Sci. Data, vol. 180161, p. 5, 2018. [5] F. Chollet, "Xception: Deep Learning with Depthwise Separable Convolutions," CoRR, vol. abs/1610.02357, 2016. [6] Gao Huang and Zhuang Liu and Kilian Q. Weinberger, "Densely Connected Convolutional Networks," CoRR, 2016.