Segmenting Lesions in Multiple Sclerosis Patients James Chen, Jason Su

Similar documents
Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging

MR IMAGE SEGMENTATION

8/3/2017. Contour Assessment for Quality Assurance and Data Mining. Objective. Outline. Tom Purdie, PhD, MCCPM

Detecting Thoracic Diseases from Chest X-Ray Images Binit Topiwala, Mariam Alawadi, Hari Prasad { topbinit, malawadi, hprasad

Janitor Bot - Detecting Light Switches Jiaqi Guo, Haizi Yu December 10, 2010

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman

CS 221: Object Recognition and Tracking

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati

Identifying Important Communications

Tumor Detection and classification of Medical MRI UsingAdvance ROIPropANN Algorithm

Classification of Abdominal Tissues by k-means Clustering for 3D Acoustic and Shear-Wave Modeling

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)

Multi-Sectional Views Textural Based SVM for MS Lesion Segmentation in Multi-Channels MRIs

FACE DETECTION AND RECOGNITION OF DRAWN CHARACTERS HERMAN CHAU

Machine Learning for Medical Image Analysis. A. Criminisi

Reconstructing Broadcasts on Trees

CS 229 Final Project Report Learning to Decode Cognitive States of Rat using Functional Magnetic Resonance Imaging Time Series

Fast or furious? - User analysis of SF Express Inc

Accelerometer Gesture Recognition

Improving Positron Emission Tomography Imaging with Machine Learning David Fan-Chung Hsu CS 229 Fall

NIH Public Access Author Manuscript Proc IEEE Int Symp Biomed Imaging. Author manuscript; available in PMC 2014 November 15.

Texture. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors

Available Online through

Ulrik Söderström 16 Feb Image Processing. Segmentation

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Introduction to Digital Image Processing

Automatic Fatigue Detection System

Estimating Noise and Dimensionality in BCI Data Sets: Towards Illiteracy Comprehension

Dimension Reduction for Big Data Analysis. Dan Shen. Department of Mathematics & Statistics University of South Florida.

Tutorials Case studies

Mapping of Hierarchical Activation in the Visual Cortex Suman Chakravartula, Denise Jones, Guillaume Leseur CS229 Final Project Report. Autumn 2008.

Distance Weighted Discrimination Method for Parkinson s for Automatic Classification of Rehabilitative Speech Treatment for Parkinson s Patients

3D Surface Reconstruction of the Brain based on Level Set Method

White Matter Lesion Segmentation (WMLS) Manual

Machine Learning for Pre-emptive Identification of Performance Problems in UNIX Servers Helen Cunningham

Image Segmentation and Registration

Training-Free, Generic Object Detection Using Locally Adaptive Regression Kernels

CS229 Final Project: Predicting Expected Response Times

Automatic Colorization of Grayscale Images

Supplementary methods

Lecture 7: Most Common Edge Detectors

Including the Size of Regions in Image Segmentation by Region Based Graph

Norbert Schuff VA Medical Center and UCSF

Hybrid Approach for MRI Human Head Scans Classification using HTT based SFTA Texture Feature Extraction Technique

ANALYSIS OF PULMONARY FIBROSIS IN MRI, USING AN ELASTIC REGISTRATION TECHNIQUE IN A MODEL OF FIBROSIS: Scleroderma

Machine Learning with MATLAB --classification

Louis Fourrier Fabien Gaie Thomas Rolf

Digital Image Processing. Prof. P. K. Biswas. Department of Electronic & Electrical Communication Engineering

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München

Application of Support Vector Machine Algorithm in Spam Filtering

Correction of Partial Volume Effects in Arterial Spin Labeling MRI

In this lecture. Background. Background. Background. PAM3012 Digital Image Processing for Radiographers

The Automation of the Feature Selection Process. Ronen Meiri & Jacob Zahavi

Object Identification in Ultrasound Scans

Artificial Neural Networks (Feedforward Nets)

5 Learning hypothesis classes (16 points)

ADAPTIVE GRAPH CUTS WITH TISSUE PRIORS FOR BRAIN MRI SEGMENTATION

Applying Supervised Learning

A New GPU-Based Level Set Method for Medical Image Segmentation

Lecture 6: Edge Detection

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation.

What Are Edges? Lecture 5: Gradients and Edge Detection. Boundaries of objects. Boundaries of Lighting. Types of Edges (1D Profiles)

Lecture 4 Image Enhancement in Spatial Domain

A Keypoint Descriptor Inspired by Retinal Computation

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

Prostate Detection Using Principal Component Analysis

3D RECONSTRUCTION OF BRAIN TISSUE

The organization of the human cerebral cortex estimated by intrinsic functional connectivity

CHAPTER 2. Morphometry on rodent brains. A.E.H. Scheenstra J. Dijkstra L. van der Weerd

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

Machine Learning: Think Big and Parallel

MEDICAL IMAGE COMPUTING (CAP 5937) LECTURE 10: Medical Image Segmentation as an Energy Minimization Problem

Improved Non-Local Means Algorithm Based on Dimensionality Reduction

Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models

Classifying Depositional Environments in Satellite Images

MEDICAL IMAGE COMPUTING (CAP 5937) LECTURE 10: Medical Image Segmentation as an Energy Minimization Problem

Automated Canvas Analysis for Painting Conservation. By Brendan Tobin

Sharpening through spatial filtering

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

New Technology Allows Multiple Image Contrasts in a Single Scan

Lesion Segmentation and Bias Correction in Breast Ultrasound B-mode Images Including Elastography Information

American Football Route Identification Using Supervised Machine Learning

Using Machine Learning to Optimize Storage Systems

CRF Based Point Cloud Segmentation Jonathan Nation

1 Machine Learning System Design

Introduction to Automated Text Analysis. bit.ly/poir599

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

ISSN: X Impact factor: 4.295

2. On classification and related tasks

Business Club. Decision Trees

The Anatomical Equivalence Class Formulation and its Application to Shape-based Computational Neuroanatomy

Understanding Andrew Ng s Machine Learning Course Notes and codes (Matlab version)

Fraud Detection using Machine Learning

VECTOR SPACE CLASSIFICATION

FINDING THE TRUE EDGE IN CTA

CS249: ADVANCED DATA MINING

Dynamic Routing Between Capsules

Applying Synthetic Images to Learning Grasping Orientation from Single Monocular Images

A Practical Guide to Support Vector Classification

Using Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear

Transcription:

Segmenting Lesions in Multiple Sclerosis Patients James Chen, Jason Su Radiologists and researchers spend countless hours tediously segmenting white matter lesions to diagnose and study brain diseases. This mundane task is ideal for a machine. Our project builds a binary classifier to identify lesions using a novel data set. We use a set of 3-D brain volumes for each subject; it contains both the typical clinical images used for segmentation, as well as 0 additional quantitative maps that measure physical tissue properties. Using these maps for lesion segmentation has never been done before. This work has direct application for studies we are involved in. It will enable us to collect and process extensive longitudinal data for gaining a greater understanding of multiple sclerosis, a disease which affects up to 0.% of the population. A Primer on Brains Voxels in a brain are usually classified into categories like white matter, gray matter, and lesions. It is relatively easy to differentiate between gray and white matter, and between lesions and white matter. However, gray matter and lesions are hard to differentiate. They have similar pixel intensities so an algorithm that tries to segment gray and white matter will often label lesions as gray. Most gray matter is located at the surface/cortex of the brain. We can easily ignore this. The difficulty lies in differentiating between lesions and deep gray matter structures. Data Set Each patient has four clinical images (which we ll refer to as Clinical) and ten quantitative maps (which we ll refer to as mcdespot maps). Each image or map is a 28x28x76 matrix of values, corresponding to voxels in the brain volume. Our ing set includes data from five subjects. The data can be completely expressed in a 80,000,000x4 matrix. However, non-lesion voxels vastly outnumber those of lesions, so ing on all observations would result in the naive prediction of non-lesion for everything. To fix this, we decided to balance the data set by randomly sampling rows of the full matrix so that the ratio of non-lesion to lesion voxels was 2-to-. All lesion voxels were kept since they represent the most important data. The resulting matrix s size is 7,25x4. The set contains data from one other patient with 724 true lesion voxels. We do not balance this patient s data. The set is a 8484x4 matrix. Preprocessing - White Matter Mask We use an image mask that discards the brain s gray matter from our data set. Even though this white matter mask is edited by a human, the task is much easier and faster than selecting lesions. It is standard procedure to produce such a mask when processing MRI images. Thus our problem is simplified to finding lesions in white matter. Learning Algorithm We built our hypotheses on support vector machines (SVM) using the radial kernel. We also experimented with linear regression and SVM with a linear kernel, but they did not perform as well with about a 0% drop using a linear kernel. We used the libsvm library.

Learning the Hypothesis Our initial feature space uses the clinical images that are typically employed by radiologists to segment lesions by hand. We wanted to evaluate how well they worked alone. Each voxel is treated as an independent sample with no information about its spatial neighbors. We looked at its misclassification, recall, precision, and similarity index as a function of ing set size. Misclassification Error Recall TP/(FN+TP) Precision TP/(FP+TP) 0. 0.05 ing 0 0 000 2000 3000 4000 5000 6000 7000 8000 ing 0 000 2000 3000 4000 5000 6000 7000 8000 ing 0.2 0 000 2000 3000 4000 5000 6000 7000 8000 Figure. Misclassfication error, recall, precision (left) and similarity index (right) for Clinical features We then added the novel quantitative imaging maps to the feature space. This did not improve the ing SI or SI. Figure 2. Learning curve for Clinical + mcdespot features After the initial evaluation of our data set, we chose to use similarity index (SI) as the primary metric of the performance of our features and algorithms. SI is a standard measure for

the performance of lesion segmentation in literature, and we wanted to be able to compare our results. 2 It is the harmonic mean of precision and recall. By optimizing SI, one is balancing between reducing false negatives and false positives, both of which are valuable in this task. Adding features Locally-weighted Gaussian neighborhood We wanted to have a feature that represented the aggregate characteristics of a voxel s immediate neighborhood. Before this, each voxel was treated independently with no spatial context. For each voxel, we generate a weighted average of the 3-D neighborhood with a 3-D Gaussian kernel, akin to the kernel used in locally-weighted linear regression. With a finite kernel size of 7x7x7, the Gaussian tails are cut off. We apply a Hamming window to mitigate Gibbs ringing. The kernel is normalized to have a sum of one. This adds 4 features to the data, one for each original image/map. Edge handling The weighted neighborhood feature is not valid at the edges of the image, where the background is zero. To account for this, we have a simple binary flag. The flag equals zero if the Gaussian neighborhood is computed with any background zeros values, and one otherwise, i.e. if the neighborhood contains all valid data. This adds one binary feature to the data. Edge detection Edges of lesions are particularly hard to segment because they often fade smoothly into their surroundings. We used a high pass filter to boost the contrast for each of the original 4 features, highlighting the edges, and added these as additional features. Here we used a simple 3x3x3 kernel that summed to zero with equal weights on all but the center value. This adds another 4 features to the data. Figure 3. Learning curve for Clinical + mcdespot + Neighborhood + Edge-related features

Feature selection PCA A problem we have encountered thus far was overfitting. The ing SI improves monotonically, while the SI would decrease with more ing samples. So we performed principal component analysis on all 43 features. Then we determined the optimal number of principal components to use by computing the SI for each. The top 8 principal components were best and were kept as input features. As shown by the learning curve in below, SI is now much more stable as ing samples are increased. The first principal component lies primarily along the axes of several mcdespot quantitative features. The weight coefficients are large for these features. 5 0.2 5 5 5 0 0 20 30 40 Number of principal components 5 Number of ing samples Figure 4. Determing the optimal number of principal components. (left) Figure 5. Learning curve with top 8 PCA features. (right) Training features Test SI Training SI Clinical 56.5 90.3 Clinical + mcdespot 54.4 89.4 Clinical + mcdespot + N&E 59.6 9.0 PCA-Top8 (Clinical + mcdespot + N&E) 62.4 9.2 Table. A comparison of learning between different feature sets Final Results We validated our final hypothesis on three new patients with different types of MS. Patient number Validation SI P008 65.5% P08 73.6% P029 45.8% Table 2. Final model on three new patients. Discussion State-of-the-art lesion segmentation algorithms have an SI of about 70% with similar variations in performance between subjects. 2 With the preliminary results of this study, we are

confident that the novel quantitative maps can complement clinical images and contribute additional information to identifying lesions. A large gap remains between the ing and actual performance. This is not due to a lack of data since even with the addition of 0 more brains to the ing set, there was still a comparable disparity. We believe that the variability between the brains of individuals as well as the subjectivity of the radiologists who defined the true lesion labels are partially responsible for this difference. The boundaries of lesions are poorly defined and vary greatly even between experts doing the manual segmentation. Additional work can be done to tune the system such as optimizing the C-parameter, radial kernel parameter, and Gaussian neighborhood size and width. We can also do better to clean the clinical images, which are typically on an inconsistent scale between patients. Conclusion We have achieved a lesion segmentation system that is fast and performs nearly as well as state-of-the-art methods. With so many avenues still yet to be explored, we believe that there is great potential for an even better final system. Figure 6. Patient 08 with lesions marked by hand in white. (left) Figure 7. Patient 08 with lesions marked by this paper s final model in white. (right) References [] Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:--27:27, 20. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm [2] H. Soltanian-zadeh, "Segmentation of multiple sclerosis lesions in MR images : a review," Brain, 20.