Manifold Learning-based Data Sampling for Model Training

Similar documents
Hierarchical Multi structure Segmentation Guided by Anatomical Correlations

Using Probability Maps for Multi organ Automatic Segmentation

PROSTATE CANCER DETECTION USING LABEL IMAGE CONSTRAINED MULTIATLAS SELECTION

Fully Automatic Multi-organ Segmentation based on Multi-boost Learning and Statistical Shape Model Search

Discriminative Dictionary Learning for Abdominal Multi-Organ Segmentation

Automatic multi-organ segmentation in dual energy CT using 3D fully convolutional network

A Review on Label Image Constrained Multiatlas Selection

Population Modeling in Neuroscience Using Computer Vision and Machine Learning to learn from Brain Images. Overview. Image Registration

Patient surface model and internal anatomical landmarks embedding

Prototype of Silver Corpus Merging Framework

Adaptive Local Multi-Atlas Segmentation: Application to Heart Segmentation in Chest CT Scans

Sparsity Based Spectral Embedding: Application to Multi-Atlas Echocardiography Segmentation!

Edge-Preserving Denoising for Segmentation in CT-Images

Learning a Manifold as an Atlas Supplementary Material

Time-of-Flight Surface De-noising through Spectral Decomposition

Rule-Based Ventral Cavity Multi-organ Automatic Segmentation in CT Scans

Machine Learning for Medical Image Analysis. A. Criminisi

Moving Metal Artifact Reduction for Cone-Beam CT (CBCT) Scans of the Thorax Region

Keypoint Transfer Segmentation

Supervoxel Classification Forests for Estimating Pairwise Image Correspondences

Segmentation of the Pectoral Muscle in Breast MRI Using Atlas-Based Approaches

Automatic Detection of Multiple Organs Using Convolutional Neural Networks

3D-CNN and SVM for Multi-Drug Resistance Detection

Detecting Bone Lesions in Multiple Myeloma Patients using Transfer Learning

Manifold Learning of Brain MRIs by Deep Learning

Scaling Calibration in the ATRACT Algorithm

Hybrid Spline-based Multimodal Registration using a Local Measure for Mutual Information

MedGIFT projects in medical imaging. Henning Müller

Comparison of Default Patient Surface Model Estimation Methods

2 Michael E. Leventon and Sarah F. F. Gibson a b c d Fig. 1. (a, b) Two MR scans of a person's knee. Both images have high resolution in-plane, but ha

Applied Statistics for Neuroscientists Part IIa: Machine Learning

Modeling and preoperative planning for kidney surgery

Manifold Learning for Image-Based Breathing Gating with Application to 4D Ultrasound

Fully Automatic Model Creation for Object Localization utilizing the Generalized Hough Transform

Separate CT-Reconstruction for Orientation and Position Adaptive Wavelet Denoising

NON OB ULTRASOUND MORPHOMETRICS AS BIOMARKERS

Iterative CT Reconstruction Using Curvelet-Based Regularization

3D Statistical Shape Model Building using Consistent Parameterization

Atlas-Based Segmentation of Abdominal Organs in 3D Ultrasound, and its Application in Automated Kidney Segmentation

Ensemble registration: Combining groupwise registration and segmentation

Manifold Learning: Applications in Neuroimaging

Segmentation of 3D CT Volume Images Using a Single 2D Atlas

Robust Linear Registration of CT images using Random Regression Forests

Figure 1. Overview of a semantic-based classification-driven image retrieval framework. image comparison; and (3) Adaptive image retrieval captures us

Segmentation of Brain MR Images via Sparse Patch Representation

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 26, NO. 10, OCTOBER

A Multiple-Layer Flexible Mesh Template Matching Method for Nonrigid Registration between a Pelvis Model and CT Images

Deformable Segmentation using Sparse Shape Representation. Shaoting Zhang

On the Characteristics of Helical 3-D X-ray Dark-field Imaging

Single Breath-hold Abdominal T 1 Mapping using 3-D Cartesian Sampling and Spatiotemporally Constrained Reconstruction

Kaggle Data Science Bowl 2017 Technical Report

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging

Joint Supervoxel Classification Forest for Weakly-Supervised Organ Segmentation

Automatic Segmentation of Parotids from CT Scans Using Multiple Atlases

2D Rigid Registration of MR Scans using the 1d Binary Projections

Learning based face hallucination techniques: A survey

Detecting Anatomical Landmarks from Limited Medical Imaging Data using Two-Stage Task-Oriented Deep Neural Networks

Organ Analysis and Classification using Principal Component and Linear Discriminant Analysis. William H. Horsthemke Daniela S. Raicu DePaul University

Medical Image Segmentation

Multi-atlas labeling with population-specific template and non-local patch-based label fusion

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization

arxiv: v2 [cs.cv] 28 Jan 2019

Context-sensitive Classification Forests for Segmentation of Brain Tumor Tissues

Automated segmentation methods for liver analysis in oncology applications

Artefakt-resistente Bewegungsschätzung für die bewegungskompensierte CT

The Anatomical Equivalence Class Formulation and its Application to Shape-based Computational Neuroanatomy

Respiratory Motion Compensation for C-arm CT Liver Imaging

Multi-Label Whole Heart Segmentation Using CNNs and Anatomical Label Configurations

Mapping Multi-Modal Routine Imaging Data to a Single Reference via Multiple Templates

Depth-Layer-Based Patient Motion Compensation for the Overlay of 3D Volumes onto X-Ray Sequences

A new calibration-free beam hardening reduction method for industrial CT

Prostate Detection Using Principal Component Analysis

doi: /

arxiv: v1 [cs.cv] 11 Apr 2018

The organization of the human cerebral cortex estimated by intrinsic functional connectivity

Generation of Triangle Meshes from Time-of-Flight Data for Surface Registration

Combination of Markerless Surrogates for Motion Estimation in Radiation Therapy

Improvement and Evaluation of a Time-of-Flight-based Patient Positioning System

Towards an Estimation of Acoustic Impedance from Multiple Ultrasound Images

Facial Expression Classification with Random Filters Feature Extraction

Regional Manifold Learning for Deformable Registration of Brain MR Images

Methodological progress in image registration for ventilation estimation, segmentation propagation and multi-modal fusion

Deep Fusion Net for Multi-Atlas Segmentation: Application to Cardiac MR Images

Utilizing Salient Region Features for 3D Multi-Modality Medical Image Registration

Norbert Schuff VA Medical Center and UCSF

Image Registration. Prof. Dr. Lucas Ferrari de Oliveira UFPR Informatics Department

Computer-Aided Diagnosis in Abdominal and Cardiac Radiology Using Neural Networks

8/3/2017. Contour Assessment for Quality Assurance and Data Mining. Objective. Outline. Tom Purdie, PhD, MCCPM

EXPECTED LABEL VALUE COMPUTATION FOR ATLAS-BASED IMAGE SEGMENTATION. Iman Aganj and Bruce Fischl

Atlas of Classifiers for Brain MRI Segmentation

SELF SUPERVISED DEEP REPRESENTATION LEARNING FOR FINE-GRAINED BODY PART RECOGNITION

Computer Aided Diagnosis Based on Medical Image Processing and Artificial Intelligence Methods

LEARNING COMPRESSED IMAGE CLASSIFICATION FEATURES. Qiang Qiu and Guillermo Sapiro. Duke University, Durham, NC 27708, USA

A COMPARISON OF WAVELET-BASED AND RIDGELET- BASED TEXTURE CLASSIFICATION OF TISSUES IN COMPUTED TOMOGRAPHY

Pose estimation using a variety of techniques

GLOBAL SAMPLING OF IMAGE EDGES. Demetrios P. Gerogiannis, Christophoros Nikou, Aristidis Likas

Non-rigid Registration of Preprocedural MRI and Intra-procedural CT for CT-guided Cryoablation Therapy of Liver Cancer

Data fusion and multi-cue data matching using diffusion maps

Learning-based Neuroimage Registration

Automatic Intrinsic Cardiac and Respiratory Gating from Cone-Beam CT Scans of the Thorax Region

Transcription:

Manifold Learning-based Data Sampling for Model Training Shuqing Chen 1, Sabrina Dorn 2, Michael Lell 3, Marc Kachelrieß 2,Andreas Maier 1 1 Pattern Recognition Lab, FAU Erlangen-Nürnberg 2 German Cancer Research Center (DKFZ), Heidelberg, Germany 3 University Hospital Nürnberg, Paracelsus Medical University, Nürnberg, Germany shuqing.chen@fau.de Abstract. Training data sampling is an important task in machine learning especially for data with small sample size and data with nonuniform sample distribution. Dividing data into different data sets randomly can cause the problem that, the training model covers only parts of the sampled cases and works inaccurately for weakly sampled cases. Recent research showed the benefit of manifold learning techniques in medical image processing. In this work, we propose a manifold learning based approach to improve the data division and the model training. We evaluated the proposed approach using an atlas registration framework and a deep learning framework. The final segmentation results using methods with and without data balancing were compared. All of the final segmentations were improved after implementing the manifold learning based approach into the frameworks. The largest improvement was 24.4%. Thus, the proposed manifold learning based approach is effective for the model training. 1 Introduction To model common properties and variations from many individuals over an entire task domain is important for many machine learning methods, such as atlas registration and deep learning. These methods can be used for varied applications, e.g. multi-organ segmentation in the field of medical image processing. Model training is an important part of such methods. A well-trained model works for most cases within the domain. However, due to the limited data sample, size and nonuniform sample distribution, problems arise in practical implementations. For many applications, the data has to be sampled into different balanced sets before starting the modeling. With small sample size or nonuniform sample distribution, random data selection may not work robustly. Aljabar et al. [1] discussed the power of manifold learning in the field of medical image processing. Using manifold learning techniques, high dimensional medical data can be converted to lower dimensional representation while respecting the intrinsic geometry [2], so that it is possible to facilitate the application of machine learning techniques such as clustering and regression. Manifold learning

2 Chen et al. techniques can be used in the preprocessing to extract features for registration [3]. They can also improve segmentation successfully by allowing the propagation of multiple atlases to a diverse set of images [4]. Furthermore, manifold learning techniques can be used for clinical classification [1]. However, few research showed the benefit of the manifold learning to reduce the problem caused by data selection in many machine learning methods. In this paper, we proposed one manifold learning based approach to solve the data selection problem and to improve the model training. 2 Materials and Methods In order to avoid the bias due to the data selection and to keep the distribution of the data sets (i.e. training data set, test data set as well as validation data set if required) similar, three steps were employed to improve the data selection: Data Representation: Initially, the high-dimensional volumetric medical data will be projected into a low-dimensional (e.g. 2-D) visualization plane using manifold learning techniques. Each point in such visualization plane will denote a volumetric image. To improve the performance, the data should be resampled to the same image spacing before using manifold learning. Data Clustering: Afterwards, the data points can be divided into different classes using clustering techniques. Data Selection: Finally, the training data set, the validation data set and the test dataset can be built by selecting data samples from each class randomly. To show the effectivity of this datatype-independent approach in general machine learning methods, the evaluation is designed with the multi-organ segmentation using atlas registration on computed tomography (CT) images [5] and the multi-organ segmentation using deep learning on dual-energy CT (DECT) images [6]. For our cases, the clusters are well distributed with less overlap on 2-D space. Thus, data projection on 2-D space is sufficient. The method of the multi-organ segmentation using atlas registration is described in [5]. In this method, the data selection for the atlas modeling should focus on the inter-subject organ shape variation. Therefore, the data selection approach is applied after the step of the affine registration, in order to avoid the effect of the position variance. The atlas construction part described in [5] is improved with data selection in following steps. First of all, the data is cropped to get the region of interest (ROI) for the redundancy reduction of the atlas. Then a reference volume is selected which is the most similar to the mean volume of all samples. Subsequently, affine registration is used to reduce the variation of the rotation, the translation, the scaling and the shearing. The results after the affine registration are then split into training data set and test data set using the manifold learning based approach mentioned previously. Average volume and atlas is constructed finally after the fine alignment based on B-Spline registration as described in [5].

Manifold Learning-based Data Sampling for Model Training 3 Fig. 1. Data representation and clustering Fig. 2. Data representation and clustering using LLE and k-means for atlas registration data. Colors denote classes, numbers data. Colors denote classes, volume indexes using LLE and k-means for deep learning denote volume indexes. are ignored for clarity. A 3-D cascaded fully connected network [7] was used for the multi-organ segmentation using deep learning technique [6]. In this network, the input training data will be augmented with more translation and rotation. Thus, unlike the atlas registration, it is not required to remove the position variation for the data selection. The data selection approach can be applied directly to the original data set without any preprocessing. Training data set, validation data set and test data set are selected using the proposed selection approach. The whole framework is kept in same as the description in [6]. 3 Results The Matlab toolbox provided by van der Maaten et al. [8] was used for manifold learning in our implementation. Dice coefficient was utilized to measure the performance of the segmentation result. To show the effect of the data selection on atlas registration, 20 VISCERAL non-enhanced CT volumes [9] were used for the evaluation. Fig. 1 plots the 20 CT volumes by characterizing the shape variation using the manifold learning approach. Each point denotes one CT volume. Locally linear embedding (LLE) [10] was chosen to reduce the dimensionality because LLE showed the best performance by experimenting the different manifold learning methods provided in the toolbox. To construct balanced training, test and validation sets, the data was clustered into k classes that should be evenly represented in each set. k- Means was used here for the data clustering with k = 3, because k = 3 is most reasonable for these volumes. This is indicated by color in Figs. 1 and 2. A 10- fold cross validation was tested for 6 target organs including left and right lung, liver, spleen, as well as left and right kidney. To construct balanced sets, one volume was selected randomly from each class as test volume, in total 3 test

4 Chen et al. Table 1. Comparison of atlas registration on CT images without and with data selection. Without Data Selection Right Lung Left Lung Right Kidney Left Kidney Liver Spleen Avg. 0.960 0.957 0.794 0.731 0.900 0.813 Std. Dev. 0.014 0.015 0.140 0.214 0.034 0.104 With Data Selection Avg. 0.965 0.960 0.834 0.821 0.912 0.842 Std. Dev. 0.009 0.010 0.080 0.121 0.024 0.051 Improvements of the average Diff. Avg. 0.005 0.003 0.040 0.090 0.012 0.029 Fig. 3. Comparision test for atlas registration with/without data selection with/without data Fig. 4. Comparison test for U-Net selection volumes were selected as test data set. The remaining 17 volumes were used as training data set. Test data sets were segmented using the segmentation method mentioned in [5]. For comparison, a 10-fold cross validation was tested using the 20 CT volumes for the same target organs based on the atlas construction method described in [5]. The same reference volume was used for the registrations, but training data set and test data set was selected randomly. The amount of the data sets were kept in same, i.e. 17 for training and 3 for the test. The results were summarized in Table 1 and shown in Fig. 3. The final segmentation of all target organs was improved with the data selection approach. Right lung and left lung has slight improvement with 0.5% and 0.3%, respectively. Liver has small improvement with about 1%. Other organs have a significant improvement from around 3% to around 9%. The distributions of the Dice coefficients are also converged significantly. To show the effect of the data selection on deep learning, 42 clinical DECT volumes were used for the evaluation. The data representation and the data clustering is illustrated in Fig. 2. The dimensionality was reduced using LLE. Each point denotes one DECT volume. The data points were clustered into 3

Manifold Learning-based Data Sampling for Model Training 5 Table 2. Comparison of U-Net on DECT images without and with data selection. Right Kidney Left Kidney Liver Spleen Without Data Selection Avg. 0.905 0.856 0.919 0.652 Std. Dev. 0.020 0.047 0.015 0.188 With Data Selection Avg. 0.905 0.863 0.934 0.896 Std. Dev. 0.034 0.071 0.011 0.032 Improvements of the average Diff. Avg. 0.000 0.007 0.015 0.244 classes by using k-means with k = 3. Like described in [6], the ratio 5:1:1 was used for the data selection. That means, 2 volumes from each class were selected randomly for validation and test. In total, validation data set and test data set was built with 6 volumes, respectively. The remaining 30 volumes were used as training data set. The segmentation of liver, spleen, as well as left and right kidney, were evaluated. 0.9 was taken as training weight and test weight. A comparison model was experimented using same framework and same condition but with randomly generated data sets. The results were summarized in Table 2 and presented in Fig. 4. The data selection approach improved all final segmentation. Right kidney and left kidney has slight improvement with 0.00% and 0.7%, respectively. Liver has small improvement with about 1.5%. Spleen has significant improvement around 24.4%. 4 Discussion We proposed a manifold learning based approach to reduce the bias of the data selection. The proposed approach was implemented into an atlas registration framework and a deep learning framework. The comparison evaluation showed the benefit of the data selection approach. Both of the machine learning methods can be improved by adding the proposed approach. The approach can be tested for more machine learning methods in future. Moreover, more manifold learning techniques can be investigated further. Furthermore, other clustering techniques can be used. ACKNOWLEDGMENTS This work was supported by the German Research Foundation (DFG) through research grant No. KA 1678/20, LE 2763/2-1 and MA 4898/5-1. References 1. Aljabar P, Wolz R, Rueckert D. Manifold learning for medical image registration, segmentation, and classification. Machine Learning in Computer-Aided Diagnosis.

6 Chen et al. 2012; p. 351. 2. Maier A, Schuster M, Eysholdt U, et al. QMOS - A Robust Visualization Method for Speaker Dependencies with Different Microphones. J Pattern Recognit Res. 2009;4(1):32 51. 3. Wachinger C, Navab N. Manifold Learning for Multi-Modal Image Registration. Proc 11th BMVC. 2010 01; p. 1 12. 4. Wolz R, Aljabar P, Hajnal JV, et al. LEAP: Learning embeddings for atlas propagation. NeuroImage. 2010;49(2):1316 1325. 5. Chen S, Endres J, Dorn S, et al. A Feasibility Study of Automatic Multi-Organ Segmentation Using Probabilistic Atlas. Proc BVM. 2017; p. 218 223. 6. Chen S, Roth H, Dorn S, et al. Towards Automatic Abdominal Multi-Organ Segmentation in Dual Energy CT using Cascaded 3D Fully Convolutional Network;. 7. Roth HR, Oda H, Hayashi Y, et al. Hierarchical 3D fully convolutional networks for multi-organ segmentation;. 8. van der Maaten LJP, Postma EO, van den Herik HJ. Dimensionality Reduction: A Comparative Review; 2008. 9. Jiménez-del Toro OA, Dicente Cid Y, Depeursinge A, et al. Hierarchic Anatomical Structure Segmentation Guided by Spatial Correlations (AnatSeg Gspac): VIS- CERAL Anatomy3. Proc Visc Chall ISBI. 2015 Apr; p. 22 26. 10. Roweis ST, Saul LK. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science. 2000;290(5500):2323 2326.