A Comparison of wavelet and curvelet for lung cancer diagnosis with a new Cluster K-Nearest Neighbor classifier

Similar documents
Automated Lesion Detection Methods for 2D and 3D Chest X-Ray Images

Computer Aided Diagnosis Based on Medical Image Processing and Artificial Intelligence Methods

A Multiclassifier Approach for Lung Nodule Classification

Hierarchical Shape Statistical Model for Segmentation of Lung Fields in Chest Radiographs

Classification of Microcalcification in Digital Mammogram using Stochastic Neighbor Embedding and KNN Classifier

Mass Classification Method in Mammogram Using Fuzzy K-Nearest Neighbour Equality

Detection & Classification of Lung Nodules Using multi resolution MTANN in Chest Radiography Images

Fig. 1 System flowchart

CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS

Diagnosis of Breast Cancer using Wavelet Entropy Features

Chapter 7 UNSUPERVISED LEARNING TECHNIQUES FOR MAMMOGRAM CLASSIFICATION

Computer-Aided Diagnosis in Abdominal and Cardiac Radiology Using Neural Networks

A COMPARISON OF WAVELET-BASED AND RIDGELET- BASED TEXTURE CLASSIFICATION OF TISSUES IN COMPUTED TOMOGRAPHY

Hybrid Approach for MRI Human Head Scans Classification using HTT based SFTA Texture Feature Extraction Technique

Tumor Detection in Breast Ultrasound images

Computer-aided detection of clustered microcalcifications in digital mammograms.

Available Online through

RADIOMICS: potential role in the clinics and challenges

Cardio-Thoracic Ratio Measurement Using Non-linear Least Square Approximation and Local Minimum

CHAPTER 6 ENHANCEMENT USING HYPERBOLIC TANGENT DIRECTIONAL FILTER BASED CONTOURLET

FEATURE EXTRACTION TECHNIQUES USING SUPPORT VECTOR MACHINES IN DISEASE PREDICTION

Digital Image Processing

algorithms ISSN

DENOISING OF COMPUTER TOMOGRAPHY IMAGES USING CURVELET TRANSFORM

Chapter 6 CLASSIFICATION ALGORITHMS FOR DETECTION OF ABNORMALITIES IN MAMMOGRAM IMAGES

Iterative CT Reconstruction Using Curvelet-Based Regularization

A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES

Wavelet and Curvelet Analysis for the Classification of Microcalcifiaction Using Mammogram Images

FEATURE DESCRIPTORS FOR NODULE TYPE CLASSIFICATION

k-nn Disgnosing Breast Cancer

Fingerprint Based Gender Classification Using Block-Based DCT

Mammogram Segmentation using Region based Method with Split and Merge Technique

Tumor Detection and classification of Medical MRI UsingAdvance ROIPropANN Algorithm

A fast breast nonlinear elastography reconstruction technique using the Veronda-Westman model

Detection of Bone Fracture using Image Processing Methods

Computer-Aided Detection system for Hemorrhage contained region

Int. J. Pharm. Sci. Rev. Res., 34(2), September October 2015; Article No. 16, Pages: 93-97

Mixed Kernel Function SVM for Pulmonary Nodule Recognition

X-ray Categorization and Spatial Localization of Chest Pathologies

2. LITERATURE REVIEW

Available online Journal of Scientific and Engineering Research, 2019, 6(1): Research Article

Medical Image Feature, Extraction, Selection And Classification

A ranklet-based CAD for digital mammography

Classification of Mammographic Images Using Artificial Neural Networks

Biomedical Research 2016; Special Issue: S123-S127 ISSN X

DYADIC WAVELETS AND DCT BASED BLIND COPY-MOVE IMAGE FORGERY DETECTION

Implementation of Modified K-Nearest Neighbor for Diagnosis of Liver Patients

SVM-based CBIR of Breast Masses on Mammograms

Image Compression. -The idea is to remove redundant data from the image (i.e., data which do not affect image quality significantly)

Iris Recognition Using Curvelet Transform Based on Principal Component Analysis and Linear Discriminant Analysis

MULTIVARIATE TEXTURE DISCRIMINATION USING A PRINCIPAL GEODESIC CLASSIFIER

Global Journal of Engineering Science and Research Management

Computer-Aided Diagnosis for Lung Diseases based on Artificial Intelligence: A Review to Comparison of Two- Ways: BP Training and PSO Optimization

Texture Classification Using Curvelet Transform

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes

FEATURE EXTRACTION FROM MAMMOGRAPHIC MASS SHAPES AND DEVELOPMENT OF A MAMMOGRAM DATABASE

Classification of Microcalcification Clusters via PSO-KNN Heuristic Parameter Selection and GLCM Features

By choosing to view this document, you agree to all provisions of the copyright laws protecting it.

ENHANCEMENT OF MAMMOGRAPHIC IMAGES FOR DETECTION OF MICROCALCIFICATIONS

Amudha K et al. Int. Res. J. Pharm. 2017, 8 (9) INTERNATIONAL RESEARCH JOURNAL OF PHARMACY

Improvement in automated detection of pulmonary nodules on helical x-ray CT images

Review on Different Segmentation Techniques For Lung Cancer CT Images

Automatic Lung Surface Registration Using Selective Distance Measure in Temporal CT Scans

Reversible Blind Watermarking for Medical Images Based on Wavelet Histogram Shifting

Kaggle Data Science Bowl 2017 Technical Report

Extraction and recognition of the thoracic organs based on 3D CT images and its application

Breast Cancer Detection in Digital Mammograms

Lung Nodule Detection using a Neural Classifier

CHAPTER-1 INTRODUCTION

Bayes Risk. Classifiers for Recognition Reading: Chapter 22 (skip 22.3) Discriminative vs Generative Models. Loss functions in classifiers

Lobar Fissure Extraction in Isotropic CT Lung Images - An Application to Cancer Identification

k-nn CLASSIFIER FOR SKIN CANCER CLASSIFICATION

Efficient Image Compression of Medical Images Using the Wavelet Transform and Fuzzy c-means Clustering on Regions of Interest.

Classifiers for Recognition Reading: Chapter 22 (skip 22.3)

EFFICIENT INTRUSION DETECTION SYSTEM BASED ON SUPPORT VECTOR MACHINES USING OPTIMIZED KERNEL FUNCTION

Analysis of texture patterns in medical images with an application to breast imaging

Human Heart Coronary Arteries Segmentation

Enhanced Image Texture Feature Extraction Method Using Local Tetra Patterns for Plant Leaf Classification System

Detection of Leukemia in Blood Microscope Images

Approaches For Automated Detection And Classification Of Masses In Mammograms

Machine Learning for Medical Image Analysis. A. Criminisi

2. Methodology. sinθ = const for details see, [5]. ψ ((x 1. ψ a,b,θ

ISSN: (Online) Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies

MEDICAL IMAGE NOISE REDUCTION AND REGION CONTRAST ENHANCEMENT USING PARTIAL DIFFERENTIAL EQUATIONS

Separate CT-Reconstruction for Orientation and Position Adaptive Wavelet Denoising

Automatic Rapid Segmentation of Human Lung from 2D Chest X-Ray Images

MEDICAL IMAGE ANALYSIS

Combinatorial Effect of Various Features Extraction on Computer Aided Detection of Pulmonary Nodules in X-ray CT Images

Analysis of classifier to improve Medical diagnosis for Breast Cancer Detection using Data Mining Techniques A.subasini 1

Face Detection Using Radial Basis Function Neural Networks With Fixed Spread Value

A Laplacian Based Novel Approach to Efficient Text Localization in Grayscale Images

Detection and Identification of Lung Tissue Pattern in Interstitial Lung Diseases using Convolutional Neural Network

MedGIFT projects in medical imaging. Henning Müller

Deep Learning based Computer Aided Diagnosis System for Breast Mammograms

FUSSCYIER: MAMMOGRAM IMAGES CLASSIFICATION BASED ON SIMILARITY MEASURE FUZZY SOFT SET

Combination of PCA with SMOTE Resampling to Boost the Prediction Rate in Lung Cancer Dataset

Early Stage Oral Cavity Cancer Detection: Anisotropic Pre-Processing and Fuzzy C-Means Segmentation

HYBRID TRANSFORMATION TECHNIQUE FOR IMAGE COMPRESSION

Transcription:

A Comparison of wavelet and curvelet for lung cancer diagnosis with a new Cluster K-Nearest Neighbor classifier HAMADA R. H. AL-ABSI 1 AND BRAHIM BELHAOUARI SAMIR 2 1 Department of Computer and Information Sciences, Faculty of Science and Information Technology Universiti Teknologi PETRONAS Bandar Seri Iskandar, 3175 Tronoh, Perak, Malaysia 2 College of Science, ALFAISAL University, P.O.Box 5927 Riyadh 11533 Kingdom of Saudi Arabia 1 Hamada.it@gmail.com 2 sbelhaouari@alfaisal.edu Abstract: - This paper presents a comparison of wavelet and curvelet for lung cancer in term of diagnostic accuracy when each one is applied separately to the cluster K-Nearest neighbor classifier. Lung cancer is among the diseases that lead to high mortality rate globally. The computer aided diagnoisis system that is shown in this paper consists of a preprocessing state, a feature extraction stage (wavelet or curvelet), a feature selection stage and finally a classification stage. The results obtained on the x-ray dataset that was utilized suggest that wavelet produce better accuracy with low false positives and false negatives compared to curvelet. Key-Words: - lung cancer; curvelet; wavelet; computer aided diagnosis; feature selection; Cluster-k-NN 1 Introduction Lung cancer is another type of cancer that also considered as on of the main causes of death globally [1]. Globocan project [2] stated that 16855 incidents occtured in 28 in both sexes (12.7% of all cancers) and 1376579 mortalities (18.2% of all cancers) occurd in the same year in both sexes as well. Figure 1 shows a comparision between lung cancers and other types of cancers in term of incidents and mortalities in both sexes. Fig. 1: Incidents and Mortality of cancer types affecting both sexes (All ages) [2] The figure above shows how this cancer affects humans more than any other type of cancer and consequently leads to death Computer aided diagnosis systems could assist in the early detection of lung cancer. Methods to achieve the detection and diagnosis of lung cancer in CAD system have been developed in previous studies. A CAD system for pulmonary nodule detection in chest radiography has been presented in [3]. The system employed an adaptive distancebased threshold algorithm for nodule segmentation. A geometric features, intensity features and gradient features were calculated for Each segmented nodule and a Fisher linear discriminant classifier was used for the classification. A 78.1% rate of the nodules were detected correctly when applied on a dataset consists of 167 chest radiographs of which 181 lung nodules were present. Sousa et al. [4] presented a system for automatic lung cancer detection. They system consists of six stages which each performs a specific task such as segmentation, reconstruction and false-positives reduction. The system achieved a ISBN: 978-1-6184-148-7 212

sensitivity of 84.84%, and a specificity of 96.15%. A system to detect lung nodules using shape-based genetic algorithm templage matching (GATM) was proposed by Dehmeshki et al. [5]. A dataset of 7 CT images with 178 nodules was used to evaluate the system; 16 of those nodules were detected by the system with a 9% accuracy.a system with an ensemble classifier aided by clustering was proposed by Lee et al. [6]. The task fo the system is to detect lung nodules. 32 scans of patients lungs were used for evaluation. The scans comprised 5721. A sensitivity of 98.33% and specificity of 97.11% were obtained. Orban et al. [7] developed a method for lung nodule detection. The method starts with an algorithm to preprocess the radiograph by removing the ribs that surround the lungs. This increases the visibility of any module that might exist. Moreover, another algorithm is utilized to increase the intensity of round-shaped objects; this algorithm is called the Constrained Sliding Band Filter (CSBF). Finally, SVM based on texture features is used to decrease the number of false detection. JSRT dataset were used together with a prive database, and the method achieved 61% sensitivity at 2.5 false positives per image. Another system that was proposed by Pereira et al [8] introduced an approach for the classification of lung nodules with multiclassifiers. The approach start by filtering the image using a multi-scale filter bank that inclosed of 36 filters at different scales and orientation, isotropic filters, Gaussian and Laplacian of Gaussian. A multiple classifiers based on different multiple-layer percpetrons (MLP) were used to classify the images. The authors evaluated the approach with JSRT dataset, with 19 classifier combination, the Borda count combination produced a 97% sensitivity with 43% error. Other combinations produced less errors (except 1 and 9) but not less than 16.21%. The authors concluded that the low performance is due to the large number of combinations. As this is an interesting area of research, many other CAD systems have been developed for lung and many other types of cancers. However, CAD systems got limitations such as diagnosing subtle regions and high false positives. In this paper, we deal with these issue by comparing the performance of wavelets and curvelets. 2 Method The presented CAD system in this paper consist of a preprocessing stage where all images are filtered with laplacian filter, then a feature extraction stage with wavelet or curvelet, a feature selection stage with 2 steps and finally, classification with clustering k-nearest Neighbor classifier. Figure 2 illustrates the system. The method is trained at the beginning using a dataset of regions that are normal and abnormal, begnin and malignant; and then, once the training is done, the method is tested using a set of images that were not used in the training. Dataset Preprocessing Apply Laplacian Filter Feature Extraction Wavelet (1...6 Levels) OR Curvelet (2...7 Scales) Feature Selection Calculate Statistical Energy Calculate Statistical Metric Testing Image Classification (Clustering K-Nearest Result Neighbor) Fig. 2: CAD system Overview The following subsections present explanation of each step in the CAD system. 2.1 Preprocessing For each image either in the training or testing of the method, laplacian filter is applied to enhance the image by sharpening the image. Laplacian is represented as follows [9]: (Linear form) (1) ISBN: 978-1-6184-148-7 213

Discreat form as follows: The x-direction: The y-direction (2) (3) (4) The produced filtered image will have a problem where the image background will be eliminated in the process, for that, we subtract the filtered image from the original image to recover the background. Figure 3 shows an example of applying laplacian filter on image 1 of the dataset. a Figure 2 shows in the frequency domain how the image is decomposed into,, and.the image corresponds to the lowest frequencies (Approximation), gives the vertical high frequencies (horizontal edges), the horizontal high frequencies (vertical edges) and the high frequencies in both directions (the diagonal). Fig. 4: Example of wavelet decomposition with level 3 For the purpose of this paper, the db1 wavelet with 6 levels will be utilized. b 2.2.1 Curvelet Transform The discrete curvelet transform is an image representation approach[1, 11]. It is based on the idea of representing a curve as superposition of functions of various length and width obeying the curvelet scaling law width length 2 [1, 11] Fig. 4 presents the curvelet analysis method. c Fig. 3 a)original image segment, b)laplacian filtered image, c) laplacian filered image after recovering the background 2.2 Feature Extraction The next step in the process is feature extraction. Wavelet and curvelet are used separately to compare their performance and find which one would achieve better result in the diagnosis of lung cancer. When including a subsection you must use, for its heading, small letters, 12pt, left justified, bold, Times New Roman as here. 2.2.1 Wavelet Transform The wavelet can be interpreted as signal decomposition in a set of independent, spatially oriented frequency channels. Let us suppose that Φ(x) and Ψ(x) are respectively a perfect low-pass and a perfect bandpass filter. Fig. 5: Curvelet Method Further reading about this method can be obtained at [11] 2.3 Feature Selection Once the feature extraction stage has been executed, a huge number of coefficients will be produced. It is important to reduce the coefficients by selecting those coefficients that contains the most important information that would contribute to high accuracy ISBN: 978-1-6184-148-7 214

and ignoring the remaining. For this reason, we use two steps for feature selection through calculating the statistical energy and then the statistical metric. The statistical energy is calculated as follows: (5) statistical metric for feature selection is introduced in this paper. This metric can be calculated as follows: Suppose m1, m2 and m3 are the mean of class1, class2, and class3, respectively and is the mean of all the classes. Let and so that is not sufficient to quantify the classification contribution of the coefficients because it may give the same values in the two cases. Therefore, there is a need to introduce another metric to quantify the coefficients contribution. We introduce another metric as follows: modified algorithm and the K-Nearest neighbor. K- means is used to cluster the data into classes and sub-classes with a centre point to represent each class and K-Nearest Neighbor is used to classify new data by calculating the Euclidean distance between the centre point of each class and the new data. With this combination, the classification is more accurate in less time. A full explanation of the algorithm can be found in [12]. 3 Dataset JSRT (Japanese Society of Radiological Technology) standard chest radiographs dataset [13] is utilized to evaluate the methods for lung cancer. There are 247 chest radiographs in this dataset, 154 enclose nodules (1 malignant and 54 benign) and 93 images do not enclose nodules. A 128 x 128 sub image that contains the nodules were selectd from the original images. The centre of the nodule is the centre of the sub image. Figure 6 shows an example of one chest radiography. (6) where is the statistical metric of class, is the mean of class, and is the number of classes. will be calculated using the following formula: Where is the number of the features in class. The way to select the desired feature coefficients will be as follows: If the statistical metric of any feature is less than a certain threshold, we will remove it, else we keep the feature. 2.4 Classification A classifier that is a based on the combination between the K-means modified algorithm and the K-Nearest Neighbor (K-NN) is applied in this research. This classifier was developed by Brahim Belhaouari Samir [12]. 2.4.1 Cluster-K-Nearest Neighbor Classifier (C- K-NN) The Cluster-K-Nearest Neighbor is a classifier that combines two algorithms that are the K-means a Fig. 6: An Example of the JSRT Dataset (JPCLN1.IMG) (a) Original Chest Radiograph (b) Extracted sub-image [13] 4 Results In this section the results that have been obtained are reported. There were two experiments, the first one on the classification of normal vs. abnormal images and the second was on the classification of benign vs. malignan images. 4.1 Normal Vs. Abnormal b ISBN: 978-1-6184-148-7 215

Table 1 shows the obtained results for the classification of normal vs. abnormal with wavelet db1 function, and Table 2 for Curvelet Table 1: Normal Vs. Abnormal (db1 Wavelet) Function Level Db1.9915 False Negatives 1.9829.9915.9915.9915.9915 Table 1: Normal Vs. Abnormal (Curvelet) Curvelet Scale.6239.3919.3488.79.2568.2791.7863.811.698.7692.811.93.767.1351.93.767 False Negatives 7.1216.93 As shown in tables 1 and 2, wavelet (db1) function outperformes curvelet, where an accuracy of 99.15 is reach in db1, and the highest of curvelet is 78.63% in scale 4. 4.2 Benign Vs. Malignant Tables 3 and 4 reports the performance of wavelet and curvelet when applied to the classification of benign vs. malignant experiments Table 3: Benign Vs. Malignant (db1 Wavelet) Function Level Db1.9481 False Negatives 1.9481.9351.9481.961.9481 Table 4: Benign Vs. Malignant (Curvelet) Curvelet Scale.7143.8.1111.991.4.991.2.991.4.991.2 7.991 ISBN: 978-1-6184-148-7 216

False Negatives.4 As shown in tables 3 and 4, wavelet (db1) function showed better performance than curvelet when both are applied separately to the classification task of benign vs. malignant cases. Moreover, the performance of wavelet when other functions such as haar was applied was also better than curvelet. This shows an evidence that wavelet could be better used for such classification tasks, however, it is necessary to appoint that, further testing with mutliple datasets with different modalities should be carried out in order to confirm this further. As noted before, the images that were used are x-ray images, further testing with CT images will also be required. 4 Conclusion The paper presented a comparision of the performance of wavelet and curvelet when applied separately for lugn cancer diagnosis. JSRT dataset was utilized in the experiments, and the obtained results suggest that wavelet performs better than curvelet in this CAD system. Although one wavelet function was reported in this paper, however, the performance of other wavelet functions (i.e. haar) was also better than curvelet. References: [1] W. H. O. (WHO). (212). World Health Statistics 212. Available: http://www.who.int/gho/publications/world_hea lth_statistics/en_whs212_full.pdf [2] Ferlay J, et al. (21, Oct. 4). GLOBOCAN 28 v2., Cancer Incidence and Mortality Worldwide: IARC CancerBase No. 1. Available: http://globocan.iarc.fr [3] R. C. Hardie, et al., " analysis of a new computer aided detection system for identifying lung nodules on chest radiographs," Medical Image Analysis, vol. 12, pp. 24-258, 28. [4] J. R. F. da Silva Sousa, et al., "Methodology for automatic detection of lung nodules in computerized tomography images," Computer Methods and Programs in Biomedicine, vol. 98, pp. 1-14, 21. [5] J. Dehmeshki, et al., "Automated detection of lung nodules in CT images using shape-based genetic algorithm," Computerized Medical Imaging and Graphics, vol. 31, pp. 48-417, 27. [6] S. L. A. Lee, et al., "Random forest based lung nodule classification aided by clustering," Computerized Medical Imaging and Graphics, vol. 34, pp. 535-542, 21. [7] G. Orbán, et al., "Lung Nodule Detection on Rib Eliminated Radiographs," in XII Mediterranean Conference on Medical and Biological Engineering and Computing 21. vol. 29, P. Bamidis and N. Pallikarakis, Eds., ed: Springer Berlin Heidelberg, 21, pp. 363-366. [8] C. Pereira, et al., "A Multiclassifier Approach for Lung Nodule Classification," in Image Analysis and Recognition. vol. 4142, A. Campilho and M. Kamel, Eds., ed: Springer Berlin Heidelberg, 26, pp. 612-623. [9] G. R.C. and W. R.E., Digital Image Processing, 3 ed. New Jersey: Prentice Hall, 28. [1] M. M. Eltoukhy, et al., "A comparison of wavelet and curvelet for breast cancer diagnosis in digital mammogram," Computers in Biology and Medicine, 21. [11] E. Candes and D. Donoho, "Curvelets: multiresolution representation,and scaling laws,," in Wavelet Applications in Signal and Image Processing VIII,Proceeding of the SPIE, 2. [12] B. B. Samir, "Fast and Control Chart Pattern Recognition using a New clusterk-nearest Neighbor," Journals of Word Academy of Science, Engineering and Technology., 29. [13] J. Shiraishi, et al., "Development of a Digital Image Database for Chest Radiographs With and Without a Lung Nodule," American Journal of Roentgenology, vol. 174, pp. 71-74, January 1, 2 2. ISBN: 978-1-6184-148-7 217