Chapter 7 UNSUPERVISED LEARNING TECHNIQUES FOR MAMMOGRAM CLASSIFICATION

Similar documents
Chapter 6 CLASSIFICATION ALGORITHMS FOR DETECTION OF ABNORMALITIES IN MAMMOGRAM IMAGES

CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 4 SEGMENTATION

Global Journal of Engineering Science and Research Management

Medical Image Feature, Extraction, Selection And Classification

Available Online through

Mass Classification Method in Mammogram Using Fuzzy K-Nearest Neighbour Equality

Clustering and Visualisation of Data

ISSN: (Online) Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies

CHAPTER 3 TUMOR DETECTION BASED ON NEURO-FUZZY TECHNIQUE

SVM-based CBIR of Breast Masses on Mammograms

Hybrid Approach for MRI Human Head Scans Classification using HTT based SFTA Texture Feature Extraction Technique

EE 589 INTRODUCTION TO ARTIFICIAL NETWORK REPORT OF THE TERM PROJECT REAL TIME ODOR RECOGNATION SYSTEM FATMA ÖZYURT SANCAR

Diagnosis of Breast Cancer using Wavelet Entropy Features

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

An Enhanced Mammogram Image Classification Using Fuzzy Association Rule Mining

Texture Image Segmentation using FCM

Change Detection in Remotely Sensed Images Based on Image Fusion and Fuzzy Clustering

Handwritten Script Recognition at Block Level

Computer-aided detection of clustered microcalcifications in digital mammograms.

CHAPTER 6 ENHANCEMENT USING HYPERBOLIC TANGENT DIRECTIONAL FILTER BASED CONTOURLET

Mammogram Segmentation using Region based Method with Split and Merge Technique

A ranklet-based CAD for digital mammography

Tumor Detection and classification of Medical MRI UsingAdvance ROIPropANN Algorithm

CHAPTER-1 INTRODUCTION

2. LITERATURE REVIEW

MR IMAGE SEGMENTATION

IMPLEMENTATION OF FUZZY C MEANS AND SNAKE MODEL FOR BRAIN TUMOR DETECTION

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

PERFORMANCE EVALUATION OF A-TROUS WAVELET-BASED FILTERS FOR ENHANCING DIGITAL MAMMOGRAPHY IMAGES

Improving the detection of excessive activation of ciliaris muscle by clustering thermal images

Norbert Schuff VA Medical Center and UCSF

6. Dicretization methods 6.1 The purpose of discretization

Image Processing Techniques for Brain Tumor Extraction from MRI Images using SVM Classifier

Efficient Image Compression of Medical Images Using the Wavelet Transform and Fuzzy c-means Clustering on Regions of Interest.

Fuzzy C-MeansC. By Balaji K Juby N Zacharias

Unsupervised Learning : Clustering

Fuzzy Segmentation. Chapter Introduction. 4.2 Unsupervised Clustering.

CHAPTER 3 WAVELET DECOMPOSITION USING HAAR WAVELET

A CRITIQUE ON IMAGE SEGMENTATION USING K-MEANS CLUSTERING ALGORITHM

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors

Feature Extraction from Wavelet Coefficients for Pattern Recognition Tasks. Rajat Aggarwal Chandu Sharvani Koteru Gopinath

The Anatomical Equivalence Class Formulation and its Application to Shape-based Computational Neuroanatomy

Available online Journal of Scientific and Engineering Research, 2019, 6(1): Research Article

AN IMPROVED K-MEANS CLUSTERING ALGORITHM FOR IMAGE SEGMENTATION

Harish Kumar N et al,int.j.computer Techology & Applications,Vol 3 (1),

Tumor Detection in Breast Ultrasound images

Using Machine Learning for Classification of Cancer Cells

NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM

Comparative Study of Clustering Algorithms using R

Fuzzy C-means Clustering with Temporal-based Membership Function

Facial Expression Recognition Using Non-negative Matrix Factorization

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

Gene Clustering & Classification

Breast Cancer Detection and Classification Using Ultrasound and Ultrasound Elastography Images

A WAVELET BASED BIOMEDICAL IMAGE COMPRESSION WITH ROI CODING

Extraction and Features of Tumour from MR brain images

CS 521 Data Mining Techniques Instructor: Abdullah Mueen

MRI Brain Image Analysis and Classification for Computer-Assisted Diagnosis

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES

Neural Network based textural labeling of images in multimedia applications

CAD SYSTEM FOR AUTOMATIC DETECTION OF BRAIN TUMOR THROUGH MRI BRAIN TUMOR DETECTION USING HPACO CHAPTER V BRAIN TUMOR DETECTION USING HPACO

ISSN: X Impact factor: 4.295

International Research Journal of Engineering and Technology (IRJET) e-issn:

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis

Content based Image Retrievals for Brain Related Diseases

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

Clustering in Ratemaking: Applications in Territories Clustering

Feature Extraction and Texture Classification in MRI

A COMPARATIVE STUDY OF DIMENSION REDUCTION METHODS COMBINED WITH WAVELET TRANSFORM APPLIED TO THE CLASSIFICATION OF MAMMOGRAPHIC IMAGES

Classification of Microcalcification in Digital Mammogram using Stochastic Neighbor Embedding and KNN Classifier

A Survey on Image Segmentation Using Clustering Techniques

Semi-Supervised Clustering with Partial Background Information

CHAPTER 8 COMPOUND CHARACTER RECOGNITION USING VARIOUS MODELS

IMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING

Study Of Spatial Biological Systems Using a Graphical User Interface

MRI Classification and Segmentation of Cervical Cancer to Find the Area of Tumor

Classification Performance related to Intrinsic Dimensionality in Mammographic Image Analysis

Analysis of classifier to improve Medical diagnosis for Breast Cancer Detection using Data Mining Techniques A.subasini 1

Introduction to Mobile Robotics

Classification of Microcalcification Clusters via PSO-KNN Heuristic Parameter Selection and GLCM Features

Image Compression Algorithm for Different Wavelet Codes

Computer-Aided Detection system for Hemorrhage contained region

Image Analysis - Lecture 5

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

1. Introduction. 2. Motivation and Problem Definition. Volume 8 Issue 2, February Susmita Mohapatra

AN IMPROVED HYBRIDIZED K- MEANS CLUSTERING ALGORITHM (IHKMCA) FOR HIGHDIMENSIONAL DATASET & IT S PERFORMANCE ANALYSIS

Datasets Size: Effect on Clustering Results

Automated Detection of Breast Cancer Using SAXS Data and Wavelet Features

Local Independent Projection Based Classification Using Fuzzy Clustering

Improving Positron Emission Tomography Imaging with Machine Learning David Fan-Chung Hsu CS 229 Fall

Adaptive Quantization for Video Compression in Frequency Domain

Unsupervised Learning

CHAPTER 3 A FAST K-MODES CLUSTERING ALGORITHM TO WAREHOUSE VERY LARGE HETEROGENEOUS MEDICAL DATABASES

Image Analysis, Classification and Change Detection in Remote Sensing

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Texture Based Image Segmentation and analysis of medical image

IMAGE DIGITIZATION BY WAVELET COEFFICIENT WITH HISTOGRAM SHAPING AND SPECIFICATION

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.

Transcription:

UNSUPERVISED LEARNING TECHNIQUES FOR MAMMOGRAM CLASSIFICATION Supervised and unsupervised learning are the two prominent machine learning algorithms used in pattern recognition and classification. In this chapter performance of K-means and Fuzzy C mean clustering algorithms in classifying mammogram images is analysed. The GLCM and wavelet features of the mammogram images are extracted from the images in the Mini-MIAS database. Using this, cluster centres corresponding to the two classes-benign and Malignant are computed, employing K-means as well as Fuzzy C means. These cluster centres are then used for classifying a set of mammogram images obtained locally. The classification results are compared with the opinions of two radiologists, and found that they are in agreement.

7.1 Introduction Learning and adaptation is the prominent key feature of any machine intelligence system. Any method that incorporates information from the training samples in the design of a classifier employs learning. All pattern recognition systems build models based on the training samples used for learning. This model acts as the base of a classifier. Creating classifiers involves posting some general form of model, or form of the classifier, and using training patterns to learn or estimate the unknown parameters of the model. The model of a classifier can be constructed mainly using two different machine learning techniques supervised and unsupervised learning. In supervised learning, a teacher provides a category label or cost for each pattern in a training set, and seeks to reduce the sum of the costs for the patterns. In unsupervised learning, which is also known as clustering, have no explicit teacher, and the system forms clusters or natural groups of the input patterns and form particular set of patterns or cost function [Duda et.al, 2001]. An automatic CAD system can provide this sort of learning ability for the detection and classification of benign and malignant categories of mammograms images. In the previous chapters we implemented certain classification algorithms aiding the development of CAD system using the supervised learning methodology. In those systems we have used a pre labeled set of features from the dataset for classification. But in unsupervised learning, different clustering techniques are adopted for the classification of the mammogram images. In literature there are many studies conducted in the area of clustering for analysis and classification of digital mammograms. Most of these studies produced significant results too. Concluding our works without mentioning the clustering algorithms will not fulfill the intention of our thesis. So in this chapter we focus on a pilot study on two different 196 Wavelet and soft computing techniques in detection of Abnormalities in Medical Images

Unsupervised Learning Techniques For Mammogram Classification unsupervised classification algorithms such as K-means and Fuzzy C mean clustering algorithms and its implementation for classifying the mammogram images. For the formation of clusters, features are extracted from the standard database. Subsequently the performance of the models is evaluated with a set of real data obtained from local hospitals. 7.2 Unsupervised Learning Methods Unsupervised learning or Clustering is a data division process which groups set of data points into non overlapping groups where data points in a group are more similar to one another than the data points in other groups. Each group is called as a cluster. By grouping data into cluster build a data model that puts the cluster in historical perspective rooted in mathematical, statistics and numerical analysis. In machine intelligence, cluster corresponds to the hidden patterns in the data. This type of learning evolves a data concept. Clustering plays a prominent role in data mining applications such as scientific data exploration, information retrieval, text mining, spatial database applications, marketing, web analysis, computation biology and medical diagnosis. Clustering algorithms are generally classified into two categories hard clustering and fuzzy clustering. In hard clustering all data points belonging to a cluster have full membership in that data whereas in fuzzy clustering a data point belongs to more than one cluster at a time. Hence, in fuzzy clustering a data point can have partial membership. [Halawani et.al, 2012]. 7.2.1 K-means Clustering The K-means clustering is a simple clustering method which uses iterative technique to partition n observation into k clusters. The partition of n observation into k clusters is based on the nearest mean principle. Even Wavelet and soft computing techniques in detection of Abnormalities in Medical Images 197

though it is fast and simple in execution, the clustering will not converge if the selection of initial cluster center is not made properly. K-means algorithm is an unsupervised clustering algorithm that classifies the input data points into multiple classes based on their inherent distance from each other. The algorithm assumes that the data features form a vector space and tries to find natural clustering in them. [Dalmiya et.al, 2012]. The basic k- means clustering algorithm is as follows: Step 1 : Choose k = # of clusters. Step 2 : Pick k data points randomly from the dataset. These data points act as the initial cluster centers Step 3 : Assign each data point from the n observation into a cluster with the minimum distance between the data point and cluster centre. Step 4 Step 5 : Re-compute the cluster centre by averaging all of the data points in the cluster. : Repeat step 3 and step 4 until there is no change in cluster centers Therefore K-means clustering, the key endeavor is to partitions the n observation into k sets (k<n) s={s1,s2,s3,..sk} so as to minimize the within cluster sum of squares. K arg min. i= 1 x s j i x j u i 2 (7.1) Where u i is the mean of points in S i, K is the number of clusters and x j is the j th data point in the observations [Ramani et.al, 2013][Gumaei et.al,2012]. 198 Wavelet and soft computing techniques in detection of Abnormalities in Medical Images

Unsupervised Learning Techniques For Mammogram Classification 7.2.2 Fuzzy C-means clustering The Fuzzy C-means (FCM) algorithm is a method of clustering which allows one of the n observations belongs to two or more clusters. It is a frequently used method in pattern recognition [Thangavel and Mohideen, 2010]. It is based on the minimization of the following objective function to achieve a good classification. J m K c = i= 1 j= 1 u xi c j 2, 1 m < (7.2) Where m is any real number greater than 1, u is the degree of membership of x in the cluster j, x i i is the i th of the d-dimensional measured data, c j is the d-dimensional center of the cluster and * is any norm expressing the similarity between any measured data and the center. Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above, with the update of member ship u in equation (7.3) and the cluster centers C j by equation (7.4) u = 1 xi c j xi ck 2 m 1 (7.3) c j k i = 1 = k u i = 1 u. x i (7.4) Wavelet and soft computing techniques in detection of Abnormalities in Medical Images 199

The iteration will stop when max k + 1 k = u u < (7.5) Where the termination criterion and k is is the iteration steps. This procedure converges to a local minimum or a saddle point of J m.. The fuzzy c-mean algorithm is defined as the following steps Step 1: Initialize U = [U ] matrix, U (0) Step 2: At k th step calculate the center vectors using equation C N m u. xi i= 1 j = N m u i= 1 C k = [C j ] with U (k) (7.6) Step 3: Update U (k) and U (k+1) using equation u = 1 xi c j xi ck 2 m 1 (7.7) k+ 1 k Step 4: If u u < then stop, otherwise return to step 2. 7.3 Related Work Basha and Prasad [Basha and Prasad, 2009] proposed a method using morphological operators and fuzzy C-mean clustering algorithm for the detection of breast tumor mass segmentation from the mammogram images. Initially this algorithm isolated masses and microcalcification from background tissues using morphological operators and then used fuzzy C- mean (FCM) algorithm for the intensity-based segmentation. The results 200 Wavelet and soft computing techniques in detection of Abnormalities in Medical Images

Unsupervised Learning Techniques For Mammogram Classification indicated that this system could facilitate the doctor to detect breast cancer in the early stage of diagnosis. Wang et.al [Wang et.al, 2011] proposed a clustering algorithm which utilized the extended Fuzzy clustering techniques for classifying mammogram images into three categories of images such as image with no lesions, image with microcalcifications and image with tumors. They used 60 mammogram images from each group comprised of total 180 images for the study. Twenty images from each group were taken for training and the remaining for testing purpose. The analysis of clustering algorithm achieved a mean accuracy of 99.7% compared with the findings of the radiologists. They also compared the result with other classification techniques like Multilayer Perceptron (MLP), Support Vector Machine (SVM) and fuzzy clustering. The result obtained by extended fuzzy clustering technique was far better than all other methods. Singh and Mohapatra [Singh and Mohapatra,2011] made a comparative study on K-means and Fuzzy C means clustering algorithm by accounting the masses in the mammogram images. In the K-means clustering algorithm they did simple distance calculation for isolating pixels in the images into different clusters whereas Fuzzy C means clustering algorithm used inverse distance calculation to the center point of the cluster. They used both clustering techniques as a segmentation strategy for better classification that aimed to group data into separate groups according to the characteristics of the image. Using these methods, authors claimed that K-means algorithm can easily identifies the masses or the origin of the cancer point in the image and Fuzzy C-means locate how much area of the image was spread out of the cancer. Wavelet and soft computing techniques in detection of Abnormalities in Medical Images 201

Ahirwar and Jadon [Ahirwar and Jadon,2011] proposed a method for detecting the tumor area of the mammogram images using an unsupervised learning methods like Self Organizing Map(SOM) and Fuzzy C mean clustering. These two methods perform clustering of high dimensional dataset into different clusters with similar features. Initially they constructed different Gray Level Co-occurrence Matrices (GLCM) in various orientations of the mammogram images and then extracted different GLCM features from that matrix for distinguishing abnormal and normal images in the dataset. Using the above two techniques they evaluated the different properties of the mammogram image such as name of the tumor region, type of region, average gray scale value of the region, area of the region and centroid of the region to distinguish the normal and abnormal images in the dataset. The authors argued that the above two methods detect higher pixel values that indicate the area of tumor in the image. Balakumaran et.al [Balakumaran et.al, 2010] proposed a novel method named fuzzy shell clustering for the detection of microcalcifications in the mammogram images. This method comprises image enhancement followed by the detection of microcalcification. Image enhancement was achieved by using wavelet transformation and the detection of microcalcification was done by fuzzy shell clustering techniques. Original Mammogram images were decomposed by applying dyadic wavelet transformations. Enhancement was achieved by setting lowest frequency sub bands of the decomposition to zero. Then image was reconstructed from the weighted detailed sub bands. The visibility of the resultant s image was greatly improved. Experimental results were confirmed that microcalcification could be accurately detected by fuzzy shell clustering algorithms. Out of 112 images in the dataset 95% of 202 Wavelet and soft computing techniques in detection of Abnormalities in Medical Images

Unsupervised Learning Techniques For Mammogram Classification the microcalcifications were identified correctly and 5% of the microcalcifications were failed to detect. Funmilola et.al [Funmilola et.al, 2012] proposed a novel clustering method fuzzy k-c means clustering algorithm for segmenting MRI images of human brain. Fuzzy k-c mean clustering is a combination of both k-means and fuzzy c means clustering algorithms which has better result in time utilization. In this method they utilized the number of iteration to that of fuzzy c-means clustering algorithm and still received optimum result. The performance of the above algorithm was tested by taking three different MRI images. The time, accuracy and number of iterations were focused on this method and they were found to be far better than k-means and fuzzy C means. Martins et.al [Martins et.al, 2009] proposed a mammogram segmentation algorithm for mass detection using K-means and Support Vector Machine classifier. Images were segmented using K-means clustering technique followed by the formation of Gray Level Co-Occurrence Matrices of the segmented images were formed at different orientations. Different GLCM features along with certain structural features were extracted. All these features were used for training and testing of the DDSM mammogram images and achieved 85% classification accuracy for identifying masses in the mammogram images. Barnathan [Barnathan, 2012] proposed a novel clustering algorithm called wave cluster for segmenting the mammogram images using wavelet features. It is a grid and density based clustering technique unique for performing clustering in wavelet space. This is an automatic clustering algorithm in which there is no need for specifying the number of clusters Wavelet and soft computing techniques in detection of Abnormalities in Medical Images 203

compared to all other conventional clustering algorithms. The images were decomposed into different frequency bands or coefficients using wavelet transformations. They used biorthogonal wavelet for decomposition. After decomposition, approximation coefficients were retained for further processing. Using specific threshold value, non significant data in the decomposition coefficients were suppressed. Then the connected component algorithm was used to discover and label clusters for the significant cells. Finally cells are mapped back to the data using a lookup table built during quantization. Using this approach, they were able to segment the breast profile from all 150 images, leaving minor residual noise adjacent to the breast. The performance on ROI extraction of this method was excellent, with 81% sensitivity and 0.96 false positives per image against manually segmented ground truth ROIs. Dalmiya et.al [Dalmiya et.al, 2012] proposed a method using wavelet and K-means clustering algorithm for cancer tumor segmentation from MRI images. This method was purely intensity based segmentation and robust against noise. Discrete Wavelet transformation was used to extract high level details of the MRI images. The processed image was then added to the original image to sharpen the image. Then K-means algorithm was applied on the sharpened image to locate tumor regions using thresholding technique. This method provides better result compared to conventional K-means and fuzzy C-mean clustering algorithms. Halawani et.al [Halawani et.al, 2012] presented three different clustering algorithms to classify the digital mammogram images into different category. They performed various clustering algorithms such as Expectation Maximization (EM), Hierarchical clustering, K-means 204 Wavelet and soft computing techniques in detection of Abnormalities in Medical Images

Unsupervised Learning Techniques For Mammogram Classification algorithms for the classification of 961 images with two cluster center. The performance of the above clustering techniques was very promising. Patel and Sinha [Patel and Sinha, 2010] proposed a modified K-means clustering algorithm named as adaptive K-means clustering for breast image segmentation for the detection of microcalcifications. The feature selection was based on the number, color and shape of objects found in the image. The number of Bins values, number of classes and sizes of the objects was considered as appropriate features for retrieval of the image information. The detection accuracy was estimated and compared with existing works and reported that the accuracy was improved by adaptive K-means clustering instead of K-means algorithm. The results were found to be satisfactory when subjected to validation by Radiologists. The classification accuracy obtained by this method was 77% for benign and 91% for malignant. Al-Olfe et.al [Al-Olfe et.al, 2010] proposed a bi-clustering algorithm for classifying mammogram images into normal, benign, malignant, masses and microcalcification types based on the unsupervised clustering methods. The bi-clustering method cluster the data points in two dimensional spaces simultaneously instead of conventional cluster methods that cluster the data points either column wise or row wise. This method also reduces the number of features from high dimensional feature space using feature dimension reduction method. 100% specificity and sensitivity was achieved by this method for classifying normal and microcalcification tissues from the images. The classification accuracy of benign and malignant images was 91.7% sensitivity and 100% specificity respectively. Wavelet and soft computing techniques in detection of Abnormalities in Medical Images 205

7.4 Classification of Mammogram Images Using Clustering Techniques The primary intention of the clustering algorithms discussed in this chapter is to label the digital mammogram images obtained from the local hospitals to benign and malignant category, which are not labeled so far. This is accomplished by making use of cluster center that are computed using the standard digital mammogram images (Mini-MIAS dataset) labeled by the experts. To label the local mammogram images, we used two different clustering algorithms viz. K-means and Fuzzy C means techniques. Two prominent feature sets GLCM and DWT wavelet transformation coefficients extracted from the datasets are used for the clustering. The algorithm for labeling the images is implemented in two phases. In the first phase, all the mammogram images in the standard dataset are enhanced using the method WSTA proposed in this thesis (Chapter 4). After the enhancement of the mammogram images, all the benign and malignant images available in the dataset are undergone segmentation for extracting the ROIs of suspected abnormal regions of the images. To extract the ROIs, an automated segmentation algorithm developed in this thesis is employed (Chapter 5). This is followed by the extraction of GLCM and Wavelet features for all the benign and malignant images in the dataset. For GLCM feature, a feature set of 16 features from the four different orientations of the GLCM are constructed. In the case of Wavelet transformation, the reduced wavelet approximation coefficients obtained after the three level wavelet decompositions are used. In order to reduce the transformation coefficients PCA is applied on the approximation coefficients obtained after the third level of DWT decomposition. The Eigen values obtained as the principal components of transformation is used as the wavelet feature. The coefficients obtained in GLCM as well as DWT are treated as the class core vectors for 206 Wavelet and soft computing techniques in detection of Abnormalities in Medical Images

Unsupervised Learning Techniques For Mammogram Classification computing the cluster centers of the both clustering algorithms. In the second phase, the same feature sets are extracted for the local digital mammogram images as explained in the first phase. After extracting the feature vectors, local digital mammogram images are labeled with reference to the cluster center computed in the first phase of the algorithm. 7.5 Results and Discussion The clustering algorithms discussed above are implemented in MATLAB. Initially the cluster centre is computed using the 116 abnormal images available in the Mini-MIAS dataset. Out of 116 abnormal images, the experts has already identified and labelled 64 benign and 53 malignant images. All these images are enhanced using WSTA. The enhanced images are then automatically segmented and ROIs of abnormal area of the images are extracted. From these ROIs, GLCM and Wavelet features are extracted and constructed the feature vector for each ROI. Based on these feature vector cluster centres are computed using K-means and Fuzzy C mean clustering algorithm. While computing the cluster centres of the standard mammogram images, the images in the dataset are labelled into benign and malignant category as shown in Table 7.1. Table 7.1: Classification details of clustering algorithms on Mini-MIAS database K-means Fuzzy C Means Feature Set Cluster 1 Cluster 2 Cluster 1 Cluster 2 (Benign) (Malignant) (Benign) (Malignant) GLCM 60 56 65 51 Wavelet (db4) 74 42 74 42 Wavelet (db8) 72 44 72 44 Wavelet(db16) 75 41 76 40 Wavelet(Haar) 74 42 73 43 Wavelet(Bior) 77 39 79 37 Wavelet and soft computing techniques in detection of Abnormalities in Medical Images 207

In the next level, feature vector for both features are extracted for the segmented ROIs of the locally available dataset. Using the cluster centre obtained for Mini-MIAS database for each feature set, the clustering algorithms are implemented using the ROIs of local dataset. While clustering all the images in the local dataset are labelled in two clusters as benign and malignant respectively. The clustering results obtained for these two clusters are shown in Table 7.2. Table 7.2: Classification details of clustering algorithms on locally available mammograms images. K-means Fuzzy C Means Feature Set Cluster 1 (Benign) Cluster 2 (Malignant) Cluster 1 (Benign) Cluster 2 (Malignant) GLCM 22 10 24 08 Wavelet (db4) 24 08 24 08 Wavelet (db8) 25 07 26 06 Wavelet(db16) 27 05 25 07 Wavelet(Haar) 27 05 26 06 Wavelet(Bior) 26 06 27 05 After consulting the local mammogram images with two Radiologists, they unanimously identified and labelled all the 7 malignant images from the dataset. By comparing our results with the expert decisions, we made following confusion matrix for the result obtained in K-means and Fuzzy C means clustering algorithms. 208 Wavelet and soft computing techniques in detection of Abnormalities in Medical Images

Table 7.3: Clustering of mammogram images using K-means and Fuzzy C means clustering algorithms with GLCM and different wavelet features. Unsupervised Learning Techniques For Mammogram Classification Wavelet and soft computing techniques in detection of Abnormalities in Medical Images 209

Performance of the above two algorithms are evaluated based on the confusion matrix shown in Table 7.3 is given in Table 7.4. Table 7.4: Performance of K-means and Fuzzy C means clustering algorithm using GLCM and different Wavelet features. Features K-means (in %) Fuzzy C means (in %) Accuracy Sensitivity Specificity Accuracy Sensitivity Specificity GLCM 90.63 88.00 96.88 96.00 Wavelet (db4) Wavelet (db8) Wavelet (db16) Wavelet (Haar) Wavelet(Bio) 96.88 93.75 93.75 96.88 71.43 71.43 85.71 96.00 96.88 96.88 96.88 93.75 85.71 85.71 71.43 96.00 From the above table, we can make following conclusions on the basis of performance two clustering algorithms. i) K-means Clustering Using db8 wavelet features, we obtained 100% accuracy, specificity and sensitivity for clustering the mammogram images into two different clusters such as malignant and benign. Using db4 wavelet features, we obtained 96.88 % accuracy and at the same time 100% sensitivity and 96% specificity whereas in Biorthogonal filter, even though the accuracy is 96.88% but sensitivity and specificity are 85.71 % and 100% respectively. Compared to GLCM feature, better classification accuracy is obtained for wavelet transformation features. 100% sensitivity was obtained for GLCM, db4 and db8 wavelet features. 210 Wavelet and soft computing techniques in detection of Abnormalities in Medical Images

Unsupervised Learning Techniques For Mammogram Classification 100% specificity is obtained for db16, Haar and Biorthogonal wavelet features. The overall classification results obtained for K-means clustering using wavelet and GLCM features are good for classifying the mammogram images in the local dataset. ii) Fuzzy C means Clustering Using db16 wavelet features, we obtained 100% accuracy, sensitivity and specificity for classifying mammogram images. Same percentage of accuracy, sensitivity and specificity are obtained for GLCM and Wavelet db4 feature set. The accuracy, sensitivity and specificity obtained for these features are 96.88%, 100% and 96% respectively. 100% sensitivity is obtained for GLCM and Wavelet features ( db4 and db16). 100% specificity is obtained for db8, db16, Haar and Biorthogonal wavelet filters. Using Biorthogonal wavelet feature, we obtained less classification performance compared to all other feature sets. The overall performance of the K-means and Fuzzy C mean clustering algorithms for classifying the mammogram images are shown in Figure 7.1 and 7.2. Wavelet and soft computing techniques in detection of Abnormalities in Medical Images 211

Classification performance in% 100 90 80 70 60 50 40 30 20 10 0 GL CM db4 db8 db16 Haar Bio rthog onal Accuracy S ensiti vity Specificity Figure 7.1: The performance (in %) of K-means clustering on different feature sets 100 90 C l a s s i f i c a t io n p e r f o r m a n c e i n % 80 70 60 50 40 30 20 Accuracy S ensiti vity Specificity 10 0 GL CM db4 db8 db16 Haar Bio rthogonal Figure: 7.2: The performance (in %) of Fuzzy C-means clustering on different feature sets 212 Wavelet and soft computing techniques in detection of Abnormalities in Medical Images

Unsupervised Learning Techniques For Mammogram Classification From the above tables and figures, it is clear that all the malignant images in the dataset are correctly identified and labelled by K-means cluster algorithm which uses Daubechies (db18) family and Daubechies (db18) for Fuzzy C means clustering. These two wavelet features identified and labelled all the 7 malignant images that are identified and labelled by the two Radiologists from the 32 mammogram images provided for them. All other feature sets are also produced better classification results for identifying the malignant images using both clustering methods. It is also found that both GLCM and Daubechies (db4) wavelet feature same percentage classification accuracy as well as the same percentage sensitivity and specificity for classifying all the malignant images using Fuzzy C means clustering algorithms. 7.6 Summary In this chapter we made a pilot study to classify the abnormal mammogram images in the dataset using clustering methods. Two important clustering algorithms such as K-means and Fuzzy C mean clustering algorithms are implemented using the GLCM and wavelet features. Two dataset, Mini-MIAS database and locally available mammogram images are used for the experiments. The WSTA method is used for enhancing the mammogram images. This is followed by a fully automated segmentation method proposed in this thesis is used for extracting the tumour or abnormal region of the mammogram images. From these ROIs, GLCM and wavelet features are extracted for both dataset. Using the standard dataset cluster centres are computed for benign and malignant category of images in the set. Then the local mammogram images are labelled into benign and malignant category based on the cluster center computed using the standard mammogram images in Mini-MIAS. The classification results obtained are in agreement with that given by two Radiologists... Wavelet and soft computing techniques in detection of Abnormalities in Medical Images 213