International Journal of Pharma and Bio Sciences HYBRID CLUSTERING ALGORITHM USING POSSIBILISTIC ROUGH C-MEANS ABSTRACT

Similar documents
Connectivity in Fuzzy Soft graph and its Complement

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Cluster ( Vehicle Example. Cluster analysis ( Terminology. Vehicle Clusters. Why cluster?

Color Texture Classification using Modified Local Binary Patterns based on Intensity and Color Information

Progressive scan conversion based on edge-dependent interpolation using fuzzy logic

CS 534: Computer Vision Model Fitting

Machine Learning. Topic 6: Clustering

Hierarchical clustering for gene expression data analysis

Interval uncertain optimization of structures using Chebyshev meta-models

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Matrix-Matrix Multiplication Using Systolic Array Architecture in Bluespec

Machine Learning: Algorithms and Applications

LOCAL BINARY PATTERNS AND ITS VARIANTS FOR FACE RECOGNITION

Research on Neural Network Model Based on Subtraction Clustering and Its Applications

A MPAA-Based Iterative Clustering Algorithm Augmented by Nearest Neighbors Search for Time-Series Data Streams

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

FUZZY SEGMENTATION IN IMAGE PROCESSING

Analysis of Continuous Beams in General

Session 4.2. Switching planning. Switching/Routing planning

Classifier Selection Based on Data Complexity Measures *

Unsupervised Learning

Performance Evaluation of TreeQ and LVQ Classifiers for Music Information Retrieval

Boosting Weighted Linear Discriminant Analysis

Optimal shape and location of piezoelectric materials for topology optimization of flextensional actuators

TAR based shape features in unconstrained handwritten digit recognition

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms

Adaptive Class Preserving Representation for Image Classification

Performance Analysis of Hybrid (supervised and unsupervised) method for multiclass data set

Unsupervised Learning and Clustering

X- Chart Using ANOM Approach

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Bottom-Up Fuzzy Partitioning in Fuzzy Decision Trees

A Deflected Grid-based Algorithm for Clustering Analysis

Steganalysis of DCT-Embedding Based Adaptive Steganography and YASS

Machine Learning 9. week

A NOTE ON FUZZY CLOSURE OF A FUZZY SET

Pattern Classification: An Improvement Using Combination of VQ and PCA Based Techniques

Evaluation of Segmentation in Magnetic Resonance Images Using k-means and Fuzzy c-means Clustering Algorithms

ABHELSINKI UNIVERSITY OF TECHNOLOGY Networking Laboratory

Pixel-Based Texture Classification of Tissues in Computed Tomography

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems

A Binarization Algorithm specialized on Document Images and Photos

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm

On the End-to-end Call Acceptance and the Possibility of Deterministic QoS Guarantees in Ad hoc Wireless Networks

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Measurement and Calibration of High Accuracy Spherical Joints

A min-max Approach for Improving the Accuracy of Effort Estimation of COCOMO

A New Approach For the Ranking of Fuzzy Sets With Different Heights

y and the total sum of

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

A Fast Way to Produce Optimal Fixed-Depth Decision Trees

TOWARDS FUZZY-HARD CLUSTERING MAPPING PROCESSES. MINYAR SASSI National Engineering School of Tunis BP. 37, Le Belvédère, 1002 Tunis, Tunisia

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

FULLY AUTOMATIC IMAGE-BASED REGISTRATION OF UNORGANIZED TLS DATA

Optimizing Document Scoring for Query Retrieval

Link Graph Analysis for Adult Images Classification

Multilabel Classification with Meta-level Features

Clustering Data. Clustering Methods. The clustering problem: Given a set of objects, find groups of similar objects

Cluster Analysis of Electrical Behavior

Minimize Congestion for Random-Walks in Networks via Local Adaptive Congestion Control

A Real-Time Detecting Algorithm for Tracking Community Structure of Dynamic Networks

A Robust Method for Estimating the Fundamental Matrix

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

Unsupervised Learning and Clustering

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Mixture Models and the Segmentation of Multimodal Textures. Roberto Manduchi. California Institute of Technology.

Clustering Algorithm of Similarity Segmentation based on Point Sorting

Recognizing Faces. Outline

A Robust Algorithm for Text Detection in Color Images

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Query Clustering Using a Hybrid Query Similarity Measure

Module Management Tool in Software Development Organizations

Keyword-based Document Clustering

A New Measure of Cluster Validity Using Line Symmetry *

A Combined Approach for Mining Fuzzy Frequent Itemset

The Codesign Challenge

Region Segmentation Readings: Chapter 10: 10.1 Additional Materials Provided

SSDR: An Algorithm for Clustering Categorical Data Using Rough Set Theory

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

DETECTING AND ANALYZING CORROSION SPOTS ON THE HULL OF LARGE MARINE VESSELS USING COLORED 3D LIDAR POINT CLOUDS

REFRACTIVE INDEX SELECTION FOR POWDER MIXTURES

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

Lecture 4: Principal components

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

A Comparative Analysis of Depth Computation of Leukaemia Images using a Refined Bit Plane and Uncertainty Based Clustering Techniques

An Image Fusion Approach Based on Segmentation Region

Computing Cloud Cover Fraction in Satellite Images using Deep Extreme Learning Machine

Hermite Splines in Lie Groups as Products of Geodesics

arxiv: v3 [cs.cv] 31 Oct 2016

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

Gabor-Filtering-Based Completed Local Binary Patterns for Land-Use Scene Classification

Optimal Fuzzy Clustering in Overlapping Clusters

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

CMPS 10 Introduction to Computer Science Lecture Notes

Integrating Fuzzy c-means Clustering with PostgreSQL *

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Bit-level Arithmetic Optimization for Carry-Save Additions

Transcription:

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 Researh Artle Botehnology Internatonal Journal of Pharma and Bo Senes ISSN 0975-6299 HYBRID CLUSTERING ALGORITHM USING POSSIBILISTIC ROUGH C-MEANS *ANURADHA J, ANVESHA SINHA AND TRIPATHY B K Shool of Computng Sene and Engneerng, VIT Unversty, Vellore 63204, Taml Nadu, Inda ABSTRACT A luster s a olleton of data objets whh are smlar to one another wthn the same luster but dssmlar to objets n another luster. Ths lusterng mehansm ensures hgh ntra-lass smlarty but low nter-lass smlarty whh an be aheved by the - means arhteture. Though muh lusterng algorthm has evolved sne Hard CMeans, Fuzzy C Means (FCM) and Rough C Means (RCM) are wdely used n applatons for ther superorty n handlng vague and unertan data. The probablst approah to lusterng tehnques an handle nosy data. In ths paper we propose a hybrd lusterng model on Possblst Rough C Means that an handle unertanty even n presene of nosy data. The theory of Rough sets whh has emerged as one of the most effent tools of the deade an generate a par of set wth lower and upper approxmatons of the objets determned to establsh the luster. Possblst theory to luster generates the typalty or possblty value of the objets that belongs to the rough luster. Ths model wll be more effent and fool proof than eah of the ndvdual models. Expermental analyss reveals that the proposed algorthm enables to streamlne the lusterng proess and have more prese lusterng mehansms. KEYWORDS: Clusterng, Fuzzy C Means, Rough C-Means, Possblst C-Means. ANURADHA J Shool of Computng Sene and Engneerng, VIT Unversty, Vellore 63204, Taml Nadu, Inda Ths artle an be downloaded from www.jpbs.net B - 799

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 INTRODUCTION Clusterng s one of the most mportant tehnques that has made a tremendous development n omputng hstory. Clusterng, s the proess of groupng of data n suh a way that the objets n a luster are as smlar as possble and those n dfferent lusters are as dssmlar as possble. Fgure. shows the formaton of lusters havng two groups vz. luster and luster 2. The qualty of the luster s determned by ths nature of the luster partton. The degree of smlarty wth n luster and the dssmlarty aross the lusters together determne the valdty of the luster. The degree of smlarty s measured n terms of ntra luster dstane and the degree of dssmlarty smeasured by ntra luster dstane. C - Means algorthm s one suh lusterng mehansm whh ensures hgh ntra-lass smlarty but low nter-lass smlarty. Fgure. Bas lusterng mehansm There exst dfferent approahes for formaton of lusters. It s broadly lassfed as dvsve and agglomeratve lusterng. In dvsve lusterng tehnques, startng wth the Hard C-means, a number of algorthms have been suggested and revewed n lterature wth eah algorthm tryng to overome the dsadvantages of the prevous algorthms. The fuzzy C-means algorthm, rough C-means algorthm, possblst C-means algorthm are more effent and the enhaned versons lke dynam, adaptable and hybrd models are ompettve eah other n ts performane showng the superorty n dfferent dmensons by overomng the dsadvantage of eah other. However, t should also be noted that a ollaboratve approah has also been used to derve algorthms of better effeny. Two or more algorthms are ombned or ollaborated together to reate an algorthm whh s more effent by gettng the benefts from more than one tehnque 4,5 ). For example-the roughfuzzy ollaboratve lusterng 3 (Mtra et al, 2006) s a ollaboraton between the rough and fuzzy C-means algorthms and seeks to overome both of ther dsadvantages. The latest n lne has been the possblst fuzzy - means algorthms 2 whh has been an amalgamaton of the possblst C-means algorthm and the Fuzzy C-means algorthm. The above mentoned algorthms are senstve to nosy data and may form ondental lusters. In ths dreton we propose a Possblst Rough C-Means (PRCM) algorthm n ths paper that overomes the drawbak of prevous algorthms. It apples the onept of typalty to rough tehnque n the formaton of lusters. The superorty of the proposed algorthm s measured usng varous luster valdty ndes. The results are analyzed onsderng dfferent parameter settng n luster formaton. 2 LITERATURE SURVEY In the past few deades, lusterng algorthms have evolved tremendously. However, the onept of hgh ntra-luster smlarty and low nter-luster smlarty remans the same. Hard C Means 5 whh s the foremost of all the lusterng algorthms, s not muh useful n real lfe stuatons. It s beause of the fat that t uses rsp values.e. 0 or for representng belongngness of objets to the luster. The luster boundary s rgd and the objet may belong to at most one luster. Ths onstrant pulls down the performane of Ths artle an be downloaded from www.jpbs.net B - 800

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 HCM on real lfe data sets. Hene HCM s not muh sutable for real lfe applatons. The Fuzzy C Means (FCM) algorthm ntrodued n 8 ( forms overlappng lusters based on membershp value of the objets belongng to all the lusters. There were many mprovements over the bas FCM algorthm that s more flexble and more adaptable whh enhanes ts performane 7,9 (. FCM algorthm s wdely used n mage proessng, pattern reognton, data mnng and n varous applatons of sene and engneerng. In ths algorthm, membershp of objets to eah of the lusters s alulated. The membershp of an objet k belongng to th s gven by (2.) 2 ( ' ) = m dk µ k (2.) j= djk Clearly, membershp of eah objet to a luster depends on the dstane of the objet wth the luster as well as the dstane of the objet from all the other lusters. It wouldn t be wrong to say that these membershps do not atually represent the belongngness of the objet to a luster; rather t represents the degree of sharng. The membershps are relatve quanttes beause they depend on the dstane of the objet to all the other lusters and therefore on the total number of lusters. FCM requres the number of lusters to be spefed ntally and forms exatly that many number of lusters even though lusters may be dental. Ths leads to the problem of low luster valdty and formaton of ondent lusters. Another problem that s of sgnfant onern n ase of FCM s the presene of nosy data 0. It fals to dentfy outlers n the data whh an be explaned wth the help of the followng fgure. Fgure 2. Presene of nosy data From the fgure 2., t s qute obvous that A and B are outlers and dfferently separated from eah luster. However, FCM assgns a membershp value of 0.5 to both A and B 0. It s so as for the alulaton of membershp of B, nstead of takng the smaller dstane from the two lusters, dstanes from both the lusters are taken. FCM, therefore, has a probablst onstrant.e. the sum of membershps of an objet to all the lusters should be. µ k = (2.2) = To overome the drawbaks of FCM, the probablst onstrant was removed by ntrodung the onept of typalty or possblty. The only onstrant n ths approah s that the typalty values must le n the nterval [0, ]. Typalty values are Ths artle an be downloaded from www.jpbs.net B - 80

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 absolute and denote degrees of belongngness of an objet to a luster. Ths algorthm s alled the Possblst C-Means (PCM). Moreover, t has been referred to as a mode-seekng algorthm beause even f the number of lusters s unknown, good lusters (dense regons) an be formed wth the proper estmaton of parameters. Ths ensures that the lusters formed are vald and not dental. Another lusterng algorthm whh has ganed attenton reently s the Rough C- Means (RCM) ntrodued by Lngras and west (2004). RCM s popular for ther apablty n handlng unertan objets and t does not requre any pror knowledge about the data. It has been presented as a ollaboratve lusterng tehnque 3 along wth FCM, jontly referred as RFCM 5. Rough sets are used extensvely n areas of data mnng. Rough C- Means uses the onept of rough sets to address the n dsernblty relaton between objets. It lassfes the data nto two dstnt groups for eah luster namely the lower approxmaton and the boundary regon. Thus, only those objets belongng ompletely to a luster an be separated from those whh are ambguous. The boundares of lusters overlaps whh makes sure that the lusterng does not beome rsp and hene further mprovements n the luster formaton s possble. We an further dversfy ths tehnque to Rough C-means (RCM) and probablst C- means (PCM) so as to establsh the prourement of the C-means value. The theory of Rough sets whh has emerged as one of the most mportant tools of the deade an be appled to the further lassfy the C-mean onept suh that the lower approxmatons and boundary regon of the objet set are determned to onsequently establsh the luster. Possblst C-means s also a method of establshng C means wheren a number of possbltes are generated, these possbltes are further evaluated through onepts of typalty to determne the C-value. We attempted to reate a hybrd of these two promnent models to be termed as Possblst Rough C-means (PRCM) suh that t norporates the best of both models. Ths model s to be establshed as more effent than eah of the ndvdual models. It wll also enable to streamlne the lusterng proess and have more prese lusterng mehansms. 3 THE PROPOSED ALGORITHM Possblst Rough C-Means (PRCM) algorthm s a ombnaton of PCM and RCM. It apples the onept of typalty to rough sets for the formaton of new lusters. Basally, we fous on lusterng objets on the bass of typalty whh then s assgned to ether the lower or boundary regon of a luster. Ths s done to ensure that all the objets that are atypal (deded on the bass of a threshold value) have hanes to be assgned to any of the lusters rather than to both. The algorthm of PRCM s gven below. PRCM Algorthm. Assgn the ntal entrods. These are hosen randomly from the dataset. 2. Calulate the dstanes of all the other data ponts from the entrods. 3. Assgn eah data objetx k to the lower approxmaton AXor boundary regons A X - A X, AX j AX j of luster pars Xand X jby omputng the dfferene n ts dstanes d k and d jk from the luster entrod pars vandv j, respetvely as follows. 4. Let d k be frst mnmum and d jk be the next mnmum. If d k - d jk s less than θ (threshold value), then xk AX AXandx k AX j AX jandx k annot be a member of lower approxmaton of any luster. else xk AX 5. For eah data pont belongng to the luster, a typalty value s alulated by Ths artle an be downloaded from www.jpbs.net B - 802

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 t k = (3.) 2 (+ d / γ ) k where γ s alulated by kdk γ = (For the objets n boundary regon) AX AX (3.2) kdk γ = AX (For the objets n lower approxmaton) (3.3) 6. Compute the new entre for eah luster X, applyng (3.4), 3.5) and 3.6). v = v x AX AX k x AX AX AX AX xk xk* tk, f AX AX 0 AX = 0 (3.4) k =, AX AX = 0 AX 0 v = wlow x x AX k k + wup AX x AX AX k AX AX f (3.5) xk* tk, f AX AX 0 AX 0 (3.6) 7. Repeat Steps 2 6 untl there s no hange n entrods. The objetve funton for the proposed algorthm s gven by (3.7) J PRCM = n n k= l= = ( τ µ ( wd + wd ) + ( τ τ + µ µ ) D k l k l k l k l kl ) (3.7) where,, τks the typal belongngness of an objet to lower approxmaton of A X and s the typal belongngness of an objet to the boundary of AX AX. The typalty of the objet to a luster helps to dentfy ts proper luster and t works well even n presene of nosy data. The expermental analyss of the proposed PRCM s onduted on varous data sets wth dfferent luster valdty measures. 4 TESTING AND VALIDATION The valdty of the luster formed by PRCM s measured aganst varous metrs and t s tested on varous data sets gven n table. 4. DB INDEX Daves Bouldn (DB) ndex and Dunn (D) ndex gven by Bezdek 8 s for valdatng the formaton of the lusters. DB s the rato of the sum of wthn-luster dstanes to between-luster dstane. Ths fator should be low beause the objets wthn a luster should be lose to eah other and far to those belongng to other lusters. It s alulated usng (4..). Ths artle an be downloaded from www.jpbs.net B - 803

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 Spr( X) + Spr( X j) DB pr = max 4..) = (, ) j d X X j d X, X ) = ( j, j x xj j (4..2) 2 x v t x BX BX k * k k ( ) ; f BX BX 2 x v t x BX k * k k Spr( X) = ; f BX 2 x v t x BX k * k k wlow + wup BX BX BX 0 BX = 0 BX BX = 0 BX 0 x ( BX k BX ) xk BX BX 2 v * tk ; otherwse (4..3) where d(x,x j ) s the dstane between the lusters X and X j alulated usng (4..2), S pr (X ) s the average dstane between objets wthn the lusters X and X j and t k s gven by equaton (4..3) and v s the entrod of the luster. The parameters w low and w up an be adjusted based on the mportane of the lower approxmaton and the boundary subjet to the onstrant w low + w up =. Also, these parameters are not taken nto onsderaton when ether of the lower approxmaton or the boundary regon s empty 3. 4.2 D INDEX D ndex s an ndaton of the dstane between the lusters,.e. how ompat and separated the lusters are from eah other. Ths fator should be hgh. D s gven by usng (4.2.). d( X, X j) D pr = mn mn (4.2.) j j maxspr( Xk) k where Spr(U k ) s gven by equaton (4..3). 4.3 QUANTITATIVE ANALYSIS Varous qualtatve measures are used to evaluate the performane of the rough fuzzy algorthm 5. We present below the defntons of these ndes. Before presentng these ndes we ntrodue some notatons. Let A( X ) and A( X ) be the lower and upper approxmatons of lusterx, and B( X) = A( X) A( X) denote the boundary regon of lusterx. The parameters ω and ω representw low and w up. It s the weght age that orrespond to the relatve mportane of lower and boundary regon. Also, ω= ω. Defnton 4.3. (α ndex) It s gven by the expresson Ths artle an be downloaded from www.jpbs.net B - 804

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 ωa α =, where (4.3.) = ωa + ωb A = xj = A( X) and B = x j (4.3.2) x A( X ) j x B j ( X ) The α ndex represents the average auray of number of lusters. It s the average of the rato of the number of objets n lower approxmaton to that n upper approxmaton of eah luster. In effet, t aptures the average degree of ompleteness of knowledge about all lusters. A good lusterng proedure should make all objets as smlar to ther entrods as possble. The ndex nreases wth nrease n smlarty wthn a luster. Therefore, for a gven data set and value, the hgher the smlarty value of objets wthn the lusters gves a hgher ndex value. The value of α nreases wth the value of. In an extreme ase when the number of lusters s maxmum,.e., = n, the total number of objets n the data set, the value of α =, when A X ) = A( X ), that s, all the lusters {X } are exat or defnable. Whereas f A X ) = B( X ) 0 α. ( (,, the value of α = 0. Thus, Defnton 4.3.2: (ρ ndex) It represents the average roughness of number of lusters and s obtaned by subtratng the average auray α from. ρ = α = - = ωa ωa + ωb (4.3.3) Defnton 4.3.3: (α* ndex) The measurerepresents the auray of approxmaton = α * = (4.3.4) { ωa + ωb} = ωa Defnton 4.3.4: (γ ndex) It represents the approxmaton of the lusterng algorthm obtaned by fndng the rato of the total number of objets n lower approxmaton to that of ardnalty of objets n the unverse. A( X) = γ = U (4.3.5) 4.5 RESULTS AND DISCUSSION For testng purposes we use eght data sets from UCI repostory wth vared dmensons. The data sets and ther dmensons are gven n table 4.5..The datasets have been hosen from a wde doman suh as medne, astrology, botehnology, organsms et. They have been taken from a wde doman so that the expermentaton may be done aross data from dfferent felds and to hek whether our results hold true aross domans. Attrbutes of a dataset refers to the dmensons of the dataset. For example, for LIVER data set, t has 5 attrbutes based on whh lass labels are reorded for the data set. Number of nstanes refers to the total number of entres n the dataset. Ths artle an be downloaded from www.jpbs.net B - 805

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 Table 4.5. Datasets onsdered for expermental Analyss Data Set Instanes Attrbutes ADHD 20 Lver 583 5 Irs 50 4 Wne 78 3 Caner 32 57 Lbra Movements 360 9 E ol 336 7 Abalone 477 8 ADHD data set s about Attenton Deft Hyper atve Dsorder, olleted by us under the supervson of a pedatr dotor from the shools n our plae. The PRCM when exeuted on ADHD data set, has generated a good luster whose performane s tabulated n table 4.5.2. The table shows the omparson of PRCM and RCM lusters formaton measured varous ndes explaned n seton 4. From the table, t s lear that PRCM shows superorty over RCM. The DB value s mnmzed to 0.6898 from 0.744. Smlarly the D value s maxmzed from.52 to.59. The other ndes lke α, ρ, α* and γ shows mnor dfferene between the two algorthm and t s lear from the fgure 4.5.. Table 4.5.2 omparson of PRCM and RCM on ADHD data set PRCM RCM DB 0.68985 0.7440504 D.593892.5200309 α 0.435678904 0.44345808 ρ 0.56432096 0.5565492 α* 0.59590275 0.59686976 γ 0.23762377 0.24752475 Fgure 4.5. Comparson of PRCM and RCM on ADHD data set Further to study the superorty of the proposed algorthm, t s tested on seven real data set onsdered from UCI repostory. Clusters formed usng Rough C-means (RCM) and Possblst Rough C-Means (PRCM) are ompared. The luster formaton proedure s terated untl there s no more hange n the luster objets. The performane of lusterng Ths artle an be downloaded from www.jpbs.net B - 806

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 by both the algorthms are ompared aganst varous measures desrbe n seton 4. The results are summarzed n tables 4.5.3, 4.5.4 and 4.5.5. The algorthm uses some parameters lkew low, w up and θ. These parameters are to be ntalzed at the begnnng of the algorthm. These parameters play a major role n luster formaton where w low and w up values represents the mportane or weghtage gven to lower and boundary regon respetvely. The value of θ determnes the degree of smlarty between the objets fallng n the same luster. It determnes the objet to be n lower or boundary regon. The parameter values should be fxed wth at most are as they hghly nfluene the luster formaton. These parameters are fxed based on lterature and by tral. By lterature w up= w low should be between 0.5 and and θ lyng between 0 and 0.5 3. For our w, low experments we have taken w low as 0.9, w up as 0. and θ as 0.3 Table 4.5.3 Comparson of PRCM and RCM based on number of objets and teratons Data Set # LOWER APPROX #BOUNDARY # Iteratons RCM PRCM RCM PRCM RCM PRCM Lver 304 32 556 540 8 3 Irs 39 35 20 28 6 26 Wne 70 59 4 36 3 2 Caner 0 32 64 0 5 3 Lbra Movements 0 360 720 0 6 32 E ol 290 279 92 4 22 36 Abalone 3743 373 866 890 4 69 Table 4.5.4 Comparson of the two algorthms based DB ndex and D ndex Dataset DB Index D Index RCM PRCM RCM PRCM Lver 0.4388 0.232295 2.862 4.48895 Irs 0.479248 0.082232 0.40243 2.54093 Wne 0.62923 0.448224 2.65363 2.400465 Caner.8272 2.43243 0.64753 0.602659 Lbra Movements 4.683.299986 0.054484 0.685768 E Col 0.3595 0.97267 2.0633 4.2478 Abalone 0.7704 0.8933 3.65404 6.003452 From the table 4.5.3, one an understand that PRCM takes more number of teraton for onvergene. On some ases PRCM has more objets n lower approxmaton than RCM and ve versa. The DB and D values are on seven data set are summarzed n table 4.5.4. From ths table t s evdent that PRCM shows superor performane than RCM by mnmzng DB value, at the same tme maxmzes D value. Table 4.5.5 Comparson of PRCM and RCM based on α, ρ, α* and γ Data Set Alpha Index (α) P Index (ρ) Alpha*Index (α*) Gamma Index (γ) RCM PRCM RCM PRCM RCM PRCM RCM PRCM Lver 0.752823 0.79332 0.24777 0.208668 0.8306 0.865343 0.522337 0.536082 Irs 0.76572 0.936256 0.234828 0.063744 0.984264 0.98375 0.932886 0.92752 Wne 0.987987 0.967337 0.0203 0.032663 0.990933 0.978623 0.960452 0.898305 Caner 0.28E-08 0 0 Lbra Movements 0.87E-08 0 0 E ol 0.95282 0.944534 0.04788 0.055466 0.96595 0.959932 0.863095 0.830357 Abalone 0.97625 0.982947 0.023875 0.07053 0.974937 0.982335 0.89632 0.893439 Ths artle an be downloaded from www.jpbs.net B - 807

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 On studyng table 4.5.4, we note that whle omparng the DB ndes of both the algorthms, the values obtaned for the data sets when the Rough C-means algorthm s used s farly hgher. For the Lbra movements and aner data sets, the DB ndex values are onsderably hgher than ts Possblst rough C-means ounterpart. All of the DB ndes for the PRCM algorthm are omfortably lesser than the RCM algorthm, ndatng that PRCMforms lusters havng lower wthn-luster dstane. Obvously, PRCM performed better. When the D ndes of table 4.5.4 are ompared, the D ndes for the Rough C-means algorthm are onsderably lower than that of the Possblst Rough C-means hene agan reassurng that the Possblst Rough C- means algorthm has fared better. Although the number of teratons requred to luster are hgher n for the hybrd algorthm but, when ths attrbute s ompared to the performane metr of the other parameters, ths fator an be gnored. Fgure 4.5.2 Comparson of DB ndex Fgure 4.5.3 Comparson of D ndex From Table 4.5.5, we note that the α and α* ndes for the PRCM table s hgher for some and lower for some values as ompared to RCM table ndatng that PRCM does not form quanttatvely good but qualtatvely good lusters. Sne ρ Index s agan a quanttatve measurement of lusters, t s hgher for some and lower for some values when PRCM and RCM results are ompared. Agan, Gamma Index s a measure of the number of objets n the lower approxmaton to the ardnalty of the unverse. Sne number of objets n lower approxmaton for PRCM s lower n some and hgher n some when ompared to RCM, t gves mxed results. Furthermore, we vared the values of w low and w up for both the algorthms and observed ther respetve performanes for DB and D ndes. The graph s generated for both the algorthms on these ndes wth w low values as 0.9, 0.8 and 0.5 for varous data sets gven n table 4.5.. The fgures 4.5.4 show the omparson of DB Index and the fgure 4.5.5shows the omparson on D Index. Ths artle an be downloaded from www.jpbs.net B - 808

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 Fgure 4.5.4 Comparson of DB ndex for w low =0.9, 0.8 and 0.5 On areful examnaton of fgure 5.5.4, we onlude that DB of PRCM s lower than RCM rrespetve of w low values. Smlarly, we ompare D values of RCM and PRCM for eah dataset n fgure 4.5.5. Here agan, the value of D s hgher for PRCM for all w low values. It should be noted that the value of w up hanges wth respet to w low and need not be onsdered separately. Fgure 4.5.5 Comparson of D ndex for w low =0.9, 0.8 and 0.5 From the fgures 4.5.4 and 4.5.5, t s very lear that the performane of PRCM s superor to RCM n both DB and D ndex. The performane of PRCM s better when w low and w up values are 0.9 and 0.. 4.6 CONCLUSION In ths part of work, we proposed a lusterng algorthm alled Possblst Rough C Means (PRCM) that s apable of formng good lusters even n presene of nosy data. The typalty or possblty value resolves the dlemma n plang the nosy data to the luster. After analyzng the results and omparsons, we onlude that our proposed PRCM algorthm forms more ohesve lusters. DB ndex for all the datasets s lower than that of Rough C-Means. On the other hand, PRCM also separates the lusters from eah other more effently than the Ths artle an be downloaded from www.jpbs.net B - 809

Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 onventonal RCM algorthm. D Index for PRCM s hgher than RCM for all the data sets, ndatng that the qualty of lusters formed by PRCM s better than RCM. Also, the performane of the algorthm remans same even for vared values of w low and w up. REFERENCES. Jan, A., K. and Dubes, R., C., Algorthms for Clusterng Data, Englewood Clffs, N.J.: Prente Hall, 988. 2. Duda, R., O., Hart, P., E. and Stork, D., G., Pattern Classfaton and Sene Analyss, John Wley & Sons, New York, 999. 3. Jan, A., K., Murthy, M., N. and Flynn, P., J., Data Clusterng: A Revew, ACM Computng Surveys, 3(3), pp. 264-323, 999. 4. Huntsberger, T. and Ajjmarangsee, P., Parallel self-organzng feature maps for unsupervsed pattern reognton,internatonal Journalof General Systems,vol. 6, pp. 357-372, 990. 5. MQueen, J., 'Some Methods for Classfaton and Analyss of Multvarate Observatons', Proeedngs of Ffth Berkeley Symp. Math. Statsts and Probablty, 28-297, 967. 6. Lngras, P. and West, C., 'Interval Set Clusterng of Web Users wth Rough K- Means', Journal of Intellgent Informaton System', 23(), pp.5-6, 2004. 7. Gath, I. and Geva, A., Unsupervsed optmal fuzzy lusterng, IEEE Transaton Pattern Anal. Mahne Intellgene,vol., pp. 773-78, 989. 8. Bezdek, J. C., 'Pattern Reognton wth Fuzzy Objetve Funton Algorthm', Kluwer Aadem Publshers, Norwell, MA, USA, 98. 9. Bezdek, J.C. and Pal, S.K. (Eds.), 'Fuzzy Models for Pattern Reognton', Eds. New York: IEEE Press, 992. 0. Krshnapuram, R. and Keller, J. M., 'A Possblst Approah to Clusterng', IEEE Transatons on Fuzzy Systems, vol. (2), pp. 98 0, 993.. Krshnapuram, R. and Keller, J. M., 'Correspondene - The Possblst C- Means Algorthm: Insghts and Reommendatons', IEEE Transatons on Fuzzy Systems, vol. 4(3), pp. 385 393,996. 2. Pal, N., R., Pal, K., Keller, J., M. andbezdek, J., C., A Possblst Fuzzy -means Clusterng Algorthm, IEEE Transatons On Fuzzy Systems, vol. 3(4), pp. 57 530, 2005. 3. Mtra, S., Banka, H. and Pedryz, W., 'Rough - Fuzzy Collaboratve Clusterng', IEEE Transaton on systems, Man and Cybernets - Part B: Cybernets, Vol.36, No 4, pp. 795 805, 2006. 4. Anuradha J, Snha, A. and Ramesh, R., Clusterng Based on Possblst Fuzzy C-Means for Web Cahng. Proeedngs of Conferene on Computng Cybernets and Intellgent Informaton Systems (CCIIS), 204. 5. Maj, P. and Pal, S. K., 'RFCM: A Hybrd Clusterng Algorthm Usng Rough and Fuzzy Sets', FundamentaInformatae, 80, pp.475-496, 2007. 6. Endo, Y. and Knoshta, N., On Objetve-Based Rough C-Means Clusterng,GRC 202,IEEE onferene on granular omputng pp. -6, 202. Ths artle an be downloaded from www.jpbs.net B - 80