A ROUGH SET APPROACH FOR CUSTOMER SEGMENTATION

Size: px
Start display at page:

Download "A ROUGH SET APPROACH FOR CUSTOMER SEGMENTATION"

Transcription

1 A ROUGH SET APPROACH FOR CUSTOMER SEGMENTATION Prabha Dhadayudam * ad Ilago Krishamurthi Departmet of CSE, Sri Krisha College of Egieerig ad Techology, Coimbatore, Idia * prabhadhadayudam@gmail.com ik@skcet.ac.i ABSTRACT Customer segmetatio is a process that divides a busiess s total customers ito groups accordig to their diversity of purchasig behavior ad characteristics. The data miig clusterig techique ca be used to accomplish this customer segmetatio. This techique clusters the customers i such a way that the customers i oe group behave similarly whe compared to the customers i other groups. The customer related data are categorical i ature. However, the clusterig algorithms for categorical data are few ad are uable to hadle ucertaity. Rough set theory (RST) is a mathematical approach that hadles ucertaity ad is capable of discoverig kowledge from a database. This paper proposes a ew clusterig techique called MADO (Miimum Average Dissimilarity betwee Obects) for categorical data based o elemets of RST. The proposed algorithm is compared with other RST based clusterig algorithms, such as MMR (Mi-Mi Roughess), MMeR (Mi Mea Roughess), SDR (Stadard Deviatio Roughess), SSDR (Stadard deviatio of Stadard Deviatio Roughess), ad MADE (Maximal Attributes DEpedecy). The results show that for the real customer data cosidered, the MADO algorithm achieves clusters with higher cohesio, lower couplig, ad less computatioal complexity whe compared to the above metioed algorithms. The proposed algorithm has also bee tested o a sythetic data set to prove that it is also suitable for high dimesioal data. Keywords: Customer Relatioship Maagemet, Customer segmetatio, Clusterig, Categorical data, Rough set theory 1 INTRODUCTION Customer Relatioship Maagemet (CRM) is a busiess methodology used to build a relatioship with log term profitable customers by aalyzig customer eeds ad behaviors. It is a importat techology i every busiess because busiess is customer cetric. CRM helps busiess leaders gai isight ito customer behavior ad life time value to icrease profit by actig accordig to the customer characteristics. Customer segmetatio plays a importat role i CRM. It divides customers ito groups accordig to their purchasig behavior, allowig busiess leaders to desig ad establish differet strategies for each group of customers ad thus maximize their value (Lig & Ye, 2001). Rececy (R), Frequecy (F), ad Moetary (M) are the attributes chose for describig the purchasig customer characteristics. The RFM model works very effectively i customer segmetatio (Wu & Li, 2005). R idicates the time iterval betwee the preset ad previous trasactio dates of a customer. F idicates the umber of trasactios that the customer has made i a particular iterval of time. M idicates the total value of the customer s trasactio (Cheg & Che, 2009). I this paper, the modified RFM model, called RFMP, is itroduced. This model cosiders the customers paymet details. P idicates the average time iterval betwee paymet ad purchase date. The customers paymet details are a importat attribute because ay two customers with the same R, F, M values but a differet P value caot be treated equally by the eterprise. The RFMP model esures that the customer segmetatio is doe obectively. The values for R, F, M, ad P attributes are cotiuous. These cotiuous values are ormalized to categorical values as beig very low, low, middle, high, ad very high for effective aalysis. The data available for customer segmetatio is ow categorical data. The data miig clusterig techique is widely used to accomplish customer segmetatio (Ngai, Xiu, & Chau, 2009). The clusterig techique is used to segmet customers i such a way that the customers i oe group behave similarly whe compared to the customers i other groups based o their trasactio details. The traditioal approaches for clusterig, such as partitioig ad hierarchical algorithms, deal with umerical data whose iheret geometric properties ca be exploited to aturally defie distace fuctios betwee data poits. Distace fuctios, such as Mahatta, Euclidea, ad Mikowski, are used for allocatig a data poit to the appropriate clusterig. However, customer data available for clusterig is categorical so the above procedures are ot feasible. The computatio of similarity or dissimilarity is essetial for categorical data (Che, Chuag, & Che 2008). 1

2 Simple matchig, co-occurrece, probabilistic ad distace hierarchy are the approaches for computig similarity or dissimilarity measures (Cao, Liag, Li, Bai, & Dag, 2011). Simple matchig is a commo approach i which the compariso of two idetical categorical values yields either zero or oe. k-modes, fuzzy k-modes, ad k-prototype algorithms are based o simple matchig. These algorithms produce clusters with weak itrasimilarity ad have stability problems (Cao, Liag, Li, Bai, & Dag, 2011). ROCK (Robust clusterig usig liks) ad CACTUS (Clusterig categorical data usig summaries) are algorithms based o the co-occurrece approach. ROCK uses the cocept of a lik to measure the similarity betwee categorical patters. Here lik is defied as the umber of commo eighbors betwee two patters (Guha, Rastogi, & Shim, 2000). CACTUS calculates the frequecy of two values appearig i the patters together. It fids clusters i subsets of all attributes ad thus performs subspace clusterig. It geeralizes the cluster defiitio of umerical data so that it is suitable for categorical data (Gati, Gehrke, & Ramakrisha, 1999). The limitatios of ROCK are that it is sesitive to threshold value ad sometimes does ot produce the required umber of clusters (Parmar, Wu, & Blackhurst, 2007). Probabilistic approaches use coditioal probability estimatio to defie relatios betwee clusters. COBWEB, AUTOCLASS, DECA, ad COOLCAT are based o probabilistic models ad require a log traiig time. COBWEB uses the category utility fuctio, ad AUTOCLASS uses a Bayesia method to derive probable class distributio. DECA is a discrete valued clusterig algorithm, ad COOLCAT is a etropy based algorithm. Distace hierarchy associates each lik with a weight ad requires the domai experts to icorporate kowledge (Cao, Liag, Li, Bai, & Dag, 2011). Rough set theory (RST) by Pawlak (1982) received a great deal of attetio for dealig with categorical data i clusterig algorithms due to its stable results ad o requiremet of domai expertise. It uses global data properties to establish similarity betwee the obects (Bea & Kambhampati, 2008). The existig clusterig algorithms (Parmar et al., 2007; Mazlack, He, Zhu, & Coppock, 2000; Kumar & Tripathy, 2009; Tripathy & Ghosh, 2011a; Tripathy & Ghosh, 2011b; Herawa, Deris, & Abaway, 2010; Herawa, Ghazali, Yato, & Deris, 2010) based o RST utilize the correlatio of attributes. If the attributes are depedet, the the clusterig algorithms based o the correlatio of attributes produce correct results. The attributes R, F, M, ad P chose for describig the purchasig characteristics of customers are idepedet so that there is o proportioal relatioship betwee the attributes. This cocept led to the proposal of MADO (Miimum Average Dissimilarity betwee Obects), a clusterig algorithm based o RST. The MADO algorithm calculates the dissimilarity betwee obects without cosiderig the depedecy betwee attributes. The rest of the paper is orgaized as follows: Sectio 2 discusses the basic cocepts of rough set theory. Sectio 3 summarizes rough set theory based clusterig algorithms. Sectio 4 explais the MADO clusterig algorithm. Sectio 5 compares the clusterig results obtaied for real customer data ad sythetic data. Fially, Sectio 6 provides cocludig remarks. 2 BACKGROUND Rough set theory (RST) by Pawlak classifies imprecise, ucertai, or icomplete iformatio or kowledge expressed by data acquired from experiece (Pawlak, 1982). It gaied importace i the areas of machie learig, kowledge acquisitio, decisio aalysis, kowledge discovery from databases, expert systems, decisio support systems, iductive reasoig, ad patter recogitio (Pawlak, 1992). It is suitable for processig qualitative iformatio that is difficult to aalyze by stadard statistical techiques. It maages vague ad ucertai data or problems related to iformatio systems (Shyg, Wag, Tzeg, & Wu, 2007). The iformatio system is a 4-tuple (quadruple) S = (U, A, V, f), where U ad A are a o-empty fiite sets of obects ad attributes, respectively, ad V is the set cotaiig the domai of each attribute, where V a deotes the domai of attribute a that belogs to A. The fuctio f: U A V is a total fuctio such that f(u, a) V a, for every (u, a) U A ad is called the iformatio fuctio. RST is maily based o the idisceribility relatio, equivalece class, lower approximatio, ad upper approximatio (Pawlak & Skowro, 2007). The idisceribility relatio (Id(B)) is a relatio o U. Give two obects, x i, x U, they are idiscerible by the set of attributes B i A if ad oly if a(x i ) = a (x ) for every a B. That is, (x i, x ) Id (B) if ad oly if a B where B A ad a(x i ) = a (x ). The equivalece class ([x i ] Id(B) ) is the set of obects x i havig the same values for the set of attributes i B. This is also kow as a elemetary set with respect to B. The lower approximatio (X LB ) is the uio of all the elemetary sets with respect to B that are cotaied i X. The upper approximatio (X UB ) is the 2

3 uio of all the elemetary sets with respect to B that have a o-empty itersectio with X. Eqs. (1) ad (2) calculate the lower ad upper approximatio, respectively. XLB { xi [ xi] Id(B) X} (1) XUB { xi [ xi] Id(B) X } (2) The lower approximatio cosists of all obects that defiitely belog to the cocept while the upper approximatio cotais all obects that possibly belog to the cocept. The differece betwee the upper ad the lower approximatio costitutes the boudary regio of the vague cocept. Approximatios are two basic operatios i rough set theory; thus it expresses vagueess ot by meas of membership but by employig a boudary regio of a set. If the boudary regio of a set is empty, the set is crisp. Otherwise the set is rough (iexact) (Pawlak & Skowro, 2007). The ratio of the cardiality of the lower approximatio ad the cardiality of the upper approximatio is defied as the accuracy of estimatio, a measure of roughess (Pawlak, 1992). May clusterig algorithms based o RST have bee developed, ad they are overviewed i the ext sectio. 3 RELATED WORK Mazlack et al. (2000) proposed two techiques based o rough set theory to select clusterig attributes. These are bi-clusterig (BC) ad total roughess (TR) techiques. BC is applicable oly for bi-valued attributes. TR hadles multi-valued attributes ad is based o the total average of the mea roughess of a attribute with respect to the set of all attributes i a iformatio system. The attribute with the higher TR is chose as the clusterig attribute. However, for partitioig, the method starts with biary valued attributes ad uses the total roughess criterio oly for multi-valued attributes. Therefore, partitioig is doe o a multi-valued attribute oly whe all the biary valued attributes have already bee partitioed. This reduces the efficiecy of the algorithm because the partitioig is doe o a biary valued attribute eve whe the total roughess value for the multi-valued attribute is high. The roughess, mea roughess, ad total roughess i TR are calculated usig Eqs. (3), (4), ad (5), respectively. XLB R(X) (3) XUB Rough(i) = R(X i ) i 1 (4) Total roughess(i) = m Rough(i) i 1 m (5) X idicates the set of obects i the data set; X i idicates the set of obects i the sub-partitio of attribute I; idicates the umber of sub-partitios of attribute I, ad m idicates the umber of attributes. I Eq. (3), X LB ad X UB are calculated usig Eqs. (1) ad (2), respectively. I Eq. (4), Rough(i) is the mea roughess of all sub-partitios of attribute i. I order to choose the partitioig attribute i, the Total roughess(i) towards all the attributes is calculated usig Eq. (5). The attribute with the highest total roughess is chose as the clusterig attribute. However, for partitioig, the method starts with biary valued attributes ad uses the total roughess criterio oly for multi-valued attributes. This creates a disadvatage due to the fact that the partitioig is doe o a biary attribute eve though the total roughess for a multi-valued attribute is higher. Parmar et al. (2007) proposed a ew techique called mi-mi roughess (MMR) for multi-valued attributes. The MMR techique is based o the miimum value of mea roughess ad does ot require total roughess for calculatig as i TR. The roughess ad mea roughess i MMR is calculated usig Eqs. (6) ad (7), respectively. Give a i, a A (set of attributes), V( ) ai is the set of values of attributes a i, ad the a i a. X is a subset of obects havig oe specific value for the attribute a i,, that is X (a i = ). ( X ( a )) a i is the lower approximatio of L 3

4 X with respect to {a }, ad ( Xa ( ai )) is the upper approximatio of X with respect to {a }. Thus, R (X) U a is defied as the roughess of X with respect to {a }, which is give by Eq. (6). Also, the mea roughess o attribute a i with respect to {a } is defied i Eq. (7). ( Xa ( ai )) L Ra (X ai ) 1 ( Xa ( ai )) U Rough a ( a i ) R a ( X ai 1)... R a ( X ai V( i ) (7) V( ai ) The maximum value of roughess is oe because the umber of obects i the lower approximatio is less tha or equal to the umber of obects i the upper approximatio. The mi-roughess (MR) of each attribute refers to the miimum of the mea roughess with respect to a sigle attribute. The mi-mi-roughess (MMR) is defied as the miimum MR of attributes. The MMR value determies the splittig attribute. The ode with the maximum umber of elemets is chose for further splittig. From the experimetal aalysis i Parmar et al. (2007), MMR achieves better results whe compared to BC, TR, fuzzy set based algorithms, ad other dissimilarity approaches. Kumar ad Tripathy (2009) proposed MMeR by makig a slight alteratio i MMR. I their algorithm, the mea-roughess (MeR) of each attribute is calculated as the mea of the mea roughess. The mi-mea-roughess (MMeR) is defied as the miimum of MeR of attributes. The MMeR value determies the splittig attribute. The ode that has the maximum average distace betwee its elemets is chose for further splittig. Tripathy ad Gosh (2011a) proposed a algorithm for clusterig categorical data where the stadard-deviatio-roughess (SDR) of each attribute is calculated as the stadard deviatio of the mea roughess. The attribute with the miimum SDR is chose as the splittig attribute. Tripathy ad Gosh (2011b) also proposed a algorithm called SSDR. Here the stadard deviatio of SDR (SSDR) is calculated, ad the splittig attribute is chose based o this value. The ode chose for further splittig i the SDR ad SSDR algorithm is the same as that of the MMeR algorithm. SSDR achieves better results tha SDR for small data sets while for large data sets it achieves the same results as SDR. The roughess used i MMeR, SDR, ad SSDR is calculated usig the same formula as i MMR. All the algorithms TR, MMR, MMeR, SDR, ad SSDR i the above examples (Parmar et al., 2007; Mazlack et al., 2000; Kumar & Tripathy, 2009; Tripathy & Ghosh, 2011a; Tripathy & Ghosh, 2011b) have bee tested o data sets available i the UCI machie learig repository. The purity of clusters was used as a measure to test the quality of the clusters. The purity ratios of MMR, MMeR, SDR, ad SSDR are i icreasig order. All these clusterig algorithms are based o the roughess of a attribute with other attributes. Herawa et al. (2010a) proposed a ew techique called maximal depedecy attributes (MDA) for selectig clusterig attributes. Based o MDA, Herawa et al. (2010b) proposed MADE for categorical data clusterig. MDA selects the clusterig attribute by determiig the depedecies betwee attributes. The attribute with the maximum degree of depedecy is selected as the partitioig attribute. 4 PROPOSED ALGORITHM The calculatio of roughess ad determiatio of depedecies betwee attributes i real customer data are ot appropriate due to the dyamic behavior of the customers. The MADO clusterig algorithm overcomes this disadvatage by utilizig the equivalece class property of rough set theory. The splittig attribute is chose based o the dissimilarity betwee the obects i the same equivalece class. Let A idicate the set cotaiig the attributes i the data ad B be aother set cotaiig the obects whose value for a particular attribute is same. Equivalece class is calculated for each attribute with a specific value. I each calculatio, the set B cotais the obects of that equivalece class. The dissimilarity betwee the obects withi the set B is calculated usig Eq. (8). The average dissimilarity betwee the obects i a equivalece class or set B is calculated usig Eq. (9). a ) (6) 4

5 V( xi, x) dissim( xi, x) 1 (8) Here x i, x belog to the same equivalece class; is the umber of attributes i A; ad V( xi, x ) attributes havig same value for the obects x i, x. m 1 m dissim( xi, x ) i 1 i 1 avgdissim(b) m(m 1) / 2 is the umber of (9) Here B idicates a equivalece class; m is the umber of obects i set B, ad x i, x belogs to set B. The equivalece class B of a attribute a i havig value v is represeted as [a i ]. Set B, which has the miimum average dissimilarity ad has at least two obects, determies the splittig attribute a i o its value v for the paret ode. Iitially, the paret ode cotais all the obects. After partitio, the leaf ode havig more obects is selected as the paret ode for further partitioig i subsequet iteratios. The algorithm termiates whe it reaches a pre-defied umber of clusters. The procedure for the MADO clusterig algorithm is give i Figure 1. Procedure (U,k) Lie 1 Begi Lie 2 Set curret umber of cluster CNC = 1 Lie 3 Set k as required umber of clusters Lie 4 Set ParetNode = U Lie 5 Do Lie 6 For each a i from A (i = 1 to, where is the umber of attributes i A) Lie 7 For =1 to l, where l is the umber of differet values i a i Lie 8 I the ParetNode, determie family of equivalece classes for a i with value which is deoted as set B Lie 9 Calculate avgdissim (B) Lie 10 Next Lie 11 Next Lie 12 Set Mi-Avg-Dissim = Mi (avgdissim (B)) for each B >1 Lie 13 Determie splittig attribute a i o its value v (a i ) correspodig to the Mi-Avg-Dissim Lie 14 CNC = CNC + 1 Lie 15 ParetNode = ProcParetNode (CNC) Lie 16 While (CNC<k) Lie 17 Ed Lie 18 ProcParetNode (CNC) Lie 19 Begi Lie 20 For i = 1 to CNC Lie 21 Size (i) = Cout (Set of Elemets i Cluster i) Lie 22 Next Lie 23 Determie Max (Size (i)) Lie 24 Retur (Set of Elemets i cluster i) correspodig to Max (Size (i)) Lie 25 Ed Figure 1. MADO algorithm 5 EXPERIMENTAL RESULTS I this sectio, real data sets of customer trasactio details are used for clusterig or segmetig the customers. Customer trasactio details for a period of six moths have bee collected from four differet eterprises. Data set1 cosists of records; data set2 cosists of records; data set3 cosists of records; ad data set4 cosists of records. For each trasactio, party id, date of purchase, amout of purchase, ad paymet of purchase are used to defie R, F, M, ad P values. The distict party id represets the idividual customer. For each distict party id, R is calculated as the iterval betwee the time that the latest cosumig behavior occurred ad the preset; F is calculated as the umber of trasactio records; M is calculated as the total purchase amout; ad P is calculated as the average time iterval (i terms of days) betwee the paymet date ad purchase date for 5

6 each trasactio i the data set. The data set ow has oly four attributes, amely R, F, M, ad P, for each customer. Dataset1 has 5062 customers; data set2 has 1420 customers; data set3 has 3811 customers; ad data set4 has 2675 customers. The values of R, F, M, ad P are ormalized as give below: For ormalizig R or P: 1) Sort the data set i descedig order of R or P. 2) Divide the data set ito five equal parts with 20% of the records i each. 3) Assig categorical values as very low, low, middle, high, ad very high to the first, secod, third, fourth, ad fifth parts of the records, respectively. For ormalizig F or M: 1) Sort the data set i ascedig order of F or M 2) Divide the data set ito five equal parts with 20% of the records i each. 3) Assig categorical values as very low, low, middle, high, ad very high to first, secod, third, fourth, ad fifth part of the records, respectively. The ormalized data set is ow used by the MADO clusterig algorithm to segmet the customers ito various groups. MMR, MMeR, SDR, SSDR, ad MADE algorithms are applied to the same four data sets so that the customers are divided ito various groups. Criteria cohesio ad couplig are used to measure the iteral quality of the cluster (Satos, Heuser, Moreira, & Wives, 2011). Cohesio expresses the average similarity betwee the elemets of a cluster. Couplig expresses the average similarity betwee all pairs of elemets, where oe elemet belogs to cluster C ad the other does ot. Ideally, cohesio should be high ad couplig should be low (Kuz & Black, 1995). The formulas for cohesio ad couplig for a cluster C are give by Eqs. (10) ad (11), respectively. The total cohesio ad total couplig of clusters are give by Eqs. (12) ad (13), respectively. sim( ci, c) i Cohesio(C) (10) m*(m 1) 2 sim( ci, q ) i, Couplig(C) (11) m* k Total cohesio Cohesio(C) (12) 1 C k Total couplig Couplig(C) (13) C 1 Here sim(c i,c ) is the similarity score betwee elemets c i ad c belogig to cluster C; sim(c i,q ) is the similarity betwee elemet c i from cluster C ad elemet q from aother cluster; m is the umber of elemets i C; is the umber of elemets outside C; ad k is the umber of clusters. The results of MMR, MMeR, SDR, SSDR, MADE, ad MADO algorithms are compared by varyig the umber of clusters to be produced from three to seve. I each case, the total couplig ad the total cohesio of the clusters are calculated usig Eqs. (12) ad (13). The results of four data sets for all the five cases produced by the clusterig algorithms are tabularized i Tables 1 through 8. The MADE algorithm produces the value zero for the degree of depedecy calculated for each attribute. This is because the lower approximatio of each attribute for each value is a ull set. Therefore this algorithm could ot be applied further to produce the clusterig result for the cosidered data set. The results of SDR ad SSDR are the same because for large data sets, SSDR achieves the same result as SDR (Tripathy & Ghosh, 2011b). Table 1. Total cohesio value for data set1 Total cohesio MADO MMR MMeR SDR & SSDR

7 Table 2. Total couplig value for data set1 Total couplig MADO MMR MMeR SDR & SSDR Table 3. Total cohesio value for data set2 Total cohesio MADO MMR MMeR SDR & SSDR Table 4. Total couplig value for data set2 Total couplig MADO MMR MMeR SDR & SSDR Table 5. Total cohesio value for data set3 Total cohesio MADO MMR MMeR SDR & SSDR Table 6. Total couplig value for data set3 Total couplig MADO MMR MMeR SDR & SSDR Table 7. Total cohesio value for data set4 Total cohesio MADO MMR MMeR SDR & SSDR

8 Table 8. Total couplig value for data set4 Total couplig MADO MMR MMeR SDR & SSDR Tables 1 through 8 illustrate that the clusters produced by the MADO algorithm have o average high cohesio ad low couplig values for all five cases with respect to the other algorithms. Because the clusterig algorithm performace depeds o producig clusters with high cohesio ad low couplig values, the proposed algorithm out performs the other rough set based clusterig algorithms for the cosidered four real customer data sets. The reaso behid this is that the proposed algorithm cosiders the dissimilarity betwee the obects i the same equivalece class whe choosig the splittig attribute istead of fidig the correlatio betwee attributes as the other rough set based clusterig algorithms do. The proposed clusterig algorithm is also applicable for high dimesioal data. It has bee tested for soyabea ad zoo data sets obtaied from the UCI Machie Learig Repository. The soybea data set cotais 47 obects with 35 categorical attributes. The zoo data set cotais 101 obects with 18 categorical attributes. The purity of clusters is used to test the quality of the clusters if the class label is already kow. I a real data set, because the class label is ot kow, cohesio ad couplig are used to test the quality of the clusters. I the sythetic data set obtaied from the repository, the class label is available i the data set ad so purity is used as a measure to test the quality of the clusters. The purity of a cluster i is defied by Eq. (14). The overall purity is defied by Eq. (15). i Purity(i) = r (14) r Overall purity = i 1 purity( i ) (15) Here, i r idicates the umbers of obects occurrig i cluster i ad its correspodig class r; r idicates the umber of obects i the class r; ad idicates the umber of clusters i the data set. A higher value of overall purity idicates a better clusterig result, with perfect clusterig yieldig a value of 1. I the zoo data set, each obect is classified as belogig to oe of the 7 classes. The MMR, MMeR, SDR, SSDR, ad proposed MADO clusterig algorithms are executed to obtai 7 classes. Purity for each cluster is calculated usig Eq. (14), ad the overall purity is calculated usig Eq. (15). The results obtaied from the MMR, SSDR, ad MADO algorithms are give i Tables 9, 10, ad 11, respectively. From Table 9, it is observed that out of 101 obects, 92 obects are classified correctly. From Table 10, it is observed that out of 101 obects, 79 obects are classified correctly. From Table 11, it is observed that out of 101 obects, 95 obects are classified correctly. Thus the overall purity of the clusters obtaied usig MMR, SSDR, ad the proposed algorithm is 0.91, , ad respectively. The overall purity of the clusters obtaied usig MMeR ad SDR is ad respectively (Tripathy & Ghosh, 2011b). Thus, the proposed clusterig algorithm produces good clusters for a high dimesioal sythetic data set. Table 9. MMR output for the zoo data set Cluster C1 C2 C3 C4 C5 C6 C7 Purity Number

9 Table 10. SSDR output for the zoo data set Cluster C1 C2 C3 C4 C5 C6 C7 Purity Number Table 11. MADO output for the zoo data set Cluster C1 C2 C3 C4 C5 C6 C7 Purity Number I the zoo data set, the MMR algorithm performs better whe compared to the MMeR, SDR, ad SSDR algorithms. Next, the MMR ad proposed clusterig algorithms are used for the soybea data set. I the soybea data set, each obect is classified as belogig to oe of the 4 diseases or 4 classes. The MMR ad proposed clusterig algorithms are executed to obtai 4 clusters. Purity for each cluster is calculated usig Eq. (14). The results obtaied from the MMR ad the proposed algorithm are give i Tables 12 ad 13, respectively. From Table 12, it is observed that out of 47 obects, 39 obects are classified correctly. From Table 13, it is observed that out of 47 obects, 46 obects are classified correctly. Thus the overall purity of the clusters obtaied usig the MMR ad the proposed algorithm is ad , respectively. Table 12. MMR output for soyabea data set Cluster Disease 1 Disease 2 Disease 3 Disease 4 Purity Number Table 13. MADO output for soybea data set Cluster Disease 1 Disease 2 Disease 3 Disease 4 Purity Number The results obtaied for the sythetic data set show that the proposed algorithm produces improved clusterig results i terms of purity whe compared to other clusterig algorithms. Therefore, the proposed clusterig algorithm ca cluster data i a better way irrespective of the umber of attributes cosidered. 9

10 6 CONCLUSION Customer segmetatio for CRM is achieved usig a data miig clusterig techique. This techique is tested i four various real data sets. Data related to customers are categorical i ature so the usual hierarchical ad partitioig algorithms are ot applicable. The importace of rough set theory for categorical clusterig is discussed, ad the algorithms based o this techique are studied. The proposed MADO algorithm cosiders the dissimilarity betwee obects i the same equivalece class. The cluster quality for a real data set is measured usig cohesio ad couplig. The experimetal results for the cosidered four customer data sets show that the MADO algorithm produces clusters with high cohesio ad low couplig values for all four cases with respect to other algorithms. The applicability of the MADO algorithm to high dimesioal data sets has bee prove by testig it with sythetic data sets. The experimetal results show that the MADO algorithm clusters high dimesioal data i a better way whe compared to other clusterig algorithms. I the future, the behavior or characteristics of customers i each segmet ca be aalyzed so that marketers ca employ differet strategies for each group. Furthermore, the life time value of a customer ca be icreased by adoptig suitable techiques for each customer segmet. 7 REFERENCES Bea, C. & Kambhampati, C. (2008) Autoomous clusterig usig rough set theory. Iteratioal Joural of Automatio ad Computig, 5(1), Cao, F., Liag, J., Li, D., Bai, L., & Dag, C. (2011) A dissimilarity measure for the k-modes clusterig algorithm. Kowledge Based Systems, 26, Che, H., Chuag, K., & Che, M. (2008) O data labelig for clusterig categorical data. IEEE Trasactios o Kowledge ad Data Egieerig, 20(11), Cheg, C. & Che, Y. (2009) Classifyig the segmetatio of customer value via RFM model ad RS theory. Expert Systems with Applicatios, 36(3), Gati, V., Gehrke, J., & Ramakrisha, R. (1999) CACTUS-Clusterig categorical data usig summaries. Proceedigs of 5th ACM SIGKDD Iteratioal coferece o Kowledge Discovery ad Data Miig (pp ). Sa Diego, CA, USA. Guha, S., Rastogi, R., & Shim, K. (2000) ROCK: A robust clusterig algorithm for categorical attributes. Iformatio Systems, 25(5), Herawa, T., Deris, M.M., & Abaway, J.H. (2010) A rough set approach for selectig clusterig attribute. Kowledge Based Systems, 23(3), Herawa, T., Ghazali, R., Yato, I.T.R, & Deris, M.M. (2010) Rough set approach for categorical data clusterig. Iteratioal Joural of Database Theory ad Applicatio, 3(1), Kumar, P. & Tripathy, B.K. (2009) MMeR: a algorithm for clusterig heterogeeous data usig rough set theory. Iteratioal Joural of Rapid Maufacturig, 1(2), Kuz, T. & Black, J.P. (1995) Usig automatic process clusterig for desig recovery ad distributed debuggig. IEEE Trasactios o Software Egieerig, 21(6), Lig, R. & Ye, D.C. (2001) Customer relatioship maagemet: A aalysis framework ad strategies. Joural of Computer Iformatio Systems, 41(3), implemetatio Mazlack, L.J., He, A., Zhu, Y., & Coppock, S. (2000) A rough set approach i choosig clusterig attributes. Proceedigs of the 13th iteratioal coferece o Computer Applicatios i Idustry ad Egieerig (pp. 1-6). Hoolulu, Hawaii, USA. Ngai, E.W.T., Xiu, L., & Chau, D.C.K. (2009) Applicatio of data miig techiques i customer relatioship maagemet: A literature review ad classificatio. Expert Systems with Applicatios, 36(2),

11 Data Sciece Joural, Volume 13, 9 April 2014 Parmar, D., Wu, T., & Blackhurst, J. (2007) MMR:a algorithm for clusterig categorical data usig rough set theory. Data ad Kowledge Egieerig, 63(3), Pawlak, Z. (1982) Rough set. Iteratioal Joural of Computer ad Iformatio Scieces, 11(5), Pawlak, Z. (1992) Rough sets: Theoretical aspects of reasoig about data, Norwell, MA, USA: Kluwer Academic Publishers. Pawlak, Z. & Skowro, A. (2007) Rudimets of rough sets. Iformatio Scieces, 177(1), Satos, J.B., Heuser, C.A., Moreira, V.P., & Wives, L.K. (2011) Automatic threshold estimatio for data matchig applicatios. Iformatio Scieces, 181(13), Shyg, J.Y., Wag, F.K., Tzeg, G.H., & Wu, K.S. (2007) Rough set theory i aalyzig the attributes of combiatio values for the isurace market. Expert Systems with Applicatios, 32(1), Tripathy, B.K. & Ghosh, A. (2011) SDR: A algorithm for clusterig categorical data usig rough set theory. Proceedigs of IEEE coferece o Recet Advaces i Itelliget Computatioal Systems (pp ). Trivadrum, Idia. Tripathy, B.K. & Ghosh, A. (2011) SSDR: A algorithm for clusterig categorical data usig rough set theory. Advaces i Applied Sciece Research, 2(3), Wu, J. & Li, Z. (2005) Research o Customer Segmetatio Model by Clusterig. Proceedigs of the ACM 7th iteratioal coferece o Electroic commerce (pp ). Xi-a, Chia. (Article history: Received 26 March 2013, Accepted 19 February 2014, Available olie 3 April 2014) 11

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical

More information

Journal of Chemical and Pharmaceutical Research, 2013, 5(12): Research Article

Journal of Chemical and Pharmaceutical Research, 2013, 5(12): Research Article Available olie www.jocpr.com Joural of Chemical ad Pharmaceutical Research, 2013, 5(12):745-749 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 K-meas algorithm i the optimal iitial cetroids based

More information

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c Advaces i Egieerig Research (AER), volume 131 3rd Aual Iteratioal Coferece o Electroics, Electrical Egieerig ad Iformatio Sciece (EEEIS 2017) Pruig ad Summarizig the Discovered Time Series Associatio Rules

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

New HSL Distance Based Colour Clustering Algorithm

New HSL Distance Based Colour Clustering Algorithm The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics

More information

Mining from Quantitative Data with Linguistic Minimum Supports and Confidences

Mining from Quantitative Data with Linguistic Minimum Supports and Confidences Miig from Quatitative Data with Liguistic Miimum Supports ad Cofideces Tzug-Pei Hog, Mig-Jer Chiag ad Shyue-Liag Wag Departmet of Electrical Egieerig Natioal Uiversity of Kaohsiug Kaohsiug, 8, Taiwa, R.O.C.

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

New Fuzzy Color Clustering Algorithm Based on hsl Similarity

New Fuzzy Color Clustering Algorithm Based on hsl Similarity IFSA-EUSFLAT 009 New Fuzzy Color Clusterig Algorithm Based o hsl Similarity Vasile Ptracu Departmet of Iformatics Techology Tarom Compay Bucharest Romaia Email: patrascu.v@gmail.com Abstract I this paper

More information

Cluster Analysis. Andrew Kusiak Intelligent Systems Laboratory

Cluster Analysis. Andrew Kusiak Intelligent Systems Laboratory Cluster Aalysis Adrew Kusiak Itelliget Systems Laboratory 2139 Seamas Ceter The Uiversity of Iowa Iowa City, Iowa 52242-1527 adrew-kusiak@uiowa.edu http://www.icae.uiowa.edu/~akusiak Two geeric modes of

More information

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees.

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees. Comp 135 Machie Learig Computer Sciece Tufts Uiversity Fall 2017 Roi Khardo Some of these slides were adapted from previous slides by Carla Brodley Our secod algorithm Let s look at a simple dataset for

More information

Bayesian Network Structure Learning from Attribute Uncertain Data

Bayesian Network Structure Learning from Attribute Uncertain Data Bayesia Network Structure Learig from Attribute Ucertai Data Wetig Sog 1,2, Jeffrey Xu Yu 3, Hog Cheg 3, Hogya Liu 4, Ju He 1,2,*, ad Xiaoyog Du 1,2 1 Key Labs of Data Egieerig ad Kowledge Egieerig, Miistry

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,

More information

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs CHAPTER IV: GRAPH THEORY Sectio : Itroductio to Graphs Sice this class is called Number-Theoretic ad Discrete Structures, it would be a crime to oly focus o umber theory regardless how woderful those topics

More information

Theory of Fuzzy Soft Matrix and its Multi Criteria in Decision Making Based on Three Basic t-norm Operators

Theory of Fuzzy Soft Matrix and its Multi Criteria in Decision Making Based on Three Basic t-norm Operators Theory of Fuzzy Soft Matrix ad its Multi Criteria i Decisio Makig Based o Three Basic t-norm Operators Md. Jalilul Islam Modal 1, Dr. Tapa Kumar Roy 2 Research Scholar, Dept. of Mathematics, BESUS, Howrah-711103,

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

Keywords Software Architecture, Object-oriented metrics, Reliability, Reusability, Coupling evaluator, Cohesion, efficiency

Keywords Software Architecture, Object-oriented metrics, Reliability, Reusability, Coupling evaluator, Cohesion, efficiency Volume 3, Issue 9, September 2013 ISSN: 2277 128X Iteratioal Joural of Advaced Research i Computer Sciece ad Software Egieerig Research Paper Available olie at: www.ijarcsse.com Couplig Evaluator to Ehace

More information

HADOOP: A NEW APPROACH FOR DOCUMENT CLUSTERING

HADOOP: A NEW APPROACH FOR DOCUMENT CLUSTERING Y.K. Patil* Iteratioal Joural of Advaced Research i ISSN: 2278-6244 IT ad Egieerig Impact Factor: 4.54 HADOOP: A NEW APPROACH FOR DOCUMENT CLUSTERING Prof. V.S. Nadedkar** Abstract: Documet clusterig is

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

Analysis of Different Similarity Measure Functions and their Impacts on Shared Nearest Neighbor Clustering Approach

Analysis of Different Similarity Measure Functions and their Impacts on Shared Nearest Neighbor Clustering Approach Aalysis of Differet Similarity Measure Fuctios ad their Impacts o Shared Nearest Neighbor Clusterig Approach Ail Kumar Patidar School of IT, Rajiv Gadhi Techical Uiversity, Bhopal (M.P.), Idia Jitedra

More information

Criterion in selecting the clustering algorithm in Radial Basis Functional Link Nets

Criterion in selecting the clustering algorithm in Radial Basis Functional Link Nets WSEAS TRANSACTIONS o SYSTEMS Ag Sau Loog, Og Hog Choo, Low Heg Chi Criterio i selectig the clusterig algorithm i Radial Basis Fuctioal Lik Nets ANG SAU LOONG 1, ONG HONG CHOON 2 & LOW HENG CHIN 3 Departmet

More information

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Bayesian approach to reliability modelling for a probability of failure on demand parameter Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

CSCI 5090/7090- Machine Learning. Spring Mehdi Allahyari Georgia Southern University

CSCI 5090/7090- Machine Learning. Spring Mehdi Allahyari Georgia Southern University CSCI 5090/7090- Machie Learig Sprig 018 Mehdi Allahyari Georgia Souther Uiversity Clusterig (slides borrowed from Tom Mitchell, Maria Floria Balca, Ali Borji, Ke Che) 1 Clusterig, Iformal Goals Goal: Automatically

More information

Solving Fuzzy Assignment Problem Using Fourier Elimination Method

Solving Fuzzy Assignment Problem Using Fourier Elimination Method Global Joural of Pure ad Applied Mathematics. ISSN 0973-768 Volume 3, Number 2 (207), pp. 453-462 Research Idia Publicatios http://www.ripublicatio.com Solvig Fuzzy Assigmet Problem Usig Fourier Elimiatio

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

Performance Plus Software Parameter Definitions

Performance Plus Software Parameter Definitions Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios

More information

Relational Interpretations of Neighborhood Operators and Rough Set Approximation Operators

Relational Interpretations of Neighborhood Operators and Rough Set Approximation Operators Relatioal Iterpretatios of Neighborhood Operators ad Rough Set Approximatio Operators Y.Y. Yao Departmet of Computer Sciece, Lakehead Uiversity, Thuder Bay, Otario, Caada P7B 5E1, E-mail: yyao@flash.lakeheadu.ca

More information

An Algorithm to Solve Multi-Objective Assignment. Problem Using Interactive Fuzzy. Goal Programming Approach

An Algorithm to Solve Multi-Objective Assignment. Problem Using Interactive Fuzzy. Goal Programming Approach It. J. Cotemp. Math. Scieces, Vol. 6, 0, o. 34, 65-66 A Algorm to Solve Multi-Objective Assigmet Problem Usig Iteractive Fuzzy Goal Programmig Approach P. K. De ad Bharti Yadav Departmet of Mathematics

More information

Study on effective detection method for specific data of large database LI Jin-feng

Study on effective detection method for specific data of large database LI Jin-feng Iteratioal Coferece o Automatio, Mechaical Cotrol ad Computatioal Egieerig (AMCCE 205) Study o effective detectio method for specific data of large database LI Ji-feg (Vocatioal College of DogYig, Shadog

More information

Clustering and Classifying Diabetic Data Sets Using K-Means Algorithm

Clustering and Classifying Diabetic Data Sets Using K-Means Algorithm Article ca be accessed olie at http://www.publishigidia.com Clusterig ad Classifyig Diabetic Data Sets Usig K-Meas Algorithm M. Kothaiayaki*, P. Thagaraj** Abstract The k-meas algorithm is well kow for

More information

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation 6-0-0 Kowledge Trasformatio from Task Scearios to View-based Desig Diagrams Nima Dezhkam Kamra Sartipi {dezhka, sartipi}@mcmaster.ca Departmet of Computig ad Software McMaster Uiversity CANADA SEKE 08

More information

Optimization for framework design of new product introduction management system Ma Ying, Wu Hongcui

Optimization for framework design of new product introduction management system Ma Ying, Wu Hongcui 2d Iteratioal Coferece o Electrical, Computer Egieerig ad Electroics (ICECEE 2015) Optimizatio for framework desig of ew product itroductio maagemet system Ma Yig, Wu Hogcui Tiaji Electroic Iformatio Vocatioal

More information

arxiv: v2 [cs.ds] 24 Mar 2018

arxiv: v2 [cs.ds] 24 Mar 2018 Similar Elemets ad Metric Labelig o Complete Graphs arxiv:1803.08037v [cs.ds] 4 Mar 018 Pedro F. Felzeszwalb Brow Uiversity Providece, RI, USA pff@brow.edu March 8, 018 We cosider a problem that ivolves

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Fuzzy Linear Regression Analysis

Fuzzy Linear Regression Analysis 12th IFAC Coferece o Programmable Devices ad Embedded Systems The Iteratioal Federatio of Automatic Cotrol September 25-27, 2013. Fuzzy Liear Regressio Aalysis Jaa Nowaková Miroslav Pokorý VŠB-Techical

More information

Fuzzy Rule Selection by Data Mining Criteria and Genetic Algorithms

Fuzzy Rule Selection by Data Mining Criteria and Genetic Algorithms Fuzzy Rule Selectio by Data Miig Criteria ad Geetic Algorithms Hisao Ishibuchi Dept. of Idustrial Egieerig Osaka Prefecture Uiversity 1-1 Gakue-cho, Sakai, Osaka 599-8531, JAPAN E-mail: hisaoi@ie.osakafu-u.ac.jp

More information

A Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System

A Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System A Novel Feature Extractio Algorithm for Haar Local Biary Patter Texture Based o Huma Visio System Liu Tao 1,* 1 Departmet of Electroic Egieerig Shaaxi Eergy Istitute Xiayag, Shaaxi, Chia Abstract The locality

More information

A New Bit Wise Technique for 3-Partitioning Algorithm

A New Bit Wise Technique for 3-Partitioning Algorithm Special Issue of Iteratioal Joural of Computer Applicatios (0975 8887) o Optimizatio ad O-chip Commuicatio, No.1. Feb.2012, ww.ijcaolie.org A New Bit Wise Techique for 3-Partitioig Algorithm Rajumar Jai

More information

Data Analysis. Concepts and Techniques. Chapter 2. Chapter 2: Getting to Know Your Data. Data Objects and Attribute Types

Data Analysis. Concepts and Techniques. Chapter 2. Chapter 2: Getting to Know Your Data. Data Objects and Attribute Types Data Aalysis Cocepts ad Techiques Chapter 2 1 Chapter 2: Gettig to Kow Your Data Data Objects ad Attribute Types Basic Statistical Descriptios of Data Data Visualizatio Measurig Data Similarity ad Dissimilarity

More information

Cubic Polynomial Curves with a Shape Parameter

Cubic Polynomial Curves with a Shape Parameter roceedigs of the th WSEAS Iteratioal Coferece o Robotics Cotrol ad Maufacturig Techology Hagzhou Chia April -8 00 (pp5-70) Cubic olyomial Curves with a Shape arameter MO GUOLIANG ZHAO YANAN Iformatio ad

More information

Descriptive Data Mining Modeling in Telecom Systems

Descriptive Data Mining Modeling in Telecom Systems Descriptive Data Miig Modelig i Telecom Systems Ivo Pejaović, Zora Sočir, Damir Medved 2 Faculty of Electrical Egieerig ad Computig, Uiversity of Zagreb Usa 3, HR-0000 Zagreb, Croatia Tel: +385 629 763;

More information

Assignment Problems with fuzzy costs using Ones Assignment Method

Assignment Problems with fuzzy costs using Ones Assignment Method IOSR Joural of Mathematics (IOSR-JM) e-issn: 8-8, p-issn: 9-6. Volume, Issue Ver. V (Sep. - Oct.06), PP 8-89 www.iosrjourals.org Assigmet Problems with fuzzy costs usig Oes Assigmet Method S.Vimala, S.Krisha

More information

Harris Corner Detection Algorithm at Sub-pixel Level and Its Application Yuanfeng Han a, Peijiang Chen b * and Tian Meng c

Harris Corner Detection Algorithm at Sub-pixel Level and Its Application Yuanfeng Han a, Peijiang Chen b * and Tian Meng c Iteratioal Coferece o Computatioal Sciece ad Egieerig (ICCSE 015) Harris Corer Detectio Algorithm at Sub-pixel Level ad Its Applicatio Yuafeg Ha a, Peijiag Che b * ad Tia Meg c School of Automobile, Liyi

More information

INTERSECTION CORDIAL LABELING OF GRAPHS

INTERSECTION CORDIAL LABELING OF GRAPHS INTERSECTION CORDIAL LABELING OF GRAPHS G Meea, K Nagaraja Departmet of Mathematics, PSR Egieerig College, Sivakasi- 66 4, Virudhuagar(Dist) Tamil Nadu, INDIA meeag9@yahoocoi Departmet of Mathematics,

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

Service Oriented Enterprise Architecture and Service Oriented Enterprise

Service Oriented Enterprise Architecture and Service Oriented Enterprise Approved for Public Release Distributio Ulimited Case Number: 09-2786 The 23 rd Ope Group Eterprise Practitioers Coferece Service Orieted Eterprise ad Service Orieted Eterprise Ya Zhao, PhD Pricipal, MITRE

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

Empirical Validate C&K Suite for Predict Fault-Proneness of Object-Oriented Classes Developed Using Fuzzy Logic.

Empirical Validate C&K Suite for Predict Fault-Proneness of Object-Oriented Classes Developed Using Fuzzy Logic. Empirical Validate C&K Suite for Predict Fault-Proeess of Object-Orieted Classes Developed Usig Fuzzy Logic. Mohammad Amro 1, Moataz Ahmed 1, Kaaa Faisal 2 1 Iformatio ad Computer Sciece Departmet, Kig

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

Fuzzy Minimal Solution of Dual Fully Fuzzy Matrix Equations

Fuzzy Minimal Solution of Dual Fully Fuzzy Matrix Equations Iteratioal Coferece o Applied Mathematics, Simulatio ad Modellig (AMSM 2016) Fuzzy Miimal Solutio of Dual Fully Fuzzy Matrix Equatios Dequa Shag1 ad Xiaobi Guo2,* 1 Sciece Courses eachig Departmet, Gasu

More information

Available online at ScienceDirect. Procedia CIRP 53 (2016 ) 21 28

Available online at   ScienceDirect. Procedia CIRP 53 (2016 ) 21 28 vailable olie at www.sciecedirect.com ScieceDirect Procedia CIRP 53 (206 ) 2 28 The 0th Iteratioal Coferece o xiomatic Desig, ICD 206 systematic approach to couplig disposal of product family desig (part

More information

Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process

Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process Vol.133 (Iformatio Techology ad Computer Sciece 016), pp.85-89 http://dx.doi.org/10.1457/astl.016. Euclidea Distace Based Feature Selectio for Fault Detectio Predictio Model i Semicoductor Maufacturig

More information

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0 Polyomial Fuctios ad Models 1 Learig Objectives 1. Idetify polyomial fuctios ad their degree 2. Graph polyomial fuctios usig trasformatios 3. Idetify the real zeros of a polyomial fuctio ad their multiplicity

More information

A Polynomial Interval Shortest-Route Algorithm for Acyclic Network

A Polynomial Interval Shortest-Route Algorithm for Acyclic Network A Polyomial Iterval Shortest-Route Algorithm for Acyclic Network Hossai M Akter Key words: Iterval; iterval shortest-route problem; iterval algorithm; ucertaity Abstract A method ad algorithm is preseted

More information

Intermediate Statistics

Intermediate Statistics Gait Learig Guides Itermediate Statistics Data processig & display, Cetral tedecy Author: Raghu M.D. STATISTICS DATA PROCESSING AND DISPLAY Statistics is the study of data or umerical facts of differet

More information

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a 4. [10] Usig a combiatorial argumet, prove that for 1: = 0 = Let A ad B be disjoit sets of cardiality each ad C = A B. How may subsets of C are there of cardiality. We are selectig elemets for such a subset

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Octahedral Graph Scaling

Octahedral Graph Scaling Octahedral Graph Scalig Peter Russell Jauary 1, 2015 Abstract There is presetly o strog iterpretatio for the otio of -vertex graph scalig. This paper presets a ew defiitio for the term i the cotext of

More information

On (K t e)-saturated Graphs

On (K t e)-saturated Graphs Noame mauscript No. (will be iserted by the editor O (K t e-saturated Graphs Jessica Fuller Roald J. Gould the date of receipt ad acceptace should be iserted later Abstract Give a graph H, we say a graph

More information

BASED ON ITERATIVE ERROR-CORRECTION

BASED ON ITERATIVE ERROR-CORRECTION A COHPARISO OF CRYPTAALYTIC PRICIPLES BASED O ITERATIVE ERROR-CORRECTIO Miodrag J. MihaljeviC ad Jova Dj. GoliC Istitute of Applied Mathematics ad Electroics. Belgrade School of Electrical Egieerig. Uiversity

More information

Performance Comparisons of PSO based Clustering

Performance Comparisons of PSO based Clustering Performace Comparisos of PSO based Clusterig Suresh Chadra Satapathy, 2 Guaidhi Pradha, 3 Sabyasachi Pattai, 4 JVR Murthy, 5 PVGD Prasad Reddy Ail Neeruoda Istitute of Techology ad Scieces, Sagivalas,Vishaapatam

More information

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve Advaces i Computer, Sigals ad Systems (2018) 2: 19-25 Clausius Scietific Press, Caada Aalysis of Server Resource Cosumptio of Meteorological Satellite Applicatio System Based o Cotour Curve Xiagag Zhao

More information

Masking based Segmentation of Diseased MRI Images

Masking based Segmentation of Diseased MRI Images Maskig based Segmetatio of Diseased MRI Images Aruava De Dept. of Electroics ad Commuicatio Egg. Dr. B.C. Roy Egg College Durgapur,Idia aruavade@yahoo.com Raib Locha Das Dept. Of Electroics ad Commuicatio

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

Stone Images Retrieval Based on Color Histogram

Stone Images Retrieval Based on Color Histogram Stoe Images Retrieval Based o Color Histogram Qiag Zhao, Jie Yag, Jigyi Yag, Hogxig Liu School of Iformatio Egieerig, Wuha Uiversity of Techology Wuha, Chia Abstract Stoe images color features are chose

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13 CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

What are Information Systems?

What are Information Systems? Iformatio Systems Cocepts What are Iformatio Systems? Roma Kotchakov Birkbeck, Uiversity of Lodo Based o Chapter 1 of Beett, McRobb ad Farmer: Object Orieted Systems Aalysis ad Desig Usig UML, (4th Editio),

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP

Introduction. Nature-Inspired Computing. Terminology. Problem Types. Constraint Satisfaction Problems - CSP. Free Optimization Problem - FOP Nature-Ispired Computig Hadlig Costraits Dr. Şima Uyar September 2006 Itroductio may practical problems are costraied ot all combiatios of variable values represet valid solutios feasible solutios ifeasible

More information

*Corresponding author. Keywords: Power quality, Assessment system, Harmonic evaluation, Comprehensive evaluation.

*Corresponding author. Keywords: Power quality, Assessment system, Harmonic evaluation, Comprehensive evaluation. 7 Iteratioal Coferece o Eergy, Power ad Evirometal Egieerig (ICEPEE 7) ISBN: 978--6595-456- Study of the Power Quality Comprehesive Evaluatio Method Zhi-mi ZHAN, Peg-fei CHAI, Bi LUO, Xig-bo LIU, Yua-li

More information

Rapid Frequent Pattern Growth and Possibilistic Fuzzy C-means Algorithms for Improving the User Profiling Personalized Web Page Recommendation System

Rapid Frequent Pattern Growth and Possibilistic Fuzzy C-means Algorithms for Improving the User Profiling Personalized Web Page Recommendation System Received: November 21, 2017 237 Rapid Frequet Patter Growth ad Possibilistic Fuzzy C-meas Algorithms for Improvig the User Profilig Persoalized Web Page Recommedatio System Sipra Sahoo 1 * Bikram Kesari

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

Exponential intuitionistic fuzzy entropy measure based image edge detection.

Exponential intuitionistic fuzzy entropy measure based image edge detection. Expoetial ituitioistic fuzzy etropy measure based image edge detectio. Abdulmajeed Alsufyai 1, Hassa Badry M. El-Owy 2 1 Departmet of Computer Sciece, College of Computers ad Iformatio Techology, Taif

More information

Which movie we can suggest to Anne?

Which movie we can suggest to Anne? ECOLE CENTRALE SUPELEC MASTER DSBI DECISION MODELING TUTORIAL COLLABORATIVE FILTERING AS A MODEL OF GROUP DECISION-MAKING You kow that the low-tech way to get recommedatios for products, movies, or etertaiig

More information

A Method of Malicious Application Detection

A Method of Malicious Application Detection 5th Iteratioal Coferece o Educatio, Maagemet, Iformatio ad Medicie (EMIM 2015) A Method of Malicious Applicatio Detectio Xiao Cheg 1,a, Ya Hui Guo 2,b, Qi Li 3,c 1 Xiao Cheg, Beijig Uiv Posts & Telecommu,

More information

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Analysis of Documents Clustering Using Sampled Agglomerative Technique

Analysis of Documents Clustering Using Sampled Agglomerative Technique Aalysis of Documets Clusterig Usig Sampled Agglomerative Techique Omar H. Karam, Ahmed M. Hamad, ad Sheri M. Moussa Abstract I this paper a clusterig algorithm for documets is proposed that adapts a samplig-based

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

Intrusion Detection Method Using Protocol Classification and Rough Set Based Support Vector Machine

Intrusion Detection Method Using Protocol Classification and Rough Set Based Support Vector Machine Computer ad formatio Sciece trusio Detectio Method Usig Protocol Classificatio ad Rough Set Based Support Vector Machie Xuyi Re College of Computer Sciece, Najig Uiversity of Post & Telecommuicatios Najig

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

Mobile terminal 3D image reconstruction program development based on Android Lin Qinhua

Mobile terminal 3D image reconstruction program development based on Android Lin Qinhua Iteratioal Coferece o Automatio, Mechaical Cotrol ad Computatioal Egieerig (AMCCE 05) Mobile termial 3D image recostructio program developmet based o Adroid Li Qihua Sichua Iformatio Techology College

More information

EMPIRICAL ANALYSIS OF FAULT PREDICATION TECHNIQUES FOR IMPROVING SOFTWARE PROCESS CONTROL

EMPIRICAL ANALYSIS OF FAULT PREDICATION TECHNIQUES FOR IMPROVING SOFTWARE PROCESS CONTROL Iteratioal Joural of Iformatio Techology ad Kowledge Maagemet July-December 2012, Volume 5, No. 2, pp. 371-375 EMPIRICAL ANALYSIS OF FAULT PREDICATION TECHNIQUES FOR IMPROVING SOFTWARE PROCESS CONTROL

More information

Text Feature Selection based on Feature Dispersion Degree and Feature Concentration Degree

Text Feature Selection based on Feature Dispersion Degree and Feature Concentration Degree Available olie at www.ijpe-olie.com vol. 13, o. 7, November 017, pp. 1159-1164 DOI: 10.3940/ijpe.17.07.p19.11591164 Text Feature Selectio based o Feature Dispersio Degree ad Feature Cocetratio Degree Zhifeg

More information

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most

More information

An Estimation of Distribution Algorithm for solving the Knapsack problem

An Estimation of Distribution Algorithm for solving the Knapsack problem Vol.4,No.5, 214 Published olie: May 25, 214 DOI: 1.7321/jscse.v4.5.1 A Estimatio of Distributio Algorithm for solvig the Kapsack problem 1 Ricardo Pérez, 2 S. Jös, 3 Arturo Herádez, 4 Carlos A. Ochoa *1,

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

Decision Support Systems

Decision Support Systems Decisio Support Systems 50 (010) 93 10 Cotets lists available at ScieceDirect Decisio Support Systems joural homepage: www.elsevier.com/locate/dss The data complexity idex to costruct a efficiet cross-validatio

More information

Research Article A Self-Adaptive Fuzzy c-means Algorithm for Determining the Optimal Number of Clusters

Research Article A Self-Adaptive Fuzzy c-means Algorithm for Determining the Optimal Number of Clusters Computatioal Itelligece ad Neurosciece Volume 216, Article ID 2647389, 12 pages http://dx.doi.org/1.1155/216/2647389 Research Article A Self-Adaptive Fuzzy c-meas Algorithm for Determiig the Optimal Number

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

ISSN (Print) Research Article. *Corresponding author Nengfa Hu

ISSN (Print) Research Article. *Corresponding author Nengfa Hu Scholars Joural of Egieerig ad Techology (SJET) Sch. J. Eg. Tech., 2016; 4(5):249-253 Scholars Academic ad Scietific Publisher (A Iteratioal Publisher for Academic ad Scietific Resources) www.saspublisher.com

More information